Skip to content
Tiatra, LLCTiatra, LLC
Tiatra, LLC
Information Technology Solutions for Washington, DC Government Agencies
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact
 
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact

The end of cloud-first: What compute everywhere actually looks like

In 2016, I was working on software for field area network gateways — routers installed in substations and roadside utility cabinets and expected to run unattended for years. Each gateway sat at the root of a low-power wireless mesh connecting thousands of smart meters. The radios were slow, the links were lossy and the backhaul was expensive.

We didn’t debate architecture. We reacted to constraints.

The gateways validated meter events, aggregated readings and made local decisions before anything reached a centralized system. Raw telemetry rarely left the field; the network couldn’t support it, and the latency of a round-trip to a datacenter would have broken real-time grid operations.

At the time, no one called this “compute everywhere.” It wasn’t a strategy — it was simply the only design that worked.

Years later, I saw the same pattern repeat in very different systems: video pipelines that moved inference closer to where content was already served, ML models pushed onto devices that once only forwarded data, and cloud platforms evolving to support compute outside centralized regions. Cloud-first didn’t fail; the assumptions underneath it stopped holding.

This article isn’t about distributed compute as a trend. It’s about the mechanics: where inference actually runs, how data really moves and what breaks operationally once workloads leave the cloud.

What “compute everywhere” actually means

“Compute everywhere” isn’t a rebrand of edge computing. It’s a recognition that modern systems need computation at multiple tiers — and that those tiers must cooperate.

A useful way to think about it is as a spectrum:

Device layer

Sensors and microcontrollers doing basic filtering or inference locally. If you want a concrete reference point, the community around the Edge AI Foundation (formerly the tinyML Foundation) is a good place to start.

Gateway layer

Aggregation points that translate protocols, correlate events and decide what’s worth sending upstream — often using constrained-network IP and routing stacks such as:

  • 6LoWPAN
  • RPL

Edge layer

Compute running in regional points of presence (PoPs) close to data sources and users, often operated by CDN or edge-compute providers.

Cloud layer

Centralized resources for training models, coordinating fleets and doing analytics that benefit from global context.

The shift isn’t about replacing the cloud. It’s about recognizing that “where should this computation run?” no longer has a default answer.

Why the equation changed

A few forces converged to break the cloud-first assumption.

IoT was never cloud-first

Early industrial IoT systems were built around the assumption that data was expensive to move. Devices were power-constrained, often battery-operated and communicated over lossy, low-bandwidth networks.

In utility and smart-metering deployments, that reality shaped the stack. Standards such as IEEE 802.15.4g were developed specifically for Smart Utility Networks operating under these constraints.

Shipping everything upstream wasn’t just inefficient; it was often impossible. Architectures assumed local aggregation and selective reporting because the network simply could not sustain continuous raw backhaul.

That constraint wasn’t new.

What changed was the data — and how quickly it stopped being manageable.

Data got heavier

As systems began incorporating cameras, radar, lidar and high-frequency industrial sensors, payloads stopped looking like measurements and started looking like streams.

A single video feed can consume sustained megabits per second even after compression. Multiply that across dozens of cameras in a factory, retail location or intersection, and continuous upstream transport stops being a cost optimization problem and becomes a hard architectural constraint.

This is why large-scale video and sensor deployments rarely ship raw data upstream once they move past pilot scale. Bandwidth adds up faster than most teams expect, and the cost isn’t just monetary.

Links saturate. Latency gets spiky. And once uplinks are congested, failures start to couple: a problem that used to be isolated to one site suddenly bleeds into the broader system because everyone is fighting for the same constrained path.

At a macro level, industry data reflects the same pressure. IDC’s Datasphere work (in a Seagate-hosted report) captures the scale of global data growth and how much of it originates outside centralized data centers.

Network forecasts tell a similar story: Cisco’s Annual Internet Report consistently highlights video as a major driver of IP traffic growth,

Those reports don’t tell you how to design your system; but they explain why the old defaults keep breaking.

In practice, teams respond in remarkably similar ways across domains. They reduce data before it moves. They aggregate, filter and extract features close to where data is generated. Raw data stays local unless there’s a clear operational or analytical reason to ship it upstream.

Once volume crosses a certain threshold, compute follows it, not as a matter of fashion, but because the network is no longer a neutral substrate.

ML learned to run small

Until recently, meaningful inference required GPUs in centralized clusters. That constraint shaped architectures as much as any design preference.

That’s no longer true.

Post-training quantization, model distillation and hardware-aware optimization are now mainstream — and supported directly in production toolchains. Google’s edge documentation for post-training quantization (via LiteRT / TensorFlow Lite workflows) is a good concrete reference.

As a result, models that once demanded datacenter-class hardware can now run within power and memory budgets measured in single-digit watts, particularly when paired with purpose-built edge accelerators and optimized runtimes. (Again, the Edge AI Foundation community is a useful signpost here.)

What made this viable wasn’t a single breakthrough, but a convergence: smaller models and  better tooling at the edge to run inference continuously without blowing power or cost budgets.

Physics and regulations

Some constraints are absolute. In fiber, propagation delay alone sets a floor on latency. A commonly used rule of thumb is roughly 4.9 microseconds per kilometer, often rounded to 5 µs/km.

Regulatory constraints are just as unforgiving. Data residency and processing requirements under GDPR and similar frameworks shape where certain data can be processed.

Edge inference helps keep sensitive data local, with only aggregated or anonymized results sent upstream.

What this looks like in production

Once you accept those constraints, you keep seeing the same architectural shapes.

Decisions at the grid edge

In utility systems, milliseconds matter. Fault detection and isolation must happen locally to maintain grid stability. Gateways execute control logic continuously, while centralized systems focus on planning, analytics and model updates.

The cloud remains essential — but it’s not in the real-time control loop.

Video processed near where content lives

CDN operators and edge platforms increasingly provide compute capabilities at or near their PoPs. When video is already distributed close to users for delivery efficiency, processing it locally avoids unnecessary data movement.

You can see this kind of edge/cloud split discussed in live video analytics work, including Microsoft Research’s Rocket project.

Devices now decide

Across industrial and retail environments, devices that once forwarded raw measurements now filter, classify and act locally. Central systems still matter for aggregation, long-term analysis and retraining — but they’re no longer in the critical path for every decision the system makes.

Operational complexity

Here’s where the “compute everywhere” pitch gets fuzzy. The tooling evolved fast. Operating it is slower, harder work.

Deployment isn’t continuous anymore

Cloud deployments assume constant connectivity. Edge devices do not. Some synchronize once a day. Others disappear for weeks.

Updating software or models turns into a logistics problem: staged rollouts, health checks and the ability to stop or roll back when things go wrong. Those patterns show up explicitly in job-based fleet update mechanisms — for example, AWS IoT jobs.

Partial failures are normal

In fleets of thousands of devices, something is always broken. Power issues, network partitions, hardware variation and firmware bugs create a steady state of partial failure.

Observability is harder, too. A silent device might be offline — or dead. Distinguishing between the two requires explicit design, often based on heartbeats and deadlines rather than continuous metrics.

Fleet diversity

Over time, edge fleets drift. Hardware revisions, firmware versions and configuration exceptions accumulate. A model that works on most devices fails on a minority due to subtle differences no one documented.

Maintaining homogeneity becomes an operational necessity, not an aesthetic preference.

How teams actually decide what runs where

The teams that navigate this transition well don’t start with an “edge strategy.” They start by asking uncomfortable questions about their workload.

  • Where does the data originate, and what does it cost to move? Data gravity usually matters more than latency. If data is generated at the edge, shipping models outward is often cheaper and simpler than pulling raw data back to the cloud.
  • What constraints are non-negotiable? Physics sets latency floors. Regulations restrict data movement. Power and connectivity shape what you can assume about availability. When one of these forces compute outward, it’s better to accept it early than fight it later.
  • What are you actually optimizing for? I’ve seen teams push inference to the edge in the name of “latency” when their application could tolerate hundreds of milliseconds. The result was a large increase in operational complexity with no user-visible benefit. Measure what actually matters before you distribute anything.
  • Can you operate it? This is the question teams skip. Running edge infrastructure requires skills many cloud-native organizations don’t have: embedded systems experience, fleet management and tolerance for intermittent connectivity. If you can’t reliably update devices or reason about partial failures, keeping workloads centralized is often the safer choice.

The new default

Compute everywhere isn’t a new layer you bolt onto an existing architecture. It’s a change in what teams assume by default.

The cloud didn’t become irrelevant. It stopped being the reflexive answer to every placement question.

Organizations that navigate this well don’t frame the problem as edge versus cloud. They treat the device-to-cloud continuum as a design space and make explicit choices within it. Inference runs close to where data is generated. Training and coordination stay centralized, where aggregation pays off. Analytics lives where global visibility actually adds value.

What surprised me wasn’t that teams moved compute out of the cloud. It was how rarely they did it because they wanted to — and how often they did it because they had to.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?


Read More from This Article: The end of cloud-first: What compute everywhere actually looks like
Source: News

Category: NewsMarch 23, 2026
Tags: art

Post navigation

PreviousPrevious post:初心者でもわかる量子アルゴリズム超入門 速くならないものある?NextNext post:미스트랄, 기업 맞춤형 AI 모델 구축 플랫폼 ‘포지’ 공개…자체 데이터 학습 지원

Related posts

샤오미, MIT 라이선스 ‘미모 V2.5’ 공개···장시간 실행 AI 에이전트 시장 겨냥
April 29, 2026
SAS makes AI governance the centerpiece of its agent strategy
April 29, 2026
The boardroom divide: Why cyber resilience is a cultural asset
April 28, 2026
Samsung Galaxy AI for business: Productivity meets security
April 28, 2026
Startup tackles knowledge graphs to improve AI accuracy
April 28, 2026
AI won’t fix your data problems. Data engineering will
April 28, 2026
Recent Posts
  • 샤오미, MIT 라이선스 ‘미모 V2.5’ 공개···장시간 실행 AI 에이전트 시장 겨냥
  • SAS makes AI governance the centerpiece of its agent strategy
  • The boardroom divide: Why cyber resilience is a cultural asset
  • Samsung Galaxy AI for business: Productivity meets security
  • Startup tackles knowledge graphs to improve AI accuracy
Recent Comments
    Archives
    • April 2026
    • March 2026
    • February 2026
    • January 2026
    • December 2025
    • November 2025
    • October 2025
    • September 2025
    • August 2025
    • July 2025
    • June 2025
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022
    • July 2022
    • June 2022
    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    Categories
    • News
    Meta
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    Tiatra LLC.

    Tiatra, LLC, based in the Washington, DC metropolitan area, proudly serves federal government agencies, organizations that work with the government and other commercial businesses and organizations. Tiatra specializes in a broad range of information technology (IT) development and management services incorporating solid engineering, attention to client needs, and meeting or exceeding any security parameters required. Our small yet innovative company is structured with a full complement of the necessary technical experts, working with hands-on management, to provide a high level of service and competitive pricing for your systems and engineering requirements.

    Find us on:

    FacebookTwitterLinkedin

    Submitclear

    Tiatra, LLC
    Copyright 2016. All rights reserved.