What goes where: How AI is forcing a new workload placement strategy

The first AI infrastructure conversations I keep getting pulled into sound like cloud debates. Should this run in a hyperscale cloud? Do we need private capacity? Is sovereign cloud enough? Can we keep the model in one place and the retrieval layer in another? Those are reasonable opening questions. In my experience, they rarely determine whether an AI workload will be operationally sound, economically defensible and governable at scale. They are just the entry point.

More than once, I have watched a meeting begin with broad posture language – cloud first, hybrid by exception, private where required – and then shift the moment someone describes the actual workload. It has to pull from internal content that cannot move freely. It sits within a workflow where response time matters. It may call systems of record. It may have to stay within a jurisdictional boundary. It may look cheap in a pilot and expensive once inference, storage, network movement and monitoring become persistent. Once the workload becomes concrete, the old posture language starts to thin out.

My previous piece argued that serious enterprise AI governance starts above the tool, in the control plane that determines what AI can see, touch and do. This is the question that follows immediately. Once an enterprise can govern AI, it still has to decide where each workload should run. That is becoming the more consequential infrastructure decision now, because AI is exposing the limits of a broad cloud posture and prompting a more practical discussion about fit.

AI is breaking the old cloud shorthand

For years, many organizations could frame cloud strategy in relatively simple terms. Cloud-first was often enough a guiding policy, even if the reality underneath was always messier. AI changes that. McKinsey recently noted that AI compute is now primarily split between training and inference, and that those workloads are already reshaping site selection, power strategy and architectural design across hyperscaler portfolios. At the same time, Uptime Institute’s 2025 survey describes an industry grappling with rising costs, worsening power constraints and the challenge of meeting AI-driven density demands. That combination should tell leaders something important: AI is not just adding more demand to the existing cloud conversation. It is changing the variables inside it.

Part of the reason is that AI is not a single workload category. Retrieval-heavy use cases create different pressures than large-scale inference. Fine-tuning has a different economic and infrastructure profile than agentic workflows connected to enterprise systems. Batch AI processing behaves differently from user-facing workloads that depend on speed and locality. Some workloads are spiky and experimental, while others quickly settle into steady operational demand. Once those differences become visible, the real issue is no longer whether private cloud is back or whether hyperscale remains dominant. The issue is whether the enterprise has a defensible way to decide what goes where and why.

The cleaner way to frame it is this: AI is turning cloud strategy back into a workload placement discipline. The question is no longer which cloud posture sounds right in the abstract, but which environment best fits the workload’s economics, data movement, latency, risk and operating constraints once the workload becomes real.

This is not nostalgia for private cloud

That distinction matters because some of the louder narratives about AI infrastructure still boil down to a familiar headline: “Private cloud is back.” In some cases, yes, parts of the AI stack are moving closer to enterprise boundaries. But that does not automatically mean the market is swinging backward. Uptime’s recent analysis of cloud repatriation makes the balance clear: Costs are pushing some workloads back toward enterprise data centers, but most organizations are still running several public clouds alongside on-premises environments in a hybrid model, and overall cloud usage is not collapsing. What is happening is more selective. Enterprises are becoming less ideological.

In practice, the reasons are more about discipline than nostalgia. Some AI workloads perform better in the hyperscale cloud because access to frontier models, elastic capacity and faster experimentation still matter more than anything else. Other workloads start to lean the other way once inference becomes steady, data movement becomes expensive, retrieval must sit near sensitive enterprise content or the operating environment cannot tolerate long network paths. Predictable demand changes the economics. So does locality. So does control. That is not a throwback. It is architecture growing up again.

You can see the market reacting to this directly. Microsoft’s recent Sovereign Cloud expansion is framed as a continuum spanning public and private environments, including fully disconnected operations and local AI inference. AWS now positions its European Sovereign Cloud around data residency, operational autonomy and resiliency requirements. Google’s Vertex AI documentation distinguishes where data remains at rest from where machine learning processing occurs. Vendor announcements do not settle the issue. They do show where the market is moving and why enterprises are rethinking placement more seriously.

Sovereignty is not a label

This is where the sovereignty discussion either becomes serious or devolves into branding. In most leadership conversations, sovereignty is used as shorthand for “keep it local.” That is too loose to be useful. The European Commission’s Cloud Sovereignty Framework treats sovereignty as a set of explicit objectives with required assurance levels, not as a marketing adjective. eu-LISA’s sovereign cloud brief makes a similar point from a public-sector perspective, tying the issue to data localization, governance, compliance, jurisdiction, transparency and operational control. That is much closer to the real decision space.

For AI workloads, sovereignty usually raises several questions at once. Where is data stored at rest? Where is processing occurring? Whose law applies if something is disputed or compelled? Who can administer the environment? What dependencies remain with the provider? What evidence survives an audit, incident review or regulatory challenge? Those questions matter more for AI than for a generic application migration because AI systems often blend model access, retrieval, data movement, tool invocation and action pathways within a single operating pattern. A workload can satisfy residency on paper and still fail the broader control test in practice.

That is also why private or sovereign environments help only if the control layer remains modern. If identity is inconsistent, policy enforcement is fragmented, audit evidence is weak or observability disappears as a workload moves closer to the enterprise, the organization has not solved the problem. It has merely relocated it. A sovereign label does not substitute for strong policy, traceability and operating discipline.

What better organizations do differently

The stronger organizations I see are not trying to settle the whole argument with a single-platform doctrine. They are building repeatable placement logic. Usually, that starts with a small set of questions, not a giant framework. What does the workload cost when usage becomes steady rather than experimental? How much data must move, and how often? Which response times actually matter to the business process? Which data classes and jurisdictions are involved? What observability and audit evidence will be needed if this workload becomes material? How hard would it be to move or redesign later if the economics or regulatory conditions change?

Those questions quickly elevate the quality of the conversation. They shift it from product preference to operating model territory. They also bring the right people into the room. Placement is not just a cloud team decision. It pulls in architecture, security, data, platform, infrastructure and operating leadership because the answer is rarely just about where compute happens to sit. It is about trust boundaries, failure modes, unit economics and the conditions under which an AI workload becomes part of real work.

The better organizations also separate workload classes earlier than most. They do not let a retrieval-heavy assistant over internal knowledge use the same placement logic as large-scale model training. They do not treat an agent that can take action in enterprise systems the same way they treat a passive assistant. They do not apply the same assumptions to a batch-processing pipeline and to a user-facing operational workload with tight latency expectations. It sounds obvious. In practice, many organizations still miss it, and a weak AI strategy often starts there.

The next leadership question

The wrong question for this phase is, “Which side of the cloud debate are we on?” It is not even, “Is private cloud back?” Those are still posture questions. The better question is narrower and harder: What should run where, and on what basis?

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

Read More from This Article: What goes where: How AI is forcing a new workload placement strategy
Source: News