What you need to know about the coming of age of neoclouds

Over the past two years, I’ve seen a noticeable shift in how technology leaders talk about AI infrastructure. Twelve months ago, the conversation was dominated by GPU availability and cost. But today, the questions being asked are far less binary. Both CIOs and CTOs are asking whether specialized AI cloud providers, now being referred to as neoclouds, are really mature enough to support long-term enterprise growth strategies. They want to understand where these new wave providers fit alongside hyperscalers such Amazon Web Services (AWS), Microsoft Azure and Google Cloud, and what role they can play within a balanced, resilient cloud environment.

Neoclouds have grown quickly by moving away from the “jack of all trades” approach of traditional hyperscalers, and have instead homed in on one specialist proposition: Delivering GPU capacity for AI workloads at a lower price than general-purpose cloud providers. This niche has allowed them to scale extremely fast. According to Synergy Research Group, neocloud revenues exceeded $23 billion in 2025 — a 200% increase on the previous year — and are expected to reach $180 billion by the end of the decade. But rapid growth alone does not answer the question that matters most to business decision-makers: How sustainable and enterprise-ready are these platforms?

From my perspective, the right way to approach this is not to view neoclouds as replacements for hyperscalers, nor as experimental side projects. They are specialized infrastructure providers emerging in response to a genuine market need. The more relevant question for CIOs is how to integrate them intelligently into an architecture that remains portable, resilient and aligned with regulatory and performance requirements.

Training is about capacity, but inference is about experience

When I speak with enterprise teams considering neoclouds, I often start by asking a simple question: Are you primarily training models, or are you running inference at scale? And that answer shapes almost everything that follows.

AI training is typically centralized. Large datasets are moved to where compute is abundant and cost-efficient, often in locations where power, land and cooling are easier to secure. In that environment, raw capacity and price per GPU hour rightly dominate the discussion. Connectivity still matters, especially when fine-tuning models with fresh data, but it’s rarely a dealbreaker. The workload can tolerate some distance because the main objective is throughput.

Inference turns that idea on its head. Once a model is deployed and serving users, responsiveness becomes the make-or-break factor. Every interaction with an AI agent or application, whether it’s machine-to-machine or user-to-machine and vice versa, depends on how quickly a request travels to the model and how fast the response returns. The physics of distance don’t disappear simply because we’re talking about digital systems. In many ways, it’s a “phygital” ecosystem where the balance has to be just right. If the infrastructure sits too far from users, the experience will be slow and cumbersome, rendering many AI use-cases completely broken. Most of us remember when online video meant staring at a buffering icon while content loaded; that’s the same position AI inference is now in, except an AI system responsible for managing an autonomous factory or interpreting signals from a self-driving car can’t afford to wait.

This is what I mean when I say the evaluation criteria need to evolve alongside the technology. When inference becomes the priority, as it often now is, it’s no longer sufficient to compare GPU specifications and headline pricing. You need to understand where a provider is located, how broadly it is distributed and how efficiently it connects to your users, partners and data sources. Inference rewards proximity, resilience and well-architected connectivity. As more AI use cases move into production, those factors begin to influence business outcomes directly, from customer satisfaction to employee productivity.

Balancing cost with risk

One of the reasons neoclouds have attracted so much attention is that their pricing is easy to compare. GPU hourly rates are published, benchmarks are shared and procurement teams can quickly calculate potential savings against hyperscaler offerings. Some of those savings can be very alluring — according to the Uptime Institute, the average cost of an Nvidia DGX H100, for instance, was $98 per hour in 2025 when purchased from a hyperscaler. But when purchasing an equivalent instance from a neocloud, the cost dropped to $34, representing a total saving of around 66%. That bottom-line clarity is appealing when AI budgets are under pressure. But what is less obvious during early evaluations is the provider’s connectivity posture. For inference-heavy workloads, connectivity shapes latency, resilience and ultimately the user experience. When I review providers in this space, I look beyond hardware specifications and price. I want to understand where they are physically located, how widely they are distributed and how well interconnected they are with access networks, enterprise networks and other cloud environments.

Enterprises can approach this pragmatically if they know what to look for. Public routing and peering data offer insight into how interconnected a platform really is and whether it relies on a narrow set of upstream providers. Geographic spread, diversity of interconnection points and proximity to users all influence performance and continuity. I also advise applying the same principles many organizations already use in their multi-cloud strategies. Few place all critical workloads with a single provider, and that same discipline should guide AI infrastructure decisions. Best practice envisages a diversity of providers, with failovers and redundancy central to infrastructure design. Here, neoclouds can become an additional component in this mix, providing technical specialization and potentially a strong geographical footprint within a particular region. Whatever providers are chosen, architecting for portability and interoperability from the outset reduces exposure and preserves both agility and resilience.

Sovereignty is becoming an architectural consideration

Alongside performance and resilience, I’m also seeing data sovereignty move from a policy discussion to a basic design requirement. Data protection regulation has already reshaped how enterprises think about storage and processing. The introduction of the General Data Protection Regulation (GDPR) in the European Union forced many organizations to revisit where personal data was stored, how it moved across borders and which third parties had access to it. For some, that meant restructuring cloud deployments, renegotiating contracts or localizing certain workloads. As AI becomes more embedded in decision-making, similar questions are now being asked about where models are trained, where they are hosted and which jurisdictions govern access to them.

What’s different is that sovereignty isn’t just a case of compliance anymore. It now affects operational control. If an AI model underpins customer experiences or the automation of services such as fraud detection, businesses need confidence that it cannot be altered, restricted or accessed in ways that undermine the organization’s interests. That makes transparency around data flows and infrastructure location an absolute necessity. So, when evaluating neocloud providers, I encourage decision-makers to ask where data is processed, how traffic moves between regions and what safeguards exist around jurisdictional control. Neoclouds, as younger, more regional or local cloud players, can be leveraged to ensure this. As regulatory frameworks mature and geopolitical tensions continue to shape technology policy, these early architectural decisions will have far-reaching consequences the more AI becomes embedded in processes and systems.

What maturity looks like

Neoclouds now have a foothold in the market, but they’re going to need to evolve and differentiate themselves in new ways if they are to survive the fierce headwinds from hyperscalers. Their early appeal was built on cost efficiency and rapid access to GPU capacity. But as AI continues its ascent, leaders need to ask more probing questions about performance, resilience, visibility and control. From my perspective, I think the fact that these questions are being asked at all is a sign of maturity rather than uncertainty. The most effective strategy right now is to view neoclouds as a complementary addition to the hyperscale cloud landscape rather than an either‑or decision. When used for the right workloads, they can sit alongside established hyperscalers, adding flexibility, choice and data sovereignty. They are specialized infrastructure providers responding to real demand, and organizations should approach them as they do any new technology or provider — thoughtfully and with due consideration to how they will function as part of a broader architecture that supports portability, transparency and resilience.

AI will continue to reshape connectivity strategies over the coming years, and as it does, infrastructure decisions that once felt purely technical will increasingly influence customer experience, regulatory posture and operational continuity. Decision-makers who evaluate neoclouds through that wider lens will be the ones who rise above the hype and create a sustained advantage for their businesses.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

Read More from This Article: What you need to know about the coming of age of neoclouds
Source: News

What you need to know about the coming of age of neoclouds

Training is about capacity, but inference is about experience

Balancing cost with risk

Sovereignty is becoming an architectural consideration

What maturity looks like

Related posts