Artificial intelligence (AI) and high-performance computing (HPC) have emerged as key areas of opportunity for innovation and business transformation.
The challenge for IT leaders is to enable these high-density workloads with the right IT infrastructure, and increasingly the community is discussing advanced cooling technologies like liquid cooling.
While direct liquid cooling (DLC) is being deployed in data centers today more than ever before, would you be surprised to learn that we’ve been deploying it in our data center designs at Digital Realty since 2015? Did you also know that liquid cooling isn’t always the right choice for every high-density AI or HPC workload?
In this post, I’ll cover the basics of the data center cooling needs for high-density workloads like AI and HPC, and how Digital Realty’s legacy of innovation has prepared us to support the acceleration of demand for advanced cooling techniques of all kinds, including liquid cooling.
I’ll also share case studies from our innovation journey that demonstrate how enabling innovation is about having the right strategy and the right partners, rather than a one-size-fits-all approach.
Cooling needs of high-density workloads
The density of an AI or HPC deployment determines its unique cooling needs.
The power density requirements for AI and HPC can be 5-10 times higher than other data center use cases. Traditional workloads tend to be in the range of 5-8 kW per rack.
It’s likely that some computing hardware may enable power densities exceeding 100 kW/rack and the peak density in the data center could reach 150 kW/rack over the next couple of years.
Traditional workload densities can be air cooled, however, broadly speaking, most AI & HPC workflows require specialized cooling such as direct liquid cooling (DLC), air-assistant liquid cooling (AALC), or a rear-door heat exchanger.
Not all AI & HPC workloads require liquid cooling
Requirements for liquid cooling vary by hardware vendor, the specific hardware itself, and the workload type. Liquid cooling is not appropriate for all hardware or every scenario.
Even in the age of AI, not every rack will be drawing 100 kW, and may not even demand specialized advanced cooling.
For example, inferencing deployments tend to be less power-hungry than training deployments and may be able to be cooled with traditional air-cooling techniques. Machine learning requires fewer resources, while deep learning and generative AI require massive environments due to their complexity.
It is important for IT leaders to understand that different AI and HPC workloads have different cooling needs and that not every data center partner will have the specialized knowledge or infrastructure capabilities to enable the technology.
The requirements for each deployment will vary, so it’s important to work with a partner who will design a customized solution and not depend on a one-size-fits-all approach. That’s why Digital Realty’s legacy of data center design expertise with advanced cooling makes a difference for our customers.
Strategies for innovation
Digital Realty’s global data center platform, PlatformDIGITAL®, was chosen to be the home of many groundbreaking AI and HPC workloads.
We’ve learned that in order to enable innovation, a few key strategies help us not only keep pace with technology — we stay a step ahead.
IT strategies to support AI & HPC workflows must enable:
- Agility
- Scale
- Sustainable growth
These case studies from our own innovation journey over the last decade highlight these strategies in action. They also demonstrate how our expertise and innovation strategy help us identify the right solution for the situation rather than relying on a one-size-fits-all approach.
Innovation case studies
Enable scale: A high-capacity trading engine with liquid cooling
2015 was a transformative year for us at Digital Realty; it was also my first year with the company. We embarked on an ambitious project to build the foundation for a global financial services company that specializes in algorithmic high-frequency trading.
A significant part of this venture was a strategic shift from traditional air cooling to advanced liquid cooling down to the chip level to support HPC clusters. This engineering feat not only enhanced the cooling system’s efficiency but also meant that we were able to scale our technology to continue to support our client as their deployment grew to nearly 6 MW.
Investing in next-generation liquid cooling technology was a decision that we knew would enable our customer beyond their immediate needs and establish a capability with a focus on long-term scalability and sustainability.
Enable sustainable growth: Supercomputing with adaptable design
Recently, we partnered with a European customer to develop a sophisticated supercomputer environment that included up to 70 kW per rack in a mixed environment. The customer needed to deploy quickly while also complying with new sustainability regulations.
Waiting 3-5 years to build a new data center was not an option, which is why our ability to retrofit existing facilities gets customers up and running faster. Taking an energy-efficient facility that we built in 2013, we were able to meet their demanding requirements for high-power density and connectivity with minimal changes to our facility. This enabled a 400% faster deployment.1
Our customer projected a 30% improvement in energy efficiency by switching to liquid cooling.1 They also benefitted from Digital Realty’s aquifer thermal energy storage (ATES) cooling system and fully renewable energy sources to achieve CO2 targets set by local sustainability regulations.
Our ability to develop retrofit designs shows our commitment to both cutting-edge and agile design that enables sustainable, and timely, growth. Our design principles ensure our infrastructure will meet not just the present needs but also requirements decades into the future.
Enable agility: A flexible, future-proofed generative AI deployment
Today, we’re playing a key role in the advancement of generative AI (GenAI). We’re working with a customer that’s integrating over 30,000 of the most advanced GPUs into one massive platform.
To enable advanced computing performance, the deployment requires that every GPU be connected in a single computing cluster. They needed a data center platform provider that could help them deploy quickly to start getting the value from their GPU investment, which was even more challenging given their specialized design requirements.
Our investment strategy is aimed at anticipating future demand, which has enabled us to match them with a facility that was shell-ready with designs ready. Our agile, modular design approach enabled us to solve their complex design challenges while maintaining 99% of the original design, which meant we could get building sooner.
Our agile approach will enable them to deploy in as soon as 12 months instead of the 36 months they’d require with custom building.1 The requirements of our customers are rapidly changing, as are the technology and the solutions to meet them — that’s why agility needs to be a core strategy to enable innovation.
Even though this is the definition of an advanced AI workload, direct liquid cooling was not the best choice for cooling. This is a good example of why a one-size-fits-all approach to high-density workload cooling doesn’t work.
Beyond infrastructure: Fostering a culture of innovation
To execute these innovation strategies, another key element is your team of people. For all IT leaders, it’s important to remember that our achievements aren’t just about infrastructure: they’re about the culture of innovation we’ve cultivated.
At Digital Realty, our talented teams bring a legacy of innovation and engineering for which we’ve received multiple awards as trailblazers in the datacenter space.
Our culture of innovation at Digital Realty enables alignment with our customers, ensuring that our partners are comfortable that they can grow with Digital Realty far into the future.
A vision for the future
My role as Chief Technology Officer at Digital Realty is to understand our customers’ technological needs and ensure that Digital Realty can support those needs, not only for today but for tomorrow.
As we look to the future, we remain dedicated to not just participating in the technological landscape but actively shaping it. Our mission is to enable our customers’ innovation by enabling agility, scale, and sustainable growth.
Sustainability is particularly important to us. We continue to expand our coverage of carbon-free and renewable power sources to keep pace with customer demand – we have more than 1 gigawatt of solar and wind energy under contract — and we have begun to use alternative fuel secondary power solutions to further reduce the lifecycle carbon footprint of our data centers.
We’ll focus on applying the best technology in time to meet our customer’s needs, rather than wholesale deploying the status quo and forcing the customers of tomorrow to accept yesterday’s limitations. This approach is what has enabled Digital Realty to provide the examples highlighted throughout this post, as well as all manner of other customer needs throughout the globe.
Our adaptability, innovative spirit, and rich heritage are what make us a unique and enduring company in the ever-evolving world of technology.
Building a legacy of innovation does not happen overnight, but at Digital Realty we’ve learned that we’re always moving in the right direction when we’re true to our values and focused on how we can best serve our customers’ needs.
Join us at Digital Realty as we continue to define the future of technology. Stay innovative, reach out to us, and let’s deploy AI and HPC in a way that transforms your organization.
Learn more about AI-ready data center infrastructure:
- Harness AI’s Potential and Navigate Disruption with Digital Realty
- Are Data Centers Obsolete in the Age of AI? Not on our Watch
- Integrating AI with Legacy Infrastructure
1 Projected outcome for this customer as compared to their existing infrastructure prior to being deployed and connected on PlatformDIGITAL® or compared with alternative solutions available at time of purchase.
Read More from This Article: Ahead of the curve on advanced cooling for AI & HPC
Source: News