How Data Centers Are Harnessing AI Workloads for Enhanced Cloud, LLM, and Inference Capabilities

Nov. 8, 2024
The changes within the data center sector are coming quickly, as performance capabilities and service delivery speeds continue to grow. At the center of the change is AI and the performance and infrastructure requirements for delivering it to customers.

If nothing else, October 2024 was a month where AI-dominated industry announcements had a huge impact on shaping the future of the data center. From infrastructure to storage designs, the common denominator is the focus on AI and how such services can be delivered to the customer.

Building for AI has become the de facto standard for the latest and greatest releases across the industry, and supporting this AI expansion will leave its imprint on the data center industry for the foreseeable future.

To wit: With the rapidly evolving artificial intelligence demands, both for supporting AI and enabling AI delivery, cloud vendors and data infrastructure providers are ramping up capabilities to meet demands for AI training workloads as well as oncoming inference performance.

While not a comprehensive listing, here are some recent announcements from Oracle, Nvidia, Cerebras, DigitalOcean, and Lightbits Labs, each companies who bring unique solutions to the table, creating flexible and scalable infrastructures for diverse AI applications.  

Standardizing AI Infrastructure

To address the challenges of deploying AI clusters at scale, the Open Compute Project (OCP) has launched its Open Systems for AI initiative. This initiative fosters a collaborative, multi-vendor ecosystem aimed at developing standardized AI data center infrastructure.

Nvidia’s and Meta’s contributions to this project, such as Nvidia’s MGX-based GB200-NVL72 platform and Meta’s Catalina AI Rack architecture, are crucial for advancing common standards for AI computing clusters, reducing costs and operational silos for data centers.

Equipment vendors such as Vertiv are also announcing their own, dedicated support, for AI-intensive data center configurations. And Nvidia has announced their own reference architectures for enterprise class AI deployments.

These collaborations aim to tackle key obstacles, including power density, cooling, and specialized computation hardware, with liquid-cooled rack and compute trays that support efficient, high-density operations.

By creating a multi-vendor, interoperable supply chain, OCP facilitates quicker adoption and a lower barrier to entry for organizations seeking to deploy AI infrastructure. Reference architectures, from the OCP and others, make these deployments more achievable in shorter time frames.

Scaling AI with Zettascale Superclusters

Oracle’s launch of its Oracle Cloud Infrastructure (OCI) Supercluster, in collaboration with Nvidia, represents a leap in scale and performance.

The new OCI Zettascale Cluster supports up to 131,072 Blackwell GPUs, reaching  2.4 zettaFLOPS of peak performance.

OCI’s Superclusters are tailored to offer high-performance computing capabilities, including supporting workloads that demand extensive computational power such as large language model (LLM) training and data-intensive simulations.

Key to OCI’s offering is flexibility in deployment, enabling customers to use AI infrastructure in their preferred locations while complying with data sovereignty requirements.

For instance, WideLabs in Brazil leverages OCI’s high-performance infrastructure to develop a Portuguese LLM, using OCI's Nvidia H100 GPUs and Kubernetes Engine for scalable, secure workloads within Brazil.

This capability is especially beneficial in regions with stringent data sovereignty requirements where data residency and security are prioritized.

By building a worldwide infrastructure for the services available from OCI, Oracle enhances capabilities that require stringent adherence to local law and regulations.

Other notable uses of the service include Zoom using OCI’s generative AI inference capabilities to enhance its Zoom AI Companion, providing users with real-time drafting, summarizing, and brainstorming assistance.

Breaking Speed Barriers in AI Inference

With a focus specifically on AI inference, Cerebras Systems has set a new standard by delivering 2,100 tokens per second on the Llama 3.2 70B model, a performance 16 times faster than current GPU-based solutions.

Leveraging its proprietary Wafer Scale Engine 3 (WSE-3), Cerebras Inference provides massive memory bandwidth, allowing it to handle large models without the latency challenges seen in other systems.

This capability is instrumental in real-time applications, where speed and responsiveness are critical.

The speed advantage that Cerebras offers has drawn clients such as GlaxoSmithKline (GSK), which is exploring AI-driven research agents to enhance drug discovery.

In fields such as voice AI, companies like LiveKit have benefited from the accelerated pipeline, enabling seamless speech-to-text and text-to-speech processes that operate faster than typical GPU-based inference alone.

Simplifying AI Deployments

Addressing the complexities in getting AI/ML workloads setup and configured for specific use cases, DigitalOcean, in collaboration with collaborative AI community Hugging Face, has introduced 1-Click Models, a tool that simplifies the deployment of AI models such as Llama 3 and Mistral on DigitalOcean GPU Droplets.

This new feature aims to streamline the otherwise complex process of setting up AI and ML models in the cloud, allowing developers to quickly deploy inference endpoints with minimal setup.

By eliminating the need for intricate configuration and security setups, DigitalOcean’s 1-Click Models democratize access to powerful AI models, with the goal of making them accessible to a broader audience.

Integrated with Hugging Face’s GenAI Services (HUGS), Digital Ocean 1-Click models provide continuous updates and optimizations, ensuring users have access to the latest improvements in AI model performance.

Climate-Aligned AI Cloud Solutions

Proving that the demands of AI infrastructures go well beyond AI/ML hardware performance, Lightbits Labs, a pioneer in NVMe over TCP storage, delivering software defined storage, has partnered with Crusoe Energy Systems, self-described as “The World’s Favorite AI-First Cloud” to expand high-performance, climate-conscious AI infrastructure.

Crusoe’s data centers are powered by a combination of stranded and clean energy sources, reducing the environmental impact of AI workloads.

Lightbits’ software-defined storage delivers high performance with low latency, ideal for AI workloads that demand consistent and high-speed access to data.

Crusoe’s expanded use of Lightbits storage meets the needs of AI developers by providing a flexible, scalable infrastructure that ensures high availability and durability.

The partnership allows Crusoe to offer its AI cloud users an optimized environment that includes storage that scales to meet demand, especially for applications such as LLM training and generative AI.

Each of these solutions contributes to a more robust, accessible AI ecosystem, addressing the challenges of scale, efficiency, and usability.

Such innovations are paving the way for future advancements by building an infrastructure that will encourage widespread adoption of AI technologies, across a variety of business sectors.

 

Keep pace with the fast-moving world of data centers and cloud computing by connecting with Data Center Frontier on LinkedIn, following us on X/Twitter and Facebook, and signing up for our weekly newsletters using the form below.

About the Author

David Chernicoff

David Chernicoff is an experienced technologist and editorial content creator with the ability to see the connections between technology and business while figuring out how to get the most from both and to explain the needs of business to IT and IT to business.

Sponsored Recommendations

How Deep Does Electrical Conduit Need to Be Buried?

In industrial and commercial settings conduit burial depth can impact system performance, maintenance requirements, and overall project costs.

Understanding Fiberglass Conduit: A Comprehensive Guide

RTRC (Reinforced Thermosetting Resin Conduit) is an electrical conduit material commonly used by industrial engineers and contractors.

NECA Manual of Labor Rates Chart

See how Champion Fiberglass compares to PVC, GRC and PVC-coated steel in installation.

Electrical Conduit Cost Savings: A Must-Have Guide for Engineers & Contractors

To help identify cost savings that don’t cut corners on quality, Champion Fiberglass developed a free resource for engineers and contractors.

ZincFive
Source: ZincFive

Data Center Backup Power: Unlocking Shorter UPS Runtimes

Tod Higinbotham, COO of ZincFive, explores the race to reduce uninterruptible power supply (UPS) runtimes.

White Papers

Dcf Techno Guard Sr Cover2022 06 15 16 38 10 232x300

High Density IT Cooling

June 21, 2022
Over the last ten years, power density in data centers has been on the rise, leaving many to wonder if the days of air cooling IT equipment are over. This report, courtesy of ...