Rethinking Resilience: Preparing Data Centers for the Next AI Wave

May 24, 2024
Tim Hysell, Co-founder and CEO of ZincFive, explains why the need for a resilient power infrastructure has never been more pressing for data center operators.

Recent high-profile outages have underscored the critical importance of data center resilience. Twitter's (now X) outage during Rihanna's Super Bowl Halftime Show and Microsoft's eight-hour outage affecting Teams, Outlook, and M365 caused widespread disruptions for millions of users. Even more alarming was the outage experienced by Australian telecommunications provider Optus, which led to transport delays, banking issues, and cut hospital phone lines for 12 hours, affecting over 10 million users (nearly 40% of the population) and 400,000 businesses.

The consequences of data center outages are further emphasized by Uptime Intelligence's 2024 data center outage report. The report reveals that 55% of operators experienced an outage in the past three years, with more than half of respondents reporting that their most recent significant outage cost over $100,000, and 16% stating that these outages cost more than $1 million.

As AI pushes data center energy consumption to unprecedented levels, the need for a resilient power infrastructure has never been more pressing. The International Energy Agency (IEA) projects that data center electricity usage will double by 2026, while training newer AI models consumes 50 times more electricity than previous generations. As various sectors further integrate AI into their operations, the need for the facilities that power these services to maintain resilience grows both in importance and difficulty.

Faced with these challenges, data center operators must take decisive action to enhance their resilience and adjust to the demanding requirements of AI. By addressing the most common causes of outages, specifically power issues and human error, including inadequate regular, comprehensive uninterruptible power supply (UPS) testing, data center operators can ensure the reliability and stability of their facilities in an increasingly complex and demanding technological landscape.

Resolving these issues requires examining their causes first. Power issues consistently emerge as the most common cause of serious outages, according to Uptime Intelligence's March 2024 report. A staggering 42% of respondents pointed to UPS failure as the leading cause of power-related outages, while 30% of incidents involved issues with the transfer switch to a generator, and 20% were attributed to generator failure itself.

Human error also plays a significant role in nearly 40% of data outages, highlighting the critical importance of proper training and adherence to established procedures. Among those who reported a human error-caused outage, 48% cited failure of staff to follow procedures, while 45% pointed to incorrect procedures.

These findings underscore the urgent need for data center operators to prioritize both the modernization and upkeep of their power infrastructure and their staff's ongoing training and education. Implementing comprehensive staff training and process reviews presents a significant opportunity to reduce human error-related outages.

To limit the risk of power-related outages, data center operators should regularly perform maintenance and rigorous testing under real-world conditions of backup power systems. They can also adopt more advanced and reliable UPS battery technologies, such as nickel-zinc. Unlike lead-acid and lithium-ion backup batteries, nickel-zinc batteries continue to discharge and carry the load even when a cell in the battery string becomes weak or depleted. This allows the battery string to continue operating and makes what would otherwise be an emergency into a simple note for replacement at the next planned maintenance cycle – no added maintenance costs or operational impact.

Nickel-zinc batteries offer several additional benefits to increase data centers’ reliability and efficiency. Unlike lithium-ion batteries, they are incapable of thermal runaway at the cell level and can operate reliably at higher temperatures – which can also lead to lower cooling costs. They also boast a greater power density than their counterparts, delivering the same amount of power in a significantly smaller footprint. This allows operators to save valuable space for income-generating equipment like servers and racks, while still providing ample backup power for intensive AI applications. Nickel-zinc batteries are also more sustainable than lead-acid and lithium batteries and can serve as a convenient drop-in replacement for lead-acid batteries, making a seamless transition to this more advanced technology.

As data center energy consumption continues to soar due to the increasing demands of AI and other power-intensive applications, ensuring a resilient power infrastructure has become more critical than ever. By embracing comprehensive staff training, rigorous testing and maintenance practices, and safe, reliable and sustainable battery technologies like nickel-zinc, data center operators can significantly bolster the resilience of their facilities and ensure they are well-equipped to handle the ever-increasing demands of the digital landscape. Taking proactive steps to address the root causes of outages and implementing innovative solutions helps facility operators ensure that customers can rely on them – no matter what the future holds.

About the Author

Tim Hysell

Tim Hysell is the co-founder and CEO of ZincFive and has over three decades of entrepreneurial success in founding, owning, and directing profitable business operations in renewable energy, banking, manufacturing, and medical devices. ZincFive is a leader in innovation and delivery of nickel-zinc batteries and power solutions. Contact ZincFive to learn how nickel-zinc chemistry can provide high power density and performance for mission critical applications.

Sponsored Recommendations

Get Utility Project Solutions

Lightweight, durable fiberglass conduit provides engineering benefits, performance and drives savings for successful utility project outcomes.

Guide to Environmental Sustainability Metrics for Data Centers

Unlock the power of Environmental, Social, and Governance (ESG) reporting in the data center industry with our comprehensive guide, proposing 28 key metrics across five categories...

The AI Disruption: Challenges and Guidance for Data Center Design

From large training clusters to small edge inference servers, AI is becoming a larger percentage of data center workloads. Learn more.

A better approach to boost data center capacity – Supply capacity agreements

Explore a transformative approach to data center capacity planning with insights on supply capacity agreements, addressing the impact of COVID-19, the AI race, and the evolving...

JDzacovsky/Shutterstock.com

Coolant Distribution Units: The Heart of a Liquid Cooling System

nVent's Abhishek Gupta explains why CDUs are at the core of driving the efficiencies that liquid cooling can bring to data centers, so choosing the right one is critical.

White Papers

Thumbnail2

Choosing the Right Technology for Diesel Backup Generators

July 26, 2023
Environmental and long-term sustainability concerns are increasingly influencing our technology decisions, and that’s driving change in the market. Gone are the days of simple...