Data Center Maintenance: What Should Be Included?

Oct. 10, 2016
In this week’s Voices of the Industry, Robert McClary, Chief Operating Officer, at FORTRUST discusses data center maintenance and lifestyle strategies and what should be included.

In this week’s Voices of the Industry,  Robert McClary, Chief Operating Officer, at FORTRUST discusses data center maintenance and lifestyle strategies and what should be included.

Robert McClary
Chief Operating Officer, FORTRUST

According to the most recent market studies, businesses are continuing to leverage colocation services as critical parts of the corporate infrastructure. Sandler Research predicted that the global colocation market would expand at a compound annual growth rate of more than 12 percent through 2020, driven by shrinking enterprise IT budgets along with rising needs for critical application support and accessibility.

As companies increasingly rely on colocation providers, it becomes even more important for these data centers to be properly maintained. Even a single minute of downtime can cost an organization thousands of dollars and its reputation in the industry, making it absolutely imperative that service providers do everything in their power to ensure round-the-clock uptime.

This is where robust data center maintenance and lifecycle strategies come into play, and become such a pivotal part of facility processes. But what, exactly, should customers ask colocation providers in regard to maintenance, and what advantages can these maintenance strategies bring to the table?

What makes maintenance and lifecycle strategies so important?

In the current IT landscape, downtime isn’t just costly in terms of dollars and cents – it can also do untold damage to a brand’s reputation. This is particularly true if the colocation facility is supporting client-facing resources that are imperative to customer service.

Robert McClary, FORTRUST Chief Operating Officer, pointed out that poor maintenance and lifecycle strategies are the second most likely cause of unplanned downtime, with human error and poor capacity management being the first likely cause of a data center downtime. Even the most optimally designed data centers cannot make up for a lack of proper system maintenance and upkeep.

“The designed reliability of a data center does not make up for poor maintenance and lifecycle strategies,” McClary wrote in FORTRUST’s eBook, A Data Center Operations Guide for Maximum Reliability. “Maintenance and lifecycle strategies are core to a data center’s ability to continuously provide high-availability service delivery and uptime over a long period of time.”

Different types of maintenance

When it comes to maintenance strategies, there are a few different types that colocation customers should look for. Gaining details about these processes is paramount, as it will show the provider’s dedication to uptime within the facility.

McClary noted that a comprehensive strategy here should include:

  • Regular and thorough inspections: Data center staff should continually inspect systems and equipment to ensure they are in proper working order. This includes daily inspections of generators, water temperature, fuel levels, plenum pressures, electrical and mechanical distribution systems operating parameters and other system parameters and configurations.
  • Continuous testing: Facility employees should also test specific systems to ensure that they are operating within the correct parameters. Processes here can encompass infrared, load testing and fail-over testing.
  • Predictive maintenance: This is a critical part of the data center’s strategy. Predictive maintenance leverages measurements and other data analysis to recognize any changes, trends or irregularities that could point to a potential failure. In this way, staff members can address these issues before they lead to an outage.
  • Preventive maintenance: McClary explained that preventative maintenance is meant to “keep a piece of equipment or component operating at its optimum level or an action that prolongs its lifecycle.” This type of maintenance can include filter or oil changes, as well as cleaning heat exchangers and electrical systems.
  • Corrective maintenance: Finally, staff members should leverage corrective maintenance processes when it comes time for a system or component to be repaired or replaced. Fixing a leak or replacing a bearing or valve would fall under corrective maintenance.

With strategies including predictive and preventative maintenance in place, the potential for system failure is considerably reduced. These processes enable facility workers to pinpoint and address issues before they cause an unplanned or even a planned outage.

“Do not be a break-fix organization that waits for failure before it takes action,” McClary recommended. “I am of the belief that it is not impossible or difficult to predict issues in equipment before failure. In fact, I believe that if you have a strong maintenance and lifecycle strategy, unpredicted failure becomes at the very least, a random event.”

Regular maintenance is critical to ensure critical systems uptime.

Parts of the lifecycle strategy

It’s also critical to ensure that facility managers have a lifecycle strategy in place. McClary explained that this includes both a preventive and predictive maintenance program in conjunction with other best practices to boost the equipment lifecycle. Activities to look for here include strategies for:

  • Replacing before failure: Many systems and components are meant to be replaced at certain intervals after their usefulness has expired. Not replacing this equipment increases the chances of failure and unplanned downtime.
  • Rotation: Similarly, some components need to be rotated according to a specific schedule to ensure performance and balance.
  • Replacement: Finally, customers should ensure that facility staff members have a strategy that dictates the proper times to replace equipment. This procedure ensures that critical systems aren’t interrupted in the process.

Additional best practices

Colocation customers should also ensure that their service providers are following other maintenance and lifecycle best practices.

“Providers should prioritize preventive and predictive maintenance.”

This includes being aware of and incorporating equipment manufacturer recommendations into their overall processes. In many cases, facility staff shouldn’t just be following these recommendations, but exceeding them to ensure that equipment performs at optimal levels and that its lifecycle can be prolonged.

Customers should also look to ensure that their provider prioritizes preventive and predictive maintenance over corrective maintenance.

“Understand the cost of corrective maintenance is much greater over the long term,” McClary wrote. “Ask any classic car buff, and they will tell you the same thing! Regular preventive maintenance will save you money over the long haul.”

In addition, it’s best to ensure that critical processes including maintenance and lifecycle procedures are handled in-house, and that these activities are not outsourced to a third party. Facility managers should be incredibly selective about which processes are carried out by external vendors. As a rule of thumb, less than 20 percent of these overall procedures should be outsourced.

“A skilled operations team with ownership of the maintenance and lifecycle strategies is core to a data center’s critical systems infrastructure’s ability to continuously provide high-availability service delivery and uptime over a long amount of time,” McClary noted. “Maintenance and lifecycle strategy must be a routine. Attention to detail and ownership is contagious if the tone is set and emphasized at every level in the organization!”

FORTRUST has delivered 100 percent continuous critical systems uptime for more than 15 years. To find out more about FORTRUST’s specific maintenance and lifecycle strategies, contact them for a tour of their Denver Data Center today.

Robert D. McClary is Chief Operating Officer, Robert is responsible for the overall supervision of business operations, high-profile construction and strategic technical direction at FORTRUST. Robert developed and implemented the process controls and procedures that support the continuous uptime and reliability that FORTRUST Denver has delivered since 2001. He is considered one of the leading experts on Management and Operations in the data center industry and was selected as a finalist by AFCOM for Data Center Manager of the Year. To give back to the data center community and promote youth technology education, Robert serves on the Board of Directors as President for the Rocky Mountain Chapter of the 7×24 Exchange and Chairs the Board of Directors for KidsTek.

About the Author

Voices of the Industry

Our Voice of the Industry feature showcases guest articles on thought leadership from sponsors of Data Center Frontier. For more information, see our Voices of the Industry description and guidelines.

Sponsored Recommendations

Guide to Environmental Sustainability Metrics for Data Centers

Unlock the power of Environmental, Social, and Governance (ESG) reporting in the data center industry with our comprehensive guide, proposing 28 key metrics across five categories...

The AI Disruption: Challenges and Guidance for Data Center Design

From large training clusters to small edge inference servers, AI is becoming a larger percentage of data center workloads. Learn more.

A better approach to boost data center capacity – Supply capacity agreements

Explore a transformative approach to data center capacity planning with insights on supply capacity agreements, addressing the impact of COVID-19, the AI race, and the evolving...

How Modernizing Aging Data Center Infrastructure Improves Sustainability

Explore the path to improved sustainability in data centers by modernizing aging infrastructure, uncovering challenges, three effective approaches, and specific examples outlined...

NicoElNino / Shutterstock

How Can Data Center Managers Handle Explosive Growth Driven By AI and ML?

Marc Caiola, Vice President of Global Data Solutions at nVent, explains how leveraging the right cooling and power technologies can help data centers manage the growth of AI and...

White Papers

Get the Full Report

Using Simulation to Validate Cooling Design

April 21, 2022
Kao Data’s UK data center is designed to sustainably support high performance computing and intensive artificial intelligence. Future Facilities explores how CFD can validated...