In this week’s Voices of the Industry, Susanna Kass, EVP, Innovation and Sustainability Strategy, at BASELAYER, discusses saving energy and money by busting the ghost servers.
Data centers around the world are experiencing an increasingly high level of demand for compute capacity. As time passes, more and more server capacity is deployed, but data centers and businesses are not getting the full benefit of the servers they have. They’re wasting energy running inefficiently at low utilization, or keeping servers online that aren’t in use any more. These comatose or “ghost” servers consume energy, while continuing to place demands for cooling, staffing, and facility infrastructure that is required to support them. In addition, these ghost servers generate additional costs such as server licenses, and hardware and software maintenance.
Data Center Managers now have the ability to identify and power-cap comatose servers, leading to thousands of kilowatt hours in reduced energy consumption and a smaller carbon footprint.
Alternatively, servers can be made available for re-provisioning, allowing IT and facility managers to defer CAPEX and OPEX on new servers and supporting power distribution and cooling infrastructure, leading to millions of dollars in savings.
The Problem & Opportunity:
According to McKinsey and Company, up to 30% of servers in enterprise data centers are comatose – consuming power but not delivering information services. The numbers are staggering with 3.6 million servers in the U.S. and 10 million worldwide and growing. This represents at least $30 billion in idle capital assuming a conservative cost per server of $3,000.
In a recent Uptime Institute Server Roundup, participants identified approximately 20,000 comatose servers. Powering down these servers can potentially reduce IT load by 5 megawatts (MW) and another 4 MW of associated cooling and infrastructure load reduction. Applying these savings to the 10 million global comatose servers could result in more than 4 gigawatts (GWs) of total load, cost and carbon emissions reductions. The comatose infrastructure could also be re-provisioned to support additional IT loads rather than purchasing new servers with their associated infrastructure and energy costs.
Power utilities bill their enterprise customers based on three main factors related to levels of energy use as shown in Figure 1. These appear as itemized charges on the monthly utility bill.
Energy Costs (kWh): These are the direct charges for consuming kilowatt hours of power, calculated from power consumption and utility rate (c/kWh). kWh charges vary significantly from less than 5c to more than 14c based on geography and power availability.
Utility Demand Charges represent the overhead costs associated with sizing the transmission and distribution infrastructure to support maximum demand levels. Demand charges vary by utility but $10/kW/month is typical.
System Capacity Charges are the costs associated with power generation. These charges vary widely by geographic location and region. For this paper we assume a rate of $50/kW/year.
Illustrative Example of Comatose Server Energy Cost
According to Uptime Institute’s recent Server Roundup, an average data center server consumes 250 watts of power at the device in an idle state. The table below illustrates the total annual cost for such a server. The calculations assume Utility Demand Charges of $10/kW-Month and System Capacity Charges of $50/kW-Year.
|Annual Totals Per Server|
|Per-Server Power Consumption|
|Server Power (kWh)||Overhead Power PUE (kWh)||Total Power (kWh)||Energy Charges @ 10.25c per kWh||Utility|
|Total Energy Cost|
In an enterprise data center farm consisting of 10,000 servers with 70% active and 30% comatose this represents a waste of $1,443,000 in annual energy costs alone. These costs can be significantly reduced or completely eliminated by using the BASELAYER RunSmart Operating System.
CAPEX and OPEX Savings for IT and Facilities
As demand for computing resources expands within an enterprise, the IT and Facilities teams must build out new infrastructure in a data center or contract services from a cloud provider. Either way it represents significant additional cost. A powerful alternative is to identify and re-provision comatose servers to support new and expanding compute demands. In so doing, IT and Facilities teams can defer large CAPEX investments and operational expenses.
Figure 2 shows an example of costs associated with servers in an enterprise data center. The data was modeled with an AWS Total Cost of Ownership Modeling tool.
Two scenarios were modeled based on 2016 server and data center cost data. Both are typical configurations for a large data center. Key conclusions for a representative enterprise data center with 10,000 servers are as follows:
CAPEX Benefits: Capital cost for a 10,000 server farm including direct and indirect contributions ranges from $55M to $122M. Assuming that 30% of the servers are identified and recovered for re-provisioning, this would allow IT and Facilities teams to defer between $16.6M and $36.6M of capital expense that would otherwise be spent on new servers and supporting infrastructure.
OPEX Benefits: Operational expense for a farm of 10,000 servers (including energy, power distribution, space, cooling, and IT equipment maintenance) ranges between $19.5M and $40.1M per year. Identifying and placing back into active service the 30% of servers in a comatose state would defer between $5.8M and $12.0M in additional annual operational expenses.
Clearly the energy, capital, and other operational cost savings from identifying comatose servers is powerful motivation to identify and either power down or re-provision these high-cost assets. BASELAYER’s RunSmart OS delivers the capability to do precisely that.
More Efficient Data Centers
Today there are tools that help data center operators monitors performance and power metrics across global data center deployments and allows corporate IT customers to manage their own demand-side energy and performance metrics by location, device, and sensor. Look for tools that offer full transparency to available power capacity, power consumption and performance metrics (including renewable energy % mix and energy efficiency), carbon neutrality reporting against emission targets, and empowers simplified and informed decision-making. Ideally these tools will have integration with other industry standard tools like the Intel DCM platform which enables identification and mitigation of resource waste, by dynamically capping power consumption of unused and underutilized servers.
To determine server activity status, tools like RunSmart monitors every server analyzing both total power consumption and power fluctuation levels. Based on these criteria a “Ghost Server Score” is calculated and the server is color coded (green through red) as shown in Figure 3. Servers that have a high probability of being in a comatose state are flagged for IT staff who can then make informed decisions on next steps. Options include whether to have RunSmart cap the power level on the server to minimize energy waste, or to flag the server for re-provisioning.
Energy waste, stranded capital costs, and flow-on operational expenses resulting from inactive “ghost” servers are a serious problem in enterprise IT environments today. There are great tools that allows IT staff to easily identify idle servers and make intelligent decisions on next steps, leading to reduced energy costs and carbon emissions, and millions of dollars in IT and Facilities savings from lowered operational costs and deferred capital expenses.
Submitted by Susanna Kass, EVP, Innovation and Sustainability Strategy, at BASELAYER. Susanna can be contacted via LinkedIn or at [email protected]. BASELAYER® RunSmart OS integrates with Intel® Data Center Manager to enhance intelligent control platform and offer customers greater transparency of their IT and critical infrastructure. Learn more.