HPC is in Transition, and Tilting Towards the Cloud

May 23, 2023
The next generations of HPC technologies will move on from the existing models and the investment of the hyperscalers will be critical to its success

The future of supercomputing and high performance computing (HPC) will be need to be driven by the hyperscale operators, as the scientific community, traditional driver of HPC, lacks the financial clout to be the prime mover, according to Professor Daniel Reed, the keynote speaker at this week's ISC High Performance conference in Hamburg, Germany. 

“The leading edge HPC market is too small, the procurements are too infrequent, the funding is too small, and the financial risk to vendors is too high, while the size and scale of the hyperscaler and deep learning markets are too large to ignore,” says Reed, the Presidential Professor in Computational Science at the University of Utah and chair of the U.S. National Science Board.

HPC on demand is becoming an important part of the business world, and it is rapidly being joined by AI/ML on demand. Google, Amazon Web Servics, Microsoft, Oracle and others are adding significant AI capabilities that can be made available to any customer with the wherewithal to make the investment in those services.

Second tier cloud providers and specialty providers are also getting into the AI/ML space by providing not just on-demand capabilities but also colocation services that can support the power and cooling demands of building and training Large Language Models (LLM) and deploying the inference engines business are demanding.

Reed, in his paper Reinventing High Performance Computing: Challenges and Opportunities, makes it clear that new technologies, a new perspective, and a better understanding how future development will be funded, is necessary. To achieve this desired future Reed and co-authors Jack Dongarra of Oak Ridge Laboratory and computer scientist Dennis Gannon, propose six maxims to guide not just future development for HPC but also its use.

Maxim One: Semiconductor constraints dictate new approaches

The days of throwing more conventional CPUs and GPUs at the problem are coming to an end. The limitations of current architectures plus the cost of developing new fabs should be driving standardization for chiplet designs that can be used in standardized HPC architectures.

Maxim Two: End-to-end hardware/software co-design is essential

The future is in application-designed HPC, including custom processors designed for software solutions, which we are already seeing in many of the AI startup companies we recently covered. Reed also believes that a significant government investment in research, along with hardware vendors and superscalers will be necessary. The DOE’s recent investment in new cooling technologies is a good example of this.

Maxim Three: Prototyping at scale is required to test new ideas

Custom silicon and new software ideas will need to be built and tested to explore the most effective way to get to the next level. This means taking risks, and the government and the big cloud players will need to be willing to take on those risks to find the next generation of answers.

Maxim Four: The space of leading edge HPC applications is far broader now than in the past.

This is a statement of fact The focus of HPC applications continue to expand after the increased visibility of the technology as it was used in the design of COVID-19 vaccines has made businesses actively look for ways that HPC can solve their everyday business problems.

Maxim Five: Cloud economics have changed the supply chain ecosystem.

Annual superscaler spending on hardware dwarfs the amount of money budgeted every few years to build the current technology supercomputers. It will be important for governments and businesses to collaborate with the cloud operators and smartphone developers on the next generation of hardware and HPC services.

Maxim Six: The societal implications of technical issues really matter.

It used to be said that computing was touching on every aspect of our lives. That is now high performance computing, from vaccine design to daily weather forecasting, our lives are constantly being touched and influenced by HPC.  AI use will increase that interaction and the ethical considerations of the impact of this cutting edge technology will require constant oversight.

The belief is that we are at another crossroads in the design and use of computing. Commoditization got us to this point, but the next generation of HPC will likely be purpose built to solve specific problems and classes of problems. Change is coming. 

About the Author

David Chernicoff

David Chernicoff is an experienced technologist and editorial content creator with the ability to see the connections between technology and business while figuring out how to get the most from both and to explain the needs of business to IT and IT to business.

Sponsored Recommendations

The AI Disruption: Challenges and Guidance for Data Center Design

From large training clusters to small edge inference servers, AI is becoming a larger percentage of data center workloads. Learn more.

A better approach to boost data center capacity – Supply capacity agreements

Explore a transformative approach to data center capacity planning with insights on supply capacity agreements, addressing the impact of COVID-19, the AI race, and the evolving...

How Modernizing Aging Data Center Infrastructure Improves Sustainability

Explore the path to improved sustainability in data centers by modernizing aging infrastructure, uncovering challenges, three effective approaches, and specific examples outlined...

How Modern DCIM Helps Multi-Tenant Colocation Data Centers Be More Competitive

Discover the transformative impact of modern DCIM software on multi-tenant colocation data centers, enhancing competitiveness through improved resiliency, security, environmental...

Sashkin/Shutterstock.com

Unpacking CDU Motors: It’s Not Just About Redundancy

Matt Archibald, Director of Technical Architecture at nVent, explores methods for controlling coolant distribution units (CDU), the "heart" of the liquid cooling system.

White Papers

Get the full report

Top 40 Data Center KPIs

July 7, 2022
A new white paper from Sunbird outlines 40 of the most critical data center KPIs that managers should monitor. The report covers KPIs from nine areas: capacity, cost, asset, change...