US planning for two exascale supercomputers

Pro

The US Titan supercomputer (Source: Oak Ridge National Laboratory)

23 November 2016

The US believes it will be ready to seek vendor proposals to build two exascale supercomputers, costing roughly $200 million (€187 million) to $300 million (€283 million) each, by 2019.

The two systems will be built at the same time and will be ready for use by 2023, although it is possible one of the systems could be ready a year earlier, according to US Department of Energy officials.

But the scientists and vendors developing exascale systems do not yet know whether president-elect Donald Trump’s administration will change directions. The incoming administration is a wild card. Supercomputing was not a topic during the campaign, and Trump’s dismissal of climate change as a hoax in particular, has researchers nervous that science funding may suffer.

SC16
At the annual supercomputing conference SC16 in Salt Lake City, a panel of government scientists outlined the exascale strategy developed by the Obama administration. When the session was opened to questions, the first two were about Trump. One attendee quipped that “pointed-head geeks are not going to be well appreciated.”

Another person in the audience, John Sopka, a high-performance computing software consultant, asked how the science community will defend itself from claims that “you are taking the money from the people and spending it on dreams,” referring to exascale systems.

Paul Messina, a computer scientist and distinguished fellow at Argonne National Labs who heads the Exascale Computing Project, appeared sanguine. “We believe that an important goal of the exascale computing project is to help economic competitiveness and economic security,” said Messina. “I could imagine that the administration would think that those are important things.”

Politically, there ought to be a lot in HPC’s favour. A broad array of industries rely on government supercomputers to conduct scientific research, improve products, attack disease, create new energy systems and understand climate, among many other fields. Defence and intelligence agencies also rely on large systems.

The ongoing exascale research funding (the US budget is $150 million (€142 million) this year) will help with advances in software, memory, processors and other technologies that ultimately filter out to the broader commercial market.

This is very much a global race, which is something the Trump administration will have to be mindful of. China, Europe and Japan are all developing exascale systems.

Chinese plans
China plans to have an exascale system ready by 2020. These nations see exascale, and the computing advances required to achieve it, as a pathway to challenging America’s tech dominance.

“I’m not losing sleep over it yet,” said Messina, of the possibility that the incoming Trump administration may have different supercomputing priorities. “Maybe I will in January.”

The US will award the exascale contracts to vendors with two different architectures. This is not a new approach and is intended to help keep competition at the highest end of the market. Recent supercomputer procurements include systems built on the IBM Power architecture, Nvidia’s Volta GPU and Cray-built systems using Intel chips.

The timing of these exascale systems, ready for 2023, is also designed to take advantage of the upgrade cycles at the national labs. The large systems that will be installed in the next several years will be ready for replacement by the time exascale systems arrive.

Performance milestone
The last big performance milestone in supercomputing occurred in 2008 with the development of a petaflop system. An exaflop is a 1,000-petaflop system and building it is challenging because of the limits of Moore’s Law, a 1960s-era observation that noted the number of transistors on a chip doubles about every two years.

“Now we’re at the point where Moore’s Law is just about to end,” said Messina in an interview. That means the key to building something faster “is by having much more parallelism, and many more pieces. That’s how you get the extra speed.”

An exascale system will solve a problem 50 times faster than the 20-petaflop systems in use in government labs today.

Development work has begun on the systems and applications that can utilise hundreds of millions of simultaneous parallel events. “How do you manage it — how do you get it all to work smoothly?” said Messina.

Energy consumption
Another major problem is energy consumption. An exascale machine can be built today using current technology, but such a system would likely need its own power plant. The US wants an exascale system that can operate on 20 megawatts and certainly no more than 30 megawatts.

Scientists will have to come up with a way “to vastly reduce the amount of energy it takes to do a calculation,” said Messina. The applications and software development are critical because most of the energy is used to move data. And new algorithms will be needed.

About 500 people are working at universities and national labs on the DOE’s coordinated effort to develop the software and other technologies exascale will need.

Aside from the cost of building the systems, the US will spend millions funding the preliminary work. Vendors want to maintain the intellectual property of what they develop. If it cost, for instance, $50 million (€47 million) to develop a certain aspect of a system, the US may ask the vendor to pay 40% of that cost if they want to keep the intellectual property.

A key goal of the US research funding is to avoid creation of one-off technologies that can only be used in these particular exascale systems.

“We have to be careful,” Terri Quinn, a deputy associate director for HPC at Lawrence Livermore National Laboratory, said at the SC16 panel session. “We don’t want them (vendors) to give us capabilities that are not sustainable in a business market.”

The work under way will help ensure that the technology research is far enough along to enable the vendors to respond to the 2019 request for proposals.

Delivering advances
Supercomputers can deliver advances in modelling and simulation. Instead of building physical prototypes of something, a supercomputer can allow modelling virtually. This can speed the time it takes something to get to market, whether a new drug or car engine. Increasingly, HPC is used in big data and is helping improve cybersecurity through rapid analysis; artificial intelligence and robotics are other fields with strong HPC demand.

China will likely beat the US in developing an exascale system, but the real test will be their usefulness.

Messina said the US approach is to develop an exascale eco-system involving vendors, universities and the government. The hope is that the exascale systems will not only a have a wide range of applications ready for them, but applications that are relatively easy to program. Messina wants to see these systems quickly put to immediate and broad use.

“Economic competitiveness does matter to a lot of people,” said Messina.

IDG News Service