HPC: Calculating activity
High performance computing is giving enterprises the ability to find answers to questions they want to ask and those they did not know to ask, writes ALEX MEEHAN
10 April 2019 | 0
The world of supercomputing can seem a remote one to most IT specialists, something that happens in university laboratories or in the headquarters of equipment manufacturers.
But increasingly the kind of high performance computing (HPC) that is capable of truly prodigious feats of number crunching is becoming relevant to enterprises as well as researchers. And the reason is down to two words – Big Data.
The amount of data being produced continues to grow exponentially and with it so too are the use cases for computers capable of working with large data sets. At the same time technology itself is advancing and applications such as artificial intelligence and data analytics are placing larger and larger demands on processing power.
But just what is high performance computing? At a time when the average data centre packs an enormous processing punch and even a laptop can boast terabyte storage, what qualifies a set of compute resources to be called high performance or high end?
According to Michael Johnston, senior research engineer and manager for IBM Research Ireland, HPC is the ability to use compute power at much larger scales than your average desktop or laptop could ever hope to manage.
“You need certain specific compute resources in order for a system to qualify as a HPC system, and they’re not common – you don’t get these off the shelf. Usually a HPC system would use supercomputers made up of many powerful nodes with GPUs that are very expensive, connected with a high performance interconnect and a shared file system.”
“The whole thing is capable of running and processing huge amounts of data, working with files that wouldn’t even fit on the average laptop for example.”
Huge data capacity and huge parallel processing power – those are the hallmarks of HPC, with systems potentially using hundreds of nodes together to perform a task. What that task is can vary hugely but according to Johnston, HPC allows users to answer questions that either couldn’t otherwise be answered at all, or couldn’t be answered within a useful period of time.
“The exact questions are really industry dependent. For example, we’ve been working for a long time with the Hartree Centre in the UK which is a focal point for where HPC meets industry. Large businesses come there with their problems and we see how HPC can help,” he said.
“What each company wants to do can be very different. Some have huge analysis problems and want to gain insight from data that is terabytes in size. For example, we worked for a long time with Unilever, which designs consumer products made of mixtures of chemicals.”
“A large part of Unilever’s R&D process involves research into new products and processes – essentially they want to make things better and cheaper. They run virtual analogs of experiments because that can be done faster than sending someone into a lab wearing a lab coat for five days. Instead of five days they have their answer in five hours.”
Other applications IBM has facilitated include engineering projects where users have created digital twins of products and then examined how they behaved in a computerised environment before they are built in real life. To do this, the models used don’t have to be exact replicas of reality – you can still get a lot of insight from course-grained models, as distinct from finely grained models.
“In the future, being able to process huge data sets is going to be a central part of how large companies will excel. That’s recognised across the EU which is why we’re seeing projects at the early stage now. Lots of companies are sitting on potential gold mines of data but they’re challenged by the question of how to build programmes that can analyse these data sets and get the insight out of them,” said Johnston.
According to Prof Jean-Cristophe Desplat, director of the Irish Centre for High End Computing (ICHEC), countries like Ireland absolutely must have HPC facilities to remain competitive in an international context.
“Nowadays almost every country uses HPC resources and nations simply cannot operate without them. This is not only true in order to support national research but also to develop commercial activities as more and more companies are processing large amounts of data or so-called Big Data,” he said.
“In fact, this is a key trend as the primary use of HPC has now become the processing of data, with this combination of HPC and Big Data being described as high performance data analysis (HPDA). It has now become very difficult to speak of HPC or Big Data without mentioning the other as both domains have become intimately linked.”
Countries like the US, China, Germany and Japan have the largest supercomputing processing capabilities but other smaller countries like Ireland are increasing their reliance on supercomputers. Ireland was recently confirmed as a founder member of the European High Performance Computing Joint Undertaking (EuroHPC JU), which is a project approving €1 billion in investment in European supercomputing facilities and HPC R&D over the coming years.
“At ICHEC we host the new national supercomputer, called Kay, which has just come online. It caters for a wide range of research and development activities across all disciplines. The applications researchers run on the system typically require lots of computing power to run larger or longer simulations, or to process large amounts of data,” said Desplat.
“This allows research to operate in domains include nanotechnology, where researchers simulate interactions between atoms and molecules to develop new materials. This supercomputer also allows us carry out weather forecasting and climate modelling in higher resolution that leads to more accurate and localised predictions.”
Other applications include running simulations for medical device development and processing large amounts of biological data for genomics to better understand diseases and treatments. The new system is also a powerful platform in the field of artificial intelligence, allowing users to train and deploy large neural networks which have become the foundation of many new discoveries.
Finally, looking at a more futuristic use, the machine will be deployed to create ICHEC’s quantum computing environment for the development of quantum applications and libraries.
All this is fascinating, but enterprise-class companies might well look at this and ask how can HPC be harnessed for their specific needs?
“I think the knowledge and understanding of HPC in the business community is growing. At ICHEC we are here to collaborate with businesses on improving their software and data systems. By moving code from CPUs to GPUs businesses can improve the efficiency of their computational requirements and ultimately lower their operating costs,” said Desplat.
The ICHEC works closely with Intel and Brian Quinn, director of European innovation for Intel Labs Europe said that in his opinion it’s very difficult to discuss HPC in isolation because effectively it functions as part of a broader IT landscape.
“HPC has a role in a chain from data centre through to storage through to clients through to edge computing etc – it’s about finding the right infrastructure to execute what you want to do. Clearly HPC is part of that, but it sits in the context of a broader compute footprint,” he said.
“More and more, we’re seeing traditional types of compute, such those found in data centres or in HPC infrastructure, and we’re seeing a convergence of that infrastructure in a virtual sense. The workload, driven by data, goes to the right place at the right time to do what needs to be done. It can be quite fluid across the different types of compute infrastructure.”
While data centres are traditional more suitable for transactional based workloads, according to Quinn HPC is able to deal with much more complex compute workloads involving high level processing. What that means is that it can handle things like scientific workloads and complex modelling.
“If you’re a car company and you want to model a new car, you’re going to need an enormous amount of compute power to model all the variables, from tire width to the information system in the car to all the sensors that are in it etc – it’s enormous. HPC is ideal for that,” said Quinn.
“Likewise, a lot of scientific modelling and workloads are suitable for HPC. Testing hypotheses and running through lots and lots of data. And of course when it comes to artificial intelligence, you’re typically working with a lot of deep learning and the teaching of a neural network how to learn from lots and lots of data examples. That’s all highly intense in terms of processing.”
Catherine Doyle, regional sales director for enterprise with Dell EMC agrees that high performance computing is likely to grow in significance in the Irish market in the future, as the application of processing power to Big Data sets becomes an ever-more valuable proposition.
“As Big Data becomes more important and companies want answers to their questions now – not later, not next week but immediately – then HPC will become more and more important. Once that data is attached to the right network, HPC will process that data extremely quickly,” she said.
“We’re talking about data that is traditionally thought of as complex. As artificial intelligence evolves and increasingly quick decisions are required, then HPC will become more important. It’s that simple.”
According to Doyle, social media is a good example of unstructured data. HPC can crunch that data extremely quickly and produce useful dataset that a bot can use, for example, to produce intelligent answers.
“The same is true in medicine or law – any enterprise field where number crunching is a factor. Unstructured Big Data is where HPC shines. The question facing people dealing with Big Data is how to make sense of it and get answers out of it quickly. In the past, you’d have had to buy a very specific big and expensive system to complete each task separately, and therefore only certain industries would have had access to it,” she said.
“While people have been talking about Big Data for a few years, the perception exists that it’s something easy to deal with. However the reality is that the technology is still in its infancy in terms of dealing with Big Data. HPC is part of the solution, but it’s a thorny set of problems being dealt with. This technology is becoming a lot more important to companies for this exact reason.”
Dell EMC sells a lot of its’ HPC products to research facilities and universities as well as to financial institutions but Doyle reports that is increasingly seeing sales happen to a wider range of buyers in the enterprise space as well.
“We’re now seeing sales happen across the board and our Isilon range of object-based storage is increasingly being bought for Big Data applications because you can use it as fast storage. It allows you to create a large scale data repository at a reasonable cost,” said Doyle.
But what’s reasonable? HPC isn’t cheap but Doyle said the question of whether it’s expensive is really a ’how long is a piece of string’ question.
“How much data do you need to crunch? How regularly are you going to be doing that? In general it’s not that HPC is expensive to get involved with but rather that the trick to doing it competitively lies in using the right software sitting in a layer over your data to extract information,” she said.
“If you take loads of raw data from many different sources in a variety of formats, how do you collate that and interrogate it in a meaningful fashion? Whether that’s expensive or not varies massively.”