Extreme Data

Longform
Image: Stcokfresh

15 May 2014

But Connaughton says that the notion of a ‘data scientist’ is in some ways part of the current hype. “I think there is a current myth that there is this mythical data scientist with some sort of double first degree in maths who understands everything in an industry. I suggest that when we talk about ‘data science’ we are in fact always talking about a team of people, who can combine the statistical analysis with the IT and the domain knowledge and whatever else might contribute in a cross-disciplinary way.”

On the other hand, he suggests also that “We could get hung up on specialist skills and processes when in fact one of the key concepts about Big Data, and so also Extreme Data, is that of information discovery. There is a real sense in which that is counter-intuitive to building more specialised and targeted systems.” But the key point, Connaughton believes, is that right now the limiting factor is not our technology but lies in the culture of organisations

Industry sectors are building systems that are software-driven in the sense of providing for the best performance of their significant applications. In some respects many enterprises are now data focussed rather than computing focussed, across all segments and industries. They all understand that big data is only useful in giving the possibility of extracting value from analytics, Dr Phillippe Trautman, HP

Workload key
But as we all know, very often the culture of organisations is driven by practical and commercial imperatives — in ICT terms, their key workloads. Dr Philippe Trautman of HP is in charge of its HPC business in EMEA. His group is focussed on those solutions and customers with some kind of HPC workloads, whether they actually have Big Data or not. For HP, as for the industry in general, the major current application markets include computer-aided engineering, design and manufacture, geophysical sciences and oil and gas exploration, life sciences, energy and then the more obvious or longer standing such as financial services, meteorology, government and academic research.

HP’s solutions for these industries are largely based on HP Moonshot, its year-old range of advanced servers that are essentially software-defined. “In talking to our clients I often describe our Moonshot solutions as software-driven and workload-defined computing. We see it as a component of HPC, not necessarily aimed at the very top end but certainly at the practical industrial and commercial levels, for example recently an oil and gas project with 150 petabytes of data.”

“IT investment focussed on workloads is particularly appropriate for that minority of organisations — or their divisions — that are dependent on a small number of key applications. The oil and gas sector provides good examples and we have specific architectures and accelerated technologies for them, usually working with specialist ISVs who have the depth of domain expertise.

“We are seeing similar trends in manufacturing and automotive industries as well as in aerospace where Airbus is a prominent client of HP. These sectors are building systems that are software-driven in the sense of providing for the best performance of their significant applications. In some respects many enterprises are now data focussed rather than computing focussed, across all segments and industries. They all understand that big data is only useful in giving the possibility of extracting value from analytics.

“Financial services and banks have understood this for a long time and today I would point to oil and gas as a sector which it very quickly adopting this approach in analysing the data from seismic studies. You can actually see more money being invested on the data side than the actual computing, which shows how the balance has swung.

“Today it is all soft, the data and the applications, and we design the platforms and the systems to match the workloads,” Dr Trautman says. “For instance a Moonshot array might have several special purpose cartridges, perhaps even bespoke, plus general purpose computing. The key point is that it is a unified computing resource with some specialist and targeted elements for efficiency and performance.”

We can take data from multiple sensors — in fact every single person or car could be a multi-sensor platform. All of these, together with fixed point and passive sensors, can be a dynamic data infrastructure. It is in the fusion of all of those possible data sets that we can gain insights and value, Dr Randy Cogill, IBM Research

Read More:


Back to Top ↑

TechCentral.ie