Nvidia accelerates path to AI for IoT, hyperscale DCs
28 September 2017 | 0
It is safe to say the Internet of Things (IoT) era has arrived, as we live in a world where things are being connected at pace never seen before. Cars, video cameras, parking meters, building facilities and anything else one can think of are being connected to the Internet, generating massive quantities of data.
The question is how does one interpret the data and understand what it means? Clearly trying to process this much data manually does not work, which is why most of the web-scale companies have embraced artificial intelligence (AI) as a way to create new services that can leverage the data. This includes speech recognition, natural language processing, real-time translation, predictive services and contextual recommendations. Every major cloud provider and many large enterprises have AI initiatives underway.
However, many data centres are not outfitted with enough processing power for AI inferencing. For those not familiar with the different phases of AI, training is teaching the AI new capabilities from an existing set of data. Inferencing is applying that learning to new data sets. Facebook’s image recognition and Amazon’s recommendation engine are both good examples of inferencing.
At its GPU Technology Conference (GTC) in China, Nvidia announced TensorRT 3, which promises to improve the performance and cut the cost of inferencing. TensorRT 3 takes very complex networks and optimises and compiles them to get the best possible performance for AI inferencing. It acts as AI “middleware” so the data can be run through any framework and sent to any GPU. Nvidia has shown why GPUs were much better for AI applications than CPUs, and it has a wide range of GPUs, depending on the type of application and processing power required.
Unlike other GPU vendors, Nvidia’s approach is not just great silicon. Instead it takes an architectural approach where it combines software, development tools and hardware as an end-to-end solution.
During his keynote, CEO Jensen Huang showed some stats where TensorRT 3 running on Nvidia GPUs offered performance that was 150x better than CPU-based systems for translation and 40x better for images, which will save its customer huge amounts of money and offer a better quality of service.
DeepStream SDK introduced. It delivers low-latency video analytics in real time. Video inferencing has become a key part of smart cities but is being used in entertainment, retail and other industries as well.
An upgrade to CUDA, Nvidia’s accelerated computing software platform. Version 9 is now optimised for the new Tesla V100 GPU accelerators, which is the highest-end GPU and ideal for AI, HPC and graphically intense applications such as virtual reality.
Huawei, Inspur and Lenovo using Nvidia’s HGX reference architecture to offer Volta-based systems. The server manufacturers will be granted early access to HGX architectures for data centres and design guidelines. The HGX architecture is the same one used by Microsoft and Facebook today, meaning Asia-Pac-based organisations can have access to the same GPU-based servers as the leading web-scale cloud providers.
The world is changing quickly, and market leaders will be defined by the organisations that have the most data and the technologies to interpret it. Core to that is GPU-based machine learning and AI, as these systems can do things far faster than people.
IDG News Service