How 5G and edge GPUs will unleash AI in the IoT platform (Reader Forum)
The combination of 5G and edge computing promises to fast-track the use of artificial intelligence (AI) in the Internet of Things (IoT). With 10 to 20 times faster speeds and dramatically lower latency, 5G makes it much more feasible to process AI workloads locally at the edge, where data is gathered, rather than the slower and more expensive method of sending it to the cloud or a data center.
This speed and reduced latency is essential to an array of IoT applications such as smart cities, transportation, intelligent manufacturing, e-health, smart farming, and so on. This is why Gartner predicts that 75 percent of enterprise-generated data will be processed outside a traditional data center or cloud by 2022, compared with 10 percent today.
There’s a common misconception, however, that processing at the edge always means doing so on the IoT endpoints themselves. That simply is not practical in most cases for compute-intensive AI workloads on devices that lack the resources to handle it, as is usually the case.
Deploying AI and machine learning as part of an IoT platform requires consideration of a different approach that offloads AI tasks from the sensors, actuators, robots, etc. and hands them off to fast computational units – GPUs – located nearby.
5G’s ability to enable this offloading with minimal latency delays creates the responsive architecture needed to apply AI to the vast array of data produced by IoT devices.
So what specifically do enterprise leaders and IoT architects need to know about executing this approach so they can benefit from the advantages of AI at the edge?
To start, understand that creating an AI-powered application has two key engineering parts.
One is providing the machine learning algorithm with training data to learn from. This carries large compute requirements when dealing with for example all the data streams collected by sensors in a smart factory.
The second is, inference/execution – i.e. applying the AI insights that the trained models have yielded in order to, for example, tweak how a machine in the factory operates based on patterns detected by the sensors. This is also a complex and often compute-heavy job.
Processing all this data by using rented servers on the public cloud can get expensive quickly, increase complexity, and cause the aforementioned latency issues. The same goes for relying on private data centers for the work.
The key question that must always be asked is whether the trained model can “fit” on the device. Though there has been progress in recent years in trained model size minimization with tools such as TensorFlow Lite, latency remains an issue for more data-compute-heavy applications, such as speech recognition or video analysis, due to compute limitations within the IoT devices.
Thus, the answer to the “fit” question is usually no, and it’s time to consider offloading the AI model to a nearby server.
Here’s where GPUs come in. Originally designed as a single-chip processor for managing and enhancing video and graphics performance, GPUs have evolved into a key AI and machine learning accelerator. Businesses, systems integrators, and network operators can configure edge GPUs to offload computation from a wide array of IoT devices.
By tapping into capacity from an edge GPU service provider, IoT businesses and practitioners can focus on domain expertise such as building a great robot or reducing manufacturing defects instead of infrastructure concerns.
Tools such as Kubeflow can further help operationalize and manage these AI workflows in the enterprise with a single developer interface.
The marriage of 5G and GPUs will expand the possibilities of what edge computing can accomplish and unlock AI solutions that would otherwise not be possible. In doing so, it will help pave the way for many exciting IoT scenarios.