Machine learning on the edge

In future AI models will also learn on edge devices with small, energy-saving microcontrollers. This prevents latencies, lowers energy costs and reduces data security risks.

Machine learning (ML) rates among the main IT issues of our time. In manufacturing and industry in particular, this implementation of artificial intelligence is paving the way for the digital transformation. One often assumes this involves enormous quantities of data that are transferred to a cloud where they are processed by complex models. However, for many applications, this approach is not desirable due to both technical and data protection aspects.

For example, processing in the cloud entails long reaction times as well as high demand for bandwidth and energy. In addition, there are concerns regarding data protection and security. And, last but not least, purely cloud-based infrastructures are reaching their limits with the exponential growth in the number of internet-capable devices.

One practical solution is the decentralized processing of data. Here, the ML models are located where the action is – in the edge. Training still takes place in the cloud or on local servers. However, inference – decision-making based on newly recorded field data – takes place on the outskirts of the network. There, the machine learning algorithms meet microcontrollers without an operating system and with limited memory.

The TinyML Foundation, which was founded in 2019, reacted to these restrictions with a set of hardware and software that enables machine learning on microcontrollers with limited resources. This was realized with the support of Google’s open-source library TensorFlow Lite. It ports the models trained in the cloud to the embedded systems.

One example is the activation command of voice-controlled personal assistants like Siri. The “always-on” application waits for the voice command around the clock using a highly energy-efficient microcontroller with only a few kilobytes of memory. Only when it receives the command does the main CPU make additional functions available.

Training on a microcontroller

Researchers at the Massachusetts Institute of Technology (MIT) have now gone one step further. They are also carrying out the training process in the edge. For example, an intelligent keyboard could continuously learn from how the user types.

Current solutions for on-device training require more than 500 megabytes of memory, much more than the 256-kilobyte capacity of most microcontrollers. Meanwhile, the intelligent algorithms and frameworks of the MIT researchers get by with less than a quarter of a megabyte in memory. The result is a training process that takes just a few minutes.

With the new approach, IoT devices can not only perform inferences, but also continuously update the AI models with newly collected data. The low usage of resources even means deep learning with artificial neuronal networks may soon be possible.

Initial tests of the frameworks were carried out as part of a computer vision model which was able to recognize the training people in photos after just ten minutes. The method was 20 times quicker than current approaches. In the future, this collected knowledge will be used to shrink down even complex models without compromising on accuracy. This could help to reduce the CO2 footprint of large ML models.

Knowledge base

Ji Lin, Ligeng Zhu, Wei-Ming Chen , Wei-Chen Wang , Chuang Gan , Song Han:

On-Device Training Under 256KB Memory