by Gustav Evertsson, Vinnter AB
Tiny machine learning or TinyML is altering the shape and nature of the machine learning landscape.
During the last two decades we have seen a boom in machine learning like never before. As a technology, machine learning is actually much older than that. Recently however, some major research projects like Long short-term memory networks, ImageNet, and the introduction of GPUs have made machine learning a feasible option for many problems. Faster internet connections and larger and larger memory devices (both for storage and ram) have also been making the data needed to train machine learning models more available. Companies like Google, Amazon, Facebook, and others have in many cases been open with the technology they originally developed for in-house use cases, and are now driving the development of new machine learning algorithms in many different ways. Cloud providers now also offer machine learning environments, making it very easy for organisations to both get started with the technology, and to scale up when needed.
In parallel with this we have also seen a growth spurt in the internet of things, with computing power becoming cheaper and less power hungry, so that it now can be added to a wide range of things to make them smarter. In the IoT boom we have also seen how sensors of all kinds can be used to monitor a diverse range of conditions — for example the environment, our own health, or the device itself.
The standard way to handle all this data has been to send it to the cloud for processing. However, because bandwidth is normally expensive or limited in scope, and sensors can normally generate a lot of data in a short time, most of this information is lost in transit. But if machine learning data analysis can be applied more locally to IoT devices, these losses may be eradicated, and new possibilities can open up.
These two technologies are now combining into what is called tiny machine learning (TinyML) — an environment in which the processing power is now sufficient to run machine learning models even in small power constraint applications, together with direct access to sensor data. On the software side, improvements in machine learning models have not only extended their capabilities, but also made them more efficient when applied to the simpler tasks more often associated with IoT devices.
The algorithms used in tiny machine learning are in essence much the same as those in traditional ML operations, with initial model training typically occurring on a local computer, or in the cloud. After this initial training, the model is condensed to produce a more compact package, in a process called deep compression. Two techniques often employed at this stage are pruning and knowledge distillation.
Once this distillation is complete, the model is quantised to reduce its storage weight, and to convert it to a format compatible with the connected device. Encoding may also occur, if it’s necessary to further reduce the size of the learning model.
The model is then converted into a format which can be interpreted by a light neural network interpreter, such as TensorFlow Lite (TF Lite).
TensorFlow by Google is one of the most popular machine learning libraries, and in 2017 TensorFlow Lite was released, targeting mobile applications. TensorFlow Lite Micro (released in 2019) targets even smaller microprocessor applications. These two platforms have made this process of shrinking the model to fit embedded devices a lot easier. It is now possible to
develop and train machine learning models on high performance desktop or cloud machines, then deploy them on embedded platforms while still using the same API.
Edge Processing For TinyML
As IoT devices and applications become more integrated with mission and business-critical use cases, the response time from information processing at data centres or in the cloud will not be quick enough. There may also be situations where hundreds of IoT sensors need to connect to the cloud simultaneously, creating network congestion.
Processing the data at the edge gives several benefits. From the point of view of privacy and data protection laws, auditing and compliance are much easier to handle when the data does not leave the device. Securing the information is also easier, because it can be very short lived when it is consumed as soon as it is read from the sensor.
Guaranteeing Energy Efficiency For Tiny Machine Learning
In many cases, processing data locally at the network edge consumes a lot less energy than transmitting it to the cloud or data centre, so battery life can improve. Some TinyML devices are capable of operating continuously for a year, running on a battery the size of a coin. This introduces options for remote environment monitoring applications in areas like agriculture, weather prediction, or the study of earthquakes.
Network latency can also be reduced, when the data does not have to be transmitted back and forth to the cloud. For example augmented reality is data-intensive, and it becomes very noticeable if there is a delay in the video processing.
Looking into the future, the cost in power is expected to continue to go down for CPU and memory, but not for radio transmission, where we seem to be closer to the physical limit of how much data per Wh we can send. This will only make the case for tinyML stronger in the future, where we will likely see ultra low power ML devices running for years on small cell batteries, and needing to transmit data to the cloud only when anomalies are detected. We are also beginning to see microprocessors specific for machine learning applications, like the Syntiant NDP100 with a footprint of only 1.4 x 1.8mm, and power consumption of less than 140 μW while still doing voice recognition. Another example is the Edge TPU by Google, an ASIC chip made to run ML models while still only consuming a few watts of power.