Feature: Embedded AI
guardrails, and formal verification methods to mathematically bound model behaviour, enabling compliance with standards like ISO 26262 or FDA guidelines. For resilience against adversarial
conditions, models are hardened via adversarial training (exposing them to perturbed data during development) and input sanitisation techniques (e.g., noise filtering), while runtime monitors track anomalies like sudden confidence drops to flag potential attacks. Secure update protocols (e.g., cryptographically-signed OTA patches) and redundancy (e.g., voting systems across multiple models) further mitigate risks, creating layered defenses that align AI/ML flexibility with embedded systems’ rigid safety and security requirements.
engineers turn to strategies to balance performance with reliability. Determinism is enforced by freezing trained models (locking weights to prevent runtime drift) and using static memory allocation to eliminate timing variability, ensuring predictable, real-time responses. Certifiability is tackled through hybrid architectures, pairing neural networks with rule-based
Advancements in AI/ML for embedded systems Techniques like model compressions (pruning, quantisation, etc.) help to shrink AI/ML models without sacrificing accuracy. Coupled with specialised hardware like neuromorphic chips and ultra-efficient AI accelerators, these innovations are transforming embedded systems. Pruning and quantisation are
two of the most widely adopted and effective methods for neural network or AI model compression. Pruning strips away redundant or non-critical components from neural networks. For example, a model trained to
recognise objects might initially have 10,000 connections between its artificial neurons. By analyzing which connections contribute least to accurate predictions (e.g., those with near-zero weights), engineers can remove 40% of them, resulting in a model that uses 60% of the original computational resources yet retaining 95% performance. Quantisation addresses a
fundamental challenge in embedded AI: the inefficiency of high-precision calculations. Neural networks typically represent parameters (weights and activations) as 32-bit floating-point numbers, format more suitable for supercomputer analysis than a smartwatch. Quantisation simplifies this by shrinking the “format” of numbers the model uses. For instance, converting 32-bit values to 8-bit integers reduces memory usage by 75% and accelerates computations, as smaller numbers require fewer processing cycles. This precision trade-off is carefully calibrated, like compressing a high-resolution photo to a smaller file while retaining enough detail to recognise faces. In addition to model compression
methods, running AI in embedded systems is further helped by hardware advancements, such as graphic processing units (GPUs), tensor processing units (TPUs) and neural processing units (NPUs). General- purpose CPUs, while versatile, struggle to handle the intense computational demands of modern AI models on energy-constrained devices. This gap
Figure 1: Neural network model pruning
Figure 2: Neural network model quantisation
www.electronicsworld.co.uk February 2026 33
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32 |
Page 33 |
Page 34 |
Page 35 |
Page 36 |
Page 37 |
Page 38 |
Page 39 |
Page 40 |
Page 41 |
Page 42 |
Page 43 |
Page 44 |
Page 45 |
Page 46 |
Page 47 |
Page 48