Feature: Embedded design
To simplify this process, Arm provides the Synchronous
Data Streaming (SDS) Framework, an open-source, free-to-use framework that enables embedded devices to record and replay sensor data in a structured format; see Figures 4 and 5. Key features include support for multiple sensor types
(accelerometers, gyroscopes, audio, video, etc.), recording to embedded file systems, USB storage, serial links, or TCP/ IP to cloud servers, data replay for testing and simulation, and portability across all Cortex-M based microcontrollers. Te SDS framework can run as a dedicated ML data logger and
can be integrated with a full application, allowing extended capture of fresh data when the system is initially deployed. Low-level hardware access is provided through CMSIS-Driver,
a standardised driver interface that simplifies portability across different microcontrollers. Within SDS, a specialised CMSIS driver called vStream manages sensor data streams. Tis driver abstracts common embedded sensors and provides a consistent interface for recording datasets, enabling developers to capture training data directly from the target hardware, ensuring the dataset accurately reflects real-world operating conditions.
Figure 6: The Arm Vela compiler is an additional step in the design workflow that readies the model to run on the Ethos NPU
Tis means engineers can design and train ML models without needing specialised local hardware.
Visual Studio Code Another essential tool is Visual Studio Code, or VS Code, which has become a widely adopted development environment for embedded toolchains. In addition to supporting traditional C and C++ development, it provides powerful extensions for working with Jupyter notebooks. A recently released Google Colab extension allows VS Code
to edit notebooks locally whilst executing code in the cloud environment. Tis hybrid workflow provides the best of both worlds: local editing and project integration, high-performance cloud computing for training models, and a local alternative to Jupyter Labs.
Data acquisition: The hidden challenge A common misconception is that ML projects begin with designing neural networks. In reality, data acquisition is oſten the most time-consuming and critical stage. If you are designing a custom embedded system, it is unlikely that a suitable public dataset already exists. Instead, engineers must collect data from the sensors used in the final design. Tese could be accelerometers and gyroscopes for activity recognition, microphones for voice detection, cameras for visual classification, or environmental sensors for anomaly detection. Te quality of the dataset directly determines the quality of the trained model.
Early testing with Arm Virtual Hardware Even with a good dataset, testing machine learning algorithms on physical hardware can be inconvenient in the early development stages. Hardware may not yet exist, or access may be limited. An effective solution is Arm Virtual Hardware (AVH), which provides a simulation of each Cortex-M processor, including CPU execution, virtual peripheral interfaces and Ethos NPU simulation. Te AVH simulation creates a generalised microcontroller platform, allowing engineers to develop and test their application before migrating it onto real hardware. One particularly useful feature of this system is the ability to
stream previously recorded sensor data into the simulated system using a vStream driver. Results from the ML model can then be streamed through a second driver and recorded on a local hard drive. With this approach, engineers can record sensor data with
the SDS framework, feed that data into the AVH simulation, execute the ML model, and record and analyse the model output. Tis creates a fast design test loop that accelerates algorithm experimentation. Te same system may also be integrated into continuous
integration pipelines, ensuring that new code changes do not degrade model performance. Once data is captured, the next step is cleaning, analysing and visualising it. Python has become the dominant language for this stage of the workflow, largely due to its extensive ecosystem of scientific libraries. Tree libraries are particularly important: NumPy, which provides high-performance numerical arrays and mathematical operations. It forms the foundation for many other ML tools. Pandas, which builds on NumPy to provide powerful tools for working with structured datasets. Pandas supports data cleaning, filtering and aggregation, and time-series analysis. For sensor data streams, Pandas is invaluable.
www.electronicsworld.co.uk May 2026 19
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32 |
Page 33 |
Page 34 |
Page 35 |
Page 36 |
Page 37 |
Page 38 |
Page 39 |
Page 40 |
Page 41 |
Page 42 |
Page 43 |
Page 44