Feature: AI
efficiently in hardware, a specialised sparse architecture is needed, plus an encoder and decoder for the operations, which most chips don’t have. Binary/ternary are extreme optimisations, making all math
operations to a bit manipulation. Most AI chips and GPUs only have 8-bit, 16-bit or floating-point calculation units, so there won’t be performance or power efficiency gains by going to extreme low precisions. The MLPerf inference v0.5 published at the end of 2019
proved all these challenges. Looking at Nvidia’s flagship T4 results, it’s achieving efficiency as low as 13% . This means, whilst Nvidia claims 130TOPS of peak performance on T4 cards, real-life AI models like SSD MobileNet-v1 can utilise 16.9TOPS of the hardware. Therefore, vendor TOPS numbers used for chip promotion are not meaningful metrics.
Track record Whilst some chips may be very good at AI inference acceleration, they almost always accelerate only a portion of the application. In one smart retail example, pre- process includes multi-stream video decode, followed by conventional computer vision algorithms to resize, reshape, format and convert the videos. Post-processing also includes object tracking and database look-up. Te end customer rarely cares about how fast the AI inference
is running, but rather about whether it can meet the video stream performance and/or real-time responsiveness of the full application pipeline. Most chips struggle to accelerate the whole application, which requires not only individual workload acceleration but also system-level dataflow optimisation. Finally, in real production settings like automotive,
industrial automation and medical, it’s critical to have a chip that’s functional-safety certified, with guaranteed longevity and strong security and authentication features. Again, many
Vitis unified software platform
Vitis combines AI with software development, enabling developers to accelerate their applications on heterogeneous platforms and target applications from cloud computing to embedded end points
emerging AI chips and GPUs lack such a track record. However, Xilinx is rising to these AI productisation
challenges. Our devices have up to 8x the internal memory of state-of-the-art GPUs, and the memory hierarchy is user- customisable, critical for achieving hardware “usable” TOPS in modern networks, such as depthwise convolution. User-programmable FPGA logic allows a custom layer to
be implemented most efficiently, keeping it from becoming a system bottleneck. For sparse neural networks, Xilinx has been long used in sparse-matrix-based signal-processing applications such as communication domains. Users can design a specialised encoder, decoder and sparse matrix engines in FPGA fabric. And lastly, for binary/ternary operations, Xilinx FPGAs use
look-up tables (LUTs) to implement bit-level manipulation, resulting in close to 1PetaOPS, or 1000TOPS (when using binary instead of Integer 8).
Industry-approved In regard to whole-application acceleration, Xilinx’s Vitis has already been adopted in production by industries to accelerate the non-deep-learning workloads including sensor fusion, conventional computer vision and DSP algorithms, path planning and motor control. Xilinx now has over 900 hardware accelerated libraries published under the Vitis brand, enabling significant speed-up in typical workloads. Xilinx is known for its quality devices, confirmed by their
adoption in safety-critical environments such as space, automotive, industrial automation and surgery-assistant robots. Xilinx’s new unified software platform, Vitis, combines AI
with software development, enabling developers to accelerate their applications on heterogeneous platforms and target applications from cloud computing to embedded end points. Vitis plugs into standard environments, uses open-source technology, and is free. Within Vitis, the AI part provides tools to optimise, quantise and compile trained models, and deliver specialised APIs for applications from the edge to the cloud, all with best-in-class inference performance and efficiency.
www.electronicsworld.co.uk September/October 2020 31
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32 |
Page 33 |
Page 34 |
Page 35 |
Page 36 |
Page 37 |
Page 38 |
Page 39 |
Page 40 |
Page 41 |
Page 42 |
Page 43 |
Page 44 |
Page 45 |
Page 46 |
Page 47 |
Page 48 |
Page 49 |
Page 50 |
Page 51 |
Page 52 |
Page 53 |
Page 54 |
Page 55 |
Page 56 |
Page 57 |
Page 58 |
Page 59 |
Page 60 |
Page 61 |
Page 62 |
Page 63 |
Page 64 |
Page 65 |
Page 66 |
Page 67 |
Page 68