Technology
Intel boosts SBC, DSP systems
Latest Intel processors advance embedded DSP and SBC system design By Ian Stalker and Alan Baldus
Intel’s latest generation of Core i7 processors is a game-changer for embedded military DSP and SBC system designs. For the first time, they bring support for Serial RapidIO, the OpenVPX (VITA 65) board interconnect fabric of choice. Even better, the performance of vector math processing, critical for DSP applications, is effectively doubled with Intel’s new 256-bit AVX instruction set.
The key performance metric of interest for military DSP systems is the speed of performing floating arithmetic operations, which is referred to typically as GFLOPS when discussing the speed of computers. In recent history, these DSP systems were commonly built using Texas Instruments 320C40 and 320C6701k and Analog Devices SHARC dedicated DSP proces- sors, which were themselv es followed by a number of generations of PowerPC/ Power Architecture processors with AltiVec. All of these processors of fered good floating point performance per watt and all were available from vendors with a history and track record of support for military embedded customers. Now, with the introduction of Intel’s 2nd Generation Core i7-2715QE quad-core processor , the design of x86-based embedded mili - tary DSP systems and high-performance SBCs takes a significant leap forward.
Intel refers to their product introduction cadence as the “T ick-Tock” model. A “tick” is when Intel delivers new silicon process technology with increased tran - sistor density, and enhanced performance and energy efficiency within a smaller version of an e xisting microarchitec- ture. The 2nd Generation Intel Core i7 is a “tock,” which is when an entirely new microarchitecture is introduced on an existing semiconductor process tech- nolgoy. Using the 32 nm process intro - duced with the Westmere generation, the 2nd Generation Core i7 (pre viously code-named “Sandy Brid ge”) features many architectural improvements (espe- cially in the cache subsystem) that lead to improved performance per clock cycle. It is the nature of microprocessor design that revised architectures typically pro vide incremental performance improvements. However, the 2nd Generation Core i7 has delivered a major leap forward in the signal processing capability of the pro - cessor, thanks to the ne w 256-bit wide Intel Advanced Vector Extensions (AVX) floating-point instruction set, which
supercedes the earlier 128-bit Streaming SIMD Extensions (SSE) instructions.
While the new Core i7 brings many advan- tages for DSP system designs, SBCs used in conjunction with Core i7-based DSP engines also benefit. SBCs can now take advantage of the f irst ever support for Serial RapidIO on Intel Architecture, as the result of an upcoming PCIe2-to-Serial RapidIO2 bridge chip from IDT that will provide a common communications path and improve interoperability in a com - plete system. The new Intel processor also supports 16 lanes of Gen2 PCIe for full-bandwidth communications across high-performance processor cards. Intel’s hyperthreading technology provides for running two execution threads on each core, enabling greater utilization of the execution units and providing improved power efficiency. Published reports show performance increases of 7 to 34 percent due to hyperthreading alone.
The AVX 256-bit difference Prior to the introduction of Intel’s new 256- bit AVX, developers of military DSP sys- tems typically turned to 128-bit AltiVec- enabled CPUs for vectorized signal pro- cessing functions. In the past fe w years, development of new AltiVec-enabled pro- cessors slowed significantly, leaving DSP system developers with limited options. In the meantime, Intel continued to in vest in and enhance its own high-performance vectorized processing solution with contin- ual enhancements to Intel Streaming SIMD Extensions, a 128-bit wide processing unit predecessor to AVX, capable of simul- taneously operating on four 32-bit floating point values. Intel SSE also featured sup - port for double-precision floating point, a feature not available in AltiVec. In Intel’s earlier multicore processors, each core was provided with its own SSE unit, so raw floating-point performance scaled with the number of cores. In the new Core i7 Intel has upgraded SSE with AVX, doubling the size to 256-bits wide.
28 VME and Critical Systems / Spring 2011
This doubled vector processing perfor- mance is a significant milestone in DSP system design. DSP algorithms used in critical military applications such as radar, SIGINT, and image processing depend on the precision achieved with floating point numbers combined with the speed of processing. The new Core i7 doubles the peak performance of SSE. When com- pared to SSE in actual FFT kernels, AVX has been benchmarked up to 1.8x f aster than SSE (Figure 1). The AVX instruction set was designed to support future exten- sions, which hints at wider implementa- tions in the future.
Serial RapidIO onboard Serial RapidIO is the preferred fabric for the types of processor -to-processor communications required by demanding military DSP systems. This is because of Serial RapidIO’s reliable packet transmis- sion and ability to deliver low and predict- able latencies. These benefits of RapidIO messaging are ideal for large peer-to-peer clusters of processors typically used in complex signal processing applications. With the Intel 2nd Generation Core i7, Serial RapidIO is supported on Intel architecture-based OpenVPX/VITA 65 embedded boards for the first time with an easy, cost-effective interconnect provided by IDT’s upcoming PCI Express (PCIe ) Gen2-to-Serial RapidIO protocol conver- sion bridging semiconductor product.
Before this newest generation of Core i7, the lack of support for Serial RapidIO for Intel platforms severely limited the via - bility of using Intel architec ture in DSP multiprocessor system designs. Solutions for Intel have included support for f ab- rics such as InfiniBand and Gigabit/10 Gigabit Ethernet, which are not embraced in military applications because of their non-industrial temperature silicon and relatively high power consumption. For SBCs, where the requirement is typically a single processor communicating with I/O, these fabrics have been sufficient,
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32