HPC 2014-15 | Processors
will be integrated in future Xeon and Xeon Phi processors. Te benefits of increased integration (e.g. faster communication, lower power use) are clear, but some HPC users are concerned that they may be locked into an Intel-only architecture.
What about Nvidia? Nvidia is the current leader in accelerated computing, with its HPC business having grown by 40 per cent in 2013. Hundreds of universities teach CUDA (Nvidia’s HPC programming language for GPUs), and there are hundreds of thousands of CUDA
“For ARM chips to make a breakthrough in HPC it is important that they deliver a significant advantage over Intel offerings in terms of performance per Watt”
developers, while OpenACC and OpenCL provide alternative programming approaches for GPUs. Nvidia is also building ARM SoCs that include an integrated GPU, and is working in the OpenPOWER consortium to develop more powerful chips for analytics incorporating OpenPOWER processors and Nvidia GPUs. In addition, most vendors planning to deploy ARM in HPC are collaborating with Nvidia.
What about ARM? For ARM chips to make a breakthrough in HPC it is important that they deliver a significant advantage over Intel offerings in terms of performance per Watt, while also delivering on a number of key issues such as 64 bit support and strong floating point performance. A number of products and initiatives are now promoting work towards these goals, including Applied Micro’s X-Gene family, HP Moonshot and Nvidia’s Tegra product family (incorporating ARM processors and an Nvidia GPU).
What about IBM? IBM has had good success in HPC with its POWER servers and power efficient BlueGene family. In order to meet the needs of future generations of HPC users, IBM has opened up the POWER architecture through licensing it, and related technologies, to members of the OpenPOWER consortium. Tis move brings a wide range of skills to the POWER architecture, and will result in new processor variants being developed, beyond those that would have fed
6 A wafer with Intel Xeon processor E7 v2 family chips, each made of 4.3 billion 22nm transistors
the interests of IBM on its own. Members of the consortium with a special interest in HPC include Mellanox and Nvidia, while Altera also sees this as being an opportunity to expand the use of FPAGs in HPC.
Are vector processors dead? Industry trends may indicate that future HPC processors are more likely to be commodity, mobile or embedded components, but NEC believes that there is a future for modern variants of vector processors. Te SX-ACE is a modern implementation of NEC’s vector architecture that delivers an order of magnitude better power efficiency compared with the previous SX-9 system. One of the key features of a vector system is fast access to data, and the SX-ACE offers 64 GB/s memory bandwidth per core which is well balanced to the 64 Gflop/s performance of each vector processing core.
What about accelerators? Accelerators have been used for number crunching since the early days of HPC, but only by a very small percentage of users. Te first accelerators included array processors from FPS, Numerix, and CSPI, while more recent devices included IBM’s Cell processor and the Clearspeed CSX600. Tere are now two dominant technologies in this growing
segment, which is driven by the need for increased compute power and a reduced electricity bill. GPUs have seen growing use in recent years as compute accelerators, with ease of use improving through soſtware tools such as CUDA, OpenACC, and OpenCL. Nvidia is the dominant GPU supplier to the HPC industry, with the new kid on the block ironically being Intel, with its Xeon Phi Many Integrated Core architecture that has a lower cost of entry for porting applications (although tuning for optimal performance is not dissimilar to tuning for GPUs). Te next generation of the Xeon Phi family will be self-hosting, so the term ‘accelerator’ may no longer be appropriate.
What about more radical approaches? It may take a breakthrough using radical technology to deliver the low power consuming, high compute power, processors required by future generations of HPC systems. Tere are several potential technologies that are currently being tried and tested. Adapteva’s Epiphany multicore architecture
offers up to 4,096 cores per chip, delivering 5.6 Gflop/s (single precision) at an energy efficiency of 70 Gflop/s per watt. On the Green500 (a variant of the TOP500 list that is focused on the most power efficient HPC systems), the top
Intel
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32 |
Page 33 |
Page 34 |
Page 35 |
Page 36 |
Page 37 |
Page 38 |
Page 39 |
Page 40