HPC_YEARBOOK

HPC 2015-16 | Processors

should be developed through a co-design process that balances classical computational speed and data-centric memory and communications architectures to deliver performance at the one-to-ten Exaflop level, with addressable memory in the Exabyte range.’ To further these objectives it recommended setting up a programme to be managed by the National Nuclear Security Administration (NNSA) and the Office of Science (both part of the DOE). Te DOE’s FastForward2 project aims

to accomplish this through public-private partnerships to accelerate the development of critical component technologies needed for extreme-scale computing. Te project awarded $100 million in contracts to five US companies: IBM was tasked with memory research; Intel, Nvidia, and Cray were all separately awarded contracts for node research; while AMD was given both memory and node research.

Diversity of processors Te change in processor technology, according to Steve Conway, vice president of high-performance computing at the market research company IDC, means that users are now looking to develop more specialised tools for specific jobs. So the general-purpose x86 technology that has dominated HPC for the last 15 years or so, is now being joined by accelerators and other, less mainstream technologies such as DSPs and FPGAs. Roy Kim, group product manager in

Nvidia’s Tesla HPC business unit, stated: ‘It’s an interesting time right now for HPC. A few years ago, Nvidia GPUs were the only accelerators on the market and a lot of customers were getting used to how to think about and how to program GPUs. Now if you talk to most industry experts, they pretty much believe that the path to the future of HPC, the path to Exascale, will be using accelerator technology.’ Te drive is to find different ways to accelerate the parallel sections of code, as this is seen as the area that can provide the largest gain in performance.

HPC feels the Power of IBM Te move from commodity HPC hardware to a more specialised model is exemplified by IBM’s Power processor. Tis is a reduced instruction set computer (RISC) processor (the acronym originally stood for Performance Optimization With Enhanced RISC) and so very different from the general purpose x86. It has a system-on-a-chip

design: integrating processors, memory, and networking logic into a single chip. Currently four out of the top 10 supercomputers in the Top500 list are IBM BlueGene machines, all with the Power BQC processor. Its prominence in HPC is largely down

to two factors. Te first is IBM’s success in the US DOE’s Coral programme. Te DOE brought together several of its national laboratories in the joint Collaboration of Oak Ridge, Argonne, and Livermore (hence the name, Coral) to coordinate investments in supercomputing technologies, streamline procurement, and reduce costs, with the aim of developing supercomputers that will be five to seven times more powerful when fully deployed than today’s fastest systems. Two out of the three systems procured through the Coral programme, costing around $325 million, will make use of a combination of IBM Power architecture, Nvidia’s Volta GPU, and Mellanox’s interconnect technologies. Te technological development behind

these partnerships underlies a highly strategic move, particularly clear in the partnership between IBM and Nvidia. While Nvidia was developing its NVLink interconnect technology, which will enable CPUs and GPUs to exchange data five to 12

“Exascale machines should be developed through a co- design process that balances classical computational speed and data-centric memory and communications architectures”

Intel I7-4790K ‘Devil’s Canyon’ processor

times faster than they can today, IBM set to work with Nvidia to integrate the technology with its own CPUs. Te forethought and planning that went into this technology integration demonstrates a partnership that has been in the making for some time. Tis exemplifies, in HPC, a second

factor in the growth of interest in the Power processor and its associated architecture more generally: IBM’s efforts to promote a wider ‘ecosystem’ of developers as well as other hardware manufacturers around the Power architecture, through the creation of the OpenPOWER Foundation. IBM’s recognition that it needs others in the wider computing community to write soſtware for its hardware perhaps reflects a recognition of the success of Nvidia, which put an immense amount of effort into building up a community of users of its Cuda language for GPUs. It is significant therefore that, earlier this year, IBM recruited Sumit Gupta from

The Xeon Phi based ‘Stampede’ supercomputer, housed at the Texas Advanced Computing Center, can perform nearly 10 quadrillion operations per second, making it one of the most powerful supercomputers in the world

15

Thomas McConnell

Intel

Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | Page 20 | Page 21 | Page 22 | Page 23 | Page 24 | Page 25 | Page 26 | Page 27 | Page 28 | Page 29 | Page 30 | Page 31 | Page 32 | Page 33 | Page 34 | Page 35 | Page 36