This page contains a Flash digital edition of a book.
predictions for 2013

system at Oak Ridge National Laboratory, which uses the K20X. An alternative metric is the Green500

list that ranks the most energy-efficient supercomputers. Te upper reaches of that list are dominated by IBM BlueGene/Q systems, but top ranked is the National Institute for Computational Sciences’ Beacon system at the University of Tennessee that uses Xeon Phi. Tird on the list is the Titan system – with an AMD GPU-accelerated entry at number two. So Nvidia lands some piercing blows with its

strong position on the Top500 list, including the top spot, while Intel counters with several new entries for pre-production versions of Xeon Phi, including top position on the Green500 list. Nvidia wins this round 10-9.

Tat is, the main program runs on a normal processor and offloads computationally intensive portions of code and related data to the accelerator. Tis is the model that has been used since the days of array processors. Te alternative methods that Intel offer are symmetric (where the workload is shared between the normal processor and what Intel calls the coprocessor), and many-core only (where the whole program runs on the Phi). Te symmetric approach can also be used with GPUs, and the many-core only approach may be of limited use because of the amount of memory available on the coprocessor. Many HPC applications spend much of

their time executing library functions, so the availability of highly-tuned mathematical


libraries is very important. Te K20X has broad library support, but the functions available for the Phi in offload mode are, as yet, limited, although this will surely change during 2013. Nvidia has also been working with the HPC

industry to provide a standard approach for programming accelerators. Tis has resulted in the OpenACC initiative, which uses compiler directives that specify loops and regions of code in standard C, C++ and Fortran to be offloaded from a host CPU to an accelerator, providing portability across operating systems, host CPUs and accelerators. OpenACC is supported by supercomputer vendor Cray and compiler companies CAPS and PGI as well as Nvidia, and is being discussed for incorporation into a future version of the mainstream OpenMP standard. Tere are arguments in favour of Nvidia’s


Cuda/OpenACC approach, and Intel’s focus on its standards supporting tool chain. But users don’t care about the details of such debates – they want tools that work and are easy to use. When OpenACC becomes part of OpenMP and users can build a single version of an application that can (with minimal tuning) run effectively on both platforms, then the industry will have achieved something worthwhile. For now, it’s 10-10.

Round seven: applications Many popular HPC applications have already been tuned for Nvidia GPUs, and while similar work is underway for the Xeon Phi, it is some way behind, so Nvidia wins this round 10-9. Key application areas are computational chemistry, material science, climate and weather, physics and CAE. Key applications that are already accelerated on Nvidia GPUs include Amber, Charmm, Gromacs, Lammps, Namd, DL_POLY, QMCPack, Quantum Espresso, Chroma, Denovo, Ansys Mechanical, MSC Nastran and Sumulia Abaqus. Intel is working with Accelrys, Altair, Ansys, CD- Adapco and MSC Soſtware, among others. Tis work is important, but Intel is playing catch-up.

Round eight: the Top500 list Te Top500 list describes the 500 fastest computers in the world, and is published twice a year at the US and European supercomputing conferences, SC and ISC respectively. Tree years ago only seven of the Top500 systems used accelerators – either IBM PowerXCell 8i or Clearspeed CSX600. Since then, the number of accelerated systems has grown to 62. Tis rapid growth has been driven by Nvidia, which supports 50 of these 62 systems, while there are seven pre-production Xeon Phi systems in the latest list. Of the top 100 systems, Xeon Phi features in four systems while Nvidia GPUs support 13 systems – including the fastest computer in the world, the Cray XK7 Titan

Conclusion Tere is little doubt that Intel’s participation in this market segment will validate the approach and will help grow the market for Intel, Nvidia and others. But although Nvidia may lose market share to Intel through 2013, the potential market will grow more quickly than they lose share – so both Intel and Nvidia will have a good year.

What does the fight scorecard look like?

1 2 3 4 5 6 7 8


DP performance SP performance Memory

Power consumption Price

Programming Applications


Intel 9 9

10 10 10 10 9 9


Nvidia 10 10 9

10 9

10 10 10 78

Both of the combatants have won important rounds, but no-one has yet landed a knock-out blow. Nvidia is the incumbent and there is a large body of GPU soſtware available, while Intel is, well, Intel. Its argument about using a common soſtware platform as mainstream Xeon processors will be persuasive to many, even if it is not entirely convincing. Bottom line – who will the winner be? In 2013, both will be winners. Intel will take market share from Nvidia, but the overall accelerator market will grow, so a year from now we will see more Nvidia GPU and more Intel Xeon Phi systems in the Top500 list. A rematch in a year’s time should prove even more interesting.

With more than 30 years’ experience in the IT industry, initially writing compilers and development tools for HPC platforms, John Barr is an independent HPC industry analyst specialising in the technology transitions towards exascale.


Page 1  |  Page 2  |  Page 3  |  Page 4  |  Page 5  |  Page 6  |  Page 7  |  Page 8  |  Page 9  |  Page 10  |  Page 11  |  Page 12  |  Page 13  |  Page 14  |  Page 15  |  Page 16  |  Page 17  |  Page 18  |  Page 19  |  Page 20  |  Page 21  |  Page 22  |  Page 23  |  Page 24  |  Page 25  |  Page 26  |  Page 27  |  Page 28  |  Page 29  |  Page 30  |  Page 31  |  Page 32  |  Page 33  |  Page 34  |  Page 35  |  Page 36  |  Page 37  |  Page 38  |  Page 39  |  Page 40