SCW_JUNJUL14

high-performance computing ➤

PGAS requires collaboration from users, soſtware and hardware suppliers. But there is a serious problem that this may not be generally accepted until it is too late. When the HPC industry has gone

through paradigm shiſts in the past, the impact of these changes has been mainly limited to the hard-core HPC industry. However, HPC is today a strategic asset for many companies beyond traditional HPC users, so major changes to applications will have far-reaching affects. Lippert believes that some applications

can evolve to exascale, while others will need to consider new approaches. Te HIGH-Q Club at Jülich supports applications that can fully exploit the compute, memory and/or network capabilities of their 458,000 core IBM BlueGene/Q system. Te club now has 12 members, with a further 10 working on qualification.

Resilience Applications today, even those running on the fastest supercomputers delivering in excess of 10 petaflop/s, assume that the system will always operate correctly. But exascale systems will use so many components that it is unlikely that the whole system will ever be operating normally. So system soſtware must track the state of the system and pass information about failed (or poorly performing) components to applications, which in turn must be built to operate correctly in such an uncertain environment. According to Parsons, handling the lack of resilience of not only

HPC IS TODAY A

STRATEGIC ASSET FOR MANY COMPANIES BEYOND TRADITIONAL HPC USERS

computation, but also communication and storage, will be a major issue for exascale systems. Ramirez thinks that while current petascale applications should be able to run on exascale systems, a new generation of applications that use fault-tolerant algorithms is required to enable resilient applications to scale to the full size of the machine. Lippert proposes a different approach, suggesting that virtualisation of compute, memory, interconnect, and storage could hide reliability issues from exascale applications. However, no hypervisor support has yet been announced that could make this a reality.

28 SCIENTIFIC COMPUTING WORLD

Conclusion An important issue relating to exascale that Ramirez thinks is extremely valuable, is not the high-end systems themselves, but is the low-cost, low-power consuming capabilities that the required technology advances will bring, resulting in petascale systems in a single rack with a power draw of 20 kW, and terascale capabilities on portable devices. Tese systems will deliver high-value to society, especially in healthcare where doctors will be able to deliver real time diagnosis rather than waiting for weeks to be able to access expensive specialist systems. Sterling is convinced that we will see the

first exascale system before the end of the decade. But, he says: ‘Te question is not “will we have an exascale system?”, but “will we have the right one?”.’ It is worth bearing in mind that the first teraflop/s machines, like the Cray T3E and Intel’s ASCI Red system that were operational 25 years ago, seemed unbearably complicated and difficult to program, and we now have devices like the Intel Xeon Phi and Nvidia K20 GPU accelerators that routinely deliver a sustained teraflop/s. So, however tough the problems seem to be, the HPC industry will overcome them and, in time, the challenges of exascale will be solved and we will soon be looking towards zettascale machines. Perhaps a sign of the times is that the chief architect of the

Cray T3E was Steve Oberlin, who is now CTO at Nvidia. It is worth remembering that the world

is a naturally parallel place, so while many current algorithms may not cope with the billions of threads that exascale systems may require, a new breed of applications that do not compress the natural parallelism of the universe may be able to succeed. Mike Bernhardt, former publisher of

Te Exascale Report, and now a marketing evangelist at Intel, says: ‘We are indeed taking some big steps into a new parallel universe. With the impressive breakthroughs in cooling technology and processor fabric integration, building an exascale machine is something we can do today, albeit not in any practical or affordable fashion. Tat will, of course, improve. But what would we do with such a machine, other than run some benchmarks? Te biggest hurdle to picking up traction in this new parallel universe is developing an exascale-level architecture and the new programming models needed to support a new generation of parallel applications.’

With more than 30 years’ experience in the IT industry, initially writing compilers and development tools for HPC platforms, John Barr is an independent HPC industry analyst specialising in technology transitions.

@scwmagazine l www.scientific-computing.com

Immrchris/Shutterstock.com

Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | Page 20 | Page 21 | Page 22 | Page 23 | Page 24 | Page 25 | Page 26 | Page 27 | Page 28 | Page 29 | Page 30 | Page 31 | Page 32 | Page 33 | Page 34 | Page 35 | Page 36 | Page 37 | Page 38 | Page 39 | Page 40 | Page 41 | Page 42 | Page 43 | Page 44 | Page 45 | Page 46 | Page 47 | Page 48 | Page 49 | Page 50 | Page 51 | Page 52 | Page 53