SCW_JUNJUL14

high-performance computing

motivation for Japan to do this is, however, is unclear, as the commercial impact of technologies leveraging the K computer (which was the world’s fastest computer in 2011) has, so far, been limited.

Technology challenges As already noted, it is impossible to reach exascale just by doing more of the same but bigger and faster. Power consumption is the largest elephant in the room, but it is not alone. In many areas progress towards exascale systems and applications will not be by incremental change, but by doing things differently. Te main issues that must be addressed before exascale systems become a reality include:

Power consumption Te most efficient large-scale HPC system today is Tsubame 2.5 at the Tokyo Institute of Technology, which has a peak performance of 5.6 petaflop/s and consumes 1.4 MW. If the current system is scaled to an exaflop/s it would consume 250 MW, which is at least an order of magnitude too much.

Massive scalability Exascale systems will have millions of processor cores, and exascale applications will have billions of parallel threads.

Heterogeneity It is generally accepted that exascale systems will be heterogeneous, with the computation being handled by highly parallel, low power devices such as the Intel Xeon Phi or Nvidia GPU accelerators.

the first exascale system may also use exotic technologies to hit the performance target. At the Big Data and Extreme Computing

meeting in Fukuoka, Japan, earlier this year, Japan’s Office for Promotion of Computing Science announced a collaboration between a number of Japanese computer vendors and research institutes that planned to build an exascale system by the end of this decade. With an investment in excess of $1 billion, and a target power consumption of less than 40 MW, the system will use what they call an extreme SIMD architecture with thousands of processing elements per chip, including on-chip memory and interconnect. Tis architecture is aimed at solving computationally intensive applications such as N-body, MD and stencil applications. Te

www.scientific-computing.com l

Integration Te majority of the power consumed by supercomputers today is not used to handle computations, but is used to move data around the system. A higher level of integration for components such as interconnect and memory will both speed computation and reduce power consumption.

Resilience Exascale systems will use so many components that it is unlikely that the whole system will ever be operating normally. Te hardware, system soſtware and applications must cope with both degraded and failed components.

Programming methodologies and applications Tere are two schools of thought regarding the programming methodologies required to build exascale applications. Some HPC

@scwmagazine

experts think that is it feasible to extend today’s MPI plus OpenMP plus an accelerator programming model for exascale. Others believe that a radical rethink is required, and that new methods, algorithms, and tools will be required to build exascale applications.

Skills Tere is a serious lack of parallel programming skills both at the entry level and at the very high end. As most mobile phones and tablets, and all computers are now multicore devices, all programmers should be taught the rudiments of parallel programming. Today, this happens only at a small number of universities, but there is a growing number of entry-level parallel programming courses being taught. Te challenges of programming systems with thousands or millions of cores are far more complex than programming a simple multicore device, but most high- end supercomputer sites have to train their own staff, as only a handful of universities or research facilities provide this level of training.

Bill Kramer, who leads the @Scale Science

and Technologies unit at the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign, believes that the biggest challenges the industry faces in moving to exascale computing are the fact that Moore’s Law no longer delivers regular power reduction, the cost of moving data, and memory and I/O capabilities being out of step with advances in compute power. In order to minimise the amount of data

moved around a system (an activity that consumes more power and takes more time than actually processing the data), application writers should consider if all of their data structures really need to use double-precision floating point, as the use of single-precision data could halve an application’s memory requirement and data transfer time.

Applications Unless there are applications that can exploit such a system, there is no point in building an exascale machine. Te number of applications that can use the full capabilities of petascale systems today is relatively small, but the programming and application-design skills are improving fast as supercomputing centres around the world focus on application porting, tuning, and training. Te jury is out on what an exascale

programming environment will look like. JUNE/JULY 2014 25 ➤

ENOXH/Shutterstock.com

Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | Page 20 | Page 21 | Page 22 | Page 23 | Page 24 | Page 25 | Page 26 | Page 27 | Page 28 | Page 29 | Page 30 | Page 31 | Page 32 | Page 33 | Page 34 | Page 35 | Page 36 | Page 37 | Page 38 | Page 39 | Page 40 | Page 41 | Page 42 | Page 43 | Page 44 | Page 45 | Page 46 | Page 47 | Page 48 | Page 49 | Page 50 | Page 51 | Page 52 | Page 53