Creating balance in HPC

Robert Roe investigates the motivation behind the

architectural changes to Europe’s fastest supercomputer, Piz Daint, housed at the Swiss National Computing Centre


he flagship supercomputer at the Swiss National Supercomputing Centre (CSCS), Piz Daint, named aſter a mountain in the

Alps, currently delivers 7.8 petaflops of compute performance, or 7.8 quadrillion mathematical calculations per second. A recently announced upgrade will double its peak performance, thanks to a refresh using the latest Intel Xeon CPUs and 4,500 Nvidia Tesla P100 GPUs. Tomas Schulthess, professor

of computational physics at ETH Zurich and director of the Swiss National Supercomputing Centre, said: ‘We will put both systems into a single fabric. It will be one fabric with two different node architectures and we will have Data Warp nodes as well.’ During the upgrade, the CPUs and

accelerators will be updated and the system will be combined with the Piz Dora supercomputer, also housed at the Swiss centre, to create a single, unified HPC system containing both CPU/GPU nodes and purely CPU based nodes. Tis upgrade is key to the future

development of supercomputing at the Swiss centre both for high-resolution simulations and for the field of data science, which requires the analysis of enormous volumes of data. Today, materials science, geophysics, life

sciences and climate sciences all use data- and CPU-intensive simulations. With the new hardware, researchers will be able to perform these simulations more efficiently.

Creating a balanced infrastructure However, the planned upgrade is not purely to increase performance. As with many

14 SCIENTIFIC COMPUTING WORLD The Swiss national Supercomputing Centre, Lugaro, Switzerland

HPC centres, CSCS has many data-intensive applications and, as these continue to scale they face increasing challenges around memory bandwidth. Schulthess explained the reasoning

behind the Piz Daint upgrade: ‘Currently on Piz Daint we have a bottleneck between


the GPU and the CPU that is one of the motivations for changing the configuration of the node.’ All of our climate codes and the seismic

codes are bandwidth bound; we have many applications that are bandwidth bound.’ To solve this challenge, Schulthess and his

colleagues decided to add High bandwidth memory (HBM) and introduce Cray’s Data Warp technology. Data Warp is an IO accelerator that uses SSDs to increase storage performance – reducing the bottlenecks associated with the systems’ most data intensive applications. ‘Although slightly reduced in physical

size, Piz Daint will become considerably more powerful as a result of the upgrade, particularly because we will be able to

Piz Daint Supercomputer housed at the Swiss National Supercomputing Centre

@scwmagazine l

increases bandwidth significantly in the most important areas,’ said Schulthess. ‘Piz Daint will remain an energy-efficient, balanced system, but will now offer increased flexibility.’ Tis upgrade, along with the introduction

of the latest Nvidia GPUs, CPUs, Cray’s Data Warp technology and, crucially, High bandwidth memory (HBM), switch the system from PCIe Gen 2 to PCIe Gen 3. Both of these improvements directly address memory bandwidth – providing a more balanced system for future users. ‘Te system will be more balanced in

the future because on the K20X and Sandy Bridge they were talking to each other with PCIE Gen 2, but now they will talk on Gen 3’ said Schulthess. ‘If you imagine that the application problem is spread across the GPU memory of many nodes, now the GPUs talk all the way to the network interface circuit using PCIE Gen 3.’

Page 1  |  Page 2  |  Page 3  |  Page 4  |  Page 5  |  Page 6  |  Page 7  |  Page 8  |  Page 9  |  Page 10  |  Page 11  |  Page 12  |  Page 13  |  Page 14  |  Page 15  |  Page 16  |  Page 17  |  Page 18  |  Page 19  |  Page 20  |  Page 21  |  Page 22  |  Page 23  |  Page 24  |  Page 25  |  Page 26  |  Page 27  |  Page 28  |  Page 29  |  Page 30  |  Page 31  |  Page 32  |  Page 33  |  Page 34  |  Page 35  |  Page 36  |  Page 37  |  Page 38  |  Page 39  |  Page 40  |  Page 41  |  Page 42  |  Page 43  |  Page 44