SCW_JUNJUL15

high-performance computing ➤ Tus, while Adept, a three-year project which

started in September 2013, has EPCC, Uppsala University in Sweden, and Ghent University in Belgium as participants, it also involves the telecoms company Ericsson AB from Sweden and a small Edinburgh-based company, Alpha Data, which specialises in FPGAs and the like for digital signal processing, imaging systems, communications, military and aerospace, as well as high-performance computing. Ericsson’s interest in Adept stems from LTE

– the Long Term Evolution of wireless data communications technology intended to increase the capacity and speed of wireless data networks using digital signal processing techniques. LTE also involves redesigning and simplifying the architecture of wireless communications networks, moving them to an IP-based system so as to significantly reduce transfer latency. Mobile phone base stations have to cope with

huge quantities of data, voice, video and they have to do so with minimal energy consumption. Like all telecoms companies therefore, Ericsson faces choices about which hardware to invest in, and needs tools to allow it to decide what to buy next. Te Adept project is providing a way to

explore the design space for soſtware developers, but it also provides the reverse function: if you know your code, it can give you an idea of what hardware to buy to run it efficiently – something invaluable to LTE technology providers. Adept started in 2013 and, according to

Lecomber, Allinea too has been working to find IT IS VERY EASY

TO USE AN HPC SYSTEM INEFFICIENTLY

ways of developing energy profiling as part of performance tools for the past couple of years, in partnership with the University of Warwick in the UK. Allinea specialises in providing high-level

benchmarking tools that allow people to understand the performance of their application without needing to see the source code, and soſtware development tools. Now the company is adding an energy-efficiency component to its toolkits. Lecomber pointed out that Allinea Performance Reports can be run through ‘a real code on a real workload, and you can see at a glance how much time you are spending in I/O; how much time you’re spending in MPI; thread synchronisation down to the processor level; you can see how much time you spend getting stuff from main memory; and how much time doing floating point vector operations.’ Adding the energy profiler ‘Will guide you on how you can run the application to use less energy,’ he said. As an application runs, Allinea’s soſtware will

36 SCIENTIFIC COMPUTING WORLD

be taking energy measurements ‘so you can see the spikes in the energy over the execution of an application, and tie those spikes to areas of your code and focus on those. Te tool will allow people to go almost down to the line level in looking at energy usage of their applications.’

Measuring power consumption But in order to reduce power consumption, it must first be measured. In the Adept project, energy measurement is EPCC’s role –developing the benchmarks -- while Alpha Data is providing the power measurement board that filters data so as to get the highest resolution power reading.

concept of the whole system. Te electricity consumed by a supercomputer is rejected to the environment as heat and, in many countries, it can be used for heating offices and spaces in the rest of the building. But this requires water at a particular temperature, dependent on the outside environment as well as what is going on inside the machine itself. Deep is investigating hardware prototypes,

not just soſtware, and the machine is lavishly equipped with energy and heat sensors to monitor its operations. But the wider environment (in principle the building itself) and its demand for energy are also being taken into account and the system soſtware has been designed to optimise energy consumption as a whole, he said. Tis could even mean that the operating system runs the computer less efficiently, in terms of energy consumption, in order to ensure that the whole system decreases its demand for energy.

According to Johnson, EPCC has a set of different types of hardware in the laboratory so that it can test performance on different architectures and then provide feedback on the accuracy of the model’s predictions as compared to the actual energy consumption. For Allinea’s David Lecomber, ‘Tere is no

perfect measuring system. Tere is always a slight delay, capacitance effects; even the granularity of the frequency of the sampling.’ However, he pointed out that Intel has the Rapl metric, which gives a lot of processor-level data on energy. Vendors such as Cray have built in decent measuring systems that bring in server-level energy information. ‘All the vendors are keen to enable to make that information available, so you can see it in the operating system and understand the energy performance,’ he said.

Holistic approach Allinea’s focus has been at the Rapl level and also on the energy consumption of accelerators such as GPUs. He pointed out: ‘You can see the spike in energy over the entire application every time you go out to a GPU. You are using more energy, but it is doing the work and the trade-off is clear. You are using more energy, but you are going to finish the computation an awful lot quicker’. However, ‘if you are spinning up a GPU,

you still have the CPUs running’ so Lecomber stressed the need for understanding the energy consumption at the level of the whole system not just Rapl and accelerator – ‘that’s what we are bringing in’. Axel Auweter, team leader for energy

efficiency in the Deep project, has an even wider

Optimise applications, not systems Allinea has been testing its energy-efficiency soſtware for some months, ahead of the formal launch. Lecomber said: ‘On some codes, you could slow the processor down by 10 to 15 per cent – and reduce the energy by similar amounts – yet the computation finished in exactly the same time.’ Te outcome underscores his theme that the best route to energy efficiency is to optimise for performance. He stressed that the issue is not the

performance of the site’s HPC system, as measured by the canonical Linpack benchmark used to draw up the Top500 list: ‘Linpack is irrelevant if your HPC system is doing say, OpenFoam’. Rather the issue was the performance of each individual application on that system: ‘Sites really need to get serious on the benchmarking of applications for performance’. (A discussion of some of the drawbacks of Linpack as a measure of system performance can be found on page 22.) Lecomber warned: ‘It is very easy to use an HPC system inefficiently. If your users are eating hours of cluster time, but their application is poorly configured – because it is using too many MPI processors and has lost efficiency by doing that, or maybe the thread/processors balance is bad and you’re losing half the time to synchronisation – that’s energy and time you’re losing. It’s too easy to assume that codes are well optimised for the systems they are running on. Benchmarking of applications will enable HPC centres to understand more about what’s going on in their system, and where their energy is actually going. Tat will lead to improvements in their throughput – the actual amount of science per Watt’. l

@scwmagazine l www.scientific-computing.com

fotogestoeber/Shutterstock.com

Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | Page 20 | Page 21 | Page 22 | Page 23 | Page 24 | Page 25 | Page 26 | Page 27 | Page 28 | Page 29 | Page 30 | Page 31 | Page 32 | Page 33 | Page 34 | Page 35 | Page 36 | Page 37 | Page 38 | Page 39 | Page 40 | Page 41 | Page 42 | Page 43 | Page 44 | Page 45 | Page 46 | Page 47 | Page 48 | Page 49 | Page 50 | Page 51 | Page 52 | Page 53 | Page 54 | Page 55 | Page 56