This page contains a Flash digital edition of a book.
HPC in the cloud is it real?

John Barr offers his take on the future of cloud

computing and HPC O

ptimising high-performance computing applications is all about understanding both the application and the target platform. HPC

developers worry about memory bandwidth, data placement, cache behaviour and the floating point performance of the target processors and compute accelerators in order to deliver the very best performance. Cloud computing, on the other hand, is all about virtualisation, which hides details of the target architecture from the application. Cloud offers both resource usage and business model flexibility, which is good – but perhaps not for HPC applications, or is it?

The transition from grid to cloud Te term grid computing became popular in technical computing circles around the turn of the century to describe an on-demand model for the use of distributed compute resources – the word ‘grid’ being chosen as it is analogous to the way that the power grid delivers pervasive access to electricity. Grid approaches could be used at many levels. One characterisation, proposed by Wolfgang Gentzsch while director of grid computing at Sun Microsystems, comprised cluster grids, enterprise grids, and the global grid. Cluster grids are now simply called clusters, for which grid middleware is used to ensure effective resource management. An enterprise grid has much in common

with what is now called a private cloud, while public clouds and the global grid have many similarities. Tere are many definitions of both grid and cloud computing. Te exact terminology doesn’t really matter, the key points being that the emergence of cloud computing owed much to the work of the pioneers in grid computing, and that a major difference between grid and cloud is the exploitation of virtualisation in the cloud, although


virtualisation is oſten not included in HPC- specific cloud offerings. HPC systems deliver insight to scientists and

engineers through the simulation of everything from subatomic particles to stars and galaxies, for purposes as diverse as fighting disease, climate modelling, and designing complex products such as cars and aircraſt. But few organisations can keep large HPC facilities busy full time, so many cannot justify buying their own personal supercomputer. Tis is where the cloud comes in. By accessing HPC facilities in the cloud an organisation can use (and pay for) the facility it needs, when it needs it – with no responsibility for paying for or managing the resource beyond the period when it is required by a project. Te utility business model of cloud computing is very powerful, but can cloud deliver the same value for HPC that it does for outsourced email, web applications or business applications delivered through soſtware-as-a- service (SaaS)? HPC cloud offerings can be described – using

very broad brushstrokes – as falling into one of two camps. On the one hand are simple clusters, while on the other hand are highly tuned facilities, adding compute accelerators and high- performance interconnects to standard clusters. HPC applications deliver high performance through exploiting a high degree of parallelism. Different classes of application behave in different ways. Some, for example a Monte Carlo simulation of financial risk analysis, can compute many independent calculations before combining all of the results at the end of the run. Others, such as weather forecasting, require the regular exchange of data between the many parallel threads of computation. So some HPC applications can run effectively on standard cloud infrastructures, while others require specific HPC capabilities.

It’s the latency, stupid Some HPC applications are highly scalable and can run efficiently in a standard cloud environment, while others are so dependent on fast communication between nodes that they need a dedicated cluster designed with

HPC in mind. High latency (or indeed low bandwidth) of communications can limit performance in several ways. First, it can just slow the application down. Second, it can limit scalability. Latency that may be acceptable when 12 nodes are communicating may become a limiting factor at hundreds of nodes. Finally, the latency effect of cloudbursting can mean that an application that runs effectively in a private or public cloud won’t work if it cloudbursts to a hybrid cloud as the latency between a private and public cloud is oſten orders of magnitude more than for communications within a cloud. Many cloud providers appreciate that

some HPC applications are more demanding than mainstream applications, and provide special offerings to meet the needs of the HPC community, including non-virtualised solutions, HPC platform-as-a-service, HPC applications



in a SaaS model, systems architected to meet the needs of HPC applications and a range of HPC services. Tis section highlights a cross section of these, but is far from complete as offerings are evolving all the time, and the flexibility of cloud means that new companies can appear out of nowhere and quickly deliver sophisticated solutions leveraging the cloud. Advania is a Nordic IT company that hosts

facilities for corporate clients worldwide. Among its offerings is an HPC cloud service for the academic community in Sweden, Denmark and Norway, which it delivers from its energy- efficient Tor data centre in Iceland. Te Amazon Elastic Compute Cloud (EC2)

provides two types of cluster specially configured for HPC needs, cluster compute and cluster GPU instances, which both use 10 Gbps Ethernet networks. Te nodes in the GPU instances

@scwmagazine l

Page 1  |  Page 2  |  Page 3  |  Page 4  |  Page 5  |  Page 6  |  Page 7  |  Page 8  |  Page 9  |  Page 10  |  Page 11  |  Page 12  |  Page 13  |  Page 14  |  Page 15  |  Page 16  |  Page 17  |  Page 18  |  Page 19  |  Page 20  |  Page 21  |  Page 22  |  Page 23  |  Page 24  |  Page 25  |  Page 26  |  Page 27  |  Page 28  |  Page 29  |  Page 30  |  Page 31  |  Page 32  |  Page 33  |  Page 34  |  Page 35  |  Page 36  |  Page 37  |  Page 38  |  Page 39  |  Page 40  |  Page 41  |  Page 42  |  Page 43  |  Page 44  |  Page 45  |  Page 46  |  Page 47  |  Page 48  |  Page 49  |  Page 50  |  Page 51  |  Page 52