Liquid cooling drives new science

Liquid cooling is driving more complex scientific research in HPC centres and enabling computing environments at the edge

of new users that want to make use of AI and ML and this will continue as the technologies become more ubiquitous and easier to use. This, in turn, is driving larger user bases in academia with more demands for dense computing configurations that can support HPC and AI applications.

As HPC systems continue to increase in density, and with more usage of GPU and accelerator technologies, controlling cooling and power efficiency becomes an increasingly difficult challenge. State of the art HPC environments will

often leverage large parallel processors, dual socket servers and possibly also GPUs to deliver performance. The increasing density that these users require means that liquid cooling plays a critical role. Liquid cooling not only helps to deliver

high value research but can also help organisations reduce energy costs and increase the efficiency of their data centres. Power and energy efficiency can be a limiting factor on many modern data centres and so reducing the amount of energy used to cool computing hardware can positively impact the power usage effectiveness (PUE) rating of a data centre – unlocking more energy that can be used to power computing hardware. Liquid cooling uses either cool or warm

liquid rather than cold air to dissipate heat from computer and server components. This allows for higher component performance and reliability, higher densities and decreased data centre operating expenses. Take a look at some of the latest and most powerful supercomputers announced over the last 12 months and you will see a rising trend in the use of liquid cooling to support these critical research systems. There is an influx

14 Scientific Computing World Summer 2021

Durham opts for liquid cooled processors in COSMA 8 Distributed Research utilising Advanced Computing (DiRAC) provides distributed high-performance computing services to those studying in fields such as astrophysics, cosmology, solar system science and nuclear physics. DiRAC’s computing resources are distributed over four university sites (Cambridge, Leicester, Durham & Edinburgh), each tailored to specific use cases. The DiRAC Memory Intensive (DiRAC MI) system at Durham University focuses on cosmological simulations,

“The cumulative effects of unproductive time spent by every scientist across multiple departments or sites can negatively impact business outcomes”

and so is known as COSMA, short for Cosmology Machine. The first generation of COSMA came online in 2001, prior to the establishment of DiRAC, with the latest iteration is COSMA 8. Successive upgrades tend to overlap, so the current DiRAC service is actually three clusters (COSMA 6, 7, and 8), while local Durham users continue to use COSMA 5. The system utilises dual 280-watt AMD Epyc 7H12 processors per node with a 2.6GHz base clock frequency and 64 cores, installed in Dell Cloud Service C-series chassis with a 2U form factor and custom CoolIT water cooling. COSMA 8’s initial installation comprises 32 nodes, each with 1TB of RAM, but will eventually

have 360 nodes. The previous system used 5120 Intel Xeon Skylake processors. This cooling solution helps DiRAC push

the computing performance of each node wit 2 64 core CPUs and large amounts of memory. ‘If you’re doing large-scale cosmological simulations of the universe, you need a lot of RAM,’ said Alastair Basden, technical manager for the DiRAC MI service at Durham University. ‘They can have run times of months, and then, after they’ve produced their data, which will be snapshots of the universe at lots of different time steps and different red shifts, then years are spent in processing and analysis.’ ‘For our current system (COSMA 7), we

have about 18GB of RAM per core,’ said Alastair Basden, technical manager for the DiRAC MI service at Durham University. ‘When you compare that with typical systems of two to four gigabytes, that’s a significant uplift. We do large-scale cosmological simulations of the universe, right from the start of the Big Bang up until the present day.’ For the 8th generation of COSMA, Durham Uni opted to use AMD’s Epyc processors. Basden said Durham opted for AMD to enable much larger simulation data sets with faster execution to speed up the discovery process during cosmological investigations. DiRAC@Durham is also exploring GPU acceleration with AMD Radeon Instinct graphics cards and plans to use them for running AI workloads like TensorFlow. COSMA 8 entered service in a

prototype stage last October. The rest of the system is undergoing installation and is expected to be ready by October 2021. ‘This will help us to understand the

nature of the universe, dark matter, dark energy and how the universe was formed,’ said Basden. ‘It’s really going to help us drill down to a fundamental understanding of the world that we live in.’

Liquid immersion at the edge There is also increasing interest in how liquid cooling can support science and research in edge computing. A recent announcement from Iceotope and Hewlett Packard demonstrates how the technology can be paired with computing

@scwmagazine |

Page 1  |  Page 2  |  Page 3  |  Page 4  |  Page 5  |  Page 6  |  Page 7  |  Page 8  |  Page 9  |  Page 10  |  Page 11  |  Page 12  |  Page 13  |  Page 14  |  Page 15  |  Page 16  |  Page 17  |  Page 18  |  Page 19  |  Page 20  |  Page 21  |  Page 22  |  Page 23  |  Page 24  |  Page 25  |  Page 26  |  Page 27  |  Page 28  |  Page 29  |  Page 30  |  Page 31  |  Page 32  |  Page 33  |  Page 34  |  Page 35  |  Page 36  |  Page 37  |  Page 38  |  Page 39  |  Page 40  |  Page 41  |  Page 42