SCW Spring 2020

HIGH PERFORMANCE COMPUTING g

products feature a proprietary valve design that delivers optimal flow,’ stated Langer. ‘Any liquid cooling implementation, regardless of manufacturer will save customers money because the systems are more efficient, but end customers should care about QD reliability and ease of use as well. ‘If a site currently is not running with liquid cooling, conversion to liquid cooling is most likely on the facilities master plan. ‘Research centres have

expert teams evaluating capital investment and operational expenses to better understand overall TCO.’ Sustainability was a primary

concern for Asperitas as it aimed to develop its technology not just for HPC but also hyperscale, which requires much higher efficency than a typical HPC installation. ‘It is a combination of

different elements. We started to develop the solution for scale and that is very different than solution for HPC and supercomputing, which is always a custom solution that is always a niche project. You might do one or two projects a year as a solutions provider,’ explained Bouricius. ‘In our case, the end game that we had in mind was a solution that could also be used by Hyperscale cloud providers. That was the extreme-scale that we developed this for.’ The Asperitas team

achieved increased efficiency by designing the system around natural convection of the liquid-based on temperature than using pumps. ‘This means that the cooling fluid is not moved around by pumps or any other mechanical force but by natural convection,’ added Bouricius. ‘This makes it a very stable

and reliable solution. We do not mix the fluids because this results in a lower average temperature. We have a layering of temperatures where all the components that

8 Scientific Computing World Spring 2020

going to be more powerful chips and we also know, it has been publicly announced by Nvidia, that there are going to be some architecture changes with respect to the Volta GPUs. The SXM3 interface is going to support 48V power input. Modern server architectures are designed to run at 12V. 48V is something that has become prevalent thanks mainly to the efforts going on in Open COmpute Project (OCP),’ Mertel noted. TMGcore uses 48V as its

rack-level distribution voltage which gives the company an advantage when designing these systems. While some organisations would need to redevelop their power supplies or transition boards to 48V to support these technologies. ‘We have been somewhat

”In our case, the end game that we had in mind was a solution that could also be used by Hyperscale cloud providers. That was the extreme-scale that we developed this for”

need to be cooled are at a lower temperature and in the top layer of the system the fluid temperature is quite high. This allows for the solution to be rather independent of climate, because we can cool the system with a water temperature of 45 degrees and we can still optimise for other temperatures.’

Cooling the next generation Mertel and his colleagues at TMGcore decided to embrace the high-performance potential for two-phase immersion cooling technology, focusing on highly dense solutions and trying to reduce overall system size. ‘Our current generation

our products focus on high- density applications. The most powerful blade server that we have developed is 6,000 watts. It has 16 V100 Nvidia GPUs in it an absurd amount of VRAM and dual Intel Scalable Processors. It is a beast of a server and it is a 1 OIU (one

Otto Immersion unit); it is our standard one unit blade server form factor,’ said Mertel. ‘The types of densities

that can be achieved through two-phase immersion really lend the technology towards servers that are able to consume large amounts of power. Today that is primarily GPU based workloads which is what a good number of HPC users require for their applications,’ Mertel added. Mertel also noted that the

company is also developing their products for more generalised CPU-based systems. ‘The next generation of CPUs that are, in terms of power consumption, starting to get on the same order of watts per square centimetre (W/cm2

) as a GPU. Intel has

already announced a 400 watt Intel Scalable processor TDP system and there are AMD offerings today that almost hit 300 watts, which is the same as V100 GPU.’ ‘We know that there are

fortunate that we had the opportunity to jump into a few of those development efforts. ‘We have been able to see what boards that are being built from the ground up and optimised for immersion cooling might look like,’ Mertel said. ‘We know that CPUs and GPUs are going to get denser and we have developed technologies that are available today which support a 500-watt chip the size of a V100 and we are working on the development of boiling enhancements that would allow us to go beyond that.’ Mertel also notes that the

company is working with chipmakers to try and get ‘boiling enhancements’ fitted to chips as they are designed removing some of the older technology that was designed to support air and water cooling. ‘Firms building chips

with integrated boiling enhancements, removing the integrated heat spreader that today would represent the primary thermal interface between the silicon ad the heat sink or boilerplate. Replacing that directly with a boiling enhancement allows the chip to be directly submerged without any kind of heat sink,’ Mertel concluded.

@scwmagazine | www.scientific-computing.com

socrates471/Shutterstock.com

Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | Page 20 | Page 21 | Page 22 | Page 23 | Page 24 | Page 25 | Page 26 | Page 27 | Page 28 | Page 29 | Page 30 | Page 31 | Page 32

orderForm.title