HIGH PERFORMANCE COMPUTING
The challenges of exascale development
ROBERT ROE TAKES A LOOK AT CHALLENGES FACING THE FIRST GENERATION OF EXASCALE SUPERCOMPUTERS’ DEVELOPMENT
With the recent announcement that the first scheduled US exascale system will be AMD-
based due to the delays of the Intel- developed Aurora system, the challenges of developing these record-breaking HPC systems is causing setbacks in the original exascale timeline. Intel has struggled to deliver the 7nm
processors that will be used in the Aurora system, announcing delays earlier this year. It was thought that this may delay the Aurora system for some time but this has now been confirmed by an interview in InsideHPC with US Under Secretary for Science Paul Dabbar, commenting that ‘They’re getting very close to the first machines that are going to be delivered next year. And the first one is going to be at Oak Ridge.’ This statement makes it clear that it will be the Oak Ridge system Frontier which is delivered first, with the Aurora system at Argonne further delayed until Intel can overcome its manufacturing challenges to deliver the 7nm chips. It is now understood that the Intel 7nm
chips for PC users will shift to late 2022 or early 2023 while server-grade CPUs are now planned to launch in late 2023. The company’s 7nm GPU codenamed Ponte Vecchio will also be delayed until 2023.
4 Scientific Computing World Autumn 2020
Chip rivals such as Nvidia and AMD are both delivering 7nm chips which have been manufactured by TSMC. Dabbar noted: ‘We are in discussions
with Intel about that. I think we’re feeling good about the overall machine. I can’t go through exactly all the different options that we’re looking at for the Argonne machine, but we have a good degree of confidence that not very long after the Oak Ridge machine – that will be delivered as part of a plan to have at least one (exascale) machine up in 2021 – but not long after that we will have the Aurora machine, the Argonne machine, also. ‘The details are still being identified about exactly what we’re going to go through with Intel and their microelectronics. But we have confidence that the machine will also be delivered, and will be delivered right behind Oak Ridge. Our major partners have different components of the hardware and software stack,’ said Dabbar.
‘One of the things that will be occurring
with the exascale programme is the Shasta software stack that is … already developed to a large degree by HPE Cray. That’s something they’re developing as part of running their system. They actually have deployed early versions of that. Some of their other machines that they have that are not exascale, so they already have the earlier versions that have been de-risked.’ Dabbar stressed that a big part of the deployment process is focused on developing the integration between hardware and software. ‘The software stack that is going to be riding on top of the hardware is going to be integrated. So a lot of the discussion today, and part of the deployment, is that layering of the operating system on top of the hardware.’
While it had been thought that the
Intel-based system may be delayed since news about the 7nm chips was first announced earlier this year, this is the first confirmation that the exascale system developed by intel will be delayed. This highlights the difficulties in developing these systems that must deliver performance and reliability at an unprecedented scale.
Calculated risks Professor Satoshi Matsuoka, director of Riken Center for Computational Science,
@scwmagazine |
www.scientific-computing.com
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32 |
Page 33 |
Page 34 |
Page 35 |
Page 36 |
Page 37 |
Page 38