SCW Feb/Mar 2020

HIGH PERFORMANCE COMPUTING g

time, the Microsoft team, led by Marc Tremblay, distinguished engineer, has been developing systems for Azure and has been enhancing machine vision and natural language processing (NLP) models on The Intelligence Processing Unit (IPU) - the Graphcore parallel processor. Matt Fyles, VP for software at

Graphcore, explains the company’s history and the development of the IPU technology that was originally - albeit many years ago - designed around the acceleration of matrix multiplication workloads that are common in HPC: ‘Graphcore started around this premise that we did not believe there was an existing architecture designed for this purpose. We had all the people from the software and hardware side in Bristol that we have a lot of experience in building these processors, so we could start a new company to build a new product around an architecture specifically for machine learning.’

‘In 2006 I worked at a company

called ClearSpeed In Bristol that was a parallel processing company. Alongside Sun Microsystems we built the first accelerated supercomputer. It wasn’t using GPUs; at that point we were using a custom processor architecture.’ ‘The key point was that interconnect

technology such as PCI-Express had become fast enough that you could offload work from a CPU to another processor without it being such a huge problem. There were also workloads that were basically large linear algebra matrix multiplication workloads, which meant that you could just build a processor that was very good at that, it was the cornerstone of a lot of existing applications in HPC,’ Fyles stated. The development of an offload engine

for HPC applications was the genesis of the Graphcore IPU. While technology has evolved a lot since then it was not until the development of AI and ML on GPUs that the team saw the potential for the Graphcore IPU in these areas. ‘I would say that it did not really move out of HPC and into ML until 2012 when researchers took Nvidia’s CUDA stack and built what would be the first accelerated ML framework. This was done by a research group not directly by Nvidia,’ said Fyles. ‘That enabled a whole new set of

applications around ML. There has been a long transition where this technology was not used for ML but it is really still really built around those cornerstones of a very fast interconnect to a host CPU with some form of offload and accelerating matrix multiplication,’ comments Fyles. ‘The technology we are using today is a

6 Scientific Computing World February/March 2020

lot more sophisticated; it is a lot smaller, we can put more compute on a single chip but fundamentally the programming model that we use at the very lowest level is the same as what we began using in 2006. When this whole world of accelerated computing was launched,’ stressed Fyles.

‘Those type of models are being replaced with ML models that have been trained on the real data but they have been replaced with this probabilistic ML model’

This is a somewhat transitional period

for HPC users. There is a changing landscape not only from the perspective of hardware providers but also in the form of changing workloads. The evolving workloads and explosion of AI and ML are driving new hardware and software but also enabling new ways to research and gain insight into science. Capitalising on this growth in users, both traditional HPC users and enterprise customers that need access to scalable AI and ML hardware provides a time of potentially huge growth for companies such as Graphcore. If smaller companies can develop specialised processing architectures for AI and get them to market in a timely manner, then they could make huge strides in a short time. However, if Graphcore is to replace GPU infrastructure with its IPU technology the company must ease the

adoption of this new technology. But, as Fyles notes, this is much easier in AI than it would be in HPC due to the nature of the software frameworks. ‘This is one of the things that is slightly easier in the AI domain than it is in the HPC domain because there are a set of standard frameworks or development environments that get used. In HPC you do not really have that because a lot of it is bespoke but in this space, people write applications that are really abstracted from the hardware. If you can plug into the back of those frameworks, you can accelerate the application they are running,’ notes Fyles. While the system is specifically optimised for AI and ML workloads, Fyles notes that there is a growing number of HPC workloads that are being replaced or augmented by AI. He gave one example of analysing data from the Large Hadron Collider (LHC). The LHC is using simulation and random number generation, trying to get a prediction of what is happening. ‘Those type of models are being

replaced with ML models that have been trained on the real data, but they have been replaced with this probabilistic ML model. Other applications are focused on weather prediction and that is really an image recognition challenge because they feed the model simulated images that come out of the HPC simulations. They train a ML model on it and use that to infer what will happen with specific inputs. ‘There are lots of parts of the standard simulation pipeline that is getting massive speed-up from using these technologies,’ Fyles concluded.

@scwmagazine | www.scientific-computing.com

 Graphcore’s IPU technology

Graphcore

Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | Page 20 | Page 21 | Page 22 | Page 23 | Page 24 | Page 25 | Page 26 | Page 27 | Page 28

orderForm.title