in his ISC18 talk on NLAFET that, ‘Linear algebra is both fundamental and ubiquitous in computational science and its vast application areas’. So many HPC applications heavily rely on BLAS that even a small performance increment can translate into huge savings in runtime by the aggregate HPC community. The parallel Numerical Linear Algebra for

Future Extreme Scale Systems (NLAFET) is a high-profile example where methods and hardware popularised for AI applications – including DAGs (directed acyclic graphs) and reduced precision hardware, are used to map BLAS operations to various hardware platforms including CPUs, GPUs and FPGAs. Similarly, Alexander Heinecke (research scientist, Intel Labs) has created the open-source libXSMM library that can speed batched small linear algebra operations, including techniques that use reduced-precision arithmetic operators, while still preserving accuracy. The advent of optimised reduced-

precision hardware for AI has brought the question of ‘how much numerical precision is enough?’ to the attention of computer scientists and is in some sense motivating the development of new numerical approaches, such as the efforts at King Abdullah University of Science and Technology (KAUST) to build an enhanced

ecosystem of numerical tools. In particular, the HiCMA linear algebra library can operate on billion by billion matrices, in workstations containing only gigabytes of memory.

Data flow architectures New ideas, numerical methods, and programmable hardware devices such as FPGAs opens the door to data flow architectures. Cloud providers such as Microsoft Azure are already using persistent neural networks for inference. In this case, the ANN is implemented directly on the FPGA – there is no program counter or executable like a conventional CPU or GPU. Instead, data ‘flows’ through the computational elements on the FPGA to produce the desired output. The result is high performance, low latency and low power consumption. While there is speculation about the adoption of non-von Neumann data flow architectures in HPC and exascale supercomputers, it is clear that scientists are currently laying the groundwork for the use of data flow architectures . This is a natural merging of the use of DAGs in projects such as NLAFET and programmable hardware such as FPGAs. As the ALCF posted, ‘Because current innovation is driven by collaboration

among colleagues and between disciplines, the potential for discovery through the pragmatic application of new computational resources, coupled with unrestrained data flow, staggers the imagination.’ From new hardware to new approaches, the true impact of deep and machine learning on HPC is yet to be seen. In short, the biggest impact is not technological, but rather a change in mindset that is stimulating innovation and new approaches to decade’s old technologies and problems. Thus, we are at the start of a point of inflection brought about by the popularity of AI hardware that a host of bright and innovative scientists are exploiting to bring about the convergence of AI and HPC. The ramifications are difficult to predict but will be extraordinarily fun to see.

A fully referenced version of this feature is available online

Rob Farber was a pioneer in the field of neural networks while on staff as a scientist in the theoretical division at Los Alamos National Laboratory. He works with companies and national laboratories as a consultant, and also teaches about HPC and AI technology worldwide. Rob can be reached at

Visit the PRACE booth #2033 at SC18 from 12 to 15 November 2018. More information: SHAPE 8th CALL IS OPEN

1 OCTOBER – 1 DECEMBER 2018 Over 40 SMEs have already taken advantage of the opportunities opened up by being part of SHAPE DON’T GET LEFT BEHIND!


Keen to enhance your software via more powerful computers A European Small to Medium-sized Enterprise Interested in increasing your competitiveness

SHAPE is an opportunity you should not miss INTERESTED?

Contact the SHAPE team at: or visit:

The Partnership for Advanced Computing in Europe (PRACE) is an international non-profit association with its seat in Brussels. The PRACE Research Infrastructure provides a persistent world-class high performance computing service for scientists and researchers from academia and industry in Europe. The computer systems and their operations accessible through PRACE are provided by 5 PRACE members (BSC representing Spain, CINECA representing Italy, ETH Zurich/CSCS representing Switzerland, GCS representing Germany and GENCI representing France). The Implementation Phase of PRACE receives funding from the EU’s Horizon 2020 Research and Innovation Programme (2014-2020) under grant agreement 730913. For more information, see


Individual support from a high performance computing expert Effort to develop and enhance your applications Compute time on some of Europe’s most powerful computing systems

TO IMPROVE YOUR: Time to solution

Product quality Service innovation

Partnership for Advanced Computing in Europe

Page 1  |  Page 2  |  Page 3  |  Page 4  |  Page 5  |  Page 6  |  Page 7  |  Page 8  |  Page 9  |  Page 10  |  Page 11  |  Page 12  |  Page 13  |  Page 14  |  Page 15  |  Page 16  |  Page 17  |  Page 18  |  Page 19  |  Page 20  |  Page 21  |  Page 22  |  Page 23  |  Page 24  |  Page 25  |  Page 26  |  Page 27  |  Page 28  |  Page 29  |  Page 30  |  Page 31  |  Page 32  |  Page 33  |  Page 34  |  Page 35  |  Page 36