Scientific Computing World June/July 2019

orderForm.title

orderForm.productCode

orderForm.description

orderForm.quantity

orderForm.itemPrice

orderForm.price

orderForm.totalPrice

orderForm.deliveryDetails.name

orderForm.deliveryDetails.accountNumber

orderForm.deliveryDetails.phone

orderForm.deliveryDetails.poNumber

orderForm.deliveryDetails.email

orderForm.deliveryDetails.companyName

orderForm.deliveryDetails.billingAddress

orderForm.deliveryDetails.deliveryAddress

orderForm.deliveryDetails.deliveryDetailsDeliveryAddressSameAsBillingAddress

orderForm.deliveryDetails.address1

orderForm.deliveryDetails.address2

orderForm.deliveryDetails.city

orderForm.deliveryDetails.state

orderForm.deliveryDetails.postCode

orderForm.deliveryDetails.country

orderForm.deliveryDetails.additionalInformation

orderForm.noItems

HIGH PERFORMANCE COMPUTING Reprogrammable HPC

FPGAS PROVIDE AN EARLY INSIGHT INTO POSSIBILE ARCHITECTURAL SPECIALISATION OPTIONS FOR HPC AND MACHINE LEARNING, WRITES ROBERT ROE

Architectural

specialisation is one option to continue to

improve performance beyond the limits imposed by the slow down in Moore’s Law. Using application-specific hardware to accelerate an application or part of one, allows the use of hardware that can be much more efficient, both in terms of power usage and performance. As discussed in the ISC

content on page 4, this is not a tactic that can be used for all applications, because of the inherent cost of building computing hardware for a single application or workflow. However, by combining challenges into groups or identifying key workloads or code that could benefit from acceleration, is likely to become an important part of increasing application performance. Some applications are well

suited to technologies such as graphics processing units (GPUs) or field-programmable gate arrays (FPGAs), which can boost performance by implementing acceleration technologies. GPU acceleration or

architectural specialisation are not new concepts, but

8 Scientific Computing World June/July 2019

some experts predict they will become increasingly common to speed up performance and also lower energy costs of future systems. Researchers at Lawrence

Berkeley National Laboratory’s Computer Architecture Group are using an FPGA to demonstrate potential improvement via a code using Density Functional Theory. The project, ‘DFT Beyond Moore’s Law: Extreme Hardware Specialization’ for the Future of HPC, aims to demonstrate purpose-built architectures as a potential future for HPC applications in the absence of continued scaling of Moore’s Law.

The final intention would be

to develop plans for a custom application-specific integrated circuits (ASIC), but the initial development will be carried out on an FPGA. While this project is still in progress, it demonstrates how particular codes, or sections of code suitable for highly parallel execution, can be implemented on FPGA technology which could supplement a CPU or

”Researchers are using the FPGA with CERN’s other computing resources to process massive quantities of high-energy particle physics data at extremely fast rates”

CPU/GPU-based computing system in the future.

The search for dark matter CERN are using Xilinx FPGAs to accelerate inferencing and sensor pre-processing workloads in CERN’s search for dark matter. The researchers behind the project are using the FPGA in combination with CERN’s other computing resources to process massive quantities of high-energy particle physics data at extremely fast rates to find clues to the origins of the universe. This requires filtering sensor data in real-time, to identify novel particle substructures that could contain evidence of the existence of dark matter and other physical phenomena. A growing team of physicists and engineers from CERN,

Fermilab (Fermi National Accelerator Laboratory), Massachusetts Institute of Technology (MIT), The University of Illinois at Chicago (UIC), and University of Florida (UF) led by Philip Harris, MIT, and Nhan Tran, a Wilson Fellow at Fermilab, wanted to have a flexible way to optimise custom-event filters in the Compact Muon Solenoid (CMS) detector they are working on at CERN. The very high data rates of up to 150 Terabytes/second in the CMS detector require event processing in real-time, but trigger filter algorithm development hindered the team’s ability to make progress. Harris explained the idea behind the project: ‘We were inspired after talking to a few people who had been working

@scwmagazine | www.scientific-computing.com

D-Visions/Shutterstock.com

Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | Page 20 | Page 21 | Page 22 | Page 23 | Page 24 | Page 25 | Page 26 | Page 27 | Page 28 | Page 29 | Page 30 | Page 31 | Page 32