SCW August/September 2018

orderForm.title

orderForm.productCode

orderForm.description

orderForm.quantity

orderForm.itemPrice

orderForm.price

orderForm.totalPrice

orderForm.deliveryDetails.name

orderForm.deliveryDetails.accountNumber

orderForm.deliveryDetails.phone

orderForm.deliveryDetails.poNumber

orderForm.deliveryDetails.email

orderForm.deliveryDetails.companyName

orderForm.deliveryDetails.billingAddress

orderForm.deliveryDetails.deliveryAddress

orderForm.deliveryDetails.deliveryDetailsDeliveryAddressSameAsBillingAddress

orderForm.deliveryDetails.address1

orderForm.deliveryDetails.address2

orderForm.deliveryDetails.city

orderForm.deliveryDetails.state

orderForm.deliveryDetails.postCode

orderForm.deliveryDetails.country

orderForm.deliveryDetails.additionalInformation

orderForm.noItems

HIGH PERFORMANCE COMPUTING

how realism guides evolution vs revolution James Reinders is a parallel programming and HPC expert with more than 27 years’ experience working for Intel until his retirement in 2017. In this article Reinders gives his take on the use of roofline estimation as a tool for code optimisation in HPC

Performance tuning by setting goals:

Roofline Analysis is a technique that projects a view of realism into optimisation targets. It lets us know when we’ve tuned all we can (assuming evolution of our code) which may uncover the unsettling fact that we need a new algorithm (revolution). As a long-time teacher of optimisation

techniques, I can confidently say that Roofline analysis is a must-have for anyone optimising for performance. This has not always been the case. As I will explain, today it is an important technique to draw upon when doing performance optimisation.

When mentioning Roofline Analysis, I

have been asked ‘Hasn’t that been around awhile?’, usually followed by ‘What’s new?’ Excellent questions. The answers

revolve around two factors: (1) complexities (latency hiding through parallelism and memory hierarchies) in optimising for today’s processing architectures – including CPUs, GPUs, and accelerators of all kinds,

(2) new tools, based on new research, to

”Roofline analysis – a technique to know when we’ve tuned all we can (evolution) which may uncover the unsettling fact that we need a new algorithm (revolution)”

4 Scientific Computing World August/September 2017

help us deal with these complexities. In the face of increasingly complicated systems, Roofline Analysis provides us with a step-by-step method to ascertain whether an algorithm has reached the end of its ability to provide more performance through continued optimisation work.

Complexities in optimising for today’s systems Today we are faced with a great diversity of compute devices, ranging from Intel Xeon scalable processors, and GPUs, to more application-specific accelerators enabled by FPGAs and ASIC technologies. It’s not the diversity that demands

Roofline analysis, it’s the complexity of the architectures of the individual devices. Specifically, it is their complex abilities to hide latencies, and the sophisticated parallel compute capabilities and

multilevel memory subsystems that play critical roles in such latency hiding. Years ago, performance optimisation was successful if we could reduce the number of instructions being executed. Such optimisations were nearly always rewarded by performance improvements. That is not the case today. Fortunately, Roofline analysis addresses these complications in optimisation work.

New tools, new research, how to cope The technique of Roofline analysis has recently seen a surge in study, resulting in some interesting papers and tutorials. Throughput optimisation techniques tend to be effective everywhere. Therefore, tuning investments using roofline analysis done on an Intel Xeon Scalable processor- based server, where the development environments are rich and mature, will lead

@scwmagazine | www.scientific-computing.com

Anilinn/Shutterstock.com

Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | Page 20 | Page 21 | Page 22 | Page 23 | Page 24 | Page 25 | Page 26 | Page 27 | Page 28 | Page 29 | Page 30 | Page 31 | Page 32 | Page 33 | Page 34 | Page 35 | Page 36