SCW_APRMAY18

orderForm.title

orderForm.productCode

orderForm.description

orderForm.quantity

orderForm.itemPrice

orderForm.price

orderForm.totalPrice

orderForm.deliveryDetails.name

orderForm.deliveryDetails.accountNumber

orderForm.deliveryDetails.phone

orderForm.deliveryDetails.poNumber

orderForm.deliveryDetails.email

orderForm.deliveryDetails.companyName

orderForm.deliveryDetails.billingAddress

orderForm.deliveryDetails.deliveryAddress

orderForm.deliveryDetails.deliveryDetailsDeliveryAddressSameAsBillingAddress

orderForm.deliveryDetails.address1

orderForm.deliveryDetails.address2

orderForm.deliveryDetails.city

orderForm.deliveryDetails.state

orderForm.deliveryDetails.postCode

orderForm.deliveryDetails.country

orderForm.deliveryDetails.additionalInformation

orderForm.noItems

NVIDIA

HIGH PERFORMANCE COMPUTING

“AI has been at the forefront of changing the way we interact with devices and companies are transforming their businesses with new AI services”

will be based on Intel’s own efforts known as the Open Programmable Acceleration Engine (OPAE) Technology, but support for OpenCL will also be included. While this will go a long way to

overcoming some of the trepidation around the use of FPGA technology Intel PSG are still targeting specific applications to demonstrate the potential for FPGAs. Mike Strickland, director, solutions

Going further ‘Today’s high end AI systems incorporate eight of our Tesla V100 GPUs connected with NVLink, through a hybrid cube mesh topology. ‘Can we go higher? Can we add more GPUs to provide our AI community with even more powerful GPUs for training?’ asked Buck. ‘One of the challenges to that is how

do we scale up NVLink? When we first introduced NVLink it was a point-point high speed interconnect offering 300 GB/s bandwidth per GPU – we directly connected all of the GPUs together in what we called a hybrid cube mesh topology. ‘To go further we need to expand the

capabilities of our NVLink fabric. With that we have invented a new product which we call the NVSwitch. This enables a fully switched device for building an NVLink fabric allowing us to put up to 18 NVLink supports with 50Gb/s per port giving a grand total of 900 Gb/s of bandwidth in this fully connected internal crossbar which is actually a two billion transistor switch.

The NVSwitch fabric can enable up to

16 Tesla V100 GPUs to communicate at a speed of 2.4 terabytes per second. Huang declared that ‘the world wants a gigantic GPU, not a big one, a gigantic one – not a huge one, a gigantic one.

www.scientific-computing.com | @scwmagazine

Combining these new technologies into a single platform gives you Huang’s next announcement the NvidiaDGX-2. Huang explained to the crowd at the conference keynote that the DGX-2 is the first single server capable of delivering two petaflops of computational power. DGX-2 has the deep learning processing

power of approximately 300 servers occupying 15 racks of datacentre space, while being 60 times smaller and, the company claimed, up to 18 times more power-efficient.

Disrupting HPC As noted in the processor feature in February/March issue of Scientific Computing World Intel is developing FPGA technology through the acquisition of Altera and the newly formed Intel Programmable Solutions Group. Later this year the company will release a programmable accelerator card (PAC) A PCIe based FPGA accelerator card based on the Aria 10 FPGA. PAC connects through PCIe Gen3 to hook into the server and comes with 8 GB of DDR4 memory, along with 128 MB of flash. Intel will also provide a supporting software stack to provide additional support and encouragement for users to pick up new technology. Some of this

architect, Intel Programmable Solutions Group, noted that networking acceleration is one key aspect that the company are exploring as the additional bandwidth provided by FPGAs could be useful in this area. ‘You can take networking traffic directly

into the FPGA or the FPGA can access data directly. For instance if you have an FPGA connected to a Xeon processor in the Xeon there is an embedded PCIe express switch they could access NVMe drives and directly pull data from these drives. For data analytics’ acceleration there is really no way to achieve that level of performance without this multifunction play.

‘Some of the things we have been talking about for HPC carry over into the embedded space. One area where there is an absolute overlap is in military intelligence and traditional HPC. There is about an 80 per cent overlap because they have a lot of the same problems,’ added Strickland.

The addition of HBM2 memory to FPGAs

could also be an important step as it will allow higher data use and bandwidth. However, Strickland noted that although it will be included with the PAC card there has been less of a need for it on applications than would be expected. ‘I can say that two years ago I thought

April/May 2018 Scientific Computing World 5

g

Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | Page 20 | Page 21 | Page 22 | Page 23 | Page 24 | Page 25 | Page 26 | Page 27 | Page 28 | Page 29 | Page 30 | Page 31 | Page 32 | Page 33 | Page 34 | Page 35 | Page 36 | Page 37 | Page 38 | Page 39 | Page 40