SCW Summer 2020

HIGH PERFORMANCE COMPUTING g While the Sage 2 project

looks to develop future alternatives to current storage technology, today, scientists’ with fast, large scale storage requirements often select a parallel file system. This technology is well known to be highly scalable and deliver performance based on the underlying hardware technology.

Removing bottlenecks Today many HPC manufacturers offer different approaches to tiered data management systems. That aim is to deliver just as much ‘fast’ storage as necessary to deliver performance with larger or less used files on ‘slow’ storage technologies to reduce the overall cost. However underlying the performance has to be reliability as downtime on large scale systems can be incredibly costly. Robert Murphy, director of

product marketing at Panasas comments on the importance of reliability and ease of use. ‘The majority of people in the research space use things like XFS and Isolon and the reason they pick those is because it works out of the box. Using Panasas today you can get the great reliability that we have always been known for and now the great performance and price performance that off the shelf hardware can provide.

‘Whether you are running a single sequence or hundreds of thousands of sequences the system is going to behave in a predictable manner. All those other NFS protocol access systems work great to a certain point but when things really start to hammer them those systems will start to fall off in performance,’ added Murphy. ‘With Panasas what makes us unique from that perspective is that, just like they did at UCSDD. They can take out the NFS protocol system and put in a Panasas that is just as reliable, just as easy to deploy as the systems they have been used to all of

6 Scientific Computing World Summer 2020

these years. But when they really load them up and more and more data is being thrown at the cluster the Panasas system gives them that great linear response curve as they scale up. Panasas helped USDC

to develop their storage architecture by developing more HPC focused technology in the form of a parallel file system. The extra storage capacity will help researchers accelerate their research by removing bottlenecks in processing and storing sequencing data that is critical to their research. UC San Diego Center for

Microbiome Innovation is a research centre on the USDC campus that has begun research focusing on COVID-19. It is hoped that a better understanding of the gut microbiome can help tell us more about the virus and its

”Whether you are running a single sequence or hundreds of thousands of sequences the system is going to behave in a predictable manner”

interaction with people. In April The University of California San Diego announced it was expanding a collaboraotion with IBM: ‘AI for Healthy Living (AIHL)’ in order to help tackle the COVID-19 pandemic. AI for Healthy Living is a multi-year partnership leveraging a unique, pre- existing cohort of adults in a senior living facility to study healthy aging and the human microbiome as part ofIBM’s AI Horizons Network of university partners.

‘In this challenging

time, we are pleased to be leveraging our existing work and momentum with IBM

through the AI for Healthy Living collaboration in order to address COVID-19,’ said Rob Knight, UC San Diego professor of pediatrics, computer science, and bioengineering and Co- Director of the IBM-UCSD Artificial Intelligence Center for Healthy Living. Knight is also Director of the UC San Diego Center for Microbiome Innovation. Panasas’ Murphy noted that

while some customers may have had bad experiences with complex file systems, Panasas focus on ease of use and accessibility has made deploying and managing a parallel file system much easier than just a few years ago. ‘The big thing is that other parallel file systems need to be tuned for a specific application either big files or small files or whatever specific workload is coming at it. With Panasas you do not have that requirement.’ One example is the The

Minnesota Supercomputing Institute which uses Panasas storage to power its supercomputer. ‘They have chemistry and life sciences codes but they also have Astrophysics, engineering and many more types of code that all run on the Panasas parallel file system,’ added Murphy.

Delivering performance Delivering performance across all workloads is a benefit of the tiered approach to parallel file system storage. As Curtis Anderson, senior software architect explains, using the storage devices for what they are best at delivers the best price and performance. ‘The fundamental difference

is in how we approach different types of storage devices. Underlying all of the software technology – because we are basically a software company,’ notes Anderson. ‘In a traditional system it will be 100 per cent disk or 100 per cent flash and they might have an LRU caching layer on top. What we have done instead is merge the two layers and then scale them out, so that

 The prototype at the Jülich Supercomputing Centre Germany used for SAGE and Sage2

every one of our storage nodes has a certain amount of “tiering” even though we do not call it that internally,’ added Anderson. For example, the new

ActiveStor Ultra from Panasas has six hard drives, SATA flash, NVMe flash and an NVDIMM stick. ‘The hard drives are really good at delivering bandwidth if all you do is give them large sequential transfers. We put all of the small files on top of the SATA flash, which is really good at handling small file transfers. The metadata is very latency-sensitive, so we move that to a database sitting on a super fast NVMe stick and then we use NV DIMM for our transaction log,’ said Anderson. ‘The net effect of all that is spreading the tiering across the cluster.’ This architecture is designed to ensure that there is just enough of each component to deliver the required performance without unnecessary costs. ‘In this way, every type of

storage device is doing what it is good at. Hard drives are doing large transfers. SATA flash – we have just enough for the small transfers. We have just enough NVMe for the metadata that we need to store,’ concluded Anderson. ‘That is how you get both a cost-effective and broad performance profile.’

@scwmagazine | www.scientific-computing.com

Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | Page 20 | Page 21 | Page 22 | Page 23 | Page 24 | Page 25 | Page 26 | Page 27 | Page 28 | Page 29 | Page 30 | Page 31 | Page 32 | Page 33 | Page 34 | Page 35 | Page 36 | Page 37 | Page 38

orderForm.title