SCW_JUNJUL16

high-performance computing ➤

and GPFS are great for scratch file systems. Tey are great if I am going to bring in a large file set, pump it into a scheduler and that thing is going to be spitting out intermediate data results to the file system.’ Starr explained that while Object storage fills

a role similar to a traditional file system, it is used best when contextualising data on a lower performance tier – such as archiving. ‘Te object storage domain tends to work better for large data sets that are being stored together with collections of meta data – it gives you a common format and a common set of semantics that tie both the meta data and the data together.’ Tis can be particularly useful for scientific

research as it can provide much more than the simple raw data. Starr gave an example of an oil and gas company that may use metadata from a sensor, alongside seismic analysis data to verify a data set or to repeat analysis at a later date. Starr said: ‘I want to know all about that seismic

information, not just “here is the raw data”, I want to know where was it captured; I want sensor information because I may be extrapolating “bad data”. Te use of object storage opens up easier

pathways for verification of data, more efficient data management and even data analytics activities across large unstructured data reserves. However, DDN’s Molly Rector stressed that, today, most HPC users are only using object storage for archiving purposes because the performance is not comparable to a file system. She explained that most users deploy object

storage for two main reasons – cost and access to data across multiple data centres. ‘It’s the price of object storage versus having file system licences and higher performance disks; it is much cheaper to put the data into WOS [DDN’s Object storage platform]. Te other reason is if you want to move data very easily, here object storage can be really beneficial.’

Accelerating IO performance One of the biggest developments in storage in recent years has been to remove the bottleneck associated with moving data from storage to processor, otherwise known as IO (input/

output operations). Many major manufacturers have some IO acceleration within their storage portfolio, including DDN’s Infinite Memory Engine (IME), Cray’s DataWarp or Spectra Logic’s Black Pearl. It should be noted that ‘Black Pearl’ is a storage platform that includes SSD to accelerate performance, rather than a specific IO acceleration appliance. ‘We fundamentally believe that as data sets get

to be multi petabyte, that it is not feasible to try and maintain multiple different products that are not designed together’ said Rector. ‘From an operational perspective, it is just not

scalable. IME is just another piece of that. Tere are bottlenecks and there are inefficiencies in how the data gets moved around. IME is all about accelerating applications and, more specifically, talking IO bottlenecks out of the workflow.’ Both Cray and DDN affirmed that they expect

tiered storage to grow in the coming years with DDN adding that it would be a standard for larger HPC sites, such as those in the Top500, within one or two years. Bolding also commented on the success of

Cray’s DataWarp technology: ‘We are selling a lot of DataWarp to a set of customers with a demand for high-bandwidth storage. DataWarp is our newest storage product, but already it is our second highest in terms of customer demand.’ Bolding also stressed that, as part of Cray’s

systems-based mentality, the Cray DataWarp technology is integrated with its proprietary Aries interconnect technology – something that no other storage provider can claim. ‘It is a fast pool and there is no way you can do that with IME or any other kind of platform. Tey do not sit on the fabric with the processors so they have limitations on the bandwidth.’

The future of storage Along with the incessant growth of data volumes, and the use of object storage and IO acceleration, the next trend in the development of storage technologies will revolve around increasing intelligence of the storage stack so that data can be moved and archived more efficiently. ‘In the case of automated scientific systems that

are doing massive parallel compute you do not want to lock up cores, to lock up resources in the

cluster until you are completely ready.’ To alleviate this congestion of compute

resources, Starr expects that ‘control systems will move farther up into the soſtware stack so that the automated system starts to drive where data resides, rather than a policy driven system hidden in an appliance.’ Removing this complexity from the current

policy driven system would allow control systems to pre-empt specific workflows, checking the availability of files so that resources can be allocated as soon as they become available. Tis development should help to reduce queue times and wasted stress on the IO performance of the storage system. Bolding indicated that he expects a further increase in the use of SSDs, particularly for IO

OVER THE LAST 10 YEARS OR SO LUSTRE IS WHAT THE MARKET HAS BEEN DEMANDING

acceleration. ‘We are going to see an increase in the use of solid state devices; they will still be more expensive than spinning disk, so it will mainly be used for near-line storage – very fast storage pools that need to be very flexible’ he said. ‘Sometimes they need to act like a file

system; sometimes they need to act as a buffer or a cache. If you want to put storage close to the compute then you need to make it as flexible as possible in order to have it improving the speed of computation’, said Bolding. Ultimately, all of these improvements focus

on increasing performance – the rate at which data can be stored and retrieved – while simultaneously reducing congestion on the storage network, so that data can be moved quickly and efficiently across the system. ‘Ultimately, we are looking at it from a

systems perspective. How can you build a computing environment that has no limitations on bandwidth storage, compute, or memory? Tat is the way that we look at the problem’, concluded Bolding. l

12 SCIENTIFIC COMPUTING WORLD

@scwmagazine l www.scientific-computing.com

Spectra Logic

Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | Page 20 | Page 21 | Page 22 | Page 23 | Page 24 | Page 25 | Page 26 | Page 27 | Page 28 | Page 29 | Page 30 | Page 31 | Page 32 | Page 33 | Page 34 | Page 35 | Page 36 | Page 37 | Page 38 | Page 39 | Page 40 | Page 41 | Page 42 | Page 43 | Page 44

orderForm.title