HPC_YEARBOOK

HPC 2013-14 | Big data

Taking a longer view

The concept of big data is nothing new, but the opportunities it can provide are. Beth Harlen reports

‘Tere has been a change in philosophy related to how organisations view data’, said Gord Sissons, product marketing manager at IBM, who explained that, in the past, people would decide in advance what information they viewed as important, and devise ways to capture that information in a database where it could be manipulated and queried. ‘As the abundance of data is exploding and is increasingly available in electronic form, and as the cost of capturing and storing data are plummeting, I now see organisations thinking differently about data,’ he continued. ‘Tey are thinking “why not just store everything just in case we need it

8

later?” While the relevance of data may not be recognised at the time of capture, analytic techniques are constantly improving, and captured data may prove to be important later on.’ Here lies both the challenge and

opportunity that ‘big data’ presents. While the term remains rather ill-defined, the general consensus is that it applies to the speed, size and sheer variety of data available today. As Sissons highlighted, the value placed on data is progressing in line with developments in key areas such as storage and analytics. But, of course, this all hinges upon the organisation being able to sustain an infrastructure that can cope with the deluge. According to Steve Jenkins, VP for

EMEA at MapR Technologies, big data is a problem for high-performance computing (HPC) when the traditional information technology hardware and soſtware can no longer contain, manage, and protect the rapid growth and scale of large amounts and increased variety of data, and be able to provide insight into it in a timely manner. Tis can occur, he said, when sampling systems start to produce data faster than the storage technology can record the data stream. Another pain point for HPC is figuring out how to correlate structured and unstructured data from multiple sources in an efficient fashion. ‘Much of the world’s structured data is stored in traditional relational databases but

Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | Page 20 | Page 21 | Page 22 | Page 23 | Page 24 | Page 25 | Page 26 | Page 27 | Page 28 | Page 29 | Page 30 | Page 31 | Page 32 | Page 33 | Page 34 | Page 35 | Page 36