CIE July2015

Aerospace & Defence

Exploring the Big Bang with Big Data

Most scientists believe that the universe was created with a big bang slightly less than 14 billion years ago. While they have evidence that suggests this is the case, as yet they cannot absolutely nail it down as an indisputable fact. Wayne Warren, chief technical officer of Raima, is following many of the projects that are homing-in on the answer and notes that the amount of data involved means that some robust computing is essential

M

uch of modern science is based on the assumption that the Big Bang theory is correct. However, while we know a lot about what has happened in the last 13 billion years, we know very much less about the first billion years - and particularly the first few milliseconds after the Big Bang. If we could find out more, our understanding of the universe would increase exponentially. There are many projects working on this, and most produce huge amounts of data that has to be analysed. It is perhaps natural to think that the data requirement (500–1,000 terabytes per second in some projects) would require a very highly specialised computer. However, the opposite is true - the primary requirements is for a computer that is robust enough to operate for many years and that is easy to operate, so that whole teams of researchers can easily learn to use it. Naturally, the computer has to be powerful enough to cope with the incoming data, but most of this data goes straight into memory and is only recalled when it is required. This sort of data management is often called Big Data and is used in many fields of endeavour and is also evident in everyday applications. Regardless of the application, data management is the key to success and a closer look reveals that many of the systems are based on everyday computer systems.

As well as capturing the incoming data, a database management system has to be quickly searchable, provide analysis and visualisation, and in most cases, ensure

14 July/August 2015

data privacy. It will use advanced methods to extract and convert the vast amounts of raw data into concise, useful information.

Projects that are out of this world This is often achieved by using parallel processing across hundreds of thousands of servers, which parallels some of the big science projects investigating the origins of the universe. An example of this is a radio telescope project called the Square Kilometer Array (SKA), which uses thousands of receivers spread around the world, but working in unison.

The SKA is due for completion in about ten years, by which time the data processing requirement will be a million trillion floating point operations per second - several orders of magnitude faster than is currently possible. Perhaps surprisingly, the scientists are confident that Big Data computer capabilities will evolve fast enough to meet their growing needs. Another long term project is just coming

on-stream now and will provide a mass of data for analysis over the next 18-24 months. After a seven and a half year flight, NASA’s Dawn space probe has entered orbit around Ceres, the first dwarf planet to be visited by a man-made spacecraft. Its mission is to survey the entire surface and relay data back to Earth. Scientists believe that Ceres is a ‘proto- planet’ that will one day join with other proto-planets to form a terrestrial planet. Dawn’s data should help them understand this process, and hence the formation of the solar system.

For a successful survey, multiple Components in Electronics

sensors will need to constantly take readings, so Dawn’s dataset will grow very quickly, and the receiving computers back on Earth are ready to collect and classify this in-depth analysis. There are plans to launch many more deep-space probes in the coming years, and naturally they share a lot of common DNA with their close cousins - Earth orbiting satellites. It is just over 50 years since the first satellites were launched, and there are thought to be about 3,000 active satellites currently in orbit. Significantly, this figure is expected to double within five to ten years and, of course, their technical capabilities are growing exponentially. As a result, it is evident that the amount of data being sent back to Earth is going to increase considerably. To help accommodate this, space scientists are transitioning their satellite and probe communications from radio wave technology to very much faster laser- based solutions.

As this happens, the goalposts that

currently define big data will move significantly. However, the need for robust, simple database management systems will remain.

The need for new solutions With such a growth in data volumes, new non-traditional data management systems will be needed. The most promising solution is embedded database technology, operating on different hardware and software platforms, providing local data management and data distribution capability. These embedded systems give local applications the intelligence to analyse and distribute data, make decisions autonomously, or summarise data efficiently for distribution to other related systems. As an example, Raima’s RDM embedded database technology products are cross- platform, small footprint, in-memory and hybrid database solutions, which are optimised for work group, live real time, embedded and mobile operating systems. They are designed for distributed architectures in resource constrained environments, and developed to fully utilise multi-core processors.

Importantly, they are suitable for running on a wide variety of platforms, and support multiple APIs and configurations, which provide developers with numerous powerful programming options and functionality. In short, they are simple systems that provide massive data handling abilities using standard computer hardware.

While this may sound very sophisticated,

Raima realises that, increasingly, the people who use their systems are not data management specialists, but experts in their own field who need transparent computer systems that deliver the data they want, when they need it and in the format they require. This data may have been collected from multiple sources and processed, so that the users get the information in a manner that is ideal for them. RDM embedded systems are designed

to provide rugged, scalable and local solutions for handling large amounts of data quickly and locally. They can run on many platforms and support multiple processor and multi-core architectures. The RDM data storage engine is multi- functional, so can provide the best performance in an embedded system’s application. RDM has a low CPU requirement, yet delivers stability in resource constrained systems, whether they are in-memory, on-disk or a combination of the two. Importantly, RDM makes data available

wherever it is needed across a system, while its automatic recovery features ensure that data will never be lost due to a system failure. Research into the Big Bang is changing our understanding of the universe - and it is also helping drive to a new paradigm of data handling. Scientists need robust, connected applications with data management tools that will cope with the huge, dynamic and rapidly expanding network of smart devices, and this is a trend reflected in many other fields too. It is therefore fair to say that big data is becoming a commodity that more and more people will come to use with less and less thought for how its operates.

www.raima.com www.cieonline.co.uk

Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | Page 20 | Page 21 | Page 22 | Page 23 | Page 24 | Page 25 | Page 26 | Page 27 | Page 28 | Page 29 | Page 30 | Page 31 | Page 32 | Page 33 | Page 34 | Page 35 | Page 36 | Page 37 | Page 38 | Page 39 | Page 40 | Page 41 | Page 42 | Page 43 | Page 44 | Page 45