SCW_APRMAY12

INTERVIEWS Industry opinion

What’s changing in the field of statistics? SCW finds out

The use of statistics is broadening, says Laurie Miles, head of analytics at SAS T

here’s a buzz in the marketplace surrounding Big Data, but we focus on big analytics because it’s not the data that’s important, it’s

what you do with it. Te world has changed for statistics with the, if you’ll forgive the term, democratisation of analytics. It used to be that a few specialists would perform analyses in the back office, but point-and-click interfaces, wizard-driven tools and algorithms that self-learn have all meant that an increasing percentage of non-statisticians are now using analytics. Increasingly, elements are

becoming more scalable and growing data volumes mean that analyses need to run faster. Soſtware is currently meeting this challenge, however. In the past, if someone

“It’s only in the past five years that I’ve seen non-statisticians getting excited by the things you can do with data”

wanted to run a complex mathematical formula, such as a neural network, on a large volume of data it might have begun on Monday morning and finished on Wednesday. We recently ran some tests with our new high-performance analytics soſtware and an enormous volume of data was run through the neural network in 38 seconds. Something that used to take days has come down to less than a minute. What this means is that analysts can try different approaches that previously wouldn’t have been possible due to time constraints. By having the flexibility to

explore a variety of options, users get a greater level of accuracy rather than talking a more innovative approach. Analytics have been around for a long

time, but it’s only in the past five years that I’ve seen non-statisticians getting excited by the things you can do with data. It’s still the tip of the iceberg but in the next five years I believe we’ll see the use of statistics opening up even further. www.sas.com

Stephen Langdale, senior technical consultant at NAG, shares his view O

n a basic level, there is a consideration of accuracy for every algorithm that’s written. Te numerical stability has to

be checked to ensure, for example, there are no divide by zero errors or an accumulation of cancellation effects which can render the output meaningless. Furthermore, a lot of work needs to be done to get a computer to consistently calculate reliable results – a necessity given that if the routines you’re relying on to make decisions are returning garbage, your decisions will be the same. In terms of statistics, one of the things

you need to look for is a wide variety of choices of methods for modelling data, allowing problems to be approached from many angles. A library of fast, reliable methods is needed because even though computers have advanced so much over the years, the amount of data people are

SCIENTIFIC COMPUTING WORLD

collecting has increased many more times over. It wasn’t that long ago that a data set of a few thousand would be huge, whereas today many data sets are orders of magnitude larger. As the number of

computing cores continues to increase, it becomes a question of how to split up data and tune statistical methods, some of which were developed decades ago for serial computing. For example, random number generators – the building blocks of Bayesian estimation – is one key area that greatly benefits from computers using threaded models, meaning that problems can be split up and concurrent calculations can be run to save time. At NAG we have noticed that industry is

“If the routines you’re relying on to make decisions return garbage, your decisions will be the same”

moving away from simpler models to more complex computer-intensive ones. Te NAG Library itself contains over 1,700 methods, including 11 chapters dedicated to areas of statistics. Key areas include probability distributions, random number generators, clustering and regression ranging from multiple regression to generalised linear models to time-series. Feedback from NAG’s users and academia drives the contents of these chapters, but NAG only adds new algorithms if they’ve

been stringently tested and are deemed likely to have longevity. But it’s not all that easy to predict which ones will be adopted by the wider community – it’s not always the ones you might expect. www.nag.co.uk

BEYOND THE NUMBERS A STATISTICS SPECIAL 25

Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | Page 20 | Page 21 | Page 22 | Page 23 | Page 24 | Page 25 | Page 26 | Page 27 | Page 28 | Page 29 | Page 30 | Page 31 | Page 32 | Page 33 | Page 34 | Page 35 | Page 36 | Page 37 | Page 38 | Page 39 | Page 40 | Page 41 | Page 42 | Page 43 | Page 44 | Page 45 | Page 46 | Page 47 | Page 48 | Page 49 | Page 50 | Page 51 | Page 52