PLANT MANAGEMENT/ AUTOMATION
Creating Value from Your Data
There may be huge potential benefits waiting in the data in your servers. Better data allows better decisions, of course. Banks, insurance firms, and telecom companies already own a large amount of data about their customers. These resources are useful for building a more personal relationship with each customer.
Some organizations already use data to build complex and customized models based on a very extensive number of input variables such as, for example in Agriculture, the influence of soil characteristics, weather, plant types to improve crop yields. Airline companies and hotel chains use dynamic pricing models to improve their yield management. Data is increasingly being referred as the “gold mine” of the 21stcentury.
The explosion of data available to us is prompting every business to address the right questions:
How can we create value from these resources? Very simple methods, such as counting words used in queries submitted to company web sites, do provide a good insight as to the evolution of our customers’ mood. Simple statistical correlations are often used to suggest a purchase just after visiting a product page on the web.
If simple descriptive statistics are useful, just guess what could be achieved from advanced regression models or powerful statistical multivariate techniques, which can be applied easily with statistical software packages like Minitab.
A simple example of the benefits of analyzing an enormous database Let’s consider an example of how one company benefited from analyzing a very large database.
Since delays negatively impact customer perceptions and also affect productivity, airline companies routinely collect a very large amount of data related to flight delays and times required to perform tasks before departure.
A couple of factors underlie the rising prominence of data:
Huge volumes of data
Data acquisition has never been easier (sensors in manufacturing plants, sensors in connected objects, data from internet usage and web clicks, from credit cards, fidelity cards, Customer Relations Management databases, satellite images etc.). Data can be stored at costs that are lower than ever before (huge storage capacity on the cloud and elsewhere).
Unprecedented velocity
Connected devices provide data in almost real time, which can be processed very quickly. It is now possible to react to any change…almost immediately.
Incredible variety
The data collected is not restricted to billing information; every source of data is potentially valuable for a business. Besides numeric data, also unstructured data such as videos, pictures, etc. get collected.
A major worldwide airline company intended to use this data to identify which crucial aircraft preparation steps often triggered delays in departure times. The company used Minitab’s stepwise regression analysis to quickly focus on the few variables that played a major role among a large number of potential inputs. Many variables turned out to be statistically significant, but two among them clearly seemed to make a major contribution (X6 and X10).
When huge databases are used, statistical analyses may become overly sensitive and detect even very small differences (due to the large sample and power of the analysis). P values often tend to be quite small (p < 0.05) for many predictors.
However, in Minitab, if you click on Results in the regression dialogue box and select Expanded tables, contributions from each variable will get displayed. X6 and X10 when considered together were contributing
to more than 80% of the overall variability (with the largest F values by far), the contributions from the remaining factors were much smaller. The airline then ran a residual analysis to cross-validate the final model.
In addition, a Principal Component Analysis (PCA, a multivariate technique) was performed in Minitab Statistical Software to describe the relations between the most important predictors and the response. Milestones were expected to be strongly correlated to the subsequent steps.
The graph above is a Loading Plot from a principal component analysis. Lines that go in the same direction and are close to one another indicate how the variables may be grouped. Variables are visually grouped together per their statistical correlations and how closely they are related.
A group of nine variables turned out to be strongly correlated to the most important inputs (X6 and X10) and to the final delay times (Y). Delays at the X6 stage obviously affected the X7 and X8 stages (subsequent operations), and delays from X10 affected the subsequent X11 and X12 operations.
Conclusion
This analysis provided simple rules that this airline’s crews can follow to avoid delays.
The airline can repeat this analysis periodically to search for the next most important causes of delays. Such an approach can propel innovation and help organizations replace traditional and intuitive decision-making methods with data-driven ones.
For more information please contact:
info@minitab.co.uk.
www.minitab.com
Enquiry No. 19
SEPTEMBER/OCTOBER 2017
www.reviewonline.uk.com
19
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32 |
Page 33 |
Page 34 |
Page 35 |
Page 36 |
Page 37 |
Page 38 |
Page 39 |
Page 40 |
Page 41 |
Page 42 |
Page 43 |
Page 44