This page contains a Flash digital edition of a book.
ood


consumption, so that we could reduce carbon dioxide emissions? Could we predict which children would respond best to specifi c learning approaches? Could we learn which factors help us to adopt a healthy diet or lifestyle? All these and more are certainly possible – if we have the data!


FIRST, DO NO HARM


Perhaps a good starting point would be to make sure that we don’t use data mining in bad ways. After all, when you can predict things it gives you the ability to make decisions that perhaps you shouldn’t. There isn’t enough space here to discuss this in detail, but it’s certainly not as straightforward as it might seem. A recent NY Times article described how a


retailer in the US was able to predict when a customer was pregnant, even if they hadn’t been told. A powerful technique, but should you use it? Perhaps the best guide would be to think how you might feel if you knew that this was being done to you – would you feel it was a reasonable thing to do, or not? Bear in mind that any prediction can be wrong, and there is a need to take into account the impact of making wrong decisions, but that taking no action is also a decision!


WHERE DATA MINING IS ALREADY USED POSITIVELY


The data mining community is just beginning to get to grips with the idea that our skills can be used positively, and as such we don’t necessarily have a good focus. Three areas where people are taking direct action, and with very different concepts are data mining competitions, self analysis and crowdsourcing, and the open data movement.


Data mining competitions combine the competitive and the cooperative. Although their algorithm-centric approach is not best practice in data mining, there is no doubt that organisations like Kaggle have breathed new life into the idea that data mining can provide benefi ts whilst data miners have fun. They are running a $3 million heritage health prize, aimed at predicting hospital admissions. Self analysis and crowdsourcing approach the issue from another angle: if it’s diffi cult to get hold of data that would allow you to predict useful stuff (for example it’s owned by commercial organisations who are reluctant to give it away), then generate the data yourself and mine it yourself. The advent of smartphones has made previous infrastructure and computational problems a thing of the past. We’re starting to see this with health data and with transport data, and this promises to be a rich source of benefi cial data mining for the future.


Of course one area where we know we need to pull together is that of social provision. And in this case we know that there is an organisation that has been gathering data assiduously on our behalf for over 200 years. Unfortunately the approach that government has taken has often been one of hiding data rather than making it available, but that is changing. And with the change comes the opportunity to use data mining and analysis to take control of our own communities. What makes roads dangerous? Which properties are more at risk from burglars? Why are some schools better than others? Data mining can provide answers, and there is no reason to believe that the answers can’t be found for everyone’s benefi t.


WHAT CAN YOU DO? If you’re lucky enough to be in a company that uses data mining for the wider benefi t then congratulations. Many people aren’t, but might still be interested in using their skills – what can they do? Two initiatives that are well worth checking out


are UN Global Pulse and DataKind. UN Global Pulse wants to use data and data mining to give insight into international situations where traditionally we know too little, too late. Can we better predict droughts, population movement, disease patterns?


DataKind looks to bring the data mining


approaches that have been so successful in commerce into the hands of charities. Although based in the US it will soon be expanding to the UK, where it will give volunteer data experts an opportunity to provide short and long term support for charities.


For further reading, search for UN global pulse, datakind.org and data for the public good.


Volume 22 – Issue 4 |December 2012 39


Page 1  |  Page 2  |  Page 3  |  Page 4  |  Page 5  |  Page 6  |  Page 7  |  Page 8  |  Page 9  |  Page 10  |  Page 11  |  Page 12  |  Page 13  |  Page 14  |  Page 15  |  Page 16  |  Page 17  |  Page 18  |  Page 19  |  Page 20  |  Page 21  |  Page 22  |  Page 23  |  Page 24  |  Page 25  |  Page 26  |  Page 27  |  Page 28  |  Page 29  |  Page 30  |  Page 31  |  Page 32  |  Page 33  |  Page 34  |  Page 35  |  Page 36  |  Page 37  |  Page 38  |  Page 39  |  Page 40  |  Page 41  |  Page 42  |  Page 43  |  Page 44