This page contains a Flash digital edition of a book.
A healthy approach to data analysis


As populations become more and more globalised, and their interactions increasingly complex, scientifi c computing is the glue which holds that framework together. Felix Grant explains


A


s this issue of Scientifi c Computing World appears, by a happy piece of synchronicity from my point of view, the Wellcome Collection in the UK


has on show an exhibition called Dirt: the fi lthy reality of everyday life. One exhibit in particular is of pivotal relevance to data analytic epidemiology: Dr John Snow’s so- called ‘ghost map’. In 1854, using what would today be


described as data visualisation, Dr Snow plotted cases of cholera on a map of Soho, London. From the results he deduced that a water pump was the source of infection. This was particularly impressive because water was not, at the time, suspected as a transmission vector, and the pathogenic germ theory of disease had not become generally accepted. The local council decision to disable the pump was therefore, in the circumstances, a seminal


12 SCIENTIFIC COMPUTING WORLD


act of faith in datacentric deduction over conventional wisdom. Seemingly unlikely causation chains


are often discovered by more sophisticated variations on Snow’s theme, emerging through statistical winnowing of gathered


general practice managers, country vets and hospital porters. In a more recent high profi le example,


again involving cholera, an outbreak in Haiti after the devastating earthquake seems to have been traced to a tragic ‘confl uence of circumstances’[1]


arising from the aid effort


itself. Identifi cation of the apparent initial import vector didn’t require any sophisticated analysis in this case, but patterns of spatial spread within the country subsequent to that were a different matter. In an unfunded study of data from census and hospitalisation records (using Madonna software; widely


THERE ARE FEW AREAS OF STUDY CLOSER THAN


EPIDEMIOLOGY TO THE TRUISM WHICH HUMAN BEINGS ARE MOST LIKELY TO FORGET: THAT THEY ARE ONLY ONE TWIG ON THE SCHEMATIC TREE OF LIFE


data. More than most data analytic areas, epidemiology can benefi t from pooled work by numerous users at the sharp end of their practice, as well as high level overviews, and data analysis is vital across that whole range. Those who helped me in preparing this article include theatre nurses,


used software from the University of California at Berkeley) Tuite and others were able to model transmission in a way that, ‘despite limited surveillance data... closely reproduces reported disease patterns’.[2] Madonna is powerful, sophisticated software for analysing dynamic systems at


www.scientific-computing.com


Page 1  |  Page 2  |  Page 3  |  Page 4  |  Page 5  |  Page 6  |  Page 7  |  Page 8  |  Page 9  |  Page 10  |  Page 11  |  Page 12  |  Page 13  |  Page 14  |  Page 15  |  Page 16  |  Page 17  |  Page 18  |  Page 19  |  Page 20  |  Page 21  |  Page 22  |  Page 23  |  Page 24  |  Page 25  |  Page 26  |  Page 27  |  Page 28  |  Page 29  |  Page 30  |  Page 31  |  Page 32  |  Page 33  |  Page 34  |  Page 35  |  Page 36  |  Page 37  |  Page 38  |  Page 39  |  Page 40  |  Page 41  |  Page 42  |  Page 43  |  Page 44  |  Page 45  |  Page 46  |  Page 47  |  Page 48  |  Page 49  |  Page 50  |  Page 51  |  Page 52