high-performance computing
At the Sanger Institute’s data centre, the computer systems can store up to four petabytes of data ➤
been corrupted. Human DNA is automatically separated from non-human DNA and moved to a secured storage location. Replicas of the data are stored in multiple locations to protect against data loss due to equipment failure. Once stored, researchers use queries against
the metadata catalogue to locate data of interest. For example, they can query data according to its study ID. Te Sanger Institute divides projects into ‘iRODS Zones’ with distinct data management policies; federation of the zones allows controlled data-sharing between projects. Data federation through iRODS will be a
critical capability for eMedLab, a collaborative bio-research project funded at £9 million by the UK’s Medical Research Council to provide a shared offsite data centre that supports ‘Data- Driven Discovery for Personalised Medicine’. Te eMedLab collaboration includes iRODS Consortium members – the Sanger Institute and University College London – as well as Te Francis Crick Institute, Queen Mary University of London, and the European Bioinformatics Institute.
In a related presentation at the User Group
Meeting, Vic Cornell from DataDirect Networks (DDN), which is also an iRODS Consortium member, discussed how Imperial College London (ICL) uses iRODS to comply with data management policies required for publicly-funded research in the UK. ICL plays a lead role in UK MED-BIO, a collaboration that includes partners from the Institute of
DATA MANAGEMENT ROADBLOCKS CAN BE OVERCOME
Cancer Research, the European Molecular Biology Laboratory-European Bioinformatics Institute, the Universities of Oxford, Swansea and Nottingham, the MRC Clinical Sciences Centre, and the non-profit organisation MRC Human Nutrition Research. Te project seeks to bring together the data, infrastructure, and expertise needed ‘to enable major advances in understanding the aetiopathogenesis of
chronic human diseases’. Te proof-of-concept system demonstrates the utility of iRODS in implementing mandated policies, such as those that: l Maintain associations between data sets and unique persistent identifiers;
l Ensure that data sets are preserved for a prescribed period of time following the last access; and
l Guarantee that archived data sets are not altered or corrupted. Te projects highlighted here are but a small
slice of the data-intensive research endeavours underway using iRODS. In genomics alone, there are nation-scale sequencing studies spinning up in the United States, Canada, Australia, Japan, South Korea, Singapore, Tailand, Kuwait, Qatar, Israel, Belgium, Luxembourg, and Estonia. Collaboration between institutions and disciplines is increasingly commonplace, and yet there will never be a one-size-fits-all data management plan for all organisations and disciplines. Also, organisations and research teams will continue to need long-term data preservation methods, so that data sets can be revisited and reanalysed as new analysis techniques emerge. At the iRODS Consortium, we feel privileged
to play a role in enabling these new data-driven efforts that will help fulfil the promise of big data. Trough painstaking effort and by paying close attention to the needs of our users, we know that data management roadblocks can be overcome. More importantly, we believe the transformative discoveries that result will bring us closer to solving important problems related to human health, climate change, environmental sustainability, poverty, and more.l
Air cooling system in the data centre 20 SCIENTIFIC COMPUTING WORLD
Dan Bedard is executive director of the iRODS Consortium, housed at RENCI at the University of North Carolina, Chapel Hill. For more information on iRODS or the consortium, please visit
irods.org.
@scwmagazine l
www.scientific-computing.com
Genome Research Limited
Genome Research Limited
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32 |
Page 33 |
Page 34 |
Page 35 |
Page 36 |
Page 37 |
Page 38 |
Page 39 |
Page 40 |
Page 41 |
Page 42 |
Page 43 |
Page 44