This page contains a Flash digital edition of a book.
GENOMICS Gene genie


Greg Blackman on the laboratory informatics used in the rapidly evolving field of genomics


It’s been 10 years since the first draft of the human genome was published. The rough draft of our genetic make-up was published on 26 June 2000 in a joint announcement by the then US President, Bill Clinton, and UK Prime Minister, Tony Blair. The sequencing was deemed a major milestone and promised much to improve our understanding in areas of medicine and biotechnology. In the decade since the draft was published,


gene sequencing techniques have come a long way. The first complete human genetic code took 13 years to map, with the complete sequence being published in April 2003. Now, scientists at the Wellcome Trust’s Sanger Institute in Cambridge can sequence an entire genome in 13 hours. And where it took millions of dollars for the Human Genome Project, it now costs around $10,000 to sequence a complete human genome. In fact, to celebrate the 10-year anniversary of the original draft code, the Wellcome Trust has launched the UK10K project to sequence 10,000 human genomes over three years. It will look at the genetic makeup of 4,000 healthy volunteers and 6,000 known sufferers from serious medical conditions including neuro-developmental disorders, congenital heart disease and obesity. Current sequencing technology can produce terabytes of data, which puts it firmly in the realm of high-performance computing (HPC). The Sanger Institute can generate up to 120 terabytes of raw data per week from its sequencing machines, and has a total of 12 HPC clusters, eight of which run Platform LSF, an HPC management software solution. ‘Traditionally, genomics wasn’t a field in


which the HPC industry was interested,’ comments Justin Johnson, director of bioinformatics at EdgeBio. ‘Now that sequencers can generate terabases of data in a week, institutes conducting genomics work are starting to think on the petabyte and exabyte scale for data generation. The introduction of next-generation sequencers has turned the


12 SCIENTIFIC COMPUTING WORLD AUGUST/SEPTEMBER 2010 www.scientific-computing.com


traditional compute infrastructure on its head. It’s not just a software tools problem, but also a compute problem.’ EdgeBio is a research reagents company providing nucleic acid purification products. Within the last 18 months, it has expanded its services to include next-generation sequencing and bioinformatics consulting services. The company operates two SOLiD 4 and four SOLiD 3Plus sequencers from Applied Biosystems, which generate terabytes of data on a weekly basis. It is carrying out sequencing for a methylation project, between Life Technologies and Virginia Commonwealth University, and involving sequencing 1,575 human samples. ‘Traditionally, the cost was in the sequencing itself, because it was such a lengthy process,’ says


Johnson. ‘Cost of sequencing has now lowered dramatically, but the informatics hurdles involved in finding value in the data remain and are expanding due to new applications possible with next-gen technologies.’ EdgeBio uses software from CLC Bio, a company providing a wide variety of next- generation sequence analysis tools, and has built its informatics infrastructure around CLC Bio’s genomics server. It has incorporated its own internally-developed algorithms and tools, as well as using open-source tools.


Linking genetics to disease Human health research uses genomic technologies to gain a better understanding of the genetics behind certain conditions. Research centres will tap into clinical samples stored in biorepositories and conduct genotyping to search for any genetic variations that could be linked to the disease. The Respiratory, Genetics and Epidemiology division of the Channing Laboratory, itself a research division of Brigham and Women’s Hospital and Harvard Medical School, runs a high-throughput genotyping laboratory, which looks for novel mutations associated with disease risk. The lab handles a biorepository for research


Page 1  |  Page 2  |  Page 3  |  Page 4  |  Page 5  |  Page 6  |  Page 7  |  Page 8  |  Page 9  |  Page 10  |  Page 11  |  Page 12  |  Page 13  |  Page 14  |  Page 15  |  Page 16  |  Page 17  |  Page 18  |  Page 19  |  Page 20  |  Page 21  |  Page 22  |  Page 23  |  Page 24  |  Page 25  |  Page 26  |  Page 27  |  Page 28  |  Page 29  |  Page 30  |  Page 31  |  Page 32  |  Page 33  |  Page 34  |  Page 35  |  Page 36  |  Page 37  |  Page 38  |  Page 39  |  Page 40  |  Page 41  |  Page 42  |  Page 43  |  Page 44  |  Page 45  |  Page 46  |  Page 47  |  Page 48
Produced with Yudu - www.yudu.com