This page contains a Flash digital edition of a book.
Reports Reports

Capturing protein-coding genes across highly divergent species Chenhong Li1,3

, Michael Hofreiter2

Key Laboratory of Exploration and Utilization of Aquatic Genetic Resources, Shanghai Ocean University, Ministry of Education, Shanghai, China, 2

1 Department of Biology, The University of York, Heslington, UK, and 3 Marine Laboratory, College of Charleston, Charleston, SC, USA

BioTechniques 54:321-326 (June 2013) doi 10.2144/000114039 Keywords: comparative biology; cross-species capture; DNA hybridization capture; next-generation sequencing; protein-coding genes; phylogenetics; target enrichment Supplementary material for this article is available at

DNA hybridization capture combined with next generation sequencing can be used to determine the sequences of hundreds of target genes across hundreds of individuals in a single experiment. However, the approach has thus far only been successfully applied to capture targets that are highly similar in sequence to the bait molecules. Here we introduce modifications that extend the reach of the method to allow efficient capture of highly divergent homologous target sequences using a single set of baits. Tese modifications have important implications for comparative biology.

Next generation sequencing is revolu- tionizing our understanding of the links between genotype and phenotype at scales that were unimaginable a few years ago. Genome-wide association studies are now routinely conducted to explore the genomic basis for phenotypic variation (1) or disease predisposition (2), as well as to identify the genomic elements that underlie adaptive evolutionary responses (3). Although such genome level associ-

ation studies are powerful, they are labor- intensive to execute, as they require not only sequencing of entire genomes, but also that the collected genomic data be carefully parsed, assembled, and accurately annotated. Tis is most easily done when closely related and well-annotated reference genomes are available for comparison. However, genome assembly and annotation become substan- tially more difficult with increasing evolu- tionary distance to the reference genome. Indeed, it is oſten easier to assemble the genomes of divergent species de novo, than it is to compare them to a reference that differs in genome organization. Tis said, most studies do not require whole genome

comparisons. Comparative medical and physiological studies, for example, usually center on a specific set of genes relevant to a pathway of interest (4). In the same vein, most molecular evolutionary and phyloge- netic studies focus on the comparison of small subsets of homologous genes. Unfortunately, no technology yet

exists that allows researchers to efficiently explore genetic variation across evolution- arily divergent organisms for large sets of pre-specified target genes, such as those that determine pelage color in vertebrates (5), or those involved in particular physiological adaptations (6,7). PCR amplification, the main and for a long time only technology for targeting specific DNA regions for sequencing, is generally too labor intensive, costly, and inconsistent to generate such multigene data sets. In principle, DNA hybridization capture technology, hereaſter referred to as gene capture (8,9), should be up to the task as it allows for hundreds of pre-specified genes to be targeted and isolated for sequencing in a single exper- iment. Indeed, gene capture techniques do work well when baits are used to interrogate

libraries with limited sequence divergence (10–13), such as those for the same species or for closely related species, or when baits are designed to target ultraconserved elements among species that otherwise have highly divergent genomic sequences (14,15). Unfor- tunately, existing gene capture protocols do not work well when targeting genes whose sequences are highly divergent from those of the baits that are used to capture them, although we know from first principles and experience with Southern blotting (16), that they should. Here we describe a gene capture method

that is effective for capturing a pre-specified set of protein coding genes across species that have been evolving independently for hundreds of millions of years, using a single bait array. We have achieved this by tuning the stringency of the hybridization and washing steps of the procedure to optimize retention of target versus non-target DNA fragments, and by conducting two rounds of gene capture whereby the captured products from the first round are used as templates to augment a second round of gene capture (see detailed Supplementary Protocol).

Method summary: Modified DNA hybridization conditions allow gene capture across widely divergent homologous sequences.

Vol. 54 | No. 6 | 2013 321 Hollings , Nicolas Straube3 , Shannon Corrigan3 , and Gavin J.P. Naylor3

Page 1  |  Page 2  |  Page 3  |  Page 4  |  Page 5  |  Page 6  |  Page 7  |  Page 8  |  Page 9  |  Page 10  |  Page 11  |  Page 12  |  Page 13  |  Page 14  |  Page 15  |  Page 16  |  Page 17  |  Page 18  |  Page 19  |  Page 20  |  Page 21  |  Page 22  |  Page 23  |  Page 24  |  Page 25  |  Page 26  |  Page 27  |  Page 28  |  Page 29  |  Page 30  |  Page 31  |  Page 32  |  Page 33  |  Page 34  |  Page 35  |  Page 36  |  Page 37  |  Page 38  |  Page 39  |  Page 40  |  Page 41  |  Page 42  |  Page 43  |  Page 44  |  Page 45  |  Page 46  |  Page 47  |  Page 48  |  Page 49  |  Page 50  |  Page 51  |  Page 52  |  Page 53  |  Page 54  |  Page 55  |  Page 56  |  Page 57  |  Page 58  |  Page 59  |  Page 60  |  Page 61  |  Page 62  |  Page 63  |  Page 64  |  Page 65  |  Page 66  |  Page 67  |  Page 68