May 2015

BLOW UP THE OLD WAY

OF TESTING gDNA

Table 1. Properties of the prepared mate-pair libraries. Peak in library

Library Targeted mate distance in size selection (kb)

Library 1

Library 2 Library 3

1–6 1–6 11–18

size distribution* (bp)

869

545 501

# of PCR cycles

10 8 10

Read length (nt)**

101 171 171 171

Duplicate read pairs (%)

0.48 0.38 0.60 3.08

Proportion of usable read pairs*** (%)

33.6 59.9 65.4 68.1

Note that for each library 15 million read pairs were randomly sampled with the program seqtk (https://github.com/lh3/seqtk) and used in the measures of duplicate read pair % and the proportions of usable reads by the program NextClip.

*This column shows the values reported by the Bioanalyzer (see Supplementary Figure S1).

**Reads with a length of 101 nucleotides (nt) were obtained by trimming the end of 171 nt reads using the program fastx_trimmer (http://hannonlab.cshl.edu/fastx_toolkit/).

***This column shows the proportion of read pairs that have a junction adaptor and are longer than 25 nt after the removal of the adaptor.

Whole-genome QC now has one workflow.

FULLY AUTOMATED

™ DOES IT ALL.

 Assesses gDNA concentration up to 40,000 bp

  

   L

 

(Table 1). The following statistics are based on random sampling of an adjusted number of sequenced reads among the different libraries (15 million read pairs). In comparison to Library 1, sequencing Library 2 with PE 171 cycles resulted in the largest proportion of read pairs (65.4%) that are regarded as usable in genome scaffolding (those in Categories A, B and C as defined by NextClip) (Table 1). This comparison shows that longer reads largely contribute to the increased proportion of usable reads, an effect that is enhanced with intensive shearing. Moreover, we sequenced a library prepared according to our modified protocol targeting the mate distance range of 11–18 kb (Library 3), and this library yielded a comparable proportion of usable reads with only 10 PCR cycles (Table 1). Using the mate-pair reads, we

computed tentative genome scaffolding using the Platanus (version 1.2.1) program (7). Scaffolding with mate-pair reads extensively increased the overall sequence continuity, as shown by the increased N50 values. Moreover, these

libraries provided improved coverage of the genome (Table 2) as indicated by the proportion of core eukaryotic genes (CEG) detected with the program CEGMA version 2.4 (8). Scaffolding using reads from Library 2 sequenced with 171 cycles had the best score (Table 2), albeit evaluation of genome assemblies requires multifaceted assessment (9). The power of our modified protocol was most evident when multiple libraries with varied mate distances were employed in scaffolding (Table 2). In our trials using P. picta genomic

DNA, we targeted relatively broad size ranges of tagmented DNA fragments (1–6 kb and 11–18 kb), but for a different species, we succeeded in preparing mate-pair libraries with narrower mate distance ranges (e.g., 12–16 kb). It is of note that our protocol enabled library preparation with no more than 10 PCR cycles even for >10 kb mate distance to yield 20 l of library solution of up to 7 nM (0.14 pmol), which kept the PCR duplicate rates down to less than 4% among 15 million read pairs (Table 1).

Table 2. Contribution of mate-pair reads to genome scaffolding. Reads used* PE only

PE + MP (Library 1, 101 nt) PE + MP (Library 1, 171 nt) PE + MP (Library 2, 171 nt)

PE + MP (Libraries 2 and 3, both 171 nt)

Scaffold N50 (kb) Max scaffold length (kb) Gene coverage (%)** 13.3 25.9 41.0 51.8

382.9 1819.1

1420.0 1790.6 1792.6 2425.0

58.9 71.3 75.4 77.4 82.7

*This column indicates read types [paired-end (PE) or mate-pair (MP)] and choice of MP libraries used in the scaffolding step after assembling only paired-end (PE) reads.

**This column shows the numbers of core eukaryotic genes (CEGs) detected as “complete” by CEGMA runs with the “--vrt” option.

256 www.BioTechniques.com

Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | Page 20 | Page 21 | Page 22 | Page 23 | Page 24 | Page 25 | Page 26 | Page 27 | Page 28 | Page 29 | Page 30 | Page 31 | Page 32 | Page 33 | Page 34 | Page 35 | Page 36 | Page 37 | Page 38 | Page 39 | Page 40 | Page 41 | Page 42 | Page 43 | Page 44 | Page 45 | Page 46 | Page 47 | Page 48 | Page 49 | Page 50 | Page 51 | Page 52 | Page 53 | Page 54 | Page 55 | Page 56 | Page 57 | Page 58 | Page 59 | Page 60 | Page 61 | Page 62 | Page 63 | Page 64 | Page 65 | Page 66 | Page 67 | Page 68