10-K Library

Here, we present a collection of draft genome sequences for 11,515 Actinobacteria isolates (10-K library). These strains are from the widely diverse environments including the South China Sea, the Xinjiang desert, Canada, Korea, and Traditional Chinese medicines (Figure 1). In order to reduce the cost of genome sequencing, we extracted genomic DNA (gDNA) from the 11,515 individual strains and arranged them in 127 96-well plates. After pooling the gDNA samples within each plate, we sequenced these 127 pools using Illumina Hiseq 2500. The paired-end reads from 127 pools were then separately assembled into contigs, resulting in a total sequence length of 26.2 Gb. Subsequently, this strategy may have potentially led to a large number of small contigs (Table 1). So, we strongly recommended that users mine genes with short sequences instead of gene clusters in our database. Our 10-K library will vastly increase the potential for discovering novel natural products and synthetic biology components.

 The sample collection of 10-K genome database
Overview of the assembled contigs
Back to Top