Moleculo synthetic long-read sequencing is now available as a library prep kit called “TruSeq Synthetic Long-Read DNA Library Prep Kits” from Illumina. Moleculo was a San Francisco based startup company founded by Stephen Quake’s lab members Michael Kertesz and Dmitry Pushkarev in 2012. Moleculo long-read technology is a combination of library preparation method and computational assembly algorithm that allows to create long sequence reads from Illumina short read technology.
Illumina acquired Moleculo in early 2013 and quickly made Moleculo long-read sequencing as service during the summer of 2013. And now, it is available as a library preparation kit for any one interested in using it. Illumina’s product page for the library kit says that
TruSeq Synthetic Long-Read DNA Library Prep is a complete end-to-end workflow that encompasses library preparation, sequencing, and informatics.
Applications of current library kit include genome finishing, de novo assembly, metagenomics, and whole human genome phasing. Briefly, generating synthetic long reads using the kit involves
- Fragmenting genomic DNA into 10 kb size and ligating adapters to the fragments
- placing the fragments into 384 wells such that about 3,000 fragments are in a single well
- amplifying fragments in each well by long-range PCR and cutting into short fragments and barcoding the fragments
- pooling the barcoded fragments from each well together to sequence them
Then the short reads are assembled using TruSeq Long-Read Assembly App to create a synthetic long reads. Current, kit produces long-reads (contig) of 8-10 kb median length. And it can be used with Illumina sequencers HiSeq 2000 and HiSeq 2500.
Giving examples of use of TruSeq Synthetic Long-Read DNA kit, Illumina’s product document showed that Illumina successfully used it for assembling C.elegans (of size ~100MB) and O. Sativa genome (of size 430MB) using the long reads alone. Earlier, Stanford and Illumina team showed Moleculo long-read technology can be used to phase human genome.
So what is next for the long read technology? Till now Illumina has shown applications for using genomic DNA. Hopefully Illumina is working on mRNA molecules. Illumina Long reads for transcriptomics will be great for isoform identification and transcriptome assembly. That will be fantastic as till now PacBio is the only way to look at transcriptomics using long reads. Will we get see the long-read technology for RNA-seq soon?