The chase to sequence a human genome for $1000 has led many sequencing technology companies to update their sequencers really fast and are coming up sequencers for the mass. One interesting question that everyone interested in next-gen sequencing data wants to ask is what is the best sequencing platform. Although the answer to the question depending on the applications one is interested in, knowing what to expect in using a sequencing technology is always useful.
A team of researchers from Sanger Institute, UK compared the performance of sequencers from Ion Torrent’s PGM, PacBio’s RS, and Illumina’s MiSeq by sequencing four microbial genomes of varying GC content. The Sanger team published their work in a paper in BMC Genomics journal this week titled
A tale of three next generation sequencing platforms: comparison of Ion torrent, pacific biosciences and illumina MiSeq sequencers, by Michael Quail, Miriam E Smith, Paul Coupland, Thomas D Otto, Simon R Harris, Thomas R Connor, Anna Bertoni, Harold P Swerdlow and Yong Gu.
This is not the first time a team of researchers have compared the next-gen sequencing platforms. Most recently in June, another UK research team compared the performance of 454 GS Junior (Roche), MiSeq (Illumina) and Ion Torrent PGM (Life Technologies) to find the best bench-top sequencer in the market. (Check here for the quick summary of their finding).
So what is new in the new paper published in BMC Genomics by Quail et. al..?
Quail et. al. has PacBio’s RS in their comparison instead of Roche’s 454 GS. In addition, Quail et. al. compare their performance with Illumina’s GAIIx and HiSeq 2000. Another important difference is that Quail et. al. sequenced three bacterial genomes with different GC content, in contrast to just E. coli by Loman et. al. Quail et. al. used
- Bordetella pertussis ( average 67.7% GC content and with some regions over 90% GC content),
- Salmonella Pullorum (average 52 % GC),
- Staphylococcus aureus (average 33 % GC)
- Plasmodium falciparum (19.3 % GC and some regions close to 0 % GC content).
The four different microbial species allowed Quail et. al to ask how these sequencing technologies performed under varying GC content. In addition to GC content, Quail et. al addressed how these sequencing platforms perform with respect to read coverage distribution, SNP/variant detection and accuracy.
GC Content Matters: Ion Torrent PGM’s Trouble with AT-rich regions
All sequencing technologies performed well with respect coverage, when sequencing theGC neutral genome S. Pullorum. The sequence coverage was uniform for GC-rich genome of B. pertussis in all technologies except for Ion Torrent data. Ion Torrent data was slightly more uneven coverage.
Among the four bacterial genomes, P. falciparum is the most complex genome with lot of repeat regions and strong AT-rich regions. Ion Torrent’s PGM data had highly biased coverage while sequencing the extremely AT-rich P. falciparum genome. PGM gave really deep coverage for the GC-rich var and subtelomeric regions in the genome and poor coverage within introns and AT-rich exons. However, the PacBio’s RS data had even coverage on GC and extremely AT-rich genomes, except for little GC coverage bias in S. aureus genome.
Error Rate: Mature Illumina Wins As Expected
Being the more matured technology, all Illumina platforms had the lowest sequencing error rates and they were below 0.4%. Ion Torrent’s PGM’s error rate was 1.78% and PacBio Sequencing had 13% error rate. When comparing the percentage of error-free reads, without a single mismatch or indel, MiSeq had the highest error free read percent with about 76.45 %. Ion Torrent data had 15.92%. And PacBio’s data had error free read.
To test how the sequencing error rate affects the SNP calling ability, Quail et. al. used S. aureus genome, which had good sequence representation from all technologies, and compared their SNP/variant calling ability.
Surprisingly, Ion Torrent PGM’s SNP calling rate was higher than that of Illumina. Ion Torrent Data called ~82% of the SNPs correctly, while Illumina platforms called 68-76% of the SNPs correctly. Among the three Illumina sequencers MiSeq had the highest SNP colling rate with 76%; followed by GAIIx 70 % true SNP calling rate and HiSeq had 69 % SNP calling rate.
The paper also has a nice summary table presenting all the tech specs and costs. Stay tuned to get the gist of it more visually soon.