Ballgown for estimating differential expression of genes, transcripts, or exons from RNA-seq

The awesome team behind Bowtie, TopHat, and @Simplystats has a new paper on estimating differential expression of genes/isoforms from RNA-seq data. The paper is published on bioRxiv, open preprint server for biology

Flexible isoform-level di fferential expression analysis with Ballgown by Alyssa C. Frazee, Geo Pertea, Andrew E. Jaff e, Ben Langmead, Steven L. Salzberg, & Jeff. T. Leek

and present a software suite to do differential expression analysis at isoform level. Bowtie, TopHat, Cufflinks, and Cuffmerge, CuffDiff- the Tuxedo suite of tools , are the most widely used methods for RNA-seq expression abundance quantitation and differential expression analysis. The Ballgown software suite complements the Tuxedo suite and presents a much improved approach to do differential expression analysis (when compared with CuffDiff2). Here is a really quick summary of the paper after a cursory read. This post may be of interest, if you use CuffDiff2 to perform the differential expression analysis.

P-value Histogram as Diagnostic Tool

The initial part of the paper shows, while testing for differential expression is CuffDiff2 is extremely conservative.  How can we tell whether one method is conservative or not. A quick look at the p-value histogram can help us decide how a method is working (with a few assumptions).

Consider a simple scenario where we are comparing two sample groups,  when there is no differentially expressed genes the p-value histogram would look like a uniform distribution.



When a thousands of genes are differentially expressed, the p-value histogram will be skewed towards zero, as the differentially expressed genes will have low p-values.

On the other hand, when the method that is used is overly conservative, the p-value histogram  will be skewed towards “1”.  One might see similar skewing, when the model used (or the assumptions) in differential expression analysis is not right.

The initial part of the paper used such diagnostics to compare Ballgame and CuffDiff2 on real and simulated data.

CuffDiff2 is Extremely Conservative: Real Data


CuffDiff2 vs Ballgown Real data

The Ballgown paper shows this using a real RNA-seq data and a simulated data set. One of the real data used is the 12 sample comparison between control and lung cancer data. Among the 4454 expressed transcripts (FPKM >1), Cuffdiff2 identified just 1 transcript as differentially expressed at the FDR 5% level. Using the same data, Ballgown’s F-test identified 2178 transcripts as differentially expressed. A similar behaviour was seen in the second real data with more samples.

CuffDiff2 is Conservative: Simulated Data

CuffDiff vs Ballgown-Simulated data

CuffDiff2 vs Ballgown Simulated data

To understand how CuffDiff is extremely conservative, the authors simulated a differential expression experiment with 10 samples in each of two groups such that among the 745 known transcripts on human chromosome 22. Among the 745 genes, 274 transcripts are differentially with 6 fold difference between groups.

In this simulation setup, Cuffdiff2 found that no transcripts are differentially expressed at 5% FDR. On the other hand, Ballgown ’s F-test identified 80 transcripts are differentially expressed at the same 5% FDR threshold.

Source of CuffDiff2’s Conservative Behaviour


CuffDiff 2 (Ignoring transcript-length normalization) vs Ballgown

What is more interesting is that they could narrow down the cause of such conservative behavior by CuffDiff2.  It seems that the cause is the transcript length normalization used  in the Cuffdiff2 software. When the transcript length normalization is ignored  Cuffdiff2 produced right results that are comparable to Ballgown. Have not fully understood the details yet, but it is a striking behaviour. It seems that the conservative behavior is known before ( Ballgown paper references CuffDiff2 paper and their earlier work in the manuscript).

It is also worth nothing that there a few updates to CuffDiff2 since it came out. May the issue was addressed in one of them. Got to read the CuffDiff2 paper and the updates.

This post was just a quick summary on the first part of the paper and hoping to read the paper in detail and writeup on that soon.



  1. HI ,
    I working with a 2 plant transcriptome data without duplicates only for DEG analysis and used athe HISAT2 and Sting Tie pipeline. Now I moving to Ballgown to check the DEG in 2 samples and one condition. Did the Ballgown accept input data with out replicate? Kindly let me know about this early as possible.

Speak Your Mind