Hacking Illumina GA IIx to Study RNA-Protein Interactions

Although RNA-Protein binding is very important in many biological processes, like splicing & many post-transcriptional processes, there are not that many high-throughput assays to study Protein-RNA interactions. Contrast this to the advances in DNA-Protein interactions.  Nature Biotechnology has an interesting paper from Greenleaf group at Stanford, that hacked Illumina GA IIx to study RNA-binding in experiments.

Hacking or repurposing (as the authors call) the Illumina sequencer, they were able to design an assay that quantitatively measures protein binding to over 10 million RNA targets on the Illumina flow cell surface. And thus give the ability to perform ultra-high-throughput & quantitative measurement on RNA-protein interactions.  The team also developed methods to analyze the images from sequencing reactions to measure equilibrium binding constants and dissociation kinetics.

The team led by William Greenleaf  used the hacked Illumina sequencers cleverly to create transcribed RNA pieces that are still attached to the DNA on the flow cell.  These DNA tethered RNA molecules attached to the flow cell were then used as substrate for the fluorescent labeled RNA Binding protein to bind. Since the protein is fluorecently labeled, the RNA binding event can then be quantitatively measured using the imaging setup that is part of the sequencer.

Hacking Illumina GA IIx to Study RNA-Protein Interactions

Hacking Illumina GA IIx to Study RNA-Protein Interactions (Source: Greenleaf Lab)

Tethering Millions of RNA Pieces to DNA on Illumina Flow Cell

Briefly, instead of sequencing regular DNA samples, the team designed RNA target sequences that can be transcribed by E. coli RNA polymerase (RNAP) in the Illumina flow cell. The designed RNA target sequences contained RNAP initiation-and-stall sequence and a region coding for diverse sequence variants of the MS2 RNA. They were also barcoded to identify different RNA variants.

The really clever experimental tricks used to quantify RNA-protein binding is as follows.  After sequencing the designed RNA targets, they removed the sequenced DNA strand and generated dsDNA, created a terminal biotin-streptavidin roadblock on the dsDNA fragments.  Then they used E.Coli RNAP to generate 26 bases of RNA, just before the RNA Polymerase is stalled.  Then, removing any excess RNA polymerase by a wash step, provided all four nucleotides to allow RNAP to transcribe the variable region and stall at the biotin-streptavidin roadblock.

Essentially, this sequence of complicated procedures give rise to “transcribed RNA” that is still tethered to its parent DNA by RNA polymerase. This way they could create RNA array containing over 12 Million distinct clonal RNA populations comprising 1.48 × 10^5 unique sequences in a single sequencing lane.

Quantitative Measurement of  RNA-Protein Binding

Now that you have RNA tethered to DNA on the Illumina flowcell, the team flowed the flourescently labeled RNA-binding protein (MS2 coat protein) over the flow cell, so that RNA binding protein can be bound to the RNA.  They performed the experiment at 10 different concentrations of the RNA binding protein. Then, the team used the image analysis tools that they developed to analyze the fluorescent decay to measure binding and dissociation constants. Pretty neat hack.

How the Stanford Team Hacked Illumina GA IIx?

The method section in the RNA MaP paper says that sequencing was done in California based ELIM Biopharmaceuticals. It is not clear whether the whole hacking also done by the company.[See the comment from one of the authors of the paper.] If you wondered, how they hacked the Illumina GA IIx, here is the part describing that in the paper.

To improve the optics and allow for equilibrium measurements on an Illumina sequencer, we modified the sequencer in several ways. First, we exchanged the standard Illumina fluorescence filter to a filter optimized for SNAP-Surface 549 fluorescence emission (Semrock FF01-562/40-25). Second, we eliminated unwanted wash steps after imaging and during the ‘safe state’ mode by changing the default SCS files. C:\Illumina\SCS2.10\DataCollection\bin\Config\HCMConfig.xml was modified to: , and C:\Illumina\SCS2.10\DataCollection\bin\Config\ImageCyclePump.xml was modified to . We also shortened all the fluidics lines of the GAIIx and the associated paired-end module.


  1. Kudos to the Greenleaf team for a very elegant, nonintrusive hack to move the needle on a new analysis! I think a lot of folks would be apprehensive in tweaking their next-generation sequencer, whereas Greenleaf’s team demonstrated a practical extension of the technology for novel means!.

  2. Lauren Chircus says:

    Just to clarify: Elim Bio only did the sequencing. The RNA generation and binding experiments were done on our own Illumina GAIIx. Glad you liked our paper!

  3. nextgenseek says:

    Thanks a lot for the clarification. Very interesting work.
    Earlier Greenleaf also answered the question on twitter. My bad that I did not update here.

Speak Your Mind