Introduction: What is RNA-Seq?

RNA molecule
from Wikipedia

Types of RNAs

In both prokaryotes and eukaryotes, there are 3 main types of RNA:

  • Messenger RNA, aka mRNA:
    • Represent ~1-5% of total RNA mass;
    • Protein-coding;
    • Mostly poly-adenylated (at the 3’);
    • Very heterogeneous in terms of base sequence and size.
  • Ribosomal RNA, aka rRNA:
    • Represent ~80-90% of total RNA.
  • Transfer RNA, aka tRNA:
    • Represent ~15% of total RNA
    • Small in size: ~ 75-95 nt

But there are many more types of RNAs:

  • Micro RNA, aka miRNA:
    • Regulatory RNAs;
    • Small in size: ~20-25 nt.
  • Small nuclear RNA, aka snRNA:
    • Some of snRNA are related to splicing mechanisms.
  • And many more: lncRNA, eRNA, scaRNA, gRNA, piRNA, etc.
Estimate of RNA levels in a typical mammalian cell
from Palazzo and Lee, Frontiers in Genetics, 2015


RNA-sequencing

RNA-sequencing, aka RNA-seq, is a High-Throughput Sequencing technique for identifying and quantifying RNA molecules in biological samples.

RNA-Seq summary
from Wikipedia

This technology is used to analyze RNA for assessing:

  • RNA/gene/transcript expression;
  • alternatively spliced transcripts;
  • gene fusion and SNPs;
  • post-translational modification.

Other technologies for assessing RNA expression are Northern Blot, real-time PCR, hybridization-based microarrays.


Technologies and protocols

RNA-seq can target different RNA populations, using different protocols:

  • Positive selection of mRNAs: polyA selection.
  • Negative selection of non-polyA: rRNA depletion.
  • Size selection: e.g. small RNA selection.


Depending on the technology and the protocol, RNA-seq can produce:

  • Single-end short reads (50-450 nt), which are used for gene expression quantification (mainly Illumina, but also Ion Torrent and BGISEQ);
  • Paired-end reads (2 x 50-250 nt), which are useful for detecting splicing events and refinement of transcriptome annotation;
  • Single long reads (PACBio or ONT), which are used for the de novo identification of new transcripts and improving transcriptome assembly.