Introduction: What is RNA-Seq?
RNA molecule |
---|
from Wikipedia |
Types of RNAs
In both prokaryotes and eukaryotes, there are 3 main types of RNA:
- Messenger RNA, aka mRNA:
- Represent ~1-5% of total RNA mass;
- Protein-coding;
- Mostly poly-adenylated (at the 3’);
- Very heterogeneous in terms of base sequence and size.
- Ribosomal RNA, aka rRNA:
- Represent ~80-90% of total RNA.
- Transfer RNA, aka tRNA:
- Represent ~15% of total RNA
- Small in size: ~ 75-95 nt
But there are many more types of RNAs:
- Micro RNA, aka miRNA:
- Regulatory RNAs;
- Small in size: ~20-25 nt.
- Small nuclear RNA, aka snRNA:
- Some of snRNA are related to splicing mechanisms.
- And many more: lncRNA, eRNA, scaRNA, gRNA, piRNA, etc.
Estimate of RNA levels in a typical mammalian cell |
---|
from Palazzo and Lee, Frontiers in Genetics, 2015 |
RNA-sequencing
RNA-sequencing, aka RNA-seq, is a High-Throughput Sequencing technique for identifying and quantifying RNA molecules in biological samples.
RNA-Seq summary |
---|
from Wikipedia |
This technology is used to analyze RNA for assessing:
- RNA/gene/transcript expression;
- alternatively spliced transcripts;
- gene fusion and SNPs;
- post-translational modification.
Other technologies for assessing RNA expression are Northern Blot, real-time PCR, hybridization-based microarrays.
Technologies and protocols
RNA-seq can target different RNA populations, using different protocols:
- Positive selection of mRNAs: polyA selection.
- Negative selection of non-polyA: rRNA depletion.
- Size selection: e.g. small RNA selection.
Depending on the technology and the protocol, RNA-seq can produce:
- Single-end short reads (50-450 nt), which are used for gene expression quantification (mainly Illumina, but also Ion Torrent and BGISEQ);
- Paired-end reads (2 x 50-250 nt), which are useful for detecting splicing events and refinement of transcriptome annotation;
- Single long reads (PACBio or ONT), which are used for the de novo identification of new transcripts and improving transcriptome assembly.