Reducing Biases in Small RNA Sequencing
Poster Feb 19, 2015
Adam Morris Ph.D., Dawn Obermoeller, Masoud Toloue Ph.D.
The past decade has seen an explosion of interest in cataloging the small RNA repertoires of animal and plant species, and in understanding the biological function of small RNAs. Distinguishing closely-related small RNAs is difficult using hybridization-based approaches, since imperfectly matched small RNAs may hybridize to primers or immobilized probes. These considerations have led to the realization that high throughput sequencing is the most practical method for large-scale small RNA studies that aim to identify and enumerate small RNAs in various species and tissues.
Unfortunately, NGS approaches for small RNA analysis are not without their own challenges. Several studies have now shown entire datasets, including those in miRBase to contain severe sequence bias, specifically, small RNA expression that is not accurately represented by sRNA-seq. Significant effort has gone into identifying the cause of this misrepresentation, and it is now generally accepted that bias in sRNA-seq libraries is primarily introduced during the ligation steps in library preparation. Specifically, RNA ligases show sequence-specific preferences toward flow cell adapters, resulting in preferential inclusion of some small RNAs in sRNA-seq libraries, at the expense of others. Simply using two different adapter sequences during ligation can result in up to 30-fold differential expression for some microRNAs. No single adapter sequence is able to efficiently ligate to all small RNAs, indicating that the target sequence, as well as adapter sequence, is a source of bias.
Our approach to overcoming ligation bias in sRNA-seq libraries involves using a pool of adapters having randomized sequences at the ligation site, where each adapter sequence is present in vast molar excess over any given small RNA in the sample. Experiments show that most of the bias in adapter ligation is due to the sequence of 2-4 adapter nucleotides adjacent to the target junction.
Using our randomized adapter strategy, small RNA libraries were prepared with both synthetic small RNAs and small RNA isolated from human brain and sequenced. Libraries utilizing randomized adapters demonstrated significantly more even coverage due to reductions in ligase bias. We will demonstrate why our new streamlined small RNA-seq protocol is critical for those needing to accurately assess small RNA abundance using high throughput sequencing.