Optimizing RNA Sequencing of FFPE Samples
App Note / Case Study
Last Updated: July 1, 2024
(+ more)
Published: May 17, 2024
Credit: iStock
Formalin-fixed paraffin-embedded (FFPE) samples held in clinical laboratories are an invaluable resource for translational research, especially in the era of personalized medicine.
However, FFPE processing and tissue storage are known to cause sample degradation, resulting in low sample input and ultimately limiting gene expression-based biomarker discovery using RNA sequencing. As a result, samples need to be enriched for mature mRNAs by depleting highly abundant ribosomal RNAs (rRNAs) before sequencing can occur.
This application note evaluates several commercially available rRNA depletion kits, providing an optimized protocol for the efficient removal of rRNA from FFPE samples.
Download this app note to ensure:
- Superior sequencing data from degraded RNA, including FFPE-derived RNA
- A streamlined workflow that generates high-quality libraries
- Improved reproducibility and consistency in RNA sequencing results
Application Note
Genomics
Agilent
Trusted Answers
SureSelect XT HS2 RNA Sequencing
of Ribosomal RNA-Depleted Samples
Author
Katherine Felts, Kristina Vsevolodova,
L. Scott Basehore, Natalia
Novoradovskaya and Scott Happe,
Agilent Technologies, Inc.
Abstract
In this application note, we present the use of commercially available ribosomal
RNA (rRNA) depletion kits in conjunction with SureSelect XT HS2 RNA reagents
for Illumina sequencers. A panel of kits that employ different approaches for the
depletion of rRNA sequences was examined, and data from two of the better-
performing kits is presented. Finally, a detailed SureSelect XT HS2 rRNA protocol
is provided.
Introduction
Next-generation sequencing (NGS) of RNA can provide
useful information regarding expression profiles, splice
variants, fusion transcripts, and post-translational
modifications. To allow for efficient transcript analysis, it is
highly beneficial to remove abundant RNA species from the
test sample. Three common approaches to enriching RNA
samples are isolation of mRNA using oligo(dT), enrichment
with targeted probes, and depletion of ribosomal RNA. As
ribosomal RNA is the most highly abundant component
of total RNA (80 to 90%), its elimination is a requirement
before any analysis of the transcriptome. Targeted RNA
sequencing (RNA-Seq) is a focused evaluation of selected
transcripts and is shown to generate quality libraries
using formalin-fixed, paraffin-embedded (FFPE) samples.
However, if a method to interrogate an unbiased population
is required, then mRNA enrichment or rRNA depletion are
more suitable methods to use. Oligo(dT) enrichment for
mRNA is only advisable for use on high-quality samples as
highly fragmented samples will result in large sample losses
and a population of terminal ends. The use of rRNA depletion
over mRNA enrichment enables researchers to prepare
RNA libraries from both high-quality and highly fragmented
samples (such as FFPE) and examine non-polyadenylated
RNA species that may be of interest.
The Agilent SureSelect XT HS2 RNA portfolio does not
currently provide a commercial ribosomal depletion module.
Therefore, to support customers wishing to prepare rRNA-
depleted XT HS2 RNA libraries, we optimized a procedure
and evaluated it with many commercially available depletion
kits. Table 1 describes the five rRNA depletion kits that were
examined in an initial round of testing. All the kits target rRNA
species from human, mouse, and rat (H/M/R); however, only
human RNA was tested in these experiments. In addition to
human rRNA, Kit J also targets bacterial rRNA and human
beta-globin RNA. Four different modes of action for rRNA
depletion were represented by the kits tested. Three of the
modes of action depleted the RNA sample before library
preparation, and one targeted DNA after the library was
constructed (Kit K). Based on performance, ease of use, and
compatibility with the existing RNA XT HS2 reagents and
workflow, two kits were chosen for more extensive testing
(Kit O and Kit M).
Kit O targets rRNA with sequence-specific DNA
oligonucleotides followed by treatment with RNase H, which
digests DNA/RNA hybrids. The procedure consists of three
incubations followed by a SPRI purification of the depleted
RNA. RNA-specific SPRI beads are purchased separately. Kit
M targets rRNA sequences using tagged oligos that are then
removed from the solution by tag-binding magnetic beads. Kit
M takes less time to complete and includes all the reagents
needed to perform the depletion. However, it requires more
hands-on steps than Kit O. Data from running these two kits
using both high-quality and FFPE RNA is presented.
Table 1. Commercially available rRNA depletion kits tested. The five kits detailed in Table 1 were all examined for compatibility with the Agilent SureSelect RNA
XT HS2 chemistry and workflow. Four different modes of action are represented within the five kits tested. Four of the kits process the RNA sample before library
construction, but one kit targets DNA after library construction (Kit K). The time to perform the depletion, manufacturer recommended input ranges, FFPE sample
compatibility, and our assessment of each of the kits are also included.
Experimental
RNA sources
Universal Human Reference RNA (UHRR) for qPCR was obtained from Agilent Technologies (Santa Clara, CA p/n 750500-
41). SeraSeq FFPE Fusion RNA Reference Material v4 was purchased from SeraCare (Gaithersburg, MD p/n 0710-0496). A
melanoma FFPE tissue section was obtained from Cureline Human Biospecimen CRO (Brisbane, CA, p/n custom).
RNA isolation and qualification
RNA was isolated from FFPE tissue sections using the RNeasy FFPE Kit from Qiagen (Germantown, MD p/n 73504). The
concentration of each extraction was determined on a NanoDrop spectrophotometer (Thermo Fisher Scientific, Waltham, MA).
Samples were further analyzed on an Agilent Bioanalyzer 2100 system using the RNA 6000 Nano kit (Santa Clara, CA p/n 5067-
1511). DV values (% of sample with >200 nt fragment length) were determined by performing a smear analysis using the
Bioanalyzer software.
SureSelect XT HS2 RNA library preparation
RNA samples were depleted of ribosomal RNA using either Kit O or Kit M according to the manufacturer instructions except
that final sample volumes were adjusted to accommodate the RNA XT HS2 start volume of 10 pL. Libraries were prepared from
depleted RNA samples using the SureSelect XT HS2 RNA Library Prep kit for Illumina (Agilent Technologies p/n G9993A-D) and
the procedure provided in the Appendix.
Sequencing and data analysis
Libraries were analyzed on an Illumina HiSeq 4000 by paired-end sequencing using a 2 x 150 read format. FASTQ files were
aligned to human reference genome GRCh38 using STAR RNA-Seq aligner v2.7.2. Libraries were down sampled to 20 million
reads for analysis. PCR duplicates were marked using a Picard duplicate marking tool (BROAD Institute) and the duplication
rate and estimated library size were reported from the output. The duplicate-marked .bam files were then analyzed using
RNA-SeQC v1.1.8 (BROAD Institute) and strand specificity and % mapped reads were reported from the output. The residual
percent rRNA was analyzed by aligning FASTQ files to a rRNA reference file using BWA-MEM, the percentage of the sample
aligned corresponds to the percent of rRNA in the library. Gene expression profiles, measured in transcripts per million (TPM),
were generated using RSEM v0.8.1. Correlation coefficients were calculated from the TPM output using Microsoft Excel. Fusion
transcripts were identified using STAR-Fusion v1.8.0.
Results and Discussion
High-quality RNA
To assess performance of the two rRNA depletion methods, libraries were prepared from UHRR using a wide range of
inputs. Inputs of 1 to 1,000 ng for both depletion methods resulted in high-quality libraries. The strand specificities and
mapping rates are high for both kits but the mapping rates were slightly better for Kit O. An input of 1 ng is not recommended
by the manufacturer for either SureSelect XT HS2 RNA or the Kit O depletion kit, and the results in Figure 1 support that
recommendation. As observed for both test kits, the low 1 ng input shows a steep increase in the duplication rate in addition to
reduced library sizes. Both methods did well at removing rRNA (all had less than 1% residual rRNA). However, the highest input
for Kit M showed an increase in residual rRNA indicating that 1,000 ng may be close to the high input capabilities of the kit. The
column graph to the right of the metrics table shows more detail regarding the residual RNA content of the individual libraries.
FFPE RNA
Because of the poor condition of FFPE samples, not all the RNA molecules in a given sample are of sufficient quality to be
manipulated in enzymatic reactions. In addition, very small fragments get washed away during SPRI purification, which also
contributes to sample loss. Low inputs can also result in the undesirable formation of adaptor dimers during library construction.
Because of these negative impacts, higher inputs of FFPE are strongly recommended to make up for the loss of "usable" template.
With this in mind, the range of inputs tested for the melanoma FFPE sample started at 50 ng on the low end. The FFPE libraries
depleted using Kit M have four to five times more residual rRNA than Kit O depleted samples. Due to the contribution from residual
rRNA, the duplication rates of Kit M libraries are perceived to be lower, and the library sizes are perceived to be larger. The rRNA
incorporated into these libraries contain adaptors with molecular bar codes (MBC), as such they are counted as unique molecules
by the duplicate marking software. Their presence increases library complexity and there are fewer duplicates. However, these
are uninformative library members that take away space in the library from more meaningful ones. This conclusion is supported
by the lower mapping rates observed for Kit M as only a few species of rRNA map to the GRCh38 reference file. The mapping
rates for the Kit O depletion method are higher, which demonstrates the benefit of minimizing the rRNA content. As anticipated, all
library metrics improved with increased input for both depletion methods. If available, higher inputs of FFPE RNA are expected to
be highly beneficial and is recommended (> 100 ng). The chart to the right of the metrics table shows the percent residual rRNA
for the individual libraries. One of the 50 ng Kit O duplicates is unusually high compared to all the other Kit O libraries tested and is
considered to be an outlier.
Figure 2. RNA sequencing metrics of rRNA-depleted libraries prepared from fragmented FFPE RNA. Duplicate rRNA depletion reactions were performed on 50,
100, 250, and 1,000 ng of FFPE RNA using both the Kit O and Kit M rRNA depletion kits. SureSelect XT HS2 RNA libraries were prepared from the resulting RNA
samples. Libraries were then sequenced and analyzed as described. The sequencing metrics table represents the average results from duplicate libraries. The bar
graph shows the percent residual rRNA for each individual library. Kit O libraries are represented by orange bars and Kit M libraries are represented by blue bars.
Gene expression
To determine if the gene expression profiles of rRNA-depleted RNA XT HS2 libraries were significantly altered by the
ribodepletion process, a correlation analysis comparing an unbiased control data set to libraries prepared from rRNA-
depleted samples was performed. To minimize as much bias as possible, a ground truth library was prepared from 10 ng
unenriched UHRR. This library was deep sequenced and all rRNA sequences were manually removed from the resulting data
set. The test data set was an RNA XT HS2 library prepared from 10 ng UHRR that had been enriched using Kit O and Kit M
rRNA depletion kits. RNA-Seq by expectation-maximization (RSEM) analysis was performed on all the data sets and the
transcripts per million (TPM) outputs were used as input for the correlation study. Correlation coefficients for pairings of all
five libraries are presented in Table 2. A high degree of correlation was observed for all the libraries (≥ 94%) and differences
observed between the two depletion kits were negligible.
Table 2. Gene expression correlation analysis of rRNA-depleted SureSelect XT HS2 RNA libraries to a ground truth library. Duplicate reactions of 10 ng UHRR
were ribodepleted using Kit O and Kit M according to manufacturer instructions. SureSelect XT HS2 RNA libraries were constructed from the resulting depleted
RNA samples and sequenced on the Illumina HiSeq 4000 platform. As a reference, a library from an unenriched 10 ng UHRR sample was constructed then deep
sequenced. The rRNA sequences were then manually removed from the resulting data to provide a ground truth data set for comparison. A TPM gene expression
analysis was performed on each library. Correlation coefficients (r) were calculated from the TPM using Microsoft Excel and are presented below.
Fusion detection
To investigate rRNA depletion kit performance for the
detection of gene fusion transcripts, RNA XT HS2 libraries
were prepared from duplicate 50 ng input reactions of
SeraSeq FFPE Fusion RNA v4 (DV20. 60%). Sixteen fusions
are reported for this reference sample, as listed in Table
3. STAR-Fusion analysis software was used to report the
fusions detected in each of the libraries. Junction and
supporting reads were combined and those resulting values
were averaged between replicate libraries and reported in
Table 3. Samples depleted using Kit M showed higher fusion
counts overall and higher counts for 13 of the 16 fusions on
the list. In addition, the Kit M libraries detected all 16 fusions,
but Kit O missed one (EGFR-SEPT14). When considered in the
context of the other sequencing metrics, the level of fusion
detection observed here for Kit M-depleted libraries improves
further. As observed in the melanoma FFPE libraries example
(Figure 2), the SeraSeq FFPE Kit M-depleted libraries had
higher amounts of residual rRNA and lower mapping rates
than those prepared with the Kit O depletion kit (data not
shown). Taking this into consideration, the fusion counts for
Kit M libraries would be expected to be even higher if an equal
number of mapped reads had been analyzed.
Table 3. Fusion detection. RNA was isolated from SeraSeq FFPE Fusion
RNA Reference v4 curls using the Qiagen RNeasy FFPE Kit. Duplicate 50
ng input reactions were ribodepleted using both Kit O and Kit M according
to the manufacturer instructions. SureSelect XT HS2 RNA libraries were
constructed from the rRNA-depleted samples and sequenced on the
Illumina HiSeq 4000 platform. Samples were normalized to 20 million reads
and fusions were determined using the STAR-Fusion (by BWA) analysis
tool. Junction and supporting read numbers for the fusions detected were
combined and results from replicate samples were averaged.
Conclusion
Five rRNA depletion kits were initially screened for use in
conjunction with the SureSelect XT HS2 RNA reagents and
workflow. Three of those kits fit seamlessly into the workflow
and performed well. Two of those kits, Kit O and Kit M, were
selected for extensive testing with both high-quality and FFPE
RNA samples. Both kits performed well on high-quality RNA,
but Kit O was significantly better at removing rRNA from FFPE
samples. An input range of 10 to 1,000 ng of high-quality
RNA resulted in quality libraries for both kits tested. Higher
inputs for FFPE RNA improved library quality for both kits. At
least 100 ng input of FFPE RNA is recommended if available.
Both kits produced libraries with gene expression profiles
that had a high correlation to that of a ground truth control
library and each other. Kit M was superior to Kit O in detecting
gene fusions. Overall, both rRNA depletion kits performed
well in addition to being reliable and easy to use. Either kit
is recommended for use with the SureSelect XT HS2 RNA
chemistry and the protocol is provided in the Appendix.
Brought to you by
Download This App Note for FREE Now!
Information you provide will be shared with the sponsors for this content. Technology Networks or its sponsors may contact you to offer you content or products based on your interest in this topic. You may opt-out at any time.
Experiencing issues viewing the form? Click here to access an alternate version