8 Tips for Improving Your NGS Libraries
How To Guide Oct 22, 2018 | by Natasha Beeton-Kempen, Ph.D.
For most laboratory workflows, you’re only as good as your starting material. This is particularly true for next-generation sequencing (NGS). Library quality is all-important in ensuring you receive high quality data at the end of what can be a long and laborious pipeline. Your NGS library should ideally consist of purified target DNA sequences with sufficient yield, minimal bias, and a homogenous size distribution, and should be successfully ligated to the appropriate sequencing adapters. There’s an ever-growing list of available library preparation kits, mostly aimed at specific NGS platforms. Each platform and each type of NGS experiment (whole genome, exome, targeted, RNA-Seq, ChIP-Seq, etc.) will require its own optimization; however, there are several tips and tricks to preparing NGS libraries that hold true across all applications. We’ve compiled some handy pointers below.
Use high-quality starting material
This may seem obvious, but it is often overlooked. Your sample may contain contaminants that remain even after sample extraction and purification. This is of course particularly true for “dirty” samples such as fecal swabs and environmental water samples. However, the contaminant can be introduced during the purification process itself – for example, some purification kits elute samples in buffers with high EDTA concentrations that can affect the repair, ligase, and polymerase enzymes involved in library construction. Phenol, chloroform, and certain salts can also have a deleterious effect. Further, your sample may contain other genetic material (DNA or RNA) that will compete with your target. For example, the presence of host genomic DNA can affect the sequencing of bacterial pathogens. It is therefore critical to carefully select the appropriate sample extraction and purification approach to enrich your target and minimize contaminants.
Use sufficient starting material
While sequencing platforms offer protocols for sequencing low amounts of input DNA, even to the point of single-cell sequencing, wherever possible one should not apply less than the minimum input DNA recommended by the manufacturer for the application you are using. A critical part of this is ensuring that you use a reliable DNA/RNA quantification method that is not biased by any other components in your sample. Your sample should also fall well within the linear range of the chosen assay to ensure accurate quantification.
The methods and reagents used to prepare your library fragments and ligate the adapters can result in sequencing bias. This is a particularly important consideration when sequencing amplicons, as the primers and PCR conditions used can lead to biased representation of certain areas of your sample genome or exome. Using enzyme digestion to fragment your sample can also result in over- or under-representation of certain areas of the genome, as digestion enzymes prefer certain sequences over others for their activity. Some genomic areas may be over-digested and subsequently purified away as the resulting fragments are too short, while others may also be purified away as they are under-digested, and the fragments are too long. The GC content of your sample may also affect amplification steps and this needs to be considered and mitigated for. To obtain cells for DNA extraction or to examine endogenous levels of specific proteins, cells are usually collected through centrifugation. After collecting the cell pellets, do not discard the supernatant into the sink. Instead, transfer the supernatant into another container and autoclave this or treat it with bleach to ensure that all remaining cells in the supernatant are killed. Then, you can discard the treated supernatant into the sink. Likewise, treat the used tissue culture plates with bleach prior to disposing of them in the biohazard bin.
Optimize the fragmentation conditions for your sample
The duration of fragmentation is critical – too long or too short will both lead to reduced sequencing coverage. This applies to fragmentation by enzyme digestion, sonication or acoustic shearing. Some samples are inherently unstable and prone to degradation, such as RNA viruses and mRNA. For such samples, you are already starting the process with “pre-digested” material and you need to take particular care during library preparation.
Prevent cross-contamination of samples
When multiplexing several samples for sequencing in a single run, it is essential that you do not cross-contaminate your samples as this will invalidate your data. Change your gloves frequently, especially when you notice any kind of spill; always change tips between pipetting steps; be careful of touching the pipette shaft against the insides of tubes; spin tubes down before opening and use filtered tips to prevent aspiration of samples into the pipette. All basic, yet crucial!
Keep other best laboratory practices in mind
All sample solutions should be stored appropriately depending on the type of sample and the length of storage (4°C for short-term storage of DNA, −20°C for short-term storage of RNA or long-term storage of DNA, and −80°C for long-term storage of DNA and RNA). DNA should be stored in a slightly basic buffer, while RNA should be stored in RNase-free water. Aliquot your sample solutions to avoid damage from repeated freeze-thawing. Use tubes made of material that binds minimal amounts of DNA and RNA to prevent sample loss. All work surfaces should be wiped down with an appropriate product before use, and particular care should be taken when working with RNA to prevent RNase digestion. Designated work areas and laminar flow hoods may be necessary to prevent sample contamination. Mix reagents thoroughly before use (be gentle with enzymes though), and work on ice throughout. Pipette carefully to avoid volume errors – where possible, use master mixes as this reduces the effect of pipetting error.
Accurately quantify your library output
After all the work required to prepare a good library, it would be a shame if you later fail to accurately quantify it for input into the next step of the sequencing pipeline. If you overestimate your library concentration, this will lead to too little input and reduced coverage. If you underestimate your library concentration, this will lead to too much input and can lead to various issues depending on the platform. When multiplexing samples, it is also important to correctly normalize the different libraries to be pooled to ensure similar read distribution for each sample.
Perform QC throughout library construction
Wherever possible, perform all the suggested quality control procedures recommended by the manufacturer of your library kit. Devise and optimize your own quality control (QC) procedures if necessary, particularly for the key steps such as sample fragmentation and adapter ligation.