Sample reliability and data analysis: Overcoming the challenge
Want to listen to this article for FREE?
Complete the form below to unlock access to ALL audio articles.
Read time: 5 minutes
Reliable samples are vital for generating scientifically valid, high quality data. Without rigorous scrutiny of samples, it is impossible for a researcher to truly understand what is behind the results their experimentation produces. Coupled with sample reliability, data analysis is seen as a major challenge in modern biology, particularly as the amount of data produced by researchers continues to increase. If data analysis is allowed to bottleneck, any amount of high quality data can go to waste and prevent us from uncovering unseen patterns that lie within.
Fortunately for researchers, a wide range of solutions are emerging on the market which allow them to confirm the quality of their samples and analyze the data they produce in an efficient, meaningful way.
AB: Do you believe that we are now nearing a stage where major breakthroughs will become commonplace?
The post-genomic era is an exciting time and advances in genetics research have been greatly accelerated by next generation sequencing. For example, whole exome sequencing using Agilent’s SureSelect Human All Exon kits have helped genetic researchers identify the causal genes for over 50 Mendelian disorders and many more complex disorders since it was launched 5 years ago.
Fortunately for researchers, a wide range of solutions are emerging on the market which allow them to confirm the quality of their samples and analyze the data they produce in an efficient, meaningful way.
To discuss the issues of sample reliability and data analysis and how Agilent is helping scientists address them, we spoke to Herman Verrelst, VP and general manager, genomics and clinical applications division for Agilent Technologies.
AB: We have read that many researchers are unaware of the problem of sample reliability, leading to low quality data. How are Agilent helping researchers overcome this?
Herman Verrelst (HV): Knowing the quality/integrity and quantity of starting material is critical to the success of an experiment as it informs the researcher as to whether protocol modifications are needed or if the sample is unsuitable for downstream processing. In addition, understanding the quality and quantity of each sample as it progresses through key protocol steps provides important information about the success of sample preparation. Agilent is a leader in providing tools for the quality assessment and quantitation of RNA, DNA and protein samples, which include:
The Agilent 2100 Bioanalyzer System – provides high quality digital data on sizing, quantitation and quality control of DNA, RNA, proteins and cells on a single platform. The RNA Integrity Number (RIN) is the industry standard for assessment of RNA integrity while the High Sensitivity DNA assay was developed in response to the needs of researchers who require accurate quality control (QC) and quantitation of low amounts of DNA (down to 5pg/ul), such as those performing next generation sequencing (NGS).
The 4200 Tapestation System - offers automated sample processing (up to 96 samples) of DNA and RNA samples with unattended walkaway operation for quick and reliable sample quality control within any NGS, microarray (aCGH) or quantitative PCR workflow. The Genomic DNA (gDNA) ScreenTape assay, with the Agilent 4200 TapeStation system separates and analyzes DNA samples up to greater than 60,000 base pairs. The assay provides a numeric measurement of gDNA quality, the DNA Integrity Number (DIN), which standardizes gDNA QC across multiple samples. The gDNA assay is ideal for the high-throughput screening of samples such as gDNA derived from formalin-fixed paraffin-embedded (FFPE) tissue.
The Agilent 2100 Bioanalyzer System – provides high quality digital data on sizing, quantitation and quality control of DNA, RNA, proteins and cells on a single platform. The RNA Integrity Number (RIN) is the industry standard for assessment of RNA integrity while the High Sensitivity DNA assay was developed in response to the needs of researchers who require accurate quality control (QC) and quantitation of low amounts of DNA (down to 5pg/ul), such as those performing next generation sequencing (NGS).
The 4200 Tapestation System - offers automated sample processing (up to 96 samples) of DNA and RNA samples with unattended walkaway operation for quick and reliable sample quality control within any NGS, microarray (aCGH) or quantitative PCR workflow. The Genomic DNA (gDNA) ScreenTape assay, with the Agilent 4200 TapeStation system separates and analyzes DNA samples up to greater than 60,000 base pairs. The assay provides a numeric measurement of gDNA quality, the DNA Integrity Number (DIN), which standardizes gDNA QC across multiple samples. The gDNA assay is ideal for the high-throughput screening of samples such as gDNA derived from formalin-fixed paraffin-embedded (FFPE) tissue.
The NGS FFPE QC Kit - FFPE tissue represents a valuable sample source for molecular cancer research. These samples provide a contextual snapshot of the tissue at a specific time point and stage of disease but are highly challenging to analyze, especially using next generation sequencing. DNA derived from FFPE is often times highly fragmented, cross-linked with protein and has a high proportion of single-stranded DNA, making it challenging for adaptor ligation and amplification, steps that are critical for successful preparation of sequencing libraries.
The Agilent NGS FFPE QC kit is a qPCR-based assay that enables functional DNA quality assessment of input DNA prior to preparation of next generation sequencing libraries. This kit enables assessment of the integrity of DNA as well as accurate quantitation of amplifiable template going into library preparation. Furthermore, optimizations to the SureSelectXT NGS target enrichment workflows based on the sample quality score are enabled. When used in conjunction with the Agilent AriaMx qPCR instrument, this assay provides accurate and rapid assessment of FFPE sample quality.
AB: The bottleneck of data analysis is always cited as a significant hurdle, how do Agilent’s informatics products help and how much further on are we than 2-3 years ago?
HV: Agilent has developed a complete portfolio of complementary software solutions that enables the rapid analysis and interpretation of NGS and microarray data without the need for complex and costly bioinformatics infrastructure.
HV: Agilent has developed a complete portfolio of complementary software solutions that enables the rapid analysis and interpretation of NGS and microarray data without the need for complex and costly bioinformatics infrastructure.
For NGS data analysis, the Agilent SureCall software addresses the needs of clinical researchers from analysis to reporting out of their target enrichment NGS data, eliminating data analysis as a bottleneck. Analysis in SureCall begins with raw reads from Illumina HiSeq/MiSeq or aligned Ion Torrent sequencing of genomic DNA enriched with Agilent’s SureSelect, OneSeq, or HaloPlex Target Enrichment in a simple 3-step workflow with best-in-class open source algorithms and an in-house developed variant caller optimized for data obtained from cancer and constitutional samples alike. SureCall supports a variety of analysis including Single Sample, Tumor-Normal pair, Trio and OneSeq CNV and Mutation Analysis. The SureCall software is available at no cost to users of Agilent’s NGS target enrichment products.
For CGH microarray analysis, the Agilent CytoGenomics software addresses the needs of cytogeneticists for analysis and triage of their CGH and CGH+SNP data from both constitutional and hematological cancer samples. CytoGenomics provides a streamlined workflow that is automation enabled for data upload and analysis and contains optimized algorithms for accurate detection of copy-number changes and copy-neutral variations, including LOH and UPD. Clinical researchers can suppress, classify, edit, and annotate aberrations and generate reports and the software was designed specifically for cytogenetic research to put data into biological context. Like SureCall, CytoGenomics is available at no cost to users of Agilent’s CGH microarray products.
Whilst significant progress has been made in developing solutions for handling large volumes of data both in terms of storage and computation (particularly for NGS) and developing better algorithms for identification of single nucleotide variants, insertions/deletions, amplifications/duplications (rearrangements remain challenging), the next challenge is functional interpretation of the genomic data.
To address this, Agilent acquired Cartagenia (in May 2015), a provider of software solutions for variant assessment and reporting of clinical genomics data from next-generation sequencing and microarrays. Uniquely geared to routine clinical labs, Cartagenia's solutions are FDA-registered as exempt Class I Medical Devices in the U.S. and as Class I Medical Devices in Europe. The Cartagenia Bench platform enables technicians, lab directors and clinicians to visualize, assess and report clinical genetics data in the context of patient information.
With Cartagenia Bench, labs can build an internal knowledge base, build variant assessment SOPs, automate report drafting, and access a wide range of community-validated, private and premium content resources, whether for oncology or inherited disease.
Cartagenia's platform also provides deep support for consortia of collaborating labs. Data-sharing has become an essential requirement for the community, and through private and public consortia, users can connect and pool their knowledge on rare diseases and actionable findings.
AB: With regards to overcoming the issues associated with genetics research, how much is this about providing the right tools and how much is it about improving education?
HV: There are fundamental challenges associated with human genetics and cancer genetics research such as sample availability and quality, tumour heterogeneity, cost efficiencies, time-to-results and data analysis and interpretation. Clinical researchers require high quality, precision genomics solutions that enable them to overcome these challenges and obtain accurate and meaningful results. Agilent is committed to providing such solutions as evidenced by our recent product introductions such as the GenetiSure PreScreen microarray for analysis of single cells in embryos by comparative genomics hybridization (CGH), OneSeq target enrichment assay that detects genome-wide CNVs, LOH and mutations in a single reaction, HaloPlexHS target enrichment system that identifies very low allele frequency variants in cancer samples and Cartagenia Bench, a clinical-grade data interpretation support software (refer to Qn 2 for more details on Cartagenia Bench).
To realize the promise of precision medicine, it is important to couple the availability of high sensitivity, high accuracy genomics solutions with education and collaboration. Agilent partners closely with our customers to educate others on a variety of topics including:
To realize the promise of precision medicine, it is important to couple the availability of high sensitivity, high accuracy genomics solutions with education and collaboration. Agilent partners closely with our customers to educate others on a variety of topics including:
- Sharing best practices and experiences in overcoming common challenges
- The utility of combining multiple technologies like microarrays, NGS and FISH to orthogonally cross-validate identified disease-associated variants and to obtain a more comprehensive view of disease
- The potential utility of molecular technologies for cancer diagnostics and how they can be implemented through easy-to-use workflow solutions
AB: Do you believe that we are now nearing a stage where major breakthroughs will become commonplace?
The post-genomic era is an exciting time and advances in genetics research have been greatly accelerated by next generation sequencing. For example, whole exome sequencing using Agilent’s SureSelect Human All Exon kits have helped genetic researchers identify the causal genes for over 50 Mendelian disorders and many more complex disorders since it was launched 5 years ago.
We expect to see more disease-associated biomarkers and gene targets being robustly identified using precision genomics technologies like microarrays and NGS and their application in tests and treatments for personalizing care.
Herman Verrelst was speaking to Ashley Board, Managing Editor at Technology Networks.