Raising the Standards of Data Integrity in Modern Research
Blog May 29, 2018 | by Ruairi J Mackenzie, Science Writer for Technology Networks
As the research landscape changes, the standards of data integrity that guide research must also change. New data tools, systems and formats provide pitfalls and opportunities for advancing the cause of data integrity. In this blog, we discuss integrity in modern research with John Sadler, Vice President and General Manager of Agilent’s Software & Informatics Division.
Ruairi Mackenzie (RM): There is an increasing awareness and enforcement of data integrity law in biomedical research. What has driven this vigilance?
John Sadler (JS): While the research realm has often had less focus on data integrity the recent, high profile cases of research misconduct, undetected error, and reproducibility of data have sparked vigilance on many levels. These issues have even been covered in mainstream media outlets, bringing researchers and their grantors under scrutiny.
This public backlash against government-sponsored research, particularly in the US, has triggered greater vigilance. Researchers are responding by adopting more data integrity tools. These tools are similar to what would be used much later in the pharmaceutical and biopharmaceutical development process, for example we are seeing an increase in the use of electronic laboratory notebook and laboratory information management systems in research. These tools enable the capture of data and documentation of the laboratory workflow as it is happening. To enable much greater auditing if the data in question, many of these electronic resources also include the electronic controls used by laboratories that operate under 21 CFR Part 11.
RM: 95% of data integrity issues are not fraudulent, and simply down to poor data management. Nevertheless, are accusations of fraudulence a concern for companies with poor data integrity standards? What steps can companies take to avoid this risk?
JS: In many cases issues with data integrity start with a poorly documented process or series of experiments. The most important step in guaranteeing data integrity is having a detailed record of who, what, when and why. Additionally, a lack of documentation means that it is difficult (if not) impossible to reproduce exactly what was executed in the lab. As a result, the results may become irreproducible.
As mentioned before there are many electronic tools that facilitate the collection, consolidating and auditing of data. A new breed of workflow-specific laboratory information management systems (LIMS) are making the system setup management and operation much simpler, enabling the laboratory to focus on generating great science. These tools will track inventory, samples, procedures, and analytical results in a single location enabling much easier reconstruction of the work, they will also track lab-wide metadata adding another level of evidence to determine the source of any issues.
Implementing these digital systems to automatically capture experimental data and metadata across the workflow is a great place to start, beyond that there are tools for scientific data management that can further document a detailed record of how the data was collected, handled, and archived.
RM: Increasingly, research is an automated process. Do you think this will improve the standards of integrity in the biomedical field?
JS: With the majority of issues resulting from poor data management and documentation, automation is most likely to have a positive impact on data integrity in research.
To some degree the success of a data integrity tool will depend on the implementation, however, automation of the process requires procedures be documented electronically. Most automated systems have controls to track the who, what, where, when and how. These controls have permission processes in place that prevent users from changing or deleting information from the system. Automated systems should reduce manual interactions, in turn reducing the opportunity for human error or fraud and leaving a more detailed record of the experiments, resulting in better data integrity.
RM: Increasing the portability and security of data by using modern data formats and cloud technologies is also increasingly important in modern research. Do you think cloud analytics and storage technologies help or hinder the drive for data integrity?
JS: Standardization of data reduces the need to have specialized vendor-specific data viewers, reducing the complexity of reviewing historical data.
Agilent is a member of the Allotrope Foundation, “an international consortium of pharmaceutical, biopharmaceutical, and other scientific research-intensive industries” working together to develop a common data format for the laboratory. We believe this will enable laboratories to operate more efficiently and better understand their data and are incorporating Allotrope data file support into our informatics systems.
Common data formats, such as the Allotrope format, enable much stronger searching of data repositories. Cloud and analytics are very powerful technologies upon which we can build great systems for capture and curation of experimental information and data. These technologies are enabling labs to rethink the way they do their work. The computing power of the cloud makes analytics and visualization of very large data sets possible.
In the future, combining these modern tools with common data formats will make it much easier to spot anomalies and abnormalities, rapidly highlighting possible data integrity issues. This will not only help the drive for data integrity, but will also enable us to have the detailed context or metadata needed to re-evaluate historical data with new perspectives, ultimately making research more reliable and efficient.
John Sadler was speaking to Ruairi J Mackenzie, Science Writer for Technology Networks