Using Data Lineage and Traceability to Optimize Publishing Potential
Frequent, high-impact publication can consolidate a researcher’s reputation and open doors to collaboration and grant award success.
Complete the form below to unlock access to ALL audio articles.
Contents
The importance of keeping FAIR data
Tools to visualize data experimental processes
- What is an ELN?
- ELNs vs LIMS
Build data tracing workflows with collaboration in mind
Data lineage, traceability and good practice guidelines
- What is good laboratory practice?
- How can ELNs support GLP compliance?
Want to get a head start when it comes to publishing in high-impact journals? Frequent, high-impact publication is one of the most powerful ways a researcher can showcase their work. It can consolidate a researcher’s reputation and open doors to collaboration and grant award success. Hence, strengthening the chances of high-impact publication can help both early-career and established researchers succeed in their chosen field.
Accurate and accessible data is key to upholding publication standards in scientific journals, and a critical factor in publishing. A number of recent papers have faced retraction after poor data reporting raised suspicions of misconduct.1
Whilst retraction is not always indicative of research fraud, the stigma associated with publication recall can throw doubt upon a researcher’s reputation. To avoid such a consequence, establishing best-practice data recording workflows is critical, and can optimize the publishing potential of original research and support success.
This article outlines the key principles of data lineage and traceability – two important factors in establishing good data recording practices. It explores how the publication potential of original research can be optimized using electronic lab notebooks for data management and sharing.
What is data lineage?
Data lineage vs traceability
Whilst lineage describes the evolution of a dataset, traceability ensures that information trails are both logical and robust. Like a chain of custody for evidence in a police investigation, data traceability ensures that information can be tracked through stages of research. Good data traceability can include clear approval processes, electronic signatures on data entries, or time stamps on data input. Such practices ensure that information is used ethically, effectively, and in compliance with governmental guidelines or industry standards.3
Figure 1: The roles of data lineage (blue) and data traceability (yellow) in research publishing, collaboration, and grant award success. Credit: SciNote.
Together, data lineage and traceability ensure the integrity and reproducibility of publishable results (Figure 1). Not only do they clarify how experimental results were obtained, but they also support the validation of findings and increase a researcher’s academic credibility.
Two papers recently published in high-impact journals faced retraction after issues arose regarding access to original data.1 Both papers shared common data sources, though the authors of neither paper had access to raw files. When the data lineage could not be traced, suspicions of fraudulent misconduct were supported, and authors’ careers were significantly negatively impacted.4
Scientific misconduct doesn’t always mean data fraud – it can also include the failure to report important experimental details.1 Without adequate data lineage and traceability, valuable information can be lost when details only become relevant later down the line. Major funding bodies across both North America and Europe have independently released new policies to reflect this, focusing on good data management and sharing practices.5
Conscientious data lineage and data traceability can help to meet these guidelines, secure funding and maximize research efficiency by reducing the need for undue experiment repeats.
The importance of keeping FAIR data
The FAIR guiding principles emphasize that all data and metadata related to scientific publications must be findable, accessible, interoperable and reusable. Achieving FAIR data is a prerequisite for proper data management. Ensuring that a research project’s data is protected, usable, and trusted is also key for achieving FAIR data. Different tools contribute to improving data management through the centralization of research-related data in one place, annotated, traceable, searchable and easy to visualize and understand.6
Tools to visualize data experimental processes
Visualizing experimental processes is one of the easiest ways to ensure that appropriate integrity measures are in place. By mapping research tasks and processes in a visual format, scientists can find answers quickly and establish important validation steps (e.g., number of replicates, controls, quality control, etc.). Results associated with each step can be captured, ensuring that critical information is never missed. Hence, tools that support visualization can help researchers make the right choices for their data and clarify why key research decisions have been made.
Electronic lab notebooks (ELNs) can facilitate the visualization of experimental steps as well as lineage and traceability pipelines. For this reason, ELNs present an exciting opportunity for researchers looking to optimize their publication potential, prepare for collaborations, and increase grant award success.
What is an ELN?
ELNs are computer software packages designed to replace traditional, paper lab notebooks. Like paper notebooks, ELNs are used to record protocol information, raw data and experimental observations. However, many packages integrate additional lab management, protocol templates, inventory management, and e-signature features.ELNs vs LIMS
Laboratory information management systems (LIMS) also offer opportunities to record data traceability information, however, LIMS are typically designed to record structured data. As ELNs allow users to record unstructured data, such as observations and analysis, they offer significant benefits over LIMS for many academic labs.7Some ELNs and LIMS will feature attributes such as audit trails, e-signatures, and time stamps; these features can ensure robust traceability and data integrity. The chart below outlines potential laboratory needs to be considered when choosing an ELN or LIMS. Note that this is generalized; different ELN and LIMS systems might offer additional functionalities that are not included in this chart.
Table 1: Key differences between an ELN and a LIMS. Credit: SciNote.
Build data tracing workflows with collaboration in mind
The SciNote ELN offers an affordable data documentation solution for researchers looking to collaborate at any stage in their careers. The unique project/experiment/task structure and data management capabilities of SciNote can support labs as they grow. Whether data is being prepared for presentation, publication or technology transfer, SciNote allows researchers to design workflows with knowledge exchange in mind.
With SciNote, data is automatically backed-up on the Cloud, eliminating the risk of physical damage and ensuring that information is always available to share. Whilst fire and water damage is thankfully rare, it does pose a significant risk within the lab environment.9
SciNote is now also integrated with protocols.io – a secure collaborative research platform designed for protocol sharing and method optimization. Scientists can directly access protocols.io through SciNote without needing a protocols.io account. This feature makes searching for and importing external protocols quick and easy. External protocols can be saved, with appropriate accreditations, directly in a researcher’s own ELN, saving time and ensuring traceability.
Data lineage, traceability and good practice guidelines
What is good laboratory practice?
Good practice guidelines lay out the processes and procedures which, if followed, are known to produce the best and most ethical outcome.Good laboratory practice (GLP) refers to the standards of non- clinical safety and regulatory studies for the development of products for human health. These principles ensure that any data generated, handled and reported during non-clinical safety studies is of high quality and integrity.11
How can ELNs support GLP compliance?
In the United States, Title 21 Code of Federal Regulations (CFR) part 11 defines the criteria for ensuring that electronic signatures and electronic records are reliable and trustworthy; only authorized personnel are granted access to electronic data.12,13 Food and Drug Administration (FDA) requirements are also shifting towards electronic data submission to streamline clinical data review.Policies on good data management and sharing can always be subject to change. For example, the National Institute of Health (NIH) recently issued a new data management and sharing policy to update what’s currently in place.14 Using ELNs to organize data lineage and traceability allows researchers to quickly respond to changes in traceability guidelines and data sharing policies, and safeguard data integrity. SciNote offers electronic signatures, audit trails, time stamps, user roles and permissions to support GLP compliance requirements mandated by 21 CFR part 11, while keeping the system flexible and easy to use.
Increase productivity and save time
Using SciNote can significantly improve productivity; individuals save approximately nine hours per week while achieving the same amount of work (Figure 2).15 Project reports can be generated automatically, making the process 90 % faster than manual alternatives.
Additionally, using templates to visualize projects and dependencies easily, makes planning and scheduling up to 80 % faster.16 SciNote also offers additional lab management tools, such as inventory management, that can further reduce the hours spent on mundane tasks and improve productivity.
Figure 2: Comparison of time spent for weekly amount of work without and with SciNote (median/work category). Credit: SciNote.
Conclusion
Building robust data lineage and traceability workflows into scientific research saves time and increases productivity, while supporting GLP compliance. SciNote ensures easy data access and visualization to minimize wasted time on costly repeat experiments that delay publication.
By supporting best practices for data sharing and management, SciNote facilitates high-impact publication, collaboration and grant award success.
Learn more about how an ELN can support data management for publication and grant writing.