We've updated our Privacy Policy to make it clearer how we use your personal data. We use cookies to provide you with a better experience. You can read our Cookie Policy here.

Advertisement

The Future of Health Data Management: Enabling a Trusted Research Environment

The Future of Health Data Management: Enabling a Trusted Research Environment content piece image
Credit: Pixabay
Listen with
Speechify
0:00
Register for free to listen to this article
Thank you. Listen to this article using the player above.

Want to listen to this article for FREE?

Complete the form below to unlock access to ALL audio articles.

Read time: 5 minutes

Increased access to health research data allows scientists and researchers to uncover new findings about diseases and treatments they might not have had access to before. This data, based on genomic markers, is the key piece to medicine-making and patient diagnosing. 

 

In a recent study, researchers at Stanford University were able to break the world record of diagnosing a patient with a rare disease, in five hours and two minutes. By contrast, a typical diagnosis of rare diseases could take up to four years – and children typically must wait six to eight years before being diagnosed. 

 

Shortening the timeline for diagnosis is clearly a critical factor in living a longer and healthier life.

 

The hurdle to speeding the pace of diagnosis is that health data is often held and accessed by a single group or organization (“silos,” in other words), and patient confidentiality makes data-sharing problematic. To overcome this hurdle, researchers and organizations are leaning into a relatively new method of health data management, by establishing trusted research environments (TREs).  

 

TRE is becoming a commonly used acronym among the science and research community. In general, a TRE is a centralized computing database that securely holds data and allows users to gain access for analysis. TREs are only accessed by approved researchers and no data ever leaves the location. Because data stays put, the risk of patient confidentiality is reduced.

 

This is a very different approach from the traditional ways in which researchers access data. Historically, researchers have had to download an entire dataset onto their computer to be able to study the findings. Transferring and releasing data in this way increases the risk of security problems, even though the individuals have been de-identified. Furthermore, this method takes a considerable amount of time – time that could be better spent on analysis of clinical data sets.

 

Why the shift?

 

The COVID-19 pandemic revealed that patient clinical data availability and standardization was key to finding out more about the virus, and how to target it head-on. Researchers from all over the world were running experiments, analyzing their findings, collecting clinical data sets, and reporting on their outcomes.

 

During this time, organizations became more aware of the pressing need for a new way to manage health data. Specifically, the UK Health Security Agency started collecting whole genome sequencing back in 2020 for COVID-infected patients. Recently the agency has just passed one million genomes in their database, which has led to many discoveries and findings about the virus and its variants. These findings were then shared with other countries to benefit the world.

 

Global impact of limited access

 

TREs are becoming the architectural backbone for health data in many research organizations. While this is a step in the right direction, many TREs still can’t speak to colleagues from other organizations, or even other departments within their own organization.

 

For example, some universities have their own research departments, each with its own TRE. There have been unfortunately common situations where TREs that are only a wall apart in an organization can’t “speak” to one another. Without this ability, it is impossible to take full advantage of a TRE.

 

As the genomic sector continues to grow, the capability of TREs to communicate will allow researchers and scientists to effectively collaborate to overcome life threatening diseases and diagnosis by breaking down the “silos” of health data.

 

That doesn’t mean moving data. Life sciences data sets are too large to move efficiently – and to complicate matters, many data security regulations forbid data to leave an organization, state or nation. Consequently, it is estimated that as much as 80–90 percent of important datasets are simply unavailable to research.

 

What is required is a shift from data centralization in silos to a means for allowing data to be shared while in situ with the organizations that gathered it in the first place. No alternative is as promising for research. 

 

What constitutes as a trusted research environment?

 

There are several factors that organizations need to consider when they set off on the challenge of developing a trusted research environment. The UK Health Data Research Alliance has applied the Five Safes framework which is comprised of safe people, safe projects, safe settings, safe data and safe output, to TREs. What follows is an overview of those components.

 

1. Safe people

 

Users need to be approved and have the appropriate credentials to access the health data. Individuals should not be trying to re-identify individuals, as that would be a breach of patient confidentiality, or give another party access through their credentials. Researchers and scientists must be properly trained on using the TRE platform. 

 

2. Safe projects 

 

Even though the TREs hold secure and sensitive information, it is essential that the data that is being used must be relevant and used to positively benefit public health. In order to achieve this, TREs must have auditing in place to ensure compliance. 

 

3. Safe settings

 

Cloud technology should never let data leave the database or export any findings to the users. Researchers should have the ability to bring in their own algorithms for analysis, but any tools that are imputed into the system must be contained in “airlock” mode. This feature allows for tools to be scanned so that the security of the TRE is not affected. Ensuring safe settings also means that the users are tracked on their activity to make sure that the researchers and their work are approved and appropriate. 

 

4. Safe data

 

Data within the TRE must be secure and safe, so that patients are de-identified and there is no possibility of researchers re-identifying the information. The quality of data has to be cleaned and verified as well, so that the appropriate data can be relevant to the approved project. The value of safe data can open up new research opportunities that will benefit the general public. 

 

5. Safe outputs

 

As mentioned in Safe settings, TREs must have barriers in place between the database and the researchers that are accessing the data. Barriers (or “airlocks”) are implemented so that the system can track requests and transactions from both sides to ensure everything is approved, secure and safe. 

 

When TREs meet all five of these requirements, organizations are enabling a fully trusted research environment. 

 

Conclusion

 

Genomic health data brings unique challenges when it comes to storage, management, analysis and collaboration, due both to the scale of the datasets and the sensitivity of what’s contained in them. TREs are becoming the architectural structure to bridge the gap for health data so that the information can be scaled and secured.