We've updated our Privacy Policy to make it clearer how we use your personal data. We use cookies to provide you with a better experience. You can read our Cookie Policy here.


The Inside Scoop on Data Registries

The Inside Scoop on Data Registries content piece image
Listen with
Register for free to listen to this article
Thank you. Listen to this article using the player above.

Want to listen to this article for FREE?

Complete the form below to unlock access to ALL audio articles.

Read time: 7 minutes

Just producing good data in a research or clinical context is not enough. You must be able to reliably capture that data and be able to count on its completeness and accuracy. Making use of the right choice of data capture technique is key. A data registry approach is one such method. Dr. Steven Labkoff, global head of clinical & healthcare informatics at the registry science practice Quantori, is a pioneer and a leader in the field. In this interview, we discuss use cases for registries, how they can improve data integrity and how they compare to electronic health record based (EHR) approaches.


Ruairi Mackenzie (RM): What is a data registry?

Steven Labkoff (SL): A data registry is a data collection process and program that is usually focused on answering very specific sets of questions. The questions can range from scientific to administrative. A data registry can vary depending on its focus – the study subject. For example, it can be focused on a “patient”, a “clinical practice” (like a cardiology or internal medicine practice), a “medical device” (like a pacemaker), or even the output from a research lab. 

The data stored in a registry tends to be longitudinal in nature, meaning that data is collected on the same variables over time so that differences over time can be measured.

RM: What are the best use cases for a registry approach to data management?

SL: There are a variety of use cases where a registry can advance science. But there are more potential uses outside research. There are at least seven large buckets of use cases into which most registry work can be classified. These include:

·        Medical Device or Medication

·        Clinical Research

·        Care Delivery

·        Natural History

·        Heath Economic / Outcome Research

·        Safety Surveillance

·        Quality Improvement

The main difference between the use cases revolves around the study subject. For example, the study subject in a Medical Device registry might be the actual device, versus the patient into which the device is implanted. This focus can allow for rapid identification of patients with the implant in the case of a recall or problem. On the other hand, if the study subject is a medical practice, the registry might be used to track administrative tasks that need to be reported to a regulatory agency such as CMS or a state medical board. If the registry is focused on a specific set of patients with a rare cancer, then the unit of study would be the patient, but only patients who have that rare cancer.

RM: How is registry data used by different stakeholders – organizations, patients etc.?

SL: Registries are used in different ways, depending on the sponsor/stakeholder. For example, if a patient advocacy organization (PAO) is the sponsor, their interest may be to find out new approaches to researching a rare disease to those with a particular ailment or disorder and their families. The PAO may have a specific interest in advancing research that may not be possible in the confines of a university setting, largely because university-based research has constraints – especially around data sharing between institutions. A PAO can work across boundaries, at the regional or national level, and collect data and information that can later be channelled for further research via a grant process.

If the stakeholder happens to be a medical society, its interest may revolve around helping its members collect data about their own practices to report up to a state or national regulatory agency, like the Center for Medicare & Medicaid Services (CMS).  For example, there are Medicare reimbursement programs known as MIPS and MACRA – to get the most out of Medicare reimbursements, a practice needs to collect data about the services it provides and how completely it meets certain metrics put forth by CMS. The state medical society may have an interest in aggregating all this data at the state level to be able to compare practices among each other. This type of activity is best managed using a registry.

RM: Does a registry approach guarantee better data completeness and integrity?

SL: The concept of using a registry does not inherently guarantee completeness of data. Nevertheless, by its nature, using a registry should usually help with data completeness. It is ultimately up to the team that manages the registry, its processes and data.  That said, the entire reason a registry structure is adopted is to ensure that there is higher quality data (cleaner, more complete and longitudinal in nature) than other means of data collection and curation. Generally, there are members of the registry team whose primary job is to review the data as it’s being gathered before being entered. Once reviewed, and entered, other team members ensure that data is both complete and well curated. Ultimately, this is why Registry Science exists.

However, using a registry program does not ensure that the data will be well curated. Various business processes, human interfaces and data science techniques are needed to ensure high-quality, well-curated and longitudinal data becomes the base expectation for the program.

The curators and stewards of registries have the task of ensuring that the final data for a given entry is accurate, complete and, where needed, researched to ensure accuracy. In addition, there are specific efforts made to connect disparate data sets together at the patient level (in the case of a disease registry), to provide a similar set of observations or other data about the study subject at a given point of time – and then to repeat those observations, lab tests, etc., to record any changes over time.

RM: In a clinical context, how does a registry approach compare to an EHR-based approach?

SL: An EHR tends to collect medical information as a by-product of clinical care. Data about a clinician’s findings, the patients’ labs, radiology or pathology results are posted chronologically in the EHR, however, there is nothing that dictates when data is collected, its frequency, nor its completeness. Indeed, when an event happens, a medical record becomes the repository of all information collected about the day-to-day events that occur based on a patient’s care.

A registry, on the other hand, tries very hard to focus on a specific question – say, the longevity of a medical implant. In the case of a pacemaker registry, data about the specific product, the pacemaker, the electronic wires used and implanted in the patient (leads), the surgeon or cardiologist who performs the implant procedure, the data and time of implants and any changes to programming.  While all of this information is also stored in the medical record, there are many reasons why having this information segregated out of the EHR for epidemiological study is critically important.  By having all this data bucketed together, one can much more easily study events, such as battery failures, lead breaks or the longevity of one device versus another without having to dive deep into the medical records of all of the patients who had that brand of pacemaker implanted. There are both safety and administrative reasons why being able to study this information in this form is critical (and is far faster and more organized than diving into the EHR every time a question needs to be answered – to both the patients and the device manufacturer.

An EHR, by its very nature, tends to only have data that is accrued in the course of normal clinical care  It seldom is highly curated – meaning that if there are missing data elements, they are not generally researched, investigated, and later, filled in – especially if it is not needed in the course of normal clinical care. 

A registry, in contrast, can be used to collect multiple types of data about the same research subject (a patient, a practice, or other study subject),  As an example, in the case of a study subject being a patient, it can be used to collect and curate healthcare data from an EHR, genomic data on tissue specimens, radiology results, pathology reports, and patient reported outcomes (survey data) – all on the same patient, repeatedly collected at intervals, reviewed and curated (cleaned up) for a highly specific, highly curated, longitudinal data on the subject.  One of the keys in figuring out the complexity of disease is understanding what’s happening with the patient from multiple points of view – call it a 360-degree point of view.  With a multimodal registry, all of these different modalities of data are collected over time.  This allows for the beginning of a true 360-degree view of the study subject.

In conclusion, there are reasons why different tools exist. A hammer is used for a different set of purposes than a saw. Likewise, an EHR is used for an array of different use cases – most of which revolve around day-to-day care, while some extend to pseudo-registry-like functionality. A registry is a different tool, designed with different use cases in mind.

I’ll leave you with a wonderful metaphor that I use when discussing registries vs EHRs (I’m a photographer so please bear with me). To me, data in an EHR is like an individual photograph. Each photograph is filled with information (data), and looking from photo to photo over time, you might be able to see some trends if you are diligent about putting everything together. If you are photographing a flower, you might get a few shots of it as a bud, some as a fully blossomed flower, and another when it’s withered and old.  But each bit of data is discrete. You would be hard pressed to figure out when a bud blossomed into a petal, or when a specific leaf dropped.

A registry, however, is more like a time-lapse video. With enough frames in the video, you can watch the flower grow day by day. You can even determine precisely when some change in light, water, or other factor impacts the flower. That’s all possible because you have enough individual measurements that are then looked at over time in the form of a video. 

Registries allow for the compilation of data over time in a way that many other means of data analysis don’t allow. The metaphor doesn’t entirely hold; you are unlikely to be constantly recording a subject like our under-surveillance flower. But by having the same types of data recorded repeatedly over time, the better you can figure out what’s happening in between snapshots.

Steven Labkoff was speaking to Ruairi J Mackenzie, Senior Science Writer for Technology Networks