Accelerating Biological Discovery With Open-Source Spatial Genomics Data
Vizgen, a life science company dedicated to advancing human health by visualizing single-cell spatial genomics information, has recently launched an ongoing Data Release Program, giving the scientific community access to open-source spatial genomics data. The first dataset to be released as part of the program, the MERFISH Mouse Receptor Map, is the largest public data set for spatial genomics available to date.
Dr. George Emanuel, scientific cofounder, director of technology and partnerships at Vizgen, spoke with Technology Networks to tell us more about the program, its aims and expected impacts. In this interview, Dr. Emanuel also discusses some of the most exciting findings from the Mouse Receptor Map and highlights how the datasets will benefit the Human Cell Atlas project.
Anna MacDonald (AM): Can you tell us more about the origins and aims of the Data Release Program?
George Emanuel (GE): Spatial transcriptomic profiling with single molecule resolution across whole tissues has potential to provide the next step forward in our understanding of complex biological systems. We put together our Data Release Program to introduce the community to the type of data our MERSCOPE platform measures for a couple of key reasons. The first, as a resource that can supplement research that is already underway to better understand the functional composition of the brain. Second, this data acts as an example for researchers of how this high-resolution spatial transcriptomics data may apply to their specific research questions. Third, providing a complete dataset can enable the computational community to begin developing tools for extracting relevant biological features from this type of data. Spatial genomics is an emerging field, and we want to help educate those who have never worked with this kind of data to understand the types of insights and discoveries that the next generation of genomics will enable.
AM: What impact do you expect access to open-source spatial genomics data will have?
GE: High quality open-source data has potential to accelerate biological discoveries. Most academic data ends up in the public domain eventually, but typically a year or two after it was first acquired. Since MERFISH has already been validated through various top tier scientific publications, we are comfortable with the quality of the data being generated with MERSCOPE and want to get the data into the hands of as many researchers as possible so they can start mining it for biological insight. We expect that providing access to open-source spatial genomics data will help the research community understand the extra level of information available in spatial genomics. Our vision for making this data publicly available is to empower researchers to make new discoveries with biological significance or be inspired to develop new tools for analyzing the data.
AM: How will the program fit into the Human Cell Atlas project?
GE: The Human Cell Atlas project is an initiative to create comprehensive reference maps of all human cells. A map by its nature requires spatial information and there are no commercial technologies besides MERSCOPE on the market that can provide single cell information for all cells across large tissues with such high multiplexing capacity. Our MERFISH Mouse Brain Receptor Map demonstrates the capacity of MERFISH to lead the Human Cell Atlas project. This data could, for example, help the Human Cell Atlas Community build on their existing single cell measurements to uncover new discoveries, or could feed into bioinformatics development pipelines. We see a lot of possibilities for ways the data could fit into the Human Cell Atlas project, and we look forward to presenting at the Human Cell Atlas Meeting at the end of June.
AM: How are the datasets generated?
GE: The datasets are generated using a prototype of the MERSCOPE platform solution. The dataset contains nine full coronal slice measurements—three positions across the mouse brain with three biological replicates at each position. Each slice was processed following our standard MERSCOPE sample prep protocol. From a fresh frozen tissue block, a 10-micron thick slice was cut and adhered onto a MERSCOPE slide, the MERSCOPE encoding probes were hybridized to imprint each binary barcode onto the targeted transcripts, and the sample was placed on the MERSCOPE to read out the barcode using sequential rounds of imaging. The platform acquired raw images across the whole sample and processed these images to decode the barcodes and identify the positions of each of the RNA transcripts in addition to segmenting the cell bodies using the additional DAPI and PolyT stains. This resulted in the MERSCOPE output that was posted for all nine slices: the list of all detected transcripts and their spatial locations in three dimensions (CSV), the mosaic images (TIFF), output from the cell segmentation analysis including the transcripts per cell matrix (CSV), the cell metadata (CSV) and the cell boundaries (HDF5).
AM: Why was the Mouse Receptor Map chosen as the first dataset?
GE: The brain contains an intricate arrangement of different types of cells, including numerous variations of inhibitory and excitatory neurons, astrocytes, monocytes, endothelial cells and oligodendrocytes. We still do not have a full understanding of how these cells are spatially arranged in the brain and how their spatial organization contributes to the overall emergence of complex cognitive processes, but MERFISH is a tool that allows researchers to interrogate these aspects directly. We chose the MERFISH Mouse Brain Receptor Map as the first dataset because it shows the power of spatial transcriptomics to advance biological discovery in one of the most challenging tissues to fully comprehend.
AM: Were any details of particular interest revealed?
GE: For this first dataset, we constructed a panel of 483 genes, including ~50 canonical markers and ~450 mouse brain receptors, including GPCRs and RTKs. GPCRs mediate signaling and may play vital roles behind brain aging and neurodegenerative disorders but are lowly expressed and difficult to capture with other technologies. Even though many of the GPCRs had not previously been extensively characterized across the brain, nearly all exhibited nontrivial spatial organization. Something to highlight from the dataset is that we demonstrated the ability of MERSCOPE to detect some particularly low expressing genes including OXTR (oxytocin receptor) TSHR (thyroid stimulating hormone receptor), and INSR (insulin receptor). The ability to detect lowly expressed genes such as these GCPRs can aid in drug development as we begin to understand the functional significance of different GPCRs expressed across different regions of the brain, as just one example.
AM: Are you able to comment on future atlases that are in the pipeline?
GE: We plan to continue to release datasets highlighting the power of the MERSCOPE platform across different application areas. It's possible that our next data release will be more clinically relevant, such as human cancer tissue, or denser tissues where the cell segmentation is not as straightforward as in the brain. We have a number of exciting results from internal projects that could lead to more data releases. We hope to eventually build a library of accessible and well catalogued MERFISH data for the research community at large.
Dr. George Emanuel was speaking to Anna MacDonald, Science Writer for Technology Networks.