We've updated our Privacy Policy to make it clearer how we use your personal data. We use cookies to provide you with a better experience. You can read our Cookie Policy here.


Sailing the Genome in Search of Safe Harbors

Sailing the Genome in Search of Safe Harbors  content piece image
A collaborative research team at Harvard’s Wyss Institute and the ETH Zurich in Switzerland has identified genomic safe harbors (GSHs) in the tumultuous see of human genome sequence to land therapeutic genes in. As part of their validation, they inserted a fluorescent GFP reporter gene into candidate GSHs and followed its expression over time. The GSHs could enable safer and longer-lasting expression of genes in future gene and cellular therapies. This illustration won the team the cover of the Cell Reports Methods issue the study is published in. Credit: Erik Aznauryan.
Listen with
Register for free to listen to this article
Thank you. Listen to this article using the player above.

Want to listen to this article for FREE?

Complete the form below to unlock access to ALL audio articles.

Read time: 7 minutes

Cell and gene therapies are poised to have a major impact on the landscape of modern medicine, carrying the potential to treat an array of different diseases with unmet clinical need.

However, the number of approved, clinically adopted cell and gene therapies is mere compared to the amount that are currently in development. A major barrier for the translation of such therapies is the safe integration of therapeutic genes into the human genome. The insertion of therapeutic genes bears the risk of “off target” effects, or integration of the gene into an unintended location.

A number of different strategies have been proposed to mitigate this effect. The most recent body of work comes from a collaboration between Harvard’s Wyss Institute for Biologically Inspired Engineering, Harvard Medical School (HMS) and the ETH Zurich in Switzerland.

Published in Cell Report Methods, the research focused on identifying “safe spots” in the genome. These locations, known as genomic safe harbors (GSHs), are areas in the genome that meet the following criteria: they can be accessed easily by genome-editing strategies, are within a safe distance from genes that possess functional properties and permit expression of a therapeutic gene, only once it has “landed” in the harbor. A simple analogy is deciding which harbor to dock a boat – there are many considerations, and these depend on the type of boat you are sailing, the weather conditions and ease of access.

The research team adopted computational strategies that enabled the identification of 2,000 predicted GSHs. From this initial identification, they successfully validated two of the sites both in vitro and in vivo using reporter proteins.

Technology Networks interviewed the study’s first author, Dr. Erik Aznauryan, research fellow in the laboratory of Professor George Church at Harvard Medical School. Aznauryan dives into further detail on the history of GSH research, the methods adopted to validate the GSH sites and the potential applications of this research.

Molly Campbell (MC): Can you talk about the history of genomic safe harbor research, and how they were discovered?

Erik Aznauryan (EA): Three genomic sites were empirically identified in previous studies to support stable expression of genes of interest in human cells: AAVS1, CCR5 and hRosa26. All these examples were established without any a-priori safety assessment of the genomic loci they reside in.

Attempts have been made to identify human GSH sites that would satisfy various safety criteria, thus avoiding the disadvantages of existing sites. One approach developed by Sadelain and colleagues used lentiviral transduction of beta-globin and green fluorescence protein genes into induced pluripotent stem cells (iPSCs), followed by the assessment of the integration sites in terms of their linear distance from various coding and regulatory elements in the genome, such as cancer genes, miRNAs and ultraconserved regions.

They discovered one lentiviral integration site that satisfied all of the proposed criteria, demonstrating sustainable expression upon erythroid differentiation of iPSCs. However, global transcriptome profile alterations of cells with transgenes integrated into this site were not assessed. A similar approach by Weiss and colleagues used lentiviral integrations in Chinese hamster ovary (CHO) cells to identify sites supporting long-term protein expression for biotechnological applications (e.g., recombinant monoclonal antibody production). Although this study led to the evaluation of multiple sites for durable, high-level transgene expression in CHO cells, no extrapolation to human genomic sites was carried out.

Another study aimed at identifying GSHs through bioinformatic search of mCreI sites – regions targeted by monomerized version of I-CreI homing endonuclease found and characterized in green algae as capable to make targeted staggered double-strand DNA breaks – residing in loci that satisfy GSH criteria. Like previous work, several stably expressing sites were identified and proposed for synthetic biology applications in humans. However, local and global gene expression profiling following integration events in these sites have not been conducted.

All these potential GSH sites possess a shared limitation of being narrowed by lentiviral- or mCreI-based integration mechanisms. Additionally, safety assessments of some of these identified sites, as well as previously established AAVS1, CCR5 and Rosa26, were carried out by evaluating the differential gene expression of genes located solely in the vicinity of these integration sites, without observing global transcriptomic changes following integration.

A more comprehensive bioinformatic-guided and genome-wide search of GSH sites based on established criteria, followed by experimental assessment of transgene expression durability in various cell types and safety assessment using global transcriptome profiling would, thus, lead to the identification of a more reliable and clinically useful genomic region.

MC: If GSHs do not encode proteins, or RNAs with functions in gene expression, or other cellular processes – what is their function in the genome?

EA: In addition to protein coding, functional RNA coding, regulatory and structural regions of the human genome, other less well understood and inactive DNA regions exist.

A large proportion of the human genome seems to have evolved in the presence of a variety of integrating viruses which, as they inserted their DNA into the eukaryotic genome over the course of million years, lead to an establishment of vast non-coding elements that we continue to carry to this day. Furthermore, partial duplications of functional human genes have resulted in the formation of inactive pseudogenes, which occupy space in the genome yet are not known to bear cellular functions.

Finally, functional roles of some non-coding portions of the human genome are not well understood yet. Our search of safe harbors was conducted using existing annotation of the human genome, and as more components of it are deciphered the identification of genomic regions safe for gene insertion will become more informed.

MC: Are you able to discuss why some regions of the genome were previously regarded as GSHs but are now recognized as non-GSHs?

EA: In the absence of other alternatives, AAVS1, CCR5 and hRosa26 sites were historically called GSHs, as they supported the expression of genes of interest in a variety of cell types and were suitable for use in a research setting.

Their caveats (mainly, location within introns of functional genes, closely surrounded by other known protein coding genes as well as oncogenes) however prevent them from being used for clinical applications. Therefore, in our paper we don’t call them GSHs, and refer to our newly discovered sites as GSHs.

MC: You thoroughly scanned the genome to identify candidate loci for further study as potential GSHs. Can you discuss some of the technological methods you adopted here, and why?

EA: We used several publicly available databases to identify genomic coordinates of structural, regulatory and coding components of the human genome according to the GSH criteria we outlined in the beginning of our study (outside genes, oncogenes, lncRNAs etc.,). We used these coordinates and bioinformatic tools – such as command line’s bedtools – to exclude these genomic elements as well as areas adjacent to them. This left us with genomic regions – putative GSHs – from which we could then experimentally validate by inserting reporter and therapeutic genes into them followed by transcriptomic analysis of GSH-integrated vs non-integrated cells.

MC: You narrowed down your search to test five, and then two GSHs. Can you expand on your choice of reporter gene when assessing two GSHs in cell lines?

EA: Oftentimes in research you go with what is available or what is of the most interest to the lab you are currently working in.

Our case was not an exception, and we initially (up until the T cell work) used the mRuby reporter gene as it was widely available and extensively utilized and validated in our lab at ETH Zurich back then.

When I moved to the Wyss Institute at Harvard, I began collaborating with Dr. Denitsa Milanova, who was interested in testing these sites in the context of skin gene therapy – particularly the treatment of junctional epidermolysis bullosa – caused by mutations in various anchor proteins connecting different layers of skin, among which is the LAMB3 gene. For this reason, we decided to express this gene in human dermal fibroblasts, together with green fluorescent protein to have a visualizable confirmation of expression. We hope we would be able to translate this study into clinics.

MC: Can you describe examples of how GSHs can be utilized in potential therapeutics?

EA: Current cell therapy approaches rely on random insertion of genes of interest into the human genome. This can be associated with potential side effects including cancerous transformation of therapeutic cells as well as eventual silencing of the inserted gene.

We hope that current cell therapies will eventually transition to therapeutic gene insertions precisely into our GSHs, which will alleviate both described concerns. Specific areas of implementation may involve safer engineering of T cells for cancer treatment: insertion of genes encoding receptors targeting tumor cells or cytokines capable of enhancing anti-tumor response.

Additionally, these sites can be used for the engineering of skin cells for therapeutic (as discussed earlier with the LAMB3 example) as well as anti-aging applications, such as expression of genes that result in youthful skin phenotype.

Finally, given the robustness of gene expression from our identified sites, they can be used for industry-scale bio-manufacturing: high-yield production of proteins of interest in human cell lines for subsequent extraction and therapeutic applications (e.g., production of clotting factors for patients with hemophilias).

MC: Are there any limitations to the research at this stage?

EA: A primary limitation to this study is the low frequency of genomic integration events using CRISPR-based knock-in tools. This means that cells in which the gene of interest successfully integrated into the GSH must be pulled out of the vastly larger population of cells without this integration.

These isolated cells would then be expanded to generate homogenous population of gene-bearing cells. Such pipeline is not ideal for a clinical setting and improvements in gene integration efficiencies are needed to help this technology easier translate into clinics.

Our lab is currently working on developing genome engineering tools which would eventually allow to integrate large genes into GSHs with high precision and efficiency.

MC: What impact might this study have on the cell and gene therapy development space?

EA: This study will hopefully lead to many researchers in the field testing our sites, validating them in other therapeutically relevant cell types and eventually using them in research – as well as in clinics – as more reliable, durable and safe alternatives to current viral based random gene insertion methods.

Additionally, since in our work we shared all putative GSHs identified by our computational pipeline, we hope researchers will attempt to test sites we haven’t validated yet by implementing the GSH evaluation pipeline that we outlined in the paper. This will lead to identification of more GSHs with perhaps even better properties for clinical translation or bio-manufacturing.

MC: What are your next steps in advancing this work?

We hope to one day translate our successful in vitro skin results and start using these GSHs in an in vivo context.

Additionally, we are looking forward to improving integration efficiencies into our GSHs, which would further support clinical transition of our sites.

Finally, we will evaluate the usability of our GSHs for large-scale production of therapeutically relevant proteins, thus ameliorating the pipeline of manufacturing of biologics.

Dr. Erik Aznauryan was speaking to Molly Campbell, Senior Science Writer for Technology Networks.