DNA Set To Become the World's Smallest Hard Drive
Complete the form below to unlock access to ALL audio articles.
The world is generating more data than ever. The International Data Corporation predicts that the global data storage demand will increase from the 33 zettabytes in 2020 to 175 zettabytes by 2025 – a figure far beyond the storage capacity of currently available methods. This growing figure, coupled with the costs and energy requirements of maintaining and transferring data, calls for novel solutions.
The prospect of using DNA as a method for storing data has been considered since the 1950s. "DNA is the molecule that stores all the instructions needed to create us as human beings. That's a lot of information, and we have a copy of all that information in every single cell in our body," said Dr. Keith EJ Tyo, associate professor of chemical and biological engineering at the Center for Synthetic Biology, Northwestern University.
Cells vs computersIn biological cells, DNA comprises four nucleic acid bases, adenine (A), thymine (T), guanine (G) and cytosine (C), which are strung together in different combinations to form genes. Genes are transcribed and translated into proteins, the functional "workhorses" of our cells. In computers, information is stored as binary digits, or "bits", which are 1s and 0s. When these bits are read in a particular order, they can be used as code to instruct programs to be conducted. The overall goal of DNA-based data storage is to encode and decode binary data to and from synthesized strands of DNA.
While the field holds huge potential, current drawbacks associated with ex-vivo synthesis of DNA have limited its large-scale application. "State-of-the-art DNA chemical synthesis is expensive relative to other storage technologies, such as solid-state hard drives," Tyo explained.
Tyo and colleagues at Northwestern have developed a novel in vitro method for recording information to DNA that relies on an enzymatic system. The method, Time-sensitive Untemplated Recording using TdT for Local Environmental Signals, or TURTLES, is published in the Journal of the American Chemical Society.
"We have been working for some time on engineering DNA as a storage medium. For a long time, we studied the most common DNA polymerases that replicate DNA. We realized that this type of replicative polymerase can stall for a relatively long time, and that would not be acceptable for data storage. Instead, we started looking at non-replicative DNA polymerases and found one with exactly the right properties," he said.
Recording the environment using DNA The DNA polymerase, called terminal deoxynucleotidyl transferase or TdT, adds nucleotide bases to the 3'- end of single-stranded DNA. Its selectivity can be characterized depending on the physiological conditions and environment of a cell. Tyo explained, "Additional links are added sequentially to one end [of the DNA strand]. When a chemical – for example, cobalt – is present, the DNA that is added tends to have more As and less Gs. When cobalt is not present, the opposite is true. Therefore, we record a 1 when we allow the links to be added when cobalt is present, and a 0 when cobalt is absent." When the DNA is later read to decode the information, if a region has a lot of A bases present, it is recorded as a 1. If a lot of G bases are present, it is recorded as a 0.
In this way, as the environment changes, the composition of the DNA strand being synthesized also changes. The average rate of nucleotide incorporation can also be used as a "time stamp" to indicate exactly when the environmental change occurred. "We engineer TdT to allosterically turn off in the presence of a physiologically relevant concentration of calcium. We use this engineered TdT in concert with a reference TdT to develop a two-polymerase system capable of recording a single-step change in the Ca2+ signal to within one minute over a 60-minute period," the authors write in the paper.
When asked how the speed of the method compares to chemically synthesizing DNA, Tyo said, "Chemical DNA synthesis works by tethering DNA to a surface and requires adding chemicals sequentially and then washing them away before adding the next chemical. This washing step creates slowness. Our approach does not require washing steps and instead all the reagents for DNA synthesis stay in the mixture and the properties of the DNA polymerase are modulated reversibly."
A call for innovation and collaboration
The study is a proof of principle demonstration where the team were able to report up to 3/8 of a byte of information in one hour. There's potential to scale-up here, Tyo said: "A digital picture is millions of bytes and takes a fraction of a second to read and write to your hard drive. Parallelization to millions of strands of DNA will allow significantly more and faster data storage, but we are going to address technical hurdles to increase the number of bytes and shorten the record time of one DNA chain."
“This is a really exciting proof of concept for methods that could one day let us study the interactions between millions of cells simultaneously. I don't think there's any previously reported direct enzyme modulation recording system,” said Namita Bhan, co-first author and a former postdoctoral researcher in the Tyo lab.
Tyo and colleagues hope that other research teams across the globe will contribute to the technology, in order for it to become viable as a commercialized concept. "Who knows, maybe if they come up with something really good, they can send us a letter explaining what they've done on a strand of DNA!" he concluded.
Keith EJ Tyo was speaking to Molly Campbell, Science Writer for Technology Networks.
Reference: Bhan N, Callisto A, Strutz J, et al. Recording temporal signals with minutes resolution using enzymatic DNA synthesis. J Am Chem Soc. 2021;143(40):16630-16640. doi:10.1021/jacs.1c07331.