The power of DNA to store information is updated


dna

Credit: CC0 Public Domain

A team of interdisciplinary researchers has discovered a new technique for storing DNA information, in this case “The Wizard of Oz,” translated into Esperanto, with unprecedented precision and efficiency. The technique takes advantage of the information storage capacity of interlocking DNA strands to encode and retrieve information in a durable and compact way.


The technique is described in an article this week. procedures of the National Academy of Sciences.

“The key advance is an encoding algorithm that enables accurate information retrieval even when DNA strands are partially damaged during storage,” said Ilya Finkelstein, associate professor of molecular biosciences and one of the study’s authors.

Humans are creating information at exponentially higher rates than we used to, contributing to the need for a way to store more information efficiently and in a long-lasting way. Companies like Google and Microsoft are among those exploring the use of DNA to store information.

“We need a way to store this data so that it is available when and where it is needed in a format that is readable,” said Stephen Jones, a research scientist who collaborated on the project with Finkelstein; Bill Press, jointly appointed professor of computer science and integrative biology; and Ph.D. pupil John Hawkins. “This idea takes advantage of what biology has been doing for billions of years: storing a lot of information in a very small space that lasts a long time. DNA does not take up much space, it can be stored at room temperature and can last hundreds of thousands of years “.

DNA is approximately 5 million times more efficient than current storage methods. In other words, a drop of DNA in a milliliter could store the same amount of information as two Walmarts full of data servers. And DNA doesn’t require permanent cooling and hard drives that are prone to mechanical failure.

There is only one problem: DNA is prone to errors. And when a genetic code has errors, it is very different from when a computer code has errors. Errors in computer codes tend to appear as blanks in the code. Errors in DNA sequences are shown as insertions or deletions. The problem is that when something is removed or added to the DNA, the entire sequence changes, with no blank spots to alert anyone.

Previously, when information was stored in DNA, the information that had to be saved, like a paragraph in a novel, was repeated 10 to 15 times. When the information was read, the repeats would be compared to eliminate any insert or deletion.

“We found a way to build the information more like a network,” said Jones. “Each piece of information reinforces other pieces of information. That way, it only needs to be read once.”

The language the researchers developed also avoids sections of DNA that are prone to errors or difficult to read. The language parameters may also change with the type of information that is being stored. For example, a dropped word in a novel is not as important as a zero on a tax return.

To demonstrate the recovery of information from degraded DNA, the team subjected their “Wizard of Oz” code to high temperatures and extreme humidity. Even though the DNA strands were damaged by these harsh conditions, all the information was still successfully decoded.

“We tried to address as many problems with the process as we could at the same time,” said Hawkins, who recently was at UT’s Oden Institute for Engineering and Computer Science. “What we finished is quite remarkable.”


New approach to DNA data storage makes system more dynamic and scalable


More information:
DOI: 10.1073 / pnas.2004821117 William H. Press al., “HEDGES DNA Storage Error Correction Code Corrects Indels and Allows Sequence Restrictions” PNAS (2020). www.pnas.org/cgi/doi/10.1073/pnas.2004821117

Provided by the University of Texas at Austin

Citation: The power of DNA to store information gets an update (2020, July 13) retrieved on July 13, 2020 from https://phys.org/news/2020-07-power-dna.html

This document is subject to copyright. Other than fair dealing for private study or research purposes, no part may be reproduced without written permission. The content is provided for informational purposes only.