Researchers from Microsoft and the University of Washington say they’ve reached a milestone in the effort to use DNA to help store the world’s rapidly growing stash of digital data.
A joint team from the Redmond company and the university said it had successfully encoded about 200 megabytes of data onto synthetic DNA molecules.
The information, which the team later successfully decoded and read, included more than 100 books, translations of the Universal Declaration of Human Rights, and a high-definition music video from the band OK Go.
That breaks the previous published record of about 22 megabytes encoded and decoded on DNA, researchers say.
“This is a concrete example that we can build computers in a very different way, that’s more than just silicon,” said Luis Ceze, a UW associate professor and the university’s principal researcher on the project. “We’re going to nature to build better computers.”
In nature, DNA contains the genetic instructions that guide the development of living organisms, encoded in the molecule’s famous double helix as a billions-long sequence of four building blocks.
In the technology industry, the dense molecules could become a powerful aid in storing and archiving photos, video and other digital information.
DNA storage of the type demonstrated in the UW lab could, theoretically, store an exabyte (one billion gigabytes) of data in about one cubic inch of DNA material, said Karin Strauss, Microsoft’s lead researcher on the project. Storing that much data by conventional methods would require a warehouse-sized data center.
“Our goal is really to build systems to show that it is possible,” Strauss said.
DNA is also highly durable.
Hard disk or flash drives can fail in a few years. Magnetic tape, the researchers say, lasts a few decades, and DVDs and other laser-read media can survive perhaps a century.
Under the right conditions, Ceze said, data encoded on DNA could be readable for thousands of years.
It is “the ultimate backup,” Ceze said. “Very dense, very durable.”
For now, though, the technology is years away from practical use, hindered by the relatively high costs of coding and decoding DNA, and a process that takes far longer than the milliseconds needed by hard drives to call up data.
In the demonstration touted recently, researchers first translated the ones and zeros that made up their digital files into a series of “letters” that correspond to the four nucleotide building blocks of DNA.
The recipe for that sequence was then sent to Twist Bioscience, a San Francisco company that Microsoft turned to for a purchase of 10 million synthetic strands of DNA earlier this year.
Twist makes the molecules to the researchers’ specifications, and then sends them back to the Seattle lab, located under the UW’s electrical-engineering center. The compound at this point looks like a few bits of salt at the bottom of a test tube.
UW and Microsoft researchers rehydrate the sample, and after a process that causes the selected DNA to replicate itself, they run it through a DNA sequencer that translates the molecules’ contents back to letters. Researchers then decode those letters back into the digital ones and zeros of each file.
The project is the work of about 20 researchers from Microsoft and the university, mixing computer-systems architects with molecular biologists and data-storage specialists.
There was, Ceze joked, an “adjustment period” for the experts in different disciplines to speak the same language.
The research began as an experimental UW program, funded by the National Science Foundation. Later, Microsoft took an interest and committed scientists from its research unit to the collaboration. The company is the project’s primary source of funding.
Cheap and efficient data storage is a growing business interest for Microsoft and other companies competing to store the world’s data.
Firms like Microsoft, Amazon.com, Google and Facebook have all found themselves in the expensive business of building massive data centers to accommodate the photos, videos and software programs customers are plugging into their online services.
For archival tasks, such as storing hospitals’ medical imaging data or for companies that produce a lot of video, DNA may be an ideal solution, Strauss said.
“This is something that has extreme potential,” Ceze said.
About the Author