DIGITIZING DNA! COULD THIS BE THE FUTURE OF STORAGE?
In my last blog, I discussed an ingenuous technique by which researchers had managed to carry out an attack using DNA! While that no doubt sounds like something from the realm of science fiction, it’s definitely not the only way the digital world is being integrated with DNA.
The security of confidential information has never been more at risk than in today’s hyper connected world. Keeping data safe, has required new techniques, both for the storage and transmission of data and recently, a method has emerged that truly pushes the boundaries of one’s imagination. Yes, organizations such as Microsoft have already been exploring data storage on DNA as far as three years ago! In fact, last year, the software giant managed to write 200 megabytes of data in DNA, including a movie and 99 literary classics. You can read all about it here.
Why Dive into DNA?
Humanity is facing a storage crisis! There was more data generated in the past two years than in all of history before that point. The growth in data is clearly exponential and doesn’t show any signs of slowing down which is why it is going to get harder for a disk, flash, tape-drive or any other device to store the volumes being created.
To address this, researchers have been exploring the use of DNA as a means of storage. A single gram of DNA can store 215 petabytes of data- which is 215 million gigabytes. It’s hard to wrap one’s head around this incredibly large number is hard so consider this- using DNA storage, roughly every bit of datum that has ever existed in the universe so far could be stored in a container weighing only as much as a couple of pickup trucks!
The “How” Part- Where It Gets Technical
Researchers from Columbia University compressed files into a master file and then split data into short strings of binary code made up of ones and zeros. Using an erasure-correcting algorithm called fountain codes, they randomly packaged the strings into droplets and mapped the zeros and ones in each droplet to the four nucleotide bases in the DNA: A (adenine), G (guanine), C (cytosine) and T (thymine).
The algorithm deleted letter combinations known to create errors, and added a ‘barcode’ to each droplet to help reassemble the files later. The files are retrieved by using DNA sequencer and software to translate the genetic code back into binary.
Of course, every idea has its own advantages and disadvantages. The advantages of DNA storage are: it is ultracompact, it can last hundreds of years if stored in a cool and dry place. However, the researchers were able to store only 1.28 petabytes of data in per gram of DNA. So far, no one has been able to achieve even half of what researchers believe DNA could actually handle. The other disadvantage is the cost. Researchers spent $7000 to synthesize the DNA they used to archive just 2 MB of data and another $2000 to read it!
There is no doubt that as research progresses, costs will likely come down over time, although it has long ways to go. We stay optimistic as scientists believe that this could be a permanent way of storing data one day.
Source: bbc,datascience.columbia.edu, sciencemag.org
Blog by:
Meenakshi Rajendran