Abstract
There has been an ascending growth in the capacity of information being generated. The increased production of data in turn has put forward other challenges as well thus, and there is the need to store this information and not only to store it but also to retain it for a prolonged time period. The reliance on DNA as a dense storage medium with high storage capacity and its ability to withstand extreme environmental conditions has increased over the past few years. There have been developments in reading and writing different forms of data on DNA, codes for encrypting data and using DNA as a way of secret writing leading towards new styles like stenography and cryptography. The article outlines different methods adopted for storing digital data on DNA with pros and cons of each method that has been applied plus the advantages and limitations of using DNA as a storage medium.
Similar content being viewed by others
References
Shrivastava S, Badlani R (2014) Data storage in DNA. Int J Electr Energy 2:119–124
Hakami HA, Chaczko Z, Kale A (2015) Review of big data storage based on DNA computing. In: Proceedings of the Asia-Pacific Conference on Computer-Aided System Engineering (APCASE’15), Quito Ecuador, pp 113–117
Castillo M (2014) From hard drives to flash drives to DNA drives. Am J Neuroradiol 35:1–2
Allentoft ME, Scofield RP, Oskam CL, Hale ML, Holdaway RN, Bunce M (2012) A molecular characterization of a newly discovered megafaunal fossil site in North Canterbury, South Island, New Zealand. J R Soc N Z 42:241–256
Borda M, Tornea O (2010) DNA secret writing techniques. In: Proceedings of the 8th International Conference on Communications (COMM’10). Bucharest, Romania, pp 451–456
Davis J (1996) Microvenus. Art J 55:70–74
DeSilva PY, Ganegoda GU (2016) New trends of digital data storage in DNA. Biomed Res Int 8072463:14
Kac E (1999) “Genesis-art of DNA,” http://www.ekac.org/geninfo
Arita M (2004) Writing information into DNA. Asp Mol Comput 2950:23–35
Smith GC, Fiddes CC, Hawkins JP, Cox JPL (2003) Some possible codes for encrypting data in DNA. Biotech Lett 25:1125–1130
Yatchie N, Ohashi Y, Tomita M (2008) Stabilizing synthetic data in the DNA of living organisms. Syst Synth Biol 2:19–25
Doig AJ (1997) Improving the efficiency of the genetic code by varying the codon length—the perfect genetic code. J Theor Biol 188:355–360
Ailenberg M, Rotstein OD (2009) An improved Huffman coding method for archiving text, images, and music characters in DNA. Biotechniques 47:747–754
Sanger F, Nicklen S, Coulson AR (1997) DNA sequencing with chainterminating inhibitors. Proc Natl Acad Sci USA 74:5463–5467
Cui G, Li C, Li H, Li X (2009) dna computing and its application to information security field. In: Proceedings of the 5th International Conference of Natural Computation, Tianjian, China; IEEE, pp 14–16
Ning K (2009) A pseudo DNA Cryptography method. http://arxiv.org/abs/0903.269
Gehani A, LaBean T, Reif J (2003) DNA-based cryptography. In aspects of molecular computing, pp 167–188. Springer, Berlin
Yachie N, Ohashi Y, Tomita M (2008) Stabilizing synthetic datain the DNA of living organisms. Syst Synth Biol 2:19–25
Bancroft C, Bowler T, Bloom B, Clelland CT (2001) Long term storage of information in DNA. Science 293:1763–1765
Yachie N, Sekiyama K, Sugahara J, Ohashi Y, Tomita M (2007) Alignment-based approach for durable data storage into living organisms. Biotechnol Prog 23:501–505
Yazdi SMHT, Yuan Y, Ma J, Zhao H, Milenkovic O (2015) A rewritable, random-access DNA-based storage system. Sci Rep 5:14138
Goldman N, Bertone P, Chen S, Dessimoz C, LeProust EM, Sipos B, Birney E (2013) Towards practical, high-capacity, low maintenance information storage in synthesized DNA. Nature 494:77–80
Chan CY, Ioannidis YE (1999) An efficient bitmap encoding scheme for selection queries. ACM SIGMOD Record ACM 28(2):215–226
Cosemans S, Dehaene W, Catthoor F (2008) A 3.6 pJ/access 480 MHz, 128Kbit on-Chip SRAM with 850 MHz boost mode in 90 nm CMOS with tunable sense amplifiers to cope with variability. In Solid-State Circuits Conference, 2008. ESSCIRC 2008. 34th European IEEE, pp 278–281
Cruz RPG, Withers JB, Li Y (2004) Dinucleotide junction cleavage versatility of 817 deoxyribozyme. Chem Biol 11:5767. https://doi.org/10.1016/j.chembiol.2003.12.012
Sangwan N (2012) Text encryption with huffman compression. Int J Comput Appl 54:29–32
Zhang Y, Bochen Fu LH (2012) Research on DNA cryptography. In: Sen J (ed) Applied cryptography and network security. pp 357–376, InTech, Rijeka, Croatia, http://www.intechopen.com/books/applied-cryptography-and-networksecurity/ research-on-dna-cryptography
Borda M (2011) Fundamentals in information theory and coding. Springer, Berlin
Borda ME, Tornea O, Hodorogea T (2009) Secret writing by DNA hybridization. Acta Technica Napocensis Electron Telecommun 50:21–24
Blaum M, Litsyn S, Buskens V, Tilborg HC (1993) Error correcting codes with bounded running digital sum. IEEE Trans Inf Theory 39:216–227
Bryksin AV, Matsumura I (2010) Overlap extension PCR cloning: a simple and reliable way to create recombinant plasmids. Biotechniques 48:463–465
Schuster SC (2008) Next-generation sequencing transforms today’s biology. Nature 5:16–18
Church GM, Gao Y, Kosuri S (2012) Next-generation digital information storage in DNA. Science 337:1628
Ogihara M, Ray A (1999) Simulating Boolean circuits on a DNA computer. Algorithmica 25:239–250
Boneh D, Dunworth C, Lipton RJ, Sgall JÍ (1996) On the computational power of DNA. Discret Appl Math 71:79–94. https://doi.org/10.1016/S0166-218X(96)00058-3. (Describes a solution for the boolean satisfy ability problem)
Kari L, Gloor G, Yu S (2000) Using DNA to solve the bounded post correspondence problem. Theor Comput Sci 231:192–203. https://doi.org/10.1016/s0304-3975(99)00100-0. (Describes a solution for the bounded Post correspondence problem, a hard-on-average NP-complete problem)
Benenson Y, Gil B, Ben-Dor U, Adar R, Shapiro E (2004) An autonomous molecular computer for logical control of gene expression. Nature 429:423–429
Jerome B, Yin P, Monica EO, Subsoontorn P, Endy D (2013) Amplifying genetic logic gates. Science 340:599–603
Amos M et al (2002) Topics in the theory of DNA computing. Theor Comput Sci 287:3–38. https://doi.org/10.1016/s0304-3975(02)00134-2
Ravinderjit SB (2001) Solution of a satisfiability problem on a gel-based DNA computer. DNA computing. Springer, Berlin, pp 27–42
Macdonald J, Stefanovic D, Stojanovic M (2009) Des assemblages d’ADN rompus au jeu et au travail, Pour la Science, pp 68–75
Nayebi A (2009) Fast matrix multiplication techniques based on the Adleman-Lipton model, arXiv: 0912.0750
Wong JR, Lee KJ, Jian-Jun S, Shao F (2015) Magnetic fields facilitate DNA-mediated charge transport. Biochemistry 54:33923399. https://doi.org/10.1021/acs.biochem.5b00295
Santoro SW, Joyce GF (1994) A general purpose RNA-cleaving DNA enzyme. Proc Natl Acad Sci 94:4262–4266. https://doi.org/10.1073/pnas.94.9.4262
Stojanovic MN, Stefanovic D (2003) A deoxyribozyme-based molecular automaton. Nat Biotechnol 21:10691074. https://doi.org/10.1038/nbt862
Seelig G, Soloveichik D, Zhang DY, Winfree E (2006) Enzyme-free nucleic acid logic circuits. Science 314:1585–1588
Rothemund PWK, Papadakis N, Winfree E (2004) Algorithmic self-assembly of DNA Sierpinski triangles. PLoS Biol 2:e424. https://doi.org/10.1371/journal.pbio.0020424
Huffman DA (1953) A method for the construction of minimum-redundancy codes. Proc IRE 40:1098–1101
Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K et al (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921
Milenkovic O, Kashyap N (2006) On the design of codes for DNA computing. In coding and cryptography. Springer, New York, pp 100–119
Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P, Agarwala R, Ainscough R, Alexandersson M, An P, Antonarakis et al (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 420:520–562
Bornholt J, Lopez R, Carmean DM, Ceze L, Seelig G, Strauss K (2016) A DNA-based archival storage system. ASPLOS, ACM, New York. https://doi.org/10.1145/2872362.2872397
Acknowledgements
This work is carried out with the help of prestigious material of the libraries and special thanks to Institute of Industrial Biotechnology, Government College University, Lahore.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Disclosure
The authors assure the integrity and quality of our research work. It is also stated that there is no plagiarism in this work and all points taken from other authors are well cited in the text. This study is completely independent and impartial.
Research involving human participants and/or animals
This article does not contain any studies conducted on human or animal subjects.
Rights and permissions
About this article
Cite this article
Akram, F., Haq, I.u., Ali, H. et al. Trends to store digital data in DNA: an overview. Mol Biol Rep 45, 1479–1490 (2018). https://doi.org/10.1007/s11033-018-4280-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11033-018-4280-y