Next door to Delhi; a ‘bank’ to store the country’s digitised biological data
- November 13, 2022
- Posted by: OptimizeIAS Team
- Category: DPN Topics
No Comments
Next door to Delhi, a ‘bank’ to store the country’s digitised biological data
Subject :Polity
Context-
- The government has for the first time set up a digitised repository where Indian researchers will store biological data from publicly funded research, reducing their dependency on American and European data banks.
About ‘Indian Biological Data Bank’-
- It has come up at the Regional Centre for Biotechnology in Faridabad.
- It will be stored on a four-petabyte [A petabyte equals 10,00,000 gigabytes (GB)] supercomputer called ‘Brahm’.
- The government has mandated that data from all publicly funded research should be stored in this central repository.
- The bio-bank, which costs about Rs 85 crore to set up, currently accepts nucleotide sequences — the digitised genetic makeup of humans, plants, animals, and microbes.
- The biobank also has a backup data ‘Disaster Recovery’ site at National Informatics Centre (NIC)-Bhubaneshwar.
Types of data stored in the Bio-Bank-
- The database currently offers two mechanisms for data submission to researchers.
- One, open access where the data uploaded can be immediately used by other researchers from across the country and two, controlled access where the data will not be openly shared for a number of years before being opened up to all.
- There are now 200 billion base pair data in the bio-bank, including 200 human genomes sequenced under the ‘1,000 Genome Project’, which is an international effort to map the genetic variations in people.
- The project will also focus on populations that are predisposed to certain diseases.
- The database contains-
- most of the 2.6 lakh Sars-Cov-2 genomes sequenced by the Indian Sars-CoV-2 Genomic Consortium (INSACOG).
- The government learnt from this data that the Omicron sub-variant BA.2.75 was being overtaken by a recombinant variant XBB — which is a combination of two Omicron sub-lineages, BJ.1 and BA.2.75.
- 25,000 sequences of mycobacterium tuberculosis that another national consortium is trying to sequence.
- Genomic sequences of crops such as rice, onion, tomatoes and mustard, among others.
- With genomes of humans, animals, and microbes present in the same database, it will also help researchers in studying zoonotic diseases.
- most of the 2.6 lakh Sars-Cov-2 genomes sequenced by the Indian Sars-CoV-2 Genomic Consortium (INSACOG).
- Although the database currently only accepts such genomic sequences, it is likely to expand later to the storage of protein sequences – strings of amino acids that join together to form various proteins found in these organisms – and imaging data such as copies of Ultrasound and MRI.
Significance-
- Such databases have traditionally played a key role in determining the genetic basis of various diseases and finding targets for vaccines and therapeutics.
- It will provide a platform for researchers to securely store their data within the country
- It will also provide access to a large database of indigenous sequences for analyses.
- At present, most Indian researchers depend on the European Molecular Biology Laboratory (EMBL) and National Center for Biotechnology Information databases for storing biological data.
- There are other smaller datasets available with some institutes, but those are not accessible to all.
- The Indian phenotype is very different and solutions based on others’ data might not be optimal.
- Moreover, India can even provide our data to Western countries.