Next door to Delhi; a ‘bank’ to store the country’s digitised biological data

November 13, 2022
Posted by: OptimizeIAS Team
Category: DPN Topics

No Comments

Next door to Delhi, a ‘bank’ to store the country’s digitised biological data

Subject :Polity

Context-

The government has for the first time set up a digitised repository where Indian researchers will store biological data from publicly funded research, reducing their dependency on American and European data banks.

About ‘Indian Biological Data Bank’-

It has come up at the Regional Centre for Biotechnology in Faridabad.
It will be stored on a four-petabyte [A petabyte equals 10,00,000 gigabytes (GB)] supercomputer called ‘Brahm’.
The government has mandated that data from all publicly funded research should be stored in this central repository.
The bio-bank, which costs about Rs 85 crore to set up, currently accepts nucleotide sequences — the digitised genetic makeup of humans, plants, animals, and microbes.
The biobank also has a backup data ‘Disaster Recovery’ site at National Informatics Centre (NIC)-Bhubaneshwar.

Types of data stored in the Bio-Bank-

The database currently offers two mechanisms for data submission to researchers.
One, open access where the data uploaded can be immediately used by other researchers from across the country and two, controlled access where the data will not be openly shared for a number of years before being opened up to all.
There are now 200 billion base pair data in the bio-bank, including 200 human genomes sequenced under the ‘1,000 Genome Project’, which is an international effort to map the genetic variations in people.
The project will also focus on populations that are predisposed to certain diseases.
The database contains-
- most of the 2.6 lakh Sars-Cov-2 genomes sequenced by the Indian Sars-CoV-2 Genomic Consortium (INSACOG).
  - The government learnt from this data that the Omicron sub-variant BA.2.75 was being overtaken by a recombinant variant XBB — which is a combination of two Omicron sub-lineages, BJ.1 and BA.2.75.
- 25,000 sequences of mycobacterium tuberculosis that another national consortium is trying to sequence.
- Genomic sequences of crops such as rice, onion, tomatoes and mustard, among others.
- With genomes of humans, animals, and microbes present in the same database, it will also help researchers in studying zoonotic diseases.
Although the database currently only accepts such genomic sequences, it is likely to expand later to the storage of protein sequences – strings of amino acids that join together to form various proteins found in these organisms – and imaging data such as copies of Ultrasound and MRI.

Significance-

Such databases have traditionally played a key role in determining the genetic basis of various diseases and finding targets for vaccines and therapeutics.
It will provide a platform for researchers to securely store their data within the country
It will also provide access to a large database of indigenous sequences for analyses.
At present, most Indian researchers depend on the European Molecular Biology Laboratory (EMBL) and National Center for Biotechnology Information databases for storing biological data.
There are other smaller datasets available with some institutes, but those are not accessible to all.
The Indian phenotype is very different and solutions based on others’ data might not be optimal.
Moreover, India can even provide our data to Western countries.