End to end genome sequencing
- August 1, 2021
- Posted by: OptimizeIAS Team
- Category: DPN Topics
No Comments
End to end genome sequencing
Context: On May 27, a preprint titled “The complete sequence of the human genome” was posted in the online repository bioRxiv.
- Researchers at the National Human Genome Research Institute (NHGRI), part of the National Institutes of Health (NIH), have produced the first end-to-end DNA sequence of a human chromosome.
- The effort is part of a broader initiative by the Telomere-to-Telomere (T2T) consortium, partially funded by NHGRI. The consortium aims to generate a complete reference sequence of the human genome.
- The T2T consortium is continuing its efforts with the remaining human chromosomes, aiming to generate a complete human genome sequence in 2020.
- They have, in the process, discovered over a hundred new genes that code for proteins. The total size of the genome they have sequenced is close to 3.05 billion base pairs.
- In this study, researchers did not sequence the X chromosome from a normal human cell. Instead, they used a special cell type — one that has two identical X chromosomes. Such a cell provides more DNA for sequencing than a male cell, which has only a single copy of an X chromosome. It also avoids sequence differences encountered when analyzing two X chromosomes of a typical female cell.
Significance:
- The Human genome project announced in 1990 announced a complete human genome but about 15% of it was incomplete. Due to limitations of technology scientists were not able to piece together some of the repetitive parts of the human genome.
- This study adds 200 million base pairs to the last draft of the human genome that was published in 2013. The results come with the caveat that about 0.3% may still have errors, and that among the sex chromosomes, only the X chromosome has been sequenced.
Concept:
What is special about protein-coding genes?
- There are long stretches that do not seem to have a particular function. On the other hand, protein-coding sequences or protein-coding genes are DNA sequences that get transcribed on ribonucleic acid (RNA) as an intermediate step.
- These in turn make the proteins responsible for various functions such as keeping the body healthy or determining the colour of the eye — proteins carry out the instructions encoded in the genes.
Genome Sequencing:
- Genome Sequencing means deciphering the exact order of base pairs in an individual.
- In this particular piece of DNA, an adenine (A) is followed by a guanine (G), which is followed by a thymine (T), which in turn is followed by a cytosine (C), another cytosine (C), and so on.
Whole/End to end Genome Sequencing:
- Whole-genome sequencing involves breaking the genome up into small pieces, sequencing the pieces, & reassembling the pieces into the full genome sequence.
- To know which genes of a person’s DNA are “mutated” the whole genome sequencing is required.
- Whole genome sequencing is the process of determining the complete DNA sequence of an organism’s genome at a single time.
- Because a human genome is incredibly long, consisting of about 6 billion bases, DNA sequencing machines cannot read all the bases at once. Instead, researchers chop the genome into smaller pieces, then analyze each piece to yield sequences of a few hundred bases at a time. Those shorter DNA sequences must then be put back together.
Advantages of End to end Genome Sequencing:
- The ability to generate truly complete sequences of chromosomes and genomes is a technical feat that will help us gain a comprehensive understanding of genome function and inform the use of genomic information in medical care.