First complete human genome

April 4, 2022
Posted by: OptimizeIAS Team
Category: DPN Topics

First complete human genome

Subject: Science & Tech

Section: Biotech

Context:

In 2003, the Human Genome Project made history when it sequenced 92% of the human genome. But ever since then, scientists have struggled to decipher the remaining 8%. Now, a team of nearly 100 scientists from the Telomere-to-Telomere (T2T) Consortium has unveiled the complete human genome — the first time it’s been sequenced entirely, researchers say.

Genome sequencing

Definition and Description: German botanist Hans Winkler coined the word “genome” in 1920, combining the word “gene” with the suffix “-ome,” meaning “complete set,” to describe the full DNA sequence contained within each cell. Researchers still use this word a century later to refer to the genetic material that makes up an organism.

Genome is an anthology containing the DNA instructions for life. It’s composed of a vast array of nucleotides (letters) that are packaged into chromosomes (chapters). Each chromosome contains genes (paragraphs) that are regions of DNA which code for the specific proteins that allow an organism to function. “Genetic material is made of DNA tightly packaged into chromosomes. Only select regions of the DNA in a genome contain genes coding for proteins”

dna instructions for life

While every living organism has a genome, the size of that genome varies from species to species. An elephant uses the same form of genetic information as the grass it eats and the bacteria in its gut. But no two genomes look exactly alike. Some are short, like the genome of the insect-dwelling bacteria Nasuiadeltocephalinicola with just 137 genes across 112,000 nucleotides. Some, like the 149 billion nucleotides of the flowering plant Paris japonica, are so long that it’s difficult to get a sense of how many genes are contained within.

The human genome contains roughly 3 billion nucleotides and just under 20,000 protein-coding genes — an estimated 1 per cent of the genome’s total length. The remaining 99 per cent is non-coding DNA sequences that don’t produce proteins. Some are regulatory components that work as a switchboard to control how other genes work. Others are pseudogenes, or genomic relics that have lost their ability to function. And over half of the human genome is repetitive, with multiple copies of near-identical sequences.

What is repetitive DNA?

Satellites DNA:

The simplest form of repetitive DNA is blocks of DNA repeated over and over in tandem called satellites. While how much satellite DNA a given genome has varies from person to person, they often cluster toward the ends of chromosomes in regions called telomeres. These regions protect chromosomes from degrading during DNA replication. They’re also found in the centromeres of chromosomes, a region that helps keep genetic information intact when cells divide.

Researchers still lack a clear understanding of all the functions of satellite DNA. But because satellite DNA forms unique patterns in each person, forensic biologists and genealogists use this genomic “fingerprint” to match crime scene samples and track ancestry. Over 50 genetic disorders are linked to variations in satellite DNA, including Huntington’s disease.

Satellite DNA tends to cluster toward the ends of chromosomes in their telomeres. Here, 46 human chromosomes are colored blue, with white telomeres.

Transposable elements:

Some scientists have described them as selfish DNA because they can insert themselves anywhere in the genome, regardless of the consequences. As the human genome evolved, many transposable sequences collected mutations repressing their ability to move to avoid harmful interruptions. But some can likely still move about. For example, transposable element insertions are linked to a number of cases of hemophilia A, a genetic bleeding disorder.

But transposable elements are not just disruptive. They can have regulatory functions that help control the expression of other DNA sequences. When they are concentrated in centromeres, they may also help maintain the integrity of the genes fundamental to cell survival.

They can also contribute to evolution. Researchers recently found that the insertion of a transposable element into a gene important to development might be why some primates, including humans, no longer have tails. Chromosome rearrangements due to transposable elements are even linked to the genesis of new species like the gibbons of southeast Asia and the wallabies of Australia.

When the Human Genome Project first launched in 1990, technological limitations made it impossible to fully uncover repetitive regions in the genome. Available sequencing technology could only read about 500 nucleotides at a time, and these short fragments had to overlap one another in order to recreate the full sequence. Researchers used these overlapping segments to identify the next nucleotides in the sequence, incrementally extending the genome assembly one fragment at a time.

Since then, sequence patches have gradually filled in gaps of the human genome bit by bit. And in 2021, the Telomere-to-Telomere (T2T) Consortium, an international consortium of scientists working to complete a human genome assembly from end to end, announced that all remaining gaps were finally filled.

With the completion of the first human genome, researchers are now looking toward capturing the full diversity of humanity.

This was made possible by improved long-read DNA sequencing technology capable of reading longer sequences thousands of nucleotides in length. With more information to situate repetitive sequences within a larger picture, it became easier to identify their proper place in the genome. Like simplifying a 1,000-piece puzzle to a 100-piece puzzle, long-read sequences made it possible to assemble large repetitive regions for the first time. But one complete genome doesn’t capture it all. Efforts continue to create diverse genomic references that fully represent the human population and life on Earth. With more complete, “telomere-to-telomere” genome references, scientists’ understanding of the repetitive dark matter of DNA will become more clear.

What is Human Genome Project?

The Human Genome Project (HGP) was the international, collaborative research program whose goal was the complete mapping and understanding of all the genes of human beings. All our genes together are known as our “genome.”

Goals:

The main goals of the Human Genome Project were first articulated in 1988 by a special committee of the U.S. National Academy of Sciences, and later adopted through a detailed series of five-year plans jointly written by the National Institutes of Health (NIH) and the Department of Energy (DoE).
Congress funded both the NIH and the DoE to embark on further exploration of this concept, and the two government agencies formalized an agreement by signing a Memorandum of Understanding to “coordinate research and technical activities related to the human genome.”
James Watson was appointed to lead the NIH component, which was dubbed the Office of Human Genome Research. The following year, the Office of Human Genome Research evolved into the National Center for Human Genome Research.
In 1990, the initial planning stage was completed with the publication of a joint research plan, “Understanding Our Genetic Inheritance: The Human Genome Project, The First Five Years, FY 1991-1995.” This initial research plan set out specific goals for the first five years of what was then projected to be a 15-year research effort.
HGP researchers deciphered the human genome in three major ways:
determining the order, or “sequence,” of all the bases in our genome’s DNA;
making maps that show the locations of genes for major sections of all our chromosomes; and
producing what are called linkage maps, through which inherited traits (such as those for genetic disease) can be tracked over generations.

APPLICATION OF HUMAN GENOME PROJECT

The sequencing of the human genome holds benefits for many fields, from molecular medicine to human evolution.
It helps in the identification of mutations that are linked to different forms of cancer.
It helps in identifying the disease-causing gene in our body.
The sequence of the DNA is stored in databases that are available to anyone on the internet.
The U.S National Centre for Biotechnology Information stores the gene sequence in a database known as GenBank, along with sequences of known and hypothetical genes and proteins.
The Genomics and Bioinformatics branch has developed due to this project.
It has improved forensic science where genetic fingerprinting helps to match a suspect to the biological material found at a crime scene.