Genetic code

Genetic code
• The sequence
of nucleotides in deoxyribonucleic acid (DNA)
and ribonucleic acid (RNA) that determines
the amino acid sequence of proteins. Though
the linear sequence of nucleotides in DNA
contains the information
for protein sequences, proteins are not made
directly from DNA. Instead, a messenger
RNA (mRNA) molecule is synthesized from the
DNA and directs the formation of the protein.
RNA is composed of four nucleotides: adenine
(A), guanine (G), cytosine (C), and uracil(U).[1]

History of genetic code:
• Adaptor hypothesis:
• The adaptor hypothesis is part of a scheme to explain how
information encoded in DNA is used to specify the amino acid
sequence of proteins. It was formulated by Francis Crick in the mid-
1950s, together with the central dogma of molecular biology and
the sequence hypothesis.[2]

• That such adaptors do exist was discovered by Mahlon
Hoagland and Paul Zamecnik in 1958. These “soluble RNAs” are now
called transfer RNAs and mediate the translation of messenger
RNAs on ribosomes according to the rules contained in the genetic
code. Crick imagined that his adaptors would be small, perhaps 5-10
nucleotides long. In fact, they are much larger, having a more complex
role to play in protein synthesis, and are closer to 100 nucleotides in
length.

Code crackers: How the genetic code was discovered
• To crack the genetic code, researchers needed to figure out how sequences of nucleotides in a
DNA or RNA molecule could encode the sequence of amino acids in a polypeptide.Why was
this a tricky problem?
• In one of the simplest potential codes, each nucleotide in an DNA or RNA molecule might
correspond to one amino acid in a polypeptide. However, this code cannot actually work,
because there are 20 amino acids commonly found in proteins and just 44 nucleotide bases in
DNA or RNA. Thus, researchers knew that the code must involve something more complex than
a one-to-one matching of nucleotides and amino acids.

The triplet hypothesis
• In the mid-1950s, the physicist George Gamow extended this line of
thinking to deduce that the genetic code was likely composed of
triplets of nucleotides. That is, he proposed that a group
of 3 successive nucleotides in a gene might code for one amino acid
in a polypeptide.
• Gamow's reasoning was that even a doublet code (2 nucleotides per
amino acid) would not work, as it would allow for only 16 ordered
groups of nucleotides (42 squared), too few to account for
the 20 standard amino acids used to build proteins. A code based on
nucleotide triplets, however, seemed promising: it would provide 64
unique sequences of nucleotides (43 cubed), more than enough to
cover the 20 amino acids.

Nirenberg, Khorana, and the identification of
codons.
• Gamow’s triplet hypothesis seemed logical and was widely accepted.
However, it had not been experimentally proven, and researchers still
did not know which triplets of nucleotides corresponded to which
amino acids.
• The cracking of the genetic code began in 1961, with work from the
American biochemist Marshall Nirenberg. For the first time,
Nirenberg and his colleagues were able to identify specific nucleotide
triplets that corresponded to particular amino acids. Their success
relied on two experimental innovations:

• A way to make artificial mRNA molecules with specific, known
sequences.
• A system to translate mRNAs into polypeptides outside of a cell (a
"cell-free" system). Nirenberg's system consisted of cytoplasm from
burst E. coli cells, which contains all of the materials needed for
translation.

• First, Nirenberg synthesized an mRNA molecule consisting only of the
nucleotide uracil (called poly-U). When he added poly-U mRNA to the
cell-free system, he found that the polypeptides made consisted
exclusively of the amino acid phenylalanine. Because the only triplet
in poly-U mRNA is UUU, Nirenberg concluded that UUU might code
for phenylalanine. Using the same approach, he was able to show
that poly-C mRNA was translated into polypeptides made exclusively
of the amino acid proline, suggesting that the triplet CCC might code
for proline.

• Other researchers, such as the biochemist Har Gobind Khorana at
University of Wisconsin, extended Nirenberg's experiment by
synthesizing artificial mRNAs with more complex sequences. For
instance, in one experiment, Khorana generated a poly-UC
(UCUCUCUCUC…) mRNA and added it to a cell-free system similar to
Nirenberg's. The poly-UC mRNA that it was translated into
polypeptides with an alternating pattern of serine and leucine amino
acids. These and other results unambiguously confirmed that the
genetic code was based on triplets, or codons. Today, we know that
serine is encoded by the codon UCU, while leucine is encoded by
CUC.

• By 1965, using the cell-free system and other techniques, Nirenberg,
Khorana, and their colleagues had deciphered the entire genetic
code. That is, they had identified the amino acid or "stop" signal
corresponding to each one of the 64nucleotide codons. For their
contributions, Nirenberg and Khorana (along with another genetic
code researcher, Robert Holley) received the Nobel Prize in 1968.[3]

Characteristics of genetic
code:

Characteristics of genetic code:
• Triplet nature:
A triplet code could make a genetic code for 64 different
combinations (4 X 4 X 4) genetic code and provide plenty of
information in the DNA molecule to specify the placement of all 20
amino acids. When experiments were performed to crack the genetic
code it was found to be a code that was triplet. These three letter
codes of nucleotides (AUG, AAA, etc.) are called codons.
• Degeneracy
The code is degenerate which means that the same amino acid is
coded by more than one base triplet. For example, the three amino
acids arginine, alanine and leucine each have six synonymous codons.

• Nonoverlapping
The genetic code is nonoverlapping, i.e., the adjacent codons do not
overlap. A nonoverlapping code means that the same letter is not
used for two different codons. In other words, no single base can take
part in the formation of more than one codon.
• Commaless
• There is no signal to indicate the end of one codon and the beginning
of the next. The genetic code is commaless (or comma-free).

• Non-ambiguity
A particular codon will always code for the same amino acid. While
the same amino acid can be coded by more than one codon (the code
is degenerate), the same codon shall not code for two or more
different amino acids (non-ambiguous).
• Universality
Although the code is based on work conducted on the bacterium
Escherichia coli but it is valid for other organisms. This important
characteristic of the genetic code is called its universality. It means
that the same sequences of 3 bases encode the same amino acids in
all life forms from simple microorganisms to complex, multicelled
organisms such as human beings.

• Polarity
The genetic code has polarity, that is, the code is always read in a
fixed direction, i.e., in the 5′ → 3′ directio𝑛.[4]

Features of genetic code
• Codon:
• Three adjacent nucleotides constitute a unit known as the codon,
which codes for an amino acid. For example, the sequence AUG is a
codon that specifies the amino acid methionine. There are 64
possible codons, three of which do not code for amino acids but
indicate the end of a protein. The remaining 61 codons specify the 20
amino acids that make up proteins. The AUG codon, in addition to
coding for methionine, is found at the beginning of every mRNA and
indicates the start of a protein.

• Anti-codon:
• Anticodons are sequences of nucleotides that are complementary to
codons. They are found in tRNAs, and allow the tRNAs to bring the
correct amino acid in line with an mRNA during protein production.

Chain initiation codon /start codon :
• Chain initiation codon /start codon :
• a triplet of bases in mRNA with the sequence AUG (less commonly, G
UG) that acts as a ‘start’ signal for TRANSLATION, specifying the first a
mino acid at the N-terminus of the POLYPEPTIDE CHAIN

Alternate start codon:
• Alternative start codons are different from the standard AUG codon and
are found in both prokaryotes (bacteria) and eukaryotes
• In eukaryotes mammalian cells can initiate translation with leucine using a
specific leucyl-tRNA that decodes the codon CUG. Candida albicans uses a
CAG start codon.
• Mitochondrial genomes use alternate start codons more significantly (AUA
and AUU in humans).
• Prokaryotes use alternate start codons significantly, mainly GUG and UUG
• In prokaryotes, E. coli is found to use AUG 83%, GUG 14%, and UUG 3% as
START codons. [5]

Chain Termination Codons/ stop codons:
• The 3 triplets UAA, UAG, UGA do not code for any amino acid. They
were originally described as non-sense codons, as against the
remaining 61 codons, which are termed as sense codons
• During protein synthesis, STOP codons cause the release of the new
polypeptide chain from the ribosome. This occurs because there are
no tRNAs with anticodons complementary to the STOP codons.[5]

What is
Overlapping and
non-overlapping
in genetic code ?

What is Over lapping and non-overlapping in
genetic code ?
• The genetic code is non-overlapping; for example in a sequence of
ABCDEF, ABC would code the first amino acid and DEF the second
whereas in an overlapping code ABC could code for the first amino
acid and BCD the second. The genetic code has no internal
punctuation (like commas and semi-colons) such as having X in
between each codon like XABCXDEFX... since it is read sequentially
from a starting point (however it could be argued that the so called
"stop" codons function as "periods" during translation). Therefore, a
deletion or insertion mutation that does not occur in a multiple of
three results in a frame shift mutation. The genetic code is composed
of nucleotide triplets. In other words, three nucleotides in mRNA (a
codon) specify one amino acid in a protein.

• The code is non-overlapping. This means that successive triplets are read in
order. Each nucleotide is part of only one triplet codon
• An-other example is as follow :
• Now consider this short sequence of DNA:
• AATGCT
• The first codon in the sequence is
• AATGCT
• If the code was overlapping if one part of a codon would be present in another
codon. If that were the case, then the next codon would be
• AATGCT
• In this scenario, the AT are present in both codons, hence the
name overlapping genetic code. However in, the genetic code is non-
overlapping, meaning the bases present in one codon are not present in
adjacent codons. Hence the next codon in a non-overlapping code would be
• AATGCT [6]

Reading frame
• The start codon is critical because it determines where translation will
begin on the mRNA. Most importantly, the position of the start codon
determines the reading frame, or how the mRNA sequence is
divided up into groups of three nucleotides inside the ribosome. As
shown in the diagram below, the same sequence of nucleotides can
encode completely different polypeptides depending on the frame in
which it's read. The start codon determines which frame is chosen
and thus ensures that the correct polypeptide is produced.

Wobble hypothesis
• There are more than one codon for one amino acid. This is called
degeneracy of genetic code.
• To explain the possible cause of degeneracy of codons, in 1966,
Francis Crick proposed “the Wobble hypothesis”.
• According to this hypothesis, only the first two bases of the codon
have a precise pairing with the bases of the anticodon of tRNA,
while the pairing between the third bases of codon and anticodon
may Wobble (wobble means to sway or move unsteadily).
• The phenomenon permits a single tRNA to recognize more than one
codon. Therefore, although there are 61 codons for amino acids, the
number of tRNA is far less (around 40) which is due to wobbling.

Wobble hypothesis :
• The wobble hypothesis states that the base at 5′ end of the anticodon is not
spatially confined as the other two bases allowing it to form hydrogen bonds
with any of several bases located at the 3′ end of a codon. This leads to the
following conclusions:
• The first two bases of the codon make normal (canonical) H-bond pairs with
the 2nd and 3rd bases of the anticodon.
• At the remaining position, less stringent rules apply and non-canonical pairing
may occur. The wobble hypothesis thus proposes a more flexible set of base-
pairing rules at the third position of the codon.
• The relaxed base-pairing requirement, or “wobble,” allows the anticodon of a
single form of tRNA to pair with more than one triplet in mRNA.
• The rules: first base U can recognize A or G, first base G can recognize U or C,
and first base I can recognize U, C or A.

• Crick’s hypothesis hence predicts that the initial two ribonucleotides
of triplet codes are often more critical than the third member in
attracting the correct tRNA.

• A wobble base pair is a pairing between
two nucleotides in RNA molecules that does not follow Watson-
Crick base pair rules.
• The four main wobble base pairs are guanine-uracil (G-U), hypoxanthine-
uracil (I-U), hypoxanthine-adenine (I-A), and hypoxanthine-cytosine (I-C).
• In order to maintain consistency of nucleic acid nomenclature, “I” is used
for hypoxanthine because hypoxanthine is the nucleobase of inosine.
• Inosine displays the true qualities of wobble, in that if that is the first
nucleotide in the anticodon then any of three bases in the original codon
can be matched with the tRNA.

Significance of the Wobble Hypothesis
• Our bodies have a limited amount of tRNAs and wobble allows for broad
specificity.
• Wobble base pairs have been shown to facilitate many biological
functions, most clearly proven in the bacterium Escherichia coli.
• The thermodynamic stability of a wobble base pair is comparable to that
of a Watson-Crick base pair.
• Wobble base pairs are fundamental in RNA secondary structure and are
critical for the proper translation of the genetic code.
• Wobbling allows faster dissociation of tRNA from mRNA and also protein
synthesis.
• The existence of wobble minimizes the damage that can be caused by a
misreading of the code; for example, if the Leu codon CUU were misread
CUC or CUA or CUG during transcription of mRNA, the codon would still be
translated as Leu during protein synthesis.[7]

Function of Genetic Code:
• The genetic code allows cells to contain a mind-boggling amount of information.
• Consider this: a microscopic fertilized egg cell, following the instructions
contained in its genetic code, can produce a human or elephant which even has
similar personality and behaviors to those of its parents. There is a lot of
information in there!
• The development of the genetic code was vital because it allowed living things to
reliably produce products necessary for their survival – and pass instructions for
how to do the same onto the next generation.
• This specific base pairing ensures that the new partner strand will contain the
same sequence of base pairs – the same “code” – as the old partner strand. Each
resulting double helix contains one strand of old DNA paired with one strand of
new DNA.
• information contained in the DNA is transformed into all of the materials of life,
using the genetic code. [8]

Types of Genetic Mutations
• A mutation, which may arise during replication and/or recombination,
is a permanent change in the nucleotide sequence of DNA. Damaged
DNA can be mutated either by substitution, deletion or insertion of
base pairs
• Point muatation : A point mutation is a type of mutation in DNA or
RNA, the cell’s genetic material, in which one single nucleotide base is
added, deleted or changed.

Types of point mutation :
• Substitution
• A substitution mutation occurs when one base pair is substituted for
another. For example, this would occur when one nucleotide
containing cytosine is accidentally substituted for one containing
guanine.
• Transition: this occurs when a purine is substituted with another
purine or when a pyrimidine is substituted with another pyrimidine.
• Transversion: when a purine is substituted for a pyrimidine or a
pyrimidine replaces a purine.

• There are three types of substitution mutations:
• Nonsense
• Missense
• Silent

• A nonsense mutation occurs when one nucleotide is substituted and
this leads to the formation of a stop codon instead of a codon that
codes for an amino acid. A stop codon a certain sequence of bases
(TAG, TAA, or TGA in DNA, and UAG, UAA, or UGA in RNA) that stops
the production of the amino acid chain. It is always found at the end
of the mRNA sequence when a protein is being produced, but if a
substitution causes it to appear in another place, it will prematurely
terminate the amino acid sequence and prevent the correct protein
from being produced.

• Missense mutation :
• occurs when one nucleotide is substituted and a different codon is
formed; but this time, the codon that forms is not a stop codon.
Instead, the codon produces a different amino acid in the sequence of
amino acids. For example, if a missense substitution changes a codon
from AAG to AGG, the amino acid arginine will be produced instead of
lysine. A missense mutation is considered conservative if the amino
acid formed via the mutation has similar properties to the one that
was supposed to be formed instead. It is called non-conservative if
the amino acid has different properties that structure and function of
a protein.

• In a silent mutation, a nucleotide is substituted but the same amino
acid is produced anyway. This can occur because multiple codons can
code for the same amino acid. For example, AAG and AAA both code
for lysine, so if the G is changed to an A, the same amino acid will
form and the protein will not be affected.

Insertion and Deletion
An insertion mutation occurs when an extra base pair is added to a
sequence of bases. A deletion mutation is the opposite; it occurs when a
base pair is deleted from a sequence. These two types of point mutations are
grouped together because both of them can drastically affect the sequence
of amino acids produced. With one or two bases added or deleted, all of the
three-base codons change. This is called a frameshift mutation. For example,
if a sequence of codons in DNA is normally CCT ATG TTT and an extra A is
added between the two cytosine bases, the sequence will instead read CAC
TAT GTT T. This completely changes the amino acids that would be produced,
which in turn changes the structure and function of the resulting protein and
can render it useless. Similarly, if one base was deleted, the sequence would
also shift[9]

• Errors in DNA Replication
• On very, very rare occasions DNA polymerase will incorporate a
noncomplementary base into the daughter strand. During the next
round of replication the missincorporated base would lead to a
mutation. This, however, is very rare as the exonuclease functions as
a proofreading mechanism recognizing mismatched base pairs and
excising them.
• Errors in DNA Recombination
• DNA often rearranges itself by a process called recombination which
proceeds via a variety of mechanisms. Occasionally DNA is lost during
replication leading to a mutation.

• Chemical Damage to DNA
• Many chemical mutagens, some exogenous, some man-made, some
environmental, are capable of damaging DNA. Many
chemotherapeutic drugs and intercalating agent drugs function by
damaging DNA.
• Radiation
• Gamma rays, X-rays, even UV light can interact with compounds in
the cell generating free radicals which cause chemical damage to
DNA[10]

Genetic code

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Genetic code

Similar a Genetic code (20)

Último

Último (20)

Genetic code