1. DNA Polymorphisms
in the Human Genome.
Mutations and Genetic
Diseases.
Marie Kopecká 2007/2008 mkopecka@med.muni.cz
2. • GENOME
• All of the genetic material of a cell
or of an individual.
3. The Human Genome
• Nuclear genome.
• Mitochondrial genome.*
Lecture concerns „Nuclear genome”
only, for which the term „Human genome” is used.
* 1994
4. • POLYMORPHISM, GENETIC (Ford, 1940)
• Existence of more than one normal allele
at a gene locus
• with the rarest allele exceeding a
frequency of 1%.
• A polymorphism may exists at several
levels, i.e., variants in DNA sequence,
• amino acid sequence,
• chromosomal structure, or
• phenotypic traits.
5. • DNA POLYMORPHISM
• A difference in DNA sequence among
individuals, groups, or populations.
• Sources include single nucleotide
polymorphisms (SNPs),
• sequence repeats,
• insertions,
• deletions and
• inversions.
6. 1866 Johan Gregor Mendel: genes (elements)
1869 Fritz Miescher: DNA (nuclein)
1882 Walther Flemming: chromosomes in mitosis
1911 Thomas Hunt Morgan: genes in chromosomes
1944 Oswald Avery, Colin MacLeod, Maclyn McCarty:
DNA is carrier of genetic information in bacteria
1952 Al Hershey, Martha Chase:
DNA is carrier of genetic information in viruses
1953 James Watson, Francis Crick: double-helix of DNA
1956 Tjio and Levan: human karyotype
1970 Hamilton Smith: restriction endonuclease HindII
1977 Fred Sanger/Allan Maxam,Walter Gilbert: DNA sequencing
1983 Kerry Mullis: PCR
2001 First „draft” of human genome:
Length of DNA=3.2 x 109 bp. Gene number = 26 383.
Nature 409: 860-921, (2001) - 15. 2. 2001 - HUGO,
Science 291: 1304, (2001)- 16. 2. 2001 - Celera Genomics
2006 Sharp, Cheng, Eichler, ARGHG 7 (2006): human genome 24 652 genes
7. The Human Genome:
First estimation of gene number?
• 1980 Walter Gilbert:
• The human genome has about
3 000 000 000 bp.
• The gene size is about 30 000 bp.
• How many genes in human DNA?
• 3 000 000 000 : 30 000 = 100 000
• The human genome contains about
100 000 genes !
8. The Human Genome (2001):
comparison with other genomes
Yeast 6 200 genes
Drosophila 14 000 genes
C. elegans 19 000 genes
Arabidopsis 27 000 genes
Homo sapiens ~ 25 000 genes
9. HUMAN GENOME – Important data:
1A. Length of DNA: 2.9 x 109 bp (Campbell, Reece 2005)
Number of genes coding proteins: 25 000
1B. Length of DNA 3.2 x 109 pb (Alberts et al. 2004)
Number of genes: 31 000
2. DNA (Alberts et al. 2004)
A. UNIQUE DNA: (39%)
CODING SEQUENCES 1.5% (Exon regions of genes coding
for proteins, rRNAs and tRNAs)
Introns and regulatory sequences (~24%)
Unique non-coding DNA (~15%)
B. REPETETIVE DNA (53%)
LINEs 21%
SINEs 13%
Retroviral-like elements 8%
DNA- only transposon „fossils” 3%
Segmental duplications 3%
Simple sequence repeats 5%
C. HETEROCHROMATIN (8%) -
10. Repetitive sequences at centromeric and telomeric
regions and intersperesed among the genes.
12. Human genome
• As one commmentator described our genome:
„In some ways our genome may resemble
your garage/ bedroom/ refrigerator/ life:
highly individualistic, but unkempt; little
evidence of organization; much accumulated
„junk“; virtually nothing ever discarded; and
the few patently valuable items
indiscriminately, apparently carelessly,
scattered throughout.”
13. Genetic Variations Within the Human Genome
With the exception of identical twins, no two people have the
exact same genome.
When the same region of the genome from two different
humans is compared, the nucleotide sequences typically
differ by about 0.1 %.
When considering the size of human genome
(~3 000 000 000 bp) that amounts ~ 3 000 000
genetic differences in each maternal and paternal
chromosome set between one person and the next.
14. DNA POLYMORPHISM / GENE MUTATION
DNA POLYMORPHISM (D.P.)=
a difference in DNA sequence among individuals, groups, or populations.
D.P. include SNP, sequence repeats, insertions, deletions, recombination.
Example: blue eyes versus brown eyes; straight hair versus curly hair.
If a difference in DNA sequence is associated with disease, it is usually
called a genetic mutation.
Changes in DNA caused by external agents (mutagens) are also called
„mutations” rather than „polymorphisms”.
GENE MUTATION:
A change in the nucleotide sequence of a DNA molecule. Genetic mutations
are a kind of genetic polymorphism.
The term „mutation” as opposed to „polymorphism” is generally used to
refer to changes in DNA sequence which are not present in most
individuals of a species and either have been associated with disease (or
risk of disease) or have resulted from damage inflicted by external
agents (viruses, radiation, chemical mutagens.)
15. SINGLE NUCLEOTIDE POLYMORPHISM (SNP)
• SINGLE NUCLEOTIDE POLYMORPHISM
is a source variance in a genome.
• A SNP is a single base variation in DNA.
• SNPs are the most simple form and
• most common source of genetic polymorphism in the human
genome = 90% of all human DNA polymorphisms.
• The bulk of this variation was inherited from our ancestors.
16. SNPs (cont.)
Frequency:
1/1000 base pairs between two equivalent chromosomes.
Distribution:
SNPs are not uniformly distributed over the entire human genome,
however, they can be found in non-coding and coding regions.
Coding regions:
SNPs in a coding region may have two different efects on the resulting protein:
Synonymous: the substitution causes no amino acid change = silent mutation.
Non-synonymous: the substitution results in an alteration of the encoded amino
acid:
-a missense mutation changes the protein by causing a change of codon;
-a nonsense mutation results in a misplaced termination codon.
Regulatory regions: SNPs may change the amount of or timing of protein
production.
Non-coding regions: In healthy people, DNA polymorphisms probably occur
predominantly in non-coding DNA sequences or in DNA regions that do not
influence the function of coded proteins.
17. SNPs
These polymorphisms are scattered throughout the genome.
More than 90% of all human genes contain at least one SNP.
Because SNPs are present at such a high density, they are
useful markers for identification of mutated gene.
Most of the SNP and other variations are genetically silent =
affect the DNA sequences in noncritical regions of the
genome. These SNP have no effect on how we look or how our
cells function.
• However, a tiny subset od SNP are presumably responsible
for nearly all heritable aspects of human individuality. Major
task is to learn to recognize those relatively few variations
that are functionally important against the large background
of neutral variation that distinguishes the genomes of any
two human beings.
18. DNA Polymorphisms in the Human Genome
Examples:
• Single nucleotide polymorphisms (SNPs)
• Variable Number of Tandem Repeats
(VNTR)
20. Recognition sites for one restriction endonuclease in Electrophoresis of
SNP homologous chromosomes restriction fragments of
DNA and Souhern blot
Variability in a number of repetitive sequence
VNTR (CACACACA…)
Recognition site
for restriction Recognition site
Gene for restriction enzyme absent
probe enzyme present
Two types of DNA polymorphism: Single Nucleotide Polymorphism (SNP)(top
left) and Variable Number of Tandem Repeats (VNTR) (bottom left), both
resulting in Restriction Fragment Length Polymorphism (RFLP) (right). Polymor-
phisms are detectable by Southern blotting,PCR,DNA sequencing,microarrays
21. • How to use SNP or VNTR ?
• SNP and VNTR change the lenth of
restriction fragments of DNA (RFLP).
• It is possible to use RFLP in indirect
DNA diagnosis
(see Lab book, p. 114)
22. Direct DNA diagnosis:
Direct identification of mutation in the human
genome by:
• DNA sequencing,
• Oligonucleotide probes (ASO) in Southern
blotting,
• PCR… (see Lab book, p. 113-117)
Indirect DNA diagnosis:
If mutation cannot be identified directly,
effort is to identify mutation by indirect
methods, f.e. by SNP near mutation by RFLP
(see Lab book, 114-117 and the next slides)
23. Example of linkage of mutated gene causing disease and SNP close on the same chromosome
25. Indirect DNA diagnosis of muscle dystrophy by RFLP
(GR)
Conclusion:
Mother´s sister does not
6 year boy
have mutated allele
DNA of family Normal allele for
members + restriction dystrophin
endonuclease,
gel electrophoresis of Deletion mutation in the
restriction fragments, dystrophin alelle
Southern blotting
26. Examples of gene mutations: substitution, insertion (addition),
deletion…
27. Consequences of the gene mutation for the function
of mutated protein product
deletion mutation
28. Another example:
mutated enzyme
Indirect DNA
diagnosis of
phenylketonuria
(AR)
by means of
RFLP
Conclusion: DNA of
II. 2 will be
family
members
+
healthy restriction
endonu-
heterozygote clease, gel
electro-
phoresis
and
Southern
blotting
29. It is easier than to look for single SNPs to identify the whole
haplotype* block and to identify SNP inside of this block!
Haplotype* = combination of alelles and futher DNA markers in linear block of DNA that is not
interrupted by recombination and that is transferred for many generations.
30. • The investigation of the human genome
continues.
• The aim is to understand the molecular
nature of mutation of genes responsible for
human diseases.
• Identification of DNA polymorphisms of
individuals and of human haplotype map may
help to identify
• pre-dispositions to diseases,
• the diagnosis and the course of diseases and
• their potential therapy.
31. Specific terms
• ALLELE The alternative forms of a genetic character found at a given locus on a chromosome.
• DIPLOID The chromosome complement consists of two copies (homologues) of each chromosome.
In humans, the each chromosome pair is from a different origin (mother, father).
• GENE (Johannsen, 1909) A hereditary factor that constitutes a single unit of hereditary material. It
corresponds to a segment of DNA that codes for the synthesis of a single polypeptide chain.
• GENE LOCUS (Morgan, 1915) The position of a gene on a chromosome.
• GENE MAP The position of gene loci on chromosomes. Physical map – the absolute position of gene
loci, their distance from each other being expressed by the number of base pairs between them. Genetic
map – expresses the distance of genetically linked loci by their frequency of recombinations.
• GENETIC MARKER A polymorphic genetic property that can be used to distinguish the parental origin
of alleles.
• GENETIC POLYMORPHISM A difference in DNA sequence among individuals, groups, or
populations. Sources include SNPs, sequence repeats, insertions, deletions and recombination.
• GENOME All of the genetic material of a cell or of an individual.
• GENOMICS The scientific field dealing with the structure and function of the entire genome.
• GENOTYPE An exact description of the genetic constitution of an individual, with respect to a single
trait or a larger set of traits. The genetic constitution of an organism as revealed by genetic or molecular
analysis, i.e. the complete set of genes, both dominant and recessive, possessed by a particular cell or
organism.
• GENOTYPING Genotyping is normally defined as detecting the genotypes of individual SNPs.
• HAPLOTYPE (haploid genotype). A combination of alleles of closely linked loci that are found in a
single chromosome and tend to be inherited together. The linear, ordered arrangement of alleles in a
chromosome. A particular pattern of sequential SNPs (or alleles) found on a single chromosome. These
SNPs tend to be inherited together over time and can serve as disease- gene markers.
• HOMOZYGOUS A diploid organism having identical alleles of a given gene on both homologous
chromosomes.
• HETEROZYGOUS A diploid organism having different alleles of a given gene on both homologous
chromosomes.
32. • LINES Long interspersed nuclear elements= long interspersed repetitive DNA sequence.
POLYMORPHISM, GENETIC (Ford, 1940) Existence of more than one normal allele at a gene locus
with the rarest allele exceeding a frequency of 1%. A polymorphism may exists at several levels, i.e.,
variants in DNA sequence, amino acid sequence, chromosomal structure, or phenotypic traits.
• PHENOTYPE (Johannsen, 1909) The observable effect of one or more genes on an individual or a cell.
The observable properties of an individual as they have developed under the combined influences of the
individual´s genotype and the effects of environmental factors.
• PROTEOM The complete set of all protein encoding genes or all proteins produced by them.
• PSEUDOGENE Nucleotide sequence of DNA similar to functional gene, but containing many mutations
that prevent expression.
• SINE Short interspersed nuclear element=short repetitive DNA sequences.
• SINGLE NUCLEOTIDE POLYMORPHISM (SNP) SNP is a source variance in a genome. A SNP is a
single base variation in DNA. SNPs are the most simple form and most common source of genetic
polymorphism in the human genome (90% of all human DNA polymorphisms).