1. EUKARYOTIC GENE
REGULATION I
Jill Howlin PhD
Canceromics Branch
Department of Oncology, Lund University
Medicon Village
http://www.med.lu.se/english/klinvetlund/canceromics/research
2. Lecture structure
Eukaryotic Gene Regulation I
A
Revision of the basics
• What is a gene, chromosome etc.
• Transcription
• mRNA processing & Splicing
• Translation…
• DNA mutations
Eukaryotic Gene Regulation II
B
Gene Structure and function
• Structure of typical gene in the genome and its
regulatory units
• How gene structure influences regulation..
A
Control of gene expression
• Producing specificity: Transcription factors
• Other forms of regulation of gene
expression, non-coding RNA, epigenetic
regulation: imprinting & methylation etc….
B
Analysis of Gene Expression & Regulation
• Analysis of gene expression in the genomics era:
transcriptomics
RNA-seq
• Studying gene regulation, DNA-protein interaction
3. Source material:
• Molecular Biology of the Cell. 5th edition. Alberts et al.
• The Cell: A Molecular Approach. 2nd edition. Cooper GM.
• Human Molecular Genetics. 4th edition. Strachan and Read
• Virtual Cell Animation Collectio http://vcell.ndsu.nodak.edu/animations/
(also available as free iphone/ipad app from itunes)
• Wikipedia http://en.wikipedia.org/
• ENCODE explorer http://www.nature.com/encode/#/threads
6. the basics…..
• What is a gene?
• How many genes do we have?
• Cell Nucleus and the Chromosome
• What is DNA?
• The double helix
• DNA-RNA-PROTEIN -the central dogma
• What is a mutation?
• DNA replication and inheritance
7. What is a GENE?
• discrete segments of DNA (or RNA) that comprise a functional unit
of hereditary material
• make a complementary RNA molecule that serves as a template
for making a protein
• the total complement of genes in an organism or cell is known as
its Genome
• the human genome approx. 3Gbp DNA and around 20,000
protein coding genes
8. A brief History of Genetics
Gregor Mendel (father of modern genetics) published his ‘Experiments in Plant
Hybridization’ in 1866
Charles Darwin published ‘Origin of the Species’ in 1859
the theories of both men were understood together and merged in early 20th
century
Around 1910 Thomas hunt Morgan genes are on chromosomes and are the mechanical
basis of heredity
The Hershey–Chase experiments 1952 by Alfred Hershey and Martha Chase confirmed
that DNA was the genetic material
9. The Double Helix
• the double helix allowed scientists to understand how DNA was replicated
• 1962 Nobel Prize James Watson, Francis Crick & Maurice Wilkins.
• Rosalind Franklin generated the X-ray diffraction images used to formulate
Watson and Cricks’ hypothesis
Photo 51
11. DNA Replication
DNA polymerase can only extend an
existing DNA strand, it cannot begin
the synthesis
helicase unwinds DNA at
the replication fork, the
DNA ahead is forced to
rotate
enzymes that solve the
physical problems of
twisted & coiling DNA
100 to 200 nucleotides long
on the lagging strand the new DNA is made in installments
12. The Cell Nucleus
DNA (and RNA) is referred to as nucleic acid because its found in the nucleus of a
cell
Site of protein synthesis
Where ribosomes are produced
13. DNA Packaging
30nm
1nm per turn, 3.6nm pitch
Nucleosome
147bp DNA wrapped
around a complex of
8 core histones
(H2A, H2B, H3, H4)
Histone H1
The nucleosomes fold
to a 30nm fiber..
that form loops
300nm in length…
..the 300nm fiber is
compressed and folded
further to produce a
250nm wide fiber
700nm
Tight coiling of the 250nm fiber
produces the chromatid of a
chromosome
1400nm Chromsome
14. • Humans have 23 pair of chromosomes
and (22 autosomes and one pair of sex
chromosomes)
Chromosomes
16. Chromosome disorders
Abnormal number aneuploidy
• Down Syndrome 3 x chromosome 21
• Klinefelter syndrome 46/47, XXY, or XXY syndrome
Abnormal structure deletions, duplications, translocations
• Jacobsen Syndrome 11q deletion disorder
Most chromosome abnormalities occur as an accident in the egg or sperm
17. Somatic error
Gross chromosomal abnormalities
• thousands of clustered chromosomal
rearrangements
• single catastrophic event of
fragmentation and reassembly
• cancer
BIMA30 2011-09-14/20
Chromothripsis shattering
18. Expression of Genetic Information
Chromosomal DNA functions in two ways:
- to allow replication of the genetic material (cell division &
inheritance)
- allow expression of genetic information in an organism
the central dogma
20. RNA ribonucleicacid
• Similar to DNA but contains the sugar ribose
instead of deoxyribose
• RNA also contains the base uracil (U) in
place of thymine (T)
• RNA molecules are less stable than DNA and
are typically single-stranded
21. The Genetic Code
DNA and RNA is organized in sets of three nucleotides, known as codons
DNA (base pairing A-T,G-C) while RNA (A-U,G-C)
Each codon correspond to a specific amino acid or to a signal (e.g.: START, STOP)
AUG, or the amino acid Met is the typical ‘START’ signal
Three ‘STOP codons’ tell the translation machinery that the end of the gene has
been reached.
64 (43) possible codons (4,4,4)
&
20 standard amino acids
genetic code is redundant
22. Deciphering the code..
5’ 3’
N C
coding strand /sense strand/Crick strand template strand/antisense strand / Watson strand
-NH2
positively charged amino group
-COOH
Negatively charged carboxylgroup
24. transcription
• DNA to RNA
basic mechanism-
DNA helix unwinds
Direction of
transcription
RNA polymerase
RNA strand
Template DNA strand
25. cis-acting factors onthesamemoleculee.g.:promoter
RNA Polymerase II is the enzyme that synthesizes RNA from genes
encoding proteins
It recognizes and requires certain regulatory sequences in a genes
promoter such as:
-TATA box
-GC box
-CAAT box
26. • RNA Polymerase II requires the assembly of the basal transcription
apparatus to initiate transcription
• This consists of general transcription factors:
-TFIIA
-TFIIB
-TFIID
-TFIIF
-TFIIH
trans-acting factors producedelsewhere
29. Capping & Poly-A-tail
• RNA capping, 5’ methyl cap, is the first modification of the mRNA as soon as it
emerges from the polymerase
• In the nucleus, the mRNA cap binds a protein complex, the CBC (cap-binding
complex)
• Following splicing and cleavage of the mRNA, RNA binding proteins and processing
enzymes generate the 3’ poly-adenylation signal, Poly-A tail
• Poly-A-binding proteins
30. splicing
• following the 5’ capping, the processed mRNA must still undergo splicing before
translation into protein can occur
• removal of the intronic regions
• cleavage of the exons
• mature spliced mRNA also has 5’ and 3’ UTRs , untranslated regions
31. Splice junctions & the Spliceosome
splicing machinery or Spliceosome
large Protein–RNA complex (snRNA –snRNP)
made up of small nuclear RNAs and >50
proteins
3 consensus sequence sites:
• Most introns start with a GT (GU for RNA)
and end with AG (always the same)
• An A forms the branch point of the lariat
produced by splicing (can vary)
lariat
33. translation
• mRNA to protein
• mature mRNA is DECODED by the ribosome to produce a specific
polypeptide (amino acid chain)
• proceeds through initiation, elongation and termination
• tRNAs, transfer RNAs: function as adaptors between the mRNA template
and the amino acids
34. tRNA
70 to 80 nucleotide long
Always CCA at the 3’ end
Binds to the codon in the mRNA sequence
36. Wobble hypothesis
There are 64 = 43 possible codons
49 different tRNAs
In 1966, Francis Crick proposed the
Wobble hypothesis
the 5' base on the anticodon, which
binds to the 3' base on the mRNA, is
not as spatially confined as the other
two bases, and can have non-standard
base pairing
37. Mutations
Permanent changes in the DNA sequence
Mutations can be :
spontaneously occurring (mistakes in replication, failure of DNA repair)
or induced (by radiation such as UV, or by chemical exposure to agents that bind or
react with DNA)
Types of mutations include:
• point mutations
• insertions
• deletions
• translocation
38. The effect on the resulting protein can vary and and includes the following
consequences:
• Frame shift mutation : a disruption in the reading frame
The sun was hot but the old man did not get his hat
T hes unw ash otb utt heo ldm and idn otg eth ish at
Or
Th esu nwa sho tbu tth eol dma ndi dno tge thi sha t
• Nonsence mutation
premature STOP or truncated protein
• Missence
Single point mutation resulting in different amino acid
• Neutral mutation
e.g.: AAA to AGA, lysine to argine
• Silent mutation
No effect on final protein
39. Mutations causing disease
BIMA30 2011-09-14/20
heritable genetic disorders (single mutation)
SCD sickle cell anaemia: a point mutation in the β-globin chain of
haemoglobin, causing the hydrophilic amino acid glutamic acid to be replaced
with the hydrophobic amino acid valine, GAG to GTG (Malaria resistance)
Cystic Fibrosis: most commonly (ΔF508) a deletion that results in a loss of
phenylalanine in the protein encoded by the CFTR gene.
40. SNPs…naturalvariation
=> 1% of population
accounts for 90% of all human genetic variation
every 100 to 300 bases
coding (gene) and noncoding regions
many SNPs have no effect on cell function
GWAS
BIMA30 2011-09-14/20
41. GWAS genome-wideassociationstudies
• SNP arrays, NG sequencing
• focus on the associations between SNP and a particular disease
• influences the risk of disease occurring in an individual
BIMA30 2011-09-14/20
42. BIMA30 2011-09-14/20
Types of SNPs…associatedwiththediseasephenotype
SNPs usually occur in non-coding regions more frequently than in coding regions
SNPs rather than cause disease may confer differential susceptibility e.g.: Alzheimer's
disease and ApoE SNPs
45. Common misconceptions:
• The only purpose of a gene is to encode a protein
• One gene encodes one mRNA and gives rise to one specific protein
• DNA that is not a gene is considered ‘junk DNA’
"Biologists should not deceive themselves with the thought that some new class of biological
molecules, of comparable importance to proteins, remains to be discovered. This seems highly unlikely."
(Francis Crick, 1958)
46. • 2003 (14th April) completion of Human Genome Project
(99% /99.99% accuracy)
• 2012 (5th September) The ENCODE project
regions of transcription, transcription factor association, chromatin structure and
histone modification in the entire genome
80% of the components of the human genome now have at least one
biochemical function associated with them!
BIMA30 2011-09-14/20
Human Genome annotation
47. Size approx. 3Gbp
GENCODE consortium aims to identify all protein-coding, long non-coding RNA
and short RNA genes: >51,096 genes: 20,026 protein-coding and 31,070 non-
coding (GENCODE version 8, March 2011)
A lot fewer genes than originally thought!
~20,000 protein–coding (<1.5%)
Non-protein-coding sequences make up the vast majority of DNA
The mitochondrial genome:
16.6kb
Not a chromosome, but one circular DNA molecule
37 genes (13 protein-coding genes)
49. Gene Structure and Function protein-codinggenes
promoters, enhancers, r
epressors
alternative splicing,
transcript variants
mRNA processing &
editing
protein isoforms,
proteasome-mediated
destruction
50. • Pseudogenes
< 12,000
genes in the genome that never become proteins
• RNA genes DNA encoding non-protein coding RNA
Approx 9,000 lncRNA and 9000 sRNA
transfer RNA (tRNA)
ribosomal RNA (rRNA)
snRNA
snoRNAs, microRNAs
siRNAs
piRNAs
Xist
Gene Structure and Function non-protein-codinggenes
53. They result from:
1 .Reterotransposition
mRNA transcript of a gene is spontaneously reverse transcribed back into DNA and
inserted into chromosomal DNA (looks like cDNA structure, exons, no promoter)
2. Gene duplication
Arising from gene duplication events and subsequent acquisition of deleterious
mutations (has introns, exon structure and promoter)
3. Disabled Genes
A gene disabled by a mutation that does not have a deleterious effect on the
organism and gets fixed in the population
L-gulono-γ-lactone oxidase (GULO)- biosynthesis of vitamin C (GULOP gene in humans)
54. Pseudogenes in Gene Regulation
• Pseudogene-derived small interfering RNAs (siRNA)
Some pseudogenes actually encode siRNAs that regulate the expression of protein-coding
mRNAs- ‘parent’ genes
• An miRNA decoy function was proposed for pseudogenes when it was
demonstrated that their transcripts were biologically active units in
tumor biology e.g.: PTEN
55. Control of transcription promoters,enhancers,repressors..
• Control of transcription in eukaryotes is complex and is dependent on both -cis and -
trans factors including: regulatory & promoter sequences, RNA polymerase, the
general transcription factors and gene specific activators and repressors
• RNA pol II is responsible for transcription of mRNA or protein-coding genes [Pol I (rRNA)
and III (tRNA and some small RNAs)]
56. • Pol II requires assembly of the general TFs on the promoter in order to initiate
transcription
• specific gene transcription may also be regulated by activator and repressor proteins
binding at near OR distant sites
RNA Pol II promoters vary significantly in structure but often include one or more of:
TATA box
GC box
CAAT box
However none are essential and many promoters lack them all!!
58. Transcript variation onegene=manymRNAs
Mainly due to the fact that: 1 Gene > 1 promoter & > 1 splicing pattern
At least half of all genes
have 2 or more promoters
Alternative splicing, as well as creating different protein isoforms, can also generate different 5’ and 3’ UTRs
60. Roleof mRNA processingin generegulation?
• seems wasteful
• exon-intron arrangement facilitates new genetic recombination
• evidence in protein domains
• variation - several protein variants can be produced from one gene
• The 5’, 3’ modifications of mRNA also have a role in stability, transport and
recognition
61. Influenceof mRNA structurein generegulation
Export ready
(correctly spliced and
polyadenylated)
mRNA is assembled in
nucleus
It moves through the
nuclear pore complex
as a curved fiber, 5’ cap
first
Final mRNA checks
Translation Initiation factors
• 5’ cap distinguish mRNA from other types (pol I and III)
• 5’ cap binds CBC (cap-binding complex)
• the Poly-A-tail is bound by Poly-A binding proteins
• splicing junctions are marked by EJCs exon junction complexes
processing, exp
ort & initiation
of translation
62. Quality control
• In the cytosol, the 5’cap and Poly-A-tail are recognized by the translation initiation
machinery
• EJCs serve to label the mRNA
• Nonsense-mediated decay (NMD) rids cells of mRNAs with premature termination codons
• hUpf complex triggers NMD in the cytoplasm when recognized downstream
Exon junction complexes
Correct stop codon
Premature stop codon Upf proteins
Upf triggers mRNA degradation
63. mRNA stability & degradation
• The stability of mRNAs is very variable
• A deadenylase enzyme acts to shorten the Poly-A tail in the cytoplasm
• Decapping enzymes - uncapped mRNA is rapidly degraded by exonucleases
• 3’ binding of miRNA mediated degradation
• RNAi mediated degradation
AREs can stimulate Poly-A tail removal
AU-rich elements 50–150 nt (rich in
adenosine and uridine)
3-UTRs of many mRNAs with a short half-
life <10% genes, e.g: growth factor
64. Alteration of the amino acid sequence of the encoded protein so it differs from the genomic DNA
A-I deamination of adenine to produce inosine:
>100o genes
RNA editing
Adenosine deaminases
acting on RNA (ADARs)
C-U deamination of cytosine to
produce uracil
Apoplipoprotein B: each
isoform has a different role in
lipid metabolism
65. Control at the protein level ..lostaftertranslation
• In principle, every step required for the process of gene expression can be
regulated
• The steps following translation are no different
• Protein folding is initiated as synthesis proceeds, often with the help of molecular
chaperones (such as Hsp70)
• The ubiquitin-proteasome system efficiently destroys incorrectly or incompletely
folded proteins and acts to limit the life of normal or correctly folded proteins
• Abnormally folded proteins can form disease causing protein aggregates
My name is ….I originally come from Dublin, Ireland where I obtained my PhD in 2004. I’ve worked at Lund University for about 6 years, first as a post-doc in CRC in Malmö and now as a junior group leader at the Dept. of Oncology, Canceromics Branch which are situated in Medicon Village. My main interest is then unsurprisingly in cancer research and cancer genomics in particular breast cancer and melanoma.
The lectures are divided over 2 days …we have 2 X 2 hour slots ..we wont need all that time probably a little over an hour each time..1hour ½ today, 1hour next timeToday we will firstly revise the very basics and then go a little deeper than you’ve gone before and introduce some new conceptsNext week we will continue with more detailed concepts and final address the experimental study of gene regulationFrom the last years student feedback I know that some really people like the revision part, others find it boring but we can also see from the exams the some of the basic stuff wasn’t well known or understood so its likely that’s there is a mixture of levels and background knowledge in the class so I think its good that we make sure we are all on the same page at the beginning. The purpose of the lectures is to give you a very structured overview of this particular topic…its not that you could just read about it yourself you certainly could but by coming to a lecture you get it all appropriately summarized. I tend to make sure that I put the most important content and text on my slides in case there is some difficulty in understanding me I am one of if not the only non-swedish speaker that you have as a lecturer and I don’t what anyone to be at a disadvantage because of that.I am happy to get feedback and or questions now or via emailI will discuss the exam and questions next week.
Prob fine with newer and some not so old editions…these are the editions I’ve usedSome years students ask forpage/chapter numbers but because the editions vary I don’t do this ..it should be obvious what chapter the material I use comes fromWiki/Google scholar can be very useful make sure that what your reading is backed up by references to articles or textbooks I will also give you some article references when I discus something that isn’t in the textbooks yetI will speak about ENCODE data and I encourage you all to go to this resource that has many useful article explaining what we know and how we know what we know about transcriptional control of the human genome. You wont find it in text books as most of this paper were published only this time last yearHowever if you know and understand everything that on my slides you should have no problem answering the exam questions. Important to use the source material to make sure you understand the content of my slides.
As I understand it you should have all done BIM12 course so finally I would remindyou to go back and look at your notes from BIM12 which covered most of what we should revise for this course
So what do I mean by the basics? Where are the genes in a cell?Who discovered genes?Who discovered DNA?
Ok so firstly a useful definition(s):Genes are discrete segments of DNA (or RNA) that comprise a functional unit of hereditary material Most!genes function to make a complementary RNA molecule that serves as a template for making a proteinThe total complement of genes in an organism or cell is known as its Genome
Gregor Mendel, was an Austrian scientist and Augustinian friar who is known as the "father of modern genetics” published his ‘Experiments in Plant Hybridization in 1866Independently, Charles Darwin, an English naturalist published ‘Origin of the Species’ outlining his theory of natural selection in 1859However not until the early 20th century were the theories of both men were understood together and mergedAround 1910,Thomas hunt Morgan an American evolutionary biologist, geneticist and embryologist (worked on fruit fly Drosophila) was able to demonstrate that genes are carried on chromosomes and are the mechanical basis of heredity.The Hershey–Chase experiments were a series of experiments conducted in 1952 by Alfred Hershey and Martha Chase, which helped to confirm that DNA was the genetic material: worked on bacteriophages, which are composed of DNA and protein, infect bacteria, their DNA enters the host bacterial cell, but most of their protein does not, suggesting DNA was the genetic materialThe proposal that chromosomes carried the factors of Mendelian inheritance was initially controversial, until 1915 when Thomas Hunt Morgan's work on inheritance and genetic linkage in the fruit fly Drosophila melanogaster provided incontrovertible evidence for the proposalHershey and Chase showed that when bacteriophages, which are composed of DNA and protein, infect bacteria, their DNA enters the host bacterial cell, but most of their protein does not. Although the results were not conclusive, and Hershey and Chase were cautious in their interpretation, previous, contemporaneous and subsequent discoveries all served to prove that DNA is the hereditary materialIt was a Danish scientist, botanist Wilhelm Johannsenthat coined the word gene in 1909 to describe the Mendelian units of heredity.
Discovery of the double helix structure allowed scientists to understand how DNA was replicated and passed from cell to cell and generation to generation1962Nobel Prize James Watson, Francis Crick & Maurice Wilkins. Rosalind Franklin generated the X-ray diffraction images used to formulate Watson and Cricks’ hypothesis but died of ovarian cancer 1958 so did not share in the nobel prizeThe model based upon the crucial X-ray diffraction image of DNA labeled as "Photo 51", from Rosalind Franklin
The DNA double helix is a spiral polymer of nucleic acids, held together by nucleotides which base pair together. The realization that the structure of DNA is that of a double-helix elucidated the mechanism of base pairing by which genetic information is stored and copied The model based upon the crucial X-ray diffraction image of DNA labeled as "Photo 51", from Rosalind Franklin
Gentics lectures by Torbjorn cover this and and its important to note that DNA replication is controlled within the context of the cell cycle, see Urban Gullberg lectures
NB:some ‘nucleic’ acid is also found in cell’s mitochondria
Each of us has enough DNA to reach from here to the sun and back, more than 300 times. How is all of that DNA packaged so tightly into a tiny nucleus?End up with a familiar looking chromosome structure
Ok so you’ve had genetics lectures with Torbjorn and so you know that ….Picture of karyotypeIt was in Albert Levins lab in Lund in 1956 that it was discovered by that humans have 46 chromosomes, not 48 as previously thought
Torbjornsholud have also covered ….Mitosis produces two daughter cells that are identical to the parent cell. Cell division ..cell growthMeiosis produces daughter cells that have one half the number of chromosomes as the parent cell; Meiosis enables organisms to reproduce sexually. Gametes (sperm and eggs) are haploid.I want to mention because:-Errors sometimes occur during the copying process. If they occur during mitosis, after conception, for instance, and during life, they are known as somatic mutations. If they happen during meiosis, in producing the egg or sperm, they are referred to as germ line mutationsand, as such, they will be passed on through inheritance.
also known as Trisomy 21 (an individual with Down Syndrome has three copies of chromosome 2146/47, XXY, or XXY syndrome is a condition in which human males have an extra X chromosomecausing intellectual disabilities, a distinctive facial appearance, and a variety of physical problemsMost chromosome abnormalities occur as an accident in the egg or sperm i.e.: germ linebut the reason they have an effect is that they influences the normal and tightly controlled gene expression Chromosome abnormalities can also be somatic
Cancer /cogential diseasesGross chromosome abnormalities can have a profound effect on gene regulation by disruption of normal gene structure localised and confined genomic regions in one or a few chromosomes,Chromothripsis is a neologism that comes from the Greek words chromo which stands for chromosome and thripsis which means 'shattering into pieces'These have only properly been appreciated since the advent of sequencing that we will discuss next week: first observed in leukaemiasimultaneous fragmentation of distinct chromosomal regions (breakpoints show a non-random distribution) and then subsequent imperfect reassemblyTouch on this again when we discuss gemome sequencing etc,
Germline & somatic replication errors influence gene expression!!
There are 64 (43) possible codons (four possible nucleotides at each of three positions) But, there is only 20 standard amino acids therefore multiple codons can specify the same amino acid and thus the wec can say tat the genetic code is redundantRedundancy is rampant in biology and with good reason most often it serves as a safety mechanism , more than one way to do the same thing!
The DNA strand sequence that matches the RNA upper stand… is the coding strand /sense strand/Crick strandRNA is synthesized from the template strand/antisense strand / Watson strand DNA is always written 5-3’A handy way to remember it is that 5-3 rhymes with N-C
Obviously transcription is a dynamic process even though most of te textbooks simply show static picture ….for me the easier way to remember this is to visualize it and so I’m going to show a short clip with some narration that describes the process in a bit more detail
Ok so we mention enhancer region’ and ‘transcription unit’And we will discuss those shortly..For now we have synth RNA and it requires furter processing
After transcription when we’ve made RNA it is subject to post transcriptional modifications with various purposes.
by adding A nucleotides (approx. 200) Once the Poly-A tail is synthesized is it available to bind Poly-A-binding proteins
Splicing involves removal…& clevage….to produce the final ‘open reading frame’ or coding sequence (CDS)The..The UTRs untranslated regions do not code for of form part of the final protein
Spliceosome -It recognizes three consensus sequence sites in order to remove an intron
So finally after processing of the RNA , now refereed to as messenger RNA translation occurs to produce the final protein productTranslation describes the process whereby the mature mRNA produced by transcription is DECODED by the ribosome to produce a specific polypeptide (amino acid chain)tRNAs, transfer RNAs: function as adaptors between the mRNA template and the amino acids which make up the polypeptide
Typical schematic of tRNA transfer RNA that one finds in the text booksIts approx 80 nucs long attaches to the aa at one end (3’) the identity of which is determined by codon in the mRNA sequence binding to the tRNA anti-codon
I mentioned earlier that the genetic code is redundantThere are 64 = 43 possible codonsBut only 49 different classes of human tRNA so some tRNA species must pair with more than one codon. Alternative codons may differ only in the third nucleotideAlternative codons may differ only in the third nucleotide
I spoke about chromosome level rearrangements and errors (abnormalities and chromothripsis) but you can also have errors/ variations on a smaller scale at the individual bases in the DNA mistakes in replication, failure of DNA repair can also result in gross chromosome abnormalities that we discussed earlierInduced by environment
Neutral mutation = chemically similar with no deleterious effect on fitnessSilent may also be referred to as synonymous
If a mutation is present in a germ cell: it can give rise to offspring that carries the mutation in all of its cells and this is then an hereditary diseaseIf a mutation occurs in a somatic cell it will be present in only the descendants of this cell (cell division) within the same organismCystic fibrosis transmembrane conductance regulator (CFTR) is a protein[1] that in humans is encoded by the CFTR gene.[2]CFTR is a ABC transporter-class ion channel that transports chloride and thiocyanate[3] ions across epithelial cell membranes.Examples of heritable genetic disorders arising from a single mutation include:SCD sickle cell anaemia: a point mutation in the β-globin chain of haemoglobin, causing the hydrophilic amino acid glutamic acid to be replaced with the hydrophobic amino acid valine, GAG to GTG (Malaria resistance)Cystic Fibrosis: most commonly (ΔF508) a deletion that results in a loss of phenylalanine in the protein encoded by the CFTR gene.
SNPs can occur in coding (gene) and noncoding regions of the genome.A common ( by definition at least 1% of population) genetic variation at the level of a single base pair : First, to be classified as a SNP, the change must be present in at least one percent of the general population. No known disease-causing mutation is this common.!!!SNPs, which make up about 90% of all human genetic variation, occur every 100 to 300 bases along the 3-billion-base human genomeSNPs can occur in coding (gene) and noncoding regions of the genome.Many SNPs have no effect on cell function but can there those that appear in a region identified as associated with a biochemical event via GWASMany SNPs have no effect on cell function but can there those that appear in a region identified as associated with a biochemical event via GWAS
By comparing healthy and diseased populations and looking at he variation their DNA
Why I am telling you about SNPs…mostly influence gene regulation rather than mutations effect the function of a protein….Linked SNPs (also called indicative SNPs) do not reside within genes and do not affect protein function. Nevertheless, they do correspond to a particular drug response or to the risk for getting a certain disease biological markersCausative SNPs affect the way a protein functions, correlating with a disease or influencing a person's response to medication. Causative SNPs come in two forms:Coding SNPs, located within the coding region of a gene, change the amino acid sequence of the gene's protein product.Non-coding SNPs, located within the gene's regulatory sequences, change the level of gene expression and, therefore, how much RNA and protein is produced.Contribution factors not like a mutation ---protein works properly or not …rather changes levels along a continuumdisease associated SNPs were consistently enriched in function-rich areas of the genome e.g.: enhancer regions ,TSS i.e: TF binding sites!! eSNPsbu even if in coding like ApoE key is suseptability rather than causationApoE SNPs:-
Speak about the80% controversy
2003 (14 April) saw successful completion of Human Genome Project with 99% of the genome sequenced to a 99.99% accuracy2012 (5th September) The ENCODE project: identify all regions of transcription, transcription factor association, chromatin structure and histone modification in the human genome sequence. 80% of the components of the human genome now have at least one biochemical function associated with them. identify all regions of transcription, transcription factor association, chromatin structure and histone modification in the human genome sequence
If you can find these resources and you wan to look them up email meSequencing of the human genome has resulted in significant revision down of the number of genes in the genome…a lot fewer~20,000 protein–coding genes but this accounts for <1.5% of the entire genome!Non-protein-coding sequences which make up that vast majority of DNA in eukaryotes is unlikely to be’’junk DNA’’!!!Composed of regulatory regions!!!I mentionmito only for completeness …wont deal with it in this course
Ok so lets go back a bit to the gene structure , what we’ve dealt with so far is in our revision today is really just protein coding genes and how they are processed
But thereis a lot of information in the non-protein coding regions of the genome,Apart from regulatory regions and regions of no known function there are even non-coding genes such as pseudogenes and RNA genesI will say a bit about pseudogenes, and Sebastian will go into much more detail on RNA genesLn and sXist -inactive X is coated with this transcript
I think you even if you didn’t hear about pseudogenes before you heard about in your PBL todayPseudogenes usually look one of two ways…depending on how they come aboutEither the entire genomic region is duplicated carrying with it..Or the psudogenearrise from a RNA transcript in which case it has no…..and is called ‘processed’
L-gulono-γ-lactone oxidase (GULO) aids in the biosynthesis of Ascorbic acid (vitamin C), but it exists as a disabled gene (GULOP) in humansGene that aids in Vit C biosyntheses that weno longer require was mutated and disabled but we kept the transcript
So it turns out that again apart from being junk DNA, pseudogenes can have roles in gene regulation too!Theres pseudogene derived siRNA which acts in the same way as other siRNA processes to downregulate a ‘parent’ geneDecoy when an miRNA from elsewhere in the genome that is supposed to downregulated a functional gene is instead intercepted by a imposter pseupogeneExamples other than PTEN are now knownwhere the pseudogene transcripts acts as a decoy transcript to trick the cell. The ‘real’ gene transcript is then able to escape from degradation)
We have briefly talked about the idea of a ‘transcription unit’ taking into account the gene itself its introns, exons and promoter but it can also include regions much further away that function as enhancer or repressor dinding sitesThey can be up or downstream
However none are essential and many promoters lack them all!! Biology always has exceptions!!
Ones depicted here are DNA binding but also have non-DNA binding co-regulatorsActivators and repressors can also be co-regulatory proteins or transcriptional co-regulators that do not bind DNA but interact with RNA Pol II and general TFsMany activators and repressors also function by alteration of the local chromatin structure, which directly influences the accessibility of the underlying DNA sequence (next weeks’ lecture)
Ok so the fight is over between the repressors and activators and co-regulators and the basal TF have assembled we can still have various outcomes depending on the transcript produced
And following on from the production of various trancripts..the transcripts determine the protein isoform theexample here is from the alphtropomyosin gene, and the sig of this it that depending on the cell type diferent isoforms are required
Seems wasteful to expend energy removing portions of pre-mRNAExon-intron arrangement facilitates new genetic recombination of exons from different genes in other words its required for the evolution of the genome and ease of variationThe evidence for this type of protein evolution can be seen in the organization of protein domainsAllows for easy variation as several protein variants can be produced from one gene –this makes it energy efficient if you think that multiple protein can be made from the same gene
5’ cap helps distinguish mRNA from other types of RNA sopol I and III produce uncapped RNA)In the nuclues, the 5’ cap binds CBC (cap-binding complex) the Poly-A-tail is bound by Poly-A binding proteins and the splicing junctions are marked by EJCs exon junction complexesThese bound proteins help processing and exporting of the mRNA and initiation of subsequent translation
Undergoes further QC steps in the cytosolNonsense-mediated decay (NMD) rids eukaryotic cells of aberrant mRNAs containing premature termination codons:NMD helps the cell recognize a ‘true’ stop codonThe assembly of a dynamic hUpf complex initiates in the nucleus at mRNA exon–exon junctions and triggers NMD in the cytoplasm when recognized downstream of a translation termination site
some mRNAs have half-live of many hours (up to 10 hours) while other more unstable transcripts have half-lives of < 30minutesAREs can stimulate Poly-A tail removalAREs, AU-rich elements , are sequence elements of 50–150 nt that are rich in adenosine and uridine bases. They are located in the 3-UTRs of many but not all mRNAs that have a short half-life <10% genesGenes encoding proteins such as growth factors are often targeted in this way for tight regulation
Alteration of the amino acid sequence of the encoded protein so that it differs from that predicted by the genomic DNA sequence (can also occur in non-coding regions)A-I deamination Takes place by formation of a dsRNA due to complementary sequence, typically in an intronEditing role in mRNA stability? Upf interaction?C-U deamination
For completeness I will mention that posttranslational control is extensively regulated a teach stepEgNeuro diseases: e.g.: Alzheimer's, Parkinson's, Abnormally folded proteins that escape the cells quality control can form disease causing protein aggregates such as are found in many neurodegenerative diseases