SlideShare una empresa de Scribd logo
1 de 91
Siddhartha Swarup
Jena
RAD/10-30
gttaaaattcagcaggcagaatgaaaataaatgtcaataattttttatt
t
taaaatattcatgttttactattttgatataatttttaaagaaaaaggc
a
gaaaccactgcttattagaaggcagattttattgattttatacccctag
a
cttgttgcatatcaaacctatgtaaaaacatctataaatcaaatcatta
a
ttgcacctagtataataattctatatatggaggtaatgtttgattcttc
a
ggagctttaataacttgaagcccgtttgattgctttaaaatgatttctc
a
ttgtatttgtttatattgtatcattaagcaaaagtacagagtaagcaat
t
agtgtgattaattcctcttccataatacagtaaagcactgcctccatag
a
ccaattctctgggatccctggaaaacatctggcatccagcaagtcttga
c
ContentsContents
S S Jena
Human Genome Project decided to use smaller
genomes as warm-up for human genome.
Resulted in sequencing:
Many bacteria
Model organism genomes
Yeast, C. elegans, Arabidopsis, Drosophila
Crop plants – Rice
Comparison of these genome sequences
provided basis for field of “Comparative
Genomics” S S Jena
What is comparative genomics?
Analyzing & comparing
genetic material from
different species to study
evolution, gene function,
and inherited disease.
To understand the
uniqueness between
different species.
S S Jena
Comparative genomics prior to obtaining full
genome sequence
Genome size
•Compared DNA content among species.
(DAPI staining- FACS)
Single copy and repetitive DNA
•Used hybridization kinetics. (Cot curve)
•Found amount of repetitive DNA differed
greatly among species.
S S Jena
What is compared?
Gene location
Gene structure
Exon number
Exon lengths
Intron lengths
Sequence similarity
Gene characteristics
Splice sites
Codon usage
Conserved synteny
S S Jena
What can homology tell us?
1. Identity of genes:
- The identity of the gene in another organism
- The identity of nearby genes
- The function of the gene (if annotated)
2. Suggestions of how the gene might be
causing disease
3. Infer ancestral relationships
4. Discover principles of evolution
S S Jena
Negative selection: The removal of
deleterious mutations from a population; also
referred to as purifying selection.
Positive selection: The retention of
mutations that benefit an organism; also
referred to as Darwinian selection.
Homologs: Features (including DNA and
protein sequences) in species being
compared that are similar because they are
ancestrally related. S S Jena
Orthologs and Paralogs
• While comparing sequence from different
genomes, we must distinguish between two
types of closely related sequences:
 Orthologs are genes found in two species
that had a common ancestor.
 Paralogs are genes found in the same
species that were created through gene
duplication events.
S S Jena
A
A’
A’’
B’
B
B”
A & B Orthologs
A’ & A’’
B’ & B’’
Paralogs
Orthologs and Paralogs
S S Jena
Example
Early globin
gene
Alpha
chain
Beta
chain
Frog
alpha
Human
alpha
Human
Beta
Frog
beta
First
duplication
event
Second
duplication
event
0 1 2 3
t
Orthologs
Paralogs
Orthologs
H
o
m
o
l
o
g
s
S S Jena
Synteny
Regions of two genomes that show
considerable similarity in terms of sequence
and conservation of the order of genes.
Genes that are in the same relative position
on two different chromosomes.
Closely related species generally have
similar order of genes on chromosomes.
Synteny can be used to identify genes in one
species based on map-position in another.
S S Jena
Synteny among crop genomes:
rice, maize and wheat
Maize: evolved from tetraploid ancestor
THE ORIGIN AND EVOLUTION OF MODEL ORGANISMS
Hedges SB (2002) Nature Reviews Genetics 3: 838 -849.
The colored blocks
represent
synteny blocks
S S Jena
Synteny of Mouse and Human genome
When sequence from mouse and human
genomes compared, regions of remarkable
synteny were found.
Genes are in almost identical order for long
stretches along the chromosome.
Human
Chr 14
Mouse
Chr 14
S S Jena
Synteny of Mouse & Human genome
Almost entirely
syntenic
S S Jena
The one to one linear correspondence
between the order of codons in a coding
sequence and the order of amino acids in the
protein encoded.
A linear map of mutation sites within a gene
corresponds to the linear location of amino
acid substitutions within the polypeptide
encoded by that gene.
Comparative genome analyses
demonstrated that gene orders among related
plant species remained largely conserved over
millions of years of evolution.
Colinearity
S S Jena
Comparative genomics exploits both similarities
and differences in the proteins, RNA and
regulatory regions of different organisms to infer
how selection has acted upon these elements.
Those elements that are responsible for
similarities between different species should be
conserved through time (stable selection), while
those elements responsible for differences
among species should be divergent (positive
selection). S S JenaS S Jena
The DNA sequences encoding the proteins and
RNAs responsible for functions that were
conserved from the last common ancestor
should be preserved in contemporary genome
sequences.
Likewise, the DNA sequences controlling the
expression of genes that are regulated similarly
in two related species should also be conserved.
Conversely, sequences that encode (or control
the expression of) proteins and RNAs
responsible for differences between species will
themselves be divergent.
Principle cont.
S S JenaS S Jena
Evolution and sequence conservation
If no constraints on DNA sequence, Random
mutations will occur.
Over tens of millions of years these random
mutations will make two related sequences
Different.
e.g. Non-coding DNA that does not have a
regulatory function tends to diverge much more
rapidly than protein coding DNA.
S S JenaS S Jena
Function and sequence conservation
However: if there are constraints, e.g.
oDNA codes for protein
oor transcription factor binds DNA
oReplication origin
Then there will be sequence similarity when
related sequences compared
Basic rule when comparing two related
sequences:
Sequence conservation = functional importance
Why do we annotate genomes?
• If we find the gene in a model organism (like
the rat), then we need to know what the
homolog is in humans.
• If we find the gene in a model organism, we
need to know if it’s doing the same thing in
humans.
• If we DON’T know what gene is implicated in
a disease, we can annotate ALL the genes in
a region and find candidates for further study
S S JenaS S Jena
Comparison of genomic sequences from different
species can help to identify:
oGene structure
oGene function
oRegulatory sequences
oInteractions between gene products
S S JenaS S Jena
Comparative Genomics Tools
• Similarity search programs
– BLAST2 (Basic Local Alignment Search Tool)
– FASTA
– MUMmer (Maximal Unique Match)
(Comparisons and analyses at both Nucleic acid and
protein level)
• Other alignment programs
– DBA [DNA Block Aligner] (Jareborg et al)
– blastz (Schwartz et al.)
– BLAT/AVID,
– WABA [Wobble Aware Bulk Aligner]
– DIALIGN [Diagonal ALIGNment] (Morgenstern et al.)
– SSAHA [Sequence Search and Alignment by
Hashing Algorithm] S S Jena
Comparative Genomics Tools
• Comparative gene prediction programs
– Twinscan
– Doublescan
– SGP-1
• Regulatory region prediction
– Consite
• Visualization/ Sequence analysis programs
– Dot plot (e.g. Dotter)
– PIP maker (Percent Identity Plot)
– Alfresco
– VISTA (VISualization Tools for Alignments)
– ACT (Artemis comparison tool)
S S Jena
General Databases Useful for Comparative Genomics
• Locus Link/RefSeq:
http://www.ncbi.nih.gov/LocusLink/
• PEDANT-Protein Extraction Description ANalysis Tool
http://pedant.gsf.de
• COGs - Cluster of Orthologous Groups (of proteins)
http://www.ncbi.nih.gov/COG/
• KEGG- Kyoto Encyclopedia of Genes and Genomes
http://www.genome.ad.jp/kegg/
• MBGD - Microbial Genome Database
http://mbgd.genome.ad.jp/
• GOLD - Genome OnLine Database
http://wit.integratedgenomics.com/GOLD/
• TIGR – The Institute of Genome Research
Comparative genomics of Parasites
S S Jena
 Alignment of DNA sequences is the core
process in comparative genomics.
 An alignment is a mapping of the nucleotides
in one sequence onto the nucleotides in the
other sequence, with gaps introduced into one
or the other sequence to increase the number
of positions with matching nucleotides.
 Several powerful alignment algorithms have
been developed to align two or more
sequences.
COMPARATIVE GENOMICS -COMPARATIVE GENOMICS -
PROCESSPROCESS
S S Jena
• The most frequently performed type of sequence
comparison is the sequence similarity search.
• Sequence comparisons that implicate function
are widely used:
– To determine if newly sequenced cDNA or
genomic region encodes gene of known
function.
– Search for similar sequence in other species
(or in same species)
Sequence similarity searchSequence similarity search
S S Jena
• Search databases of DNA sequences
• Use computer algorithms to align sequences
– Don’t require perfect matches between
sequences
– Allow for insertions, deletions and base
changes
• Most commonly used algorithms:
– BLAST
– FAST-A
Homology searches
S S Jena
BLASTBLAST
S S Jena
Pairwise genome comparison of protein
homologs (symmetrical best hits)
http://www.ncbi.nlm.nih.gov/sutils/geneplot.cgi S S Jena
Genome databases
• Genomes at NCBI, EBI, TIGR
NCBI comparative maps
S S Jena
Ensembl
Genome databases
S S Jena
Ensembl
synteny
views
S S Jena
 Gramene (http://www.gramene.org) is a comparative
genome mapping database for grasses and a
community resource for rice (Oryza sativa).
 It combines a semi-automatically generated
database of cereal genomic and expressed sequence
tag sequences, genetic maps, map relations, and
publications, with a curated database of rice mutants
(genes and alleles), molecular markers, and proteins.
 Gramene curators read and extract detailed
information from published sources, summarize that
information in a structured format, and establish links
to related objects both inside and outside the
database.
S S Jena
Map Search:
Comparative
Maps
S S Jena
• Rice
– whole genome sequence
• Other crop grasses
– Maize
– Sorghum
– Millet
– Sugarcane
– Wheat
– Oats
– Barley
Gramene scope
Synteny in Gramene
Genome analysesGenome analyses
• Variation in
– Genome size
– GC content
– Codon usage
– Amino acid composition
– Genome organization
• Single circular chromosomes
• Linear chromosome + extra chromosomal elements
E. coli: 4.6Mbp
M. pneumoniae: 0.81Mbp
B. subtilis: 4.20Mbp
B. burgdorferi: 29%
M. tuberculosis: 68%
G, A, P, R: GC rich
I, F, Y, M, D: AT rich
S S Jena
CG: Comparisons between genomes
• The stains of the same species
• The closely related species
• The distantly related species
– List of Orthologs
– Evolution of individual genes
– Evolution of organisms
S S Jena
Comparison of the coding regions
• Begins with the
gene identification
algorithm:
Infer what portions
of the genomic
sequence actively
code for genes.
• There are four
basic approaches.
4 basic categories of gene
identification programs
Category Algorithm
1. Based on direct
evidence of transcription
EST_GENOME
sim4
2. Based on homology
with known genes
PROCRUSTES
3. Statistical or ab-initio
approaches
Genscan
FGENES
GeneMark
Glimmer
4. Using genome
comparison
TwinScan
Rosetta
S S Jena
‘Trait-to-gene’ approach
A bioinformatics approach to identify genes
involved in adaptive traits is called “Trait-to-gene”.
Assumption: New genes will be created to
perform tasks required for an adaptive response.
Underlying reasoning: organisms that share a
particular trait will share related genes.
Also used to identify two different genes that
serve the same function in different organisms.
S S Jena
Relating traits to genes
To compare genes among species the Trait-to-
gene approach uses “COG database”. (Eugene
Koonin)
It is a method for identifying likely orthologues
when making whole genome comparisons among
multiple species.
S S Jena
Important observations with regard to
Gene Order
• Order is highly conserved in closely related
species but gets changed by rearrangements.
• With more evolutionary distance, no
correspondence between the gene order of
orthologous genes.
• Group of genes having similar biochemical
function tend to remain localized.
S S Jena
Finding regulatory regionsFinding regulatory regions
 Called phylogentic footprinting (analogous with
DNAase footprinting)
 Functionally important regions are mutated less.
 These cis-regulatory motifs can be determined
by:
– Finding common motifs in orthologous sequences
– Aligning orthologous sequences first, then
indentifying common regions
 Previously known motifs might help
S S Jena
Regulatory region prediction
Consite:
– Detection of
TFBS
conserved in
corresponding
genomic
sequences
from different
species
S S Jena
Visualization
Dot plot:
A graphical dot plot program for detailed comparison of
two sequences
Soft wares
for dot-plot:
DNA strider
Dotlet
Dotter
Dottup
S S JenaS S JenaS S Jena
Dot plot
– The X axis represents the
first sequence (PHO5),
– The Y axis represents the
second sequence (PHO3)
– A dot is plotted for each
match between two
residues of the sequences.
– Diagonal lines reveal
regions of identity between
the two sequences.
A dot plot is a simple graphical representation of identical
residues between two sequences.
S S JenaS S JenaS S Jena
Hypothetical whole genome map dot plots
S S JenaS S Jena
Whole genome map dot plots of
E. coli vs B. subtilis and E. coli vs S. typhimurium
Random distribution Diagonal patternS S JenaS S Jena
Sequence logo
A very useful representation of the conservation
patterns is the so-called sequence logo.
This shows the conserved residues as larger
characters, where the total height of a column is
proportional to how conserved that position is.
S S JenaS S Jena
Applications ofApplications of
comparative genomicscomparative genomics
1. Gene prediction
2. Regulatory region prediction
3. Interaction mapping
S S Jena
 When comparing genomes of different species
– Genes normally have same exon/intron structure
(Neutral theory of evolution)
 Look for ORFs that are conserved in both
genomes
 Frequently permits accurate identification of genes
– Fugu/human comparison: found >1000 genes that
had been missed by annotation
– Mouse/human comparison indicates only 30,000
genes in genome
How genome comparisons help?
S S Jena
 The comparison of fruit fly genome with the human
genome discovered that about 60 percent of genes
are conserved between fly and human.
 Virtually all (99%) of the protein-coding genes in
humans align with homologs in mouse, and over
80% are clear 1:1 orthologs. In most cases, the
intron-exon structures are highly conserved.
 The finding that the three wheat genomes have a
highly similar gene content and order was the first
demonstration of colinearity in the grasses and a
pivotal finding in the development of comparative
genomics.
How genome comparisons help? Cont.
S S Jena
Comparison of the human and mouse spermidine
synthase genes revealed an additional intron in
the human gene that is not found in the mouse
homologue.
Sequence comparison example
Human
Mouse
5,500 bp
S S Jena
Relationships among the Genomes of
Rice, Foxtail Millet, and Pearl Millet
CG helps genome annotationsCG helps genome annotations
In prokaryotes, finding genes is relatively
easy based on open reading frames (ORFs)
In eukaryotes, we have to look for ORFs,
exons, introns, splice sites, polyA sites
Difficulties:
• Predicted exons sometimes do not exist
• Pseudogenes
• Alternative splicing
Merit: In different species, the genes normally
have similar exon-intron structure
S S Jena
Finding regulatory sequencesFinding regulatory sequencesFinding regulatory sequencesFinding regulatory sequences
Regulatory sequences are difficult to identify
using computer programs.
Problems are:
 Most enhancer sequences have yet to be
identified
 They are usually short: 6-10 basepairs
 Those that are known are usually degenerate
• They can differ in one or more basepairs
• Still bind the cognate transcription factor
S S Jena
Comparisons to identify regulatory elements
Comparisons of genomes of different species
can identify cis-regulatory elements. (Neutral
theory of evolution)
Change in intergenic regions and introns are
usually more rapid than in coding regions
Nevertheless, regulatory elements tend to be
conserved (because these seq. bind TFs)
Conserved intergenic sequences identified by
aligning genomic regions of orthologs are
called “phylogenetic footprint.” (analogous with
DNAase footprinting).
Interaction mapping
 A remarkable use of comparative genomics is
to identify interacting proteins.
 Protein-protein interactions are critical for
cellular functions like
 Transfer of information in a genetic pathway
 Scaffolding to tether other proteins
 Enzymatic reactions (multi-subunit enzymes)
 Large molecular machines such as motors
S S Jena
Rosetta Stone
 Interaction proteins are encoded by single gene
in some species, whereas in other species
same proteins are encoded in two genes.
 Systematic search through sequenced genomes
for these relationships should identify proteins
that interact.
 This method is called “Rosetta Stone” approach
S S Jena
Rosetta Stone example
 Equivalent of yeast protein
topoisomerase II, in E. coli is
encoded by two genes:
gyrase A and gyrase B.
 Suggests that gyrase B and
gyrase A interact in E. coli
S S Jena
1. Identification and mapping of
Leucine-rich repeat resistance gene analogs
in Bermuda grass
 31 Bermuda grass (Cynodon spp.) disease
resistance gene analogs (BRGA) were
cloned and sequenced from diploid, triploid,
tetraploid and hexaploid bermuda grass using
degenerate primers to target Nucleotide
Binding site (NBS) of the NBS-Leucine Rich
Repeat (LRR) resistance family.
(Harris et al., (2010) J.Amer.Soc.135:74-82)
S S Jena
2. Synteny between the centromeric regions
of wheat and rice
 Recently discovered that rice centromeres contain
genes. This helps in studying centromere homologies
between wheat and rice chromosomes by mapping
rice centromeric regions on to wheat aneuploid stocks
 Genome wide comparison of wheat ESTs that were
mapped to centromeric regions against rice genome
sequences revealed high conservation and one to one
correspondence of centromeric regions between
wheat and rice chromosome pairs W1-R5, W2-R7,
W3-R1, W5-R12, W6-R2 and W7-R8
(Qi et al., (2009) Genetics 183:1235-1247)
S S Jena
3. Sequencing and comparative analysis of
conserved syntenic segment (CSS)
in the solanaceae
 Wang et.al. reported generation and analysis of
sequences for unduplicated conserved syntenic
segment (CSS) in genomes of five members of
solanaceae.
 This analysis indicates 30 million years of plant
evolution in absence of polyploidization.
 The sequenced segments of the potato, tomato,
pepper, eggplant, and petunia genomes are
shown alongside corresponding regions of the
Arabidopsis (At) genome.
(Wang et al., (2008) Genetics 180:391-408)
S S Jena
Conserved syntenic segment (CSS) in five species of Solanaceae
4. Comparative physical mapping of Rice
Comparative physical mapping between Oryza sativa
(AA Genome type) and Oryza punctata (BB Genome
type) was constructed by aligning physical map of O.
punctata on to O. punctata genome sequence.
The level of conservation of each genome between two
species was determined.
The alignment suggests more divergence of intergenic
and repeat regions in comparison to gene rich regions.
Genome of O. punctata was 8% larger than O. sativa
with individual chr. differences of 1.5 to 16.5%.
(Kim et al., (2007) Genetics:379-390)
S S Jena
Alignment view of the comparative physical map of O. punctata
(BB genome type) and O. sativa (AA genome type) using SyMap
What is difference between man and ape?
Man and chimpanzee have a
genome wide similarity of greater
than 95%.
What accounts for differences in
species?.
Recent study suggests that it is due
to specific gene expression
differences.
– Striking differences found only in
brain
S S Jena
Human/ape gene expression comparisons
Blood tissue Brain tissueLiver tissue
S S Jena
CASE STUDYCASE STUDYCASE STUDYCASE STUDY
S S Jena
Objective of the researchObjective of the research
Classification and phylogenic analysis of
phytohormone related genes, from
metabolism enzymes to receptors and
signaling components, in different species.
S S Jena
Abstract of the workAbstract of the work
Genetic and molecular studies in the model organism
Arabidopsis thaliana have revealed the individual
pathways of various plant hormone responses.
Selected 479 genes that were convincingly associated
with various hormone actions.
By using these 479 genes as queries, a genome-wide
search for their orthologues in several species
(microorganisms, plants and animals) was performed.
Meanwhile, a comparative analysis was conducted to
evaluate their evolutionary relationship.
S S Jena
Result and discussionResult and discussion
Phylogenetic tree generated by orthologue
genes, using orthologue gene similarity as
compared to A. thaliana hormone related genes
Protein sequence phylogenetic
tree
S S Jena
Distribution of orthologues in function category
of hormone related genes in different species
Distribution of orthologues in function category
of hormone related genes in different species
The height of each bar showing in different color represents the
percentage of orthologue genes in AHRG of selected plants as
compared to that of A. thaliana.
Blue - orthologues belonging to hormone metabolism related genes;
Purple - orthologues belonging to hormone transport genes;
light yellow – orthologues belonging to genes related to signal
transduction.
S S Jena
Comparison of the copy numbers of AHRG orthologues in cereals –
Rice, S. bicolor, P. trichocarpa, and A. thaliana.
Different colors represent ratios that are calculated by the number of
orthologue genes in selected species versus the number of AHRG in
A. thaliana.
S S Jena
ConclusionConclusion
The metabolisms and functions of plant hormones are
generally more sophisticated and diversified in higher
plant species.
In particular, several phytohormone receptors and key
signaling components were not present in lower plants
or animals.
Meanwhile, as the genome complexity increases, the
orthologue genes tend to have more copies and
probably gain more diverse functions.
S S Jena
CASE STUDY- 2CASE STUDY- 2
S S Jena
Plant disease resistance (R) loci frequently lack
synteny between related species of cereals and
crucifers but appear to be positionally well conserved
in the Solanaceae.
In this report, a local RGA approach is adopted using
genomic information from the model Solanaceous
plant tomato to isolate R3a, a potato gene that confers
race-specific resistance to the late blight pathogen
Phytophthora infestans.
Abstract of the researchAbstract of the research
S S Jena
The genomic regions harboring the R3 late
blight resistance locus in potato and the I2
Fusarium wilt resistance locus in tomato are
colinear (Huang et al., 2004).
Identified a cluster of I2 gene analogues
(I2GAs) in potato.
This potato I2GA cluster positionally
corresponds to the SL8D cluster of the I2
complex locus in tomato and was therefore
named the St-I2 cluster.
Results and discussionResults and discussion
S S Jena
R3a candidates were
identified using a local
resistance gene
analogue (RGA) approach
The syntenic relationships
of R gene clusters are
highlighted using
gray rectangles.
To identify I2GAs
physically close to R3a, an
association analysis on
bacterial artificial
chromosome (BAC) pools
was conducted.
ResultsResults cont.cont.
Comparative genetic maps of the
I2 complex locus in tomato and
the R3 complex locus in potato.
ResultsResults cont.cont.
In vitro inoculation of the primary transformants of R3a. Massive
sporulation (S) and localized hypersensitive reactions (HR) are
observed on compatible and incompatible interactions, respectively.
S S Jena
In this study, genomic information from tomato was
used to isolate the potato late blight resistance gene
R3a from an ancient locus involved in plant innate
immunity in the Solanaceae.
Comparative analyses of the R3 complex locus with
the corresponding I2 complex locus in tomato suggest
that this is an ancient locus involved in plant innate
immunity against oomycetes and fungal pathogens.
However, the R3 complex locus has evolved after
divergence from tomato and the locus has experienced
a significant expansion in potato without disruption of
the flanking colinearity.
ConclusionConclusion
S S Jena
LIMITATIONS OF CGLIMITATIONS OF CGLIMITATIONS OF CGLIMITATIONS OF CG
Homologous genes are relatively well preserved
while noncoding regions tend to show varying
degrees of conservation.
Cross species comparative genomics is influenced
by the evolutionary distance of the compared
species.
Genetic drift- how can we tell what differences are
really selection and important to organism function
and not a result of genetic drift.
Computationally intensive- large amount of data that
are being compared, still coming up with the tools to
process and compare genomes.
In order for the comparisons to statistically relevant
many more genomes will need to be sequenced.S S Jena
S S Jena
The goal of comparative genomics
Due to our current ability
to annotate genomes we
can precisely place a list
of genes on the
chromosome, resulting
in something similar to a
set of lights of uniform
color and intensity.
We will be able to tell
what each gene does
and when and where it is
expressed.
S S Jena
Issues for the future
• Faster/better algorithms for aligning vertebrate
genomes
• Multiple alignments
– Comparing several species can give clues to which
regulatory sequences are of a basic nature, and
which are lineage specific
• Cataloguing of comparative data
• Better visualisation
– Whole syntenic region <> nucleotide level
– Multiple genome sequences
S S Jena
S S Jena
QUERIES…

Más contenido relacionado

La actualidad más candente

STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICSSTRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICSSHEETHUMOLKS
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomicsajay301
 
Forward and reverse genetics
Forward and reverse geneticsForward and reverse genetics
Forward and reverse geneticsSachin Ekatpure
 
Applications of genomics and proteomics ppt
Applications of genomics and  proteomics pptApplications of genomics and  proteomics ppt
Applications of genomics and proteomics pptIbad khan
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicsAthira RG
 
Genomics(functional genomics)
Genomics(functional genomics)Genomics(functional genomics)
Genomics(functional genomics)IndrajaDoradla
 
The ensembl database
The ensembl databaseThe ensembl database
The ensembl databaseAshfaq Ahmad
 
PHYSICAL MAPPING STRATEGIES IN GENOMICS
PHYSICAL MAPPING STRATEGIES IN GENOMICSPHYSICAL MAPPING STRATEGIES IN GENOMICS
PHYSICAL MAPPING STRATEGIES IN GENOMICSUsman Arshad
 
Orthologs,Paralogs & Xenologs
 Orthologs,Paralogs & Xenologs  Orthologs,Paralogs & Xenologs
Orthologs,Paralogs & Xenologs OsamaZafar16
 
ZINC FINGER NUCLEASE TECHNOLOGY
ZINC FINGER NUCLEASE TECHNOLOGYZINC FINGER NUCLEASE TECHNOLOGY
ZINC FINGER NUCLEASE TECHNOLOGYPriyesh Waghmare
 
Presentation on marker genes
Presentation on marker genesPresentation on marker genes
Presentation on marker genesTasmina Susmi
 

La actualidad más candente (20)

STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICSSTRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
 
TRANSPOSON TAGGING
TRANSPOSON TAGGINGTRANSPOSON TAGGING
TRANSPOSON TAGGING
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
Genomics
GenomicsGenomics
Genomics
 
Forward and reverse genetics
Forward and reverse geneticsForward and reverse genetics
Forward and reverse genetics
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
 
Applications of genomics and proteomics ppt
Applications of genomics and  proteomics pptApplications of genomics and  proteomics ppt
Applications of genomics and proteomics ppt
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Genomics(functional genomics)
Genomics(functional genomics)Genomics(functional genomics)
Genomics(functional genomics)
 
The ensembl database
The ensembl databaseThe ensembl database
The ensembl database
 
PHYSICAL MAPPING STRATEGIES IN GENOMICS
PHYSICAL MAPPING STRATEGIES IN GENOMICSPHYSICAL MAPPING STRATEGIES IN GENOMICS
PHYSICAL MAPPING STRATEGIES IN GENOMICS
 
YEAST TWO HYBRID SYSTEM
 YEAST TWO HYBRID SYSTEM YEAST TWO HYBRID SYSTEM
YEAST TWO HYBRID SYSTEM
 
Orthologs,Paralogs & Xenologs
 Orthologs,Paralogs & Xenologs  Orthologs,Paralogs & Xenologs
Orthologs,Paralogs & Xenologs
 
Genome mapping
Genome mapping Genome mapping
Genome mapping
 
ZINC FINGER NUCLEASE TECHNOLOGY
ZINC FINGER NUCLEASE TECHNOLOGYZINC FINGER NUCLEASE TECHNOLOGY
ZINC FINGER NUCLEASE TECHNOLOGY
 
Presentation on marker genes
Presentation on marker genesPresentation on marker genes
Presentation on marker genes
 
Physical mapping
Physical mappingPhysical mapping
Physical mapping
 
Gene isolation methods
Gene isolation methodsGene isolation methods
Gene isolation methods
 

Destacado

Comparative genomics presentation
Comparative genomics presentationComparative genomics presentation
Comparative genomics presentationEmmanuel Aguon
 
What is comparative genomics
What is comparative genomicsWhat is comparative genomics
What is comparative genomicsUsman Arshad
 
Comparative genomics and proteomics
Comparative genomics and proteomicsComparative genomics and proteomics
Comparative genomics and proteomicsNikhil Aggarwal
 
Var identificatn with lab tech @ sid
Var identificatn with lab tech @ sidVar identificatn with lab tech @ sid
Var identificatn with lab tech @ sidsidjena70
 
Comparative Genomics with GMOD and BioPerl
Comparative Genomics with GMOD and BioPerlComparative Genomics with GMOD and BioPerl
Comparative Genomics with GMOD and BioPerlJason Stajich
 
Darwin’s theory of evolution
Darwin’s theory of evolutionDarwin’s theory of evolution
Darwin’s theory of evolutionSteve Johnson
 
Lyn pugecte bereket folder
Lyn pugecte  bereket folderLyn pugecte  bereket folder
Lyn pugecte bereket folderbreket
 
Us case law joey femia
Us case law joey femiaUs case law joey femia
Us case law joey femiaJoeyFemiaa
 
Best friends(1)
Best friends(1)Best friends(1)
Best friends(1)yannah1103
 
Profile of audience
Profile of audienceProfile of audience
Profile of audiencerobynnwardd
 
AppCircus - Badaboom A Dino's Rhythm Game
AppCircus - Badaboom A Dino's Rhythm GameAppCircus - Badaboom A Dino's Rhythm Game
AppCircus - Badaboom A Dino's Rhythm GamePedro Kayatt
 
News & Videos about Web -- CNN.com
News & Videos about Web -- CNN.comNews & Videos about Web -- CNN.com
News & Videos about Web -- CNN.comeducatedcommuni79
 
SPRV #2 - São Paulo Realidade Virtual - Boas práticas
SPRV #2 - São Paulo Realidade Virtual - Boas práticasSPRV #2 - São Paulo Realidade Virtual - Boas práticas
SPRV #2 - São Paulo Realidade Virtual - Boas práticasPedro Kayatt
 
The Host opening analysis
The Host opening analysisThe Host opening analysis
The Host opening analysisrobynnwardd
 

Destacado (20)

Comparative genomics presentation
Comparative genomics presentationComparative genomics presentation
Comparative genomics presentation
 
What is comparative genomics
What is comparative genomicsWhat is comparative genomics
What is comparative genomics
 
Comparative genomics and proteomics
Comparative genomics and proteomicsComparative genomics and proteomics
Comparative genomics and proteomics
 
Var identificatn with lab tech @ sid
Var identificatn with lab tech @ sidVar identificatn with lab tech @ sid
Var identificatn with lab tech @ sid
 
Comparative Genomics with GMOD and BioPerl
Comparative Genomics with GMOD and BioPerlComparative Genomics with GMOD and BioPerl
Comparative Genomics with GMOD and BioPerl
 
Gene order
Gene orderGene order
Gene order
 
Darwin’s theory of evolution
Darwin’s theory of evolutionDarwin’s theory of evolution
Darwin’s theory of evolution
 
Audience
AudienceAudience
Audience
 
Lyn pugecte bereket folder
Lyn pugecte  bereket folderLyn pugecte  bereket folder
Lyn pugecte bereket folder
 
Javiannys molino
Javiannys molinoJaviannys molino
Javiannys molino
 
The host
The hostThe host
The host
 
Us case law joey femia
Us case law joey femiaUs case law joey femia
Us case law joey femia
 
7Sacrament
7Sacrament7Sacrament
7Sacrament
 
Best friends(1)
Best friends(1)Best friends(1)
Best friends(1)
 
Profile of audience
Profile of audienceProfile of audience
Profile of audience
 
AppCircus - Badaboom A Dino's Rhythm Game
AppCircus - Badaboom A Dino's Rhythm GameAppCircus - Badaboom A Dino's Rhythm Game
AppCircus - Badaboom A Dino's Rhythm Game
 
News & Videos about Web -- CNN.com
News & Videos about Web -- CNN.comNews & Videos about Web -- CNN.com
News & Videos about Web -- CNN.com
 
Audience
AudienceAudience
Audience
 
SPRV #2 - São Paulo Realidade Virtual - Boas práticas
SPRV #2 - São Paulo Realidade Virtual - Boas práticasSPRV #2 - São Paulo Realidade Virtual - Boas práticas
SPRV #2 - São Paulo Realidade Virtual - Boas práticas
 
The Host opening analysis
The Host opening analysisThe Host opening analysis
The Host opening analysis
 

Similar a Comparative genomics @ sid 2003 format

Presentation1..gymno..non specific markers n microsatellites..by Nikita Patha...
Presentation1..gymno..non specific markers n microsatellites..by Nikita Patha...Presentation1..gymno..non specific markers n microsatellites..by Nikita Patha...
Presentation1..gymno..non specific markers n microsatellites..by Nikita Patha...NIKITAPATHANIA
 
Gene targeting and sequence tags
Gene targeting and sequence tagsGene targeting and sequence tags
Gene targeting and sequence tagsAlen Shaji
 
Rna lecture
Rna lectureRna lecture
Rna lecturenishulpu
 
Apollo : A workshop for the Manakin Research Coordination Network
Apollo: A workshop for the Manakin Research Coordination NetworkApollo: A workshop for the Manakin Research Coordination Network
Apollo : A workshop for the Manakin Research Coordination NetworkMonica Munoz-Torres
 
RNASeq Experiment Design
RNASeq Experiment DesignRNASeq Experiment Design
RNASeq Experiment DesignYaoyu Wang
 
Blast gp assignment
Blast  gp assignmentBlast  gp assignment
Blast gp assignmentbarathvaj
 
Current trends in pseduogene detection and characterization
Current trends in pseduogene detection and characterizationCurrent trends in pseduogene detection and characterization
Current trends in pseduogene detection and characterizationShreya Feliz
 
Comparative genomics.pdf
Comparative genomics.pdfComparative genomics.pdf
Comparative genomics.pdfshinycthomas
 
Apollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityApollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityMonica Munoz-Torres
 
Catalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seqCatalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seqManjappa Ganiger
 
Lecture 5.pptx
Lecture 5.pptxLecture 5.pptx
Lecture 5.pptxericndunek
 
Impact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEGImpact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEGLong Pei
 
Sequencealignmentinbioinformatics 100204112518-phpapp02
Sequencealignmentinbioinformatics 100204112518-phpapp02Sequencealignmentinbioinformatics 100204112518-phpapp02
Sequencealignmentinbioinformatics 100204112518-phpapp02PILLAI ASWATHY VISWANATH
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENTMariya Raju
 

Similar a Comparative genomics @ sid 2003 format (20)

Sequence alignment.pptx
Sequence alignment.pptxSequence alignment.pptx
Sequence alignment.pptx
 
Presentation1..gymno..non specific markers n microsatellites..by Nikita Patha...
Presentation1..gymno..non specific markers n microsatellites..by Nikita Patha...Presentation1..gymno..non specific markers n microsatellites..by Nikita Patha...
Presentation1..gymno..non specific markers n microsatellites..by Nikita Patha...
 
Gene targeting and sequence tags
Gene targeting and sequence tagsGene targeting and sequence tags
Gene targeting and sequence tags
 
genomic comparison
genomic comparison genomic comparison
genomic comparison
 
Rna lecture
Rna lectureRna lecture
Rna lecture
 
Apollo : A workshop for the Manakin Research Coordination Network
Apollo: A workshop for the Manakin Research Coordination NetworkApollo: A workshop for the Manakin Research Coordination Network
Apollo : A workshop for the Manakin Research Coordination Network
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
RNASeq Experiment Design
RNASeq Experiment DesignRNASeq Experiment Design
RNASeq Experiment Design
 
Molecular markers
Molecular markersMolecular markers
Molecular markers
 
Blast gp assignment
Blast  gp assignmentBlast  gp assignment
Blast gp assignment
 
Comparitive genomics
Comparitive genomicsComparitive genomics
Comparitive genomics
 
Current trends in pseduogene detection and characterization
Current trends in pseduogene detection and characterizationCurrent trends in pseduogene detection and characterization
Current trends in pseduogene detection and characterization
 
Comparative genomics.pdf
Comparative genomics.pdfComparative genomics.pdf
Comparative genomics.pdf
 
EiB Seminar from Antoni Miñarro, Ph.D
EiB Seminar from Antoni Miñarro, Ph.DEiB Seminar from Antoni Miñarro, Ph.D
EiB Seminar from Antoni Miñarro, Ph.D
 
Apollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityApollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research Community
 
Catalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seqCatalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seq
 
Lecture 5.pptx
Lecture 5.pptxLecture 5.pptx
Lecture 5.pptx
 
Impact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEGImpact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEG
 
Sequencealignmentinbioinformatics 100204112518-phpapp02
Sequencealignmentinbioinformatics 100204112518-phpapp02Sequencealignmentinbioinformatics 100204112518-phpapp02
Sequencealignmentinbioinformatics 100204112518-phpapp02
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENT
 

Más de sidjena70

Seed storage and maintenance
Seed storage and maintenanceSeed storage and maintenance
Seed storage and maintenancesidjena70
 
Seed certification
Seed certificationSeed certification
Seed certificationsidjena70
 
Quality seed production
Quality seed productionQuality seed production
Quality seed productionsidjena70
 
Overview of agril production system
Overview of agril production systemOverview of agril production system
Overview of agril production systemsidjena70
 
Protein folding @ sid
Protein folding @ sidProtein folding @ sid
Protein folding @ sidsidjena70
 
Mb 4 plant res @ sid
Mb 4 plant res @ sidMb 4 plant res @ sid
Mb 4 plant res @ sidsidjena70
 
Tilling @ sid
Tilling @ sidTilling @ sid
Tilling @ sidsidjena70
 
Electrn microsopy @sid
Electrn microsopy @sidElectrn microsopy @sid
Electrn microsopy @sidsidjena70
 
Ee of pgr @sid
Ee of pgr @sidEe of pgr @sid
Ee of pgr @sidsidjena70
 
Cotton genomics @sid
Cotton genomics @sidCotton genomics @sid
Cotton genomics @sidsidjena70
 
Abiotic stress resistance @ sid
Abiotic stress resistance @ sidAbiotic stress resistance @ sid
Abiotic stress resistance @ sidsidjena70
 
Biochemical marker @ sid
Biochemical marker @ sidBiochemical marker @ sid
Biochemical marker @ sidsidjena70
 
Introduction to Agriculture
Introduction to AgricultureIntroduction to Agriculture
Introduction to Agriculturesidjena70
 

Más de sidjena70 (15)

Seed storage and maintenance
Seed storage and maintenanceSeed storage and maintenance
Seed storage and maintenance
 
Seed certification
Seed certificationSeed certification
Seed certification
 
Quality seed production
Quality seed productionQuality seed production
Quality seed production
 
Overview of agril production system
Overview of agril production systemOverview of agril production system
Overview of agril production system
 
Agroecology
AgroecologyAgroecology
Agroecology
 
Rn ai @ sid
Rn ai @ sidRn ai @ sid
Rn ai @ sid
 
Protein folding @ sid
Protein folding @ sidProtein folding @ sid
Protein folding @ sid
 
Mb 4 plant res @ sid
Mb 4 plant res @ sidMb 4 plant res @ sid
Mb 4 plant res @ sid
 
Tilling @ sid
Tilling @ sidTilling @ sid
Tilling @ sid
 
Electrn microsopy @sid
Electrn microsopy @sidElectrn microsopy @sid
Electrn microsopy @sid
 
Ee of pgr @sid
Ee of pgr @sidEe of pgr @sid
Ee of pgr @sid
 
Cotton genomics @sid
Cotton genomics @sidCotton genomics @sid
Cotton genomics @sid
 
Abiotic stress resistance @ sid
Abiotic stress resistance @ sidAbiotic stress resistance @ sid
Abiotic stress resistance @ sid
 
Biochemical marker @ sid
Biochemical marker @ sidBiochemical marker @ sid
Biochemical marker @ sid
 
Introduction to Agriculture
Introduction to AgricultureIntroduction to Agriculture
Introduction to Agriculture
 

Último

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Último (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Comparative genomics @ sid 2003 format

  • 1.
  • 4. Human Genome Project decided to use smaller genomes as warm-up for human genome. Resulted in sequencing: Many bacteria Model organism genomes Yeast, C. elegans, Arabidopsis, Drosophila Crop plants – Rice Comparison of these genome sequences provided basis for field of “Comparative Genomics” S S Jena
  • 5. What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease. To understand the uniqueness between different species. S S Jena
  • 6. Comparative genomics prior to obtaining full genome sequence Genome size •Compared DNA content among species. (DAPI staining- FACS) Single copy and repetitive DNA •Used hybridization kinetics. (Cot curve) •Found amount of repetitive DNA differed greatly among species. S S Jena
  • 7. What is compared? Gene location Gene structure Exon number Exon lengths Intron lengths Sequence similarity Gene characteristics Splice sites Codon usage Conserved synteny S S Jena
  • 8. What can homology tell us? 1. Identity of genes: - The identity of the gene in another organism - The identity of nearby genes - The function of the gene (if annotated) 2. Suggestions of how the gene might be causing disease 3. Infer ancestral relationships 4. Discover principles of evolution S S Jena
  • 9. Negative selection: The removal of deleterious mutations from a population; also referred to as purifying selection. Positive selection: The retention of mutations that benefit an organism; also referred to as Darwinian selection. Homologs: Features (including DNA and protein sequences) in species being compared that are similar because they are ancestrally related. S S Jena
  • 10. Orthologs and Paralogs • While comparing sequence from different genomes, we must distinguish between two types of closely related sequences:  Orthologs are genes found in two species that had a common ancestor.  Paralogs are genes found in the same species that were created through gene duplication events. S S Jena
  • 11. A A’ A’’ B’ B B” A & B Orthologs A’ & A’’ B’ & B’’ Paralogs Orthologs and Paralogs S S Jena
  • 13. Synteny Regions of two genomes that show considerable similarity in terms of sequence and conservation of the order of genes. Genes that are in the same relative position on two different chromosomes. Closely related species generally have similar order of genes on chromosomes. Synteny can be used to identify genes in one species based on map-position in another. S S Jena
  • 14. Synteny among crop genomes: rice, maize and wheat Maize: evolved from tetraploid ancestor
  • 15. THE ORIGIN AND EVOLUTION OF MODEL ORGANISMS Hedges SB (2002) Nature Reviews Genetics 3: 838 -849. The colored blocks represent synteny blocks S S Jena
  • 16. Synteny of Mouse and Human genome When sequence from mouse and human genomes compared, regions of remarkable synteny were found. Genes are in almost identical order for long stretches along the chromosome. Human Chr 14 Mouse Chr 14 S S Jena
  • 17. Synteny of Mouse & Human genome Almost entirely syntenic S S Jena
  • 18. The one to one linear correspondence between the order of codons in a coding sequence and the order of amino acids in the protein encoded. A linear map of mutation sites within a gene corresponds to the linear location of amino acid substitutions within the polypeptide encoded by that gene. Comparative genome analyses demonstrated that gene orders among related plant species remained largely conserved over millions of years of evolution. Colinearity S S Jena
  • 19. Comparative genomics exploits both similarities and differences in the proteins, RNA and regulatory regions of different organisms to infer how selection has acted upon these elements. Those elements that are responsible for similarities between different species should be conserved through time (stable selection), while those elements responsible for differences among species should be divergent (positive selection). S S JenaS S Jena
  • 20. The DNA sequences encoding the proteins and RNAs responsible for functions that were conserved from the last common ancestor should be preserved in contemporary genome sequences. Likewise, the DNA sequences controlling the expression of genes that are regulated similarly in two related species should also be conserved. Conversely, sequences that encode (or control the expression of) proteins and RNAs responsible for differences between species will themselves be divergent. Principle cont. S S JenaS S Jena
  • 21. Evolution and sequence conservation If no constraints on DNA sequence, Random mutations will occur. Over tens of millions of years these random mutations will make two related sequences Different. e.g. Non-coding DNA that does not have a regulatory function tends to diverge much more rapidly than protein coding DNA. S S JenaS S Jena
  • 22. Function and sequence conservation However: if there are constraints, e.g. oDNA codes for protein oor transcription factor binds DNA oReplication origin Then there will be sequence similarity when related sequences compared Basic rule when comparing two related sequences: Sequence conservation = functional importance
  • 23. Why do we annotate genomes? • If we find the gene in a model organism (like the rat), then we need to know what the homolog is in humans. • If we find the gene in a model organism, we need to know if it’s doing the same thing in humans. • If we DON’T know what gene is implicated in a disease, we can annotate ALL the genes in a region and find candidates for further study S S JenaS S Jena
  • 24. Comparison of genomic sequences from different species can help to identify: oGene structure oGene function oRegulatory sequences oInteractions between gene products S S JenaS S Jena
  • 25. Comparative Genomics Tools • Similarity search programs – BLAST2 (Basic Local Alignment Search Tool) – FASTA – MUMmer (Maximal Unique Match) (Comparisons and analyses at both Nucleic acid and protein level) • Other alignment programs – DBA [DNA Block Aligner] (Jareborg et al) – blastz (Schwartz et al.) – BLAT/AVID, – WABA [Wobble Aware Bulk Aligner] – DIALIGN [Diagonal ALIGNment] (Morgenstern et al.) – SSAHA [Sequence Search and Alignment by Hashing Algorithm] S S Jena
  • 26. Comparative Genomics Tools • Comparative gene prediction programs – Twinscan – Doublescan – SGP-1 • Regulatory region prediction – Consite • Visualization/ Sequence analysis programs – Dot plot (e.g. Dotter) – PIP maker (Percent Identity Plot) – Alfresco – VISTA (VISualization Tools for Alignments) – ACT (Artemis comparison tool) S S Jena
  • 27. General Databases Useful for Comparative Genomics • Locus Link/RefSeq: http://www.ncbi.nih.gov/LocusLink/ • PEDANT-Protein Extraction Description ANalysis Tool http://pedant.gsf.de • COGs - Cluster of Orthologous Groups (of proteins) http://www.ncbi.nih.gov/COG/ • KEGG- Kyoto Encyclopedia of Genes and Genomes http://www.genome.ad.jp/kegg/ • MBGD - Microbial Genome Database http://mbgd.genome.ad.jp/ • GOLD - Genome OnLine Database http://wit.integratedgenomics.com/GOLD/ • TIGR – The Institute of Genome Research Comparative genomics of Parasites S S Jena
  • 28.  Alignment of DNA sequences is the core process in comparative genomics.  An alignment is a mapping of the nucleotides in one sequence onto the nucleotides in the other sequence, with gaps introduced into one or the other sequence to increase the number of positions with matching nucleotides.  Several powerful alignment algorithms have been developed to align two or more sequences. COMPARATIVE GENOMICS -COMPARATIVE GENOMICS - PROCESSPROCESS S S Jena
  • 29. • The most frequently performed type of sequence comparison is the sequence similarity search. • Sequence comparisons that implicate function are widely used: – To determine if newly sequenced cDNA or genomic region encodes gene of known function. – Search for similar sequence in other species (or in same species) Sequence similarity searchSequence similarity search S S Jena
  • 30. • Search databases of DNA sequences • Use computer algorithms to align sequences – Don’t require perfect matches between sequences – Allow for insertions, deletions and base changes • Most commonly used algorithms: – BLAST – FAST-A Homology searches S S Jena
  • 32. Pairwise genome comparison of protein homologs (symmetrical best hits) http://www.ncbi.nlm.nih.gov/sutils/geneplot.cgi S S Jena
  • 33. Genome databases • Genomes at NCBI, EBI, TIGR
  • 37.  Gramene (http://www.gramene.org) is a comparative genome mapping database for grasses and a community resource for rice (Oryza sativa).  It combines a semi-automatically generated database of cereal genomic and expressed sequence tag sequences, genetic maps, map relations, and publications, with a curated database of rice mutants (genes and alleles), molecular markers, and proteins.  Gramene curators read and extract detailed information from published sources, summarize that information in a structured format, and establish links to related objects both inside and outside the database. S S Jena
  • 38.
  • 40. • Rice – whole genome sequence • Other crop grasses – Maize – Sorghum – Millet – Sugarcane – Wheat – Oats – Barley Gramene scope Synteny in Gramene
  • 41. Genome analysesGenome analyses • Variation in – Genome size – GC content – Codon usage – Amino acid composition – Genome organization • Single circular chromosomes • Linear chromosome + extra chromosomal elements E. coli: 4.6Mbp M. pneumoniae: 0.81Mbp B. subtilis: 4.20Mbp B. burgdorferi: 29% M. tuberculosis: 68% G, A, P, R: GC rich I, F, Y, M, D: AT rich S S Jena
  • 42. CG: Comparisons between genomes • The stains of the same species • The closely related species • The distantly related species – List of Orthologs – Evolution of individual genes – Evolution of organisms S S Jena
  • 43. Comparison of the coding regions • Begins with the gene identification algorithm: Infer what portions of the genomic sequence actively code for genes. • There are four basic approaches. 4 basic categories of gene identification programs Category Algorithm 1. Based on direct evidence of transcription EST_GENOME sim4 2. Based on homology with known genes PROCRUSTES 3. Statistical or ab-initio approaches Genscan FGENES GeneMark Glimmer 4. Using genome comparison TwinScan Rosetta S S Jena
  • 44. ‘Trait-to-gene’ approach A bioinformatics approach to identify genes involved in adaptive traits is called “Trait-to-gene”. Assumption: New genes will be created to perform tasks required for an adaptive response. Underlying reasoning: organisms that share a particular trait will share related genes. Also used to identify two different genes that serve the same function in different organisms. S S Jena
  • 45. Relating traits to genes To compare genes among species the Trait-to- gene approach uses “COG database”. (Eugene Koonin) It is a method for identifying likely orthologues when making whole genome comparisons among multiple species. S S Jena
  • 46. Important observations with regard to Gene Order • Order is highly conserved in closely related species but gets changed by rearrangements. • With more evolutionary distance, no correspondence between the gene order of orthologous genes. • Group of genes having similar biochemical function tend to remain localized. S S Jena
  • 47. Finding regulatory regionsFinding regulatory regions  Called phylogentic footprinting (analogous with DNAase footprinting)  Functionally important regions are mutated less.  These cis-regulatory motifs can be determined by: – Finding common motifs in orthologous sequences – Aligning orthologous sequences first, then indentifying common regions  Previously known motifs might help S S Jena
  • 48. Regulatory region prediction Consite: – Detection of TFBS conserved in corresponding genomic sequences from different species S S Jena
  • 49. Visualization Dot plot: A graphical dot plot program for detailed comparison of two sequences Soft wares for dot-plot: DNA strider Dotlet Dotter Dottup S S JenaS S JenaS S Jena
  • 50. Dot plot – The X axis represents the first sequence (PHO5), – The Y axis represents the second sequence (PHO3) – A dot is plotted for each match between two residues of the sequences. – Diagonal lines reveal regions of identity between the two sequences. A dot plot is a simple graphical representation of identical residues between two sequences. S S JenaS S JenaS S Jena
  • 51. Hypothetical whole genome map dot plots S S JenaS S Jena
  • 52. Whole genome map dot plots of E. coli vs B. subtilis and E. coli vs S. typhimurium Random distribution Diagonal patternS S JenaS S Jena
  • 53. Sequence logo A very useful representation of the conservation patterns is the so-called sequence logo. This shows the conserved residues as larger characters, where the total height of a column is proportional to how conserved that position is. S S JenaS S Jena
  • 54. Applications ofApplications of comparative genomicscomparative genomics 1. Gene prediction 2. Regulatory region prediction 3. Interaction mapping S S Jena
  • 55.  When comparing genomes of different species – Genes normally have same exon/intron structure (Neutral theory of evolution)  Look for ORFs that are conserved in both genomes  Frequently permits accurate identification of genes – Fugu/human comparison: found >1000 genes that had been missed by annotation – Mouse/human comparison indicates only 30,000 genes in genome How genome comparisons help? S S Jena
  • 56.  The comparison of fruit fly genome with the human genome discovered that about 60 percent of genes are conserved between fly and human.  Virtually all (99%) of the protein-coding genes in humans align with homologs in mouse, and over 80% are clear 1:1 orthologs. In most cases, the intron-exon structures are highly conserved.  The finding that the three wheat genomes have a highly similar gene content and order was the first demonstration of colinearity in the grasses and a pivotal finding in the development of comparative genomics. How genome comparisons help? Cont. S S Jena
  • 57. Comparison of the human and mouse spermidine synthase genes revealed an additional intron in the human gene that is not found in the mouse homologue. Sequence comparison example Human Mouse 5,500 bp S S Jena
  • 58. Relationships among the Genomes of Rice, Foxtail Millet, and Pearl Millet
  • 59. CG helps genome annotationsCG helps genome annotations In prokaryotes, finding genes is relatively easy based on open reading frames (ORFs) In eukaryotes, we have to look for ORFs, exons, introns, splice sites, polyA sites Difficulties: • Predicted exons sometimes do not exist • Pseudogenes • Alternative splicing Merit: In different species, the genes normally have similar exon-intron structure S S Jena
  • 60. Finding regulatory sequencesFinding regulatory sequencesFinding regulatory sequencesFinding regulatory sequences Regulatory sequences are difficult to identify using computer programs. Problems are:  Most enhancer sequences have yet to be identified  They are usually short: 6-10 basepairs  Those that are known are usually degenerate • They can differ in one or more basepairs • Still bind the cognate transcription factor S S Jena
  • 61. Comparisons to identify regulatory elements Comparisons of genomes of different species can identify cis-regulatory elements. (Neutral theory of evolution) Change in intergenic regions and introns are usually more rapid than in coding regions Nevertheless, regulatory elements tend to be conserved (because these seq. bind TFs) Conserved intergenic sequences identified by aligning genomic regions of orthologs are called “phylogenetic footprint.” (analogous with DNAase footprinting).
  • 62. Interaction mapping  A remarkable use of comparative genomics is to identify interacting proteins.  Protein-protein interactions are critical for cellular functions like  Transfer of information in a genetic pathway  Scaffolding to tether other proteins  Enzymatic reactions (multi-subunit enzymes)  Large molecular machines such as motors S S Jena
  • 63. Rosetta Stone  Interaction proteins are encoded by single gene in some species, whereas in other species same proteins are encoded in two genes.  Systematic search through sequenced genomes for these relationships should identify proteins that interact.  This method is called “Rosetta Stone” approach S S Jena
  • 64. Rosetta Stone example  Equivalent of yeast protein topoisomerase II, in E. coli is encoded by two genes: gyrase A and gyrase B.  Suggests that gyrase B and gyrase A interact in E. coli S S Jena
  • 65. 1. Identification and mapping of Leucine-rich repeat resistance gene analogs in Bermuda grass  31 Bermuda grass (Cynodon spp.) disease resistance gene analogs (BRGA) were cloned and sequenced from diploid, triploid, tetraploid and hexaploid bermuda grass using degenerate primers to target Nucleotide Binding site (NBS) of the NBS-Leucine Rich Repeat (LRR) resistance family. (Harris et al., (2010) J.Amer.Soc.135:74-82) S S Jena
  • 66. 2. Synteny between the centromeric regions of wheat and rice  Recently discovered that rice centromeres contain genes. This helps in studying centromere homologies between wheat and rice chromosomes by mapping rice centromeric regions on to wheat aneuploid stocks  Genome wide comparison of wheat ESTs that were mapped to centromeric regions against rice genome sequences revealed high conservation and one to one correspondence of centromeric regions between wheat and rice chromosome pairs W1-R5, W2-R7, W3-R1, W5-R12, W6-R2 and W7-R8 (Qi et al., (2009) Genetics 183:1235-1247) S S Jena
  • 67. 3. Sequencing and comparative analysis of conserved syntenic segment (CSS) in the solanaceae  Wang et.al. reported generation and analysis of sequences for unduplicated conserved syntenic segment (CSS) in genomes of five members of solanaceae.  This analysis indicates 30 million years of plant evolution in absence of polyploidization.  The sequenced segments of the potato, tomato, pepper, eggplant, and petunia genomes are shown alongside corresponding regions of the Arabidopsis (At) genome. (Wang et al., (2008) Genetics 180:391-408) S S Jena
  • 68. Conserved syntenic segment (CSS) in five species of Solanaceae
  • 69. 4. Comparative physical mapping of Rice Comparative physical mapping between Oryza sativa (AA Genome type) and Oryza punctata (BB Genome type) was constructed by aligning physical map of O. punctata on to O. punctata genome sequence. The level of conservation of each genome between two species was determined. The alignment suggests more divergence of intergenic and repeat regions in comparison to gene rich regions. Genome of O. punctata was 8% larger than O. sativa with individual chr. differences of 1.5 to 16.5%. (Kim et al., (2007) Genetics:379-390) S S Jena
  • 70. Alignment view of the comparative physical map of O. punctata (BB genome type) and O. sativa (AA genome type) using SyMap
  • 71. What is difference between man and ape? Man and chimpanzee have a genome wide similarity of greater than 95%. What accounts for differences in species?. Recent study suggests that it is due to specific gene expression differences. – Striking differences found only in brain S S Jena
  • 72. Human/ape gene expression comparisons Blood tissue Brain tissueLiver tissue S S Jena
  • 73. CASE STUDYCASE STUDYCASE STUDYCASE STUDY S S Jena
  • 74. Objective of the researchObjective of the research Classification and phylogenic analysis of phytohormone related genes, from metabolism enzymes to receptors and signaling components, in different species. S S Jena
  • 75. Abstract of the workAbstract of the work Genetic and molecular studies in the model organism Arabidopsis thaliana have revealed the individual pathways of various plant hormone responses. Selected 479 genes that were convincingly associated with various hormone actions. By using these 479 genes as queries, a genome-wide search for their orthologues in several species (microorganisms, plants and animals) was performed. Meanwhile, a comparative analysis was conducted to evaluate their evolutionary relationship. S S Jena
  • 76. Result and discussionResult and discussion Phylogenetic tree generated by orthologue genes, using orthologue gene similarity as compared to A. thaliana hormone related genes Protein sequence phylogenetic tree S S Jena
  • 77. Distribution of orthologues in function category of hormone related genes in different species Distribution of orthologues in function category of hormone related genes in different species The height of each bar showing in different color represents the percentage of orthologue genes in AHRG of selected plants as compared to that of A. thaliana. Blue - orthologues belonging to hormone metabolism related genes; Purple - orthologues belonging to hormone transport genes; light yellow – orthologues belonging to genes related to signal transduction. S S Jena
  • 78. Comparison of the copy numbers of AHRG orthologues in cereals – Rice, S. bicolor, P. trichocarpa, and A. thaliana. Different colors represent ratios that are calculated by the number of orthologue genes in selected species versus the number of AHRG in A. thaliana. S S Jena
  • 79. ConclusionConclusion The metabolisms and functions of plant hormones are generally more sophisticated and diversified in higher plant species. In particular, several phytohormone receptors and key signaling components were not present in lower plants or animals. Meanwhile, as the genome complexity increases, the orthologue genes tend to have more copies and probably gain more diverse functions. S S Jena
  • 80. CASE STUDY- 2CASE STUDY- 2 S S Jena
  • 81. Plant disease resistance (R) loci frequently lack synteny between related species of cereals and crucifers but appear to be positionally well conserved in the Solanaceae. In this report, a local RGA approach is adopted using genomic information from the model Solanaceous plant tomato to isolate R3a, a potato gene that confers race-specific resistance to the late blight pathogen Phytophthora infestans. Abstract of the researchAbstract of the research S S Jena
  • 82. The genomic regions harboring the R3 late blight resistance locus in potato and the I2 Fusarium wilt resistance locus in tomato are colinear (Huang et al., 2004). Identified a cluster of I2 gene analogues (I2GAs) in potato. This potato I2GA cluster positionally corresponds to the SL8D cluster of the I2 complex locus in tomato and was therefore named the St-I2 cluster. Results and discussionResults and discussion S S Jena
  • 83. R3a candidates were identified using a local resistance gene analogue (RGA) approach The syntenic relationships of R gene clusters are highlighted using gray rectangles. To identify I2GAs physically close to R3a, an association analysis on bacterial artificial chromosome (BAC) pools was conducted. ResultsResults cont.cont. Comparative genetic maps of the I2 complex locus in tomato and the R3 complex locus in potato.
  • 84. ResultsResults cont.cont. In vitro inoculation of the primary transformants of R3a. Massive sporulation (S) and localized hypersensitive reactions (HR) are observed on compatible and incompatible interactions, respectively. S S Jena
  • 85. In this study, genomic information from tomato was used to isolate the potato late blight resistance gene R3a from an ancient locus involved in plant innate immunity in the Solanaceae. Comparative analyses of the R3 complex locus with the corresponding I2 complex locus in tomato suggest that this is an ancient locus involved in plant innate immunity against oomycetes and fungal pathogens. However, the R3 complex locus has evolved after divergence from tomato and the locus has experienced a significant expansion in potato without disruption of the flanking colinearity. ConclusionConclusion S S Jena
  • 86. LIMITATIONS OF CGLIMITATIONS OF CGLIMITATIONS OF CGLIMITATIONS OF CG Homologous genes are relatively well preserved while noncoding regions tend to show varying degrees of conservation. Cross species comparative genomics is influenced by the evolutionary distance of the compared species. Genetic drift- how can we tell what differences are really selection and important to organism function and not a result of genetic drift. Computationally intensive- large amount of data that are being compared, still coming up with the tools to process and compare genomes. In order for the comparisons to statistically relevant many more genomes will need to be sequenced.S S Jena
  • 88. The goal of comparative genomics Due to our current ability to annotate genomes we can precisely place a list of genes on the chromosome, resulting in something similar to a set of lights of uniform color and intensity. We will be able to tell what each gene does and when and where it is expressed. S S Jena
  • 89. Issues for the future • Faster/better algorithms for aligning vertebrate genomes • Multiple alignments – Comparing several species can give clues to which regulatory sequences are of a basic nature, and which are lineage specific • Cataloguing of comparative data • Better visualisation – Whole syntenic region <> nucleotide level – Multiple genome sequences S S Jena

Notas del editor

  1. The origins of the field of Comparative Genomics can be found in the strategy used by the Human Genome Project to start sequencing smaller genomes to gain experience prior to tackling the large and complex human genome. This resulted in the sequencing of several bacterial genomes as well as the genomes of some of the most widely studied model organisms including (in chronological order of sequence completion): brewers yeast, S. cerevisiae; the nematode, C. elegans; the model plant, Arabidopsis thaliana; and the fruitfly, Drosophila melanogaster. As each genome sequence was completed, it was compared to those already in the databases and the study of the comparison of genomes came to be known as “Comparative Genomics”.
  2. Prior to the availability of fully-sequenced genomes and the acquisition of the name Comparative Genomics, genome-wide comparisons had been performed. These included determining relative genome size by comparing the DNA content of different species. One way of determining DNA content was to incubate cells with a DNA-binding dye such as DAPI and then analyze the fluorescence found in individual nuclei using a fluorescence activated cell sorter (FACS). Another parameter that could be roughly determined prior to sequencing was the relative abundance of single copy versus repetitive DNA in the genome. Genomic DNA was randomly sheared and denatured. It was then allowed to reanneal and the amount of double-stranded DNA was determined at successive time intervals. Because repetitive DNA finds a complementary strand more rapidly than single-copy DNA, the proportion of double-stranded DNA detected at early times was considered to be roughly equal to the amount of repetitive DNA in the genome. Graphs of the proportion of double stranded DNA versus time were called Cot curves. This use of hybridization kinetics was refined to identify highly repetitive DNA (the DNA that most rapidly reannealed), middle repetitive DNA (more slowly reannealing DNA) and single copy DNA (the slowest to reanneal).
  3. There is a trap that is easy to fall into when comparing two genomic sequences. For a comparison of two genes to be meaningful one must know the evolutionary relationship between them. The genes can be either orthologs or paralogs. Othologs are genes found in two species that had a common ancestor prior to the divergence of the two species. Paralogs are genes found in the same species that were created through gene duplication events. The problem arises when one compares genes from two species that are similar in sequence but are not orthologs. These genes may have different structure, function and expression patterns in the two species.
  4. This slides depicts the relationship between orthologs and paralogs. A species indicated by the green line evolves from a species represented by the red line. In the green species, gene A evolves from gene B in the red species. After the two species have diverged, gene duplication events take place in both species. In the green species these events produce A’ and A’’, while in the red species they produce B’ and B’’. A and B are the most closely related to the ancestral gene while A’, A’’, B’ and B’’ are more distantly related and have taken on new functions. Thus it is reasonable to compare A and B but not A’ with B’ or with B’’ nor should one compare B with A’ or A’’. The important question to ask is are two genes related by a speciation event as is the case for orthologs, or by a gene duplication event, as is the case for paralogs.
  5. Prior to the availability of sequenced genomes, genetic and physical maps of different species were compared. When a similar order of genes was found on the chromosomes of two different species it was called “synteny.” Sometimes, duplications of chromosomal segments leads to syntenic regions within the genome of the same species. In general, it has been found that closely related species have fairly similar gene order on their chromosomes, leading to large syntenic regions. An important practical application of synteny is to identify a gene that has been genetically mapped in a species that has a very large genome. If the gene is mapped between two known genes then synteny would suggest that the gene of interest is likely to lie between the same two genes in a species with a smaller genome.
  6. One of the most dramatic examples of synteny between species whose genomes vary greatly in size is found in the crop plants, rice, maize (corn) and wheat. In this figure the chromosomes of each species are arranged in a circle. Rice has the smallest genome and is in the middle in red. Maize has a duplicated genome which is indicated by the two sets of yellow chromosomes. Wheat has a considerably larger genome than maize and is on the outside. The location of genes that control traits such as height or seed shattering are indicated on the chromosomes of the different species. Now that the rice genome is fully sequenced, wheat and maize breeders can use synteny to identify genes involved in these and other traits for which they have a genetic map position.
  7. Once genomes of related species are fully sequenced, then the precise order of genes can be compared. For Mouse and Human there are stretches of remarkable synteny like the one shown on this slide between a portion of Human chromosome 14 and mouse chromosome 12. The blue lines linking the two chromosomal regions indicate genes that are found on both chromosomes. Note that the size of the intergenic regions varies considerably, but the order of genes is nearly identical.
  8. Comparison of Mouse and Human sequences across the whole genome revealed a hodge-podge of chromosomal translocations as shown on this slide. At the bottom of the image are the 23 human chromosomes each with a different color code. Above them are the 20 mouse chromosomes showing the regions that are syntenic to a particular human chromosome through use of the human chromosome’s color. Some mouse chromosomes like the X chromosome are almost entirely syntenic to their human counterpart. Others have syntenic regions from many human chromosomes, like mouse chromosome 2 which has syntenic regions from at least seven human chromosomes.
  9. Most uses of DNA sequence comparison are based on the observation that conservation of DNA sequence is usually for a reason. The idea is that if there are no constraints on the DNA sequence then mutations will occur randomly. The accumulation of these random mutations means that over tens of millions of years of evolutionary time, two sequences that started out the same will become so different as to be unrecognizable. An example is non-coding DNA that does not have a regulatory function tends to diverge much more rapidly than protein coding DNA.
  10. However, if there are constraints on the DNA such as coding for a protein or binding a transcription factor, then that portion of the DNA sequence will not randomly drift, but will remain relatively similar in sequence over the millions of years that two species evolve independently. Thus, the basic rule when comparing two related sequences is that sequence conservation indicates that there is functional importance. We will discuss how this rule has been used to identify various features of DNA sequence. It is important to point out that sequence conservation will also exist when the two related genomes are so closely related that insufficient time has passed to allow for sequence divergence.
  11. Once DNA sequence is known for two genomes, comparisons between them can be used for many purposes including determining gene structure (e.g. exon/intron boundaries), specifying gene function (by similarity to a gene encoding a protein of known function) and identifying regulatory sequences (such as promoters and enhancers). Comparisons of DNA sequences have even been used to determine which gene products are likely to interact with each other.
  12. By far the most frequently performed type of sequence comparison is the sequence similarity search. This is frequently called a homology search, although strictly speaking, homology refers to similarity due to a common origin and sequence similarity can also arise through convergent evolution. Whenever DNA is freshly sequenced the question naturally arises, “Does it encode something that is similar to a protein (or RNA) of known function?” To answer this question, researchers search for similar sequences in other species, or in the same species.
  13. To find out if there are any sequences that are similar to a newly acquired sequence requires searching the databases of DNA and protein sequences. This process is described in detail in the Bioinformatics chapter. Briefly, there are computer algorithms that attempt to find the best alignment between the search sequence and all the sequences in the databases. Because the sequences in other species are likely to have undergone some changes the computer programs don’t require perfect matches between the sequences. They allow for some base or amino acid changes as well as small insertions and deletions. The most commonly used program is BLAST because it performs a thorough search in a relatively small amount of time. Another program known as FAST-A is able to detect matching segments over longer stretches of sequence but takes much longer to perform a thorough search of the databases.
  14. A bioinformatics approach to identify genes involved in adaptive traits is called “Trait-to-gene”. This method is based on the assumption that in some cases at least, new genes will be created to perform tasks required for an adaptive response. Its underlying reasoning is that organisms that share a particular trait will share related genes. This method can also be used to identify two different genes that serve the same function in different organisms.
  15. To compare genes among species the Trait-to-gene approach uses a database developed by Eugene Koonin’s group of orthologues known as the “COG database”. COG stands for “Cluster of Orthologous Group” and it is a method for identifying likely orthologues when making whole genome comparisons among multiple species. When all species share a COG then there is little that can be learned except that it is likely to play an important role in a basic housekeeping function. However, as is illustrated in this slide, if all the species that share a particular trait have a gene that is a member of a particular COG, and species that don’t share the trait also don’t have the COG, then this is evidence that the gene may play an important role in the particular adaptive trait.
  16. Perfectly congruent maps produce a single diagonal set of dots. Maps that differ by simple chromosomal rearrangements show recognizable patterns corresponding to those rearrangements.
  17. Whole genome map dot plots show essentially random distributions of homologous genes when the maps of E. coli and B. subtilis are compared. However, a similar plot of E. coli and S. typhimurium shows an obvious diagonal and gives some evidence of the well known inversion in the 30-40 minute region (Sanderson and Hall, 1970; Riley and Krawiec, 1987).
  18. When genomes of more than one species are compared, due to the neutral theory of evolution coding regions tend to be conserved as well as exon/intron structures. Thus, it becomes fairly straightforward to look for ORFs that are conserved in both genomes and found in similar positions relative to surrounding ORFs. Cross-genome comparisons have been shown to be an effective means of accurately annotating genes as well as identifying new genes. For example, in the comparison of the Fugu genome with the human genome, over 1000 putative genes were discovered that had been missed by the annotation programs. Comparison of the mouse genome with the human genome indicates that there are only about 30,000 genes in both genomes. This is consistent with the predictions made from the draft human sequence but runs contrary to earlier pre-genome predictions that set the number of genes as closer to 100,000.
  19. When the human spermidine synthase gene involved in the synthesis of polyamines was compared with the homologous gene in the mouse genome it was found that there is an additional intron in the human gene which interrupts the fifth exon in the mouse gene. This is an example of a comparison that highlights how gene structure can change as species evolve.
  20. Identifying regulatory sequences in fully sequenced genomes is very challenging for computer programs. There are several issues that compound the problem. The first is that relatively few transcription factor binding sites have been characterized by biochemical or genetic experiments. Thus, many have yet to be identified. Second, most enhancer and repressor cis-elements are short, between 6 and 10 basepairs. Third, and perhaps most problematic, is the fact that most cis-elements are degenerate, meaning that two sites can differ in one or more basepairs but still bind the same transcription factor. The combination of these characteristics means that it is very difficult to write a computer program that can reliably identify cis-elements in eukaryotic genomes.
  21. The neutral theory of evolution can greatly aid in the search for cis-regulatory elements. Because these sequences bind transcription factors they are constrained from random mutations. Thus sequence conservation in intergenic regions and introns can be used to help identify cis-regulatory elements. To detect conserved intergenic sequences usually involves aligning the genomic regions of orthologues from two or more species. The conserved sequences identified by this type of alignment are called “phylogenetic footprints.”
  22. A remarkable use of comparative genomics has been to identify proteins that are likely to interact with each other. Protein-protein interactions are critical for various types of cellular functions including: the transfer of information in a genetic pathway, providing a scaffold to tether other proteins, multi-subunit enzymes and large molecular machines such as dynein which works as a molecular motor.
  23. Normally, one would think that it would be necessary to work with proteins to be able to detect interacting partners. However, with remarkable insight, the group of Christos Ouzounis realized that proteins that interact in one species can be encoded by a single gene in another species. They used these criteria to systematically search through the sequenced genomes to identify genes encoding proteins that are likely to interact with each other. They called their method the “Rosetta Stone” approach.
  24. An example of the Rosetta Stone approach is the finding that when the genome of the yeast S. cereviseae was compared with the bacterium E. coli it was noted that the equivalent of the yeast protein, topoisomerase II is encoded in two genes in E. coli: gyrase A and gyrase B. This is taken as evidence that gyrase A and gyrase B probably interact in E. coli.
  25. Of particular concern to some people is the very small difference at the genome level between humans and apes. Based on hybridization between the two genomes there appeared to be a 98.4% similarity, but more recent analysis of sequence comparisons suggests that it is closer to 95%, which is still highly similar. This raises the question,”If the genomes are so similar what accounts for the important differences between the species such as language, the ability to walk upright, etc.?” Pioneering work by Svante Pääbo’s group has begun to address this question. The initial data suggest that there are striking differences in gene expression patterns in the brain, but not in other organs.
  26. In their 2002 study, Pääbo’s group used microarrays of 12,000 human genes to analyze expression patterns in rhesus monkeys, chimpanzees, and humans. The results showed little difference in the results from liver or blood among the three primates. However, when tissue from the brain was used, it showed a 5.5 fold difference between humans and chimpanzees. The difference was calculated by using the absolute change in gene expression in all 12,000 genes and then doing the interspecies comparison. This study suggests that the most dramatic differences between humans and chimpanzees, namely the disparity in cognitive abilities, might be explained by a difference in gene expression in the brain.
  27. Our current ability to annotate genomes results in a list of genes for which we still know very little about either their function or regulation. We can precisely place them on the chromosome resulting in something similar to a set of lights of uniform color and intensity. With the approaches of comparative genomics it is hoped that we will be able to soon gain enough knowledge so that we will be able to tell what each gene does and when and where it is expressed. The genome will then resemble a multi-colored set of lights for which we know both their purpose and exactly how they are controlled.