2. Genetic resistance to Marek’s Disease
• MHC (B) locus has a major influence on MD
resistance
• Several haplotypes of B locus have been found to
correlate with resistance
– B21 most resistance
– B19 susceptibility
• Lines 6 and 7 (ADOL*) are B2 homozygous, but
line 6 is resistant and line 7 is susceptible to MD
• Relatively few non-MHC genes have been
identified
*Avian disease and Oncology Laboratory, East Lansing
3. Research Goal
• Identify non-MHC genes influencing MD
resistance from a genome-wide gene and
isoform expression analysis based on RNA-Seq
data
• Generate hypotheses for studying the
mechanism controlling MD resistance
Collaboration with Hans Cheng (ADOL) and Jerry Dodgson (MSU)
Dr. Likit Preeyanon
6. Gene models and isoforms are woefully incomplete –
e.g. ENSEMBL missing many exon-exon junctions.
De novo reconstruction
Ab initio reconstruction
Dr. Likit Preeyanon
7. GIMME: Software for Merging Gene Models
Assembly-
based
Local
Assembly
GIMME
Reference-
guided
Merged
Models
In-house software
Dr. Likit Preeyanon
Dr. Likit Preeyanon
8. Merged Gene Models
Global Assembly
Local Assembly
Reference-guided
Merged (consensus) Model
Newly predicted isoform
9. Merged models connect fragmented gene models & provide new
isoforms
Merged models can glue fragmented gene models and
include unannotated isoforms.
Gene B
Gene A
Gene A
Reference-guided
Merged model
10. IDH3A Gene – now with both UTRs!
Merged
RefSeq
ENSEMBL
UTR
11. IDH3A– different models, different predicted
expression…
SE : single-end, PE: paired-end
Not signif..
Signif
12. Differentially Expressed Genes from Different Gene Model Sets
…Differ.
DE genes by EBseq FDR < 0.05
Ref-guided
Ref-guided
13. In addition, many of the diff expr genes are not
annotated in KEGG
Ref-guided
14. GOseq FDR 0.05
Chicken + Human
KEGG Pathway
40 pathways
Must merge in
human KEGG
annotations
17. Biological Processes (BP) categories involved in Adaptive
Immune Responses are Enriched in Line 7 (susceptible)
GO ID Description Adjusted p-value
0009615 Response to virus 0.00023
0050670 Regulation of lymphocyte
proliferation
0.00048
0002252 Immune effector process 0.00068
0051249 Regulation of lymphocyte
activation
0.0027
0042129 Regulation of T cell
proliferation
0.0032
0002250 Adaptive immune response 0.0106
At early stage of infection, elicitation of the adaptive immune responses
appears to be delayed in line 6.
20. Differential Exon Usage of ITGB2 Gene from MISO
Spliced reads
Percent Spliced In (Ψ)
Read coverage
21. Genes with predicted differential splicing can be
categorized into four groups
Cutoff = 0.2
6 Ctrl
6 Inf
7 Ctrl
7 Inf
1
1
1
1
0
0
0
0
Group I
11 Genes
ψ
1
1
1
1
0
0
0
0
Group II
19 Genes
ψ
1
1
1
1
0
0
0
0
Group III
20 Genes
ψ
0 1
0 1
0 1
0 1
Group IV
1 Genes
ψ
22. The main point
• We are completely at the mercy of
annotations to interpret our large-scale data.
• Need more experimental information!
• But also, better methods => better signal
23. Concluding thoughts (I)
• Computational analysis of high-throughput
sequencing data can help refine hypotheses,
but cannot conclusively resolve mechanism.
• Don’t knock “refining hypotheses”, though!
Complex biological phenomena like disease
are refractory to simplifying assumptions.
24. Concluding thoughts (II)
• Much of the -omic data being gathered by all of
you has utility far beyond your specific research
question.
• This is particularly true in “semi-model”
organisms where annotations are generally poor
and not species-specific, and where there may be
significant intra-species variation.
• How can we better share this data, to make faster
and better progress?
25. Where should we spend our –omics
money?
• Improving genomes is still expensive and
requires significant technical expertise.
• mRNAseq is inexpensive, broadly useful and
wonderful for building better gene models.
• Proteomics and metabolomics?
• Better tools, annotation, and data sharing and
exploration portals are critically important to
the future of (agricultural genomics.
Thanks!
Notas del editor
The genetic resistance of MD is complex and controlled by many genes. The B locus is a major locus and incidence of MD varies widely among different haplotypes. In ADOL, lines 6 and 7, chickens share B2 haplotype but differ greatly in response to MD resistance. Several studies have been conducted to identify
Data from a much larger scale could be used to generate a hypothesis for studying a mechanism of MD resistance.
Understanding the mechanism of disease and MD resistance could lead to development of better vaccines.
As you know…
Some unique splice junctions are also found in other datasets. Ensembl models have many unique splice junctions because the models include genes and isoforms from other tissues.
Incorporating Ensembl models help, but Ensembl models do not include all genes in our samples.
Note that it is fortunate that GOSeq supports custom KEGG annotation. Most tools do not accept custom annotation, so you can only use annotation of one species at a time.
It should be pointed out that phagosome pathway is only enriched in line 7. The phagosome pathway, as you know, is critically important for the activation of T cells and elicitation of the adaptive immune responses becausse genes in this pathway are involved in phagocytosis and antigen presentation.
As expected, biological processes involved in adaptive immune responses only enriched in line 7.
If we could tag or barcode all reads, it’d be easy to estimate isoform expression.