SlideShare una empresa de Scribd logo
1 de 54
Descargar para leer sin conexión
Investigating the 3D structure of the genome with
Hi-C data analysis
Sylvain Foissac & Nathalie Villa-Vialaneix
prenom.nom@inra.fr
Séminaire MIAT - Toulouse, 2 juin 2017
SF & NV2 | Hi-C data analysis 1/28
Sommaire
1 Normalization
2 TAD identification
3 A/B compartments
4 Differential analysis
SF & NV2 | Hi-C data analysis 2/28
Sommaire
1 Normalization
2 TAD identification
3 A/B compartments
4 Differential analysis
SF & NV2 | Hi-C data analysis 3/28
Purpose of normalization
1 within matrix normalization: make bins comparable within a matrix
(not needed for differential analysis)
SF & NV2 | Hi-C data analysis 4/28
Purpose of normalization
1 within matrix normalization: make bins comparable within a matrix
(not needed for differential analysis)
2 between matrix normalization: make the same bin pair comparable
between two matrices (needed for differential analysis)
SF & NV2 | Hi-C data analysis 4/28
Different within matrix normalizations
to correct technical biases
(GC content, mappability...)
explicit correction [Yaffe and Tanay, 2011, Hu et al., 2012]: every factor
causing bais is identified and estimated
SF & NV2 | Hi-C data analysis 5/28
Different within matrix normalizations
to correct technical biases
(GC content, mappability...)
explicit correction [Yaffe and Tanay, 2011, Hu et al., 2012]: every factor
causing bais is identified and estimated
non parametric correction ICE correction using matrix balancing
[Imakaev et al., 2012]
K = b Kb for a K st ∀ i = 1, . . . , p,
p
j=1
Kij is constant
SF & NV2 | Hi-C data analysis 5/28
Different within matrix normalizations
to correct technical biases
picture from [Schmitt et al., 2016]
SF & NV2 | Hi-C data analysis 5/28
Different within matrix normalizations
to take distances into account
theoretical distribution taken from [Belton et al., 2012]
Kd
ij =
Kij − Kd(i,j)
σ(Dd(i,j))
with
Kd average counts at distance d
σ(Dd) standard deviation
available in HiTC [Servant et al., 2012]
SF & NV2 | Hi-C data analysis 6/28
Between matrix normalization
correct for differences in sequencing depth
standard approach: similar to RNA-seq normalization
SF & NV2 | Hi-C data analysis 7/28
Between matrix normalization
correct for differences in sequencing depth
standard approach: similar to RNA-seq normalization
However...
SF & NV2 | Hi-C data analysis 7/28
Between matrix normalization
correct for differences in sequencing depth
standard approach: similar to RNA-seq normalization
However...
density adjustment by LOESS fit [Robinson and Oshlack, 2010]
(implemented in csaw)
SF & NV2 | Hi-C data analysis 7/28
Sommaire
1 Normalization
2 TAD identification
3 A/B compartments
4 Differential analysis
SF & NV2 | Hi-C data analysis 8/28
Topologically Associated Domains (TADs)
[Rao et al., 2014]
SF & NV2 | Hi-C data analysis 9/28
TAD method jungle
Directionality index [Dixon et al., 2012]: compute divergence between
up/downstream interaction counts + HMM to identify TADs
SF & NV2 | Hi-C data analysis 10/28
TAD method jungle
Directionality index [Dixon et al., 2012]: compute divergence between
up/downstream interaction counts + HMM to identify TADs
armatus [Filippova et al., 2013]: maximize a criteria which evaluate a
within/between count ratio + combine multi-resolution results in a
consensual segmentation
SF & NV2 | Hi-C data analysis 10/28
TAD method jungle
Directionality index [Dixon et al., 2012]: compute divergence between
up/downstream interaction counts + HMM to identify TADs
armatus [Filippova et al., 2013]: maximize a criteria which evaluate a
within/between count ratio + combine multi-resolution results in a
consensual segmentation
segmentation method [Brault et al., 2017]: block boundary estimation in
matrix
SF & NV2 | Hi-C data analysis 10/28
TAD method jungle
Directionality index [Dixon et al., 2012]: compute divergence between
up/downstream interaction counts + HMM to identify TADs
armatus [Filippova et al., 2013]: maximize a criteria which evaluate a
within/between count ratio + combine multi-resolution results in a
consensual segmentation
segmentation method [Brault et al., 2017]: block boundary estimation in
matrix
... (many others), interestingly, very few provides a hierarchical
clustering
Comparisons in: [Fotuhi Siahpirani et al., 2016, Dali and Blanchette, 2017]
SF & NV2 | Hi-C data analysis 10/28
DI evolution with respect to armatus TADs
SF & NV2 | Hi-C data analysis 11/28
CTCF at TAD boundaries
SF & NV2 | Hi-C data analysis 12/28
Enrichment of genomic features around TAD boundaries
Homo Sapiens [Dixon et al., 2012]
Sus Scrofa (PORCINET project)
SF & NV2 | Hi-C data analysis 13/28
Current methodological development
Constrained HAC as a way to compare/combine TADs between samples
Contrained HAC: Hierarchical clustering with contiguity constrains
SF & NV2 | Hi-C data analysis 14/28
Current methodological development
Constrained HAC as a way to compare/combine TADs between samples
Contrained HAC: Hierarchical clustering with contiguity constrains
Challenges (currently under development with Pierre Neuvial and Marie
Chavent):
methodological issues: what happens when using Ward’s linkage
criterion with a non Euclidean similarity (counts of the Hi-C matrix)?
what happens when adding constrains to HAC? (partially solved)
development of the R package adjclust (Google Summer of Code
selected project)
SF & NV2 | Hi-C data analysis 14/28
Sommaire
1 Normalization
2 TAD identification
3 A/B compartments
4 Differential analysis
SF & NV2 | Hi-C data analysis 15/28
A/B compartments
[Lieberman-Aiden et al., 2009]
[Giorgetti et al., 2013]
Method (in theory):
compute Pearson correlations between bins
(using interaction counts with all the other bins
of the same chromosome)
compute eigenvectors (or perform PCA) on this
correlation matrix
affect A/B compartments to +/- values of PCs
SF & NV2 | Hi-C data analysis 16/28
A/B compartments in practice
after ICED and distance-based normalizations
SF & NV2 | Hi-C data analysis 17/28
A/B compartments in practice
after ICED and distance-based normalizations
Method:
differentiate between A/B using sign of the correlation between PCs
and diagonal counts
choose a relevant PC and method maximizing − log10(p − value)
between diagonal counts in +/- PC (2-group comparison Student test)
SF & NV2 | Hi-C data analysis 17/28
Biological validation
SF & NV2 | Hi-C data analysis 18/28
Sommaire
1 Normalization
2 TAD identification
3 A/B compartments
4 Differential analysis
SF & NV2 | Hi-C data analysis 19/28
Filtering
In differential analysis of sequencing data, filtering is a crucial step:
removing low count features (that are little or no chance to be found
differential) improves the test power (leverage the multiple testing
correction effect) and can save unnecessary computational time
SF & NV2 | Hi-C data analysis 20/28
Filtering
In differential analysis of sequencing data, filtering is a crucial step:
removing low count features (that are little or no chance to be found
differential) improves the test power (leverage the multiple testing
correction effect) and can save unnecessary computational time
can be performed 1/ at the beginning of the analysis or after the
estimation of the parameters of the model used for differential
analysis
SF & NV2 | Hi-C data analysis 20/28
Filtering
In differential analysis of sequencing data, filtering is a crucial step:
removing low count features (that are little or no chance to be found
differential) improves the test power (leverage the multiple testing
correction effect) and can save unnecessary computational time
can be performed 1/ at the beginning of the analysis or after the
estimation of the parameters of the model used for differential
analysis; 2/ can be fixed to an arbitrary value (minimum total count
per sample) or automated from the data
SF & NV2 | Hi-C data analysis 20/28
Filtering
In differential analysis of sequencing data, filtering is a crucial step:
removing low count features (that are little or no chance to be found
differential) improves the test power (leverage the multiple testing
correction effect) and can save unnecessary computational time
can be performed 1/ at the beginning of the analysis or after the
estimation of the parameters of the model used for differential
analysis; 2/ can be fixed to an arbitrary value (minimum total count
per sample) or automated from the data
for Hi-C data:
filtering was performed at the beginning of the analysis (to limit the
computation burden)
was performed by using an arbitrary threshold or a threshold based
on the estimation of the noise background by a quantile of
inter-chromosomal counts (as in R package diffHic)
SF & NV2 | Hi-C data analysis 20/28
Filtering
In differential analysis of sequencing data, filtering is a crucial step:
removing low count features (that are little or no chance to be found
differential) improves the test power (leverage the multiple testing
correction effect) and can save unnecessary computational time
can be performed 1/ at the beginning of the analysis or after the
estimation of the parameters of the model used for differential
analysis; 2/ can be fixed to an arbitrary value (minimum total count
per sample) or automated from the data
500 kb - automatic filter (filters counts<∼ 5) - 96.4% of pairs filtered out
before filtering after filtering
SF & NV2 | Hi-C data analysis 20/28
Exploratory analysis (500kb bins)
chromosome 1
1 0.911
1
0.8886
0.8866
1
0.8566
0.8651
0.8288
1
0.8973
0.9118
0.8912
0.8692
1
0.8935
0.9032
0.8818
0.8799
0.906
1
LW90−160216−GCCAAT
LW90−160223−CTTGTA
LW90−160308−AGTTCC
LW110−160307−CGATGT
LW110−160308−AGTCAA
LW110−160517−ACAGTG
LW
90−160216−G
C
C
AAT
LW
90−160223−C
TTG
TA
LW
90−160308−AG
TTC
C
LW
110−160307−C
G
ATG
T
LW
110−160308−AG
TC
AA
LW
110−160517−AC
AG
TG
−1.0 −0.5 0.0 0.5 1.0
Cosinus (Frobenius norm)
good reproducibility between
experiments
no clear organization with respect to
the condition
SF & NV2 | Hi-C data analysis 21/28
Exploratory analysis (500kb bins)
chromosome 1
1 0.911
1
0.8886
0.8866
1
0.8566
0.8651
0.8288
1
0.8973
0.9118
0.8912
0.8692
1
0.8935
0.9032
0.8818
0.8799
0.906
1
LW90−160216−GCCAAT
LW90−160223−CTTGTA
LW90−160308−AGTTCC
LW110−160307−CGATGT
LW110−160308−AGTCAA
LW110−160517−ACAGTG
LW
90−160216−G
C
C
AAT
LW
90−160223−C
TTG
TA
LW
90−160308−AG
TTC
C
LW
110−160307−C
G
ATG
T
LW
110−160308−AG
TC
AA
LW
110−160517−AC
AG
TG
−1.0 −0.5 0.0 0.5 1.0
Cosinus (Frobenius norm)
good reproducibility between
experiments
no clear organization with respect to
the condition
all data after filtering and between
matrix normalization (LOESS)
2 outliers but PC1 is organized with
respect to the condition
SF & NV2 | Hi-C data analysis 21/28
Methods for differential analysis of Hi-C
Similar to RNA-seq [Lun and Smyth, 2015] and R package diffHic
(essentially a wrapper for edgeR):
count data modeled by Binomial Negative distribution
SF & NV2 | Hi-C data analysis 22/28
Methods for differential analysis of Hi-C
Similar to RNA-seq [Lun and Smyth, 2015] and R package diffHic
(essentially a wrapper for edgeR):
count data modeled by Binomial Negative distribution
parameters (mean, variance per gene) are estimated from data: a
variance vs mean relationship is modeled
SF & NV2 | Hi-C data analysis 22/28
Methods for differential analysis of Hi-C
Similar to RNA-seq [Lun and Smyth, 2015] and R package diffHic
(essentially a wrapper for edgeR):
count data modeled by Binomial Negative distribution
parameters (mean, variance per gene) are estimated from data: a
variance vs mean relationship is modeled
test is performed using an exact test (similar to Fisher) or a
log-likelihood ratio test (GLM model)
SF & NV2 | Hi-C data analysis 22/28
Complementary remarks about DE analysis
Hi-C data contain more zeros than RNA-seq data: some people
propose to use Zero Inflated BN distribution (unpublished as far as I
know)
SF & NV2 | Hi-C data analysis 23/28
Complementary remarks about DE analysis
Hi-C data contain more zeros than RNA-seq data: some people
propose to use Zero Inflated BN distribution (unpublished as far as I
know)
provides a p-value for every pair of bins:
analysis based on a very large number of bins for finer resolutions
(500kb after filtering: 998 623 pairs of bins; without filtering:
13 509 221 pairs of bins): problem solved for 500kb bins but still under
study for 40kb bins
SF & NV2 | Hi-C data analysis 23/28
Complementary remarks about DE analysis
Hi-C data contain more zeros than RNA-seq data: some people
propose to use Zero Inflated BN distribution (unpublished as far as I
know)
provides a p-value for every pair of bins:
analysis based on a very large number of bins for finer resolutions
(500kb after filtering: 998 623 pairs of bins; without filtering:
13 509 221 pairs of bins): problem solved for 500kb bins but still under
study for 40kb bins
tests are performed as if bin pairs were independant whereas they are
spatially correlated
SF & NV2 | Hi-C data analysis 23/28
Complementary remarks about DE analysis
Hi-C data contain more zeros than RNA-seq data: some people
propose to use Zero Inflated BN distribution (unpublished as far as I
know)
provides a p-value for every pair of bins:
analysis based on a very large number of bins for finer resolutions
(500kb after filtering: 998 623 pairs of bins; without filtering:
13 509 221 pairs of bins): problem solved for 500kb bins but still under
study for 40kb bins
tests are performed as if bin pairs were independant whereas they are
spatially correlated: estimation of model parameters might be improved
if 1/ smoothed with respect to spatial proximity (similar to what is
sometimes performed methylation data analysis); 2/ performed
independantly for pairs of bins at a given distance (future work).
post-analysis of spatial distribution of p-values, work-in-progress with
Pierre Neuvial (submitted CNRS project)
SF & NV2 | Hi-C data analysis 23/28
because last page had no picture
probably not suited for the youngest
SF & NV2 | Hi-C data analysis 24/28
Preliminary results
913 bin pairs found differential (after multiple testing correction)
most of them are related to 3 chromosomes
parameter setting (filters...) and biological analysis are work-in-progress...
SF & NV2 | Hi-C data analysis 25/28
Differential TADs (state-of-the-art)
Detecting differential domains between the two conditions
Existing approaches:
[Fraser et al., 2015] (3 conditions, no replicate)
HMM on TAD boundaries (with a tolerance threshold) to identify
different TAD boundaries between samples
HAC on TADs, cophenetic distance to obtain local conserved structure
by using a z-score approach
SF & NV2 | Hi-C data analysis 26/28
Differential TADs (state-of-the-art)
Detecting differential domains between the two conditions
Existing approaches:
[Fraser et al., 2015] (3 conditions, no replicate)
HMM on TAD boundaries (with a tolerance threshold) to identify
different TAD boundaries between samples
HAC on TADs, cophenetic distance to obtain local conserved structure
by using a z-score approach
R package diffHic computes up/down-stream counts (with ± 100Kb)
and uses the GLM model implemented in edgeR with an interaction
between stream direction (up/down) and condition.
SF & NV2 | Hi-C data analysis 26/28
Differential TADs (state-of-the-art)
Detecting differential domains between the two conditions
Existing approaches:
[Fraser et al., 2015] (3 conditions, no replicate)
HMM on TAD boundaries (with a tolerance threshold) to identify
different TAD boundaries between samples
HAC on TADs, cophenetic distance to obtain local conserved structure
by using a z-score approach
R package diffHic computes up/down-stream counts (with ± 100Kb)
and uses the GLM model implemented in edgeR with an interaction
between stream direction (up/down) and condition.
However, the first approach does not take biological variability into account
(no replicate) and the second uses only a very aggregated criterion.
SF & NV2 | Hi-C data analysis 26/28
Differential TADs (perspectives)
Ideas for future work
Using constrained HAC, are we able to:
compute a consensus dendrogram using several biological replicates;
differentiate branches significantly (in which sense?) different
between conditions taking into account the within condition variability?
SF & NV2 | Hi-C data analysis 27/28
Differential TADs (perspectives)
Ideas for future work
Using constrained HAC, are we able to:
compute a consensus dendrogram using several biological replicates;
differentiate branches significantly (in which sense?) different
between conditions taking into account the within condition variability?
SF & NV2 | Hi-C data analysis 27/28
Conclusions and perspectives
Honnestly, it’s late and I really do not believe that I will have enough time to
make a conclusion and discuss perspectives so...
Questions?
SF & NV2 | Hi-C data analysis 28/28
References
Belton, J., Patton MacCord, R., Harmen Gibcus, J., Naumova, N., Zhan, Y., and Dekker, J. (2012).
Hi-C: a comprehensive technique to capture the conformation of genomes.
Methods, 58:268–276.
Brault, V., Chiquet, J., and Lévy-Leduc, C. (2017).
Efficient block boundaries estimation in block-wise constant matrices: an application to HiC data.
Electronic Journal of Statistics, 11(1):1570–1599.
Dali, R. and Blanchette, M. (2017).
A critical assessment of topologically associating domain prediction tools.
Nucleic Acid Research, 45(6):2994–3005.
Dixon, J., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J., and Ren, B. (2012).
Topological domains in mammalian genomes identified by analysis of chromatin interactions.
Nature, 485:376–380.
Filippova, D., Patro, R., Duggal, G., and Kingsford, C. (2013).
Identification of alternative topological domains in chromatin.
Algorithms for Molecular Biology, 9:14.
Fotuhi Siahpirani, A., Ay, F., and Roy, S. (2016).
A multi-task graph-clustering approach for chromosome conformation capture data sets identifies conserved modules of
chromosomal interactions.
Genome Biology, 17:114.
Fraser, J., Ferrai, C., Chiariello, A., Schueler, M., Rito, T., Laudanno, G., Barbieri, M., Moore, B., Kraemer, D., Aitken, S., Xie, S.,
Morris, K., Itoh, M., Kawaji, H., Jaeger, I., Hayashizaki, Y., Carninci, P., Forrest, A., The FANTOM Consortium, Semple, C.,
Dostie, J., Pombo, A., and Nicodemi, M. (2015).
Hierarchical folding and reorganization of chromosomes are linked to transcriptional changes in cellular differentiation.
Molecular Systems Biology, 11:852.
Giorgetti, L., Servant, N., and Heard, E. (2013).
Changes in the organization of the genome during the mammalian cell cycle.
SF & NV2 | Hi-C data analysis 28/28
Genome Biology, 14:142.
Hu, M., Deng, K., Selvaraj, S., Qin, Z., Ren, B., and Liu, J. (2012).
HiCNorm: removing biases in Hi-C data via Poisson regression.
Bioinformatics, 28(23):3131–3133.
Imakaev, M., Fudenberg, G., McCord, R., Naumova, N., Goloborodko, A., Lajoie, B., Dekker, J., and Mirny, L. (2012).
Iterative correction of Hi-C data reveals hallmarks of chromosome organization.
Nature Methods, 9:999–1003.
Lieberman-Aiden, E., van Berkum, N., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B., Sabo, P., Dorschner,
M., Sandstrom, R., Bernstein, B., Bender, M., Groudine, M., Gnirke, A., Stamatoyannopoulos, J., Mirny, L., Lander, E., and
Dekker, J. (2009).
Comprehensive mapping of long-range interactions reveals folding principles of the human genome.
Science, 326(5950):289–293.
Lun, A. and Smyth, G. (2015).
diffHic: a Bioconductor package to detect differential genomic interactions in Hi-C data.
BMC Bioinformatics, 16:258.
Rao, S., Huntley, M., Durand, N., Stamenova, E., Bochkov, I., Robinson, J., Sanborn, A., Machol, I., Omer, A., Lander, E., and
Lieberman Aiden, E. (2014).
A 3D map of the human genome at kilobase resolution reveals principle of chromatin looping.
Cell, 159(7):1665–1680.
Robinson, M. and Oshlack, A. (2010).
A scaling normalization method for differential expression analysis of RNA-seq data.
Genome Biology, 11:R25.
Schmitt, A., Hu, M., and Ren, B. (2016).
Genome-wide mapping and analysis of chromosome architecture.
Nature Reviews, 17(12):743–755.
Servant, N., Lajoie, B., Nora, E., Giorgetti, L., Chen, C., Heard, E., Dekker, J., and Barillot, E. (2012).
SF & NV2 | Hi-C data analysis 28/28
HiTC: exploration of high-throughput ‘C’ experiments.
Bioinformatics, 28(21):2843–2844.
Yaffe, E. and Tanay, A. (2011).
Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture.
Nature Genetics, 43:1059–1065.
SF & NV2 | Hi-C data analysis 28/28

Más contenido relacionado

La actualidad más candente

RFLP - Restriction Fragment Length Polymorphism
RFLP - Restriction Fragment Length PolymorphismRFLP - Restriction Fragment Length Polymorphism
RFLP - Restriction Fragment Length PolymorphismDeepa Arumugam
 
Tilling and Ecotilling for crop improvement
Tilling and Ecotilling for crop improvement Tilling and Ecotilling for crop improvement
Tilling and Ecotilling for crop improvement Devidas Thombare
 
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekGenomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekData Driven Innovation
 
Association mapping, GWAS, Mapping, natural population mapping
Association mapping, GWAS, Mapping, natural population mappingAssociation mapping, GWAS, Mapping, natural population mapping
Association mapping, GWAS, Mapping, natural population mappingMahesh Biradar
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomicsajay301
 
Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformat...
Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformat...Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformat...
Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformat...VHIR Vall d’Hebron Institut de Recerca
 
Mapping and Applications of Linkage Disequilibrium and Association Mapping in...
Mapping and Applications of Linkage Disequilibrium and Association Mapping in...Mapping and Applications of Linkage Disequilibrium and Association Mapping in...
Mapping and Applications of Linkage Disequilibrium and Association Mapping in...FAO
 
Micro array based comparative genomic hybridisation -Dr Yogesh D
Micro array based comparative genomic hybridisation -Dr Yogesh DMicro array based comparative genomic hybridisation -Dr Yogesh D
Micro array based comparative genomic hybridisation -Dr Yogesh DDr.Yogesh D
 
Recent trends in bioinformatics
Recent trends in bioinformaticsRecent trends in bioinformatics
Recent trends in bioinformaticsZeeshan Hanjra
 
NGS Data Preprocessing
NGS Data PreprocessingNGS Data Preprocessing
NGS Data PreprocessingcursoNGS
 
Qtl analysis and its mapping
Qtl analysis and its mappingQtl analysis and its mapping
Qtl analysis and its mappingVikas Verma
 
Genome Mapping
Genome MappingGenome Mapping
Genome MappingStudent
 
Crop plants genetic and genomic resources
Crop plants genetic and genomic resourcesCrop plants genetic and genomic resources
Crop plants genetic and genomic resourcesArun Prabhu Dhanapal
 
Degradome sequencing and small rna targets
Degradome sequencing and small rna targetsDegradome sequencing and small rna targets
Degradome sequencing and small rna targetsAswinChilakala
 
Plant epigenetic memory in plant growth behavior and stress response. Sally M...
Plant epigenetic memory in plant growth behavior and stress response. Sally M...Plant epigenetic memory in plant growth behavior and stress response. Sally M...
Plant epigenetic memory in plant growth behavior and stress response. Sally M...CIAT
 
Epigenetic regulation of rice flowering and reproduction
Epigenetic regulation of rice flowering and reproductionEpigenetic regulation of rice flowering and reproduction
Epigenetic regulation of rice flowering and reproductionRoshan Parihar
 
The Smith Waterman algorithm
The Smith Waterman algorithmThe Smith Waterman algorithm
The Smith Waterman algorithmavrilcoghlan
 

La actualidad más candente (20)

RFLP - Restriction Fragment Length Polymorphism
RFLP - Restriction Fragment Length PolymorphismRFLP - Restriction Fragment Length Polymorphism
RFLP - Restriction Fragment Length Polymorphism
 
Tilling and Ecotilling for crop improvement
Tilling and Ecotilling for crop improvement Tilling and Ecotilling for crop improvement
Tilling and Ecotilling for crop improvement
 
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekGenomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
 
Association mapping, GWAS, Mapping, natural population mapping
Association mapping, GWAS, Mapping, natural population mappingAssociation mapping, GWAS, Mapping, natural population mapping
Association mapping, GWAS, Mapping, natural population mapping
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformat...
Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformat...Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformat...
Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformat...
 
Mapping and Applications of Linkage Disequilibrium and Association Mapping in...
Mapping and Applications of Linkage Disequilibrium and Association Mapping in...Mapping and Applications of Linkage Disequilibrium and Association Mapping in...
Mapping and Applications of Linkage Disequilibrium and Association Mapping in...
 
Micro array based comparative genomic hybridisation -Dr Yogesh D
Micro array based comparative genomic hybridisation -Dr Yogesh DMicro array based comparative genomic hybridisation -Dr Yogesh D
Micro array based comparative genomic hybridisation -Dr Yogesh D
 
Protein-protein interaction networks
Protein-protein interaction networksProtein-protein interaction networks
Protein-protein interaction networks
 
Recent trends in bioinformatics
Recent trends in bioinformaticsRecent trends in bioinformatics
Recent trends in bioinformatics
 
NGS Data Preprocessing
NGS Data PreprocessingNGS Data Preprocessing
NGS Data Preprocessing
 
Qtl analysis and its mapping
Qtl analysis and its mappingQtl analysis and its mapping
Qtl analysis and its mapping
 
Genome Mapping
Genome MappingGenome Mapping
Genome Mapping
 
Qtl mapping sachin pbt
Qtl mapping sachin pbtQtl mapping sachin pbt
Qtl mapping sachin pbt
 
Crop plants genetic and genomic resources
Crop plants genetic and genomic resourcesCrop plants genetic and genomic resources
Crop plants genetic and genomic resources
 
Degradome sequencing and small rna targets
Degradome sequencing and small rna targetsDegradome sequencing and small rna targets
Degradome sequencing and small rna targets
 
Plant epigenetic memory in plant growth behavior and stress response. Sally M...
Plant epigenetic memory in plant growth behavior and stress response. Sally M...Plant epigenetic memory in plant growth behavior and stress response. Sally M...
Plant epigenetic memory in plant growth behavior and stress response. Sally M...
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Epigenetic regulation of rice flowering and reproduction
Epigenetic regulation of rice flowering and reproductionEpigenetic regulation of rice flowering and reproduction
Epigenetic regulation of rice flowering and reproduction
 
The Smith Waterman algorithm
The Smith Waterman algorithmThe Smith Waterman algorithm
The Smith Waterman algorithm
 

Similar a Investigating the 3D structure of the genome with Hi-C data analysis

Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Dmitry Grapov
 
Reproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfishReproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfishtuxette
 
BPSO&1-NN algorithm-based variable selection for power system stability ident...
BPSO&1-NN algorithm-based variable selection for power system stability ident...BPSO&1-NN algorithm-based variable selection for power system stability ident...
BPSO&1-NN algorithm-based variable selection for power system stability ident...IJAEMSJORNAL
 
'ACCOST' for differential HiC analysis
'ACCOST' for differential HiC analysis'ACCOST' for differential HiC analysis
'ACCOST' for differential HiC analysistuxette
 
Differential analyses of structures in HiC data
Differential analyses of structures in HiC dataDifferential analyses of structures in HiC data
Differential analyses of structures in HiC datatuxette
 
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
Maxim Kazantsev
 
Parallel KNN for Big Data using Adaptive Indexing
Parallel KNN for Big Data using Adaptive IndexingParallel KNN for Big Data using Adaptive Indexing
Parallel KNN for Big Data using Adaptive IndexingIRJET Journal
 
METODOLOGIA DEA EN STATA
METODOLOGIA DEA EN STATAMETODOLOGIA DEA EN STATA
METODOLOGIA DEA EN STATALuhSm
 
Ijariie1117 volume 1-issue 1-page-25-27
Ijariie1117 volume 1-issue 1-page-25-27Ijariie1117 volume 1-issue 1-page-25-27
Ijariie1117 volume 1-issue 1-page-25-27IJARIIE JOURNAL
 
Fault detection and diagnosis for non-Gaussian stochastic distribution system...
Fault detection and diagnosis for non-Gaussian stochastic distribution system...Fault detection and diagnosis for non-Gaussian stochastic distribution system...
Fault detection and diagnosis for non-Gaussian stochastic distribution system...ISA Interchange
 
Accounting serx
Accounting serxAccounting serx
Accounting serxzeer1234
 
Accounting serx
Accounting serxAccounting serx
Accounting serxzeer1234
 
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...Happiest Minds Technologies
 
Dimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxDimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxSivam Chinna
 
NNPDF3.0: parton distributions for the LHC Run II
NNPDF3.0: parton distributions for the LHC Run IINNPDF3.0: parton distributions for the LHC Run II
NNPDF3.0: parton distributions for the LHC Run IIjuanrojochacon
 
Ijricit 01-002 enhanced replica detection in short time for large data sets
Ijricit 01-002 enhanced replica detection in  short time for large data setsIjricit 01-002 enhanced replica detection in  short time for large data sets
Ijricit 01-002 enhanced replica detection in short time for large data setsIjripublishers Ijri
 
The Use Of Decision Trees For Adaptive Item
The Use Of Decision Trees For Adaptive ItemThe Use Of Decision Trees For Adaptive Item
The Use Of Decision Trees For Adaptive Itembarthriley
 
IRJET- Survey of Feature Selection based on Ant Colony
IRJET- Survey of Feature Selection based on Ant ColonyIRJET- Survey of Feature Selection based on Ant Colony
IRJET- Survey of Feature Selection based on Ant ColonyIRJET Journal
 

Similar a Investigating the 3D structure of the genome with Hi-C data analysis (20)

Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)
 
Reproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfishReproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfish
 
BPSO&1-NN algorithm-based variable selection for power system stability ident...
BPSO&1-NN algorithm-based variable selection for power system stability ident...BPSO&1-NN algorithm-based variable selection for power system stability ident...
BPSO&1-NN algorithm-based variable selection for power system stability ident...
 
'ACCOST' for differential HiC analysis
'ACCOST' for differential HiC analysis'ACCOST' for differential HiC analysis
'ACCOST' for differential HiC analysis
 
Differential analyses of structures in HiC data
Differential analyses of structures in HiC dataDifferential analyses of structures in HiC data
Differential analyses of structures in HiC data
 
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

 
Parallel KNN for Big Data using Adaptive Indexing
Parallel KNN for Big Data using Adaptive IndexingParallel KNN for Big Data using Adaptive Indexing
Parallel KNN for Big Data using Adaptive Indexing
 
METODOLOGIA DEA EN STATA
METODOLOGIA DEA EN STATAMETODOLOGIA DEA EN STATA
METODOLOGIA DEA EN STATA
 
Ijariie1117 volume 1-issue 1-page-25-27
Ijariie1117 volume 1-issue 1-page-25-27Ijariie1117 volume 1-issue 1-page-25-27
Ijariie1117 volume 1-issue 1-page-25-27
 
Fault detection and diagnosis for non-Gaussian stochastic distribution system...
Fault detection and diagnosis for non-Gaussian stochastic distribution system...Fault detection and diagnosis for non-Gaussian stochastic distribution system...
Fault detection and diagnosis for non-Gaussian stochastic distribution system...
 
Accounting serx
Accounting serxAccounting serx
Accounting serx
 
Accounting serx
Accounting serxAccounting serx
Accounting serx
 
Mayank
MayankMayank
Mayank
 
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
 
Dimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxDimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptx
 
NNPDF3.0: parton distributions for the LHC Run II
NNPDF3.0: parton distributions for the LHC Run IINNPDF3.0: parton distributions for the LHC Run II
NNPDF3.0: parton distributions for the LHC Run II
 
Ijricit 01-002 enhanced replica detection in short time for large data sets
Ijricit 01-002 enhanced replica detection in  short time for large data setsIjricit 01-002 enhanced replica detection in  short time for large data sets
Ijricit 01-002 enhanced replica detection in short time for large data sets
 
Atomreaktor
AtomreaktorAtomreaktor
Atomreaktor
 
The Use Of Decision Trees For Adaptive Item
The Use Of Decision Trees For Adaptive ItemThe Use Of Decision Trees For Adaptive Item
The Use Of Decision Trees For Adaptive Item
 
IRJET- Survey of Feature Selection based on Ant Colony
IRJET- Survey of Feature Selection based on Ant ColonyIRJET- Survey of Feature Selection based on Ant Colony
IRJET- Survey of Feature Selection based on Ant Colony
 

Más de tuxette

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathstuxette
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènestuxette
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquestuxette
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-Ctuxette
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?tuxette
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...tuxette
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquestuxette
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeantuxette
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...tuxette
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquestuxette
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...tuxette
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...tuxette
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation datatuxette
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?tuxette
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysistuxette
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricestuxette
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Predictiontuxette
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelstuxette
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random foresttuxette
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICStuxette
 

Más de tuxette (20)

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en maths
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènes
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiques
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-C
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiques
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWean
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation data
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysis
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatrices
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random forest
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 

Último

BACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika DasBACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika DasChayanika Das
 
Pests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPirithiRaju
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests GlycosidesNandakishor Bhaurao Deshmukh
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxRitchAndruAgustin
 
Unveiling the Cannabis Plant’s Potential
Unveiling the Cannabis Plant’s PotentialUnveiling the Cannabis Plant’s Potential
Unveiling the Cannabis Plant’s PotentialMarkus Roggen
 
Advances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of CancerAdvances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of CancerLuis Miguel Chong Chong
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxfarhanvvdk
 
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11GelineAvendao
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxPayal Shrivastava
 
BACTERIAL DEFENSE SYSTEM by Dr. Chayanika Das
BACTERIAL DEFENSE SYSTEM by Dr. Chayanika DasBACTERIAL DEFENSE SYSTEM by Dr. Chayanika Das
BACTERIAL DEFENSE SYSTEM by Dr. Chayanika DasChayanika Das
 
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...Chayanika Das
 
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer ZahanaEGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer ZahanaDr.Mahmoud Abbas
 
dll general biology week 1 - Copy.docx
dll general biology   week 1 - Copy.docxdll general biology   week 1 - Copy.docx
dll general biology week 1 - Copy.docxkarenmillo
 
Total Legal: A “Joint” Journey into the Chemistry of Cannabinoids
Total Legal: A “Joint” Journey into the Chemistry of CannabinoidsTotal Legal: A “Joint” Journey into the Chemistry of Cannabinoids
Total Legal: A “Joint” Journey into the Chemistry of CannabinoidsMarkus Roggen
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxpriyankatabhane
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024Jene van der Heide
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Sérgio Sacani
 
The Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and FunctionThe Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and FunctionJadeNovelo1
 

Último (20)

BACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika DasBACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
 
Pests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPR
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
 
Unveiling the Cannabis Plant’s Potential
Unveiling the Cannabis Plant’s PotentialUnveiling the Cannabis Plant’s Potential
Unveiling the Cannabis Plant’s Potential
 
Advances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of CancerAdvances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of Cancer
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptx
 
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptx
 
BACTERIAL DEFENSE SYSTEM by Dr. Chayanika Das
BACTERIAL DEFENSE SYSTEM by Dr. Chayanika DasBACTERIAL DEFENSE SYSTEM by Dr. Chayanika Das
BACTERIAL DEFENSE SYSTEM by Dr. Chayanika Das
 
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
 
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer ZahanaEGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
 
dll general biology week 1 - Copy.docx
dll general biology   week 1 - Copy.docxdll general biology   week 1 - Copy.docx
dll general biology week 1 - Copy.docx
 
Total Legal: A “Joint” Journey into the Chemistry of Cannabinoids
Total Legal: A “Joint” Journey into the Chemistry of CannabinoidsTotal Legal: A “Joint” Journey into the Chemistry of Cannabinoids
Total Legal: A “Joint” Journey into the Chemistry of Cannabinoids
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
 
Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?
 
AZOTOBACTER AS BIOFERILIZER.PPTX
AZOTOBACTER AS BIOFERILIZER.PPTXAZOTOBACTER AS BIOFERILIZER.PPTX
AZOTOBACTER AS BIOFERILIZER.PPTX
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
 
The Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and FunctionThe Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and Function
 

Investigating the 3D structure of the genome with Hi-C data analysis

  • 1. Investigating the 3D structure of the genome with Hi-C data analysis Sylvain Foissac & Nathalie Villa-Vialaneix prenom.nom@inra.fr Séminaire MIAT - Toulouse, 2 juin 2017 SF & NV2 | Hi-C data analysis 1/28
  • 2. Sommaire 1 Normalization 2 TAD identification 3 A/B compartments 4 Differential analysis SF & NV2 | Hi-C data analysis 2/28
  • 3. Sommaire 1 Normalization 2 TAD identification 3 A/B compartments 4 Differential analysis SF & NV2 | Hi-C data analysis 3/28
  • 4. Purpose of normalization 1 within matrix normalization: make bins comparable within a matrix (not needed for differential analysis) SF & NV2 | Hi-C data analysis 4/28
  • 5. Purpose of normalization 1 within matrix normalization: make bins comparable within a matrix (not needed for differential analysis) 2 between matrix normalization: make the same bin pair comparable between two matrices (needed for differential analysis) SF & NV2 | Hi-C data analysis 4/28
  • 6. Different within matrix normalizations to correct technical biases (GC content, mappability...) explicit correction [Yaffe and Tanay, 2011, Hu et al., 2012]: every factor causing bais is identified and estimated SF & NV2 | Hi-C data analysis 5/28
  • 7. Different within matrix normalizations to correct technical biases (GC content, mappability...) explicit correction [Yaffe and Tanay, 2011, Hu et al., 2012]: every factor causing bais is identified and estimated non parametric correction ICE correction using matrix balancing [Imakaev et al., 2012] K = b Kb for a K st ∀ i = 1, . . . , p, p j=1 Kij is constant SF & NV2 | Hi-C data analysis 5/28
  • 8. Different within matrix normalizations to correct technical biases picture from [Schmitt et al., 2016] SF & NV2 | Hi-C data analysis 5/28
  • 9. Different within matrix normalizations to take distances into account theoretical distribution taken from [Belton et al., 2012] Kd ij = Kij − Kd(i,j) σ(Dd(i,j)) with Kd average counts at distance d σ(Dd) standard deviation available in HiTC [Servant et al., 2012] SF & NV2 | Hi-C data analysis 6/28
  • 10. Between matrix normalization correct for differences in sequencing depth standard approach: similar to RNA-seq normalization SF & NV2 | Hi-C data analysis 7/28
  • 11. Between matrix normalization correct for differences in sequencing depth standard approach: similar to RNA-seq normalization However... SF & NV2 | Hi-C data analysis 7/28
  • 12. Between matrix normalization correct for differences in sequencing depth standard approach: similar to RNA-seq normalization However... density adjustment by LOESS fit [Robinson and Oshlack, 2010] (implemented in csaw) SF & NV2 | Hi-C data analysis 7/28
  • 13. Sommaire 1 Normalization 2 TAD identification 3 A/B compartments 4 Differential analysis SF & NV2 | Hi-C data analysis 8/28
  • 14. Topologically Associated Domains (TADs) [Rao et al., 2014] SF & NV2 | Hi-C data analysis 9/28
  • 15. TAD method jungle Directionality index [Dixon et al., 2012]: compute divergence between up/downstream interaction counts + HMM to identify TADs SF & NV2 | Hi-C data analysis 10/28
  • 16. TAD method jungle Directionality index [Dixon et al., 2012]: compute divergence between up/downstream interaction counts + HMM to identify TADs armatus [Filippova et al., 2013]: maximize a criteria which evaluate a within/between count ratio + combine multi-resolution results in a consensual segmentation SF & NV2 | Hi-C data analysis 10/28
  • 17. TAD method jungle Directionality index [Dixon et al., 2012]: compute divergence between up/downstream interaction counts + HMM to identify TADs armatus [Filippova et al., 2013]: maximize a criteria which evaluate a within/between count ratio + combine multi-resolution results in a consensual segmentation segmentation method [Brault et al., 2017]: block boundary estimation in matrix SF & NV2 | Hi-C data analysis 10/28
  • 18. TAD method jungle Directionality index [Dixon et al., 2012]: compute divergence between up/downstream interaction counts + HMM to identify TADs armatus [Filippova et al., 2013]: maximize a criteria which evaluate a within/between count ratio + combine multi-resolution results in a consensual segmentation segmentation method [Brault et al., 2017]: block boundary estimation in matrix ... (many others), interestingly, very few provides a hierarchical clustering Comparisons in: [Fotuhi Siahpirani et al., 2016, Dali and Blanchette, 2017] SF & NV2 | Hi-C data analysis 10/28
  • 19. DI evolution with respect to armatus TADs SF & NV2 | Hi-C data analysis 11/28
  • 20. CTCF at TAD boundaries SF & NV2 | Hi-C data analysis 12/28
  • 21. Enrichment of genomic features around TAD boundaries Homo Sapiens [Dixon et al., 2012] Sus Scrofa (PORCINET project) SF & NV2 | Hi-C data analysis 13/28
  • 22. Current methodological development Constrained HAC as a way to compare/combine TADs between samples Contrained HAC: Hierarchical clustering with contiguity constrains SF & NV2 | Hi-C data analysis 14/28
  • 23. Current methodological development Constrained HAC as a way to compare/combine TADs between samples Contrained HAC: Hierarchical clustering with contiguity constrains Challenges (currently under development with Pierre Neuvial and Marie Chavent): methodological issues: what happens when using Ward’s linkage criterion with a non Euclidean similarity (counts of the Hi-C matrix)? what happens when adding constrains to HAC? (partially solved) development of the R package adjclust (Google Summer of Code selected project) SF & NV2 | Hi-C data analysis 14/28
  • 24. Sommaire 1 Normalization 2 TAD identification 3 A/B compartments 4 Differential analysis SF & NV2 | Hi-C data analysis 15/28
  • 25. A/B compartments [Lieberman-Aiden et al., 2009] [Giorgetti et al., 2013] Method (in theory): compute Pearson correlations between bins (using interaction counts with all the other bins of the same chromosome) compute eigenvectors (or perform PCA) on this correlation matrix affect A/B compartments to +/- values of PCs SF & NV2 | Hi-C data analysis 16/28
  • 26. A/B compartments in practice after ICED and distance-based normalizations SF & NV2 | Hi-C data analysis 17/28
  • 27. A/B compartments in practice after ICED and distance-based normalizations Method: differentiate between A/B using sign of the correlation between PCs and diagonal counts choose a relevant PC and method maximizing − log10(p − value) between diagonal counts in +/- PC (2-group comparison Student test) SF & NV2 | Hi-C data analysis 17/28
  • 28. Biological validation SF & NV2 | Hi-C data analysis 18/28
  • 29. Sommaire 1 Normalization 2 TAD identification 3 A/B compartments 4 Differential analysis SF & NV2 | Hi-C data analysis 19/28
  • 30. Filtering In differential analysis of sequencing data, filtering is a crucial step: removing low count features (that are little or no chance to be found differential) improves the test power (leverage the multiple testing correction effect) and can save unnecessary computational time SF & NV2 | Hi-C data analysis 20/28
  • 31. Filtering In differential analysis of sequencing data, filtering is a crucial step: removing low count features (that are little or no chance to be found differential) improves the test power (leverage the multiple testing correction effect) and can save unnecessary computational time can be performed 1/ at the beginning of the analysis or after the estimation of the parameters of the model used for differential analysis SF & NV2 | Hi-C data analysis 20/28
  • 32. Filtering In differential analysis of sequencing data, filtering is a crucial step: removing low count features (that are little or no chance to be found differential) improves the test power (leverage the multiple testing correction effect) and can save unnecessary computational time can be performed 1/ at the beginning of the analysis or after the estimation of the parameters of the model used for differential analysis; 2/ can be fixed to an arbitrary value (minimum total count per sample) or automated from the data SF & NV2 | Hi-C data analysis 20/28
  • 33. Filtering In differential analysis of sequencing data, filtering is a crucial step: removing low count features (that are little or no chance to be found differential) improves the test power (leverage the multiple testing correction effect) and can save unnecessary computational time can be performed 1/ at the beginning of the analysis or after the estimation of the parameters of the model used for differential analysis; 2/ can be fixed to an arbitrary value (minimum total count per sample) or automated from the data for Hi-C data: filtering was performed at the beginning of the analysis (to limit the computation burden) was performed by using an arbitrary threshold or a threshold based on the estimation of the noise background by a quantile of inter-chromosomal counts (as in R package diffHic) SF & NV2 | Hi-C data analysis 20/28
  • 34. Filtering In differential analysis of sequencing data, filtering is a crucial step: removing low count features (that are little or no chance to be found differential) improves the test power (leverage the multiple testing correction effect) and can save unnecessary computational time can be performed 1/ at the beginning of the analysis or after the estimation of the parameters of the model used for differential analysis; 2/ can be fixed to an arbitrary value (minimum total count per sample) or automated from the data 500 kb - automatic filter (filters counts<∼ 5) - 96.4% of pairs filtered out before filtering after filtering SF & NV2 | Hi-C data analysis 20/28
  • 35. Exploratory analysis (500kb bins) chromosome 1 1 0.911 1 0.8886 0.8866 1 0.8566 0.8651 0.8288 1 0.8973 0.9118 0.8912 0.8692 1 0.8935 0.9032 0.8818 0.8799 0.906 1 LW90−160216−GCCAAT LW90−160223−CTTGTA LW90−160308−AGTTCC LW110−160307−CGATGT LW110−160308−AGTCAA LW110−160517−ACAGTG LW 90−160216−G C C AAT LW 90−160223−C TTG TA LW 90−160308−AG TTC C LW 110−160307−C G ATG T LW 110−160308−AG TC AA LW 110−160517−AC AG TG −1.0 −0.5 0.0 0.5 1.0 Cosinus (Frobenius norm) good reproducibility between experiments no clear organization with respect to the condition SF & NV2 | Hi-C data analysis 21/28
  • 36. Exploratory analysis (500kb bins) chromosome 1 1 0.911 1 0.8886 0.8866 1 0.8566 0.8651 0.8288 1 0.8973 0.9118 0.8912 0.8692 1 0.8935 0.9032 0.8818 0.8799 0.906 1 LW90−160216−GCCAAT LW90−160223−CTTGTA LW90−160308−AGTTCC LW110−160307−CGATGT LW110−160308−AGTCAA LW110−160517−ACAGTG LW 90−160216−G C C AAT LW 90−160223−C TTG TA LW 90−160308−AG TTC C LW 110−160307−C G ATG T LW 110−160308−AG TC AA LW 110−160517−AC AG TG −1.0 −0.5 0.0 0.5 1.0 Cosinus (Frobenius norm) good reproducibility between experiments no clear organization with respect to the condition all data after filtering and between matrix normalization (LOESS) 2 outliers but PC1 is organized with respect to the condition SF & NV2 | Hi-C data analysis 21/28
  • 37. Methods for differential analysis of Hi-C Similar to RNA-seq [Lun and Smyth, 2015] and R package diffHic (essentially a wrapper for edgeR): count data modeled by Binomial Negative distribution SF & NV2 | Hi-C data analysis 22/28
  • 38. Methods for differential analysis of Hi-C Similar to RNA-seq [Lun and Smyth, 2015] and R package diffHic (essentially a wrapper for edgeR): count data modeled by Binomial Negative distribution parameters (mean, variance per gene) are estimated from data: a variance vs mean relationship is modeled SF & NV2 | Hi-C data analysis 22/28
  • 39. Methods for differential analysis of Hi-C Similar to RNA-seq [Lun and Smyth, 2015] and R package diffHic (essentially a wrapper for edgeR): count data modeled by Binomial Negative distribution parameters (mean, variance per gene) are estimated from data: a variance vs mean relationship is modeled test is performed using an exact test (similar to Fisher) or a log-likelihood ratio test (GLM model) SF & NV2 | Hi-C data analysis 22/28
  • 40. Complementary remarks about DE analysis Hi-C data contain more zeros than RNA-seq data: some people propose to use Zero Inflated BN distribution (unpublished as far as I know) SF & NV2 | Hi-C data analysis 23/28
  • 41. Complementary remarks about DE analysis Hi-C data contain more zeros than RNA-seq data: some people propose to use Zero Inflated BN distribution (unpublished as far as I know) provides a p-value for every pair of bins: analysis based on a very large number of bins for finer resolutions (500kb after filtering: 998 623 pairs of bins; without filtering: 13 509 221 pairs of bins): problem solved for 500kb bins but still under study for 40kb bins SF & NV2 | Hi-C data analysis 23/28
  • 42. Complementary remarks about DE analysis Hi-C data contain more zeros than RNA-seq data: some people propose to use Zero Inflated BN distribution (unpublished as far as I know) provides a p-value for every pair of bins: analysis based on a very large number of bins for finer resolutions (500kb after filtering: 998 623 pairs of bins; without filtering: 13 509 221 pairs of bins): problem solved for 500kb bins but still under study for 40kb bins tests are performed as if bin pairs were independant whereas they are spatially correlated SF & NV2 | Hi-C data analysis 23/28
  • 43. Complementary remarks about DE analysis Hi-C data contain more zeros than RNA-seq data: some people propose to use Zero Inflated BN distribution (unpublished as far as I know) provides a p-value for every pair of bins: analysis based on a very large number of bins for finer resolutions (500kb after filtering: 998 623 pairs of bins; without filtering: 13 509 221 pairs of bins): problem solved for 500kb bins but still under study for 40kb bins tests are performed as if bin pairs were independant whereas they are spatially correlated: estimation of model parameters might be improved if 1/ smoothed with respect to spatial proximity (similar to what is sometimes performed methylation data analysis); 2/ performed independantly for pairs of bins at a given distance (future work). post-analysis of spatial distribution of p-values, work-in-progress with Pierre Neuvial (submitted CNRS project) SF & NV2 | Hi-C data analysis 23/28
  • 44. because last page had no picture probably not suited for the youngest SF & NV2 | Hi-C data analysis 24/28
  • 45. Preliminary results 913 bin pairs found differential (after multiple testing correction) most of them are related to 3 chromosomes parameter setting (filters...) and biological analysis are work-in-progress... SF & NV2 | Hi-C data analysis 25/28
  • 46. Differential TADs (state-of-the-art) Detecting differential domains between the two conditions Existing approaches: [Fraser et al., 2015] (3 conditions, no replicate) HMM on TAD boundaries (with a tolerance threshold) to identify different TAD boundaries between samples HAC on TADs, cophenetic distance to obtain local conserved structure by using a z-score approach SF & NV2 | Hi-C data analysis 26/28
  • 47. Differential TADs (state-of-the-art) Detecting differential domains between the two conditions Existing approaches: [Fraser et al., 2015] (3 conditions, no replicate) HMM on TAD boundaries (with a tolerance threshold) to identify different TAD boundaries between samples HAC on TADs, cophenetic distance to obtain local conserved structure by using a z-score approach R package diffHic computes up/down-stream counts (with ± 100Kb) and uses the GLM model implemented in edgeR with an interaction between stream direction (up/down) and condition. SF & NV2 | Hi-C data analysis 26/28
  • 48. Differential TADs (state-of-the-art) Detecting differential domains between the two conditions Existing approaches: [Fraser et al., 2015] (3 conditions, no replicate) HMM on TAD boundaries (with a tolerance threshold) to identify different TAD boundaries between samples HAC on TADs, cophenetic distance to obtain local conserved structure by using a z-score approach R package diffHic computes up/down-stream counts (with ± 100Kb) and uses the GLM model implemented in edgeR with an interaction between stream direction (up/down) and condition. However, the first approach does not take biological variability into account (no replicate) and the second uses only a very aggregated criterion. SF & NV2 | Hi-C data analysis 26/28
  • 49. Differential TADs (perspectives) Ideas for future work Using constrained HAC, are we able to: compute a consensus dendrogram using several biological replicates; differentiate branches significantly (in which sense?) different between conditions taking into account the within condition variability? SF & NV2 | Hi-C data analysis 27/28
  • 50. Differential TADs (perspectives) Ideas for future work Using constrained HAC, are we able to: compute a consensus dendrogram using several biological replicates; differentiate branches significantly (in which sense?) different between conditions taking into account the within condition variability? SF & NV2 | Hi-C data analysis 27/28
  • 51. Conclusions and perspectives Honnestly, it’s late and I really do not believe that I will have enough time to make a conclusion and discuss perspectives so... Questions? SF & NV2 | Hi-C data analysis 28/28
  • 52. References Belton, J., Patton MacCord, R., Harmen Gibcus, J., Naumova, N., Zhan, Y., and Dekker, J. (2012). Hi-C: a comprehensive technique to capture the conformation of genomes. Methods, 58:268–276. Brault, V., Chiquet, J., and Lévy-Leduc, C. (2017). Efficient block boundaries estimation in block-wise constant matrices: an application to HiC data. Electronic Journal of Statistics, 11(1):1570–1599. Dali, R. and Blanchette, M. (2017). A critical assessment of topologically associating domain prediction tools. Nucleic Acid Research, 45(6):2994–3005. Dixon, J., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J., and Ren, B. (2012). Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature, 485:376–380. Filippova, D., Patro, R., Duggal, G., and Kingsford, C. (2013). Identification of alternative topological domains in chromatin. Algorithms for Molecular Biology, 9:14. Fotuhi Siahpirani, A., Ay, F., and Roy, S. (2016). A multi-task graph-clustering approach for chromosome conformation capture data sets identifies conserved modules of chromosomal interactions. Genome Biology, 17:114. Fraser, J., Ferrai, C., Chiariello, A., Schueler, M., Rito, T., Laudanno, G., Barbieri, M., Moore, B., Kraemer, D., Aitken, S., Xie, S., Morris, K., Itoh, M., Kawaji, H., Jaeger, I., Hayashizaki, Y., Carninci, P., Forrest, A., The FANTOM Consortium, Semple, C., Dostie, J., Pombo, A., and Nicodemi, M. (2015). Hierarchical folding and reorganization of chromosomes are linked to transcriptional changes in cellular differentiation. Molecular Systems Biology, 11:852. Giorgetti, L., Servant, N., and Heard, E. (2013). Changes in the organization of the genome during the mammalian cell cycle. SF & NV2 | Hi-C data analysis 28/28
  • 53. Genome Biology, 14:142. Hu, M., Deng, K., Selvaraj, S., Qin, Z., Ren, B., and Liu, J. (2012). HiCNorm: removing biases in Hi-C data via Poisson regression. Bioinformatics, 28(23):3131–3133. Imakaev, M., Fudenberg, G., McCord, R., Naumova, N., Goloborodko, A., Lajoie, B., Dekker, J., and Mirny, L. (2012). Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nature Methods, 9:999–1003. Lieberman-Aiden, E., van Berkum, N., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B., Sabo, P., Dorschner, M., Sandstrom, R., Bernstein, B., Bender, M., Groudine, M., Gnirke, A., Stamatoyannopoulos, J., Mirny, L., Lander, E., and Dekker, J. (2009). Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science, 326(5950):289–293. Lun, A. and Smyth, G. (2015). diffHic: a Bioconductor package to detect differential genomic interactions in Hi-C data. BMC Bioinformatics, 16:258. Rao, S., Huntley, M., Durand, N., Stamenova, E., Bochkov, I., Robinson, J., Sanborn, A., Machol, I., Omer, A., Lander, E., and Lieberman Aiden, E. (2014). A 3D map of the human genome at kilobase resolution reveals principle of chromatin looping. Cell, 159(7):1665–1680. Robinson, M. and Oshlack, A. (2010). A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biology, 11:R25. Schmitt, A., Hu, M., and Ren, B. (2016). Genome-wide mapping and analysis of chromosome architecture. Nature Reviews, 17(12):743–755. Servant, N., Lajoie, B., Nora, E., Giorgetti, L., Chen, C., Heard, E., Dekker, J., and Barillot, E. (2012). SF & NV2 | Hi-C data analysis 28/28
  • 54. HiTC: exploration of high-throughput ‘C’ experiments. Bioinformatics, 28(21):2843–2844. Yaffe, E. and Tanay, A. (2011). Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture. Nature Genetics, 43:1059–1065. SF & NV2 | Hi-C data analysis 28/28