1. A STATISTICAL MODEL FOR METHYLATION
LEVEL INFERENCE USING BS-SEQ DATA
M. BESSOUL, G.VIEJO | Université Pierre et Marie Curie | 2012
M A S T E R
BIOINFORMATIQUE
E T MODELISATION
dimanche 17 juin 2012
2. I. Background on DNA methylation
II. Motivations
III. BS-Seq
IV. Data simulation
V. Statistical model
VI. MethSeq
VII. Results
VIII. Discussion
CONTENT
dimanche 17 juin 2012
7. III. BS-SEQ
G CCC T A
mm
G CTC T A
Sodium
bisulfite
+
PCR
G CTC T A
BISULFITE SEQUENCING
dimanche 17 juin 2012
8. III. BS-SEQ
G CCC T A
mm
G CTC T A
G T ATT T
Sodium
bisulfite
+
PCR
G CTC T A
Alignment
BS Sequence over C-less reference
BISULFITE SEQUENCING
Bisulfite
sequence
C-less
sequence
dimanche 17 juin 2012
11. III. BS-SEQ
ALIGNMENT
C-less seq
readsG CTC T A
G T ATT T
What do we get ? For every C position :
•Number of overlapping reads :
•Number of mismatches :
yreads
yc
dimanche 17 juin 2012
12. IV. DATA SIMULATION
GENOME (.fasta file)
Real profile
(5mC positions)
Short reads
Random
methylation
simulation
Bisulfite
transformation
Bowtie
alignment
SAM file
Parsing
BS-Seq data
(coverage + mismatches
at C positions)
Sequencing
C-less index
Comparison
.py script
.py script
.py script
Alignment
and parsing
Data
simulation
dimanche 17 juin 2012