SlideShare una empresa de Scribd logo
1 de 34
Descargar para leer sin conexión
www.cmmt.ubc.ca
JASPAR BioPython & MANTA
Anthony Mathelier, David Arenillas & Wyeth Wasserman
anthony.mathelier@gmail.com & dave@cmmt.ubc.ca
Wasserman Lab
2
Outline
● JASPAR BioPython module
– What is JASPAR?
– How to construct matrices from JASPAR files using
the JASPAR BioPython module.
● MANTA
– What is stored in MANTA?
– How to interrogate the MANTA DB using Python and
our web application.
3
http://jaspar.genereg.net
Mathelier et al. JASPAR 2014: an extensively expanded and updated open-access database of
transcription factor binding profiles. Nucleic Acids Res. 2014 PMID 24194598
4
Modelling Transcription Factor Binding Sites
(TFBS)
A [ 1 0 19 20 18 1 20 7 ]
C [ 1 0 1 0 1 18 0 2 ]
G [17 0 0 0 1 0 0 3 ]
T [ 1 20 0 0 0 1 0 8 ]
Example: FOXD1
PFM – Position Frequency Matrix
Logo
gctaaGTAACAATgcgca
cttaaGTAAACATcgctc
ccaatGTAAACAAacgga
gaaagGTAAACAAtgggc
GTAAACATgtact
cttgtGTAAACAAaaagc
cttaaGTAAACACgtccg
cttatGTCAACAGtgggt
tGTAAACATtgcat
GTAAACAAtgcga
cttagGTAAACAT
tttcgTTAAGTAAaca
caaaATAAACAAcgtgc
gctaaCTAAACAGagaga
gtgttGTAAACATtggaa
taatGTAAACAAtgcgg
gaaagGTAAACATaagaa
cctaaGTAAACACaacgc
cctaaGTAAACATt
cttatGTAAACAGaggtc
Known binding sites
5
Scoring putative TFBS sequences
A  [ 1  0 19 20 18  1 20  7 ]
C  [ 1  0  1  0  1 18  0  2 ]
G  [17  0  0  0  1  0  0  3 ]
T  [ 1 20  0  0  0  1  0  8 ]
A  [­1.5 ­2.5  1.7  1.8  1.6 ­1.5  1.8  0.4 ]
C  [­1.5 ­2.5 ­1.5 ­2.5 ­1.5  1.6 ­2.5 ­1.0 ]
G  [ 1.6 ­2.5 ­2.5 ­2.5 ­1.5 ­2.5 ­2.5 ­0.6 ]
T  [­1.5  1.8 ­2.5 ­2.5 ­2.5 ­1.5 ­2.5  0.6 ]
A C G A G T T A A A C A A G C T A
A  [­1.5 ­2.5  1.7  1.8  1.6 ­1.5  1.8  0.4 ]
C  [­1.5 ­2.5 ­1.5 ­2.5 ­1.5  1.6 ­2.5 ­1.0 ]
G  [ 1.6 ­2.5 ­2.5 ­2.5 ­1.5 ­2.5 ­2.5 ­0.6 ]
T  [­1.5  1.8 ­2.5 ­2.5 ­2.5 ­1.5 ­2.5  0.6 ]
Score = 9.2
PFM PWM – Position Weight Matrix
PWM
Sum score at
each position
(aka PSSM – Position Specific Scoring Matrix)
6
Overview of the JASPAR 2014 database
7
JASPAR Biopython modules
➢ Bio.motifs.jaspar
➢ Read / write motifs encoded in the JASPAR flat file formats:
sites, PFM and jaspar
➢ Bio.motifs.jaspar.db
➢ Search / fetch motifs from a JASPAR formatted database.
http://biopython.org*
*Cock et al. Biopython: freely available Python tools for computational molecular biology and
bioinformatics. Bioinformatics. 2009 Jun 1;25(11):1422-3. PMID: 19304878
Extend Biopython's Bio.motifs module to support construction
of TFBS matrices from JASPAR supported formats.
8
Constructing a matrix from a JASPAR sites
formatted file
The JASPAR sites format consists of a list of known binding sites for a motif.
9
Constructing a matrix from a JASPAR pfm
formatted file
The JASPAR pfm format simply describes a frequency matrix for a single motif.
10
Constructing matrices from a JASPAR jaspar
formatted file
Note the use of the parse rather than the read method to read multiple motifs.
The JASPAR jaspar format allows for multiple motifs. Each record consists of a header line
followed by four lines defining the frequency matrix.
11
Constructing matrices from a JASPAR jaspar
formatted file cont'd
The frequency portions of the file can be specified in a simpler format identical to the pfm
format.
12
The JASPAR DB module
Connect to a JASPAR database:
Modelled after the Perl TFBS modules*.
Specifically, the Bio.motifs.jaspar.db.JASPAR5 BioPython class is modelled
after the TFBS::DB::JASPAR5 perl class.
Fetch a specific motif by it's JASPAR ID:
* Lenhard et al. TFBS: Computational framework for transcription factor binding site analysis.
Bioinformatics. 2002 PMID 12176838
13
JASPAR DB module cont'd
Fetch multiple motifs according to various attributes.
Example: fetch the motifs of all the vertebrate and insect transcription factors from the CORE
JASPAR collection which are part of the Forkhead family and which have an information
content of at least 12 bits:
Note that selection criteria (such a 'tax_group' and 'tf_family') which allow multiple values may
be specified either as a single value or as a list of values.
14
For more information...
For an overview and examples of using these modules, please
see the JASPAR sub-section under the “Reading motifs”
section of the BioPython Tutorial and Cookbook:
http://biopython.org/DIST/docs/tutorial/Tutorial.html
For more technical information see the Bio.motifs.jaspar
section of the BioPython API docs:
http://biopython.org/DIST/docs/api
15
MANTA
MongoDB for Analysis of TFBS Alteration
Mathelier et al. Cis-regulatory somatic mutations and gene-expression alteration in B-cell lymphomas.
Genome Biology. 2015. PMID 25903198
16
MANTA
DB
...gctaaGTAACAATgcgca...
...cttaaGTAAACATcgctc...
...ccaatGTAAACAAacgga...
Adapted from Szalkowski and Schmid (2010). Briefings in Bioinfomatics.
17
MANTA Statistics
ChIP-seq experiments 477
Transcription factors 103
TFBSs 9,510,336
Unique bases covered
76,160,599 (~2.25% of the
human genome)
AMIA TBI&CRI March 19th
-23rd
, 2012 18
18
Variations may impact TF binding
TF
Binding
sequence
Mutated
binding
sequence
Transcription initiated
Transcription fails to
initiate
TF recognizes binding
site
TF fails to recognize
binding site
Exon
Exon
5’ UTR
5’ UTR
AGCTAGCTATATTTAAACAACACTGTCTAGCATTGCCTGATAGATGAGCCGTCGCAGCTGGA
AGCTAGCTATATTTAATCCACACTGTCTAGCATTGCCTGATAGATGAGCCGTCGCAGCTGGA
TFTF
19
DNA
TFBS
Assessing the impact of variations on TF binding
20
DNA
SNV
Assessing the impact of variations on TF binding
21
DNA
SNV
Assessing the impact of variations on TF binding
22
DNA
SNV
Assessing the impact of variations on TF binding
23
DNA
SNV
Assessing the impact of variations on TF binding
24
DNA
SNV
Assessing the impact of variations on TF binding
25
DNA
SNV
Record best TFBS hit with
the mutated sequence
Assessing the impact of variations on TF binding
26
DNA
TFBS
0.80 0.85 0.90 0.95 1.00 1.05 1.10
01234567
alt/ref
Density
Assessing the impact of variations on TF binding
27
DNA
SNV
0.80 0.85 0.90 0.95 1.00 1.05 1.10
01234567
alt/ref
Density
Alternative
Assessing the impact of variations on TF binding
28
Example of Application of MANTA
Mathelier et al. Cis-regulatory somatic mutations and gene-expression alteration in B-cell lymphomas.
Genome Biology. 2015. PMID
29
The MANTA Database
Implemented with MongoDB (http://www.mongodb.org)
Consists of 3 collections:
Experiments
- experiment name, type, TF name, JASPAR matrix ID, etc.
Peaks
- peak position (chromosome, start, end), score, position of maximum
peak height, etc.
TFBSs / SNVs
- position (chromosome, start, end), strand, score for the unmutated
TFBS plus similar information and impact score for each position / alt.
allele mutation.
30
MANTA DB with Python
Example: connect to MANTA DB and fetch all TFBS affected by an SNV at position 6425005
on chromosome 19.
31
MANTA Web Interface
URL: http://manta.cmmt.ubc.ca/manta
Source code: https://github.com/wassermanlab/MANTA
32
33
34
Thanks!
Any questions?
Contacts:
Anthony Mathelier, anthony.mathelier@gmail.com
David Arenillas, dave@cmmt.ubc.ca
URLs:
Wasserman Lab: www.cisreg.ca
BioPython: http://biopython.org
MANTA: manta.cmmt.ubc.ca/manta

Más contenido relacionado

La actualidad más candente

Molecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breedingMolecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breedingFOODCROPS
 
BIOL335: Functional genomics
BIOL335: Functional genomicsBIOL335: Functional genomics
BIOL335: Functional genomicsPaul Gardner
 
Analysis of gene expression
Analysis of gene expressionAnalysis of gene expression
Analysis of gene expressionTapeshwar Yadav
 
Functional annotation
Functional annotationFunctional annotation
Functional annotationRavi Gandham
 
Harmonal mutagenesis in plant
Harmonal mutagenesis in plantHarmonal mutagenesis in plant
Harmonal mutagenesis in plantVaibhav Chavan
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
Using methylation patterns to determine origin of biological material and age
Using methylation patterns to determine origin of biological material and ageUsing methylation patterns to determine origin of biological material and age
Using methylation patterns to determine origin of biological material and ageQIAGEN
 
Epigenetic silencing of MGMT (O6-methylguanine DNA methyltransferase) gene in...
Epigenetic silencing of MGMT (O6-methylguanine DNA methyltransferase) gene in...Epigenetic silencing of MGMT (O6-methylguanine DNA methyltransferase) gene in...
Epigenetic silencing of MGMT (O6-methylguanine DNA methyltransferase) gene in...arman170701
 
Van criekinge next_generation_epigenetic_profling_vvumc_uploaded
Van criekinge next_generation_epigenetic_profling_vvumc_uploadedVan criekinge next_generation_epigenetic_profling_vvumc_uploaded
Van criekinge next_generation_epigenetic_profling_vvumc_uploadedProf. Wim Van Criekinge
 
Single Nucleotide Polymorphism Analysis (SNPs)
Single Nucleotide Polymorphism Analysis (SNPs)Single Nucleotide Polymorphism Analysis (SNPs)
Single Nucleotide Polymorphism Analysis (SNPs)Data Science Thailand
 
Protein – DNA interactions, an overview
Protein – DNA interactions, an overviewProtein – DNA interactions, an overview
Protein – DNA interactions, an overviewDariyus Kabraji
 
recombinant DNA with subtopics
recombinant DNA with subtopicsrecombinant DNA with subtopics
recombinant DNA with subtopicsssabakazmi
 
SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)talhakhat
 
Site directed mutagenesis
Site  directed mutagenesisSite  directed mutagenesis
Site directed mutagenesisZain Khadim
 
Tyler functional annotation thurs 1120
Tyler functional annotation thurs 1120Tyler functional annotation thurs 1120
Tyler functional annotation thurs 1120Sucheta Tripathy
 
Ttp Lab Tech Talk 051810
Ttp Lab Tech Talk 051810Ttp Lab Tech Talk 051810
Ttp Lab Tech Talk 051810Neil Kubica
 
Molecular quantitative genetics for plant breeding roundtable 2010x
Molecular quantitative genetics for plant breeding roundtable 2010xMolecular quantitative genetics for plant breeding roundtable 2010x
Molecular quantitative genetics for plant breeding roundtable 2010xFOODCROPS
 

La actualidad más candente (20)

Molecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breedingMolecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breeding
 
BIOL335: Functional genomics
BIOL335: Functional genomicsBIOL335: Functional genomics
BIOL335: Functional genomics
 
Brian_Strahl 2013_class_on_genomics_and_proteomics
Brian_Strahl 2013_class_on_genomics_and_proteomicsBrian_Strahl 2013_class_on_genomics_and_proteomics
Brian_Strahl 2013_class_on_genomics_and_proteomics
 
Analysis of gene expression
Analysis of gene expressionAnalysis of gene expression
Analysis of gene expression
 
Functional annotation
Functional annotationFunctional annotation
Functional annotation
 
Harmonal mutagenesis in plant
Harmonal mutagenesis in plantHarmonal mutagenesis in plant
Harmonal mutagenesis in plant
 
Mutational analysis
Mutational analysisMutational analysis
Mutational analysis
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Using methylation patterns to determine origin of biological material and age
Using methylation patterns to determine origin of biological material and ageUsing methylation patterns to determine origin of biological material and age
Using methylation patterns to determine origin of biological material and age
 
Epigenetic silencing of MGMT (O6-methylguanine DNA methyltransferase) gene in...
Epigenetic silencing of MGMT (O6-methylguanine DNA methyltransferase) gene in...Epigenetic silencing of MGMT (O6-methylguanine DNA methyltransferase) gene in...
Epigenetic silencing of MGMT (O6-methylguanine DNA methyltransferase) gene in...
 
SNP Genotyping Technologies
SNP Genotyping TechnologiesSNP Genotyping Technologies
SNP Genotyping Technologies
 
Van criekinge next_generation_epigenetic_profling_vvumc_uploaded
Van criekinge next_generation_epigenetic_profling_vvumc_uploadedVan criekinge next_generation_epigenetic_profling_vvumc_uploaded
Van criekinge next_generation_epigenetic_profling_vvumc_uploaded
 
Single Nucleotide Polymorphism Analysis (SNPs)
Single Nucleotide Polymorphism Analysis (SNPs)Single Nucleotide Polymorphism Analysis (SNPs)
Single Nucleotide Polymorphism Analysis (SNPs)
 
Protein – DNA interactions, an overview
Protein – DNA interactions, an overviewProtein – DNA interactions, an overview
Protein – DNA interactions, an overview
 
recombinant DNA with subtopics
recombinant DNA with subtopicsrecombinant DNA with subtopics
recombinant DNA with subtopics
 
SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)
 
Site directed mutagenesis
Site  directed mutagenesisSite  directed mutagenesis
Site directed mutagenesis
 
Tyler functional annotation thurs 1120
Tyler functional annotation thurs 1120Tyler functional annotation thurs 1120
Tyler functional annotation thurs 1120
 
Ttp Lab Tech Talk 051810
Ttp Lab Tech Talk 051810Ttp Lab Tech Talk 051810
Ttp Lab Tech Talk 051810
 
Molecular quantitative genetics for plant breeding roundtable 2010x
Molecular quantitative genetics for plant breeding roundtable 2010xMolecular quantitative genetics for plant breeding roundtable 2010x
Molecular quantitative genetics for plant breeding roundtable 2010x
 

Similar a Webinar about JASPAR BioPython module and MANTA.

20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_coursehansjansen9999
 
Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor...
Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor...Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor...
Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor...Cyrus Chan
 
RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingmikaelhuss
 
Nanobiology mid term exam (mesele)
Nanobiology mid term exam (mesele)Nanobiology mid term exam (mesele)
Nanobiology mid term exam (mesele)Mesele Tilahun
 
Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121Jane Landolin
 
Use of spark for proteomic scoring seattle presentation
Use of spark for  proteomic scoring   seattle presentationUse of spark for  proteomic scoring   seattle presentation
Use of spark for proteomic scoring seattle presentationlordjoe
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchAnshika Bansal
 
Overview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataOverview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataThomas Keane
 
RSEM and DE packages
RSEM and DE packagesRSEM and DE packages
RSEM and DE packagesRavi Gandham
 
Liu_Jiangyuan_1201662_FR
Liu_Jiangyuan_1201662_FRLiu_Jiangyuan_1201662_FR
Liu_Jiangyuan_1201662_FR姜圆 刘
 
Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Manuel Martín
 

Similar a Webinar about JASPAR BioPython module and MANTA. (20)

20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
Rnaseq forgenefinding
Rnaseq forgenefindingRnaseq forgenefinding
Rnaseq forgenefinding
 
Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor...
Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor...Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor...
Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor...
 
RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processing
 
Rnaseq forgenefinding
Rnaseq forgenefindingRnaseq forgenefinding
Rnaseq forgenefinding
 
31931 31941
31931 3194131931 31941
31931 31941
 
Bioinformatics seminar
Bioinformatics seminarBioinformatics seminar
Bioinformatics seminar
 
Nanobiology mid term exam (mesele)
Nanobiology mid term exam (mesele)Nanobiology mid term exam (mesele)
Nanobiology mid term exam (mesele)
 
Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Use of spark for proteomic scoring seattle presentation
Use of spark for  proteomic scoring   seattle presentationUse of spark for  proteomic scoring   seattle presentation
Use of spark for proteomic scoring seattle presentation
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences research
 
Iplant pag
Iplant pagIplant pag
Iplant pag
 
Overview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataOverview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence data
 
RSEM and DE packages
RSEM and DE packagesRSEM and DE packages
RSEM and DE packages
 
Liu_Jiangyuan_1201662_FR
Liu_Jiangyuan_1201662_FRLiu_Jiangyuan_1201662_FR
Liu_Jiangyuan_1201662_FR
 
Biological databases
Biological databasesBiological databases
Biological databases
 
RNA-Seq
RNA-SeqRNA-Seq
RNA-Seq
 
第2回LinkedData勉強会@yayamamo
第2回LinkedData勉強会@yayamamo第2回LinkedData勉強会@yayamamo
第2回LinkedData勉強会@yayamamo
 
Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?
 

Último

Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Monika Rani
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...Lokesh Kothari
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...ssuser79fe74
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 

Último (20)

Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 

Webinar about JASPAR BioPython module and MANTA.

  • 1. www.cmmt.ubc.ca JASPAR BioPython & MANTA Anthony Mathelier, David Arenillas & Wyeth Wasserman anthony.mathelier@gmail.com & dave@cmmt.ubc.ca Wasserman Lab
  • 2. 2 Outline ● JASPAR BioPython module – What is JASPAR? – How to construct matrices from JASPAR files using the JASPAR BioPython module. ● MANTA – What is stored in MANTA? – How to interrogate the MANTA DB using Python and our web application.
  • 3. 3 http://jaspar.genereg.net Mathelier et al. JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles. Nucleic Acids Res. 2014 PMID 24194598
  • 4. 4 Modelling Transcription Factor Binding Sites (TFBS) A [ 1 0 19 20 18 1 20 7 ] C [ 1 0 1 0 1 18 0 2 ] G [17 0 0 0 1 0 0 3 ] T [ 1 20 0 0 0 1 0 8 ] Example: FOXD1 PFM – Position Frequency Matrix Logo gctaaGTAACAATgcgca cttaaGTAAACATcgctc ccaatGTAAACAAacgga gaaagGTAAACAAtgggc GTAAACATgtact cttgtGTAAACAAaaagc cttaaGTAAACACgtccg cttatGTCAACAGtgggt tGTAAACATtgcat GTAAACAAtgcga cttagGTAAACAT tttcgTTAAGTAAaca caaaATAAACAAcgtgc gctaaCTAAACAGagaga gtgttGTAAACATtggaa taatGTAAACAAtgcgg gaaagGTAAACATaagaa cctaaGTAAACACaacgc cctaaGTAAACATt cttatGTAAACAGaggtc Known binding sites
  • 5. 5 Scoring putative TFBS sequences A  [ 1  0 19 20 18  1 20  7 ] C  [ 1  0  1  0  1 18  0  2 ] G  [17  0  0  0  1  0  0  3 ] T  [ 1 20  0  0  0  1  0  8 ] A  [­1.5 ­2.5  1.7  1.8  1.6 ­1.5  1.8  0.4 ] C  [­1.5 ­2.5 ­1.5 ­2.5 ­1.5  1.6 ­2.5 ­1.0 ] G  [ 1.6 ­2.5 ­2.5 ­2.5 ­1.5 ­2.5 ­2.5 ­0.6 ] T  [­1.5  1.8 ­2.5 ­2.5 ­2.5 ­1.5 ­2.5  0.6 ] A C G A G T T A A A C A A G C T A A  [­1.5 ­2.5  1.7  1.8  1.6 ­1.5  1.8  0.4 ] C  [­1.5 ­2.5 ­1.5 ­2.5 ­1.5  1.6 ­2.5 ­1.0 ] G  [ 1.6 ­2.5 ­2.5 ­2.5 ­1.5 ­2.5 ­2.5 ­0.6 ] T  [­1.5  1.8 ­2.5 ­2.5 ­2.5 ­1.5 ­2.5  0.6 ] Score = 9.2 PFM PWM – Position Weight Matrix PWM Sum score at each position (aka PSSM – Position Specific Scoring Matrix)
  • 6. 6 Overview of the JASPAR 2014 database
  • 7. 7 JASPAR Biopython modules ➢ Bio.motifs.jaspar ➢ Read / write motifs encoded in the JASPAR flat file formats: sites, PFM and jaspar ➢ Bio.motifs.jaspar.db ➢ Search / fetch motifs from a JASPAR formatted database. http://biopython.org* *Cock et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics. 2009 Jun 1;25(11):1422-3. PMID: 19304878 Extend Biopython's Bio.motifs module to support construction of TFBS matrices from JASPAR supported formats.
  • 8. 8 Constructing a matrix from a JASPAR sites formatted file The JASPAR sites format consists of a list of known binding sites for a motif.
  • 9. 9 Constructing a matrix from a JASPAR pfm formatted file The JASPAR pfm format simply describes a frequency matrix for a single motif.
  • 10. 10 Constructing matrices from a JASPAR jaspar formatted file Note the use of the parse rather than the read method to read multiple motifs. The JASPAR jaspar format allows for multiple motifs. Each record consists of a header line followed by four lines defining the frequency matrix.
  • 11. 11 Constructing matrices from a JASPAR jaspar formatted file cont'd The frequency portions of the file can be specified in a simpler format identical to the pfm format.
  • 12. 12 The JASPAR DB module Connect to a JASPAR database: Modelled after the Perl TFBS modules*. Specifically, the Bio.motifs.jaspar.db.JASPAR5 BioPython class is modelled after the TFBS::DB::JASPAR5 perl class. Fetch a specific motif by it's JASPAR ID: * Lenhard et al. TFBS: Computational framework for transcription factor binding site analysis. Bioinformatics. 2002 PMID 12176838
  • 13. 13 JASPAR DB module cont'd Fetch multiple motifs according to various attributes. Example: fetch the motifs of all the vertebrate and insect transcription factors from the CORE JASPAR collection which are part of the Forkhead family and which have an information content of at least 12 bits: Note that selection criteria (such a 'tax_group' and 'tf_family') which allow multiple values may be specified either as a single value or as a list of values.
  • 14. 14 For more information... For an overview and examples of using these modules, please see the JASPAR sub-section under the “Reading motifs” section of the BioPython Tutorial and Cookbook: http://biopython.org/DIST/docs/tutorial/Tutorial.html For more technical information see the Bio.motifs.jaspar section of the BioPython API docs: http://biopython.org/DIST/docs/api
  • 15. 15 MANTA MongoDB for Analysis of TFBS Alteration Mathelier et al. Cis-regulatory somatic mutations and gene-expression alteration in B-cell lymphomas. Genome Biology. 2015. PMID 25903198
  • 17. 17 MANTA Statistics ChIP-seq experiments 477 Transcription factors 103 TFBSs 9,510,336 Unique bases covered 76,160,599 (~2.25% of the human genome)
  • 18. AMIA TBI&CRI March 19th -23rd , 2012 18 18 Variations may impact TF binding TF Binding sequence Mutated binding sequence Transcription initiated Transcription fails to initiate TF recognizes binding site TF fails to recognize binding site Exon Exon 5’ UTR 5’ UTR AGCTAGCTATATTTAAACAACACTGTCTAGCATTGCCTGATAGATGAGCCGTCGCAGCTGGA AGCTAGCTATATTTAATCCACACTGTCTAGCATTGCCTGATAGATGAGCCGTCGCAGCTGGA TFTF
  • 19. 19 DNA TFBS Assessing the impact of variations on TF binding
  • 20. 20 DNA SNV Assessing the impact of variations on TF binding
  • 21. 21 DNA SNV Assessing the impact of variations on TF binding
  • 22. 22 DNA SNV Assessing the impact of variations on TF binding
  • 23. 23 DNA SNV Assessing the impact of variations on TF binding
  • 24. 24 DNA SNV Assessing the impact of variations on TF binding
  • 25. 25 DNA SNV Record best TFBS hit with the mutated sequence Assessing the impact of variations on TF binding
  • 26. 26 DNA TFBS 0.80 0.85 0.90 0.95 1.00 1.05 1.10 01234567 alt/ref Density Assessing the impact of variations on TF binding
  • 27. 27 DNA SNV 0.80 0.85 0.90 0.95 1.00 1.05 1.10 01234567 alt/ref Density Alternative Assessing the impact of variations on TF binding
  • 28. 28 Example of Application of MANTA Mathelier et al. Cis-regulatory somatic mutations and gene-expression alteration in B-cell lymphomas. Genome Biology. 2015. PMID
  • 29. 29 The MANTA Database Implemented with MongoDB (http://www.mongodb.org) Consists of 3 collections: Experiments - experiment name, type, TF name, JASPAR matrix ID, etc. Peaks - peak position (chromosome, start, end), score, position of maximum peak height, etc. TFBSs / SNVs - position (chromosome, start, end), strand, score for the unmutated TFBS plus similar information and impact score for each position / alt. allele mutation.
  • 30. 30 MANTA DB with Python Example: connect to MANTA DB and fetch all TFBS affected by an SNV at position 6425005 on chromosome 19.
  • 31. 31 MANTA Web Interface URL: http://manta.cmmt.ubc.ca/manta Source code: https://github.com/wassermanlab/MANTA
  • 32. 32
  • 33. 33
  • 34. 34 Thanks! Any questions? Contacts: Anthony Mathelier, anthony.mathelier@gmail.com David Arenillas, dave@cmmt.ubc.ca URLs: Wasserman Lab: www.cisreg.ca BioPython: http://biopython.org MANTA: manta.cmmt.ubc.ca/manta