SlideShare una empresa de Scribd logo
1 de 34
Descargar para leer sin conexión
www.cmmt.ubc.ca
JASPAR BioPython & MANTA
Anthony Mathelier, David Arenillas & Wyeth Wasserman
anthony.mathelier@gmail.com & dave@cmmt.ubc.ca
Wasserman Lab
2
Outline
● JASPAR BioPython module
– What is JASPAR?
– How to construct matrices from JASPAR files using
the JASPAR BioPython module.
● MANTA
– What is stored in MANTA?
– How to interrogate the MANTA DB using Python and
our web application.
3
http://jaspar.genereg.net
Mathelier et al. JASPAR 2014: an extensively expanded and updated open-access database of
transcription factor binding profiles. Nucleic Acids Res. 2014 PMID 24194598
4
Modelling Transcription Factor Binding Sites
(TFBS)
A [ 1 0 19 20 18 1 20 7 ]
C [ 1 0 1 0 1 18 0 2 ]
G [17 0 0 0 1 0 0 3 ]
T [ 1 20 0 0 0 1 0 8 ]
Example: FOXD1
PFM – Position Frequency Matrix
Logo
gctaaGTAACAATgcgca
cttaaGTAAACATcgctc
ccaatGTAAACAAacgga
gaaagGTAAACAAtgggc
GTAAACATgtact
cttgtGTAAACAAaaagc
cttaaGTAAACACgtccg
cttatGTCAACAGtgggt
tGTAAACATtgcat
GTAAACAAtgcga
cttagGTAAACAT
tttcgTTAAGTAAaca
caaaATAAACAAcgtgc
gctaaCTAAACAGagaga
gtgttGTAAACATtggaa
taatGTAAACAAtgcgg
gaaagGTAAACATaagaa
cctaaGTAAACACaacgc
cctaaGTAAACATt
cttatGTAAACAGaggtc
Known binding sites
5
Scoring putative TFBS sequences
A  [ 1  0 19 20 18  1 20  7 ]
C  [ 1  0  1  0  1 18  0  2 ]
G  [17  0  0  0  1  0  0  3 ]
T  [ 1 20  0  0  0  1  0  8 ]
A  [­1.5 ­2.5  1.7  1.8  1.6 ­1.5  1.8  0.4 ]
C  [­1.5 ­2.5 ­1.5 ­2.5 ­1.5  1.6 ­2.5 ­1.0 ]
G  [ 1.6 ­2.5 ­2.5 ­2.5 ­1.5 ­2.5 ­2.5 ­0.6 ]
T  [­1.5  1.8 ­2.5 ­2.5 ­2.5 ­1.5 ­2.5  0.6 ]
A C G A G T T A A A C A A G C T A
A  [­1.5 ­2.5  1.7  1.8  1.6 ­1.5  1.8  0.4 ]
C  [­1.5 ­2.5 ­1.5 ­2.5 ­1.5  1.6 ­2.5 ­1.0 ]
G  [ 1.6 ­2.5 ­2.5 ­2.5 ­1.5 ­2.5 ­2.5 ­0.6 ]
T  [­1.5  1.8 ­2.5 ­2.5 ­2.5 ­1.5 ­2.5  0.6 ]
Score = 9.2
PFM PWM – Position Weight Matrix
PWM
Sum score at
each position
(aka PSSM – Position Specific Scoring Matrix)
6
Overview of the JASPAR 2014 database
7
JASPAR Biopython modules
➢ Bio.motifs.jaspar
➢ Read / write motifs encoded in the JASPAR flat file formats:
sites, PFM and jaspar
➢ Bio.motifs.jaspar.db
➢ Search / fetch motifs from a JASPAR formatted database.
http://biopython.org*
*Cock et al. Biopython: freely available Python tools for computational molecular biology and
bioinformatics. Bioinformatics. 2009 Jun 1;25(11):1422-3. PMID: 19304878
Extend Biopython's Bio.motifs module to support construction
of TFBS matrices from JASPAR supported formats.
8
Constructing a matrix from a JASPAR sites
formatted file
The JASPAR sites format consists of a list of known binding sites for a motif.
9
Constructing a matrix from a JASPAR pfm
formatted file
The JASPAR pfm format simply describes a frequency matrix for a single motif.
10
Constructing matrices from a JASPAR jaspar
formatted file
Note the use of the parse rather than the read method to read multiple motifs.
The JASPAR jaspar format allows for multiple motifs. Each record consists of a header line
followed by four lines defining the frequency matrix.
11
Constructing matrices from a JASPAR jaspar
formatted file cont'd
The frequency portions of the file can be specified in a simpler format identical to the pfm
format.
12
The JASPAR DB module
Connect to a JASPAR database:
Modelled after the Perl TFBS modules*.
Specifically, the Bio.motifs.jaspar.db.JASPAR5 BioPython class is modelled
after the TFBS::DB::JASPAR5 perl class.
Fetch a specific motif by it's JASPAR ID:
* Lenhard et al. TFBS: Computational framework for transcription factor binding site analysis.
Bioinformatics. 2002 PMID 12176838
13
JASPAR DB module cont'd
Fetch multiple motifs according to various attributes.
Example: fetch the motifs of all the vertebrate and insect transcription factors from the CORE
JASPAR collection which are part of the Forkhead family and which have an information
content of at least 12 bits:
Note that selection criteria (such a 'tax_group' and 'tf_family') which allow multiple values may
be specified either as a single value or as a list of values.
14
For more information...
For an overview and examples of using these modules, please
see the JASPAR sub-section under the “Reading motifs”
section of the BioPython Tutorial and Cookbook:
http://biopython.org/DIST/docs/tutorial/Tutorial.html
For more technical information see the Bio.motifs.jaspar
section of the BioPython API docs:
http://biopython.org/DIST/docs/api
15
MANTA
MongoDB for Analysis of TFBS Alteration
Mathelier et al. Cis-regulatory somatic mutations and gene-expression alteration in B-cell lymphomas.
Genome Biology. 2015. PMID 25903198
16
MANTA
DB
...gctaaGTAACAATgcgca...
...cttaaGTAAACATcgctc...
...ccaatGTAAACAAacgga...
Adapted from Szalkowski and Schmid (2010). Briefings in Bioinfomatics.
17
MANTA Statistics
ChIP-seq experiments 477
Transcription factors 103
TFBSs 9,510,336
Unique bases covered
76,160,599 (~2.25% of the
human genome)
AMIA TBI&CRI March 19th
-23rd
, 2012 18
18
Variations may impact TF binding
TF
Binding
sequence
Mutated
binding
sequence
Transcription initiated
Transcription fails to
initiate
TF recognizes binding
site
TF fails to recognize
binding site
Exon
Exon
5’ UTR
5’ UTR
AGCTAGCTATATTTAAACAACACTGTCTAGCATTGCCTGATAGATGAGCCGTCGCAGCTGGA
AGCTAGCTATATTTAATCCACACTGTCTAGCATTGCCTGATAGATGAGCCGTCGCAGCTGGA
TFTF
19
DNA
TFBS
Assessing the impact of variations on TF binding
20
DNA
SNV
Assessing the impact of variations on TF binding
21
DNA
SNV
Assessing the impact of variations on TF binding
22
DNA
SNV
Assessing the impact of variations on TF binding
23
DNA
SNV
Assessing the impact of variations on TF binding
24
DNA
SNV
Assessing the impact of variations on TF binding
25
DNA
SNV
Record best TFBS hit with
the mutated sequence
Assessing the impact of variations on TF binding
26
DNA
TFBS
0.80 0.85 0.90 0.95 1.00 1.05 1.10
01234567
alt/ref
Density
Assessing the impact of variations on TF binding
27
DNA
SNV
0.80 0.85 0.90 0.95 1.00 1.05 1.10
01234567
alt/ref
Density
Alternative
Assessing the impact of variations on TF binding
28
Example of Application of MANTA
Mathelier et al. Cis-regulatory somatic mutations and gene-expression alteration in B-cell lymphomas.
Genome Biology. 2015. PMID
29
The MANTA Database
Implemented with MongoDB (http://www.mongodb.org)
Consists of 3 collections:
Experiments
- experiment name, type, TF name, JASPAR matrix ID, etc.
Peaks
- peak position (chromosome, start, end), score, position of maximum
peak height, etc.
TFBSs / SNVs
- position (chromosome, start, end), strand, score for the unmutated
TFBS plus similar information and impact score for each position / alt.
allele mutation.
30
MANTA DB with Python
Example: connect to MANTA DB and fetch all TFBS affected by an SNV at position 6425005
on chromosome 19.
31
MANTA Web Interface
URL: http://manta.cmmt.ubc.ca/manta
Source code: https://github.com/wassermanlab/MANTA
32
33
34
Thanks!
Any questions?
Contacts:
Anthony Mathelier, anthony.mathelier@gmail.com
David Arenillas, dave@cmmt.ubc.ca
URLs:
Wasserman Lab: www.cisreg.ca
BioPython: http://biopython.org
MANTA: manta.cmmt.ubc.ca/manta

Más contenido relacionado

La actualidad más candente

Molecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breedingMolecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breedingFOODCROPS
 
BIOL335: Functional genomics
BIOL335: Functional genomicsBIOL335: Functional genomics
BIOL335: Functional genomicsPaul Gardner
 
Analysis of gene expression
Analysis of gene expressionAnalysis of gene expression
Analysis of gene expressionTapeshwar Yadav
 
Functional annotation
Functional annotationFunctional annotation
Functional annotationRavi Gandham
 
Harmonal mutagenesis in plant
Harmonal mutagenesis in plantHarmonal mutagenesis in plant
Harmonal mutagenesis in plantVaibhav Chavan
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
Using methylation patterns to determine origin of biological material and age
Using methylation patterns to determine origin of biological material and ageUsing methylation patterns to determine origin of biological material and age
Using methylation patterns to determine origin of biological material and ageQIAGEN
 
Epigenetic silencing of MGMT (O6-methylguanine DNA methyltransferase) gene in...
Epigenetic silencing of MGMT (O6-methylguanine DNA methyltransferase) gene in...Epigenetic silencing of MGMT (O6-methylguanine DNA methyltransferase) gene in...
Epigenetic silencing of MGMT (O6-methylguanine DNA methyltransferase) gene in...arman170701
 
Van criekinge next_generation_epigenetic_profling_vvumc_uploaded
Van criekinge next_generation_epigenetic_profling_vvumc_uploadedVan criekinge next_generation_epigenetic_profling_vvumc_uploaded
Van criekinge next_generation_epigenetic_profling_vvumc_uploadedProf. Wim Van Criekinge
 
Single Nucleotide Polymorphism Analysis (SNPs)
Single Nucleotide Polymorphism Analysis (SNPs)Single Nucleotide Polymorphism Analysis (SNPs)
Single Nucleotide Polymorphism Analysis (SNPs)Data Science Thailand
 
Protein – DNA interactions, an overview
Protein – DNA interactions, an overviewProtein – DNA interactions, an overview
Protein – DNA interactions, an overviewDariyus Kabraji
 
recombinant DNA with subtopics
recombinant DNA with subtopicsrecombinant DNA with subtopics
recombinant DNA with subtopicsssabakazmi
 
SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)talhakhat
 
Site directed mutagenesis
Site  directed mutagenesisSite  directed mutagenesis
Site directed mutagenesisZain Khadim
 
Tyler functional annotation thurs 1120
Tyler functional annotation thurs 1120Tyler functional annotation thurs 1120
Tyler functional annotation thurs 1120Sucheta Tripathy
 
Ttp Lab Tech Talk 051810
Ttp Lab Tech Talk 051810Ttp Lab Tech Talk 051810
Ttp Lab Tech Talk 051810Neil Kubica
 
Molecular quantitative genetics for plant breeding roundtable 2010x
Molecular quantitative genetics for plant breeding roundtable 2010xMolecular quantitative genetics for plant breeding roundtable 2010x
Molecular quantitative genetics for plant breeding roundtable 2010xFOODCROPS
 

La actualidad más candente (20)

Molecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breedingMolecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breeding
 
BIOL335: Functional genomics
BIOL335: Functional genomicsBIOL335: Functional genomics
BIOL335: Functional genomics
 
Brian_Strahl 2013_class_on_genomics_and_proteomics
Brian_Strahl 2013_class_on_genomics_and_proteomicsBrian_Strahl 2013_class_on_genomics_and_proteomics
Brian_Strahl 2013_class_on_genomics_and_proteomics
 
Analysis of gene expression
Analysis of gene expressionAnalysis of gene expression
Analysis of gene expression
 
Functional annotation
Functional annotationFunctional annotation
Functional annotation
 
Harmonal mutagenesis in plant
Harmonal mutagenesis in plantHarmonal mutagenesis in plant
Harmonal mutagenesis in plant
 
Mutational analysis
Mutational analysisMutational analysis
Mutational analysis
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Using methylation patterns to determine origin of biological material and age
Using methylation patterns to determine origin of biological material and ageUsing methylation patterns to determine origin of biological material and age
Using methylation patterns to determine origin of biological material and age
 
Epigenetic silencing of MGMT (O6-methylguanine DNA methyltransferase) gene in...
Epigenetic silencing of MGMT (O6-methylguanine DNA methyltransferase) gene in...Epigenetic silencing of MGMT (O6-methylguanine DNA methyltransferase) gene in...
Epigenetic silencing of MGMT (O6-methylguanine DNA methyltransferase) gene in...
 
SNP Genotyping Technologies
SNP Genotyping TechnologiesSNP Genotyping Technologies
SNP Genotyping Technologies
 
Van criekinge next_generation_epigenetic_profling_vvumc_uploaded
Van criekinge next_generation_epigenetic_profling_vvumc_uploadedVan criekinge next_generation_epigenetic_profling_vvumc_uploaded
Van criekinge next_generation_epigenetic_profling_vvumc_uploaded
 
Single Nucleotide Polymorphism Analysis (SNPs)
Single Nucleotide Polymorphism Analysis (SNPs)Single Nucleotide Polymorphism Analysis (SNPs)
Single Nucleotide Polymorphism Analysis (SNPs)
 
Protein – DNA interactions, an overview
Protein – DNA interactions, an overviewProtein – DNA interactions, an overview
Protein – DNA interactions, an overview
 
recombinant DNA with subtopics
recombinant DNA with subtopicsrecombinant DNA with subtopics
recombinant DNA with subtopics
 
SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)
 
Site directed mutagenesis
Site  directed mutagenesisSite  directed mutagenesis
Site directed mutagenesis
 
Tyler functional annotation thurs 1120
Tyler functional annotation thurs 1120Tyler functional annotation thurs 1120
Tyler functional annotation thurs 1120
 
Ttp Lab Tech Talk 051810
Ttp Lab Tech Talk 051810Ttp Lab Tech Talk 051810
Ttp Lab Tech Talk 051810
 
Molecular quantitative genetics for plant breeding roundtable 2010x
Molecular quantitative genetics for plant breeding roundtable 2010xMolecular quantitative genetics for plant breeding roundtable 2010x
Molecular quantitative genetics for plant breeding roundtable 2010x
 

Similar a Webinar about JASPAR BioPython module and MANTA.

20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_coursehansjansen9999
 
Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor...
Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor...Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor...
Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor...Cyrus Chan
 
RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingmikaelhuss
 
Nanobiology mid term exam (mesele)
Nanobiology mid term exam (mesele)Nanobiology mid term exam (mesele)
Nanobiology mid term exam (mesele)Mesele Tilahun
 
Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121Jane Landolin
 
Use of spark for proteomic scoring seattle presentation
Use of spark for  proteomic scoring   seattle presentationUse of spark for  proteomic scoring   seattle presentation
Use of spark for proteomic scoring seattle presentationlordjoe
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchAnshika Bansal
 
Overview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataOverview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataThomas Keane
 
RSEM and DE packages
RSEM and DE packagesRSEM and DE packages
RSEM and DE packagesRavi Gandham
 
Liu_Jiangyuan_1201662_FR
Liu_Jiangyuan_1201662_FRLiu_Jiangyuan_1201662_FR
Liu_Jiangyuan_1201662_FR姜圆 刘
 
Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Manuel Martín
 

Similar a Webinar about JASPAR BioPython module and MANTA. (20)

20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
Rnaseq forgenefinding
Rnaseq forgenefindingRnaseq forgenefinding
Rnaseq forgenefinding
 
Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor...
Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor...Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor...
Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor...
 
RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processing
 
Rnaseq forgenefinding
Rnaseq forgenefindingRnaseq forgenefinding
Rnaseq forgenefinding
 
31931 31941
31931 3194131931 31941
31931 31941
 
Bioinformatics seminar
Bioinformatics seminarBioinformatics seminar
Bioinformatics seminar
 
Nanobiology mid term exam (mesele)
Nanobiology mid term exam (mesele)Nanobiology mid term exam (mesele)
Nanobiology mid term exam (mesele)
 
Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Use of spark for proteomic scoring seattle presentation
Use of spark for  proteomic scoring   seattle presentationUse of spark for  proteomic scoring   seattle presentation
Use of spark for proteomic scoring seattle presentation
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences research
 
Iplant pag
Iplant pagIplant pag
Iplant pag
 
Overview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataOverview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence data
 
RSEM and DE packages
RSEM and DE packagesRSEM and DE packages
RSEM and DE packages
 
Liu_Jiangyuan_1201662_FR
Liu_Jiangyuan_1201662_FRLiu_Jiangyuan_1201662_FR
Liu_Jiangyuan_1201662_FR
 
Biological databases
Biological databasesBiological databases
Biological databases
 
RNA-Seq
RNA-SeqRNA-Seq
RNA-Seq
 
第2回LinkedData勉強会@yayamamo
第2回LinkedData勉強会@yayamamo第2回LinkedData勉強会@yayamamo
第2回LinkedData勉強会@yayamamo
 
Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?
 

Último

CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptxpallavirawat456
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxRitchAndruAgustin
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
bonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlsbonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlshansessene
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024Jene van der Heide
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests GlycosidesNandakishor Bhaurao Deshmukh
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫qfactory1
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxThermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxuniversity
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxmaryFF1
 

Último (20)

CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptx
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
bonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlsbonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girls
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?
 
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxThermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
 

Webinar about JASPAR BioPython module and MANTA.

  • 1. www.cmmt.ubc.ca JASPAR BioPython & MANTA Anthony Mathelier, David Arenillas & Wyeth Wasserman anthony.mathelier@gmail.com & dave@cmmt.ubc.ca Wasserman Lab
  • 2. 2 Outline ● JASPAR BioPython module – What is JASPAR? – How to construct matrices from JASPAR files using the JASPAR BioPython module. ● MANTA – What is stored in MANTA? – How to interrogate the MANTA DB using Python and our web application.
  • 3. 3 http://jaspar.genereg.net Mathelier et al. JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles. Nucleic Acids Res. 2014 PMID 24194598
  • 4. 4 Modelling Transcription Factor Binding Sites (TFBS) A [ 1 0 19 20 18 1 20 7 ] C [ 1 0 1 0 1 18 0 2 ] G [17 0 0 0 1 0 0 3 ] T [ 1 20 0 0 0 1 0 8 ] Example: FOXD1 PFM – Position Frequency Matrix Logo gctaaGTAACAATgcgca cttaaGTAAACATcgctc ccaatGTAAACAAacgga gaaagGTAAACAAtgggc GTAAACATgtact cttgtGTAAACAAaaagc cttaaGTAAACACgtccg cttatGTCAACAGtgggt tGTAAACATtgcat GTAAACAAtgcga cttagGTAAACAT tttcgTTAAGTAAaca caaaATAAACAAcgtgc gctaaCTAAACAGagaga gtgttGTAAACATtggaa taatGTAAACAAtgcgg gaaagGTAAACATaagaa cctaaGTAAACACaacgc cctaaGTAAACATt cttatGTAAACAGaggtc Known binding sites
  • 5. 5 Scoring putative TFBS sequences A  [ 1  0 19 20 18  1 20  7 ] C  [ 1  0  1  0  1 18  0  2 ] G  [17  0  0  0  1  0  0  3 ] T  [ 1 20  0  0  0  1  0  8 ] A  [­1.5 ­2.5  1.7  1.8  1.6 ­1.5  1.8  0.4 ] C  [­1.5 ­2.5 ­1.5 ­2.5 ­1.5  1.6 ­2.5 ­1.0 ] G  [ 1.6 ­2.5 ­2.5 ­2.5 ­1.5 ­2.5 ­2.5 ­0.6 ] T  [­1.5  1.8 ­2.5 ­2.5 ­2.5 ­1.5 ­2.5  0.6 ] A C G A G T T A A A C A A G C T A A  [­1.5 ­2.5  1.7  1.8  1.6 ­1.5  1.8  0.4 ] C  [­1.5 ­2.5 ­1.5 ­2.5 ­1.5  1.6 ­2.5 ­1.0 ] G  [ 1.6 ­2.5 ­2.5 ­2.5 ­1.5 ­2.5 ­2.5 ­0.6 ] T  [­1.5  1.8 ­2.5 ­2.5 ­2.5 ­1.5 ­2.5  0.6 ] Score = 9.2 PFM PWM – Position Weight Matrix PWM Sum score at each position (aka PSSM – Position Specific Scoring Matrix)
  • 6. 6 Overview of the JASPAR 2014 database
  • 7. 7 JASPAR Biopython modules ➢ Bio.motifs.jaspar ➢ Read / write motifs encoded in the JASPAR flat file formats: sites, PFM and jaspar ➢ Bio.motifs.jaspar.db ➢ Search / fetch motifs from a JASPAR formatted database. http://biopython.org* *Cock et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics. 2009 Jun 1;25(11):1422-3. PMID: 19304878 Extend Biopython's Bio.motifs module to support construction of TFBS matrices from JASPAR supported formats.
  • 8. 8 Constructing a matrix from a JASPAR sites formatted file The JASPAR sites format consists of a list of known binding sites for a motif.
  • 9. 9 Constructing a matrix from a JASPAR pfm formatted file The JASPAR pfm format simply describes a frequency matrix for a single motif.
  • 10. 10 Constructing matrices from a JASPAR jaspar formatted file Note the use of the parse rather than the read method to read multiple motifs. The JASPAR jaspar format allows for multiple motifs. Each record consists of a header line followed by four lines defining the frequency matrix.
  • 11. 11 Constructing matrices from a JASPAR jaspar formatted file cont'd The frequency portions of the file can be specified in a simpler format identical to the pfm format.
  • 12. 12 The JASPAR DB module Connect to a JASPAR database: Modelled after the Perl TFBS modules*. Specifically, the Bio.motifs.jaspar.db.JASPAR5 BioPython class is modelled after the TFBS::DB::JASPAR5 perl class. Fetch a specific motif by it's JASPAR ID: * Lenhard et al. TFBS: Computational framework for transcription factor binding site analysis. Bioinformatics. 2002 PMID 12176838
  • 13. 13 JASPAR DB module cont'd Fetch multiple motifs according to various attributes. Example: fetch the motifs of all the vertebrate and insect transcription factors from the CORE JASPAR collection which are part of the Forkhead family and which have an information content of at least 12 bits: Note that selection criteria (such a 'tax_group' and 'tf_family') which allow multiple values may be specified either as a single value or as a list of values.
  • 14. 14 For more information... For an overview and examples of using these modules, please see the JASPAR sub-section under the “Reading motifs” section of the BioPython Tutorial and Cookbook: http://biopython.org/DIST/docs/tutorial/Tutorial.html For more technical information see the Bio.motifs.jaspar section of the BioPython API docs: http://biopython.org/DIST/docs/api
  • 15. 15 MANTA MongoDB for Analysis of TFBS Alteration Mathelier et al. Cis-regulatory somatic mutations and gene-expression alteration in B-cell lymphomas. Genome Biology. 2015. PMID 25903198
  • 17. 17 MANTA Statistics ChIP-seq experiments 477 Transcription factors 103 TFBSs 9,510,336 Unique bases covered 76,160,599 (~2.25% of the human genome)
  • 18. AMIA TBI&CRI March 19th -23rd , 2012 18 18 Variations may impact TF binding TF Binding sequence Mutated binding sequence Transcription initiated Transcription fails to initiate TF recognizes binding site TF fails to recognize binding site Exon Exon 5’ UTR 5’ UTR AGCTAGCTATATTTAAACAACACTGTCTAGCATTGCCTGATAGATGAGCCGTCGCAGCTGGA AGCTAGCTATATTTAATCCACACTGTCTAGCATTGCCTGATAGATGAGCCGTCGCAGCTGGA TFTF
  • 19. 19 DNA TFBS Assessing the impact of variations on TF binding
  • 20. 20 DNA SNV Assessing the impact of variations on TF binding
  • 21. 21 DNA SNV Assessing the impact of variations on TF binding
  • 22. 22 DNA SNV Assessing the impact of variations on TF binding
  • 23. 23 DNA SNV Assessing the impact of variations on TF binding
  • 24. 24 DNA SNV Assessing the impact of variations on TF binding
  • 25. 25 DNA SNV Record best TFBS hit with the mutated sequence Assessing the impact of variations on TF binding
  • 26. 26 DNA TFBS 0.80 0.85 0.90 0.95 1.00 1.05 1.10 01234567 alt/ref Density Assessing the impact of variations on TF binding
  • 27. 27 DNA SNV 0.80 0.85 0.90 0.95 1.00 1.05 1.10 01234567 alt/ref Density Alternative Assessing the impact of variations on TF binding
  • 28. 28 Example of Application of MANTA Mathelier et al. Cis-regulatory somatic mutations and gene-expression alteration in B-cell lymphomas. Genome Biology. 2015. PMID
  • 29. 29 The MANTA Database Implemented with MongoDB (http://www.mongodb.org) Consists of 3 collections: Experiments - experiment name, type, TF name, JASPAR matrix ID, etc. Peaks - peak position (chromosome, start, end), score, position of maximum peak height, etc. TFBSs / SNVs - position (chromosome, start, end), strand, score for the unmutated TFBS plus similar information and impact score for each position / alt. allele mutation.
  • 30. 30 MANTA DB with Python Example: connect to MANTA DB and fetch all TFBS affected by an SNV at position 6425005 on chromosome 19.
  • 31. 31 MANTA Web Interface URL: http://manta.cmmt.ubc.ca/manta Source code: https://github.com/wassermanlab/MANTA
  • 32. 32
  • 33. 33
  • 34. 34 Thanks! Any questions? Contacts: Anthony Mathelier, anthony.mathelier@gmail.com David Arenillas, dave@cmmt.ubc.ca URLs: Wasserman Lab: www.cisreg.ca BioPython: http://biopython.org MANTA: manta.cmmt.ubc.ca/manta