SlideShare una empresa de Scribd logo
1 de 12
Descargar para leer sin conexión
miRNA Discovery & Prediction Algorithms
Sergei Lebedev
October 13, 2012
What is miRNA?
• microRNA or miRNA, ≈ 22 nucleotide-long non-coding RNA;
• mostly expressed in a tissue-specific manner and play crucial
roles in cell proliferation, apoptosis and differentiation during
cell development;
• thought to be involved in post-transcriptional control in plants
and animals;
• linked to disease1, for example hsa-miR-126 is associated with
retinoblastoma, breast cancer, lung cancer, kidney cancer,
asthma etc.
1
See http://www.mir2disease.org for details.
1 / 11
miRNA in action: nucleus [1]
• pri-miRNA is transcribed by RNA polymerase II and seem to
possess promoter and enchancer regions, similar to protein
coding genes;
• pri-miRNA is then cleaved into (possibly multiple)
pre-miRNA by an enzyme complex Drosha.
2 / 11
miRNA in action: cytoplasm [1]
• Dicer removes the stem-loop, leaving two complementary
sequences: miRNA and miRNA*, the latter is not known to
have any regulatory function.
• Mature miRNA base-pairs with 3’ UTR of target mRNAs and
blocks protein syntesis or causes mRNA degradation.
3 / 11
miRNA identification
• Biological methods: northern blots, qRT-PCR2, micro arrays,
RNA-seq or miRNA-seq.
• Bioinformatics to the rescue! the usual strategy: first
sequence everything, RNA-seq in this case, then try to make
sense of whatever the result is.
• In this talk: miRDeep [2], MiRAlign [3], MiRank [4].
• A lot of existing tools out of scope, most can be described
with a one liner: “We’ve developed a novel method for
miRNA identification, based on machine learning approach,
SVM, HMM!”.
2
RT for reverse transcription, not real-time.
4 / 11
mirDeep
5 / 11
MiRAlign
6 / 11
miRank: overview
• Treat miRNA identification problem as a problem of
information retrieval, where novel miRNAs are to be retrieved
from a set of candidates by the known query samples – “true”
miRNAs.
• More formally, given a set of known pre-miRNAs XQ as query
samples and a set of putative candidates XU as unknown
samples, rank XU with respect to XQ.
• To do so, compute the relevancy values fi ∈ [0, 1] for all
unknown samples, assuming fi = 1 for query samples.
• After that, simply select n ranked samples, which constitute
to predicted pre-miRNA.
• Makes sense, right?
7 / 11
miRank: how does it work?
• miRank models belief propagation process by doing Markov
random walks on a graph, where each vertex corresponds to
either known pre-miRNA or a putative candidate and two
vertices are connected by an edge if the two vertices are
“close to each other”.
• Each edge on the graph is assigned a weight wij , proportional
to the Euclidean distance between the samples vi and vj (see
next slide on how samples are represented).
• When a random walker transits from vi to vj it transmits the
relevancy information of vi to vj by the following update rule:
f
(k+1)
i = α
xj ∈XU
pij f
(k)
j +
xj ∈XQ
pij fj pij =
wij
deg(vij )
8 / 11
miRank: features
Global
• normalized minimum free energy of folding (MFE);
• normalized no. of paired nucleotides on both arms;
• normalized loop length.
Local – RNAFold
GUAGCACUAAAGUGCUUAUAGUGCAGGUAGUGUUUAGUUAUCUACUGCAUUAUGAGCACUUAAAGUACUGC
((((.(((.(((((((((((((((((.(((((......)).))))))))))))))))))))..))).))))
• Each nucleotide is either paired, denoted by a bracket (– 5’
arm, )– 3’ arm, or unpaired – .;
• Each local feature is a “word” of length 3, further
distinguished by the nucleotide in the middle position,
examples: ((., .((.
9 / 11
miRank: good parts, bad parts & magic
• The method doesn’t require any genomic annotations, except
for the set of query samples.
• ≈ 75% precision and ≈ 70% recall even with very few query
samples (1, 5) – hard to validate, because the source code
was never released.
• The notion of similarity between query samples, which defines
the graph structure is unclear, even though it looks critical for
algorithm performance.
• Two user-specified parameters, n – number of predicted
samples and α – the weight of unknown samples in the
relevancy value. How do they affect precision-recall and how
to choose them?
• Overall, it seems like miRank isn’t used much by biologists3.
3
http://www.ncbi.nlm.nih.gov/pubmed?linkname=pubmed_pubmed_
citedin&from_uid=18586744
10 / 11
References
K. Chen and N. Rajewsky.
The evolution of gene regulation by transcription factors and microRNAs.
Nat. Rev. Genet., 8(2):93–103, Feb 2007.
M. R. Friedlander, W. Chen, C. Adamidi, J. Maaskola, R. Einspanier,
S. Knespel, and N. Rajewsky.
Discovering microRNAs from deep sequencing data using miRDeep.
Nat. Biotechnol., 26(4):407–415, Apr 2008.
X. Wang, J. Zhang, F. Li, J. Gu, T. He, X. Zhang, and Y. Li.
MicroRNA identification based on sequence and structure alignment.
Bioinformatics, 21(18):3610–3614, Sep 2005.
Y. Xu, X. Zhou, and W. Zhang.
MicroRNA prediction with a novel ranking algorithm based on random
walks.
Bioinformatics, 24(13):i50–58, Jul 2008.
11 / 11

Más contenido relacionado

La actualidad más candente

De novo str_prediction
De novo str_predictionDe novo str_prediction
De novo str_predictionShwetA Kumari
 
Network-based machine learning approach for aggregating multi-modal data
Network-based machine learning approach for aggregating multi-modal dataNetwork-based machine learning approach for aggregating multi-modal data
Network-based machine learning approach for aggregating multi-modal dataSOYEON KIM
 
Particle Swarm Optimization for Gene cluster Identification
Particle Swarm Optimization for Gene cluster IdentificationParticle Swarm Optimization for Gene cluster Identification
Particle Swarm Optimization for Gene cluster IdentificationEditor IJCATR
 
Curveball Algorithm for Random Sampling of Protein Networks
Curveball Algorithm for Random Sampling of Protein NetworksCurveball Algorithm for Random Sampling of Protein Networks
Curveball Algorithm for Random Sampling of Protein NetworksAkua Biaa Adu
 
Modularity of protein folds as a tool for template free modeling of structures
Modularity of protein folds as a tool for template free modeling of structuresModularity of protein folds as a tool for template free modeling of structures
Modularity of protein folds as a tool for template free modeling of structuresPranavathiyani G
 
Protein Remote Homology Detection
Protein Remote Homology DetectionProtein Remote Homology Detection
Protein Remote Homology DetectionAlia Hamwi
 
Template free modeling of protein structures
Template free modeling of protein structuresTemplate free modeling of protein structures
Template free modeling of protein structuresPranavathiyani G
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...journal ijrtem
 
Gene regulatory networks
Gene regulatory networksGene regulatory networks
Gene regulatory networksMadiheh
 
Bio process
Bio processBio process
Bio processsun777
 

La actualidad más candente (19)

De novo str_prediction
De novo str_predictionDe novo str_prediction
De novo str_prediction
 
Network-based machine learning approach for aggregating multi-modal data
Network-based machine learning approach for aggregating multi-modal dataNetwork-based machine learning approach for aggregating multi-modal data
Network-based machine learning approach for aggregating multi-modal data
 
Genomics
Genomics Genomics
Genomics
 
Particle Swarm Optimization for Gene cluster Identification
Particle Swarm Optimization for Gene cluster IdentificationParticle Swarm Optimization for Gene cluster Identification
Particle Swarm Optimization for Gene cluster Identification
 
Rish seq ana
Rish seq anaRish seq ana
Rish seq ana
 
Gene expression profiling ii
Gene expression profiling  iiGene expression profiling  ii
Gene expression profiling ii
 
Curveball Algorithm for Random Sampling of Protein Networks
Curveball Algorithm for Random Sampling of Protein NetworksCurveball Algorithm for Random Sampling of Protein Networks
Curveball Algorithm for Random Sampling of Protein Networks
 
Critique
CritiqueCritique
Critique
 
Modularity of protein folds as a tool for template free modeling of structures
Modularity of protein folds as a tool for template free modeling of structuresModularity of protein folds as a tool for template free modeling of structures
Modularity of protein folds as a tool for template free modeling of structures
 
Protein Remote Homology Detection
Protein Remote Homology DetectionProtein Remote Homology Detection
Protein Remote Homology Detection
 
Intergenic segments
Intergenic segmentsIntergenic segments
Intergenic segments
 
Template free modeling of protein structures
Template free modeling of protein structuresTemplate free modeling of protein structures
Template free modeling of protein structures
 
222397 lecture 16 17
222397 lecture 16 17222397 lecture 16 17
222397 lecture 16 17
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
 
Homology modeling: Modeller
Homology modeling: ModellerHomology modeling: Modeller
Homology modeling: Modeller
 
Virus Sequence Alignment and Phylogenetic Analysis 2019
Virus Sequence Alignment and Phylogenetic Analysis 2019Virus Sequence Alignment and Phylogenetic Analysis 2019
Virus Sequence Alignment and Phylogenetic Analysis 2019
 
Gene regulatory networks
Gene regulatory networksGene regulatory networks
Gene regulatory networks
 
Molecular phylogenetics
Molecular phylogeneticsMolecular phylogenetics
Molecular phylogenetics
 
Bio process
Bio processBio process
Bio process
 

Destacado

Micro RNAs and ageing Dr Felekkis 2014 (1)
Micro RNAs and ageing Dr Felekkis 2014 (1)Micro RNAs and ageing Dr Felekkis 2014 (1)
Micro RNAs and ageing Dr Felekkis 2014 (1)Marios Kyriazis
 
urinary proteomics
urinary proteomicsurinary proteomics
urinary proteomicsRavi Kumar
 
miRNA Activity in Arabidopsis thaliana
miRNA Activity in Arabidopsis thalianamiRNA Activity in Arabidopsis thaliana
miRNA Activity in Arabidopsis thalianatsandrew
 
Mi rna data analysis 2013
Mi rna data analysis 2013Mi rna data analysis 2013
Mi rna data analysis 2013Elsa von Licy
 
Screening of candidate miRNA and target binding sites specific for HIV latency
Screening of candidate miRNA and target binding sites specific for HIV latencyScreening of candidate miRNA and target binding sites specific for HIV latency
Screening of candidate miRNA and target binding sites specific for HIV latencymaysoethu
 
Comparative Genomics and de Bruijn graphs
Comparative Genomics and de Bruijn graphsComparative Genomics and de Bruijn graphs
Comparative Genomics and de Bruijn graphsBioinformaticsInstitute
 
Advanced miRNA Expression Analysis: miRNA and its Role in Human Disease Webin...
Advanced miRNA Expression Analysis: miRNA and its Role in Human Disease Webin...Advanced miRNA Expression Analysis: miRNA and its Role in Human Disease Webin...
Advanced miRNA Expression Analysis: miRNA and its Role in Human Disease Webin...QIAGEN
 
Functional Analysis of miRNA: miRNA and its Role in Human Disease Webinar Ser...
Functional Analysis of miRNA: miRNA and its Role in Human Disease Webinar Ser...Functional Analysis of miRNA: miRNA and its Role in Human Disease Webinar Ser...
Functional Analysis of miRNA: miRNA and its Role in Human Disease Webinar Ser...QIAGEN
 
MiRNA presentation
MiRNA presentationMiRNA presentation
MiRNA presentationjthouse1928
 
RNAi, miRNA & siRNA
RNAi, miRNA & siRNARNAi, miRNA & siRNA
RNAi, miRNA & siRNAsbryant89
 

Destacado (17)

Micro RNAs and ageing Dr Felekkis 2014 (1)
Micro RNAs and ageing Dr Felekkis 2014 (1)Micro RNAs and ageing Dr Felekkis 2014 (1)
Micro RNAs and ageing Dr Felekkis 2014 (1)
 
urinary proteomics
urinary proteomicsurinary proteomics
urinary proteomics
 
Mi rna array 2013
Mi rna array 2013Mi rna array 2013
Mi rna array 2013
 
miRNA Activity in Arabidopsis thaliana
miRNA Activity in Arabidopsis thalianamiRNA Activity in Arabidopsis thaliana
miRNA Activity in Arabidopsis thaliana
 
Mi rna data analysis 2013
Mi rna data analysis 2013Mi rna data analysis 2013
Mi rna data analysis 2013
 
miRNA Revolution
miRNA RevolutionmiRNA Revolution
miRNA Revolution
 
Metodologia mirna
Metodologia mirnaMetodologia mirna
Metodologia mirna
 
Screening of candidate miRNA and target binding sites specific for HIV latency
Screening of candidate miRNA and target binding sites specific for HIV latencyScreening of candidate miRNA and target binding sites specific for HIV latency
Screening of candidate miRNA and target binding sites specific for HIV latency
 
Comparative Genomics and de Bruijn graphs
Comparative Genomics and de Bruijn graphsComparative Genomics and de Bruijn graphs
Comparative Genomics and de Bruijn graphs
 
miRNA and Diabetes
miRNA and DiabetesmiRNA and Diabetes
miRNA and Diabetes
 
Mirna and its applications
Mirna and its applicationsMirna and its applications
Mirna and its applications
 
Advanced miRNA Expression Analysis: miRNA and its Role in Human Disease Webin...
Advanced miRNA Expression Analysis: miRNA and its Role in Human Disease Webin...Advanced miRNA Expression Analysis: miRNA and its Role in Human Disease Webin...
Advanced miRNA Expression Analysis: miRNA and its Role in Human Disease Webin...
 
Functional Analysis of miRNA: miRNA and its Role in Human Disease Webinar Ser...
Functional Analysis of miRNA: miRNA and its Role in Human Disease Webinar Ser...Functional Analysis of miRNA: miRNA and its Role in Human Disease Webinar Ser...
Functional Analysis of miRNA: miRNA and its Role in Human Disease Webinar Ser...
 
MiRNA presentation
MiRNA presentationMiRNA presentation
MiRNA presentation
 
RNAi, miRNA & siRNA
RNAi, miRNA & siRNARNAi, miRNA & siRNA
RNAi, miRNA & siRNA
 
miRNA & siRNA
miRNA & siRNAmiRNA & siRNA
miRNA & siRNA
 
Micro rna
Micro rnaMicro rna
Micro rna
 

Similar a Slides

Wp mi script_preamp_0613_lr
Wp mi script_preamp_0613_lrWp mi script_preamp_0613_lr
Wp mi script_preamp_0613_lrElsa von Licy
 
Toxicogenomics: microarray
Toxicogenomics: microarrayToxicogenomics: microarray
Toxicogenomics: microarrayEden D'souza
 
Genomics(functional genomics)
Genomics(functional genomics)Genomics(functional genomics)
Genomics(functional genomics)IndrajaDoradla
 
Rt2 micrornapcr arrayswhitepaper
Rt2 micrornapcr arrayswhitepaperRt2 micrornapcr arrayswhitepaper
Rt2 micrornapcr arrayswhitepaperElsa von Licy
 
miRNA Breast Cancer Prognosis -- Ingenuity Systems
miRNA Breast Cancer Prognosis -- Ingenuity SystemsmiRNA Breast Cancer Prognosis -- Ingenuity Systems
miRNA Breast Cancer Prognosis -- Ingenuity SystemsNatalie Ng
 
High throughput Data Analysis
High throughput Data AnalysisHigh throughput Data Analysis
High throughput Data AnalysisSetia Pramana
 
overview on Next generation sequencing in breast csncer
overview on Next generation sequencing in breast csnceroverview on Next generation sequencing in breast csncer
overview on Next generation sequencing in breast csncerSeham Al-Shehri
 
Microarray and dna chips for transcriptome study
Microarray and dna chips for transcriptome studyMicroarray and dna chips for transcriptome study
Microarray and dna chips for transcriptome studyBia Khan
 
Analytical Study of Hexapod miRNAs using Phylogenetic Methods
Analytical Study of Hexapod miRNAs using Phylogenetic MethodsAnalytical Study of Hexapod miRNAs using Phylogenetic Methods
Analytical Study of Hexapod miRNAs using Phylogenetic Methodscscpconf
 
Microarray @ujjwal sirohi
Microarray @ujjwal sirohiMicroarray @ujjwal sirohi
Microarray @ujjwal sirohiujjwal sirohi
 
Exploiting microRNAs for precision oncology
Exploiting microRNAs for precision oncologyExploiting microRNAs for precision oncology
Exploiting microRNAs for precision oncologyGhent University
 
Experimental methods and the big data sets
Experimental methods and the big data sets Experimental methods and the big data sets
Experimental methods and the big data sets improvemed
 
Processing Raw scRNA-Seq Sequencing Data
Processing Raw scRNA-Seq Sequencing DataProcessing Raw scRNA-Seq Sequencing Data
Processing Raw scRNA-Seq Sequencing DataAlireza Doustmohammadi
 
Applications of microarray
Applications of microarrayApplications of microarray
Applications of microarrayprateek kumar
 
miScript Single Cell Poster
miScript Single Cell PostermiScript Single Cell Poster
miScript Single Cell PosterQIAGEN
 
microRNA for Clinical Research and Tumor Analysis
microRNA for Clinical Research and Tumor AnalysismicroRNA for Clinical Research and Tumor Analysis
microRNA for Clinical Research and Tumor AnalysisBioGenex
 

Similar a Slides (20)

Micro RNA.ppt
Micro RNA.pptMicro RNA.ppt
Micro RNA.ppt
 
Transcriptomics
TranscriptomicsTranscriptomics
Transcriptomics
 
Wp mi script_preamp_0613_lr
Wp mi script_preamp_0613_lrWp mi script_preamp_0613_lr
Wp mi script_preamp_0613_lr
 
Toxicogenomics: microarray
Toxicogenomics: microarrayToxicogenomics: microarray
Toxicogenomics: microarray
 
Qi liu 08.08.2014
Qi liu 08.08.2014Qi liu 08.08.2014
Qi liu 08.08.2014
 
Genomics(functional genomics)
Genomics(functional genomics)Genomics(functional genomics)
Genomics(functional genomics)
 
Rt2 micrornapcr arrayswhitepaper
Rt2 micrornapcr arrayswhitepaperRt2 micrornapcr arrayswhitepaper
Rt2 micrornapcr arrayswhitepaper
 
miRNA Breast Cancer Prognosis -- Ingenuity Systems
miRNA Breast Cancer Prognosis -- Ingenuity SystemsmiRNA Breast Cancer Prognosis -- Ingenuity Systems
miRNA Breast Cancer Prognosis -- Ingenuity Systems
 
High throughput Data Analysis
High throughput Data AnalysisHigh throughput Data Analysis
High throughput Data Analysis
 
overview on Next generation sequencing in breast csncer
overview on Next generation sequencing in breast csnceroverview on Next generation sequencing in breast csncer
overview on Next generation sequencing in breast csncer
 
Microarray and dna chips for transcriptome study
Microarray and dna chips for transcriptome studyMicroarray and dna chips for transcriptome study
Microarray and dna chips for transcriptome study
 
Analytical Study of Hexapod miRNAs using Phylogenetic Methods
Analytical Study of Hexapod miRNAs using Phylogenetic MethodsAnalytical Study of Hexapod miRNAs using Phylogenetic Methods
Analytical Study of Hexapod miRNAs using Phylogenetic Methods
 
Microarray @ujjwal sirohi
Microarray @ujjwal sirohiMicroarray @ujjwal sirohi
Microarray @ujjwal sirohi
 
Exploiting microRNAs for precision oncology
Exploiting microRNAs for precision oncologyExploiting microRNAs for precision oncology
Exploiting microRNAs for precision oncology
 
Experimental methods and the big data sets
Experimental methods and the big data sets Experimental methods and the big data sets
Experimental methods and the big data sets
 
Genomics seminar copy
Genomics seminar   copyGenomics seminar   copy
Genomics seminar copy
 
Processing Raw scRNA-Seq Sequencing Data
Processing Raw scRNA-Seq Sequencing DataProcessing Raw scRNA-Seq Sequencing Data
Processing Raw scRNA-Seq Sequencing Data
 
Applications of microarray
Applications of microarrayApplications of microarray
Applications of microarray
 
miScript Single Cell Poster
miScript Single Cell PostermiScript Single Cell Poster
miScript Single Cell Poster
 
microRNA for Clinical Research and Tumor Analysis
microRNA for Clinical Research and Tumor AnalysismicroRNA for Clinical Research and Tumor Analysis
microRNA for Clinical Research and Tumor Analysis
 

Más de BioinformaticsInstitute

Биоинформатический анализ данных полноэкзомного секвенирования: анализ качес...
 Биоинформатический анализ данных полноэкзомного секвенирования: анализ качес... Биоинформатический анализ данных полноэкзомного секвенирования: анализ качес...
Биоинформатический анализ данных полноэкзомного секвенирования: анализ качес...BioinformaticsInstitute
 
Вперед в прошлое. Методы генетической диагностики древней днк
Вперед в прошлое. Методы генетической диагностики древней днкВперед в прошлое. Методы генетической диагностики древней днк
Вперед в прошлое. Методы генетической диагностики древней днкBioinformaticsInstitute
 
"Зачем биологам суперкомпьютеры", Александр Предеус
"Зачем биологам суперкомпьютеры", Александр Предеус"Зачем биологам суперкомпьютеры", Александр Предеус
"Зачем биологам суперкомпьютеры", Александр ПредеусBioinformaticsInstitute
 
Иммунотерапия раковых опухолей: взгляд со стороны системной биологии. Максим ...
Иммунотерапия раковых опухолей: взгляд со стороны системной биологии. Максим ...Иммунотерапия раковых опухолей: взгляд со стороны системной биологии. Максим ...
Иммунотерапия раковых опухолей: взгляд со стороны системной биологии. Максим ...BioinformaticsInstitute
 
Рак 101 (Мария Шутова, ИоГЕН РАН)
Рак 101 (Мария Шутова, ИоГЕН РАН)Рак 101 (Мария Шутова, ИоГЕН РАН)
Рак 101 (Мария Шутова, ИоГЕН РАН)BioinformaticsInstitute
 
Секвенирование как инструмент исследования сложных фенотипов человека: от ген...
Секвенирование как инструмент исследования сложных фенотипов человека: от ген...Секвенирование как инструмент исследования сложных фенотипов человека: от ген...
Секвенирование как инструмент исследования сложных фенотипов человека: от ген...BioinformaticsInstitute
 
Инвестиции в биоинформатику и биотех (Андрей Афанасьев)
Инвестиции в биоинформатику и биотех (Андрей Афанасьев)Инвестиции в биоинформатику и биотех (Андрей Афанасьев)
Инвестиции в биоинформатику и биотех (Андрей Афанасьев)BioinformaticsInstitute
 

Más de BioinformaticsInstitute (20)

Graph genome
Graph genome Graph genome
Graph genome
 
Nanopores sequencing
Nanopores sequencingNanopores sequencing
Nanopores sequencing
 
A superglue for string comparison
A superglue for string comparisonA superglue for string comparison
A superglue for string comparison
 
Биоинформатический анализ данных полноэкзомного секвенирования: анализ качес...
 Биоинформатический анализ данных полноэкзомного секвенирования: анализ качес... Биоинформатический анализ данных полноэкзомного секвенирования: анализ качес...
Биоинформатический анализ данных полноэкзомного секвенирования: анализ качес...
 
Вперед в прошлое. Методы генетической диагностики древней днк
Вперед в прошлое. Методы генетической диагностики древней днкВперед в прошлое. Методы генетической диагностики древней днк
Вперед в прошлое. Методы генетической диагностики древней днк
 
Knime & bioinformatics
Knime & bioinformaticsKnime & bioinformatics
Knime & bioinformatics
 
"Зачем биологам суперкомпьютеры", Александр Предеус
"Зачем биологам суперкомпьютеры", Александр Предеус"Зачем биологам суперкомпьютеры", Александр Предеус
"Зачем биологам суперкомпьютеры", Александр Предеус
 
Иммунотерапия раковых опухолей: взгляд со стороны системной биологии. Максим ...
Иммунотерапия раковых опухолей: взгляд со стороны системной биологии. Максим ...Иммунотерапия раковых опухолей: взгляд со стороны системной биологии. Максим ...
Иммунотерапия раковых опухолей: взгляд со стороны системной биологии. Максим ...
 
Рак 101 (Мария Шутова, ИоГЕН РАН)
Рак 101 (Мария Шутова, ИоГЕН РАН)Рак 101 (Мария Шутова, ИоГЕН РАН)
Рак 101 (Мария Шутова, ИоГЕН РАН)
 
Плюрипотентность 101
Плюрипотентность 101Плюрипотентность 101
Плюрипотентность 101
 
Секвенирование как инструмент исследования сложных фенотипов человека: от ген...
Секвенирование как инструмент исследования сложных фенотипов человека: от ген...Секвенирование как инструмент исследования сложных фенотипов человека: от ген...
Секвенирование как инструмент исследования сложных фенотипов человека: от ген...
 
Инвестиции в биоинформатику и биотех (Андрей Афанасьев)
Инвестиции в биоинформатику и биотех (Андрей Афанасьев)Инвестиции в биоинформатику и биотех (Андрей Афанасьев)
Инвестиции в биоинформатику и биотех (Андрей Афанасьев)
 
Biodb 2011-everything
Biodb 2011-everythingBiodb 2011-everything
Biodb 2011-everything
 
Biodb 2011-05
Biodb 2011-05Biodb 2011-05
Biodb 2011-05
 
Biodb 2011-04
Biodb 2011-04Biodb 2011-04
Biodb 2011-04
 
Biodb 2011-03
Biodb 2011-03Biodb 2011-03
Biodb 2011-03
 
Biodb 2011-01
Biodb 2011-01Biodb 2011-01
Biodb 2011-01
 
Biodb 2011-02
Biodb 2011-02Biodb 2011-02
Biodb 2011-02
 
Ngs 3 1
Ngs 3 1Ngs 3 1
Ngs 3 1
 
Ngs 1 0_0
Ngs 1 0_0Ngs 1 0_0
Ngs 1 0_0
 

Último

UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
Babel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxBabel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxYounusS2
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?SANGHEE SHIN
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
Spring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfSpring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfAnna Loughnan Colquhoun
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 

Último (20)

UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
Babel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxBabel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptx
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
Spring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfSpring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdf
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 

Slides

  • 1. miRNA Discovery & Prediction Algorithms Sergei Lebedev October 13, 2012
  • 2. What is miRNA? • microRNA or miRNA, ≈ 22 nucleotide-long non-coding RNA; • mostly expressed in a tissue-specific manner and play crucial roles in cell proliferation, apoptosis and differentiation during cell development; • thought to be involved in post-transcriptional control in plants and animals; • linked to disease1, for example hsa-miR-126 is associated with retinoblastoma, breast cancer, lung cancer, kidney cancer, asthma etc. 1 See http://www.mir2disease.org for details. 1 / 11
  • 3. miRNA in action: nucleus [1] • pri-miRNA is transcribed by RNA polymerase II and seem to possess promoter and enchancer regions, similar to protein coding genes; • pri-miRNA is then cleaved into (possibly multiple) pre-miRNA by an enzyme complex Drosha. 2 / 11
  • 4. miRNA in action: cytoplasm [1] • Dicer removes the stem-loop, leaving two complementary sequences: miRNA and miRNA*, the latter is not known to have any regulatory function. • Mature miRNA base-pairs with 3’ UTR of target mRNAs and blocks protein syntesis or causes mRNA degradation. 3 / 11
  • 5. miRNA identification • Biological methods: northern blots, qRT-PCR2, micro arrays, RNA-seq or miRNA-seq. • Bioinformatics to the rescue! the usual strategy: first sequence everything, RNA-seq in this case, then try to make sense of whatever the result is. • In this talk: miRDeep [2], MiRAlign [3], MiRank [4]. • A lot of existing tools out of scope, most can be described with a one liner: “We’ve developed a novel method for miRNA identification, based on machine learning approach, SVM, HMM!”. 2 RT for reverse transcription, not real-time. 4 / 11
  • 8. miRank: overview • Treat miRNA identification problem as a problem of information retrieval, where novel miRNAs are to be retrieved from a set of candidates by the known query samples – “true” miRNAs. • More formally, given a set of known pre-miRNAs XQ as query samples and a set of putative candidates XU as unknown samples, rank XU with respect to XQ. • To do so, compute the relevancy values fi ∈ [0, 1] for all unknown samples, assuming fi = 1 for query samples. • After that, simply select n ranked samples, which constitute to predicted pre-miRNA. • Makes sense, right? 7 / 11
  • 9. miRank: how does it work? • miRank models belief propagation process by doing Markov random walks on a graph, where each vertex corresponds to either known pre-miRNA or a putative candidate and two vertices are connected by an edge if the two vertices are “close to each other”. • Each edge on the graph is assigned a weight wij , proportional to the Euclidean distance between the samples vi and vj (see next slide on how samples are represented). • When a random walker transits from vi to vj it transmits the relevancy information of vi to vj by the following update rule: f (k+1) i = α xj ∈XU pij f (k) j + xj ∈XQ pij fj pij = wij deg(vij ) 8 / 11
  • 10. miRank: features Global • normalized minimum free energy of folding (MFE); • normalized no. of paired nucleotides on both arms; • normalized loop length. Local – RNAFold GUAGCACUAAAGUGCUUAUAGUGCAGGUAGUGUUUAGUUAUCUACUGCAUUAUGAGCACUUAAAGUACUGC ((((.(((.(((((((((((((((((.(((((......)).))))))))))))))))))))..))).)))) • Each nucleotide is either paired, denoted by a bracket (– 5’ arm, )– 3’ arm, or unpaired – .; • Each local feature is a “word” of length 3, further distinguished by the nucleotide in the middle position, examples: ((., .((. 9 / 11
  • 11. miRank: good parts, bad parts & magic • The method doesn’t require any genomic annotations, except for the set of query samples. • ≈ 75% precision and ≈ 70% recall even with very few query samples (1, 5) – hard to validate, because the source code was never released. • The notion of similarity between query samples, which defines the graph structure is unclear, even though it looks critical for algorithm performance. • Two user-specified parameters, n – number of predicted samples and α – the weight of unknown samples in the relevancy value. How do they affect precision-recall and how to choose them? • Overall, it seems like miRank isn’t used much by biologists3. 3 http://www.ncbi.nlm.nih.gov/pubmed?linkname=pubmed_pubmed_ citedin&from_uid=18586744 10 / 11
  • 12. References K. Chen and N. Rajewsky. The evolution of gene regulation by transcription factors and microRNAs. Nat. Rev. Genet., 8(2):93–103, Feb 2007. M. R. Friedlander, W. Chen, C. Adamidi, J. Maaskola, R. Einspanier, S. Knespel, and N. Rajewsky. Discovering microRNAs from deep sequencing data using miRDeep. Nat. Biotechnol., 26(4):407–415, Apr 2008. X. Wang, J. Zhang, F. Li, J. Gu, T. He, X. Zhang, and Y. Li. MicroRNA identification based on sequence and structure alignment. Bioinformatics, 21(18):3610–3614, Sep 2005. Y. Xu, X. Zhou, and W. Zhang. MicroRNA prediction with a novel ranking algorithm based on random walks. Bioinformatics, 24(13):i50–58, Jul 2008. 11 / 11