Roeder rocky 2011_46

•Descargar como PPTX, PDF•

0 recomendaciones•260 vistas

Chris Roeder

Conference Talk: A Distributed Framework for Computation on the Results of Large Scale NLP

Tecnología Educación

A Distributed Framework for
Computation on the Results of
Large Scale NLP
Christophe Roeder, William A. Baumgartner Jr., Kevin Livingston,
Lawrence E. Hunter
(University of Colorado Anschutz Medial Campus)

Chris.Roeder@ucdenver.edu
http://compbio.ucdenver.edu

Motivation
• A vast amount of information is available
in journal articles
• Journal articles are unstructured text
• Many applications require structured
knowledge
– Curated ontologies (Gene Ontology)
– Databases (UniProt, EntrezGene)
• Challenge: extract structured knowledge
from unstructured text and integrate with
existing knowledge…at massive scale

Architecture
Journal RDF
Scaled NLP Pipeline
Articles(u Document
nstructured) s(structured)
Queries Sesam
Knowledge e/Hado
Base(Ontologi op
es,
Databases)

Knowledg
Applications
Applications
(Visualization
e Distilled
Applications
(Visualization
, (Visualization
NLP,…)
, NLP,…)
Output
, NLP,…) (structured)
Structured
Information

Example Application
• Concept annotation
trends over time

Insuli
n

NOS1

http://tinyurl.com/bio-trends

Summary
• NLP pipelines extract structured annotations
• Our framework provides massively parallel access
to these structured document annotations
• Structured representation is integrated with
knowledge base
• Affords parallelization when possible, and access
to knowledge base when necessary
• Provides integration of unstructured document text
with structured knowledge for enabling
applications such as:
– Visualization (BioJigsaw, Hanalyzer,…)
– Natural Language Understanding (OpenDMAP)
– Leveraging text data for validation and evaluation of
other methods

Thank You / Questions
• http://tinyurl.com/bio-trends

• Co-authors
– William A. Baumgartner Jr. for data generation
– Kevin Livingston for RDF and Clojure help
• Grants and PIs
– Lawrence E Hunter, UCDenver SOM
• NIH 2R01LM009254-04, NIH 2R01LM008111-04A1,
NIH 5R01GM083649-02
– Karin Verspoor, UCDenver SOM
• NIH R01 LM010120-01
– Gully Burns, ISI
• NSF 0849977

Más contenido relacionado

La actualidad más candente

Semantic Integration for Heterogeneous Domain-specific Information: The NIF CaseNeuroscience Information Framework

The Process of Information extraction through Natural Language ProcessingWaqas Tariq

ニューラル日本語固有表現認識における格フレームの有効性検証Takashi Inui

Conceptual foundations of text mining and preprocessing steps nfaoui el_habibEl Habib NFAOUI

Building a Digital Learning Object w/ Articulate Storyline 2Shalin Hai-Jew

Automatic Term Recognition with Apache SolrJIE GAO

September 2021: Top10 Cited Articles in Natural Language Computingkevig

Model of information retrieval (3)9866825059

Use of ontologies in natural language processingATHMAN HAJ-HAMOU

subratABA,BALASORE

Phrase Structure Identification and Classification of Sentences using Deep Le...ijtsrd

Ontology learningEhsan Asgarian

Using ontology for natural language processingcracaoanu constantin sergiu

Introduction to Natural Language Processingdhruv_chaudhari

R programming language - Mustafa WahediUNICORNS IN TECH

downloadbutest

Arcomem training opinions_advancedarcomem

Analyzing Nontextual Content Features to Detect Academic PlagiarismScientific Information Analytics Group, Prof. Gipp

La actualidad más candente (18)

Semantic Integration for Heterogeneous Domain-specific Information: The NIF Case

The Process of Information extraction through Natural Language Processing

ニューラル日本語固有表現認識における格フレームの有効性検証

Conceptual foundations of text mining and preprocessing steps nfaoui el_habib

Building a Digital Learning Object w/ Articulate Storyline 2

Automatic Term Recognition with Apache Solr

September 2021: Top10 Cited Articles in Natural Language Computing

Model of information retrieval (3)

Use of ontologies in natural language processing

subrat

Phrase Structure Identification and Classification of Sentences using Deep Le...

Ontology learning

Using ontology for natural language processing

Introduction to Natural Language Processing

R programming language - Mustafa Wahedi

download

Arcomem training opinions_advanced

Analyzing Nontextual Content Features to Detect Academic Plagiarism

Destacado

HibernateChris Roeder

Text-mining and Automationbenosteen

NLP in Practice - Part IDelip Rao

Apache UIMA IntroductionTommaso Teofili

13. Constantin Orasan (UoW) Natural Language Processing for TranslationRIILP

Natural Language Processing in Alternative and Augmentative CommunicationDivya Sugumar

Color of wordsEli Bressert

Natural Language Processing in R (rNLP)fridolin.wild

Big Data & Text MiningMichel Bruley

Destacado (9)

Hibernate

Text-mining and Automation

NLP in Practice - Part I

Apache UIMA Introduction

13. Constantin Orasan (UoW) Natural Language Processing for Translation

Natural Language Processing in Alternative and Augmentative Communication

Color of words

Natural Language Processing in R (rNLP)

Big Data & Text Mining

Similar a Roeder rocky 2011_46

A Framework for Ontology Usage AnalysisJamshaid Ashraf

Machine Learning of Natural Languagebutest

Linked Open data: CNRDatiGovIT

Towards a Marketplace of Open Source Software DataFernando Silva Parreiras

Future of Natural Language Processing - Potential Lists of Topics for PhD stu...PhD Assistance

Introduction to natural language processing (NLP)Alia Hamwi

NLP2RDF Wortschatz and Linguistic LOD draftSebastian Hellmann

NLP Tasks and Applications.ppt useful inKumari Naveen

lect36-tasks.pptHaHa501620

Auto Mapping Texts for Human-Machine Analysis and SensemakingShalin Hai-Jew

A Methodological Framework for Ontology and Multilingual Termontological Data...Christophe Debruyne

Data Landscapes - AddictionNeuroscience Information Framework

Our World is Socio-technicalMarkus Luczak-Rösch

20120718 linkedopendataandnextgenerationsciencemcguinnessesip finalDeborah McGuinness

How do we know what we don’t know: Using the Neuroscience Information Framew...Maryann Martone

The Neuroscience Information Framework: Establishing a practical semantic fra...Neuroscience Information Framework

From Linked Data to Semantic ApplicationsAndre Freitas

The real world of ontologies and phenotype representation: perspectives from...Neuroscience Information Framework

The State of #NLProcVsevolod Dyomkin

Similar a Roeder rocky 2011_46 (20)

A Framework for Ontology Usage Analysis

Machine Learning of Natural Language

Linked Open data: CNR

Towards a Marketplace of Open Source Software Data

Future of Natural Language Processing - Potential Lists of Topics for PhD stu...

Introduction to natural language processing (NLP)

NLP2RDF Wortschatz and Linguistic LOD draft

NLP Tasks and Applications.ppt useful in

lect36-tasks.ppt

Auto Mapping Texts for Human-Machine Analysis and Sensemaking

A Methodological Framework for Ontology and Multilingual Termontological Data...

Data Landscapes - Addiction

Our World is Socio-technical

20120718 linkedopendataandnextgenerationsciencemcguinnessesip final

How do we know what we don’t know: Using the Neuroscience Information Framew...

The Neuroscience Information Framework: Establishing a practical semantic fra...

From Linked Data to Semantic Applications

The real world of ontologies and phenotype representation: perspectives from...

The State of #NLProc

Más de Chris Roeder

Roeder posterismb2010Chris Roeder

UmlChris Roeder

Spring surveyChris Roeder

MavenChris Roeder

Rocky2010 roeder full_textbiomedicalliteratureprocesingChris Roeder

SgeChris Roeder

Más de Chris Roeder (6)

Roeder posterismb2010

Uml

Spring survey

Maven

Rocky2010 roeder full_textbiomedicalliteratureprocesing

Sge

Último

CNIC Information System with Pakdata Cf In Pakistandanishmna97

Apidays New York 2024 - The value of a flexible API Management solution for O...apidays

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub

TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

FWD Group - Insurer Innovation Award 2024The Digital Insurer

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10

WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub

ICT role in 21st century education and its challengesrafiqahmad00786416

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software

MINDCTI Revenue Release Quarter One 2024MIND CTI

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot

Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services

Roeder rocky 2011_46

1. A Distributed Framework for Computation on the Results of Large Scale NLP Christophe Roeder, William A. Baumgartner Jr., Kevin Livingston, Lawrence E. Hunter (University of Colorado Anschutz Medial Campus) Chris.Roeder@ucdenver.edu http://compbio.ucdenver.edu

2. Motivation • A vast amount of information is available in journal articles • Journal articles are unstructured text • Many applications require structured knowledge – Curated ontologies (Gene Ontology) – Databases (UniProt, EntrezGene) • Challenge: extract structured knowledge from unstructured text and integrate with existing knowledge…at massive scale

3. Architecture Journal RDF Scaled NLP Pipeline Articles(u Document nstructured) s(structured) Queries Sesam Knowledge e/Hado Base(Ontologi op es, Databases) Knowledg Applications Applications (Visualization e Distilled Applications (Visualization , (Visualization NLP,…) , NLP,…) Output , NLP,…) (structured) Structured Information

4. Example Application • Concept annotation trends over time Insuli n NOS1 http://tinyurl.com/bio-trends

5. Summary • NLP pipelines extract structured annotations • Our framework provides massively parallel access to these structured document annotations • Structured representation is integrated with knowledge base • Affords parallelization when possible, and access to knowledge base when necessary • Provides integration of unstructured document text with structured knowledge for enabling applications such as: – Visualization (BioJigsaw, Hanalyzer,…) – Natural Language Understanding (OpenDMAP) – Leveraging text data for validation and evaluation of other methods

6. Thank You / Questions • http://tinyurl.com/bio-trends • Co-authors – William A. Baumgartner Jr. for data generation – Kevin Livingston for RDF and Clojure help • Grants and PIs – Lawrence E Hunter, UCDenver SOM • NIH 2R01LM009254-04, NIH 2R01LM008111-04A1, NIH 5R01GM083649-02 – Karin Verspoor, UCDenver SOM • NIH R01 LM010120-01 – Gully Burns, ISI • NSF 0849977

Notas del editor

Plug KabobPlug Open Access, Mention Elsevier collections, size
Mention UIMA Distringuish NER from normalization, and how that ID ties it into the KBPutting High Precision Enttiytrecog to work at large scaleInduction, abductionGet around noise issues by using a LOT of dataPrecision and recal require scaleMight learn something, if said often enoughCorrleations between proteins, coorrenceppiCoorrence with other ontology terms or other extracted terms or biological processes
No excuses, don’t trivialize, but emphasize its value as a demoBuilt in about a week, computation over PMC OA in 2 hours on a very modest cluster (40 cores)(inefficiencies exist as well) lot of data, runs qucilyDemonstrates that the framework can be used quickly and worksSame technology can be used
On that last point, think of coorelatoins and stuff.** who knows what we’ll think of with the possibilities this opens up

Roeder rocky 2011_46

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (18)

Destacado

Destacado (9)

Similar a Roeder rocky 2011_46

Similar a Roeder rocky 2011_46 (20)

Más de Chris Roeder

Más de Chris Roeder (6)

Último

Último (20)

Roeder rocky 2011_46

Notas del editor