SlideShare una empresa de Scribd logo
1 de 19
Descargar para leer sin conexión
THE DATA UNIFICATION IMPERATIVE
ANDY PALMER | CO-FOUNDER, TAMR
BACKGROUND
Career is a mashup of:
start-ups + enterprise
customer + vendor
data + application
technical + business
HEALTHCARE INVESTMENTS
HUGE INVESTMENT IN ENTERPRISE IT & BIG DATA
Companies invested $3-4 Trillion in IT over last 20+ years
And now are investing billions in “Big Data” and Analytics 3.0...
DIRTY LITTLE SECRET: DATA VARIETY IN ENTERPRISE
Most investments oriented towards
some “silo” in the enterprise
● application
● function
● division
● geography
Data tied to these investments
is extremely siloed
BIG DATA & ANALYTICS NEED CLEAN + UNIFIED DATA
“Consider the more than $44 billion projected by Gartner to be spent on big data in
2014. The vast majority of it — $37.4 billion — is going to IT services. Enterprise software
only accounts for about a tenth. The disproportionate spending on services is a sign of
immaturity in how we manage data.” - Mahesh S. Kumar, Harvard Business Review
TACKLING THE ENTERPRISE DATA SILO PROBLEM
All are necessary but not sufficient to truly address next-gen challenges
● Democratized visualization and modeling - radical consumption heterogeneity
● SemanticWeb/LinkedData - radical source heterogeneity
● Provenance for data to improve reliability
● Rapid iteration/change requires reproducability from source
● Desire for longitudinal data across many entities
● Need for automated data quality / assurance
Traditional approaches...
● Standardization - worth trying
● Aggregation - yes - but actually makes the problem worse
● Top-down modeling (MDM/ETL) - ok for app-specific or well-defined data
THE MYTH OF THE SINGLE TECH VENDOR SOLUTION
“Use my brand and data unification will just happen!”
REALLY?
HEALTHCARE/BIOPHARMA IS THE FRONT LINE
The diversity of data and
decentralized nature of healthcare
and specifically biopharmaceutical
research make our industry the
place where next gen data
management will develop.
TABULAR DATA IS KEY ASSET
But it’s messy ...
CURATION AT SCALE
Hiring More Data Scientists Makes the Problem Worse
Reality Enterprise RealityGoal
• Manual data collection
and preparation
• Long lead time to
analyses
• Limited individual view
on variety of data
• Extensive rework
• No cohesive view of
data efforts
• Expertise across
organization underutilized
NEW TOOLS ARE NECESSARY
New transformation tools are necessary… but not sufficient to
solve the enterprise data variety problem
Unified View
A few sources...
Thousands of sources
SOLUTION: BOTTOM-UP, PROBABILISTIC DATA MODELING & “COLLABORATIVE CURATION”
Time to embrace the reality of extreme data variety
across the entire enterprise - “Unified Data”
Back to the future
● 1990’s web: probabilistic search / website connection
● 2020’s enterprise: probabilistic data source connection
& curation
Requires a bottom-up, probabilistic and collaborative
approach to data (complements deterministic)
● Rules for transformation are necessary but not sufficient
to solve broad problem of broad integration
● Mix of 80% probabilistic & 20% deterministic
● Iteratively and systematically engage data experts
CORE OF TAMR
Machine Learning with Human Insight
Identify sources, understand relationships and curate the massive variety of siloed data
Structured and
Semi-structured
Data Sources
Collaborative
Curation
Data Experts
(Source
owners)
Data Stewards
and Curators
Data
Inventory
APIs
Systems
Tools
Data
Scientists
Advanced
Algorithms &
Machine
Learning
Expert
Input
Integrated Data
& Metadata
Expert
Directory
FORTUNE 5 BIOPHARMA
Challenges
• 7k+ scientists
• Decentralized organization
• Assay data in spreadsheets
• 30k+ tables
• 100k+ unique attributes
• Error detection in units
Tamr Unified View
Thousands of
Potential Sources
SOLUTION OVERVIEW: CDISC CONVERSION
The Problem
• Clinical trial data reported in wide variety of
formats, ontologies and standards
• Underspecified attribute names, varying
qualities of annotation, duplicate data, etc…
The Solution
• A scalable, replicable way to automatically
unify and convert clinical trial data to CDISC
format.
Benefit
• Tamr technology solves common CDISC problems: schema mapping and expert sourcing
• Faster way to aggregate and report ongoing trial data for regulatory filings
• Simplified reporting for various agency ontologies
TAMR
TAMR
Thank You

Más contenido relacionado

La actualidad más candente

Getting down to business on Big Data analytics
Getting down to business on Big Data analyticsGetting down to business on Big Data analytics
Getting down to business on Big Data analyticsThe Marketing Distillery
 
An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...
An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...
An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...Neo4j
 
Modern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail BankingModern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail BankingCambridge Semantics
 
A Dynamic Data Catalog for Autonomy and Self-Service
A Dynamic Data Catalog for Autonomy and Self-ServiceA Dynamic Data Catalog for Autonomy and Self-Service
A Dynamic Data Catalog for Autonomy and Self-ServiceDenodo
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesAshraf Uddin
 
Business case for Big Data Analytics
Business case for Big Data AnalyticsBusiness case for Big Data Analytics
Business case for Big Data AnalyticsVijay Rao
 
Business Analytics & Big Data Trends and Predictions 2014 - 2015
Business Analytics & Big Data Trends and Predictions 2014 - 2015Business Analytics & Big Data Trends and Predictions 2014 - 2015
Business Analytics & Big Data Trends and Predictions 2014 - 2015Brad Culbert
 
Graph Grid by Atom Rain
Graph Grid by Atom RainGraph Grid by Atom Rain
Graph Grid by Atom RainMeg Vorland
 
Accelerate Digital Transformation with an Enterprise Big Data Fabric
Accelerate Digital Transformation with an Enterprise Big Data FabricAccelerate Digital Transformation with an Enterprise Big Data Fabric
Accelerate Digital Transformation with an Enterprise Big Data FabricCambridge Semantics
 
Chief Data & Analytics Officer Fall Boston - Presentation
Chief Data & Analytics Officer Fall Boston - PresentationChief Data & Analytics Officer Fall Boston - Presentation
Chief Data & Analytics Officer Fall Boston - PresentationSrinivasan Sankar
 
How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?Thanakrit Lersmethasakul
 
Data Catalog as a Business Enabler
Data Catalog as a Business EnablerData Catalog as a Business Enabler
Data Catalog as a Business EnablerSrinivasan Sankar
 
Big Data Expo 2015 - Trillium software Big Data and the Data Quality
Big Data Expo 2015 - Trillium software Big Data and the Data QualityBig Data Expo 2015 - Trillium software Big Data and the Data Quality
Big Data Expo 2015 - Trillium software Big Data and the Data QualityBigDataExpo
 
Business Value of Data
Business Value of Data Business Value of Data
Business Value of Data UIResearchPark
 
Bigdata for sme-industrial intelligence information-24july2017-final
Bigdata for sme-industrial intelligence information-24july2017-finalBigdata for sme-industrial intelligence information-24july2017-final
Bigdata for sme-industrial intelligence information-24july2017-finalstelligence
 
Big Data, Business Intelligence and Data Analytics
Big Data, Business Intelligence and Data AnalyticsBig Data, Business Intelligence and Data Analytics
Big Data, Business Intelligence and Data AnalyticsSystems Limited
 
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Simplilearn
 

La actualidad más candente (20)

Getting down to business on Big Data analytics
Getting down to business on Big Data analyticsGetting down to business on Big Data analytics
Getting down to business on Big Data analytics
 
An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...
An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...
An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...
 
Modern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail BankingModern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail Banking
 
Three Big Data Case Studies
Three Big Data Case StudiesThree Big Data Case Studies
Three Big Data Case Studies
 
A Dynamic Data Catalog for Autonomy and Self-Service
A Dynamic Data Catalog for Autonomy and Self-ServiceA Dynamic Data Catalog for Autonomy and Self-Service
A Dynamic Data Catalog for Autonomy and Self-Service
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
Business case for Big Data Analytics
Business case for Big Data AnalyticsBusiness case for Big Data Analytics
Business case for Big Data Analytics
 
Business Analytics & Big Data Trends and Predictions 2014 - 2015
Business Analytics & Big Data Trends and Predictions 2014 - 2015Business Analytics & Big Data Trends and Predictions 2014 - 2015
Business Analytics & Big Data Trends and Predictions 2014 - 2015
 
Graph Grid by Atom Rain
Graph Grid by Atom RainGraph Grid by Atom Rain
Graph Grid by Atom Rain
 
Accelerate Digital Transformation with an Enterprise Big Data Fabric
Accelerate Digital Transformation with an Enterprise Big Data FabricAccelerate Digital Transformation with an Enterprise Big Data Fabric
Accelerate Digital Transformation with an Enterprise Big Data Fabric
 
Chief Data & Analytics Officer Fall Boston - Presentation
Chief Data & Analytics Officer Fall Boston - PresentationChief Data & Analytics Officer Fall Boston - Presentation
Chief Data & Analytics Officer Fall Boston - Presentation
 
Big Data SurVey - IOUG - 2013 - 594292
Big Data SurVey - IOUG - 2013 - 594292Big Data SurVey - IOUG - 2013 - 594292
Big Data SurVey - IOUG - 2013 - 594292
 
How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?
 
Data Catalog as a Business Enabler
Data Catalog as a Business EnablerData Catalog as a Business Enabler
Data Catalog as a Business Enabler
 
Big Data Expo 2015 - Trillium software Big Data and the Data Quality
Big Data Expo 2015 - Trillium software Big Data and the Data QualityBig Data Expo 2015 - Trillium software Big Data and the Data Quality
Big Data Expo 2015 - Trillium software Big Data and the Data Quality
 
Business Value of Data
Business Value of Data Business Value of Data
Business Value of Data
 
Bigdata for sme-industrial intelligence information-24july2017-final
Bigdata for sme-industrial intelligence information-24july2017-finalBigdata for sme-industrial intelligence information-24july2017-final
Bigdata for sme-industrial intelligence information-24july2017-final
 
Big Data, Business Intelligence and Data Analytics
Big Data, Business Intelligence and Data AnalyticsBig Data, Business Intelligence and Data Analytics
Big Data, Business Intelligence and Data Analytics
 
The Year of the Graph
The Year of the GraphThe Year of the Graph
The Year of the Graph
 
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
 

Destacado

K mac donaldwk2teampaper
K mac donaldwk2teampaperK mac donaldwk2teampaper
K mac donaldwk2teampaperKaren MacDonald
 
презентация бим-радио
презентация бим-радиопрезентация бим-радио
презентация бим-радиоSimon Yeah
 
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...Anton Kirillov
 

Destacado (6)

Zija ppt6 shari 05-2015
Zija ppt6  shari 05-2015Zija ppt6  shari 05-2015
Zija ppt6 shari 05-2015
 
K mac donaldwk2teampaper
K mac donaldwk2teampaperK mac donaldwk2teampaper
K mac donaldwk2teampaper
 
презентация бим-радио
презентация бим-радиопрезентация бим-радио
презентация бим-радио
 
Spain
SpainSpain
Spain
 
Starsoft tm1
Starsoft tm1Starsoft tm1
Starsoft tm1
 
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
 

Similar a Tamr | Biogen data unification imperative

Value of data in digital transformation
Value of data in digital transformationValue of data in digital transformation
Value of data in digital transformationLoihde Advisory
 
Big Data; Big Potential: How to find the talent who can harness its power
Big Data; Big Potential: How to find the talent who can harness its powerBig Data; Big Potential: How to find the talent who can harness its power
Big Data; Big Potential: How to find the talent who can harness its powerLucas Group
 
Modern Data Challenges require Modern Graph Technology
Modern Data Challenges require Modern Graph TechnologyModern Data Challenges require Modern Graph Technology
Modern Data Challenges require Modern Graph TechnologyNeo4j
 
Article Evaluation 4
Article Evaluation 4Article Evaluation 4
Article Evaluation 4AnshumanRaina
 
final oracle presentation
final oracle presentationfinal oracle presentation
final oracle presentationPriyesh Patel
 
How Can You Calculate the Cost of Your Data?
How Can You Calculate the Cost of Your Data?How Can You Calculate the Cost of Your Data?
How Can You Calculate the Cost of Your Data?DATAVERSITY
 
Neil Sholay - Data Driven Business - #OracleCloudDay London
Neil Sholay - Data Driven Business - #OracleCloudDay LondonNeil Sholay - Data Driven Business - #OracleCloudDay London
Neil Sholay - Data Driven Business - #OracleCloudDay LondonNeil Sholay
 
The New Data Dynamics How to turn data into a competitive advantage
The New Data Dynamics How to turn data into a competitive advantageThe New Data Dynamics How to turn data into a competitive advantage
The New Data Dynamics How to turn data into a competitive advantageFiona Lew
 
Big Data Trends and Challenges Report - Whitepaper
Big Data Trends and Challenges Report - WhitepaperBig Data Trends and Challenges Report - Whitepaper
Big Data Trends and Challenges Report - WhitepaperVasu S
 
Big data and your career final
Big data and your career finalBig data and your career final
Big data and your career finalMarina Kerbel
 
Crawl, Walk, Run: How to Get Started with Hadoop
Crawl, Walk, Run: How to Get Started with HadoopCrawl, Walk, Run: How to Get Started with Hadoop
Crawl, Walk, Run: How to Get Started with HadoopInside Analysis
 
Why Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieWhy Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieSunil Ranka
 
Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019mark madsen
 
Data and analytics leadership vision for 2017
Data and analytics leadership vision for 2017Data and analytics leadership vision for 2017
Data and analytics leadership vision for 2017Sameer Dhanrajani
 
Fate of the Chief Data Officer
Fate of the Chief Data OfficerFate of the Chief Data Officer
Fate of the Chief Data OfficerTamarah Usher
 
From Big Data to Business Value
From Big Data to Business ValueFrom Big Data to Business Value
From Big Data to Business ValueGib Bassett
 

Similar a Tamr | Biogen data unification imperative (20)

Value of data in digital transformation
Value of data in digital transformationValue of data in digital transformation
Value of data in digital transformation
 
Big Data; Big Potential: How to find the talent who can harness its power
Big Data; Big Potential: How to find the talent who can harness its powerBig Data; Big Potential: How to find the talent who can harness its power
Big Data; Big Potential: How to find the talent who can harness its power
 
Modern Data Challenges require Modern Graph Technology
Modern Data Challenges require Modern Graph TechnologyModern Data Challenges require Modern Graph Technology
Modern Data Challenges require Modern Graph Technology
 
Article Evaluation 4
Article Evaluation 4Article Evaluation 4
Article Evaluation 4
 
Bidata
BidataBidata
Bidata
 
final oracle presentation
final oracle presentationfinal oracle presentation
final oracle presentation
 
BigDataforSMEs
BigDataforSMEsBigDataforSMEs
BigDataforSMEs
 
How Can You Calculate the Cost of Your Data?
How Can You Calculate the Cost of Your Data?How Can You Calculate the Cost of Your Data?
How Can You Calculate the Cost of Your Data?
 
Neil Sholay - Data Driven Business - #OracleCloudDay London
Neil Sholay - Data Driven Business - #OracleCloudDay LondonNeil Sholay - Data Driven Business - #OracleCloudDay London
Neil Sholay - Data Driven Business - #OracleCloudDay London
 
Big data
Big dataBig data
Big data
 
The New Data Dynamics How to turn data into a competitive advantage
The New Data Dynamics How to turn data into a competitive advantageThe New Data Dynamics How to turn data into a competitive advantage
The New Data Dynamics How to turn data into a competitive advantage
 
BIG DATA, small workforce
BIG DATA, small workforceBIG DATA, small workforce
BIG DATA, small workforce
 
Big Data Trends and Challenges Report - Whitepaper
Big Data Trends and Challenges Report - WhitepaperBig Data Trends and Challenges Report - Whitepaper
Big Data Trends and Challenges Report - Whitepaper
 
Big data and your career final
Big data and your career finalBig data and your career final
Big data and your career final
 
Crawl, Walk, Run: How to Get Started with Hadoop
Crawl, Walk, Run: How to Get Started with HadoopCrawl, Walk, Run: How to Get Started with Hadoop
Crawl, Walk, Run: How to Get Started with Hadoop
 
Why Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieWhy Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A Lie
 
Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019
 
Data and analytics leadership vision for 2017
Data and analytics leadership vision for 2017Data and analytics leadership vision for 2017
Data and analytics leadership vision for 2017
 
Fate of the Chief Data Officer
Fate of the Chief Data OfficerFate of the Chief Data Officer
Fate of the Chief Data Officer
 
From Big Data to Business Value
From Big Data to Business ValueFrom Big Data to Business Value
From Big Data to Business Value
 

Último

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 

Último (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 

Tamr | Biogen data unification imperative

  • 1. THE DATA UNIFICATION IMPERATIVE ANDY PALMER | CO-FOUNDER, TAMR
  • 2. BACKGROUND Career is a mashup of: start-ups + enterprise customer + vendor data + application technical + business
  • 4. HUGE INVESTMENT IN ENTERPRISE IT & BIG DATA Companies invested $3-4 Trillion in IT over last 20+ years And now are investing billions in “Big Data” and Analytics 3.0...
  • 5. DIRTY LITTLE SECRET: DATA VARIETY IN ENTERPRISE Most investments oriented towards some “silo” in the enterprise ● application ● function ● division ● geography Data tied to these investments is extremely siloed
  • 6. BIG DATA & ANALYTICS NEED CLEAN + UNIFIED DATA “Consider the more than $44 billion projected by Gartner to be spent on big data in 2014. The vast majority of it — $37.4 billion — is going to IT services. Enterprise software only accounts for about a tenth. The disproportionate spending on services is a sign of immaturity in how we manage data.” - Mahesh S. Kumar, Harvard Business Review
  • 7. TACKLING THE ENTERPRISE DATA SILO PROBLEM All are necessary but not sufficient to truly address next-gen challenges ● Democratized visualization and modeling - radical consumption heterogeneity ● SemanticWeb/LinkedData - radical source heterogeneity ● Provenance for data to improve reliability ● Rapid iteration/change requires reproducability from source ● Desire for longitudinal data across many entities ● Need for automated data quality / assurance Traditional approaches... ● Standardization - worth trying ● Aggregation - yes - but actually makes the problem worse ● Top-down modeling (MDM/ETL) - ok for app-specific or well-defined data
  • 8. THE MYTH OF THE SINGLE TECH VENDOR SOLUTION “Use my brand and data unification will just happen!” REALLY?
  • 9. HEALTHCARE/BIOPHARMA IS THE FRONT LINE The diversity of data and decentralized nature of healthcare and specifically biopharmaceutical research make our industry the place where next gen data management will develop.
  • 10. TABULAR DATA IS KEY ASSET But it’s messy ...
  • 11. CURATION AT SCALE Hiring More Data Scientists Makes the Problem Worse Reality Enterprise RealityGoal • Manual data collection and preparation • Long lead time to analyses • Limited individual view on variety of data • Extensive rework • No cohesive view of data efforts • Expertise across organization underutilized
  • 12. NEW TOOLS ARE NECESSARY New transformation tools are necessary… but not sufficient to solve the enterprise data variety problem Unified View A few sources... Thousands of sources
  • 13. SOLUTION: BOTTOM-UP, PROBABILISTIC DATA MODELING & “COLLABORATIVE CURATION” Time to embrace the reality of extreme data variety across the entire enterprise - “Unified Data” Back to the future ● 1990’s web: probabilistic search / website connection ● 2020’s enterprise: probabilistic data source connection & curation Requires a bottom-up, probabilistic and collaborative approach to data (complements deterministic) ● Rules for transformation are necessary but not sufficient to solve broad problem of broad integration ● Mix of 80% probabilistic & 20% deterministic ● Iteratively and systematically engage data experts
  • 14. CORE OF TAMR Machine Learning with Human Insight Identify sources, understand relationships and curate the massive variety of siloed data Structured and Semi-structured Data Sources Collaborative Curation Data Experts (Source owners) Data Stewards and Curators Data Inventory APIs Systems Tools Data Scientists Advanced Algorithms & Machine Learning Expert Input Integrated Data & Metadata Expert Directory
  • 15. FORTUNE 5 BIOPHARMA Challenges • 7k+ scientists • Decentralized organization • Assay data in spreadsheets • 30k+ tables • 100k+ unique attributes • Error detection in units Tamr Unified View Thousands of Potential Sources
  • 16. SOLUTION OVERVIEW: CDISC CONVERSION The Problem • Clinical trial data reported in wide variety of formats, ontologies and standards • Underspecified attribute names, varying qualities of annotation, duplicate data, etc… The Solution • A scalable, replicable way to automatically unify and convert clinical trial data to CDISC format. Benefit • Tamr technology solves common CDISC problems: schema mapping and expert sourcing • Faster way to aggregate and report ongoing trial data for regulatory filings • Simplified reporting for various agency ontologies
  • 17. TAMR
  • 18. TAMR