SlideShare una empresa de Scribd logo
1 de 21
Descargar para leer sin conexión
RWE & Patient Analytics
Leveraging Databricks
An Use Case
Harini Gopalakrishnan & Martin Longpre
Sanofi
Disclaimer
• The views and opinions expressed in this presentation are that of
the individual presenter and should not be attributed to any
organization with whom the presenter is employed or affiliated
• All registered trademarks cited are property of their respective
owners.
Agenda
Harini Gopalakrishnan -20 minutes
▪ What is Real world evidence and Real world data
▪ Advanced analytics in RWE generation
▪ Security and privacy of our Data
▪ Our journey – an conceptual view of the architecture
and what we have achieved
Martin Longpre – 20 minutes
▪ Databricks implementation- our customization
▪ Demo
▪ Look forward: where we want to partner for
improvements
Q&A – 20 minutes
Defining the Problem- Real World Data and
Evidence
Context: How do we define RWE & RWD
Real World Data (RWD) is a term used to
describe health care related data that are
collected outside the context of
randomized clinical trials (RCTs),
Real world evidence (RWE) is defined as the
insight or knowledge derived from the analysis
of real world data, conducted to respond to a
specific research question
RWE leverages analytics on RWD to discover, develop, deliver and
provide new insights on healthcare interventions
Examples of Real-world data sources
~ 130 TB (EHR/Claims)
~2000 TB per month in versions, transformations
Analysis in RWE: Advanced analytics methodology
Traditional analytics
• Traditional RWE statistics, meta-analysis, data modelling, propensity-score matching
Advanced analytics
• Predictive modelling, unsupervised clustering, rule extraction, model bootstrapping,
natural language processing, machine learning
Machine learning: a computer
program is said to learn from
experience (partially captured
within data), when its performance
increases with experience
Supervised techniques example
• Logistic regression
• Markov chain
• Bayesian network
• K-nearest neighbour
Non-supervised techniques examples
• K-means clustering
• Hierarchical ascendant classification
• Factorial analyses
• Non-negative matrix factorization
Innovation in evidence generation
Uses of RWE – why is it valuable
https://www.healthcatalyst.com/insights/real-world-data-chief-driver-drug-development
The driving reasons for
leveraging them more
recently include:
• Ease of availability in
compute resources for big
data
• Availability of curated and
high quality data sources
both internally and
externally
Real world evidence influences all aspects of a pharma value chain
Regulatory Decision
making
Reimbursement decisions
Clinical Guidelines
2 3
1
Transforming RWD to Evidence: Use case in action
AI based indication searching approach that relies on Real-World Data thus bringing a higher confidence and reducing
biases
Data is always privacy preserved and de-identified
Sanofi: Novel Indications via AI —
Finding new treatment indications for an
approved therapy is of immense value to
pharma for drug re-purposing efforts,
R&D candidate prioritization, and overall
productivity. Sanofi wanted to develop an
AI based indication searching approach
that relies on real-world data thus
bringing a higher confidence and
reducing biases. Sanofi applied
unsupervised machine learning to create
a phenotypic cluster of patients in order
to identify relevant indications that
worked across clusters. The pipeline
crunched nearly 17 million patients with
2,700 characteristics derived from
electronic health records (EHRs) The
initial results of the novel approach
recovered 90% of known indications and
identified many more deemed credible by
development teams producing a higher
level of confidence in results and a
reduction in cost and time to market, with
fewer, faster and more targeted trials,
while minimizing attrition and risk.
https://www.gartner.com/en/newsroom/press-releases/2020-11-17-gartner-announces-winners-of-th
e-2020-gartner-healthcare-and-life-sciences-eye-on-innovation-award
Winner of the Gartner Award 2020 for Innovation in Health care and
Lifesciences
https://www.gartner.com/en/newsroom/press-releases/2020-11-17-gartner-announces-winners-of-th
e-2020-gartner-healthcare-and-life-sciences-eye-on-innovation-award
Trust of data and analysis being performed is a MUST
“ Patients and consumers have a
significant role to play in the
collection of real-world data and
generation of real-world evidence,
but to be effective, patient and
consumer engagement approaches
would include considering them
partners and capturing outcomes that
are important to them “
▪ Patient consent is a must
▪ Privacy preserved linkage must be
performed, encryption is a key
aspect
▪ Establish trusted Patient relationship
to explain the usage of data and
consent (e. g: secondary use of
primary data)
▪ Data should not be used beyond the
intended purpose- governance
around the usage is a must
Our Architecture & implementation
Key aspects of a RWE Ecosystem
Data
Management
Secure data
storage – triple
encrypted with
audited access
control
Full data lineage –
complete history
of every data
transformation
Data pipeline –
designed for high
performance
handling of big
data
Analytics
Self-service tools
– filtering and
querying tools for
feasibility an
descriptive
information
Interactive tools –
dashboards and
applications for
study execution
Low-level tools –
R, Python and
SQL for
comparative
analysis and
advanced
analytics
Access
Control
Multi-tenant
configuration –
provide each
organization with
their own
namespace
User provisioning
– role-based
access controlled
by each
organization
Inherited data
permissions –
transformed data
retains access
control
Auditing
and
Monitoring
Full auditing of
user actions – log
each action and
generate reports
Comprehensive
monitoring –
performance,
usage, and
custom actions
Powerful computer resources to handle billions of rows of data
Complete history of all data updates, with ability to bind to
specific versions
Complete data traceability – every transform and resulting data
set is captured
Robust data security and access control for all data and projects
Ability to manage metadata, reference data and master data
Built on a scalable data lake
What does our system offer?
14
Data is always privacy preserved and de-identified. We do not own the KEY for re-identification within this eco system
Disclaimer: For example purposes only
Clinical Bioinformatics
Internal Sources External Sources
Self Service Analysis Advanced Analytics
Data Augmentation
Visualization / Dashboards
Data lake (Sanofi AWS )
Artificial Intelligence/ML
Standardized analytical workflows
Cohort Definitions and Data Modelling
Conventional Studies
(NLP)
Secured and Traceable Sanofi controlled
environment
Data and Analysis Collaboration*
Societies and
Consortia
Academic
Institutions
Regulatory
Agencies
Internal sources
Insights
External Collaboration Other Internal Platforms
The Conceptual architecture
https://aws.amazon.com/blogs/industries/sanofi-webinar-performing-end-to-end-real-world-evidence-generation-with-traceability-and-transparency-on-aws/
Data lake
(Secured and Access controlled at the data level)
When do we use Databricks
▪ Exploratory use cases – projects where we need to run AI/ML workflow for use cases that require
GPU , custom libraries, NLP /sentiment analysis
▪ Cross functional team: working on a specific project – both internal and external stakeholders
▪ Flexibility: Ability for users to manage their own cluster profiles – size up and down based on
policy
▪ Data ingestion pipelines migrating away from AWS Glue and Batch for cost and performance
reasons- 30% improvement in costs & productivity
▪ Delta lake under analysis: today it is directly managed in parquet /S3
▪ SQL analytics: under evaluation
▪ Usage of our Azure AD
configuration
▪ One AD groups per data
type
▪ Deactivation of the DBFS
file system for end users
(DBFS not align with our
data restriction polices)
▪ All data access are
predefined and available
through /mnt
▪ Integration of the
DB REPOS feature
connected directly
to our enterprise
Gitlab services
▪ Usage of CI/CD
pipelines for
deploying scripts
and tasks
Passthrough for Security
▪ Cluster names suffixed with
the policies names for audit
and monitoring
▪ Limit the type of worker
and driver for better budget
management
▪ Enforce the termination of
cluster with default values
based on projects/use cases
(manage by cluster policies)
Databricks Customization (1/2)
Gitlab integration Cluster Policies
▪ Only used for specific use case mostly for
Rstudio
▪Fully integrated to our AWS stack
▪IAM roles setup for S3 bucket accesses
▪One home folder per users created by default
(internal process)
Instance Profiling IAM roles and policies
Databricks Customization (2/2)
Demo and Future
Demo
Improvements
▪ Support for R studio
▪ Data access control and policy propagation to restrict
unauthorized use of data- no lineage on data
Summary- Our Journey and benefits
▪ Started from a traditional ware house 3
years ago to crate an end to end eco
system for evidence generation and insights
▪ Helped move away from conventional to
more advanced analytical approaches
leveraging the power of big data and cloud
▪ Delivered several evidence generating
studies, i.e studies at scale that have
impacted all aspects of pharma value chain
with demonstratable ROI
https://www.dovepress.com/cr_data/article_fulltext/s160000/160029/img/jmdh-160029_F003.jpg
Feedback
Your feedback is important to us.
Don’t forget to rate and review the sessions.

Más contenido relacionado

La actualidad más candente

Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
Data Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationData Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationDATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationDenodo
 
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDATAVERSITY
 
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data LandscapeData Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data LandscapeDATAVERSITY
 
Data Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data IntelligenceData Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data IntelligenceAlation
 
How to Take Advantage of an Enterprise Data Warehouse in the Cloud
How to Take Advantage of an Enterprise Data Warehouse in the CloudHow to Take Advantage of an Enterprise Data Warehouse in the Cloud
How to Take Advantage of an Enterprise Data Warehouse in the CloudDenodo
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceDATAVERSITY
 
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...HostedbyConfluent
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsDATAVERSITY
 
Data Governance
Data GovernanceData Governance
Data GovernanceBoris Otto
 
How to Build & Sustain a Data Governance Operating Model
How to Build & Sustain a Data Governance Operating Model How to Build & Sustain a Data Governance Operating Model
How to Build & Sustain a Data Governance Operating Model DATUM LLC
 
Data Governance Best Practices
Data Governance Best PracticesData Governance Best Practices
Data Governance Best PracticesDATAVERSITY
 
Improving Data Literacy Around Data Architecture
Improving Data Literacy Around Data ArchitectureImproving Data Literacy Around Data Architecture
Improving Data Literacy Around Data ArchitectureDATAVERSITY
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture DesignKujambu Murugesan
 
The Chief Data Officer Agenda: Metrics for Information and Data Management
The Chief Data Officer Agenda: Metrics for Information and Data ManagementThe Chief Data Officer Agenda: Metrics for Information and Data Management
The Chief Data Officer Agenda: Metrics for Information and Data ManagementDATAVERSITY
 
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...DATAVERSITY
 
Data Modeling, Data Governance, & Data Quality
Data Modeling, Data Governance, & Data QualityData Modeling, Data Governance, & Data Quality
Data Modeling, Data Governance, & Data QualityDATAVERSITY
 

La actualidad más candente (20)

Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Data Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationData Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital Transformation
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data Architecture
 
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data LandscapeData Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
 
Data Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data IntelligenceData Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data Intelligence
 
How to Take Advantage of an Enterprise Data Warehouse in the Cloud
How to Take Advantage of an Enterprise Data Warehouse in the CloudHow to Take Advantage of an Enterprise Data Warehouse in the Cloud
How to Take Advantage of an Enterprise Data Warehouse in the Cloud
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data Governance
 
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced Analytics
 
Data Governance
Data GovernanceData Governance
Data Governance
 
How to Build & Sustain a Data Governance Operating Model
How to Build & Sustain a Data Governance Operating Model How to Build & Sustain a Data Governance Operating Model
How to Build & Sustain a Data Governance Operating Model
 
Data Governance Best Practices
Data Governance Best PracticesData Governance Best Practices
Data Governance Best Practices
 
8 Steps to Creating a Data Strategy
8 Steps to Creating a Data Strategy8 Steps to Creating a Data Strategy
8 Steps to Creating a Data Strategy
 
Improving Data Literacy Around Data Architecture
Improving Data Literacy Around Data ArchitectureImproving Data Literacy Around Data Architecture
Improving Data Literacy Around Data Architecture
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
The Chief Data Officer Agenda: Metrics for Information and Data Management
The Chief Data Officer Agenda: Metrics for Information and Data ManagementThe Chief Data Officer Agenda: Metrics for Information and Data Management
The Chief Data Officer Agenda: Metrics for Information and Data Management
 
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
 
Data Modeling, Data Governance, & Data Quality
Data Modeling, Data Governance, & Data QualityData Modeling, Data Governance, & Data Quality
Data Modeling, Data Governance, & Data Quality
 

Similar a RWE Patient Analytics with Databricks

Transforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform StrategyTransforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform StrategyDatabricks
 
Maven and google pharma r&d (1)
Maven and google pharma r&d  (1)Maven and google pharma r&d  (1)
Maven and google pharma r&d (1)Matt Barnes
 
FDA News Webinar - Inspection Intelligence
FDA News Webinar - Inspection IntelligenceFDA News Webinar - Inspection Intelligence
FDA News Webinar - Inspection IntelligenceArmin Torres
 
FDA News Webinar - Inspection Intelligence
FDA News Webinar - Inspection IntelligenceFDA News Webinar - Inspection Intelligence
FDA News Webinar - Inspection IntelligenceArmin Torres
 
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...David Peyruc
 
Roadmap to next generation digital lab
Roadmap to next generation digital labRoadmap to next generation digital lab
Roadmap to next generation digital labStephan Gürtler
 
Enabling patient-centricity-pfizer
Enabling patient-centricity-pfizerEnabling patient-centricity-pfizer
Enabling patient-centricity-pfizerDavid Teszler
 
Enabling Patient Centricity for Pfizer through AWS Cloud (LFS301-S-i) - AWS r...
Enabling Patient Centricity for Pfizer through AWS Cloud (LFS301-S-i) - AWS r...Enabling Patient Centricity for Pfizer through AWS Cloud (LFS301-S-i) - AWS r...
Enabling Patient Centricity for Pfizer through AWS Cloud (LFS301-S-i) - AWS r...Amazon Web Services
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?Denodo
 
Bridging Health Care and Clinical Trial Data through Technology
Bridging Health Care and Clinical Trial Data through TechnologyBridging Health Care and Clinical Trial Data through Technology
Bridging Health Care and Clinical Trial Data through TechnologySaama
 
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...Big Data Week
 
Forrester2019
Forrester2019Forrester2019
Forrester2019Ming Yuan
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Denodo
 
Big Data Analytics for Healthcare Decision Support- Operational and Clinical
Big Data Analytics for Healthcare Decision Support- Operational and ClinicalBig Data Analytics for Healthcare Decision Support- Operational and Clinical
Big Data Analytics for Healthcare Decision Support- Operational and ClinicalAdrish Sannyasi
 
Optimizing_Customer_Lifecycle_with_Big_Data_Analytics_4079WP
Optimizing_Customer_Lifecycle_with_Big_Data_Analytics_4079WPOptimizing_Customer_Lifecycle_with_Big_Data_Analytics_4079WP
Optimizing_Customer_Lifecycle_with_Big_Data_Analytics_4079WPRadium Communications
 
Regulatory Intelligence
Regulatory IntelligenceRegulatory Intelligence
Regulatory IntelligenceArmin Torres
 
Building an Intelligent Biobank to Power Research Decision-Making
Building an Intelligent Biobank to Power Research Decision-MakingBuilding an Intelligent Biobank to Power Research Decision-Making
Building an Intelligent Biobank to Power Research Decision-MakingDenodo
 
Enterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for HealthcareEnterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for HealthcareDATA360US
 
Computer Software Assurance (CSA): Understanding the FDA’s New Draft Guidance
Computer Software Assurance (CSA): Understanding the FDA’s New Draft GuidanceComputer Software Assurance (CSA): Understanding the FDA’s New Draft Guidance
Computer Software Assurance (CSA): Understanding the FDA’s New Draft GuidanceGreenlight Guru
 
Contexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti
 

Similar a RWE Patient Analytics with Databricks (20)

Transforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform StrategyTransforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform Strategy
 
Maven and google pharma r&d (1)
Maven and google pharma r&d  (1)Maven and google pharma r&d  (1)
Maven and google pharma r&d (1)
 
FDA News Webinar - Inspection Intelligence
FDA News Webinar - Inspection IntelligenceFDA News Webinar - Inspection Intelligence
FDA News Webinar - Inspection Intelligence
 
FDA News Webinar - Inspection Intelligence
FDA News Webinar - Inspection IntelligenceFDA News Webinar - Inspection Intelligence
FDA News Webinar - Inspection Intelligence
 
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
 
Roadmap to next generation digital lab
Roadmap to next generation digital labRoadmap to next generation digital lab
Roadmap to next generation digital lab
 
Enabling patient-centricity-pfizer
Enabling patient-centricity-pfizerEnabling patient-centricity-pfizer
Enabling patient-centricity-pfizer
 
Enabling Patient Centricity for Pfizer through AWS Cloud (LFS301-S-i) - AWS r...
Enabling Patient Centricity for Pfizer through AWS Cloud (LFS301-S-i) - AWS r...Enabling Patient Centricity for Pfizer through AWS Cloud (LFS301-S-i) - AWS r...
Enabling Patient Centricity for Pfizer through AWS Cloud (LFS301-S-i) - AWS r...
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
 
Bridging Health Care and Clinical Trial Data through Technology
Bridging Health Care and Clinical Trial Data through TechnologyBridging Health Care and Clinical Trial Data through Technology
Bridging Health Care and Clinical Trial Data through Technology
 
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
 
Forrester2019
Forrester2019Forrester2019
Forrester2019
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
 
Big Data Analytics for Healthcare Decision Support- Operational and Clinical
Big Data Analytics for Healthcare Decision Support- Operational and ClinicalBig Data Analytics for Healthcare Decision Support- Operational and Clinical
Big Data Analytics for Healthcare Decision Support- Operational and Clinical
 
Optimizing_Customer_Lifecycle_with_Big_Data_Analytics_4079WP
Optimizing_Customer_Lifecycle_with_Big_Data_Analytics_4079WPOptimizing_Customer_Lifecycle_with_Big_Data_Analytics_4079WP
Optimizing_Customer_Lifecycle_with_Big_Data_Analytics_4079WP
 
Regulatory Intelligence
Regulatory IntelligenceRegulatory Intelligence
Regulatory Intelligence
 
Building an Intelligent Biobank to Power Research Decision-Making
Building an Intelligent Biobank to Power Research Decision-MakingBuilding an Intelligent Biobank to Power Research Decision-Making
Building an Intelligent Biobank to Power Research Decision-Making
 
Enterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for HealthcareEnterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for Healthcare
 
Computer Software Assurance (CSA): Understanding the FDA’s New Draft Guidance
Computer Software Assurance (CSA): Understanding the FDA’s New Draft GuidanceComputer Software Assurance (CSA): Understanding the FDA’s New Draft Guidance
Computer Software Assurance (CSA): Understanding the FDA’s New Draft Guidance
 
Contexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to Production
 

Más de Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionDatabricks
 

Más de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack Detection
 

Último

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 

Último (20)

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 

RWE Patient Analytics with Databricks

  • 1. RWE & Patient Analytics Leveraging Databricks An Use Case Harini Gopalakrishnan & Martin Longpre Sanofi
  • 2. Disclaimer • The views and opinions expressed in this presentation are that of the individual presenter and should not be attributed to any organization with whom the presenter is employed or affiliated • All registered trademarks cited are property of their respective owners.
  • 3. Agenda Harini Gopalakrishnan -20 minutes ▪ What is Real world evidence and Real world data ▪ Advanced analytics in RWE generation ▪ Security and privacy of our Data ▪ Our journey – an conceptual view of the architecture and what we have achieved Martin Longpre – 20 minutes ▪ Databricks implementation- our customization ▪ Demo ▪ Look forward: where we want to partner for improvements Q&A – 20 minutes
  • 4. Defining the Problem- Real World Data and Evidence
  • 5. Context: How do we define RWE & RWD Real World Data (RWD) is a term used to describe health care related data that are collected outside the context of randomized clinical trials (RCTs), Real world evidence (RWE) is defined as the insight or knowledge derived from the analysis of real world data, conducted to respond to a specific research question RWE leverages analytics on RWD to discover, develop, deliver and provide new insights on healthcare interventions Examples of Real-world data sources ~ 130 TB (EHR/Claims) ~2000 TB per month in versions, transformations
  • 6. Analysis in RWE: Advanced analytics methodology Traditional analytics • Traditional RWE statistics, meta-analysis, data modelling, propensity-score matching Advanced analytics • Predictive modelling, unsupervised clustering, rule extraction, model bootstrapping, natural language processing, machine learning Machine learning: a computer program is said to learn from experience (partially captured within data), when its performance increases with experience Supervised techniques example • Logistic regression • Markov chain • Bayesian network • K-nearest neighbour Non-supervised techniques examples • K-means clustering • Hierarchical ascendant classification • Factorial analyses • Non-negative matrix factorization Innovation in evidence generation
  • 7. Uses of RWE – why is it valuable https://www.healthcatalyst.com/insights/real-world-data-chief-driver-drug-development The driving reasons for leveraging them more recently include: • Ease of availability in compute resources for big data • Availability of curated and high quality data sources both internally and externally Real world evidence influences all aspects of a pharma value chain Regulatory Decision making Reimbursement decisions Clinical Guidelines 2 3 1
  • 8. Transforming RWD to Evidence: Use case in action AI based indication searching approach that relies on Real-World Data thus bringing a higher confidence and reducing biases Data is always privacy preserved and de-identified Sanofi: Novel Indications via AI — Finding new treatment indications for an approved therapy is of immense value to pharma for drug re-purposing efforts, R&D candidate prioritization, and overall productivity. Sanofi wanted to develop an AI based indication searching approach that relies on real-world data thus bringing a higher confidence and reducing biases. Sanofi applied unsupervised machine learning to create a phenotypic cluster of patients in order to identify relevant indications that worked across clusters. The pipeline crunched nearly 17 million patients with 2,700 characteristics derived from electronic health records (EHRs) The initial results of the novel approach recovered 90% of known indications and identified many more deemed credible by development teams producing a higher level of confidence in results and a reduction in cost and time to market, with fewer, faster and more targeted trials, while minimizing attrition and risk. https://www.gartner.com/en/newsroom/press-releases/2020-11-17-gartner-announces-winners-of-th e-2020-gartner-healthcare-and-life-sciences-eye-on-innovation-award
  • 9. Winner of the Gartner Award 2020 for Innovation in Health care and Lifesciences https://www.gartner.com/en/newsroom/press-releases/2020-11-17-gartner-announces-winners-of-th e-2020-gartner-healthcare-and-life-sciences-eye-on-innovation-award
  • 10. Trust of data and analysis being performed is a MUST “ Patients and consumers have a significant role to play in the collection of real-world data and generation of real-world evidence, but to be effective, patient and consumer engagement approaches would include considering them partners and capturing outcomes that are important to them “ ▪ Patient consent is a must ▪ Privacy preserved linkage must be performed, encryption is a key aspect ▪ Establish trusted Patient relationship to explain the usage of data and consent (e. g: secondary use of primary data) ▪ Data should not be used beyond the intended purpose- governance around the usage is a must
  • 11. Our Architecture & implementation
  • 12. Key aspects of a RWE Ecosystem Data Management Secure data storage – triple encrypted with audited access control Full data lineage – complete history of every data transformation Data pipeline – designed for high performance handling of big data Analytics Self-service tools – filtering and querying tools for feasibility an descriptive information Interactive tools – dashboards and applications for study execution Low-level tools – R, Python and SQL for comparative analysis and advanced analytics Access Control Multi-tenant configuration – provide each organization with their own namespace User provisioning – role-based access controlled by each organization Inherited data permissions – transformed data retains access control Auditing and Monitoring Full auditing of user actions – log each action and generate reports Comprehensive monitoring – performance, usage, and custom actions
  • 13. Powerful computer resources to handle billions of rows of data Complete history of all data updates, with ability to bind to specific versions Complete data traceability – every transform and resulting data set is captured Robust data security and access control for all data and projects Ability to manage metadata, reference data and master data Built on a scalable data lake What does our system offer?
  • 14. 14 Data is always privacy preserved and de-identified. We do not own the KEY for re-identification within this eco system Disclaimer: For example purposes only Clinical Bioinformatics Internal Sources External Sources Self Service Analysis Advanced Analytics Data Augmentation Visualization / Dashboards Data lake (Sanofi AWS ) Artificial Intelligence/ML Standardized analytical workflows Cohort Definitions and Data Modelling Conventional Studies (NLP) Secured and Traceable Sanofi controlled environment Data and Analysis Collaboration* Societies and Consortia Academic Institutions Regulatory Agencies Internal sources Insights External Collaboration Other Internal Platforms The Conceptual architecture https://aws.amazon.com/blogs/industries/sanofi-webinar-performing-end-to-end-real-world-evidence-generation-with-traceability-and-transparency-on-aws/ Data lake (Secured and Access controlled at the data level)
  • 15. When do we use Databricks ▪ Exploratory use cases – projects where we need to run AI/ML workflow for use cases that require GPU , custom libraries, NLP /sentiment analysis ▪ Cross functional team: working on a specific project – both internal and external stakeholders ▪ Flexibility: Ability for users to manage their own cluster profiles – size up and down based on policy ▪ Data ingestion pipelines migrating away from AWS Glue and Batch for cost and performance reasons- 30% improvement in costs & productivity ▪ Delta lake under analysis: today it is directly managed in parquet /S3 ▪ SQL analytics: under evaluation
  • 16. ▪ Usage of our Azure AD configuration ▪ One AD groups per data type ▪ Deactivation of the DBFS file system for end users (DBFS not align with our data restriction polices) ▪ All data access are predefined and available through /mnt ▪ Integration of the DB REPOS feature connected directly to our enterprise Gitlab services ▪ Usage of CI/CD pipelines for deploying scripts and tasks Passthrough for Security ▪ Cluster names suffixed with the policies names for audit and monitoring ▪ Limit the type of worker and driver for better budget management ▪ Enforce the termination of cluster with default values based on projects/use cases (manage by cluster policies) Databricks Customization (1/2) Gitlab integration Cluster Policies
  • 17. ▪ Only used for specific use case mostly for Rstudio ▪Fully integrated to our AWS stack ▪IAM roles setup for S3 bucket accesses ▪One home folder per users created by default (internal process) Instance Profiling IAM roles and policies Databricks Customization (2/2)
  • 19. Improvements ▪ Support for R studio ▪ Data access control and policy propagation to restrict unauthorized use of data- no lineage on data
  • 20. Summary- Our Journey and benefits ▪ Started from a traditional ware house 3 years ago to crate an end to end eco system for evidence generation and insights ▪ Helped move away from conventional to more advanced analytical approaches leveraging the power of big data and cloud ▪ Delivered several evidence generating studies, i.e studies at scale that have impacted all aspects of pharma value chain with demonstratable ROI https://www.dovepress.com/cr_data/article_fulltext/s160000/160029/img/jmdh-160029_F003.jpg
  • 21. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.