Punjab Call Girls Contact Number +919053,900,678 Punjab Call Girls
Analyze Genomes: A Federated In-memory Database Computing Platform enabling real-time Analysis of Big Medical Data
1. Analyze Genomes: A Federated In-Memory Database Computing
Platform Enabling Real-time Analysis of Big Medical Data
Dr. Matthieu-P. Schapranow
SAPPHIRE, Orlando, USA
May 17, 2016
2. ■ Online: Visit we.analyzegenomes.com for latest research
results, slides, videos, tools, and publications
■ Offline: High-Performance In-Memory Genome Data Analysis:
In-Memory Data Management Research, Springer,
ISBN: 978-3-319-03034-0, 2014
■ In Person: Join us for Intel Tech Talks at SAPPHIRE booth 625 daily!
□ May 17 12.30pm: A Federated In-Memory Database Computing Platform Enabling
Real-time Analysis of Big Medical Data
□ May 18 12.30pm: In-Memory Apps for Next Generation Life Sciences Research
□ May 19 11.30am: In-Memory Apps Supporting Precision Medicine
Where to find additional information?
Schapranow, SAPPHIRE,
May 17, 2016
A Federated In-
Memory Database
Computing Platform
for Big Medical Data
2
3. Indirect Interaction
Direct Interaction
C linician PatientResearcher
Pharm aceutical
Com pany
H ealthcare
Providers
H ospital
Research
Center
Laboratory
Patient
Advocacy
G roup
Intelligent Healthcare Networks in the 21st Century?
Schapranow, SAPPHIRE,
May 17, 2016
A Federated In-
Memory Database
Computing Platform
for Big Medical Data
3
4. Indirect Interaction
Direct Interaction
C linician PatientResearcher
Pharm aceutical
Com pany
H ealthcare
Providers
H ospital
Research
Center
Laboratory
Patient
Advocacy
G roup
Intelligent Healthcare Networks in the 21st Century?
Schapranow, SAPPHIRE,
May 17, 2016
A Federated In-
Memory Database
Computing Platform
for Big Medical Data
4
5. Indirect Interaction
Direct Interaction
C linician PatientResearcher
Pharm aceutical
Com pany
H ealthcare
Providers
H ospital
Research
Center
Laboratory
Patient
Advocacy
G roup
Intelligent Healthcare Networks in the 21st Century!
Schapranow, SAPPHIRE,
May 17, 2016
A Federated In-
Memory Database
Computing Platform
for Big Medical Data
5
6. ■ Patients
□ Individual anamnesis, family history, and background
□ Require fast access to individualized therapy
■ Clinicians
□ Identify root and extent of disease using laboratory tests
□ Evaluate therapy alternatives, adapt existing therapy
■ Researchers
□ Conduct laboratory work, e.g. analyze patient samples
□ Create new research findings and come-up with treatment alternatives
The Setting
Actors in Oncology
Schapranow, SAPPHIRE,
May 17, 2016
6
A Federated In-
Memory Database
Computing Platform
for Big Medical Data
7. IT Challenges
Distributed Heterogeneous Data Sources
7
Human genome/biological data
600GB per full genome
15PB+ in databases of leading institutes
Prescription data
1.5B records from 10,000 doctors and
10M Patients (100 GB)
Clinical trials
Currently more than 30k
recruiting on ClinicalTrials.gov
Human proteome
160M data points (2.4GB) per sample
>3TB raw proteome data in ProteomicsDB
PubMed database
>23M articles
Hospital information systems
Often more than 50GB
Medical sensor data
Scan of a single organ in 1s
creates 10GB of raw dataCancer patient records
>160k records at NCT A Federated In-
Memory Database
Computing Platform
for Big Medical Data
Schapranow, SAPPHIRE,
May 17, 2016
8. Schapranow, SAPPHIRE,
May 17, 2016
Our Approach
Analyze Genomes: Real-time Analysis of Big Medical Data
8
In-Memory Database
Extensions for Life Sciences
Data Exchange,
App Store
Access Control,
Data Protection
Fair Use
Statistical
Tools
Real-time
Analysis
App-spanning
User Profiles
Combined and Linked Data
Genome
Data
Cellular
Pathways
Genome
Metadata
Research
Publications
Pipeline and
Analysis Models
Drugs and
Interactions
A Federated In-
Memory Database
Computing Platform
for Big Medical Data
Drug Response
Analysis
Pathway Topology
Analysis
Medical
Knowledge CockpitOncolyzer
Clinical Trial
Recruitment
Cohort
Analysis
...
Indexed
Sources
9. Combined column
and row store
Map/Reduce Single and
multi-tenancy
Lightweight
compression
Insert only
for time travel
Real-time
replication
Working on
integers
SQL interface on
columns and rows
Active/passive
data store
Minimal
projections
Group key Reduction of
software layers
Dynamic multi-
threading
Bulk load
of data
Object-
relational
mapping
Text retrieval
and extraction engine
No aggregate
tables
Data partitioning Any attribute
as index
No disk
On-the-fly
extensibility
Analytics on
historical data
Multi-core/
parallelization
Our Technology
In-Memory Database Technology
+
++
+
+
P
v
+++
t
SQL
x
x
T
disk
9
Schapranow, SAPPHIRE,
May 17, 2016
A Federated In-
Memory Database
Computing Platform
for Big Medical Data
10. Where are all those Clouds go to?
Schapranow, SAPPHIRE,
May 17, 2016
A Federated In-
Memory Database
Computing Platform
for Big Medical Data
10
Gartner's 2014 Hype Cycle for Emerging Technologies
11. ■ Requirements
□ Real-time data analysis
□ Maintained software
■ Restrictions
□ Data privacy
□ Data locality
□ Volume of “big medical data”
■ Solution?
□ Federated In-Memory Database System vs. Cloud Computing
Software Requirements in Life Sciences
Schapranow, SAPPHIRE,
May 17, 2016
A Federated In-
Memory Database
Computing Platform
for Big Medical Data
11
12. Approach I:
Multiple Cloud Service Providers
Schapranow, SAPPHIRE,
May 17, 2016
A Federated In-
Memory Database
Computing Platform
for Big Medical Data
12
Local System
C loud
Synchronization
Service
R
Local Storage
Local
Synchronization
Service
R
Shared
C loud
Storage
Site A
Local System
R
Local Storage
Local
Synchronization
Service
Site B
C loud
Synchronization
Service
Shared
C loud
Storage
R
Cloud Provider
Site A
C loud Provider
Site B
13. Approach II:
A Single Service Provider
Schapranow, SAPPHIRE,
May 17, 2016
A Federated In-
Memory Database
Computing Platform
for Big Medical Data
13
Cloud
Synchronization
Service
Shared
Cloud
Storage
Site A Site BCloud Provider
Cloud System
R R
14. Multiple Sites Forming the
Federated In-Memory Database System (FIMDB)
Schapranow, SAPPHIRE,
May 17, 2016
A Federated In-
Memory Database
Computing Platform
for Big Medical Data
14
Federated In-M em ory D atabase System
M aster Data and
Shared Algorithm s
Site A Site BCloud Provider
Cloud IM D B
Instance
Local IM DB
Instance
Sensitive D ata,
e.g. Patient Data
R
Local IM DB
Instance
Sensitive Data,
e.g. Patient D ata
R
15. FIMDB: Cloud Service Provider
Schapranow, SAPPHIRE,
May 17, 2016
A Federated In-
Memory Database
Computing Platform
for Big Medical Data
15
Site B
Federated In-M em ory
D atabase Instance,
Algorithm s, and
Applications M anaged
by Service Provider
CloudService
Provider
Site A
FIMDB
A.1
FIMDB
A.2
FIMDB
A.3
FIMDB
A.4
FIMDB
A.5
FIMDB
B.1
FIMDB
B.2
FIMDB
B.3
FIMDB
C.1
Federated In-M em ory
Database Instances
M aster Data
M anaged by
Service Provider
Sensitive D ata
reside at Site
■ Change of cloud computing paradigm:
Transfer (small) algorithms to (big) data
■ In-Memory Database (IMDB)
□ Landscape of IMDB nodes
□ Stored IMDB procedures and algorithms
□ Master data for applications
■ In-Memory File System (IMDBfs)
□ Integration of file-based tools
□ Managed services directory
□ OS binaries compiled and statically linked for
individual platforms
16. 1. Establish site-to-site VPN connection b/w site and cloud service
provider
2. Mount remote services directory
3. Install and configure local IMDB instance from services directory
4. Subscribe to and configure selected managed services
FIMDB: Setup of a New Client
Schapranow, SAPPHIRE,
May 17, 2016
A Federated In-
Memory Database
Computing Platform
for Big Medical Data
16
17. ■ Data partitioning protects sensitive data by
storing it on local hardware resources only
■ Supports parallel query execution, i.e. reduced
processing time
■ Efficient use of existing hardware resources
FIMDB: Incorporating Local Compute Resources
Schapranow, SAPPHIRE,
May 17, 2016
A Federated In-
Memory Database
Computing Platform
for Big Medical Data
17
18. ■ Brings algorithms to data
■ Forms a single database across individual sites and locations
■ Master data managed by service provider whilst sensitive data resides locally
What to Take Home?
Test it Yourself: AnalyzeGenomes.com
Schapranow, SAPPHIRE,
May 17, 2016
A Federated In-
Memory Database
Computing Platform
for Big Medical Data
18
Pros Cons
Single database license Complex operation
Easy to consume services Time-consuming infrastructure setup
Query propagation by IMDB
Only a single source of truth
19. ■ Online: Visit we.analyzegenomes.com for latest research
results, slides, videos, tools, and publications
■ Offline: High-Performance In-Memory Genome Data Analysis:
In-Memory Data Management Research, Springer,
ISBN: 978-3-319-03034-0, 2014
■ In Person: Join us for Intel Tech Talks at SAPPHIRE booth 625 daily!
□ May 17 12.30pm: A Federated In-Memory Database Computing Platform Enabling
Real-time Analysis of Big Medical Data
□ May 18 12.30pm: In-Memory Apps for Next Generation Life Sciences Research
□ May 19 11.30am: In-Memory Apps Supporting Precision Medicine
Where to find additional information?
Schapranow, SAPPHIRE,
May 17, 2016
A Federated In-
Memory Database
Computing Platform
for Big Medical Data
19
20. Keep in contact with us!
Dr. Matthieu-P. Schapranow
Program Manager E-Health & Life Sciences
Hasso Plattner Institute
August-Bebel-Str. 88
14482 Potsdam, Germany
schapranow@hpi.de
http://we.analyzegenomes.com/
Schapranow, SAPPHIRE,
May 17, 2016
A Federated In-
Memory Database
Computing Platform
for Big Medical Data
20