SlideShare una empresa de Scribd logo
1 de 40
Descargar para leer sin conexión
Gestion des données scientifiques en imagerie in vivo
https://piv2017.sciencesconf.org/
Christophe CERIN, Université Paris13
christophe.cerin@lipn.univ-paris13.fr
Plateformes de resources numériques
Expérience USPC
Paris, December 7, 2017
Outline
2
• Motivations related to large scale platforms
• Background:
• Criteria for observing platforms
• High Performance Computing (HPC)
• Apache suite for Big Data
• Some projects
• Contribution: USPC platform
• Conclusion and perspective
Motivations related to large
scale platforms
In experimental Sciences
we need tips for using
them appropriately / best
practices
3
4
Definition of a large scale system
• System with unprecedented amounts of hardware (>1M cores,
>100 PB storage, >100Gbits/s interconnect);
• Ultra large scale system: hardware + lines of source code +
numbers of users + volumes of data;
• New problems:
• built across multiple organizations;
• often with conflicting purposes and needs;
• constructed from heterogeneous parts with complex
dependencies and emergent properties;
• continuously evolving;
• software, hardware and human failures will be the norm,
not the exception.
Since in our lives (Cloud Computing)
5
Software as a Service
SaaS
Plateform as a Service
PaaS

Infrastructure as a 

Service

IaaS
• Don’t believe them;
• Variety in the eco-system is the norm.
6
Some speeches are ready for you
In Cloud, innovation drives technology/science
Containers and no more Virtual Machines
7
Hardware
Operating System
App App App
Conventional Stack
Hardware
OS
App App App
Hypervisor
OS OS
Virtualized Stack
▪Abstraction on resources in the past: VM technologies to hide
physical properties, interactions between system layer,
application layer and the end-users
VM versus Container
8
9
In Sciences (not in business)
• Experimental approach: modeling,…,
experiments,…
• What is an experimental plan for large
scale systems and reproducible research?
• Answers depend on the system used:
• Cluster (dedicated; managed)
• Cloud (almost non supervised… or by
you)
Background
10
▪Cluster ➔Performance ➔Flops - Top500
▪Cluster ➔Power ➔Flops per watt - Green500
▪Cloud ➔Price ➔for computing, for storing…
▪Cloud: the economic model may be strange (Amazon
Spot Instances)
▪Scientific issues: how to compare cluster and cloud?
(be fair)
11
Performance criteria for observing
large scale system
12
Performance (Top500)
13
Performance (Green500)
14
Productivity
▪Productivity traditionally refers to the ratio between the quantity
of software produced and the cost spent for it.
▪Factors: Programming language used, Program size, The experience
of programmers and design personnel, The novelty of requirements,
The complexity of the program and its data, The use of structured
programming methods, Program class or the distribution method,
Program type of the application area, Tools and environmental
conditions, Enhancing existing programs or systems, Maintaining
existing programs or systems, Reusing existing modules and standard
designs, Program generators, Fourth-generation languages,
Geographic separation of development locations, Defect potentials
and removal methods, (Existing) Documentation, Prototyping before
main development begins, Project teams and organization
structures, Morale and compensation of staff
▪Parallel (http://press3.mcs.anl.gov/atpesc/files/2015/03/ESCTP-snir_215.pdf):
▪Shared memory (OpenMP):
▪CPU+vector
▪CPU+GPU
▪CPU+accelerators (Google Tensor processing unit…)
▪Message passing (MPI)
▪Sequential with the call to « parallel data structure » (Scala)
▪Map-reduce: Hadoop, Spark (In addition to Map and Reduce operations, it
supports SQL queries, streaming data, machine learning and graph data
processing.)
15
Programming model
▪Even more complicated: Cray has added "ARM Option" (i.e. CPU blade option,
using the ThunderX2) to their XC50 supercomputers
▪We need consolidations:
▪in hardware
▪in software
▪in algorithms
▪in applications
▪in experimental methods
▪hpc facilities
▪training
16
Hardware landscape
17
Apache suite for Big Data
1- Bases de Données
1.1 Apache Hbase
1.2 Apache Cassandra
1.3 CouchDB
1.4 MongoDB
2 Accès aux données/ requêtage
2.1 Pig
2.2 Hive
2.3 Livy
3 Data Intelligence
3.1 Apache Drill
3.2 Apache Mahout
3.3 H2O
4 Data Serialisation
4.1 Apache Thrift
4.2 Apache Avro
5 Data integration
5.1 Apache Sqoop
5.2 Apache Flume
5.3 Apache Chuckwa
6 Requetage
6.1 Presto
6.2 Impala
6.3 Dremel
7 Sécurité des données
7.1 Apache Metron
7.2 Sqrrl
8.2 MapReduce
8.3 Spark
9.1 Elasticsearch
10.1 Apache Hadoop
10.6 Apache Mesos
10.10 Apache Flink
10.12 Apache Zookeeper
10.15 Apache Storm
10.20 Apache Kafka
Example of a project mixing
HPC and Big Data considerations
Where theory comes before data:
physics
chemistry
Materials
...
Where data comes before theory:
Medicine
business insights
autonomy (smart car, building, city)
18
Harp: Hadoop and Collective
Communication for Iterative Computation
19
• Project http://salsaproj.indiana.edu/harp/
• Motivation:
• Communication patterns are not abstracted and
defined in Hadoop, Pregel/Giraph, Spark
• In contrary, MPI has very fine grain based
(collective) communication primitives (based on
arrays and buffers)
• Harp provides data abstractions and communication
abstractions on top of them. It can be plugged into
Hadoop runtime to enable efficient in-memory
communication to avoid HDFS read/write.
20
Harp
• The data
abstraction
has 3
categories
horizontally
and 3 levels
vertically
• Harp works as a plugin in Hadoop. The goal is to
make Hadoop cluster can schedule original
MapReduce jobs and Map-Collective jobs at the
same time.
• Collective communication is defined as
movement of partitions within tables.
• Collective communication requires
synchronization. Hadoop scheduler is modified
to schedule tasks in BSP style.
• Fault tolerance with checkpointing.
21
Harp
• For the data-scientist of nowadays;
• Hub to facilite access to Amazon cloud
resources, to share data, to collaborate
between many people through notebooks;
• RosettaHUB: https://www.rosettahub.com
• Available through Amazon Educate program
(https://s3.amazonaws.com/awseducate-
list/AWS_Educate_Institutions.pdf)
22
RosettaHub
23
24
25
26
SPC and the deployment of
virtual machines
IDV experience
(Imageries du vivant)
27
28
http://cirrus.uspc.fr
29
30
31
Technologie OpenNebula
32
33
34
35
36
• Plateforme d'imagerie ePad
http://pf-01.lab.parisdescartes.fr:1243/dcm4chee-web3/
• Plateforme PLM dans le cadre du projet DRIVE-SPC
http://pf-01.lab.parisdescartes.fr:2345/tc/webclient
• Plateforme Sis4web de SisNCom
http://pf-01.lab.parisdescartes.fr:1323/sis4web/login.php
• Benchsys
http://pf-01.lab.parisdescartes.fr:9001/
• Owncloud
http://pf-01.lab.parisdescartes.fr:1253/owncloud/index.php/login
• …
37
Examples IDV services
Conclusion and Future work
Many, many opportunities
38
• Computer hardware will continue to evolve; Trend:
accelerators instead of GPUs;
• Platforms will continue to evolve to facilitate the
management of data, the computation on data;
• Need for convergence/resonance between HPC and Big-
data eco-systems;
• SPC: needs for training, using, transforming the scientific
approaches… with research engineers for accompany the
change;
• SPC: needs for projects… and to share betweenez
communities.
39
Conclusion and perspective
Thanks to Leila Abidi, Olivier
Waldek, Roland Chervet
40

Más contenido relacionado

La actualidad más candente

51 Use Cases and implications for HPC & Apache Big Data Stack
51 Use Cases and implications for HPC & Apache Big Data Stack51 Use Cases and implications for HPC & Apache Big Data Stack
51 Use Cases and implications for HPC & Apache Big Data StackGeoffrey Fox
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Geoffrey Fox
 
HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...
HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...
HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...Geoffrey Fox
 
Hadoop - Architectural road map for Hadoop Ecosystem
Hadoop -  Architectural road map for Hadoop EcosystemHadoop -  Architectural road map for Hadoop Ecosystem
Hadoop - Architectural road map for Hadoop Ecosystemnallagangus
 
Introducing the hadoop ecosystem
Introducing the hadoop ecosystemIntroducing the hadoop ecosystem
Introducing the hadoop ecosystemGeert Van Landeghem
 
High Performance Data Analytics with Java on Large Multicore HPC Clusters
High Performance Data Analytics with Java on Large Multicore HPC ClustersHigh Performance Data Analytics with Java on Large Multicore HPC Clusters
High Performance Data Analytics with Java on Large Multicore HPC ClustersSaliya Ekanayake
 
High Performance Processing of Streaming Data
High Performance Processing of Streaming DataHigh Performance Processing of Streaming Data
High Performance Processing of Streaming DataGeoffrey Fox
 
Next Generation Grid: Integrating Parallel and Distributed Computing Runtimes...
Next Generation Grid: Integrating Parallel and Distributed Computing Runtimes...Next Generation Grid: Integrating Parallel and Distributed Computing Runtimes...
Next Generation Grid: Integrating Parallel and Distributed Computing Runtimes...Geoffrey Fox
 
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...inside-BigData.com
 
Apache hadoop introduction and architecture
Apache hadoop  introduction and architectureApache hadoop  introduction and architecture
Apache hadoop introduction and architectureHarikrishnan K
 
Big data | Hadoop | components of hadoop |Rahul Gulab Sing
Big data | Hadoop | components of hadoop |Rahul Gulab SingBig data | Hadoop | components of hadoop |Rahul Gulab Sing
Big data | Hadoop | components of hadoop |Rahul Gulab SingRahul Singh
 
Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...
Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...
Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...Geoffrey Fox
 
Visualizing and Clustering Life Science Applications in Parallel 
Visualizing and Clustering Life Science Applications in Parallel Visualizing and Clustering Life Science Applications in Parallel 
Visualizing and Clustering Life Science Applications in Parallel Geoffrey Fox
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY pptsravya raju
 
Hadoop MapReduce Framework
Hadoop MapReduce FrameworkHadoop MapReduce Framework
Hadoop MapReduce FrameworkEdureka!
 

La actualidad más candente (20)

51 Use Cases and implications for HPC & Apache Big Data Stack
51 Use Cases and implications for HPC & Apache Big Data Stack51 Use Cases and implications for HPC & Apache Big Data Stack
51 Use Cases and implications for HPC & Apache Big Data Stack
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
 
HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...
HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...
HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...
 
Hadoop - Architectural road map for Hadoop Ecosystem
Hadoop -  Architectural road map for Hadoop EcosystemHadoop -  Architectural road map for Hadoop Ecosystem
Hadoop - Architectural road map for Hadoop Ecosystem
 
Introducing the hadoop ecosystem
Introducing the hadoop ecosystemIntroducing the hadoop ecosystem
Introducing the hadoop ecosystem
 
High Performance Data Analytics with Java on Large Multicore HPC Clusters
High Performance Data Analytics with Java on Large Multicore HPC ClustersHigh Performance Data Analytics with Java on Large Multicore HPC Clusters
High Performance Data Analytics with Java on Large Multicore HPC Clusters
 
High Performance Processing of Streaming Data
High Performance Processing of Streaming DataHigh Performance Processing of Streaming Data
High Performance Processing of Streaming Data
 
Next Generation Grid: Integrating Parallel and Distributed Computing Runtimes...
Next Generation Grid: Integrating Parallel and Distributed Computing Runtimes...Next Generation Grid: Integrating Parallel and Distributed Computing Runtimes...
Next Generation Grid: Integrating Parallel and Distributed Computing Runtimes...
 
Hadoop
Hadoop Hadoop
Hadoop
 
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
 
Apache hadoop introduction and architecture
Apache hadoop  introduction and architectureApache hadoop  introduction and architecture
Apache hadoop introduction and architecture
 
Big data Analytics Hadoop
Big data Analytics HadoopBig data Analytics Hadoop
Big data Analytics Hadoop
 
Big data | Hadoop | components of hadoop |Rahul Gulab Sing
Big data | Hadoop | components of hadoop |Rahul Gulab SingBig data | Hadoop | components of hadoop |Rahul Gulab Sing
Big data | Hadoop | components of hadoop |Rahul Gulab Sing
 
Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...
Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...
Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...
 
Visualizing and Clustering Life Science Applications in Parallel 
Visualizing and Clustering Life Science Applications in Parallel Visualizing and Clustering Life Science Applications in Parallel 
Visualizing and Clustering Life Science Applications in Parallel 
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Big dataanalyticsbeyondhadoop public_20_june_2013
Big dataanalyticsbeyondhadoop public_20_june_2013Big dataanalyticsbeyondhadoop public_20_june_2013
Big dataanalyticsbeyondhadoop public_20_june_2013
 
Hadoop
HadoopHadoop
Hadoop
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
Hadoop MapReduce Framework
Hadoop MapReduce FrameworkHadoop MapReduce Framework
Hadoop MapReduce Framework
 

Similar a C cerin piv2017_c

Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesGeoffrey Fox
 
High Performance Computing and Big Data
High Performance Computing and Big Data High Performance Computing and Big Data
High Performance Computing and Big Data Geoffrey Fox
 
Anusua Trivedi, Data Scientist at Texas Advanced Computing Center (TACC), UT ...
Anusua Trivedi, Data Scientist at Texas Advanced Computing Center (TACC), UT ...Anusua Trivedi, Data Scientist at Texas Advanced Computing Center (TACC), UT ...
Anusua Trivedi, Data Scientist at Texas Advanced Computing Center (TACC), UT ...MLconf
 
Hadoop Tutorial.ppt
Hadoop Tutorial.pptHadoop Tutorial.ppt
Hadoop Tutorial.pptSathish24111
 
Designing Convergent HPC and Big Data Software Stacks: An Overview of the HiB...
Designing Convergent HPC and Big Data Software Stacks: An Overview of the HiB...Designing Convergent HPC and Big Data Software Stacks: An Overview of the HiB...
Designing Convergent HPC and Big Data Software Stacks: An Overview of the HiB...inside-BigData.com
 
Big data: Descoberta de conhecimento em ambientes de big data e computação na...
Big data: Descoberta de conhecimento em ambientes de big data e computação na...Big data: Descoberta de conhecimento em ambientes de big data e computação na...
Big data: Descoberta de conhecimento em ambientes de big data e computação na...Rio Info
 
myHadoop - Hadoop-on-Demand on Traditional HPC Resources
myHadoop - Hadoop-on-Demand on Traditional HPC ResourcesmyHadoop - Hadoop-on-Demand on Traditional HPC Resources
myHadoop - Hadoop-on-Demand on Traditional HPC ResourcesSriram Krishnan
 
Cloud Services for Big Data Analytics
Cloud Services for Big Data AnalyticsCloud Services for Big Data Analytics
Cloud Services for Big Data AnalyticsGeoffrey Fox
 
Introduction to Apache Hadoop
Introduction to Apache HadoopIntroduction to Apache Hadoop
Introduction to Apache HadoopChristopher Pezza
 
Designing Software Libraries and Middleware for Exascale Systems: Opportuniti...
Designing Software Libraries and Middleware for Exascale Systems: Opportuniti...Designing Software Libraries and Middleware for Exascale Systems: Opportuniti...
Designing Software Libraries and Middleware for Exascale Systems: Opportuniti...inside-BigData.com
 
Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...
Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...
Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...Renato Bonomini
 
IJSRED-V2I3P43
IJSRED-V2I3P43IJSRED-V2I3P43
IJSRED-V2I3P43IJSRED
 
Rack Cluster Deployment for SDSC Supercomputer
Rack Cluster Deployment for SDSC SupercomputerRack Cluster Deployment for SDSC Supercomputer
Rack Cluster Deployment for SDSC SupercomputerRebekah Rodriguez
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Perficient, Inc.
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Herman Wu
 
Bitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSBitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSPhilip Filleul
 

Similar a C cerin piv2017_c (20)

Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software Architectures
 
High Performance Computing and Big Data
High Performance Computing and Big Data High Performance Computing and Big Data
High Performance Computing and Big Data
 
Anusua Trivedi, Data Scientist at Texas Advanced Computing Center (TACC), UT ...
Anusua Trivedi, Data Scientist at Texas Advanced Computing Center (TACC), UT ...Anusua Trivedi, Data Scientist at Texas Advanced Computing Center (TACC), UT ...
Anusua Trivedi, Data Scientist at Texas Advanced Computing Center (TACC), UT ...
 
Hadoop Tutorial.ppt
Hadoop Tutorial.pptHadoop Tutorial.ppt
Hadoop Tutorial.ppt
 
Hadoop tutorial
Hadoop tutorialHadoop tutorial
Hadoop tutorial
 
Designing Convergent HPC and Big Data Software Stacks: An Overview of the HiB...
Designing Convergent HPC and Big Data Software Stacks: An Overview of the HiB...Designing Convergent HPC and Big Data Software Stacks: An Overview of the HiB...
Designing Convergent HPC and Big Data Software Stacks: An Overview of the HiB...
 
Big data: Descoberta de conhecimento em ambientes de big data e computação na...
Big data: Descoberta de conhecimento em ambientes de big data e computação na...Big data: Descoberta de conhecimento em ambientes de big data e computação na...
Big data: Descoberta de conhecimento em ambientes de big data e computação na...
 
myHadoop - Hadoop-on-Demand on Traditional HPC Resources
myHadoop - Hadoop-on-Demand on Traditional HPC ResourcesmyHadoop - Hadoop-on-Demand on Traditional HPC Resources
myHadoop - Hadoop-on-Demand on Traditional HPC Resources
 
Cloud Services for Big Data Analytics
Cloud Services for Big Data AnalyticsCloud Services for Big Data Analytics
Cloud Services for Big Data Analytics
 
Introduction to Apache Hadoop
Introduction to Apache HadoopIntroduction to Apache Hadoop
Introduction to Apache Hadoop
 
FC Brochure & Insert
FC Brochure & InsertFC Brochure & Insert
FC Brochure & Insert
 
useR 2014 jskim
useR 2014 jskimuseR 2014 jskim
useR 2014 jskim
 
ODSC and iRODS
ODSC and iRODSODSC and iRODS
ODSC and iRODS
 
Designing Software Libraries and Middleware for Exascale Systems: Opportuniti...
Designing Software Libraries and Middleware for Exascale Systems: Opportuniti...Designing Software Libraries and Middleware for Exascale Systems: Opportuniti...
Designing Software Libraries and Middleware for Exascale Systems: Opportuniti...
 
Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...
Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...
Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...
 
IJSRED-V2I3P43
IJSRED-V2I3P43IJSRED-V2I3P43
IJSRED-V2I3P43
 
Rack Cluster Deployment for SDSC Supercomputer
Rack Cluster Deployment for SDSC SupercomputerRack Cluster Deployment for SDSC Supercomputer
Rack Cluster Deployment for SDSC Supercomputer
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
 
Bitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSBitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FS
 

Más de Bertrand Tavitian

Más de Bertrand Tavitian (7)

14 30-mjoliot biomist-piv-2017
14 30-mjoliot biomist-piv-201714 30-mjoliot biomist-piv-2017
14 30-mjoliot biomist-piv-2017
 
9 30 fandre-dist_cnrs_piv_2017
9 30 fandre-dist_cnrs_piv_20179 30 fandre-dist_cnrs_piv_2017
9 30 fandre-dist_cnrs_piv_2017
 
M allanic piv2017_c
M allanic piv2017_cM allanic piv2017_c
M allanic piv2017_c
 
M joliot biomist-piv_2017_c
M joliot biomist-piv_2017_cM joliot biomist-piv_2017_c
M joliot biomist-piv_2017_c
 
14 00-20171207 rance-piv_c
14 00-20171207 rance-piv_c14 00-20171207 rance-piv_c
14 00-20171207 rance-piv_c
 
M kain fli-iam_piv_2017_c
M kain fli-iam_piv_2017_cM kain fli-iam_piv_2017_c
M kain fli-iam_piv_2017_c
 
Mc jacquemot piv2017_c
Mc jacquemot piv2017_cMc jacquemot piv2017_c
Mc jacquemot piv2017_c
 

Último

Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original PhotosBook Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photosnarwatsonia7
 
Glomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptxGlomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptxDr.Nusrat Tariq
 
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024Gabriel Guevara MD
 
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...narwatsonia7
 
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment BookingCall Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Bookingnarwatsonia7
 
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersBook Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersnarwatsonia7
 
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy GirlsCall Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girlsnehamumbai
 
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls ServiceCall Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Servicesonalikaur4
 
See the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy PlatformSee the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy PlatformKweku Zurek
 
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...Miss joya
 
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking ModelsMumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Modelssonalikaur4
 
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingCall Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingNehru place Escorts
 
Call Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort Service
Call Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort ServiceCall Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort Service
Call Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort Serviceparulsinha
 
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...narwatsonia7
 
Hemostasis Physiology and Clinical correlations by Dr Faiza.pdf
Hemostasis Physiology and Clinical correlations by Dr Faiza.pdfHemostasis Physiology and Clinical correlations by Dr Faiza.pdf
Hemostasis Physiology and Clinical correlations by Dr Faiza.pdfMedicoseAcademics
 

Último (20)

Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original PhotosBook Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
 
Glomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptxGlomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptx
 
sauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Service
sauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Servicesauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Service
sauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Service
 
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
 
Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024
 
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
 
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
 
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
 
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment BookingCall Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
 
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
 
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersBook Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
 
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy GirlsCall Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
 
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls ServiceCall Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
 
See the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy PlatformSee the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy Platform
 
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
 
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking ModelsMumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
 
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingCall Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
 
Call Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort Service
Call Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort ServiceCall Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort Service
Call Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort Service
 
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
 
Hemostasis Physiology and Clinical correlations by Dr Faiza.pdf
Hemostasis Physiology and Clinical correlations by Dr Faiza.pdfHemostasis Physiology and Clinical correlations by Dr Faiza.pdf
Hemostasis Physiology and Clinical correlations by Dr Faiza.pdf
 

C cerin piv2017_c

  • 1. Gestion des données scientifiques en imagerie in vivo https://piv2017.sciencesconf.org/ Christophe CERIN, Université Paris13 christophe.cerin@lipn.univ-paris13.fr Plateformes de resources numériques Expérience USPC Paris, December 7, 2017
  • 2. Outline 2 • Motivations related to large scale platforms • Background: • Criteria for observing platforms • High Performance Computing (HPC) • Apache suite for Big Data • Some projects • Contribution: USPC platform • Conclusion and perspective
  • 3. Motivations related to large scale platforms In experimental Sciences we need tips for using them appropriately / best practices 3
  • 4. 4 Definition of a large scale system • System with unprecedented amounts of hardware (>1M cores, >100 PB storage, >100Gbits/s interconnect); • Ultra large scale system: hardware + lines of source code + numbers of users + volumes of data; • New problems: • built across multiple organizations; • often with conflicting purposes and needs; • constructed from heterogeneous parts with complex dependencies and emergent properties; • continuously evolving; • software, hardware and human failures will be the norm, not the exception.
  • 5. Since in our lives (Cloud Computing) 5 Software as a Service SaaS Plateform as a Service PaaS
 Infrastructure as a 
 Service
 IaaS
  • 6. • Don’t believe them; • Variety in the eco-system is the norm. 6 Some speeches are ready for you
  • 7. In Cloud, innovation drives technology/science Containers and no more Virtual Machines 7 Hardware Operating System App App App Conventional Stack Hardware OS App App App Hypervisor OS OS Virtualized Stack ▪Abstraction on resources in the past: VM technologies to hide physical properties, interactions between system layer, application layer and the end-users
  • 9. 9 In Sciences (not in business) • Experimental approach: modeling,…, experiments,… • What is an experimental plan for large scale systems and reproducible research? • Answers depend on the system used: • Cluster (dedicated; managed) • Cloud (almost non supervised… or by you)
  • 11. ▪Cluster ➔Performance ➔Flops - Top500 ▪Cluster ➔Power ➔Flops per watt - Green500 ▪Cloud ➔Price ➔for computing, for storing… ▪Cloud: the economic model may be strange (Amazon Spot Instances) ▪Scientific issues: how to compare cluster and cloud? (be fair) 11 Performance criteria for observing large scale system
  • 14. 14 Productivity ▪Productivity traditionally refers to the ratio between the quantity of software produced and the cost spent for it. ▪Factors: Programming language used, Program size, The experience of programmers and design personnel, The novelty of requirements, The complexity of the program and its data, The use of structured programming methods, Program class or the distribution method, Program type of the application area, Tools and environmental conditions, Enhancing existing programs or systems, Maintaining existing programs or systems, Reusing existing modules and standard designs, Program generators, Fourth-generation languages, Geographic separation of development locations, Defect potentials and removal methods, (Existing) Documentation, Prototyping before main development begins, Project teams and organization structures, Morale and compensation of staff
  • 15. ▪Parallel (http://press3.mcs.anl.gov/atpesc/files/2015/03/ESCTP-snir_215.pdf): ▪Shared memory (OpenMP): ▪CPU+vector ▪CPU+GPU ▪CPU+accelerators (Google Tensor processing unit…) ▪Message passing (MPI) ▪Sequential with the call to « parallel data structure » (Scala) ▪Map-reduce: Hadoop, Spark (In addition to Map and Reduce operations, it supports SQL queries, streaming data, machine learning and graph data processing.) 15 Programming model
  • 16. ▪Even more complicated: Cray has added "ARM Option" (i.e. CPU blade option, using the ThunderX2) to their XC50 supercomputers ▪We need consolidations: ▪in hardware ▪in software ▪in algorithms ▪in applications ▪in experimental methods ▪hpc facilities ▪training 16 Hardware landscape
  • 17. 17 Apache suite for Big Data 1- Bases de Données 1.1 Apache Hbase 1.2 Apache Cassandra 1.3 CouchDB 1.4 MongoDB 2 Accès aux données/ requêtage 2.1 Pig 2.2 Hive 2.3 Livy 3 Data Intelligence 3.1 Apache Drill 3.2 Apache Mahout 3.3 H2O 4 Data Serialisation 4.1 Apache Thrift 4.2 Apache Avro 5 Data integration 5.1 Apache Sqoop 5.2 Apache Flume 5.3 Apache Chuckwa 6 Requetage 6.1 Presto 6.2 Impala 6.3 Dremel 7 Sécurité des données 7.1 Apache Metron 7.2 Sqrrl 8.2 MapReduce 8.3 Spark 9.1 Elasticsearch 10.1 Apache Hadoop 10.6 Apache Mesos 10.10 Apache Flink 10.12 Apache Zookeeper 10.15 Apache Storm 10.20 Apache Kafka
  • 18. Example of a project mixing HPC and Big Data considerations Where theory comes before data: physics chemistry Materials ... Where data comes before theory: Medicine business insights autonomy (smart car, building, city) 18
  • 19. Harp: Hadoop and Collective Communication for Iterative Computation 19 • Project http://salsaproj.indiana.edu/harp/ • Motivation: • Communication patterns are not abstracted and defined in Hadoop, Pregel/Giraph, Spark • In contrary, MPI has very fine grain based (collective) communication primitives (based on arrays and buffers) • Harp provides data abstractions and communication abstractions on top of them. It can be plugged into Hadoop runtime to enable efficient in-memory communication to avoid HDFS read/write.
  • 20. 20 Harp • The data abstraction has 3 categories horizontally and 3 levels vertically
  • 21. • Harp works as a plugin in Hadoop. The goal is to make Hadoop cluster can schedule original MapReduce jobs and Map-Collective jobs at the same time. • Collective communication is defined as movement of partitions within tables. • Collective communication requires synchronization. Hadoop scheduler is modified to schedule tasks in BSP style. • Fault tolerance with checkpointing. 21 Harp
  • 22. • For the data-scientist of nowadays; • Hub to facilite access to Amazon cloud resources, to share data, to collaborate between many people through notebooks; • RosettaHUB: https://www.rosettahub.com • Available through Amazon Educate program (https://s3.amazonaws.com/awseducate- list/AWS_Educate_Institutions.pdf) 22 RosettaHub
  • 23. 23
  • 24. 24
  • 25. 25
  • 26. 26
  • 27. SPC and the deployment of virtual machines IDV experience (Imageries du vivant) 27
  • 29. 29
  • 30. 30
  • 32. 32
  • 33. 33
  • 34. 34
  • 35. 35
  • 36. 36
  • 37. • Plateforme d'imagerie ePad http://pf-01.lab.parisdescartes.fr:1243/dcm4chee-web3/ • Plateforme PLM dans le cadre du projet DRIVE-SPC http://pf-01.lab.parisdescartes.fr:2345/tc/webclient • Plateforme Sis4web de SisNCom http://pf-01.lab.parisdescartes.fr:1323/sis4web/login.php • Benchsys http://pf-01.lab.parisdescartes.fr:9001/ • Owncloud http://pf-01.lab.parisdescartes.fr:1253/owncloud/index.php/login • … 37 Examples IDV services
  • 38. Conclusion and Future work Many, many opportunities 38
  • 39. • Computer hardware will continue to evolve; Trend: accelerators instead of GPUs; • Platforms will continue to evolve to facilitate the management of data, the computation on data; • Need for convergence/resonance between HPC and Big- data eco-systems; • SPC: needs for training, using, transforming the scientific approaches… with research engineers for accompany the change; • SPC: needs for projects… and to share betweenez communities. 39 Conclusion and perspective
  • 40. Thanks to Leila Abidi, Olivier Waldek, Roland Chervet 40