SlideShare una empresa de Scribd logo
1 de 38
Gagan Agarwal1* Prasanna Balaprakash2 Ian Foster2* Raj Kettimuthu2 
Sven Leyffer2 Vitali Morozov2 Todd Munson2 Nagi Rao3* 
Saday Sadayappan1 Brad Settlemyer3 Brian Tierney4* Don Towsley5* 
Venkat Vishwanath2 Yao Zhang2 
1 Ohio State University 2 Argonne National Laboratory 
3 Oak Ridge National Laboratory 4 ESnet 5 UMass Amherst (* Co-PIs) 
Advanced Scientific Computing Research 
Program manager: Rich Carlson ♦︎
2 
Prediction, explanation, & optimization are 
challenging for even “simple” E2E workflows 
Source 
data 
store 
Desti-nation 
data 
store 
Wide 
Area 
Network 
For example, file transfer, for which we want to: 
• Predict achievable throughput for a specific configuration 
• Explain factors influencing performance 
• Optimize parameter values to achieve high speeds
3 
Prediction, explanation, & optimization are 
challenging for even “simple” E2E workflows 
Application 
OS 
FS Stack 
HBA/HCA 
Router 
LAN 
Switch 
Source 
data 
transfer 
node 
TCP 
IP 
NIC 
Application 
OS 
Router TCP 
FS Stack 
HBA/HCA 
LAN 
Switch 
IP 
NIC 
Storage Array 
Wide 
Area 
Network 
OST 
MDT 
Lustre 
file 
system 
Destination 
data transfer 
node 
OSS 
OSS 
MDS 
MDS 
+ diverse environments 
+ diverse workloads 
+ contention
85 Gbps sustained disk-to-disk over 100 
Gbps network, Ottawa—New Orleans 
4 
Raj Kettiumuthu 
and team, 
Argonne
High-speed transfers to/from AWS cloud, 
via Globus transfer service 
• UChicago  AWS S3 (US region): Sustained 2 Gbps 
– 2 GridFTP servers, GPFS file system at UChicago 
– Multi-part upload via 16 concurrent HTTP connections 
• AWS  AWS (same region): Sustained 5 Gbps 
5 
go#s3
6 
One Advanced 
Photon Source 
data node: 
125 destinations
Same 
node 
(1 Gbps 
link)
9
How to create more accurate, useful, and 
portable models of such systems? 
Simple analytical model: 
T= α+ β*l 
[startup cost + sustained bandwidth] 
Experiment + regression 
to estimate α, β 
10 
First-principles modeling 
to better capture details 
of system & application 
components 
Data-driven modeling to 
learn unknown details of 
system & application 
components 
Model 
composition 
Model, data 
comparison
The RAMSES vision 
To develop a new science of end-to-end 
analytical performance modeling that will 
transform understanding of the behavior of 
science workflows in extreme-scale science 
environments. 
Based on integration of first-principles and 
data-driven modeling, and structured 
approach to model evaluation & composition 
11
The RAMSES research agenda & platform 
Modeling 
Develop, evaluate, 
and refine component 
and end-to-end models 
Tools 
Develop easy-to-use 
tools to provide end-users 
with actionable 
advice 
Estimation 
Develop and apply data-driven 
estimation methods: 
differential regression, 
surrogate models, 
etc. 
Experiments 
Extensive, automated 
Databas 
experiments to test models 
& build database 
12 
Evaluators Advisor 
e 
Estimators Tester
We are informed by five challenge workflows 
13 
Transfer: High-performance, end-to-end 
file transfer 
Scattering: Capture and analysis of 
diffuse scattering experimental data 
MapReduce: Data-intensive, distributed 
data analytics 
Exascale: Performance of exascale 
application kernels on memory hierarchies 
In-situ: Configuration and placement of in-situ 
analysis computations
Transfer: End-to-end file movement 
Storage Array 
14 
Application 
OS 
FS Stack 
HBA/HCA 
Router 
LAN 
Switch 
Source 
data 
transfer 
node 
TCP 
IP 
NIC 
Application 
OS 
TCP 
IP 
FS Stack 
HBA/HCA 
Router 
LAN 
Switch 
NIC 
Wide 
Area 
Network 
Predict: Throughput for configuration 
Explain: Factors influencing performance 
Optimize: Parameters for high speeds 
OST 
MDT 
Lustre 
file 
system 
Destination 
data transfer 
node 
OSS 
OSS 
MDS 
MDS
Scattering: Linking simulation and 
experiment to study disordered structures 
Diffuse scattering images from Ray Osborn et al., Argonne 
Experimental Sample 
scattering 
Material 
composition 
Simulated 
structure 
Simulated 
scattering 
La 60% 
Sr 40% 
Detect errors 
(secs—mins) 
Knowledge base 
Past experiments; 
simulations; literature; 
expert knowledge 
Select experiments 
(mins—hours) 
Contribute to knowledge base 
Simulations driven by 
experiments (mins—days) 
Knowledge-driven 
decision making 
Evolutionary optimization
Immediate assessment of alignment quality in 
near-field high-energy diffraction microscopy 
1 
Blue Gene/Q 
Orthros 
(All data in NFS) 
3: Generate 
Parameters 
FOP.c 
50 tasks 
25s/task 
¼ CPU hours 
Uses Swift/K 
Dataset 
360 files 
4 GB total 
1: Median calc 
75s (90% I/O) 
MedianImage.c 
Uses Swift/K 
2: Peak Search 
15s per file 
ImageProcessing.c 
Uses Swift/K 
Reduced 
Dataset 
360 files 
5 MB total 
feedback to experiment 
Detector 
4: Analysis Pass 
FitOrientation.c 
60s/task (PC) 
1667 CPU hours 
60s/task (BG/Q) 
1667 CPU hours 
Uses Swift/T 
GO Transfer 
Up to 
2.2 M CPU hours 
per week! 
ssh 
Globus Catalog 
Scientific Metadata 
Workflow Workflow Progress 
Control 
Script 
Bash 
Manual 
This is a 
single 
workflow 
3: Convert bin L 
to N 
2 min for all files, 
convert files to 
Network Endian 
format 
Before 
After 
Hemant Sharma, Justin Wozniak, Mike Wilde, Jon Almer
MapReduce: Distributing data and 
computation for data analytics 
Job Assignment 
... 
... 
Data 
Slaves 
Master 
Local Cluster 
Local 
Reduction 
... 
... 
Data 
Slaves 
Master 
Cloud 
Environment 
Job Assignment 
Local 
Reduction 
Index 
17 
Remote data 
analysis 
Job 
assignment 
Global 
reduction
Exascale simulation 
18 
Images Courtesy: Joseph Insley (Argonne) 
HACC Cosmology 
• Compute intensive phase with 
regular stride one access 
• Tree walk phase: irregular 
memory access with high 
branching and integer ops 
• 3D FFT communication intensive 
phase 
• I/O Phase 
Nek5000 CFD 
• Matrix vector product phase 
• Conjugate gradient iteration 
• Communication phase 
involving nearest neighbor 
exchange and vector 
reductions
In situ analysis on the DOE Leadership 
Compute 
Resource 
(Multi 
Petaflop, 
High Radix 
Interconnect 
Dragonfly, 
5D Torus) 
Computing Infrastructure 
I/O 
Nodes 
Switch 
Complex 
Analysis 
Nodes/Cluster 
(IB) File Server 
Nodes 
Storage System 
1536 
GB/s 
DTN Nodes 
We need to perform the right computation at 
the right place and time, taking into account 
details of the simulation, resources, and analysis 
1 
2 
3 
4
A diverse set of components 
Server 
Parallel 
computer 
Router 
Storage system 
LAN 
WAN 
TCP, UDT 
GridFTP 
File systems 
GridFTP server 
NECbone 
HACCbone 
Checksum 
Encryption 
MapReduce 
Other apps 
Transfer Y Y Y Y Y Y Y Y Y Y Y 
Scattering Y Y Y Y Y Y Y Y 
Exascale Y Y Y Y Y Y 
Distributed 
MapReduce Y Y Y Y Y Y Y Y Y 
In-Situ Y Y Y Y Y Y Y Y 
20
Develop, evaluate, and refine 
component and end-to-end 
models 
• Models from the literature 
• Fluid models for network flows 
• SKOPE modeling 
system 
21 
Develop and apply 
data-driven 
estimation methods 
• Differential regression 
• Surrogate models 
• Other methods from literature 
Develop easy-to-use tools to 
provide end-users with 
actionable advice 
• Runtime advisor, integrated 
with Globus transfer system 
Automated experiments to 
test models and build 
database 
• Experiment design 
• Testbeds
Overview Input Output 
Workload input 
Code 
skeletons 
Parser 
Per-function 
intermediate repr. 
(Block Skeleton Trees) 
Behavior 
modeling engine 
Execution-based 
intermediate repr. 
(Bayesian execution tree) 
Transformation 
engine 
Performance 
projection 
Characterization 
engine 
Transformed 
Bayesian execution 
tree 
Hardware model 
system 
specifications 
Performance 
projection 
Schema for 
suggested 
tranformations 
Synthesized 
characteristics 
Source code 
User Effort 
(semi-automated with 
a source-to-source 
translator) 
Automatic 
SKOPE language 
Back end Front end 
Bottleneck analysis 
SKOPE 
performance 
modeling 
framework
Differential regression for combining 
data from different sources 
Example of use: Predict performance on connection length L 
not realizable on physical infrastructure 
E.g., IB-RDMA or HTCP throughput on 900-mile connection 
1) Make multiple measurements of performance on path lengths d: 
– Ms(d): OPNET simulation 
– ME(d): ANUE-emulated path 
– MU(di): Real network (USN) 
2) Compute measurement regressions on d: ṀA(.), A∈{S, E, U} 
3) Compute differential regressions: ΔṀA,B(.) = ṀA(.) - ṀB(.), A, B∈{S, E, U} 
4) Apply differential regression to obtain estimates, C∈{S, E} 
퓜U(d) = MC(d) - ΔṀC,U(d) 
simulated/emulated measurements point regression estimate
We will extend the differential regression 
method in several areas 
• To compare different component models 
– E.g., different models of network elements, storage 
systems, protocol implementations 
• To compare different composite models 
– E.g., different methods for combining memory and 
CPU models 
• To compare model outputs with measurements 
24
Component model 
component 
System 
parameters 
Task size 
parameters 
i 
cost 
terms 
performance 
quality model 
p i 
si 
Experiment design 
(active learning) 
Analytical 
and 
empirical 
models 
ˆQ 
i ( pi ,si ) is a regression 
estimate of
End-to-end profile composition 
Source LAN 
profile 
WAN 
profile 
Destination LAN 
profile 
Configuration for 
host and edge 
devices 
Configuration 
for WAN 
devices 
Configuration for 
host and edge 
devices 
composition 
operations
End-to-end model composition & analysis 
• End-to-end model using composition 
– It is an approximation: due to component interactions 
not modelled by the composition operator 
• Actual end-to-end performance model 
– Component models are “corrected” to account for un-modelled 
effects: this form is assumed to exist 
27
Using end-to-end measurements and differential 
regression to correct regression estimates 
• Regression estimate of composed model: 
– “Estimated”, since components models are “incomplete” 
as derived from first principles and/or measurements 
• Error due to regression estimate: 
• Error can be mitigated using measurements: 
Corrected estimate of : 
28 
Q p,s ( )Å ˆQ 
p,s ( ) = Q p,s ( )- ˆQ 
p,s ( ) éë 
ùû 
2 
ˆ (p, ) Qs 
Qp,s 
ˆQ 
p,s ( ) = ˆQ 
p,s ( )+ ˆD 
(p,s) 
Analytical 
model 
Correction from differential 
regression using 
measurements
Performance guarantees 
• Vapnik-Chervonenkis theory: under finite VC-dim(F) 
P I ˆD, ˆQ, p ( )- I D*, ˆQ, p ( ) >e { } <d F,l,e ( ) 
Estimated Optimal 
– Guarantees that error of regression estimate is close to 
optimal with a certain probability 
– Distribution-free: does not require detailed knowledge 
of error distributions – uses end-to-end measurements 
• Error of the corrected estimate: 
29 
i p 
I D, ˆQ 
( , p) = Qp,s - ˆQ 
p,s ( )- D p,s ( ) éë 
ùû 
ò dPQp,s
Surrogate modeling framework 
to inform choice of experiments 
30 
Machine learning & 
optimization 
Performance 
metrics 
Informative 
configurations 
First-principles models 
Evaluation
Fluid models of network flows 
GridFTP flow i, parallelism ki 
dT k T t 
i i i 
  
2 
dt R k 
Bottleneck router 
T t p t 
dt      
Solve for throughputs, and 
transfer delays 
Special case: known p 
31 
GridFTP flow i: 
RTT Ri 
Throughput Ti 
Bottleneck 
router: 
Capacity C 
Loss rate p 
{ 0} 1Q j 
j 
dQ 
C T 
i 
i 
i 
k 
T 
R p 
 
( ) 
( ) ( ) 
2 
i 
i i
32 
Model composition 
Analytical 
models 
Performance projections 
Regression 
models 
Experiments Historical logs 
Emulators 
Code skeletons 
SKOPE 
language 
Workload 
parameters 
Source 
code 
Benchmarks 
Simulators 
SKOPE 
System models 
(current or future) 
Application behavior 
models 
Our 
multi-modal 
approach
33 
File transfer performance projections 
System models Application behavior 
Application 
to file 
transfer 
Model composition 
Analytical 
models 
Regression 
models 
Experiments Historical logs 
Code skeletons 
SKOPE 
language 
Workload 
parameters 
Source 
code 
SKOPE 
models 
Storage, TCP, WAN 
iperf 
GridFTP 
Emulators XDD
34 
Exascale simulation perf. projections 
System models Application behavior 
Compute, memory, models 
Model composition 
Analytical 
models 
Regression 
models 
Experiments Historical logs 
Code skeletons 
SKOPE 
language 
Workload 
parameters 
Source 
code 
SKOPE 
interconnect 
MPI 
benchmarks 
Stream 
DGEMM IOR 
corresponding CPU of a code skeleton is int roduced in the comment is not discussed in further L ist ing 1: Mat Mul ’ s CPU 1 f l oat A[ N] [ K] , B[ K] [ M] ; 
f l oat C[ N] [ M] ; 
3 i nt i , j , k ; 
f or ( i =0; i <N; ++i ) { 
5 f or ( j =0; j <M; ++j ) { 
f l oat sum = 0; 
7 f or ( k =0; k <K; ++k) { 
sum+=A[ i ] [ k] * B[ k ] [ j ] ; 
9 } 
C[ i ] [ j ] = sum; 
11 } 
L ist ing 2: Mat Mul ’ s code skele-t 
on 
1 f l oat A[ N] [ K] 
f l oat B[ K] [ M] 
3 f l oat C[ N] [ M] 
/ * t he l oop space * / 
5 par al l el _f or ( N, M) 
: i , j 
7 { 
/ * comput at i on w/ t 
9 * i nst r uc t i on count 
* / 
11 comp 1 
/ * st r eami ng l oop * / 
13 st r eam k = 0: K { 
/ * l oad * / 
15 l d A[ i ] [ k ] 
l d B[ k ] [ j ] 
17 comp 3 
} 
19 comp 5 
/ * st or e * / 
21 st C[ i ] [ j ] 
} 
The following informat a computat ional kernel. 
Dat a par al lel ism homoge-neous 
tasks repeated express data parallelism the innermost parallel A task corresponds f or loop. I t is expressed computat ion. 
Dat a accesses are oper-at 
ions. The accessed in-dices, 
array sizes, and be expressed as well; are random unless users and List ing 6). 
Application 
to exascale 
simulation
A performance database 
• We aim to collect instrumentation data in a 
central database to simplify model validation 
• We plan to use the perfSONAR measurement 
archive tool as a starting point 
– REST API on top of Cassandra and Postgres 
– Optimized for time series data 
– Will extend as needed 
– http://software.es.net/esmond/ 
35
Application to transfer optimization 
36 
Performance 
predictor 
Parameter 
database 
Performance 
analyst 
Model 
refiner 
User 
feedback 
agent 
Globus 
(1) Transfer service 
description 
(3) Transfer 
performance 
(4) User 
feedback 
(2) 
Prediction 
Prediction 
Analysis 
Analysis 
Parameter 
update
Summary 
• We focus on the science of modeling: integration 
of first-principles and data-driven models; model 
composition and evaluation 
• Our challenge applications span a broad 
spectrum of DOE resources and disciplines 
• We see big opportunities for cooperation: e.g., 
on development and evaluation of component 
models 
37
Thanks, and for more information 
• Thanks to our sponsors: 
Advanced Scientific Computing Research 
Program manager: Rich Carlson 
• Thanks to my RAMSES project co-participants 
• For more information, please see 
https://sites.google.com/site/ramsesdoeproject/ 
ianfoster.org and @ianfoster 38

Más contenido relacionado

La actualidad más candente

Scalable Algorithm Design with MapReduce
Scalable Algorithm Design with MapReduceScalable Algorithm Design with MapReduce
Scalable Algorithm Design with MapReducePietro Michiardi
 
A time energy performance analysis of map reduce on heterogeneous systems wit...
A time energy performance analysis of map reduce on heterogeneous systems wit...A time energy performance analysis of map reduce on heterogeneous systems wit...
A time energy performance analysis of map reduce on heterogeneous systems wit...newmooxx
 
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화NAVER Engineering
 
Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...
Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...
Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...Accumulo Summit
 
Optimization of Incremental Queries CloudMDE2015
Optimization of Incremental Queries CloudMDE2015Optimization of Incremental Queries CloudMDE2015
Optimization of Incremental Queries CloudMDE2015József Makai
 
Large Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkLarge Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkCloudera, Inc.
 
2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...
2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...
2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...DB Tsai
 
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Jen Aman
 
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...Xiao Qin
 
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Intel® Software
 
Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...
Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...
Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...Intel® Software
 
Next generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labNext generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labImpetus Technologies
 
Recent progress on distributing deep learning
Recent progress on distributing deep learningRecent progress on distributing deep learning
Recent progress on distributing deep learningViet-Trung TRAN
 
2014-10-20 Large-Scale Machine Learning with Apache Spark at Internet of Thin...
2014-10-20 Large-Scale Machine Learning with Apache Spark at Internet of Thin...2014-10-20 Large-Scale Machine Learning with Apache Spark at Internet of Thin...
2014-10-20 Large-Scale Machine Learning with Apache Spark at Internet of Thin...DB Tsai
 
Generalized Linear Models in Spark MLlib and SparkR
Generalized Linear Models in Spark MLlib and SparkRGeneralized Linear Models in Spark MLlib and SparkR
Generalized Linear Models in Spark MLlib and SparkRDatabricks
 
Convolutional Neural Networks at scale in Spark MLlib
Convolutional Neural Networks at scale in Spark MLlibConvolutional Neural Networks at scale in Spark MLlib
Convolutional Neural Networks at scale in Spark MLlibDataWorks Summit
 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFIan Foster
 
Big Linked Data Interlinking - ExtremeEarth Open Workshop
Big Linked Data Interlinking - ExtremeEarth Open WorkshopBig Linked Data Interlinking - ExtremeEarth Open Workshop
Big Linked Data Interlinking - ExtremeEarth Open WorkshopExtremeEarth
 

La actualidad más candente (20)

Scalable Algorithm Design with MapReduce
Scalable Algorithm Design with MapReduceScalable Algorithm Design with MapReduce
Scalable Algorithm Design with MapReduce
 
A time energy performance analysis of map reduce on heterogeneous systems wit...
A time energy performance analysis of map reduce on heterogeneous systems wit...A time energy performance analysis of map reduce on heterogeneous systems wit...
A time energy performance analysis of map reduce on heterogeneous systems wit...
 
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
 
Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...
Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...
Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...
 
Optimization of Incremental Queries CloudMDE2015
Optimization of Incremental Queries CloudMDE2015Optimization of Incremental Queries CloudMDE2015
Optimization of Incremental Queries CloudMDE2015
 
Large Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkLarge Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache Spark
 
2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...
2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...
2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...
 
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
 
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
 
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
 
Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...
Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...
Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...
 
Next generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labNext generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph lab
 
Recent progress on distributing deep learning
Recent progress on distributing deep learningRecent progress on distributing deep learning
Recent progress on distributing deep learning
 
2014-10-20 Large-Scale Machine Learning with Apache Spark at Internet of Thin...
2014-10-20 Large-Scale Machine Learning with Apache Spark at Internet of Thin...2014-10-20 Large-Scale Machine Learning with Apache Spark at Internet of Thin...
2014-10-20 Large-Scale Machine Learning with Apache Spark at Internet of Thin...
 
Yarn spark next_gen_hadoop_8_jan_2014
Yarn spark next_gen_hadoop_8_jan_2014Yarn spark next_gen_hadoop_8_jan_2014
Yarn spark next_gen_hadoop_8_jan_2014
 
NGBT_poster_v0.4
NGBT_poster_v0.4NGBT_poster_v0.4
NGBT_poster_v0.4
 
Generalized Linear Models in Spark MLlib and SparkR
Generalized Linear Models in Spark MLlib and SparkRGeneralized Linear Models in Spark MLlib and SparkR
Generalized Linear Models in Spark MLlib and SparkR
 
Convolutional Neural Networks at scale in Spark MLlib
Convolutional Neural Networks at scale in Spark MLlibConvolutional Neural Networks at scale in Spark MLlib
Convolutional Neural Networks at scale in Spark MLlib
 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCF
 
Big Linked Data Interlinking - ExtremeEarth Open Workshop
Big Linked Data Interlinking - ExtremeEarth Open WorkshopBig Linked Data Interlinking - ExtremeEarth Open Workshop
Big Linked Data Interlinking - ExtremeEarth Open Workshop
 

Destacado

Foster Computational Thinking
Foster Computational ThinkingFoster Computational Thinking
Foster Computational ThinkingIan Foster
 
测试驱动的前端开发初探
测试驱动的前端开发初探测试驱动的前端开发初探
测试驱动的前端开发初探hua qiu
 
Science as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryScience as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryIan Foster
 
Prueba
PruebaPrueba
Pruebaccpq
 
GENI Engineering Conference -- Ian Foster
GENI Engineering Conference -- Ian FosterGENI Engineering Conference -- Ian Foster
GENI Engineering Conference -- Ian FosterIan Foster
 
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and AutomationThe Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and AutomationIan Foster
 
Accelerating data-intensive science by outsourcing the mundane
Accelerating data-intensive science by outsourcing the mundaneAccelerating data-intensive science by outsourcing the mundane
Accelerating data-intensive science by outsourcing the mundaneIan Foster
 
Computing Outside The Box June 2009
Computing Outside The Box June 2009Computing Outside The Box June 2009
Computing Outside The Box June 2009Ian Foster
 
Taming Big Data!
Taming Big Data!Taming Big Data!
Taming Big Data!Ian Foster
 
Big data at experimental facilities
Big data at experimental facilitiesBig data at experimental facilities
Big data at experimental facilitiesIan Foster
 
So Long Computer Overlords
So Long Computer OverlordsSo Long Computer Overlords
So Long Computer OverlordsIan Foster
 
Computing Outside The Box
Computing Outside The BoxComputing Outside The Box
Computing Outside The BoxIan Foster
 
Streamlined data sharing and analysis to accelerate cancer research
Streamlined data sharing and analysis to accelerate cancer researchStreamlined data sharing and analysis to accelerate cancer research
Streamlined data sharing and analysis to accelerate cancer researchIan Foster
 
Accelerating Discovery via Science Services
Accelerating Discovery via Science ServicesAccelerating Discovery via Science Services
Accelerating Discovery via Science ServicesIan Foster
 
Grid Computing July 2009
Grid Computing July 2009Grid Computing July 2009
Grid Computing July 2009Ian Foster
 
Opportunities for X-Ray science in future computing architectures
Opportunities for X-Ray science in future computing architecturesOpportunities for X-Ray science in future computing architectures
Opportunities for X-Ray science in future computing architecturesIan Foster
 
Globus Auth: A Research Identity and Access Management Platform
Globus Auth: A Research Identity and Access Management PlatformGlobus Auth: A Research Identity and Access Management Platform
Globus Auth: A Research Identity and Access Management PlatformIan Foster
 

Destacado (17)

Foster Computational Thinking
Foster Computational ThinkingFoster Computational Thinking
Foster Computational Thinking
 
测试驱动的前端开发初探
测试驱动的前端开发初探测试驱动的前端开发初探
测试驱动的前端开发初探
 
Science as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryScience as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate Discovery
 
Prueba
PruebaPrueba
Prueba
 
GENI Engineering Conference -- Ian Foster
GENI Engineering Conference -- Ian FosterGENI Engineering Conference -- Ian Foster
GENI Engineering Conference -- Ian Foster
 
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and AutomationThe Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
 
Accelerating data-intensive science by outsourcing the mundane
Accelerating data-intensive science by outsourcing the mundaneAccelerating data-intensive science by outsourcing the mundane
Accelerating data-intensive science by outsourcing the mundane
 
Computing Outside The Box June 2009
Computing Outside The Box June 2009Computing Outside The Box June 2009
Computing Outside The Box June 2009
 
Taming Big Data!
Taming Big Data!Taming Big Data!
Taming Big Data!
 
Big data at experimental facilities
Big data at experimental facilitiesBig data at experimental facilities
Big data at experimental facilities
 
So Long Computer Overlords
So Long Computer OverlordsSo Long Computer Overlords
So Long Computer Overlords
 
Computing Outside The Box
Computing Outside The BoxComputing Outside The Box
Computing Outside The Box
 
Streamlined data sharing and analysis to accelerate cancer research
Streamlined data sharing and analysis to accelerate cancer researchStreamlined data sharing and analysis to accelerate cancer research
Streamlined data sharing and analysis to accelerate cancer research
 
Accelerating Discovery via Science Services
Accelerating Discovery via Science ServicesAccelerating Discovery via Science Services
Accelerating Discovery via Science Services
 
Grid Computing July 2009
Grid Computing July 2009Grid Computing July 2009
Grid Computing July 2009
 
Opportunities for X-Ray science in future computing architectures
Opportunities for X-Ray science in future computing architecturesOpportunities for X-Ray science in future computing architectures
Opportunities for X-Ray science in future computing architectures
 
Globus Auth: A Research Identity and Access Management Platform
Globus Auth: A Research Identity and Access Management PlatformGlobus Auth: A Research Identity and Access Management Platform
Globus Auth: A Research Identity and Access Management Platform
 

Similar a RAMSES: Robust Analytic Models for Science at Extreme Scales

Scientific
Scientific Scientific
Scientific marpierc
 
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesDiscovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesIan Foster
 
The Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-SystemThe Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-Systeminside-BigData.com
 
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...jsvetter
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowDaniel S. Katz
 
SERENE 2014 School: Daniel varro serene2014_school
SERENE 2014 School: Daniel varro serene2014_schoolSERENE 2014 School: Daniel varro serene2014_school
SERENE 2014 School: Daniel varro serene2014_schoolHenry Muccini
 
SERENE 2014 School: Incremental Model Queries over the Cloud
SERENE 2014 School: Incremental Model Queries over the CloudSERENE 2014 School: Incremental Model Queries over the Cloud
SERENE 2014 School: Incremental Model Queries over the CloudSERENEWorkshop
 
sudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJAsudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJANicolas Poggi
 
Azure Databricks for Data Scientists
Azure Databricks for Data ScientistsAzure Databricks for Data Scientists
Azure Databricks for Data ScientistsRichard Garris
 
Mirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP LibraryMirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP LibraryDeepak Shankar
 
The Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with SparkThe Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with SparkSingleStore
 
Achieving horizontal scalability in density-based clustering for urls
Achieving horizontal scalability in density-based clustering for urlsAchieving horizontal scalability in density-based clustering for urls
Achieving horizontal scalability in density-based clustering for urlsAndrea Morichetta
 
Network Planning &amp; Design: An Art or a Science?
Network Planning &amp; Design: An Art or a Science?Network Planning &amp; Design: An Art or a Science?
Network Planning &amp; Design: An Art or a Science?Vishal Sharma, Ph.D.
 
A Survey of Recent Advances in Network Planning/Traffic Engineering (TE) Tools
A Survey of Recent Advances in Network Planning/Traffic Engineering (TE) ToolsA Survey of Recent Advances in Network Planning/Traffic Engineering (TE) Tools
A Survey of Recent Advances in Network Planning/Traffic Engineering (TE) ToolsVishal Sharma, Ph.D.
 
PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...Feng Li
 
ACIC: Automatic Cloud I/O Configurator for HPC Applications
ACIC: Automatic Cloud I/O Configurator for HPC ApplicationsACIC: Automatic Cloud I/O Configurator for HPC Applications
ACIC: Automatic Cloud I/O Configurator for HPC ApplicationsMingliang Liu
 
Spatial decision support and analytics on a campus scale: bringing GIS, CAD, ...
Spatial decision support and analytics on a campus scale: bringing GIS, CAD, ...Spatial decision support and analytics on a campus scale: bringing GIS, CAD, ...
Spatial decision support and analytics on a campus scale: bringing GIS, CAD, ...Safe Software
 
Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...Anubhav Jain
 

Similar a RAMSES: Robust Analytic Models for Science at Extreme Scales (20)

Scientific
Scientific Scientific
Scientific
 
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesDiscovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
 
The Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-SystemThe Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-System
 
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
 
Решения WANDL и NorthStar для операторов
Решения WANDL и NorthStar для операторовРешения WANDL и NorthStar для операторов
Решения WANDL и NorthStar для операторов
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
 
SERENE 2014 School: Daniel varro serene2014_school
SERENE 2014 School: Daniel varro serene2014_schoolSERENE 2014 School: Daniel varro serene2014_school
SERENE 2014 School: Daniel varro serene2014_school
 
SERENE 2014 School: Incremental Model Queries over the Cloud
SERENE 2014 School: Incremental Model Queries over the CloudSERENE 2014 School: Incremental Model Queries over the Cloud
SERENE 2014 School: Incremental Model Queries over the Cloud
 
sudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJAsudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJA
 
Azure Databricks for Data Scientists
Azure Databricks for Data ScientistsAzure Databricks for Data Scientists
Azure Databricks for Data Scientists
 
CLIM Program: Remote Sensing Workshop, An Introduction to Systems and Softwar...
CLIM Program: Remote Sensing Workshop, An Introduction to Systems and Softwar...CLIM Program: Remote Sensing Workshop, An Introduction to Systems and Softwar...
CLIM Program: Remote Sensing Workshop, An Introduction to Systems and Softwar...
 
Mirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP LibraryMirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP Library
 
The Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with SparkThe Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with Spark
 
Achieving horizontal scalability in density-based clustering for urls
Achieving horizontal scalability in density-based clustering for urlsAchieving horizontal scalability in density-based clustering for urls
Achieving horizontal scalability in density-based clustering for urls
 
Network Planning &amp; Design: An Art or a Science?
Network Planning &amp; Design: An Art or a Science?Network Planning &amp; Design: An Art or a Science?
Network Planning &amp; Design: An Art or a Science?
 
A Survey of Recent Advances in Network Planning/Traffic Engineering (TE) Tools
A Survey of Recent Advances in Network Planning/Traffic Engineering (TE) ToolsA Survey of Recent Advances in Network Planning/Traffic Engineering (TE) Tools
A Survey of Recent Advances in Network Planning/Traffic Engineering (TE) Tools
 
PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...
 
ACIC: Automatic Cloud I/O Configurator for HPC Applications
ACIC: Automatic Cloud I/O Configurator for HPC ApplicationsACIC: Automatic Cloud I/O Configurator for HPC Applications
ACIC: Automatic Cloud I/O Configurator for HPC Applications
 
Spatial decision support and analytics on a campus scale: bringing GIS, CAD, ...
Spatial decision support and analytics on a campus scale: bringing GIS, CAD, ...Spatial decision support and analytics on a campus scale: bringing GIS, CAD, ...
Spatial decision support and analytics on a campus scale: bringing GIS, CAD, ...
 
Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...
 

Más de Ian Foster

Global Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxGlobal Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxIan Foster
 
The Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionThe Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionIan Foster
 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumIan Foster
 
ESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsIan Foster
 
Linking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationLinking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationIan Foster
 
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryA Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryIan Foster
 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptxIan Foster
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceIan Foster
 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryIan Foster
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the ContinuumIan Foster
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationIan Foster
 
Research Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryResearch Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryIan Foster
 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterIan Foster
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for ScienceIan Foster
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light SourcesIan Foster
 
Team Argon Summary
Team Argon SummaryTeam Argon Summary
Team Argon SummaryIan Foster
 
Thoughts on interoperability
Thoughts on interoperabilityThoughts on interoperability
Thoughts on interoperabilityIan Foster
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
 
NIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasNIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasIan Foster
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
 

Más de Ian Foster (20)

Global Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxGlobal Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptx
 
The Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionThe Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, Evolution
 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the Continuum
 
ESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsESnet6 and Smart Instruments
ESnet6 and Smart Instruments
 
Linking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationLinking Scientific Instruments and Computation
Linking Scientific Instruments and Computation
 
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryA Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptx
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental Science
 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and Chemistry
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud Automation
 
Research Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryResearch Automation for Data-Driven Discovery
Research Automation for Data-Driven Discovery
 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and Jupyter
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light Sources
 
Team Argon Summary
Team Argon SummaryTeam Argon Summary
Team Argon Summary
 
Thoughts on interoperability
Thoughts on interoperabilityThoughts on interoperability
Thoughts on interoperability
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
NIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasNIH Data Commons Architecture Ideas
NIH Data Commons Architecture Ideas
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 

Último

WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11GelineAvendao
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Christina Parmionova
 
Explainable AI for distinguishing future climate change scenarios
Explainable AI for distinguishing future climate change scenariosExplainable AI for distinguishing future climate change scenarios
Explainable AI for distinguishing future climate change scenariosZachary Labe
 
projectile motion, impulse and moment
projectile  motion, impulse  and  momentprojectile  motion, impulse  and  moment
projectile motion, impulse and momentdonamiaquintan2
 
Abnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptxAbnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptxzeus70441
 
Replisome-Cohesin Interfacing A Molecular Perspective.pdf
Replisome-Cohesin Interfacing A Molecular Perspective.pdfReplisome-Cohesin Interfacing A Molecular Perspective.pdf
Replisome-Cohesin Interfacing A Molecular Perspective.pdfAtiaGohar1
 
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxQ4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxtuking87
 
Gas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGiovaniTrinidad
 
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2AuEnriquezLontok
 
well logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptxwell logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptxzaydmeerab121
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests GlycosidesNandakishor Bhaurao Deshmukh
 
DECOMPOSITION PATHWAYS of TM-alkyl complexes.pdf
DECOMPOSITION PATHWAYS of TM-alkyl complexes.pdfDECOMPOSITION PATHWAYS of TM-alkyl complexes.pdf
DECOMPOSITION PATHWAYS of TM-alkyl complexes.pdfDivyaK787011
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfGABYFIORELAMALPARTID1
 
Introduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxIntroduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxMedical College
 
Quarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsQuarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsCharlene Llagas
 
DNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxDNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxGiDMOh
 
Loudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptxLoudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptxpriyankatabhane
 
办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书
办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书
办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书zdzoqco
 

Último (20)

AZOTOBACTER AS BIOFERILIZER.PPTX
AZOTOBACTER AS BIOFERILIZER.PPTXAZOTOBACTER AS BIOFERILIZER.PPTX
AZOTOBACTER AS BIOFERILIZER.PPTX
 
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
 
Explainable AI for distinguishing future climate change scenarios
Explainable AI for distinguishing future climate change scenariosExplainable AI for distinguishing future climate change scenarios
Explainable AI for distinguishing future climate change scenarios
 
projectile motion, impulse and moment
projectile  motion, impulse  and  momentprojectile  motion, impulse  and  moment
projectile motion, impulse and moment
 
Abnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptxAbnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptx
 
Replisome-Cohesin Interfacing A Molecular Perspective.pdf
Replisome-Cohesin Interfacing A Molecular Perspective.pdfReplisome-Cohesin Interfacing A Molecular Perspective.pdf
Replisome-Cohesin Interfacing A Molecular Perspective.pdf
 
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxQ4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
 
Gas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptx
 
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
 
well logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptxwell logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptx
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
 
DECOMPOSITION PATHWAYS of TM-alkyl complexes.pdf
DECOMPOSITION PATHWAYS of TM-alkyl complexes.pdfDECOMPOSITION PATHWAYS of TM-alkyl complexes.pdf
DECOMPOSITION PATHWAYS of TM-alkyl complexes.pdf
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
 
Introduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxIntroduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptx
 
Quarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsQuarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and Functions
 
DNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxDNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptx
 
Loudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptxLoudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptx
 
Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?
 
办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书
办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书
办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书
 

RAMSES: Robust Analytic Models for Science at Extreme Scales

  • 1. Gagan Agarwal1* Prasanna Balaprakash2 Ian Foster2* Raj Kettimuthu2 Sven Leyffer2 Vitali Morozov2 Todd Munson2 Nagi Rao3* Saday Sadayappan1 Brad Settlemyer3 Brian Tierney4* Don Towsley5* Venkat Vishwanath2 Yao Zhang2 1 Ohio State University 2 Argonne National Laboratory 3 Oak Ridge National Laboratory 4 ESnet 5 UMass Amherst (* Co-PIs) Advanced Scientific Computing Research Program manager: Rich Carlson ♦︎
  • 2. 2 Prediction, explanation, & optimization are challenging for even “simple” E2E workflows Source data store Desti-nation data store Wide Area Network For example, file transfer, for which we want to: • Predict achievable throughput for a specific configuration • Explain factors influencing performance • Optimize parameter values to achieve high speeds
  • 3. 3 Prediction, explanation, & optimization are challenging for even “simple” E2E workflows Application OS FS Stack HBA/HCA Router LAN Switch Source data transfer node TCP IP NIC Application OS Router TCP FS Stack HBA/HCA LAN Switch IP NIC Storage Array Wide Area Network OST MDT Lustre file system Destination data transfer node OSS OSS MDS MDS + diverse environments + diverse workloads + contention
  • 4. 85 Gbps sustained disk-to-disk over 100 Gbps network, Ottawa—New Orleans 4 Raj Kettiumuthu and team, Argonne
  • 5. High-speed transfers to/from AWS cloud, via Globus transfer service • UChicago  AWS S3 (US region): Sustained 2 Gbps – 2 GridFTP servers, GPFS file system at UChicago – Multi-part upload via 16 concurrent HTTP connections • AWS  AWS (same region): Sustained 5 Gbps 5 go#s3
  • 6. 6 One Advanced Photon Source data node: 125 destinations
  • 7. Same node (1 Gbps link)
  • 8.
  • 9. 9
  • 10. How to create more accurate, useful, and portable models of such systems? Simple analytical model: T= α+ β*l [startup cost + sustained bandwidth] Experiment + regression to estimate α, β 10 First-principles modeling to better capture details of system & application components Data-driven modeling to learn unknown details of system & application components Model composition Model, data comparison
  • 11. The RAMSES vision To develop a new science of end-to-end analytical performance modeling that will transform understanding of the behavior of science workflows in extreme-scale science environments. Based on integration of first-principles and data-driven modeling, and structured approach to model evaluation & composition 11
  • 12. The RAMSES research agenda & platform Modeling Develop, evaluate, and refine component and end-to-end models Tools Develop easy-to-use tools to provide end-users with actionable advice Estimation Develop and apply data-driven estimation methods: differential regression, surrogate models, etc. Experiments Extensive, automated Databas experiments to test models & build database 12 Evaluators Advisor e Estimators Tester
  • 13. We are informed by five challenge workflows 13 Transfer: High-performance, end-to-end file transfer Scattering: Capture and analysis of diffuse scattering experimental data MapReduce: Data-intensive, distributed data analytics Exascale: Performance of exascale application kernels on memory hierarchies In-situ: Configuration and placement of in-situ analysis computations
  • 14. Transfer: End-to-end file movement Storage Array 14 Application OS FS Stack HBA/HCA Router LAN Switch Source data transfer node TCP IP NIC Application OS TCP IP FS Stack HBA/HCA Router LAN Switch NIC Wide Area Network Predict: Throughput for configuration Explain: Factors influencing performance Optimize: Parameters for high speeds OST MDT Lustre file system Destination data transfer node OSS OSS MDS MDS
  • 15. Scattering: Linking simulation and experiment to study disordered structures Diffuse scattering images from Ray Osborn et al., Argonne Experimental Sample scattering Material composition Simulated structure Simulated scattering La 60% Sr 40% Detect errors (secs—mins) Knowledge base Past experiments; simulations; literature; expert knowledge Select experiments (mins—hours) Contribute to knowledge base Simulations driven by experiments (mins—days) Knowledge-driven decision making Evolutionary optimization
  • 16. Immediate assessment of alignment quality in near-field high-energy diffraction microscopy 1 Blue Gene/Q Orthros (All data in NFS) 3: Generate Parameters FOP.c 50 tasks 25s/task ¼ CPU hours Uses Swift/K Dataset 360 files 4 GB total 1: Median calc 75s (90% I/O) MedianImage.c Uses Swift/K 2: Peak Search 15s per file ImageProcessing.c Uses Swift/K Reduced Dataset 360 files 5 MB total feedback to experiment Detector 4: Analysis Pass FitOrientation.c 60s/task (PC) 1667 CPU hours 60s/task (BG/Q) 1667 CPU hours Uses Swift/T GO Transfer Up to 2.2 M CPU hours per week! ssh Globus Catalog Scientific Metadata Workflow Workflow Progress Control Script Bash Manual This is a single workflow 3: Convert bin L to N 2 min for all files, convert files to Network Endian format Before After Hemant Sharma, Justin Wozniak, Mike Wilde, Jon Almer
  • 17. MapReduce: Distributing data and computation for data analytics Job Assignment ... ... Data Slaves Master Local Cluster Local Reduction ... ... Data Slaves Master Cloud Environment Job Assignment Local Reduction Index 17 Remote data analysis Job assignment Global reduction
  • 18. Exascale simulation 18 Images Courtesy: Joseph Insley (Argonne) HACC Cosmology • Compute intensive phase with regular stride one access • Tree walk phase: irregular memory access with high branching and integer ops • 3D FFT communication intensive phase • I/O Phase Nek5000 CFD • Matrix vector product phase • Conjugate gradient iteration • Communication phase involving nearest neighbor exchange and vector reductions
  • 19. In situ analysis on the DOE Leadership Compute Resource (Multi Petaflop, High Radix Interconnect Dragonfly, 5D Torus) Computing Infrastructure I/O Nodes Switch Complex Analysis Nodes/Cluster (IB) File Server Nodes Storage System 1536 GB/s DTN Nodes We need to perform the right computation at the right place and time, taking into account details of the simulation, resources, and analysis 1 2 3 4
  • 20. A diverse set of components Server Parallel computer Router Storage system LAN WAN TCP, UDT GridFTP File systems GridFTP server NECbone HACCbone Checksum Encryption MapReduce Other apps Transfer Y Y Y Y Y Y Y Y Y Y Y Scattering Y Y Y Y Y Y Y Y Exascale Y Y Y Y Y Y Distributed MapReduce Y Y Y Y Y Y Y Y Y In-Situ Y Y Y Y Y Y Y Y 20
  • 21. Develop, evaluate, and refine component and end-to-end models • Models from the literature • Fluid models for network flows • SKOPE modeling system 21 Develop and apply data-driven estimation methods • Differential regression • Surrogate models • Other methods from literature Develop easy-to-use tools to provide end-users with actionable advice • Runtime advisor, integrated with Globus transfer system Automated experiments to test models and build database • Experiment design • Testbeds
  • 22. Overview Input Output Workload input Code skeletons Parser Per-function intermediate repr. (Block Skeleton Trees) Behavior modeling engine Execution-based intermediate repr. (Bayesian execution tree) Transformation engine Performance projection Characterization engine Transformed Bayesian execution tree Hardware model system specifications Performance projection Schema for suggested tranformations Synthesized characteristics Source code User Effort (semi-automated with a source-to-source translator) Automatic SKOPE language Back end Front end Bottleneck analysis SKOPE performance modeling framework
  • 23. Differential regression for combining data from different sources Example of use: Predict performance on connection length L not realizable on physical infrastructure E.g., IB-RDMA or HTCP throughput on 900-mile connection 1) Make multiple measurements of performance on path lengths d: – Ms(d): OPNET simulation – ME(d): ANUE-emulated path – MU(di): Real network (USN) 2) Compute measurement regressions on d: ṀA(.), A∈{S, E, U} 3) Compute differential regressions: ΔṀA,B(.) = ṀA(.) - ṀB(.), A, B∈{S, E, U} 4) Apply differential regression to obtain estimates, C∈{S, E} 퓜U(d) = MC(d) - ΔṀC,U(d) simulated/emulated measurements point regression estimate
  • 24. We will extend the differential regression method in several areas • To compare different component models – E.g., different models of network elements, storage systems, protocol implementations • To compare different composite models – E.g., different methods for combining memory and CPU models • To compare model outputs with measurements 24
  • 25. Component model component System parameters Task size parameters i cost terms performance quality model p i si Experiment design (active learning) Analytical and empirical models ˆQ i ( pi ,si ) is a regression estimate of
  • 26. End-to-end profile composition Source LAN profile WAN profile Destination LAN profile Configuration for host and edge devices Configuration for WAN devices Configuration for host and edge devices composition operations
  • 27. End-to-end model composition & analysis • End-to-end model using composition – It is an approximation: due to component interactions not modelled by the composition operator • Actual end-to-end performance model – Component models are “corrected” to account for un-modelled effects: this form is assumed to exist 27
  • 28. Using end-to-end measurements and differential regression to correct regression estimates • Regression estimate of composed model: – “Estimated”, since components models are “incomplete” as derived from first principles and/or measurements • Error due to regression estimate: • Error can be mitigated using measurements: Corrected estimate of : 28 Q p,s ( )Å ˆQ p,s ( ) = Q p,s ( )- ˆQ p,s ( ) éë ùû 2 ˆ (p, ) Qs Qp,s ˆQ p,s ( ) = ˆQ p,s ( )+ ˆD (p,s) Analytical model Correction from differential regression using measurements
  • 29. Performance guarantees • Vapnik-Chervonenkis theory: under finite VC-dim(F) P I ˆD, ˆQ, p ( )- I D*, ˆQ, p ( ) >e { } <d F,l,e ( ) Estimated Optimal – Guarantees that error of regression estimate is close to optimal with a certain probability – Distribution-free: does not require detailed knowledge of error distributions – uses end-to-end measurements • Error of the corrected estimate: 29 i p I D, ˆQ ( , p) = Qp,s - ˆQ p,s ( )- D p,s ( ) éë ùû ò dPQp,s
  • 30. Surrogate modeling framework to inform choice of experiments 30 Machine learning & optimization Performance metrics Informative configurations First-principles models Evaluation
  • 31. Fluid models of network flows GridFTP flow i, parallelism ki dT k T t i i i   2 dt R k Bottleneck router T t p t dt      Solve for throughputs, and transfer delays Special case: known p 31 GridFTP flow i: RTT Ri Throughput Ti Bottleneck router: Capacity C Loss rate p { 0} 1Q j j dQ C T i i i k T R p  ( ) ( ) ( ) 2 i i i
  • 32. 32 Model composition Analytical models Performance projections Regression models Experiments Historical logs Emulators Code skeletons SKOPE language Workload parameters Source code Benchmarks Simulators SKOPE System models (current or future) Application behavior models Our multi-modal approach
  • 33. 33 File transfer performance projections System models Application behavior Application to file transfer Model composition Analytical models Regression models Experiments Historical logs Code skeletons SKOPE language Workload parameters Source code SKOPE models Storage, TCP, WAN iperf GridFTP Emulators XDD
  • 34. 34 Exascale simulation perf. projections System models Application behavior Compute, memory, models Model composition Analytical models Regression models Experiments Historical logs Code skeletons SKOPE language Workload parameters Source code SKOPE interconnect MPI benchmarks Stream DGEMM IOR corresponding CPU of a code skeleton is int roduced in the comment is not discussed in further L ist ing 1: Mat Mul ’ s CPU 1 f l oat A[ N] [ K] , B[ K] [ M] ; f l oat C[ N] [ M] ; 3 i nt i , j , k ; f or ( i =0; i <N; ++i ) { 5 f or ( j =0; j <M; ++j ) { f l oat sum = 0; 7 f or ( k =0; k <K; ++k) { sum+=A[ i ] [ k] * B[ k ] [ j ] ; 9 } C[ i ] [ j ] = sum; 11 } L ist ing 2: Mat Mul ’ s code skele-t on 1 f l oat A[ N] [ K] f l oat B[ K] [ M] 3 f l oat C[ N] [ M] / * t he l oop space * / 5 par al l el _f or ( N, M) : i , j 7 { / * comput at i on w/ t 9 * i nst r uc t i on count * / 11 comp 1 / * st r eami ng l oop * / 13 st r eam k = 0: K { / * l oad * / 15 l d A[ i ] [ k ] l d B[ k ] [ j ] 17 comp 3 } 19 comp 5 / * st or e * / 21 st C[ i ] [ j ] } The following informat a computat ional kernel. Dat a par al lel ism homoge-neous tasks repeated express data parallelism the innermost parallel A task corresponds f or loop. I t is expressed computat ion. Dat a accesses are oper-at ions. The accessed in-dices, array sizes, and be expressed as well; are random unless users and List ing 6). Application to exascale simulation
  • 35. A performance database • We aim to collect instrumentation data in a central database to simplify model validation • We plan to use the perfSONAR measurement archive tool as a starting point – REST API on top of Cassandra and Postgres – Optimized for time series data – Will extend as needed – http://software.es.net/esmond/ 35
  • 36. Application to transfer optimization 36 Performance predictor Parameter database Performance analyst Model refiner User feedback agent Globus (1) Transfer service description (3) Transfer performance (4) User feedback (2) Prediction Prediction Analysis Analysis Parameter update
  • 37. Summary • We focus on the science of modeling: integration of first-principles and data-driven models; model composition and evaluation • Our challenge applications span a broad spectrum of DOE resources and disciplines • We see big opportunities for cooperation: e.g., on development and evaluation of component models 37
  • 38. Thanks, and for more information • Thanks to our sponsors: Advanced Scientific Computing Research Program manager: Rich Carlson • Thanks to my RAMSES project co-participants • For more information, please see https://sites.google.com/site/ramsesdoeproject/ ianfoster.org and @ianfoster 38

Notas del editor

  1. Yes. The entire namespace is stored on Lustre Metadata Servers (MDSs); file data is stored on Lustre Object Storage Servers (OSSs). Note that unlike many block-based clustered filesystems where the MDS is still in charge of block allocation, the Lustre MDS is not involved in file IO in any manner and is not a source of contention for file IO. The data for each file may reside in multiple objects on separate servers. Lustre 1.x manages these objects in a RAID-0 (striping) configuration, so each object in a multi-object file contains only a part of the file's data. Future versions of Lustre will allow the user or administrator to choose other striping methods, such as RAID-1 or RAID-5 redundancy. What is the difference between an OST and an OSS? As the architecture has evolved, we refined these terms. An Object Storage Server (OSS) is a server node, running the Lustre software stack. It has one or more network interfaces and usually one or more disks. An Object Storage Target (OST) is an interface to a single exported backend volume. It is conceptually similar to an NFS export, except that an OST does not contain a whole namespace, but rather file system objects.
  2. Combine simulation, emulation, experiment: differential regression First-principles models Machine learning to
  3. Q: I can’t work out why the bottom two images are dimmed: some configuration option? Or, how to create nice oval around first.
  4. Yes. The entire namespace is stored on Lustre Metadata Servers (MDSs); file data is stored on Lustre Object Storage Servers (OSSs). Note that unlike many block-based clustered filesystems where the MDS is still in charge of block allocation, the Lustre MDS is not involved in file IO in any manner and is not a source of contention for file IO. The data for each file may reside in multiple objects on separate servers. Lustre 1.x manages these objects in a RAID-0 (striping) configuration, so each object in a multi-object file contains only a part of the file's data. Future versions of Lustre will allow the user or administrator to choose other striping methods, such as RAID-1 or RAID-5 redundancy. What is the difference between an OST and an OSS? As the architecture has evolved, we refined these terms. An Object Storage Server (OSS) is a server node, running the Lustre software stack. It has one or more network interfaces and usually one or more disks. An Object Storage Target (OST) is an interface to a single exported backend volume. It is conceptually similar to an NFS export, except that an OST does not contain a whole namespace, but rather file system objects.
  5. “Most of materials science is bottlenecked by disordered structures”—Littlewood. Solve inverse problem. How do we make this sort of application routine? Allow thousands—millions?—to contribute to the knowledge base. Challenge: takes months to do a single loop through cycle. Just as important, it is an incredibly labor intensive and expensive process.
  6. DS, NF-HEDM, FF-HEDM, PD workflows operational Catalog integrated into workflow, supports rich user interface Workflows use large-scale compute resources outside of APS Data publication service demonstrated Parallel algs for 3-D image reconstruction, structure determination, etc. Globus Galaxies platform integrated with Swift for scalability
  7. HACC: The short force evaluation kernel is compute intensive with regular stride one memory accesses. This kernel can be fully vectorized and/or threaded. The tree walk phase has essentially irregular indirect memory accesses, and has very high number of branching and integer operations. The 3D FFT phase is implemented with point-to-point communication operations and is executed only every long time step; thus significantly reducing the overall communication complexity of the code. NEKBONE KERNEL : The Nekbone Kernel is a single-core code focused on the matrix-vector product at the heart of the spectral element method.  The code allows for analysis and optimization of the performance of the matrix-vector product kernel, which is recast as a set of computationally intense matrix-matrix products with relatively low operation count and minimal data movement. NEKBONE : The Nekbone mini-app allows users to study the computationally intense linear solvers that account for a large percentage of the more intricate Nek5000 software, as well as the communication costs required for nearest-neighbor data exchanges and vector reductions. Nekbone embeds the nekbone_kernel in a conjugate gradient iteration to solve the 3D Poisson equation. Preconditioning in the current version is based on diagonal scaling, which allows for simpler code than the full multigrid structure found in Nek5000. Nekbone has been created to be easily adapted and manipulated to different platforms, communication structures, and scalability studies.