SlideShare una empresa de Scribd logo
1 de 31
Descargar para leer sin conexión
www.eudat.eu
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065
Introduction to HPC
Programming Models
Stefano Markidis
KTH, Sweden
Supercomputing - I
Use of computer simulation as a tool
for greater understanding of the real
world
Complements experimentation
and theory
Problems are increasingly
computationally challenging
Large parallel machines needed
to perform calculations
Critical to leverage parallelism in
all phases
Data access is a huge challenge
Using parallelism to obtain
performance
Finding usable, efficient, portable
I/O interfaces
millenium project
Thermal hydraulics with Nek500
Parallel Machines
c c
c c
💲💲
DRAM
Your First
Parallel Machine
=
Your Laptop
Parallel Machines
c c
c c
💲💲
DRAM
Office Workstation
c c
c c
💲💲
Parallel Machines
Computing Node
c c
c c
💲💲
DRAM
c c
c c
💲💲
c c
c c
💲💲
c c
c c
💲💲
NIC
c c
c c
💲💲
DRAM
c c
c c
💲💲
c c
c c
💲💲
c c
c c
💲💲
NIC
c c
c c
💲💲
DRAM
c c
c c
💲💲
c c
c c
💲💲
c c
c c
💲💲
NIC
c c
c c
💲💲
c c
c c
💲💲
c c
c c
💲💲
c c
c c
💲💲
c c
c c
💲💲
c c
c c
💲💲
c c
c c
💲💲
c c
c c
c c
c c
c c
c c
c c
c c
c c
c c
HPC I/O System is also rather complex…
An HPC I/O system is attached to supercomputer
The HPC I/O system is a supercomputer itself
Commodity
network primarily
carries storage traffic
Enterprise storage
controllers and large racks
of disks connected via
Storage nodes run
parallel file system
software and manage
Gateway nodes run
parallel file system client
software and forward I/O
Ethernet
10 Gbit/sec
InfiniBand
16 Gbit/sec
BG/P Tree
6.8 Gbit/sec
Serial ATA
3.0 Gbit/sec
HW bottleneck is
here. Controllers
can manage only
4.6 Gbyte/sec.
Peak I/O system
bandwidth is
78.2 Gbyte/sec.
Architectural diagram of 557 TF Argonne Leadership Computing Facility Blue Gene/P I/O system
Supercomputing – II
Most of modern supercomputer
hardware are built following two
principles:
use of commodity hardware:
Intel CPUs, AMD CPUs, DDR4,
NVIDIA GPU …
Using parallelism to achieve
very high performance
The file systems connected to
computers are built in the same
way
Gather large numbers of
storage device: HDDs, SSDs
Connect them together in
parallel to create a high
bandwidth, high capacity
storage device.
Largest Supercomputers
https://www.top500.org/
Largest HPC IO Systems
https://www.vi4io.org/hpsl/2017/start
This is where
Big Data starts for HPC
Supercomputing - III
Supercomputing, n. [ sˌuːpəkəmpjˈuːtə]
A special branch of scientific computing
that turns a computation-bound problem
into an I/O-bound problem.
Why is that ? I/O vs Compute Performance
Disk AccessRates over Time
HPC Programming Models
Programming models are an abstraction of parallel
computer architectures
To express conveniently algorithms without
focusing on the details of underlying hardware
To remove complexity of architecture when
designing algorithms
To allow for high-performance
implementations
Two HPC Programming Models for Supercomputers
p0 p1 p2
a=12 a=77 a=32
a=12
a=12
12
12
Message-Passing: explicit send
and receive operations (explicit
communication)
p0 p1 p2
a(1) =
12
a(2) =
77
a(3) =
32
a(1) a(2) a(3)
a(2) =
12 a(2) =
12
a(3) =
12
a(3) =
12
PGAS: access global memory that is
physically distributed (implicit
communication)
Get/Put are load/store to global memory
Problem: move value of a from p0,
to p1 and then p2
How do you program a supercomputer ?
99% of the codes for supercomputers are written in
Fortran (including Fortran77) and C/C++
Other languages supporting multithreading for
on-node parallelism (Python, Java, …)
99% of the large HPC codes use MPI libraries (MP
programming model)
Used to move data from one computing node to
another but also used for on-node parallelism
Data-analytics frameworks for supercomputers
often use MPI as transport layers
MPI
MPI = standardized specification document for a
Message Passing library to support parallel computing in
C/C++ and Fortran.
Portability
High-Performance
Two main implementations:
MPICH and OpenMPI (you can install on your laptop)
Supercomputer vendors provide highly-tuned
implementations of these two.
Only four fundamental functions: MPI_Init,
MPI_Finalize, MPI_Send, MPI_Recv
Other collective functions that include all the
communicate processes, i.e. broadcast, scatter, …
Includes RDMA operations (one-sided), also streaming
models built atop
Simple MPI code
What is Parallel I/O?
At the program level:
Concurrent reads or writes from multiple
processes to a common file
At the system level:
A parallel file system and hardware that support
such concurrent access
Three strategies of I/O in HPC:
Spokesperson
Multiple writers multiple files
Cooperative
Spokesperson
One process performs the I/O
Easy to program
It doesn’t scale
Shared File
Multiple writers multiple files
All the processes write to individual files
Might limited by the file system
Easy to program
It doesn’t scale
Number of files creates bottleneck with metadata operations
Number of simultaneous disk accesses creates contention for
file system resources
Cooperative Parallel I/O (Real Parallel IO)
Multiple processes write to a shared file potentially not in
a non-contiguous way
Truly IO-parallel
EUDAT Summer School, 3-7 July 2017, Crete
Applications (Weather Forecast, CFD, Astrophysics …)
High-Level I/O Level Libraries
I/O Middleware
I/O Forwarding
I/O Parallel File system
I/O Hardware
MPI I/O
HDF5, NetCDF, SionLib, ADIOS
CIOD/DVS
Lustre, GPFS, …
Know about this
allow you to optimize
higher level of the
software stack
HPC I/O Software Stack
MPI I/O
Why Parallel I/O in MPI?
Writing is like sending and reading is like receiving.
Any parallel I/O system will need:
collective operations, communicators, …
Why do I/O in MPI?
Why not just POSIX?
Parallel performance
Single file (instead of one file / process)
MPI has replacement functions for POSIX I/O
Provides migration path
Multiple styles of I/O can all be expressed in MPI
Including some that cannot be expressed without
MPI
MPI I/O: the basics
I/O operations for unformatted binary file, similar to
read and write, there is no fwrite nor fread.
Just like POSIX I/O, you need to
Open the file
Read or Write data to the file
Close the file
In MPI, these steps are almost the same:
Open the file: MPI_File_open
Write to the file: MPI_File_write
Close the file: MPI_File_close
An example of MPI I/O
#include <stdio.h>
#include "mpi.h”
int main(int argc, char *argv[])
{
MPI_File fh;
int buf[1000], rank;
MPI_Init(0,0);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI _File_ open(MPI_COMM_WORLD, "test.out",
MPI_MODE_CREATE|MPI_MODE_WRONLY,
MPI_INFO_NULL, &fh);
if (rank == 0)
MPI _File_ w rite(fh, buf, 1000, MPI_INT, MPI_STATUS_IGNORE);
MPI _File_ close(&fh);
MPI_Finalize();
return 0;
}
Example code to write to a shared file
High-Level Parallel Libraries
Provide structure to files
Well-defined, portable formats
Self-describing
APIs more appropriate for computational science
Typed data
Noncontiguous regions in memory and file
Interfaces are implemented on top of MPI-IO
HDF5
Most used high-level library in scientific codes
HDF5 = Hierarchical Data format
HDF5 is three things:
Data model: container, data set, group and link
Library: support for parallel I/O operations
File Format: Hierarchical data organization in
single file; typed, multidimensional array storage ;
attributes on dataset, data
What about PGAS I/O ?
Effort in designing PGAS-like
programming systems for I/O
operations
Different parts of a shared
file are virtually mapped to
a global memory space,
that is accessible by all
processes, think about
mmap for instance
To write to disk, make a
store to global memory
To read from disk, make a
load from the global
memory
I/O system is becoming very
heterogeneous so it is good to
have a unique flat global
“memory” space to hide this
architectural complexity
a(0) a(1) a(2) a(3) a(4) a(5) a(6)
File 1 File 2
mapping
User
write
a(5)
a(5) = 8.7
Global “memory”
Conclusions
Supercomputers consist of several computing
nodes connected by an high-performance network
Programming models abstract supercomputer
hardware to allow for efficient implementation of
algorithms
MPI, C/C++ and Fortran are dominant
MPI I/O provides means for real parallel I/O
HDF5 most famous data format, library and data
model in HPC
PGAS I/O might be a viable option
www.eudat.eu
Thank you!
www.eudat.eu
Acknowledgements
These slides are largely based and adapted from:
- “Parallel I/O in Practice” by Rob Ross
https://www.nersc.gov/assets/Training/pio-in-practice-sc12.pdf
- “Short introduktion on Optimizing I/O” by Cray
https://www.pdc.kth.se/education/course-resources/introduction-to-
cray-xc30-xc40/feb-2015/05_Short_Intro_Optimizing-IO.pdf
- “Lecture 32: Introduction to MPI I/O” by Bill Gropp
http://wgropp.cs.illinois.edu/courses/cs598-s16/lectures/lecture32.pdf

Más contenido relacionado

La actualidad más candente

Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 productsInteroperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
The HDF-EOS Tools and Information Center
 
Introduction to Ocean Observation1
Introduction to Ocean Observation1Introduction to Ocean Observation1
Introduction to Ocean Observation1
Jose Rodriguez
 
Distributed Algorithm for Frequent Pattern Mining using HadoopMap Reduce Fram...
Distributed Algorithm for Frequent Pattern Mining using HadoopMap Reduce Fram...Distributed Algorithm for Frequent Pattern Mining using HadoopMap Reduce Fram...
Distributed Algorithm for Frequent Pattern Mining using HadoopMap Reduce Fram...
idescitation
 
HDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
HDFS Erasure Code Storage - Same Reliability at Better Storage EfficiencyHDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
HDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
DataWorks Summit
 
3.introduction to map reduce
3.introduction to map reduce3.introduction to map reduce
3.introduction to map reduce
databloginfo
 

La actualidad más candente (20)

Sap technical deep dive in a column oriented in memory database
Sap technical deep dive in a column oriented in memory databaseSap technical deep dive in a column oriented in memory database
Sap technical deep dive in a column oriented in memory database
 
Hadoop and MapReduce
Hadoop and MapReduceHadoop and MapReduce
Hadoop and MapReduce
 
An unsupervised framework for effective indexing of BigData
An unsupervised framework for effective indexing of BigDataAn unsupervised framework for effective indexing of BigData
An unsupervised framework for effective indexing of BigData
 
HDF Tools Tutorial
HDF Tools TutorialHDF Tools Tutorial
HDF Tools Tutorial
 
Images of HDF5
Images of HDF5Images of HDF5
Images of HDF5
 
MODIS Reprojection Tool
MODIS Reprojection ToolMODIS Reprojection Tool
MODIS Reprojection Tool
 
NASA HDF/HDF-EOS Data for Dummies (and Developers)
NASA HDF/HDF-EOS Data for Dummies (and Developers)NASA HDF/HDF-EOS Data for Dummies (and Developers)
NASA HDF/HDF-EOS Data for Dummies (and Developers)
 
Programming Modes and Performance of Raspberry-Pi Clusters
Programming Modes and Performance of Raspberry-Pi ClustersProgramming Modes and Performance of Raspberry-Pi Clusters
Programming Modes and Performance of Raspberry-Pi Clusters
 
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 productsInteroperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
 
Introduction to Ocean Observation1
Introduction to Ocean Observation1Introduction to Ocean Observation1
Introduction to Ocean Observation1
 
MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10
 
Flow Solver: HiFUN
Flow Solver: HiFUNFlow Solver: HiFUN
Flow Solver: HiFUN
 
Distributed Algorithm for Frequent Pattern Mining using HadoopMap Reduce Fram...
Distributed Algorithm for Frequent Pattern Mining using HadoopMap Reduce Fram...Distributed Algorithm for Frequent Pattern Mining using HadoopMap Reduce Fram...
Distributed Algorithm for Frequent Pattern Mining using HadoopMap Reduce Fram...
 
Interconnecting Belgian national and regional address data using EC ISA "Loca...
Interconnecting Belgian national and regional address data using EC ISA "Loca...Interconnecting Belgian national and regional address data using EC ISA "Loca...
Interconnecting Belgian national and regional address data using EC ISA "Loca...
 
NetCDF and HDF5
NetCDF and HDF5NetCDF and HDF5
NetCDF and HDF5
 
All AI Roads lead to Distribution - Dot AI
All AI Roads lead to Distribution - Dot AIAll AI Roads lead to Distribution - Dot AI
All AI Roads lead to Distribution - Dot AI
 
HDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
HDFS Erasure Code Storage - Same Reliability at Better Storage EfficiencyHDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
HDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
 
Resisting skew accumulation
Resisting skew accumulationResisting skew accumulation
Resisting skew accumulation
 
MATLAB and HDF-EOS
MATLAB and HDF-EOSMATLAB and HDF-EOS
MATLAB and HDF-EOS
 
3.introduction to map reduce
3.introduction to map reduce3.introduction to map reduce
3.introduction to map reduce
 

Similar a Introduction to HPC Programming Models - EUDAT Summer School (Stefano Markidis, KTH)

Mainframe Architecture & Product Overview
Mainframe Architecture & Product OverviewMainframe Architecture & Product Overview
Mainframe Architecture & Product Overview
abhi1112
 
BUILDING A PRIVATE HPC CLOUD FOR COMPUTE AND DATA-INTENSIVE APPLICATIONS
BUILDING A PRIVATE HPC CLOUD FOR COMPUTE AND DATA-INTENSIVE APPLICATIONSBUILDING A PRIVATE HPC CLOUD FOR COMPUTE AND DATA-INTENSIVE APPLICATIONS
BUILDING A PRIVATE HPC CLOUD FOR COMPUTE AND DATA-INTENSIVE APPLICATIONS
ijccsa
 

Similar a Introduction to HPC Programming Models - EUDAT Summer School (Stefano Markidis, KTH) (20)

Automatic generation of hardware memory architectures for HPC
Automatic generation of hardware memory architectures for HPCAutomatic generation of hardware memory architectures for HPC
Automatic generation of hardware memory architectures for HPC
 
Burst Buffer: From Alpha to Omega
Burst Buffer: From Alpha to OmegaBurst Buffer: From Alpha to Omega
Burst Buffer: From Alpha to Omega
 
Petapath HP Cast 12 - Programming for High Performance Accelerated Systems
Petapath HP Cast 12 - Programming for High Performance Accelerated SystemsPetapath HP Cast 12 - Programming for High Performance Accelerated Systems
Petapath HP Cast 12 - Programming for High Performance Accelerated Systems
 
Role of python in hpc
Role of python in hpcRole of python in hpc
Role of python in hpc
 
Mainframe Architecture & Product Overview
Mainframe Architecture & Product OverviewMainframe Architecture & Product Overview
Mainframe Architecture & Product Overview
 
2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdf2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdf
 
Scientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchScientific Computing @ Fred Hutch
Scientific Computing @ Fred Hutch
 
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...
 
CC LECTURE NOTES (1).pdf
CC LECTURE NOTES (1).pdfCC LECTURE NOTES (1).pdf
CC LECTURE NOTES (1).pdf
 
High-Performance Computing and OpenSolaris
High-Performance Computing and OpenSolarisHigh-Performance Computing and OpenSolaris
High-Performance Computing and OpenSolaris
 
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and CloudsArchitecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
 
BUILDING A PRIVATE HPC CLOUD FOR COMPUTE AND DATA-INTENSIVE APPLICATIONS
BUILDING A PRIVATE HPC CLOUD FOR COMPUTE AND DATA-INTENSIVE APPLICATIONSBUILDING A PRIVATE HPC CLOUD FOR COMPUTE AND DATA-INTENSIVE APPLICATIONS
BUILDING A PRIVATE HPC CLOUD FOR COMPUTE AND DATA-INTENSIVE APPLICATIONS
 
Accelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesAccelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing Technologies
 
2019 HighPerformance Computing - Strategies for Machine Learning.pdf
2019 HighPerformance Computing - Strategies for Machine Learning.pdf2019 HighPerformance Computing - Strategies for Machine Learning.pdf
2019 HighPerformance Computing - Strategies for Machine Learning.pdf
 
HPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataHPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big Data
 
Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)
 
Hopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, SunnyvaleHopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, Sunnyvale
 
pythonOCC PDE2009 presentation
pythonOCC PDE2009 presentationpythonOCC PDE2009 presentation
pythonOCC PDE2009 presentation
 
Achieving compute and storage independence for data-driven workloads
Achieving compute and storage independence for data-driven workloadsAchieving compute and storage independence for data-driven workloads
Achieving compute and storage independence for data-driven workloads
 
Build your own discovery index of scholary e-resources
Build your own discovery index of scholary e-resourcesBuild your own discovery index of scholary e-resources
Build your own discovery index of scholary e-resources
 

Más de EUDAT

Linking service capabilities to data stweardship competences for professional...
Linking service capabilities to data stweardship competences for professional...Linking service capabilities to data stweardship competences for professional...
Linking service capabilities to data stweardship competences for professional...
EUDAT
 
Training by EOSC-hub - Integrating and Managing services for the European Ope...
Training by EOSC-hub - Integrating and Managing services for the European Ope...Training by EOSC-hub - Integrating and Managing services for the European Ope...
Training by EOSC-hub - Integrating and Managing services for the European Ope...
EUDAT
 

Más de EUDAT (20)

EUDAT_Brochure_Generica_Jan_UPDATED(5).pdf
EUDAT_Brochure_Generica_Jan_UPDATED(5).pdfEUDAT_Brochure_Generica_Jan_UPDATED(5).pdf
EUDAT_Brochure_Generica_Jan_UPDATED(5).pdf
 
EUDAT Booklet Mar22 (2).pdf
EUDAT Booklet Mar22 (2).pdfEUDAT Booklet Mar22 (2).pdf
EUDAT Booklet Mar22 (2).pdf
 
EUDAT_Brochure_Generica_Jan_UPDATED (1).pdf
EUDAT_Brochure_Generica_Jan_UPDATED (1).pdfEUDAT_Brochure_Generica_Jan_UPDATED (1).pdf
EUDAT_Brochure_Generica_Jan_UPDATED (1).pdf
 
EUDAT Brochure - B2HANDLE.pdf
EUDAT Brochure - B2HANDLE.pdfEUDAT Brochure - B2HANDLE.pdf
EUDAT Brochure - B2HANDLE.pdf
 
EUDAT Brochure - B2DROP.pdf
EUDAT Brochure - B2DROP.pdfEUDAT Brochure - B2DROP.pdf
EUDAT Brochure - B2DROP.pdf
 
EUDAT Brochure - B2SHARE.pdf
EUDAT Brochure - B2SHARE.pdfEUDAT Brochure - B2SHARE.pdf
EUDAT Brochure - B2SHARE.pdf
 
EUDAT Brochure - B2SAFE.pdf
EUDAT Brochure - B2SAFE.pdfEUDAT Brochure - B2SAFE.pdf
EUDAT Brochure - B2SAFE.pdf
 
EUDAT Brochure - B2FIND(1).pdf
EUDAT Brochure - B2FIND(1).pdfEUDAT Brochure - B2FIND(1).pdf
EUDAT Brochure - B2FIND(1).pdf
 
EUDAT Brochure - B2ACCESS.pdf
EUDAT Brochure - B2ACCESS.pdfEUDAT Brochure - B2ACCESS.pdf
EUDAT Brochure - B2ACCESS.pdf
 
Rob Carrillo - Writing effective service documentation for EUDAT services
Rob Carrillo - Writing effective service documentation for EUDAT servicesRob Carrillo - Writing effective service documentation for EUDAT services
Rob Carrillo - Writing effective service documentation for EUDAT services
 
Ariyo - EUDAT CDI B2 services documentation
Ariyo - EUDAT CDI B2 services documentationAriyo - EUDAT CDI B2 services documentation
Ariyo - EUDAT CDI B2 services documentation
 
Introduction to eudat and its services
Introduction to eudat and its servicesIntroduction to eudat and its services
Introduction to eudat and its services
 
Using B2NOTE: The U.Porto Pilot
Using B2NOTE: The U.Porto PilotUsing B2NOTE: The U.Porto Pilot
Using B2NOTE: The U.Porto Pilot
 
OpenAIRE Advance - Kick off last week
OpenAIRE Advance - Kick off last weekOpenAIRE Advance - Kick off last week
OpenAIRE Advance - Kick off last week
 
European Open Science Cloud - Skills workshop
European Open Science Cloud - Skills workshopEuropean Open Science Cloud - Skills workshop
European Open Science Cloud - Skills workshop
 
Linking service capabilities to data stweardship competences for professional...
Linking service capabilities to data stweardship competences for professional...Linking service capabilities to data stweardship competences for professional...
Linking service capabilities to data stweardship competences for professional...
 
FAIRness of training materials
FAIRness of training materialsFAIRness of training materials
FAIRness of training materials
 
Training by EOSC-hub - Integrating and Managing services for the European Ope...
Training by EOSC-hub - Integrating and Managing services for the European Ope...Training by EOSC-hub - Integrating and Managing services for the European Ope...
Training by EOSC-hub - Integrating and Managing services for the European Ope...
 
Draft Governance Framework for the EOSC
Draft Governance Framework for the EOSCDraft Governance Framework for the EOSC
Draft Governance Framework for the EOSC
 
Building Interoperable AAI for Researchers
Building Interoperable AAI for ResearchersBuilding Interoperable AAI for Researchers
Building Interoperable AAI for Researchers
 

Último

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
JohnnyPlasten
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 

Último (20)

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 

Introduction to HPC Programming Models - EUDAT Summer School (Stefano Markidis, KTH)

  • 1. www.eudat.eu EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065 Introduction to HPC Programming Models Stefano Markidis KTH, Sweden
  • 2. Supercomputing - I Use of computer simulation as a tool for greater understanding of the real world Complements experimentation and theory Problems are increasingly computationally challenging Large parallel machines needed to perform calculations Critical to leverage parallelism in all phases Data access is a huge challenge Using parallelism to obtain performance Finding usable, efficient, portable I/O interfaces millenium project Thermal hydraulics with Nek500
  • 3. Parallel Machines c c c c 💲💲 DRAM Your First Parallel Machine = Your Laptop
  • 4. Parallel Machines c c c c 💲💲 DRAM Office Workstation c c c c 💲💲
  • 5. Parallel Machines Computing Node c c c c 💲💲 DRAM c c c c 💲💲 c c c c 💲💲 c c c c 💲💲 NIC
  • 6. c c c c 💲💲 DRAM c c c c 💲💲 c c c c 💲💲 c c c c 💲💲 NIC c c c c 💲💲 DRAM c c c c 💲💲 c c c c 💲💲 c c c c 💲💲 NIC c c c c 💲💲 c c c c 💲💲 c c c c 💲💲 c c c c 💲💲 c c c c 💲💲 c c c c 💲💲 c c c c 💲💲 c c c c c c c c c c c c c c c c c c c c
  • 7. HPC I/O System is also rather complex… An HPC I/O system is attached to supercomputer The HPC I/O system is a supercomputer itself Commodity network primarily carries storage traffic Enterprise storage controllers and large racks of disks connected via Storage nodes run parallel file system software and manage Gateway nodes run parallel file system client software and forward I/O Ethernet 10 Gbit/sec InfiniBand 16 Gbit/sec BG/P Tree 6.8 Gbit/sec Serial ATA 3.0 Gbit/sec HW bottleneck is here. Controllers can manage only 4.6 Gbyte/sec. Peak I/O system bandwidth is 78.2 Gbyte/sec. Architectural diagram of 557 TF Argonne Leadership Computing Facility Blue Gene/P I/O system
  • 8. Supercomputing – II Most of modern supercomputer hardware are built following two principles: use of commodity hardware: Intel CPUs, AMD CPUs, DDR4, NVIDIA GPU … Using parallelism to achieve very high performance The file systems connected to computers are built in the same way Gather large numbers of storage device: HDDs, SSDs Connect them together in parallel to create a high bandwidth, high capacity storage device.
  • 10. Largest HPC IO Systems https://www.vi4io.org/hpsl/2017/start This is where Big Data starts for HPC
  • 11. Supercomputing - III Supercomputing, n. [ sˌuːpəkəmpjˈuːtə] A special branch of scientific computing that turns a computation-bound problem into an I/O-bound problem.
  • 12. Why is that ? I/O vs Compute Performance Disk AccessRates over Time
  • 13. HPC Programming Models Programming models are an abstraction of parallel computer architectures To express conveniently algorithms without focusing on the details of underlying hardware To remove complexity of architecture when designing algorithms To allow for high-performance implementations
  • 14. Two HPC Programming Models for Supercomputers p0 p1 p2 a=12 a=77 a=32 a=12 a=12 12 12 Message-Passing: explicit send and receive operations (explicit communication) p0 p1 p2 a(1) = 12 a(2) = 77 a(3) = 32 a(1) a(2) a(3) a(2) = 12 a(2) = 12 a(3) = 12 a(3) = 12 PGAS: access global memory that is physically distributed (implicit communication) Get/Put are load/store to global memory Problem: move value of a from p0, to p1 and then p2
  • 15. How do you program a supercomputer ? 99% of the codes for supercomputers are written in Fortran (including Fortran77) and C/C++ Other languages supporting multithreading for on-node parallelism (Python, Java, …) 99% of the large HPC codes use MPI libraries (MP programming model) Used to move data from one computing node to another but also used for on-node parallelism Data-analytics frameworks for supercomputers often use MPI as transport layers
  • 16. MPI MPI = standardized specification document for a Message Passing library to support parallel computing in C/C++ and Fortran. Portability High-Performance Two main implementations: MPICH and OpenMPI (you can install on your laptop) Supercomputer vendors provide highly-tuned implementations of these two. Only four fundamental functions: MPI_Init, MPI_Finalize, MPI_Send, MPI_Recv Other collective functions that include all the communicate processes, i.e. broadcast, scatter, … Includes RDMA operations (one-sided), also streaming models built atop
  • 18. What is Parallel I/O? At the program level: Concurrent reads or writes from multiple processes to a common file At the system level: A parallel file system and hardware that support such concurrent access Three strategies of I/O in HPC: Spokesperson Multiple writers multiple files Cooperative
  • 19. Spokesperson One process performs the I/O Easy to program It doesn’t scale Shared File
  • 20. Multiple writers multiple files All the processes write to individual files Might limited by the file system Easy to program It doesn’t scale Number of files creates bottleneck with metadata operations Number of simultaneous disk accesses creates contention for file system resources
  • 21. Cooperative Parallel I/O (Real Parallel IO) Multiple processes write to a shared file potentially not in a non-contiguous way Truly IO-parallel
  • 22. EUDAT Summer School, 3-7 July 2017, Crete Applications (Weather Forecast, CFD, Astrophysics …) High-Level I/O Level Libraries I/O Middleware I/O Forwarding I/O Parallel File system I/O Hardware MPI I/O HDF5, NetCDF, SionLib, ADIOS CIOD/DVS Lustre, GPFS, … Know about this allow you to optimize higher level of the software stack HPC I/O Software Stack
  • 23. MPI I/O Why Parallel I/O in MPI? Writing is like sending and reading is like receiving. Any parallel I/O system will need: collective operations, communicators, … Why do I/O in MPI? Why not just POSIX? Parallel performance Single file (instead of one file / process) MPI has replacement functions for POSIX I/O Provides migration path Multiple styles of I/O can all be expressed in MPI Including some that cannot be expressed without MPI
  • 24. MPI I/O: the basics I/O operations for unformatted binary file, similar to read and write, there is no fwrite nor fread. Just like POSIX I/O, you need to Open the file Read or Write data to the file Close the file In MPI, these steps are almost the same: Open the file: MPI_File_open Write to the file: MPI_File_write Close the file: MPI_File_close
  • 25. An example of MPI I/O #include <stdio.h> #include "mpi.h” int main(int argc, char *argv[]) { MPI_File fh; int buf[1000], rank; MPI_Init(0,0); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI _File_ open(MPI_COMM_WORLD, "test.out", MPI_MODE_CREATE|MPI_MODE_WRONLY, MPI_INFO_NULL, &fh); if (rank == 0) MPI _File_ w rite(fh, buf, 1000, MPI_INT, MPI_STATUS_IGNORE); MPI _File_ close(&fh); MPI_Finalize(); return 0; } Example code to write to a shared file
  • 26. High-Level Parallel Libraries Provide structure to files Well-defined, portable formats Self-describing APIs more appropriate for computational science Typed data Noncontiguous regions in memory and file Interfaces are implemented on top of MPI-IO
  • 27. HDF5 Most used high-level library in scientific codes HDF5 = Hierarchical Data format HDF5 is three things: Data model: container, data set, group and link Library: support for parallel I/O operations File Format: Hierarchical data organization in single file; typed, multidimensional array storage ; attributes on dataset, data
  • 28. What about PGAS I/O ? Effort in designing PGAS-like programming systems for I/O operations Different parts of a shared file are virtually mapped to a global memory space, that is accessible by all processes, think about mmap for instance To write to disk, make a store to global memory To read from disk, make a load from the global memory I/O system is becoming very heterogeneous so it is good to have a unique flat global “memory” space to hide this architectural complexity a(0) a(1) a(2) a(3) a(4) a(5) a(6) File 1 File 2 mapping User write a(5) a(5) = 8.7 Global “memory”
  • 29. Conclusions Supercomputers consist of several computing nodes connected by an high-performance network Programming models abstract supercomputer hardware to allow for efficient implementation of algorithms MPI, C/C++ and Fortran are dominant MPI I/O provides means for real parallel I/O HDF5 most famous data format, library and data model in HPC PGAS I/O might be a viable option
  • 31. www.eudat.eu Acknowledgements These slides are largely based and adapted from: - “Parallel I/O in Practice” by Rob Ross https://www.nersc.gov/assets/Training/pio-in-practice-sc12.pdf - “Short introduktion on Optimizing I/O” by Cray https://www.pdc.kth.se/education/course-resources/introduction-to- cray-xc30-xc40/feb-2015/05_Short_Intro_Optimizing-IO.pdf - “Lecture 32: Introduction to MPI I/O” by Bill Gropp http://wgropp.cs.illinois.edu/courses/cs598-s16/lectures/lecture32.pdf