Role of python in hpc

Dr REEJA S R
Associate Professor
CSE Department
Dayananda Sagar University – School of Engineering
Kudlu Gate, Bangalore
Given Talk in DSCE ,ISE Dept., Bangalore
 Introduction to HPC
 What is Python?
 Why Python
 Why Python for HPC
 Python in HPC
 Challenges
 Conclusions
 What is HPC?
 When do we need HPC?
 What does HPC Include?
 Rise &Falls of HPC Computer Architecture
• There is no clear definition
 Computing on high performance computers
 Solving problems / doing research using computer modeling, simulationand analysis
 Engineering design using computer modeling, simulation and analysis
• My understanding
 A huge number of computational and memory requirements
 Cannot be afforded by a PC efficiently
 Speeds and feeds are the keywords
• Who uses High-Performance Computing
 Research institutes, universities and government labs
 Weather and climate research, bioscience, energy, militaryetc.
 Engineering design: more or less every product we use
 Automotive, aerospace, oil and gas explorations, digital media, financialsimulation
 Mechanical simulation, package designs, silicon manufacturingetc.
• Similar concepts
 Parallel computing: computing on parallel computers
 Super computing: computing on world 500 fastest supercomputers
• Case1: Complete a time-consuming operation in less time
 I am an automotive engineer
 I need to design a new car that consumes less gasoline
 I’d rather have the design completed in 6 months than in 2 years
 I want to test my design using computer simulations rather than building very
expensive prototypes and crashing them
• Case 2: Complete an operation under a tight deadline
 I work for a weather prediction agency
 I am getting input from weather stations/sensors
 I’d like to predict tomorrow’s forecast today
• Case 3: Perform a high number of operations per seconds
 I am an engineer at Amazon.com
 My Web server gets 1,000 hits per seconds
 I’d like my web server and databases to handle 1,000 transactions per
seconds so that customers do not experience bad delays
• High-performance computing is fast computing
• Computations in parallel over lots of compute elements (CPU,
GPU)
• Very fast network to connect between the compute elements
• Hardware
• Computer Architecture
• Vector Computers, MPP, SMP, Distributed Systems, Clusters
• Network Connections
• InfiniBand, Ethernet, Proprietary (Myrinet, Quadrics, Cray-
SeaStar etc.)
• Software
• Programming models
• MPI (Message Passing Interface), SHMEM (Shared Memory),
PGAS( partitioned global address space), etc.
• Applications
• Open source, commercial
 Vector Computers (VC) - proprietary system
 Provided the breakthrough needed for the emergence of computational
science, but
 they were only a partial answer
 Massively Parallel Processors (MPP) - proprietary systems
 High cost and a low performance/price ratio.
 Symmetric Multiprocessors (SMP)
 Suffers from scalability
 Distributed Systems
 Difficult to use and hard to extract parallel performance
 Clusters – commodity and highly popular
 High Performance Computing - Commodity Supercomputing
 High Availability Computing - Mission Critical Applications
 Modern, interpreted, object-oriented, full
featured high level programming language
 Portable (Unix/Linux, Mac OS X, Windows)
 Open source, intellectual property rights
held by the Python Software Foundation
 Python versions: 2.x and 3.x
 Goal - Develop a small python program
that runs multiple serial execution with
different load balancing techniques
applied
 Fast program development
 Simple syntax
 Easy to write well readable code
 Large standard library
 Lots of third party libraries
 Numpy, Scipy
 Mpi4py
 When you want to maximize productivity (not
necessarily performance)
 Mature language with large user base
 Huge collection of freely available software libraries
 High Performance Computing
 Engineering, Optimization, Differential Equations
 Scientific Datasets, Analysis, Visualization
 General-purpose computing
 web apps, GUIs, databases, and tons more
 Python combines the best of both JIT and AOTcode.
 Write performance critical loops and kernels in
C/FORTRAN
 Write high level logic and “boiler plate” in Python
 Unicode&Bytes
 Array
- Memory efficient array for primitive types
 Math
- Basic maths operations, include statistics
 Sqlite3
- Sql file based storage engine
 Collections
- Variety of objects (deque, counter &
dictionary Variants)
 Huge varieties of libraries , including(
numpy,scipy etc…)
 Libraries
 Numpy- a numerical python library
 Scipy –Scientific libraries
 Pandas-library for data analysis
 Scikit-learn – default machine learning library
 Biopython – bioinformatics library
 Tornado – easy bindings for concurrency
 Database bindings- for communicating with virtually all db
including Redis, MongoDB,HDF5 & SQL
 Web development framework – Creating website
 Opencv- binding for computer vision
 API bindings – for easy access to popular web API(google,
twitter & linkdln)
 Matplotlib: python –m pip install matplotlib
 High level
-lower barriers, reduce time to solution
 Interfaces with os, libraries and other
software
- Make a great glue for automating the modern scientific work
flow
- Sage(ties together biggest open source numeric software into
a unified python interface
- Reduce re-inventing of wheels
 Open Source
- Portable, free, transparent, verifiable
- Scales to arbitrary number of nodes with no license costs
 Interpreted
-Interactive data analysis and plotting
-Interactive parallel computing
 Numpy : Array data structure
>>> from numpy.random import *
>>> from pylab import *
>>> hist(randn(10000), 100)
>>> show()
Role of python in hpc
Role of python in hpc
>>>import math
>>>x=math.factorial(3)
>>>print”fact =%d”%(x)
 Ans: fact=6
 matrix
>>>import numpy as np
>>>np.array(np.mat(‘1 2;3 4’))
Ans:Array([ [1, 2],
[3, 4]])
>>>array=[4,2,6]
>>>array.append(1)
>>>print”before sorting”,array
>>>array.sort()
>>>Print”after sorting”,array
 History of NumPy
 Features
– a powerful N-dimensional array object
– sophisticated (broadcasting) functions
– tools for integrating C/C++ and Fortran code
– useful linear algebra, Fourier transform, and
random number capabilities
 Development
– Based originally on Numeric by Jim Hugunin
– Also based on NumArray by Perry Greenfield
– Written by Travis Oliphant to bring both feature
sets together
 What makes an array so much faster?
 Data layout
– homogenous: every item takes up the
same size block of memory
– single data-type objects
– powerful array scalar types
 universal function (ufuncs)
– function that operates on ndarrays in an
element-by-element fashion
– vectorized wrapper for a function
– built-in functions are implemented in
compiled C code
 Data layout
 homogenous: every item takes up the same
size block of memory
 single data-type objects
 powerful array scalar types
 Numpy has a sophisticated view of data.
bool int int8 int16
int32 int64 uint8
uint16 uint32 uint64
float float16 float32
float64 complex complex64
complex128
 Help
>>>import pylab
>>>help(pylab)
 Speedups
-use faster hardware – more cores, more cache, more GHz
-use cpu vector instruction
- Byte code and everything is in object
- fast fetcher
-load directly to numpy array
-Improves RDBMS query speed
-Speed up data message
-Cache previous day’s data
-Switch from batch to online architecture
-6 process slots cut runtime to 2 hours
-Fully parallel crashes the db
 To speedup
 PyMPI
 Pynamic
 Pytrilinos
 ODIN
 Seamless
 PyMPI
-was developed to extend python’s
scripting abilities to parallel and distributed
codes
- Parallel extension modules are written
- modules and processing can be
combined in one convenient place to simplify
processing
- single python script can provide setup,
simulation, instruction and postprocessing
 Pynamic
- Tests a system’s linking and loading
capabilities
-pynamic drivers will perform a test of
the MPI functionality
Can also gather performance matric
including the job startup time, module import
time, function visit time and MPI test time
 PyTrilinos:
- For parallel scientific computing, we provide a
high-level interface to the Trilinos and Tpetra parallel
linear algebra library.
- This makes parallel linear algebra
- Easier to use via a simplified user interface
- More intuitive through features such as advanced indexing
- More useful by enabling access to it from the already extensive
Python scientific software stack.
Role of python in hpc
 Optimized Distributed NumPy (ODIN)
- builds on top of the NumPy
- providing a distributed array data
structure that makes parallel array-based
computations.
- It provides built-in functions that work
with distributed arrays
- Framework for creating new functions
that work with distributed arrays.
ODIN’s approach has several advantages:
- Users have access to arrays in the same way that
they think about them: either globally or locally.
- As ODIN arrays are easier to use and reason about
than the MPI-equivalent, this leads to faster iterative
cycles, more flexibility when exploring parallel algorithms,
and an overall reduction in total time-to-solution.
- ODIN is designed to work with existing MPI
programs
– By using Python, ODIN can leverage the ecosystem
of speed-related third party packages, either to wrap
external code or to accelerate existing Python code.
- With the power and expressiveness of NumPy array
slicing, ODIN can optimize distributed array expressions.
These optimizations include: loop fusion, array expression
analysis to select the appropriate communication strategy
between worker nodes
• ODIN’s basic features
—distributed array creation, unary and binary ufunc
application, global and local modes of interaction
—are prototyped
- are currently being tested on systems and clusters
with small to mid-range number of nodes.
 Seamless
- for automatic, Just-in-time compilation
of Python source code.
-Seamless aims to make node-level
Python code as fast as compiled languages via
dynamic compilation.
-It also allows effortless access to
compiled libraries in Python, allowing easy
integration of existing code bases written in
statically typed languages.
• Schematic relation between PyTrilinos, ODIN, and Seamless.
• Each of the three packages is standalone.
• ODIN can use Seamless and PyTrilinos and the functionality that
these two packages provide.
• Seamless provides four principal features, while PyTrilinos
wraps several Trilinos solver packages.
 Python is too slow.
-Seamless allows compilation to fast machine code,
either dynamically or statically.
 Python is yet another language to integrate with existing
software.
-Seamless allows easy interaction between Python and
other languages, and removes nearly all barriers to inter-
language programming.
 The Python HPC ecosystem is too small.
- PyTrilinos provides access to a comprehensive suite of
HPC solvers. Further, ODIN will provide a library of functions and
methods designed to work with distributed arrays, and its design
allows access to any existing MPI routines.
 Integrating all components is too difficult.
-ODIN provides a common framework to integrate
disparate components for distributed computing.
 Performance.
-Processor capacity and memory bandwidth are scaling faster than
system I/O.
-A solution is required that provides higher overall available I/O
bandwidth per socket to accelerate message passing interface (MPI) rates for
tomorrow’s HPC deployments.
 Cost and density.
-More components in a server limit density and increase fabric cost.
-An integrated fabric controller helps eliminate the additional costs
and required space of discrete cards, enabling higher server density while
freeing up a valuable PCIe slot for other storage and networking controllers.
 Reliability and power.
-Discrete interface cards consume many watts of power.
-An integrated interface card on the processor can draw less power
with fewer discrete components.
 Python is a dynamic object-oriented programming
language.
 Because of its powerful and flexible syntax, Python
excels as a platform for High Performance
Computing and scientific computing.
 Versatility, simplicity of use, high portability and
the large number of open source modules and
packages make it very popular for scientific use.
 Pure Python is generally slower compared to
traditional language (C or Fortran), there are
various techniques and libraries that allow you to
obtain performance absolutely comparable to
those of the most common compiled languages,
assuring a good balance between computational
performance and time investment.
 reejasr@gmail.com
 reeja-cse@dsu.edu.in
Role of python in hpc
1 de 39

Recomendados

Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016 por
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016MLconf
1.1K vistas42 diapositivas
Large Data Analyze With PyTables por
Large Data Analyze With PyTablesLarge Data Analyze With PyTables
Large Data Analyze With PyTablesInnfinision Cloud and BigData Solutions
906 vistas40 diapositivas
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016 por
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016MLconf
843 vistas23 diapositivas
PyTables por
PyTablesPyTables
PyTablesAli Hallaji
427 vistas41 diapositivas
Introduction To TensorFlow por
Introduction To TensorFlowIntroduction To TensorFlow
Introduction To TensorFlowSpotle.ai
1.1K vistas19 diapositivas
Distributed deep learning por
Distributed deep learningDistributed deep learning
Distributed deep learningMehdi Shibahara
1.3K vistas38 diapositivas

Más contenido relacionado

La actualidad más candente

Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As... por
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...Databricks
724 vistas29 diapositivas
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016 por
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016MLconf
851 vistas20 diapositivas
Distributed deep learning optimizations for Finance por
Distributed deep learning optimizations for FinanceDistributed deep learning optimizations for Finance
Distributed deep learning optimizations for Financegeetachauhan
607 vistas20 diapositivas
Image Classification Done Simply using Keras and TensorFlow por
Image Classification Done Simply using Keras and TensorFlow Image Classification Done Simply using Keras and TensorFlow
Image Classification Done Simply using Keras and TensorFlow Rajiv Shah
2.3K vistas16 diapositivas
Distributed deep learning optimizations por
Distributed deep learning optimizationsDistributed deep learning optimizations
Distributed deep learning optimizationsgeetachauhan
427 vistas17 diapositivas
Distributed TensorFlow on Hops (Papis London, April 2018) por
Distributed TensorFlow on Hops (Papis London, April 2018)Distributed TensorFlow on Hops (Papis London, April 2018)
Distributed TensorFlow on Hops (Papis London, April 2018)Jim Dowling
645 vistas46 diapositivas

La actualidad más candente(19)

Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As... por Databricks
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
Databricks724 vistas
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016 por MLconf
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
MLconf851 vistas
Distributed deep learning optimizations for Finance por geetachauhan
Distributed deep learning optimizations for FinanceDistributed deep learning optimizations for Finance
Distributed deep learning optimizations for Finance
geetachauhan607 vistas
Image Classification Done Simply using Keras and TensorFlow por Rajiv Shah
Image Classification Done Simply using Keras and TensorFlow Image Classification Done Simply using Keras and TensorFlow
Image Classification Done Simply using Keras and TensorFlow
Rajiv Shah2.3K vistas
Distributed deep learning optimizations por geetachauhan
Distributed deep learning optimizationsDistributed deep learning optimizations
Distributed deep learning optimizations
geetachauhan427 vistas
Distributed TensorFlow on Hops (Papis London, April 2018) por Jim Dowling
Distributed TensorFlow on Hops (Papis London, April 2018)Distributed TensorFlow on Hops (Papis London, April 2018)
Distributed TensorFlow on Hops (Papis London, April 2018)
Jim Dowling645 vistas
Separating Hype from Reality in Deep Learning with Sameer Farooqui por Databricks
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
Databricks971 vistas
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017 por MLconf
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
MLconf497 vistas
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala por Spark Summit
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaDeep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Spark Summit9.2K vistas
Introduction to apache horn (incubating) por Edward Yoon
Introduction to apache horn (incubating)Introduction to apache horn (incubating)
Introduction to apache horn (incubating)
Edward Yoon5.2K vistas
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2... por MLconf
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
MLconf1.7K vistas
Python Powered Data Science at Pivotal (PyData 2013) por Srivatsan Ramanujam
Python Powered Data Science at Pivotal (PyData 2013)Python Powered Data Science at Pivotal (PyData 2013)
Python Powered Data Science at Pivotal (PyData 2013)
Srivatsan Ramanujam8.5K vistas
2nd Hivemall meetup 20151020 por Makoto Yui
2nd Hivemall meetup 201510202nd Hivemall meetup 20151020
2nd Hivemall meetup 20151020
Makoto Yui2.8K vistas
Machine learning at scale with Google Cloud Platform por Matthias Feys
Machine learning at scale with Google Cloud PlatformMachine learning at scale with Google Cloud Platform
Machine learning at scale with Google Cloud Platform
Matthias Feys8K vistas
Deep learning with TensorFlow por Ndjido Ardo BAR
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlow
Ndjido Ardo BAR1.6K vistas

Similar a Role of python in hpc

PyTables por
PyTablesPyTables
PyTablesAli Hallaji
769 vistas41 diapositivas
Py tables por
Py tablesPy tables
Py tablesAli Hallaji
185 vistas41 diapositivas
Introduction to High Performance Computing por
Introduction to High Performance ComputingIntroduction to High Performance Computing
Introduction to High Performance ComputingUmarudin Zaenuri
609 vistas13 diapositivas
Introduction to High-Performance Computing por
Introduction to High-Performance ComputingIntroduction to High-Performance Computing
Introduction to High-Performance ComputingUmarudin Zaenuri
6.2K vistas13 diapositivas
Cluster Tutorial por
Cluster TutorialCluster Tutorial
Cluster Tutorialcybercbm
5.1K vistas174 diapositivas
Keynote at Converge 2019 por
Keynote at Converge 2019Keynote at Converge 2019
Keynote at Converge 2019Travis Oliphant
469 vistas51 diapositivas

Similar a Role of python in hpc(20)

Introduction to High Performance Computing por Umarudin Zaenuri
Introduction to High Performance ComputingIntroduction to High Performance Computing
Introduction to High Performance Computing
Umarudin Zaenuri609 vistas
Introduction to High-Performance Computing por Umarudin Zaenuri
Introduction to High-Performance ComputingIntroduction to High-Performance Computing
Introduction to High-Performance Computing
Umarudin Zaenuri6.2K vistas
Cluster Tutorial por cybercbm
Cluster TutorialCluster Tutorial
Cluster Tutorial
cybercbm5.1K vistas
Travis Oliphant "Python for Speed, Scale, and Science" por Fwdays
Travis Oliphant "Python for Speed, Scale, and Science"Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"
Fwdays218 vistas
Cytoscape: Now and Future por Keiichiro Ono
Cytoscape: Now and FutureCytoscape: Now and Future
Cytoscape: Now and Future
Keiichiro Ono2.4K vistas
Stories About Spark, HPC and Barcelona by Jordi Torres por Spark Summit
Stories About Spark, HPC and Barcelona by Jordi TorresStories About Spark, HPC and Barcelona by Jordi Torres
Stories About Spark, HPC and Barcelona by Jordi Torres
Spark Summit3K vistas
Mauricio breteernitiz hpc-exascale-iscte por mbreternitz
Mauricio breteernitiz hpc-exascale-iscteMauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-iscte
mbreternitz86 vistas
Machine learning model to production por Georg Heiler
Machine learning model to productionMachine learning model to production
Machine learning model to production
Georg Heiler4.4K vistas
Deploying and Managing HPC Clusters with IBM Platform and Intel Xeon Phi Copr... por Intel IT Center
Deploying and Managing HPC Clusters with IBM Platform and Intel Xeon Phi Copr...Deploying and Managing HPC Clusters with IBM Platform and Intel Xeon Phi Copr...
Deploying and Managing HPC Clusters with IBM Platform and Intel Xeon Phi Copr...
Intel IT Center949 vistas
Lecture 1 introduction to parallel and distributed computing por Vajira Thambawita
Lecture 1   introduction to parallel and distributed computingLecture 1   introduction to parallel and distributed computing
Lecture 1 introduction to parallel and distributed computing
Vajira Thambawita2.9K vistas
High-Performance and Scalable Designs of Programming Models for Exascale Systems por inside-BigData.com
High-Performance and Scalable Designs of Programming Models for Exascale SystemsHigh-Performance and Scalable Designs of Programming Models for Exascale Systems
High-Performance and Scalable Designs of Programming Models for Exascale Systems
inside-BigData.com2.8K vistas
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit... por Ilkay Altintas, Ph.D.
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
01 introduction fundamentals_of_parallelism_and_code_optimization-www.astek.ir por aminnezarat
01 introduction fundamentals_of_parallelism_and_code_optimization-www.astek.ir01 introduction fundamentals_of_parallelism_and_code_optimization-www.astek.ir
01 introduction fundamentals_of_parallelism_and_code_optimization-www.astek.ir
aminnezarat88 vistas
Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L... por Simplilearn
Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...
Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...
Simplilearn436 vistas

Último

Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc... por
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...csegroupvn
6 vistas210 diapositivas
Proposal Presentation.pptx por
Proposal Presentation.pptxProposal Presentation.pptx
Proposal Presentation.pptxkeytonallamon
63 vistas36 diapositivas
Design_Discover_Develop_Campaign.pptx por
Design_Discover_Develop_Campaign.pptxDesign_Discover_Develop_Campaign.pptx
Design_Discover_Develop_Campaign.pptxShivanshSeth6
45 vistas20 diapositivas
sam_software_eng_cv.pdf por
sam_software_eng_cv.pdfsam_software_eng_cv.pdf
sam_software_eng_cv.pdfsammyigbinovia
9 vistas5 diapositivas
Ansari: Practical experiences with an LLM-based Islamic Assistant por
Ansari: Practical experiences with an LLM-based Islamic AssistantAnsari: Practical experiences with an LLM-based Islamic Assistant
Ansari: Practical experiences with an LLM-based Islamic AssistantM Waleed Kadous
7 vistas29 diapositivas
Pitchbook Repowerlab.pdf por
Pitchbook Repowerlab.pdfPitchbook Repowerlab.pdf
Pitchbook Repowerlab.pdfVictoriaGaleano
5 vistas12 diapositivas

Último(20)

Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc... por csegroupvn
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...
csegroupvn6 vistas
Proposal Presentation.pptx por keytonallamon
Proposal Presentation.pptxProposal Presentation.pptx
Proposal Presentation.pptx
keytonallamon63 vistas
Design_Discover_Develop_Campaign.pptx por ShivanshSeth6
Design_Discover_Develop_Campaign.pptxDesign_Discover_Develop_Campaign.pptx
Design_Discover_Develop_Campaign.pptx
ShivanshSeth645 vistas
Ansari: Practical experiences with an LLM-based Islamic Assistant por M Waleed Kadous
Ansari: Practical experiences with an LLM-based Islamic AssistantAnsari: Practical experiences with an LLM-based Islamic Assistant
Ansari: Practical experiences with an LLM-based Islamic Assistant
M Waleed Kadous7 vistas
REACTJS.pdf por ArthyR3
REACTJS.pdfREACTJS.pdf
REACTJS.pdf
ArthyR335 vistas
SPICE PARK DEC2023 (6,625 SPICE Models) por Tsuyoshi Horigome
SPICE PARK DEC2023 (6,625 SPICE Models) SPICE PARK DEC2023 (6,625 SPICE Models)
SPICE PARK DEC2023 (6,625 SPICE Models)
Tsuyoshi Horigome36 vistas
MongoDB.pdf por ArthyR3
MongoDB.pdfMongoDB.pdf
MongoDB.pdf
ArthyR349 vistas
SUMIT SQL PROJECT SUPERSTORE 1.pptx por Sumit Jadhav
SUMIT SQL PROJECT SUPERSTORE 1.pptxSUMIT SQL PROJECT SUPERSTORE 1.pptx
SUMIT SQL PROJECT SUPERSTORE 1.pptx
Sumit Jadhav 22 vistas
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth por Innomantra
BCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for GrowthBCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for Growth
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth
Innomantra 10 vistas
fakenews_DBDA_Mar23.pptx por deepmitra8
fakenews_DBDA_Mar23.pptxfakenews_DBDA_Mar23.pptx
fakenews_DBDA_Mar23.pptx
deepmitra816 vistas
GDSC Mikroskil Members Onboarding 2023.pdf por gdscmikroskil
GDSC Mikroskil Members Onboarding 2023.pdfGDSC Mikroskil Members Onboarding 2023.pdf
GDSC Mikroskil Members Onboarding 2023.pdf
gdscmikroskil59 vistas
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf por AlhamduKure
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdfASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf
AlhamduKure6 vistas
Design of machine elements-UNIT 3.pptx por gopinathcreddy
Design of machine elements-UNIT 3.pptxDesign of machine elements-UNIT 3.pptx
Design of machine elements-UNIT 3.pptx
gopinathcreddy34 vistas
_MAKRIADI-FOTEINI_diploma thesis.pptx por fotinimakriadi
_MAKRIADI-FOTEINI_diploma thesis.pptx_MAKRIADI-FOTEINI_diploma thesis.pptx
_MAKRIADI-FOTEINI_diploma thesis.pptx
fotinimakriadi10 vistas

Role of python in hpc

  • 1. Dr REEJA S R Associate Professor CSE Department Dayananda Sagar University – School of Engineering Kudlu Gate, Bangalore Given Talk in DSCE ,ISE Dept., Bangalore
  • 2.  Introduction to HPC  What is Python?  Why Python  Why Python for HPC  Python in HPC  Challenges  Conclusions
  • 3.  What is HPC?  When do we need HPC?  What does HPC Include?  Rise &Falls of HPC Computer Architecture
  • 4. • There is no clear definition  Computing on high performance computers  Solving problems / doing research using computer modeling, simulationand analysis  Engineering design using computer modeling, simulation and analysis • My understanding  A huge number of computational and memory requirements  Cannot be afforded by a PC efficiently  Speeds and feeds are the keywords • Who uses High-Performance Computing  Research institutes, universities and government labs  Weather and climate research, bioscience, energy, militaryetc.  Engineering design: more or less every product we use  Automotive, aerospace, oil and gas explorations, digital media, financialsimulation  Mechanical simulation, package designs, silicon manufacturingetc. • Similar concepts  Parallel computing: computing on parallel computers  Super computing: computing on world 500 fastest supercomputers
  • 5. • Case1: Complete a time-consuming operation in less time  I am an automotive engineer  I need to design a new car that consumes less gasoline  I’d rather have the design completed in 6 months than in 2 years  I want to test my design using computer simulations rather than building very expensive prototypes and crashing them • Case 2: Complete an operation under a tight deadline  I work for a weather prediction agency  I am getting input from weather stations/sensors  I’d like to predict tomorrow’s forecast today • Case 3: Perform a high number of operations per seconds  I am an engineer at Amazon.com  My Web server gets 1,000 hits per seconds  I’d like my web server and databases to handle 1,000 transactions per seconds so that customers do not experience bad delays
  • 6. • High-performance computing is fast computing • Computations in parallel over lots of compute elements (CPU, GPU) • Very fast network to connect between the compute elements • Hardware • Computer Architecture • Vector Computers, MPP, SMP, Distributed Systems, Clusters • Network Connections • InfiniBand, Ethernet, Proprietary (Myrinet, Quadrics, Cray- SeaStar etc.) • Software • Programming models • MPI (Message Passing Interface), SHMEM (Shared Memory), PGAS( partitioned global address space), etc. • Applications • Open source, commercial
  • 7.  Vector Computers (VC) - proprietary system  Provided the breakthrough needed for the emergence of computational science, but  they were only a partial answer  Massively Parallel Processors (MPP) - proprietary systems  High cost and a low performance/price ratio.  Symmetric Multiprocessors (SMP)  Suffers from scalability  Distributed Systems  Difficult to use and hard to extract parallel performance  Clusters – commodity and highly popular  High Performance Computing - Commodity Supercomputing  High Availability Computing - Mission Critical Applications
  • 8.  Modern, interpreted, object-oriented, full featured high level programming language  Portable (Unix/Linux, Mac OS X, Windows)  Open source, intellectual property rights held by the Python Software Foundation  Python versions: 2.x and 3.x  Goal - Develop a small python program that runs multiple serial execution with different load balancing techniques applied
  • 9.  Fast program development  Simple syntax  Easy to write well readable code  Large standard library  Lots of third party libraries  Numpy, Scipy  Mpi4py
  • 10.  When you want to maximize productivity (not necessarily performance)  Mature language with large user base  Huge collection of freely available software libraries  High Performance Computing  Engineering, Optimization, Differential Equations  Scientific Datasets, Analysis, Visualization  General-purpose computing  web apps, GUIs, databases, and tons more  Python combines the best of both JIT and AOTcode.  Write performance critical loops and kernels in C/FORTRAN  Write high level logic and “boiler plate” in Python
  • 11.  Unicode&Bytes  Array - Memory efficient array for primitive types  Math - Basic maths operations, include statistics  Sqlite3 - Sql file based storage engine  Collections - Variety of objects (deque, counter & dictionary Variants)  Huge varieties of libraries , including( numpy,scipy etc…)
  • 12.  Libraries  Numpy- a numerical python library  Scipy –Scientific libraries  Pandas-library for data analysis  Scikit-learn – default machine learning library  Biopython – bioinformatics library  Tornado – easy bindings for concurrency  Database bindings- for communicating with virtually all db including Redis, MongoDB,HDF5 & SQL  Web development framework – Creating website  Opencv- binding for computer vision  API bindings – for easy access to popular web API(google, twitter & linkdln)  Matplotlib: python –m pip install matplotlib
  • 13.  High level -lower barriers, reduce time to solution  Interfaces with os, libraries and other software - Make a great glue for automating the modern scientific work flow - Sage(ties together biggest open source numeric software into a unified python interface - Reduce re-inventing of wheels  Open Source - Portable, free, transparent, verifiable - Scales to arbitrary number of nodes with no license costs  Interpreted -Interactive data analysis and plotting -Interactive parallel computing
  • 14.  Numpy : Array data structure >>> from numpy.random import * >>> from pylab import * >>> hist(randn(10000), 100) >>> show()
  • 17. >>>import math >>>x=math.factorial(3) >>>print”fact =%d”%(x)  Ans: fact=6  matrix >>>import numpy as np >>>np.array(np.mat(‘1 2;3 4’)) Ans:Array([ [1, 2], [3, 4]])
  • 19.  History of NumPy  Features – a powerful N-dimensional array object – sophisticated (broadcasting) functions – tools for integrating C/C++ and Fortran code – useful linear algebra, Fourier transform, and random number capabilities  Development – Based originally on Numeric by Jim Hugunin – Also based on NumArray by Perry Greenfield – Written by Travis Oliphant to bring both feature sets together
  • 20.  What makes an array so much faster?  Data layout – homogenous: every item takes up the same size block of memory – single data-type objects – powerful array scalar types  universal function (ufuncs) – function that operates on ndarrays in an element-by-element fashion – vectorized wrapper for a function – built-in functions are implemented in compiled C code
  • 21.  Data layout  homogenous: every item takes up the same size block of memory  single data-type objects  powerful array scalar types
  • 22.  Numpy has a sophisticated view of data. bool int int8 int16 int32 int64 uint8 uint16 uint32 uint64 float float16 float32 float64 complex complex64 complex128
  • 24.  Speedups -use faster hardware – more cores, more cache, more GHz -use cpu vector instruction - Byte code and everything is in object - fast fetcher -load directly to numpy array -Improves RDBMS query speed -Speed up data message -Cache previous day’s data -Switch from batch to online architecture -6 process slots cut runtime to 2 hours -Fully parallel crashes the db
  • 25.  To speedup  PyMPI  Pynamic  Pytrilinos  ODIN  Seamless
  • 26.  PyMPI -was developed to extend python’s scripting abilities to parallel and distributed codes - Parallel extension modules are written - modules and processing can be combined in one convenient place to simplify processing - single python script can provide setup, simulation, instruction and postprocessing
  • 27.  Pynamic - Tests a system’s linking and loading capabilities -pynamic drivers will perform a test of the MPI functionality Can also gather performance matric including the job startup time, module import time, function visit time and MPI test time
  • 28.  PyTrilinos: - For parallel scientific computing, we provide a high-level interface to the Trilinos and Tpetra parallel linear algebra library. - This makes parallel linear algebra - Easier to use via a simplified user interface - More intuitive through features such as advanced indexing - More useful by enabling access to it from the already extensive Python scientific software stack.
  • 30.  Optimized Distributed NumPy (ODIN) - builds on top of the NumPy - providing a distributed array data structure that makes parallel array-based computations. - It provides built-in functions that work with distributed arrays - Framework for creating new functions that work with distributed arrays.
  • 31. ODIN’s approach has several advantages: - Users have access to arrays in the same way that they think about them: either globally or locally. - As ODIN arrays are easier to use and reason about than the MPI-equivalent, this leads to faster iterative cycles, more flexibility when exploring parallel algorithms, and an overall reduction in total time-to-solution. - ODIN is designed to work with existing MPI programs – By using Python, ODIN can leverage the ecosystem of speed-related third party packages, either to wrap external code or to accelerate existing Python code. - With the power and expressiveness of NumPy array slicing, ODIN can optimize distributed array expressions. These optimizations include: loop fusion, array expression analysis to select the appropriate communication strategy between worker nodes
  • 32. • ODIN’s basic features —distributed array creation, unary and binary ufunc application, global and local modes of interaction —are prototyped - are currently being tested on systems and clusters with small to mid-range number of nodes.
  • 33.  Seamless - for automatic, Just-in-time compilation of Python source code. -Seamless aims to make node-level Python code as fast as compiled languages via dynamic compilation. -It also allows effortless access to compiled libraries in Python, allowing easy integration of existing code bases written in statically typed languages.
  • 34. • Schematic relation between PyTrilinos, ODIN, and Seamless. • Each of the three packages is standalone. • ODIN can use Seamless and PyTrilinos and the functionality that these two packages provide. • Seamless provides four principal features, while PyTrilinos wraps several Trilinos solver packages.
  • 35.  Python is too slow. -Seamless allows compilation to fast machine code, either dynamically or statically.  Python is yet another language to integrate with existing software. -Seamless allows easy interaction between Python and other languages, and removes nearly all barriers to inter- language programming.  The Python HPC ecosystem is too small. - PyTrilinos provides access to a comprehensive suite of HPC solvers. Further, ODIN will provide a library of functions and methods designed to work with distributed arrays, and its design allows access to any existing MPI routines.  Integrating all components is too difficult. -ODIN provides a common framework to integrate disparate components for distributed computing.
  • 36.  Performance. -Processor capacity and memory bandwidth are scaling faster than system I/O. -A solution is required that provides higher overall available I/O bandwidth per socket to accelerate message passing interface (MPI) rates for tomorrow’s HPC deployments.  Cost and density. -More components in a server limit density and increase fabric cost. -An integrated fabric controller helps eliminate the additional costs and required space of discrete cards, enabling higher server density while freeing up a valuable PCIe slot for other storage and networking controllers.  Reliability and power. -Discrete interface cards consume many watts of power. -An integrated interface card on the processor can draw less power with fewer discrete components.
  • 37.  Python is a dynamic object-oriented programming language.  Because of its powerful and flexible syntax, Python excels as a platform for High Performance Computing and scientific computing.  Versatility, simplicity of use, high portability and the large number of open source modules and packages make it very popular for scientific use.  Pure Python is generally slower compared to traditional language (C or Fortran), there are various techniques and libraries that allow you to obtain performance absolutely comparable to those of the most common compiled languages, assuring a good balance between computational performance and time investment.