SlideShare una empresa de Scribd logo
1 de 10
Post-processing SAR images on Xeon Phi – a
porting exercise
Martin Hilgeman
HPC Consultant EMEA
Research Computing
• Xeon Phi porting checklist
• Post processing of SAR images
• Questions
Agenda
Dell Confidential
Research Computing
SAR interferometry is used to monitor terrain displacements by using satellite data
The phase difference between two SAR images is calculated that have been acquired at:
› different times
› with slightly different view angles
› different observation dates
Dell Confidential
Case study: Geospatial imaging
Research Computing
Application characteristics:
– Written in C90 and C++, ~ 150,000 lines of code
– Serial application, using Intel MKL DFTI functions
– Iterative scheme
• MKL FFTs are already being used, little room for improvement
– FFTs are too small to run in threaded mode
• Application only runs in serial mode, needs to be parallelized in order to run on a Phi in
reasonable speed
• Phase images are divided into patches, which is suitable for parallelization
– Parallelize the main loop using OpenMP
Dell Confidential
Code details
Research Computing
Test Setup
Confidential5
Jobs ran at the TACC Stampede system
PowerEdge C8220 with Intel® Xeon® E5-2680 2.7GHz
– 16 cores
– 32 GB memory
– RHEL 6.3
– Intel® Xeon Phi™ 7120P
Research Computing
Parallelization of big main loop
count = 1;
if (lastGCP != NULL) {
app = lastGCP->next;
} else {
app = bufferGCP;
}
while (app) {
app->loc [0] = (int) app->loc [0];
app->loc [1] = (int) app->loc [1];
app->offset[0] = (int) app->offset[0];
app->offset[1] = (int) app->offset[1];
<lot of work>
lastGCP = app;
app = app->next;
count++;
}
6 Confidential
count = 1;
if (lastGCP != NULL) {
app = lastGCP->next;
} else {
app = bufferGCP;
}
int i = 0;
int nthreads, me;
#if defined _OPENMP
#pragma omp parallel 
private(i,me,app2,err,suboffset,qc,m_block,s_block)
{
nthreads = omp_get_num_threads();
me = omp_get_thread_num();
#else
nthreads = 1;
me = 0;
#endif
#pragma omp for schedule(static)
for (i = 0; i < buffer_ngcp; i++) {
#pragma omp critical
{
app2 = app;
app = app->next;
}
app2->loc [0] = (int) app2->loc [0];
app2->loc [1] = (int) app2->loc [1];
app2->offset[0] = (int) app2->offset[0];
app2->offset[1] = (int) app2->offset[1];
<lot of work>
lastGCP = app2;
count++;
}
#if defined _OPENMP
}
#endif
Research ComputingDell Confidential
Results
Wall Clock Time (ss:00)
Original version on E5440 2.83 GHz 102.00
Original version on E5-2680 2.70 GHz 27.43
Additional source optimizations 25.83
2 threads 17.37
4 threads 9.97
6 threads 7.17
8 threads 5.77
12 threads 5.57
Research Computing
• Now runs multi-threaded using OpenMP directives
– The number of patches is greater than the number of threads on the Phi
• Memory footprint is small, so should fit on the card
– Running in native mode is possible
• Intel MKL has complete support for Xeon Phi
– Assume that the FFTs are making optimal use of the resources
Dell Confidential
How about Intel® Xeon Phi™ performance?
Research Computing
Module Summary
--------------------------------------------------------------------------------
Samples Self % Total % Module
247 59.81% 59.81% /lib64/libc-2.12.so
79 19.13% 78.93% /scratch/dell-guest/app
38 9.20% 88.14% /opt/intel/mkl/lib/intel64/libmkl_core.so
34 8.23% 96.37% /opt/intel/mkl/lib/intel64/libmkl_intel_thread.so
10 2.42% 98.79% /opt/intel/mkl/lib/intel64/libmkl_intel_lp64.so
5 1.21% 100.00% /lib64/ld-2.12.so
File Summary
--------------------------------------------------------------------------------
Samples Self % Total % File
332 80.39% 80.39% ??
64 15.50% 95.88% malloc.c
10 2.42% 98.31% interp.c
3 0.73% 99.03% bsearch.c
2 0.48% 99.52% SRC/patch.c
2 0.48% 100.00% printf_fp.c
Function Summary
--------------------------------------------------------------------------------
Samples Self % Total % Function
126 30.51% 30.51% brk
69 16.71% 47.22% ??
60 14.53% 61.74% _int_malloc
43 10.41% 72.15% pmatch
34 8.23% 80.39% memcpy
13 3.15% 83.54% get_correlation_real_mkl
10 2.42% 85.96% extract2double
8 1.94% 87.89% __intel_new_memcpy
Dell Confidential
A profile
Research ComputingDell Confidential
Questions?

Más contenido relacionado

La actualidad más candente

Computer Performance Microscopy with SHIM
Computer Performance Microscopy with SHIMComputer Performance Microscopy with SHIM
Computer Performance Microscopy with SHIMhiyangxi
 
MLOps Case Studies: Building fast, scalable, and high-accuracy ML systems at ...
MLOps Case Studies: Building fast, scalable, and high-accuracy ML systems at ...MLOps Case Studies: Building fast, scalable, and high-accuracy ML systems at ...
MLOps Case Studies: Building fast, scalable, and high-accuracy ML systems at ...Masashi Shibata
 
Two C++ Tools: Compiler Explorer and Cpp Insights
Two C++ Tools: Compiler Explorer and Cpp InsightsTwo C++ Tools: Compiler Explorer and Cpp Insights
Two C++ Tools: Compiler Explorer and Cpp InsightsAlison Chaiken
 
VkRunner: a simple Vulkan shader script test utility [Lightning Talk] (Lightn...
VkRunner: a simple Vulkan shader script test utility [Lightning Talk] (Lightn...VkRunner: a simple Vulkan shader script test utility [Lightning Talk] (Lightn...
VkRunner: a simple Vulkan shader script test utility [Lightning Talk] (Lightn...Igalia
 
190111 tf2 preview_jwkang_pub
190111 tf2 preview_jwkang_pub190111 tf2 preview_jwkang_pub
190111 tf2 preview_jwkang_pubJaewook. Kang
 
function* - ES6, generators, and all that (JSRomandie meetup, February 2014)
function* - ES6, generators, and all that (JSRomandie meetup, February 2014)function* - ES6, generators, and all that (JSRomandie meetup, February 2014)
function* - ES6, generators, and all that (JSRomandie meetup, February 2014)Igalia
 
Include
IncludeInclude
Includezniker
 
Cypher for Gremlin
Cypher for GremlinCypher for Gremlin
Cypher for GremlinopenCypher
 
Unity best practices (2013)
Unity best practices (2013)Unity best practices (2013)
Unity best practices (2013)Benjamin Robert
 

La actualidad más candente (14)

Computer Performance Microscopy with SHIM
Computer Performance Microscopy with SHIMComputer Performance Microscopy with SHIM
Computer Performance Microscopy with SHIM
 
MLOps Case Studies: Building fast, scalable, and high-accuracy ML systems at ...
MLOps Case Studies: Building fast, scalable, and high-accuracy ML systems at ...MLOps Case Studies: Building fast, scalable, and high-accuracy ML systems at ...
MLOps Case Studies: Building fast, scalable, and high-accuracy ML systems at ...
 
Two C++ Tools: Compiler Explorer and Cpp Insights
Two C++ Tools: Compiler Explorer and Cpp InsightsTwo C++ Tools: Compiler Explorer and Cpp Insights
Two C++ Tools: Compiler Explorer and Cpp Insights
 
Sorter
SorterSorter
Sorter
 
VkRunner: a simple Vulkan shader script test utility [Lightning Talk] (Lightn...
VkRunner: a simple Vulkan shader script test utility [Lightning Talk] (Lightn...VkRunner: a simple Vulkan shader script test utility [Lightning Talk] (Lightn...
VkRunner: a simple Vulkan shader script test utility [Lightning Talk] (Lightn...
 
190111 tf2 preview_jwkang_pub
190111 tf2 preview_jwkang_pub190111 tf2 preview_jwkang_pub
190111 tf2 preview_jwkang_pub
 
function* - ES6, generators, and all that (JSRomandie meetup, February 2014)
function* - ES6, generators, and all that (JSRomandie meetup, February 2014)function* - ES6, generators, and all that (JSRomandie meetup, February 2014)
function* - ES6, generators, and all that (JSRomandie meetup, February 2014)
 
Include
IncludeInclude
Include
 
Snug
SnugSnug
Snug
 
Cypher for Gremlin
Cypher for GremlinCypher for Gremlin
Cypher for Gremlin
 
Peephole Optimization
Peephole OptimizationPeephole Optimization
Peephole Optimization
 
Numba Overview
Numba OverviewNumba Overview
Numba Overview
 
Circuit Simplifier
Circuit SimplifierCircuit Simplifier
Circuit Simplifier
 
Unity best practices (2013)
Unity best practices (2013)Unity best practices (2013)
Unity best practices (2013)
 

Similar a Post-processing SAR images on Xeon Phi - a porting exercise

lecture_GPUArchCUDA04-OpenMPHOMP.pdf
lecture_GPUArchCUDA04-OpenMPHOMP.pdflecture_GPUArchCUDA04-OpenMPHOMP.pdf
lecture_GPUArchCUDA04-OpenMPHOMP.pdfTigabu Yaya
 
Exploiting parallelism opportunities in non-parallel architectures to improve...
Exploiting parallelism opportunities in non-parallel architectures to improve...Exploiting parallelism opportunities in non-parallel architectures to improve...
Exploiting parallelism opportunities in non-parallel architectures to improve...GreenLSI Team, LSI, UPM
 
OmpSs – improving the scalability of OpenMP
OmpSs – improving the scalability of OpenMPOmpSs – improving the scalability of OpenMP
OmpSs – improving the scalability of OpenMPIntel IT Center
 
OpenMP tasking model: from the standard to the classroom
OpenMP tasking model: from the standard to the classroomOpenMP tasking model: from the standard to the classroom
OpenMP tasking model: from the standard to the classroomFacultad de Informática UCM
 
Netmap presentation
Netmap presentationNetmap presentation
Netmap presentationAmir Razmjou
 
Umbra Ignite 2015: Jérémy Virga – Dishonored 2 rendering engine architecture ...
Umbra Ignite 2015: Jérémy Virga – Dishonored 2 rendering engine architecture ...Umbra Ignite 2015: Jérémy Virga – Dishonored 2 rendering engine architecture ...
Umbra Ignite 2015: Jérémy Virga – Dishonored 2 rendering engine architecture ...Umbra Software
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Data Con LA
 
Mathematics and development of fast TLS handshakes
Mathematics and development of fast TLS handshakesMathematics and development of fast TLS handshakes
Mathematics and development of fast TLS handshakesAlexander Krizhanovsky
 
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...Edge AI and Vision Alliance
 
SMP implementation for OpenBSD/sgi
SMP implementation for OpenBSD/sgiSMP implementation for OpenBSD/sgi
SMP implementation for OpenBSD/sgiTakuya ASADA
 
Performance Analysis of Lattice QCD on GPUs in APGAS Programming Model
Performance Analysis of Lattice QCD on GPUs in APGAS Programming ModelPerformance Analysis of Lattice QCD on GPUs in APGAS Programming Model
Performance Analysis of Lattice QCD on GPUs in APGAS Programming ModelKoichi Shirahata
 
Topology hiding Multipath Routing Protocol in MANET
Topology hiding Multipath Routing Protocol in MANETTopology hiding Multipath Routing Protocol in MANET
Topology hiding Multipath Routing Protocol in MANETAkshay Phalke
 
Java Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoJava Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoValeriia Maliarenko
 
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesPragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesMarina Kolpakova
 
Modeling & Simulation of CubeSat-based Missions'Concept of Operations
Modeling & Simulation of CubeSat-based Missions'Concept of OperationsModeling & Simulation of CubeSat-based Missions'Concept of Operations
Modeling & Simulation of CubeSat-based Missions'Concept of OperationsObeo
 
Profiling & Testing with Spark
Profiling & Testing with SparkProfiling & Testing with Spark
Profiling & Testing with SparkRoger Rafanell Mas
 
Data Analytics and Simulation in Parallel with MATLAB*
Data Analytics and Simulation in Parallel with MATLAB*Data Analytics and Simulation in Parallel with MATLAB*
Data Analytics and Simulation in Parallel with MATLAB*Intel® Software
 

Similar a Post-processing SAR images on Xeon Phi - a porting exercise (20)

lecture_GPUArchCUDA04-OpenMPHOMP.pdf
lecture_GPUArchCUDA04-OpenMPHOMP.pdflecture_GPUArchCUDA04-OpenMPHOMP.pdf
lecture_GPUArchCUDA04-OpenMPHOMP.pdf
 
Exploiting parallelism opportunities in non-parallel architectures to improve...
Exploiting parallelism opportunities in non-parallel architectures to improve...Exploiting parallelism opportunities in non-parallel architectures to improve...
Exploiting parallelism opportunities in non-parallel architectures to improve...
 
OmpSs – improving the scalability of OpenMP
OmpSs – improving the scalability of OpenMPOmpSs – improving the scalability of OpenMP
OmpSs – improving the scalability of OpenMP
 
OpenMP tasking model: from the standard to the classroom
OpenMP tasking model: from the standard to the classroomOpenMP tasking model: from the standard to the classroom
OpenMP tasking model: from the standard to the classroom
 
Programar para GPUs
Programar para GPUsProgramar para GPUs
Programar para GPUs
 
Netmap presentation
Netmap presentationNetmap presentation
Netmap presentation
 
Umbra Ignite 2015: Jérémy Virga – Dishonored 2 rendering engine architecture ...
Umbra Ignite 2015: Jérémy Virga – Dishonored 2 rendering engine architecture ...Umbra Ignite 2015: Jérémy Virga – Dishonored 2 rendering engine architecture ...
Umbra Ignite 2015: Jérémy Virga – Dishonored 2 rendering engine architecture ...
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
 
Mathematics and development of fast TLS handshakes
Mathematics and development of fast TLS handshakesMathematics and development of fast TLS handshakes
Mathematics and development of fast TLS handshakes
 
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
 
SMP implementation for OpenBSD/sgi
SMP implementation for OpenBSD/sgiSMP implementation for OpenBSD/sgi
SMP implementation for OpenBSD/sgi
 
Performance Analysis of Lattice QCD on GPUs in APGAS Programming Model
Performance Analysis of Lattice QCD on GPUs in APGAS Programming ModelPerformance Analysis of Lattice QCD on GPUs in APGAS Programming Model
Performance Analysis of Lattice QCD on GPUs in APGAS Programming Model
 
Topology hiding Multipath Routing Protocol in MANET
Topology hiding Multipath Routing Protocol in MANETTopology hiding Multipath Routing Protocol in MANET
Topology hiding Multipath Routing Protocol in MANET
 
Exploring Gpgpu Workloads
Exploring Gpgpu WorkloadsExploring Gpgpu Workloads
Exploring Gpgpu Workloads
 
Java Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoJava Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey Kovalenko
 
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesPragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
 
Modeling & Simulation of CubeSat-based Missions'Concept of Operations
Modeling & Simulation of CubeSat-based Missions'Concept of OperationsModeling & Simulation of CubeSat-based Missions'Concept of Operations
Modeling & Simulation of CubeSat-based Missions'Concept of Operations
 
Can FPGAs Compete with GPUs?
Can FPGAs Compete with GPUs?Can FPGAs Compete with GPUs?
Can FPGAs Compete with GPUs?
 
Profiling & Testing with Spark
Profiling & Testing with SparkProfiling & Testing with Spark
Profiling & Testing with Spark
 
Data Analytics and Simulation in Parallel with MATLAB*
Data Analytics and Simulation in Parallel with MATLAB*Data Analytics and Simulation in Parallel with MATLAB*
Data Analytics and Simulation in Parallel with MATLAB*
 

Más de Intel IT Center

AI Crash Course- Supercomputing
AI Crash Course- SupercomputingAI Crash Course- Supercomputing
AI Crash Course- SupercomputingIntel IT Center
 
FPGA Inference - DellEMC SURFsara
FPGA Inference - DellEMC SURFsaraFPGA Inference - DellEMC SURFsara
FPGA Inference - DellEMC SURFsaraIntel IT Center
 
High Memory Bandwidth Demo @ One Intel Station
High Memory Bandwidth Demo @ One Intel StationHigh Memory Bandwidth Demo @ One Intel Station
High Memory Bandwidth Demo @ One Intel StationIntel IT Center
 
INFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutions
INFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutionsINFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutions
INFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutionsIntel IT Center
 
Disrupt Hackers With Robust User Authentication
Disrupt Hackers With Robust User AuthenticationDisrupt Hackers With Robust User Authentication
Disrupt Hackers With Robust User AuthenticationIntel IT Center
 
Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...
Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...
Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...Intel IT Center
 
Harness Digital Disruption to Create 2022’s Workplace Today
Harness Digital Disruption to Create 2022’s Workplace TodayHarness Digital Disruption to Create 2022’s Workplace Today
Harness Digital Disruption to Create 2022’s Workplace TodayIntel IT Center
 
Don't Rely on Software Alone. Protect Endpoints with Hardware-Enhanced Security.
Don't Rely on Software Alone.Protect Endpoints with Hardware-Enhanced Security.Don't Rely on Software Alone.Protect Endpoints with Hardware-Enhanced Security.
Don't Rely on Software Alone. Protect Endpoints with Hardware-Enhanced Security.Intel IT Center
 
Achieve Unconstrained Collaboration in a Digital World
Achieve Unconstrained Collaboration in a Digital WorldAchieve Unconstrained Collaboration in a Digital World
Achieve Unconstrained Collaboration in a Digital WorldIntel IT Center
 
Intel® Xeon® Scalable Processors Enabled Applications Marketing Guide
Intel® Xeon® Scalable Processors Enabled Applications Marketing GuideIntel® Xeon® Scalable Processors Enabled Applications Marketing Guide
Intel® Xeon® Scalable Processors Enabled Applications Marketing GuideIntel IT Center
 
#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...
#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...
#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...Intel IT Center
 
Identity Protection for the Digital Age
Identity Protection for the Digital AgeIdentity Protection for the Digital Age
Identity Protection for the Digital AgeIntel IT Center
 
Three Steps to Making a Digital Workplace a Reality
Three Steps to Making a Digital Workplace a RealityThree Steps to Making a Digital Workplace a Reality
Three Steps to Making a Digital Workplace a RealityIntel IT Center
 
Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...
Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...
Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...Intel IT Center
 
Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0
Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0
Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0Intel IT Center
 
Intel® Xeon® Processor E5-2600 v4 Enterprise Database Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Enterprise Database Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Enterprise Database Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Enterprise Database Applications ShowcaseIntel IT Center
 
Intel® Xeon® Processor E5-2600 v4 Core Business Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Core Business Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Core Business Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Core Business Applications ShowcaseIntel IT Center
 
Intel® Xeon® Processor E5-2600 v4 Financial Security Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Financial Security Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Financial Security Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Financial Security Applications ShowcaseIntel IT Center
 
Intel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications ShowcaseIntel IT Center
 
Intel® Xeon® Processor E5-2600 v4 Tech Computing Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Tech Computing Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Tech Computing Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Tech Computing Applications ShowcaseIntel IT Center
 

Más de Intel IT Center (20)

AI Crash Course- Supercomputing
AI Crash Course- SupercomputingAI Crash Course- Supercomputing
AI Crash Course- Supercomputing
 
FPGA Inference - DellEMC SURFsara
FPGA Inference - DellEMC SURFsaraFPGA Inference - DellEMC SURFsara
FPGA Inference - DellEMC SURFsara
 
High Memory Bandwidth Demo @ One Intel Station
High Memory Bandwidth Demo @ One Intel StationHigh Memory Bandwidth Demo @ One Intel Station
High Memory Bandwidth Demo @ One Intel Station
 
INFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutions
INFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutionsINFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutions
INFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutions
 
Disrupt Hackers With Robust User Authentication
Disrupt Hackers With Robust User AuthenticationDisrupt Hackers With Robust User Authentication
Disrupt Hackers With Robust User Authentication
 
Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...
Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...
Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...
 
Harness Digital Disruption to Create 2022’s Workplace Today
Harness Digital Disruption to Create 2022’s Workplace TodayHarness Digital Disruption to Create 2022’s Workplace Today
Harness Digital Disruption to Create 2022’s Workplace Today
 
Don't Rely on Software Alone. Protect Endpoints with Hardware-Enhanced Security.
Don't Rely on Software Alone.Protect Endpoints with Hardware-Enhanced Security.Don't Rely on Software Alone.Protect Endpoints with Hardware-Enhanced Security.
Don't Rely on Software Alone. Protect Endpoints with Hardware-Enhanced Security.
 
Achieve Unconstrained Collaboration in a Digital World
Achieve Unconstrained Collaboration in a Digital WorldAchieve Unconstrained Collaboration in a Digital World
Achieve Unconstrained Collaboration in a Digital World
 
Intel® Xeon® Scalable Processors Enabled Applications Marketing Guide
Intel® Xeon® Scalable Processors Enabled Applications Marketing GuideIntel® Xeon® Scalable Processors Enabled Applications Marketing Guide
Intel® Xeon® Scalable Processors Enabled Applications Marketing Guide
 
#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...
#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...
#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...
 
Identity Protection for the Digital Age
Identity Protection for the Digital AgeIdentity Protection for the Digital Age
Identity Protection for the Digital Age
 
Three Steps to Making a Digital Workplace a Reality
Three Steps to Making a Digital Workplace a RealityThree Steps to Making a Digital Workplace a Reality
Three Steps to Making a Digital Workplace a Reality
 
Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...
Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...
Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...
 
Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0
Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0
Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0
 
Intel® Xeon® Processor E5-2600 v4 Enterprise Database Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Enterprise Database Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Enterprise Database Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Enterprise Database Applications Showcase
 
Intel® Xeon® Processor E5-2600 v4 Core Business Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Core Business Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Core Business Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Core Business Applications Showcase
 
Intel® Xeon® Processor E5-2600 v4 Financial Security Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Financial Security Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Financial Security Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Financial Security Applications Showcase
 
Intel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications Showcase
 
Intel® Xeon® Processor E5-2600 v4 Tech Computing Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Tech Computing Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Tech Computing Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Tech Computing Applications Showcase
 

Último

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 

Último (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 

Post-processing SAR images on Xeon Phi - a porting exercise

  • 1. Post-processing SAR images on Xeon Phi – a porting exercise Martin Hilgeman HPC Consultant EMEA
  • 2. Research Computing • Xeon Phi porting checklist • Post processing of SAR images • Questions Agenda Dell Confidential
  • 3. Research Computing SAR interferometry is used to monitor terrain displacements by using satellite data The phase difference between two SAR images is calculated that have been acquired at: › different times › with slightly different view angles › different observation dates Dell Confidential Case study: Geospatial imaging
  • 4. Research Computing Application characteristics: – Written in C90 and C++, ~ 150,000 lines of code – Serial application, using Intel MKL DFTI functions – Iterative scheme • MKL FFTs are already being used, little room for improvement – FFTs are too small to run in threaded mode • Application only runs in serial mode, needs to be parallelized in order to run on a Phi in reasonable speed • Phase images are divided into patches, which is suitable for parallelization – Parallelize the main loop using OpenMP Dell Confidential Code details
  • 5. Research Computing Test Setup Confidential5 Jobs ran at the TACC Stampede system PowerEdge C8220 with Intel® Xeon® E5-2680 2.7GHz – 16 cores – 32 GB memory – RHEL 6.3 – Intel® Xeon Phi™ 7120P
  • 6. Research Computing Parallelization of big main loop count = 1; if (lastGCP != NULL) { app = lastGCP->next; } else { app = bufferGCP; } while (app) { app->loc [0] = (int) app->loc [0]; app->loc [1] = (int) app->loc [1]; app->offset[0] = (int) app->offset[0]; app->offset[1] = (int) app->offset[1]; <lot of work> lastGCP = app; app = app->next; count++; } 6 Confidential count = 1; if (lastGCP != NULL) { app = lastGCP->next; } else { app = bufferGCP; } int i = 0; int nthreads, me; #if defined _OPENMP #pragma omp parallel private(i,me,app2,err,suboffset,qc,m_block,s_block) { nthreads = omp_get_num_threads(); me = omp_get_thread_num(); #else nthreads = 1; me = 0; #endif #pragma omp for schedule(static) for (i = 0; i < buffer_ngcp; i++) { #pragma omp critical { app2 = app; app = app->next; } app2->loc [0] = (int) app2->loc [0]; app2->loc [1] = (int) app2->loc [1]; app2->offset[0] = (int) app2->offset[0]; app2->offset[1] = (int) app2->offset[1]; <lot of work> lastGCP = app2; count++; } #if defined _OPENMP } #endif
  • 7. Research ComputingDell Confidential Results Wall Clock Time (ss:00) Original version on E5440 2.83 GHz 102.00 Original version on E5-2680 2.70 GHz 27.43 Additional source optimizations 25.83 2 threads 17.37 4 threads 9.97 6 threads 7.17 8 threads 5.77 12 threads 5.57
  • 8. Research Computing • Now runs multi-threaded using OpenMP directives – The number of patches is greater than the number of threads on the Phi • Memory footprint is small, so should fit on the card – Running in native mode is possible • Intel MKL has complete support for Xeon Phi – Assume that the FFTs are making optimal use of the resources Dell Confidential How about Intel® Xeon Phi™ performance?
  • 9. Research Computing Module Summary -------------------------------------------------------------------------------- Samples Self % Total % Module 247 59.81% 59.81% /lib64/libc-2.12.so 79 19.13% 78.93% /scratch/dell-guest/app 38 9.20% 88.14% /opt/intel/mkl/lib/intel64/libmkl_core.so 34 8.23% 96.37% /opt/intel/mkl/lib/intel64/libmkl_intel_thread.so 10 2.42% 98.79% /opt/intel/mkl/lib/intel64/libmkl_intel_lp64.so 5 1.21% 100.00% /lib64/ld-2.12.so File Summary -------------------------------------------------------------------------------- Samples Self % Total % File 332 80.39% 80.39% ?? 64 15.50% 95.88% malloc.c 10 2.42% 98.31% interp.c 3 0.73% 99.03% bsearch.c 2 0.48% 99.52% SRC/patch.c 2 0.48% 100.00% printf_fp.c Function Summary -------------------------------------------------------------------------------- Samples Self % Total % Function 126 30.51% 30.51% brk 69 16.71% 47.22% ?? 60 14.53% 61.74% _int_malloc 43 10.41% 72.15% pmatch 34 8.23% 80.39% memcpy 13 3.15% 83.54% get_correlation_real_mkl 10 2.42% 85.96% extract2double 8 1.94% 87.89% __intel_new_memcpy Dell Confidential A profile