SlideShare a Scribd company logo
1 of 21
Can You Get Performance from
Xeon Phi Easily?
Lessons Learned from Two Real
Cases
Objective
• Check the amount of work to use Intel
Xeon Phi.
• Minimal modifications using only pragmas.
• Two applications:
– CalcunetW. Test MKL Libraries.
– GammaMaps. Test pragmas.
• Two modes:
– Native: Only compiled to execute on Xeon Phi
– Offload: Uses Host+Xeon Phi
CalcuNetw: Calculate Measurements in Complex Networks
• Complex networks, consisting of sets of
nodes or vertices joined together in pairs by
links or edges.
• Application Calculates for each network:
– Subgraph Centrality (SC): characterizes the
participation of each node in all subgraphs in a
network.
– SC odd: account only paths of long odd
– SC even: account only paths of long even
– Bipartivity: Is a proportion of even to total number of
closed walks in the network.
– Network Communicability for Connected Nodes:
C(p,q): Measures how well communicated are two
nodes in the network.
– Network Communicability C(G): is the mean of all
the C(p,q),
Mouriño J.C., Estrada E., Gomez A. “ CalcuNetw: Calculate Measurements in Complex Networks ”,Informe Técnico
CESGA-2005-003
CalcuNetW
GammaMaps: A figure-of-merit in Radiation
Therapy
X
Y
Z
Dose in voxel i,j,k
X
Y
Z
GammaMaps: A figure-of-merit in
Radiation Therapy
Read
Doses
Initialise and
normalise
Compute
Gamma
Store
Gamma
• Application in FORTRAN 90
• Parallelised using OpenMP
• Geometric algorithm*
• 512 x 512 x 128 = 33,554,432
voxels
• Auto-vectorization
• Pragmas for offload
* T. Ju, T. Simpson, J. O. Deasy, and D. A. Low, “Geometric interpretation of the γ dose distribution
comparison technique: Interpolation-free calculation,” Medical Physics, vol. 35, no. 3, p. 879, 2008.
Results of Experiments
Platform
Host
CPU Model Intel(R) Xeon(R) CPU E5-2680
0 @ 2.70GHz
Nr. of cores 16
Memory 32788 MB
Operating System Linux 2.6.32-279.el6.x86_64
Compiler Version 2013U2 Intel Xeon Phi
Model Beta0 Engineering Sample
Nr. of cores 61 at 1.09GHz
Memory 7936 MB
Operating System MPSS Gold U1
Compiler Version 2013U2
GDDR Technology GDDR5
GDDR Frecuency 2750000 KHz
• Remote
access to
Intel systems
• Feb. 2013
COMPACT - FINE
C1 C2 C3 C4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
0 1 2 3 4 5 6 7
Intel Xeon Phi Affinity Policies
SCATTER - FINE
C1 C2 C3 C4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
0 4 1 5 2 6 3 7
BALANCED - FINE
C1 C2 C3 C4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
0 1 2 3 4 5 6 7
BALANCED - CORE
C1 C2 C3 C4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
{0,1} {2,3} {4,5} {6,7}
• TYPE
– Compact
– Scatter
– Balanced
• Granularity
– Fine or Thread
– Core
Results for CalcunetW
CalcunetW
CalcunetW
CalcunetW
Results for GammaMaps
GammaMaps
Host
0
200
400
600
800
1000
1200
1400
0 5 10 15 20
ElapsedTime(s)
Nr. of Threads
Host
local-compact-core
local-compact-fine
local-scatter-fine
local-scatter-core
GammaMaps
Xeon Phi poor I/O
Conclusions
• Using MKL library is easy and does not
require changes in the code.
• Easy pragmas on code permit fast usage
• I/O performance issues in Xeon Phi
• 1 Xeon Phi ~ 1 Xeon E5-2680
• Improve performance requires additional
work.
Acknowledge
The authors would like to thank Intel for
providing access to Intel Xeon Phi
coprocessor.
Questions
Andrés Gómez
José Carlos Mouriño
Carmen Cotelo
Aurelio Rodríguez
The TEAM

More Related Content

Similar to Getting Performance from Xeon Phi Easily

Performance analysis and implementation of modified sdm based noc for mpsoc o...
Performance analysis and implementation of modified sdm based noc for mpsoc o...Performance analysis and implementation of modified sdm based noc for mpsoc o...
Performance analysis and implementation of modified sdm based noc for mpsoc o...eSAT Journals
 
Secure remote protocol for fpga reconfiguration
Secure remote protocol for fpga reconfigurationSecure remote protocol for fpga reconfiguration
Secure remote protocol for fpga reconfigurationeSAT Journals
 
Secure remote protocol for fpga reconfiguration
Secure remote protocol for fpga reconfigurationSecure remote protocol for fpga reconfiguration
Secure remote protocol for fpga reconfigurationeSAT Publishing House
 
Implementation of resource sharing strategy for power optimization in embedde...
Implementation of resource sharing strategy for power optimization in embedde...Implementation of resource sharing strategy for power optimization in embedde...
Implementation of resource sharing strategy for power optimization in embedde...Alexander Decker
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
Wired and Wireless Computer Network Performance Evaluation Using OMNeT++ Simu...
Wired and Wireless Computer Network Performance Evaluation Using OMNeT++ Simu...Wired and Wireless Computer Network Performance Evaluation Using OMNeT++ Simu...
Wired and Wireless Computer Network Performance Evaluation Using OMNeT++ Simu...Jaipal Dhobale
 
Investigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing ApproachInvestigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing ApproachIJERA Editor
 
Investigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing ApproachInvestigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing ApproachIJERA Editor
 
IRJET- Re-Configuration Topology for On-Chip Networks by Back-Tracking
IRJET- Re-Configuration Topology for On-Chip Networks by Back-TrackingIRJET- Re-Configuration Topology for On-Chip Networks by Back-Tracking
IRJET- Re-Configuration Topology for On-Chip Networks by Back-TrackingIRJET Journal
 
Optimal configuration of network
Optimal configuration of networkOptimal configuration of network
Optimal configuration of networkjpstudcorner
 
Blue gene detail journal
Blue gene detail journalBlue gene detail journal
Blue gene detail journalVivek Jha
 
RT15 Berkeley | ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
RT15 Berkeley |  ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...RT15 Berkeley |  ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
RT15 Berkeley | ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...OPAL-RT TECHNOLOGIES
 
Optimal and Power Aware BIST for Delay Testing of System-On-Chip
Optimal and Power Aware BIST for Delay Testing of System-On-ChipOptimal and Power Aware BIST for Delay Testing of System-On-Chip
Optimal and Power Aware BIST for Delay Testing of System-On-ChipIDES Editor
 
Enhanced Leach Protocol
Enhanced Leach ProtocolEnhanced Leach Protocol
Enhanced Leach Protocolijceronline
 
A ULTRA-LOW POWER ROUTER DESIGN FOR NETWORK ON CHIP
A ULTRA-LOW POWER ROUTER DESIGN FOR NETWORK ON CHIPA ULTRA-LOW POWER ROUTER DESIGN FOR NETWORK ON CHIP
A ULTRA-LOW POWER ROUTER DESIGN FOR NETWORK ON CHIPijaceeejournal
 
underground cable fault location using aruino,gsm&gps
underground cable fault location using aruino,gsm&gps underground cable fault location using aruino,gsm&gps
underground cable fault location using aruino,gsm&gps Mohd Sohail
 
IRJET- An Enhanced Cluster (CH-LEACH) based Routing Scheme for Wireless Senso...
IRJET- An Enhanced Cluster (CH-LEACH) based Routing Scheme for Wireless Senso...IRJET- An Enhanced Cluster (CH-LEACH) based Routing Scheme for Wireless Senso...
IRJET- An Enhanced Cluster (CH-LEACH) based Routing Scheme for Wireless Senso...IRJET Journal
 
Maxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialMaxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialmadhuinturi
 
Modification of l3 learning switch code for firewall functionality in pox con...
Modification of l3 learning switch code for firewall functionality in pox con...Modification of l3 learning switch code for firewall functionality in pox con...
Modification of l3 learning switch code for firewall functionality in pox con...eSAT Journals
 

Similar to Getting Performance from Xeon Phi Easily (20)

Performance analysis and implementation of modified sdm based noc for mpsoc o...
Performance analysis and implementation of modified sdm based noc for mpsoc o...Performance analysis and implementation of modified sdm based noc for mpsoc o...
Performance analysis and implementation of modified sdm based noc for mpsoc o...
 
Secure remote protocol for fpga reconfiguration
Secure remote protocol for fpga reconfigurationSecure remote protocol for fpga reconfiguration
Secure remote protocol for fpga reconfiguration
 
Secure remote protocol for fpga reconfiguration
Secure remote protocol for fpga reconfigurationSecure remote protocol for fpga reconfiguration
Secure remote protocol for fpga reconfiguration
 
Implementation of resource sharing strategy for power optimization in embedde...
Implementation of resource sharing strategy for power optimization in embedde...Implementation of resource sharing strategy for power optimization in embedde...
Implementation of resource sharing strategy for power optimization in embedde...
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Wired and Wireless Computer Network Performance Evaluation Using OMNeT++ Simu...
Wired and Wireless Computer Network Performance Evaluation Using OMNeT++ Simu...Wired and Wireless Computer Network Performance Evaluation Using OMNeT++ Simu...
Wired and Wireless Computer Network Performance Evaluation Using OMNeT++ Simu...
 
Investigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing ApproachInvestigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing Approach
 
Investigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing ApproachInvestigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing Approach
 
IRJET- Re-Configuration Topology for On-Chip Networks by Back-Tracking
IRJET- Re-Configuration Topology for On-Chip Networks by Back-TrackingIRJET- Re-Configuration Topology for On-Chip Networks by Back-Tracking
IRJET- Re-Configuration Topology for On-Chip Networks by Back-Tracking
 
Optimal configuration of network
Optimal configuration of networkOptimal configuration of network
Optimal configuration of network
 
Blue gene detail journal
Blue gene detail journalBlue gene detail journal
Blue gene detail journal
 
RT15 Berkeley | ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
RT15 Berkeley |  ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...RT15 Berkeley |  ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
RT15 Berkeley | ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
 
Optimal and Power Aware BIST for Delay Testing of System-On-Chip
Optimal and Power Aware BIST for Delay Testing of System-On-ChipOptimal and Power Aware BIST for Delay Testing of System-On-Chip
Optimal and Power Aware BIST for Delay Testing of System-On-Chip
 
Enhanced Leach Protocol
Enhanced Leach ProtocolEnhanced Leach Protocol
Enhanced Leach Protocol
 
A ULTRA-LOW POWER ROUTER DESIGN FOR NETWORK ON CHIP
A ULTRA-LOW POWER ROUTER DESIGN FOR NETWORK ON CHIPA ULTRA-LOW POWER ROUTER DESIGN FOR NETWORK ON CHIP
A ULTRA-LOW POWER ROUTER DESIGN FOR NETWORK ON CHIP
 
underground cable fault location using aruino,gsm&gps
underground cable fault location using aruino,gsm&gps underground cable fault location using aruino,gsm&gps
underground cable fault location using aruino,gsm&gps
 
blue gene ppt
blue gene pptblue gene ppt
blue gene ppt
 
IRJET- An Enhanced Cluster (CH-LEACH) based Routing Scheme for Wireless Senso...
IRJET- An Enhanced Cluster (CH-LEACH) based Routing Scheme for Wireless Senso...IRJET- An Enhanced Cluster (CH-LEACH) based Routing Scheme for Wireless Senso...
IRJET- An Enhanced Cluster (CH-LEACH) based Routing Scheme for Wireless Senso...
 
Maxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialMaxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorial
 
Modification of l3 learning switch code for firewall functionality in pox con...
Modification of l3 learning switch code for firewall functionality in pox con...Modification of l3 learning switch code for firewall functionality in pox con...
Modification of l3 learning switch code for firewall functionality in pox con...
 

More from Andrés Gómez

Infraestructuras data science_portugal_ipca_industry_4.0_v2
Infraestructuras data science_portugal_ipca_industry_4.0_v2Infraestructuras data science_portugal_ipca_industry_4.0_v2
Infraestructuras data science_portugal_ipca_industry_4.0_v2Andrés Gómez
 
HPC on Cloud for SMEs. The case of bolt tightening.
HPC on Cloud for SMEs. The case of bolt tightening.HPC on Cloud for SMEs. The case of bolt tightening.
HPC on Cloud for SMEs. The case of bolt tightening.Andrés Gómez
 
A Web-platform for radiotherapy, a new workflow concept and an information sh...
A Web-platform for radiotherapy, a new workflow concept and an information sh...A Web-platform for radiotherapy, a new workflow concept and an information sh...
A Web-platform for radiotherapy, a new workflow concept and an information sh...Andrés Gómez
 
Federated HPC Clouds Applied to Radiation Therapy
Federated HPC Clouds Applied to Radiation TherapyFederated HPC Clouds Applied to Radiation Therapy
Federated HPC Clouds Applied to Radiation TherapyAndrés Gómez
 
Software libre y modelos de programación en la investigación con supercomputa...
Software libre y modelos de programación en la investigación con supercomputa...Software libre y modelos de programación en la investigación con supercomputa...
Software libre y modelos de programación en la investigación con supercomputa...Andrés Gómez
 
Role of public supercomputing centers in the promotion of HPC on Cloud: the C...
Role of public supercomputing centers in the promotion of HPC on Cloud: the C...Role of public supercomputing centers in the promotion of HPC on Cloud: the C...
Role of public supercomputing centers in the promotion of HPC on Cloud: the C...Andrés Gómez
 
VCOC BonFIRE presentation at FIRE Engineering Workshop 2012
VCOC BonFIRE presentation at FIRE Engineering Workshop 2012VCOC BonFIRE presentation at FIRE Engineering Workshop 2012
VCOC BonFIRE presentation at FIRE Engineering Workshop 2012Andrés Gómez
 

More from Andrés Gómez (7)

Infraestructuras data science_portugal_ipca_industry_4.0_v2
Infraestructuras data science_portugal_ipca_industry_4.0_v2Infraestructuras data science_portugal_ipca_industry_4.0_v2
Infraestructuras data science_portugal_ipca_industry_4.0_v2
 
HPC on Cloud for SMEs. The case of bolt tightening.
HPC on Cloud for SMEs. The case of bolt tightening.HPC on Cloud for SMEs. The case of bolt tightening.
HPC on Cloud for SMEs. The case of bolt tightening.
 
A Web-platform for radiotherapy, a new workflow concept and an information sh...
A Web-platform for radiotherapy, a new workflow concept and an information sh...A Web-platform for radiotherapy, a new workflow concept and an information sh...
A Web-platform for radiotherapy, a new workflow concept and an information sh...
 
Federated HPC Clouds Applied to Radiation Therapy
Federated HPC Clouds Applied to Radiation TherapyFederated HPC Clouds Applied to Radiation Therapy
Federated HPC Clouds Applied to Radiation Therapy
 
Software libre y modelos de programación en la investigación con supercomputa...
Software libre y modelos de programación en la investigación con supercomputa...Software libre y modelos de programación en la investigación con supercomputa...
Software libre y modelos de programación en la investigación con supercomputa...
 
Role of public supercomputing centers in the promotion of HPC on Cloud: the C...
Role of public supercomputing centers in the promotion of HPC on Cloud: the C...Role of public supercomputing centers in the promotion of HPC on Cloud: the C...
Role of public supercomputing centers in the promotion of HPC on Cloud: the C...
 
VCOC BonFIRE presentation at FIRE Engineering Workshop 2012
VCOC BonFIRE presentation at FIRE Engineering Workshop 2012VCOC BonFIRE presentation at FIRE Engineering Workshop 2012
VCOC BonFIRE presentation at FIRE Engineering Workshop 2012
 

Recently uploaded

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 

Recently uploaded (20)

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 

Getting Performance from Xeon Phi Easily

  • 1. Can You Get Performance from Xeon Phi Easily? Lessons Learned from Two Real Cases
  • 2. Objective • Check the amount of work to use Intel Xeon Phi. • Minimal modifications using only pragmas. • Two applications: – CalcunetW. Test MKL Libraries. – GammaMaps. Test pragmas. • Two modes: – Native: Only compiled to execute on Xeon Phi – Offload: Uses Host+Xeon Phi
  • 3. CalcuNetw: Calculate Measurements in Complex Networks • Complex networks, consisting of sets of nodes or vertices joined together in pairs by links or edges. • Application Calculates for each network: – Subgraph Centrality (SC): characterizes the participation of each node in all subgraphs in a network. – SC odd: account only paths of long odd – SC even: account only paths of long even – Bipartivity: Is a proportion of even to total number of closed walks in the network. – Network Communicability for Connected Nodes: C(p,q): Measures how well communicated are two nodes in the network. – Network Communicability C(G): is the mean of all the C(p,q), Mouriño J.C., Estrada E., Gomez A. “ CalcuNetw: Calculate Measurements in Complex Networks ”,Informe Técnico CESGA-2005-003
  • 5. GammaMaps: A figure-of-merit in Radiation Therapy X Y Z Dose in voxel i,j,k X Y Z
  • 6. GammaMaps: A figure-of-merit in Radiation Therapy Read Doses Initialise and normalise Compute Gamma Store Gamma • Application in FORTRAN 90 • Parallelised using OpenMP • Geometric algorithm* • 512 x 512 x 128 = 33,554,432 voxels • Auto-vectorization • Pragmas for offload * T. Ju, T. Simpson, J. O. Deasy, and D. A. Low, “Geometric interpretation of the γ dose distribution comparison technique: Interpolation-free calculation,” Medical Physics, vol. 35, no. 3, p. 879, 2008.
  • 8. Platform Host CPU Model Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz Nr. of cores 16 Memory 32788 MB Operating System Linux 2.6.32-279.el6.x86_64 Compiler Version 2013U2 Intel Xeon Phi Model Beta0 Engineering Sample Nr. of cores 61 at 1.09GHz Memory 7936 MB Operating System MPSS Gold U1 Compiler Version 2013U2 GDDR Technology GDDR5 GDDR Frecuency 2750000 KHz • Remote access to Intel systems • Feb. 2013
  • 9. COMPACT - FINE C1 C2 C3 C4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 0 1 2 3 4 5 6 7 Intel Xeon Phi Affinity Policies SCATTER - FINE C1 C2 C3 C4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 0 4 1 5 2 6 3 7 BALANCED - FINE C1 C2 C3 C4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 0 1 2 3 4 5 6 7 BALANCED - CORE C1 C2 C3 C4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 {0,1} {2,3} {4,5} {6,7} • TYPE – Compact – Scatter – Balanced • Granularity – Fine or Thread – Core
  • 16. Host 0 200 400 600 800 1000 1200 1400 0 5 10 15 20 ElapsedTime(s) Nr. of Threads Host local-compact-core local-compact-fine local-scatter-fine local-scatter-core
  • 19. Conclusions • Using MKL library is easy and does not require changes in the code. • Easy pragmas on code permit fast usage • I/O performance issues in Xeon Phi • 1 Xeon Phi ~ 1 Xeon E5-2680 • Improve performance requires additional work.
  • 20. Acknowledge The authors would like to thank Intel for providing access to Intel Xeon Phi coprocessor.
  • 21. Questions Andrés Gómez José Carlos Mouriño Carmen Cotelo Aurelio Rodríguez The TEAM