SlideShare una empresa de Scribd logo
1 de 11
Descargar para leer sin conexión
Kepler vs Xeon Phi : our measures
and their complete source code
http://www.hpcmagazine.fr/en-couverture/kepler-vs-xeon-phi-nos-mesures/
Florent Duguet, PhD
CEO - Altimesh
http://www.altimesh.com/
... article in French
Presentation & translation by
Ronan Keryell (SILKAN / Aptina)
2 different architectures
Some functional analogies...
● Vendor data
● Flops/memop: minimal ratio to avoid waiting for
memory
3 microbenchmarks
From theory to practice...
● 1 memory bound : read a vector
– K20: Naïve/vectorized with float4/use texture cache
– Phi : Naïve/vectorized/gather/aligned vector load
● 1 compute bound : Hörner approximation iterated
(expm1())^12 (= 12 add, 24 mul, 60 madd)
– K20: Naïve/vectorized with float4 or double4
– Phi : Naïve/intrinsics
● 1 latency bound : b[i] += a[i + index[k]]
– K20: Naïve/loop interchange/ __ldg to skip L2$
– Phi : Naïve/vectorized/gather/aligned vector load
Memory bound
Memory bound
Compute bound
Compute bound
Latency bound
Latency bound
Conclusion
● (...) = (vendor data)
● Warning : in this experimentation fma counts for
1 FLOP instead of usual (... and constructors !)
2 FLOP
● Disclaimer : examples available :-) on
http://www.hpcmagazine.fr/files/sources/003-Kepler-vs-Phi.zip

Más contenido relacionado

La actualidad más candente

La actualidad más candente (19)

Dynamic memory Allocation in c language
Dynamic memory Allocation in c languageDynamic memory Allocation in c language
Dynamic memory Allocation in c language
 
Cryptography
CryptographyCryptography
Cryptography
 
Malloc() and calloc() in c
Malloc() and calloc() in cMalloc() and calloc() in c
Malloc() and calloc() in c
 
Graph 500 DISLIB powered optimized version
Graph 500 DISLIB powered optimized versionGraph 500 DISLIB powered optimized version
Graph 500 DISLIB powered optimized version
 
Scheduling in Time-Sensitive Networks (TSN) for Mixed-Criticality Industrial ...
Scheduling in Time-Sensitive Networks (TSN) for Mixed-Criticality Industrial ...Scheduling in Time-Sensitive Networks (TSN) for Mixed-Criticality Industrial ...
Scheduling in Time-Sensitive Networks (TSN) for Mixed-Criticality Industrial ...
 
[Bop] Block Oriented Programming Automating Data-only Attacks
[Bop] Block Oriented Programming Automating Data-only Attacks[Bop] Block Oriented Programming Automating Data-only Attacks
[Bop] Block Oriented Programming Automating Data-only Attacks
 
Programming Actor-based Collective Adaptive Systems
Programming Actor-based Collective Adaptive SystemsProgramming Actor-based Collective Adaptive Systems
Programming Actor-based Collective Adaptive Systems
 
Avoiding Hardware Aliasing
Avoiding Hardware AliasingAvoiding Hardware Aliasing
Avoiding Hardware Aliasing
 
Fine grained asynchronism for pseudo-spectral codes - with application to tur...
Fine grained asynchronism for pseudo-spectral codes - with application to tur...Fine grained asynchronism for pseudo-spectral codes - with application to tur...
Fine grained asynchronism for pseudo-spectral codes - with application to tur...
 
Dynamic memory allocation in c
Dynamic memory allocation in cDynamic memory allocation in c
Dynamic memory allocation in c
 
C dynamic ppt
C dynamic pptC dynamic ppt
C dynamic ppt
 
Deep Learning meetup
Deep Learning meetupDeep Learning meetup
Deep Learning meetup
 
Python Basis Tutorial
Python Basis TutorialPython Basis Tutorial
Python Basis Tutorial
 
Hubba Deep Learning
Hubba Deep LearningHubba Deep Learning
Hubba Deep Learning
 
Lecture1
Lecture1Lecture1
Lecture1
 
Preference of Efficient Architectures for GF(p) Elliptic Curve Crypto Operati...
Preference of Efficient Architectures for GF(p) Elliptic Curve Crypto Operati...Preference of Efficient Architectures for GF(p) Elliptic Curve Crypto Operati...
Preference of Efficient Architectures for GF(p) Elliptic Curve Crypto Operati...
 
Cs 62
Cs 62Cs 62
Cs 62
 
Re-engineering Eclipse MDT/OCL for Xtext
Re-engineering Eclipse MDT/OCL for XtextRe-engineering Eclipse MDT/OCL for Xtext
Re-engineering Eclipse MDT/OCL for Xtext
 
2Bytesprog2 course_2014_c1_sets
2Bytesprog2 course_2014_c1_sets2Bytesprog2 course_2014_c1_sets
2Bytesprog2 course_2014_c1_sets
 

Destacado

Thanks for stopping by!
Thanks for stopping by!Thanks for stopping by!
Thanks for stopping by!
Nan Myers
 
Présentation travailleurs autonomes 16 novembre 2010
Présentation travailleurs autonomes   16 novembre 2010Présentation travailleurs autonomes   16 novembre 2010
Présentation travailleurs autonomes 16 novembre 2010
MXO | agence totale
 
Internship Presentation
Internship PresentationInternship Presentation
Internship Presentation
Julie Gondek
 
MENTEE-Handbook-DIGITAL-1 FINAL
MENTEE-Handbook-DIGITAL-1 FINALMENTEE-Handbook-DIGITAL-1 FINAL
MENTEE-Handbook-DIGITAL-1 FINAL
Gregor Botlik
 

Destacado (12)

Thanks for stopping by!
Thanks for stopping by!Thanks for stopping by!
Thanks for stopping by!
 
Jerry Novack | Qualities of a Great Mentor
Jerry Novack | Qualities of a Great MentorJerry Novack | Qualities of a Great Mentor
Jerry Novack | Qualities of a Great Mentor
 
Infografia nativos digitales
Infografia nativos digitalesInfografia nativos digitales
Infografia nativos digitales
 
Présentation travailleurs autonomes 16 novembre 2010
Présentation travailleurs autonomes   16 novembre 2010Présentation travailleurs autonomes   16 novembre 2010
Présentation travailleurs autonomes 16 novembre 2010
 
Internship Presentation
Internship PresentationInternship Presentation
Internship Presentation
 
MENTEE-Handbook-DIGITAL-1 FINAL
MENTEE-Handbook-DIGITAL-1 FINALMENTEE-Handbook-DIGITAL-1 FINAL
MENTEE-Handbook-DIGITAL-1 FINAL
 
Biodiversità dell’Isola di Gorgona: una nuova varietà di olivo per la ricchez...
Biodiversità dell’Isola di Gorgona: una nuova varietà di olivo per la ricchez...Biodiversità dell’Isola di Gorgona: una nuova varietà di olivo per la ricchez...
Biodiversità dell’Isola di Gorgona: una nuova varietà di olivo per la ricchez...
 
ZEE LAMBAYEQUE, PERU
ZEE LAMBAYEQUE, PERUZEE LAMBAYEQUE, PERU
ZEE LAMBAYEQUE, PERU
 
02competencias conceptos y taxonomias aplicables a los distintos niveles educ...
02competencias conceptos y taxonomias aplicables a los distintos niveles educ...02competencias conceptos y taxonomias aplicables a los distintos niveles educ...
02competencias conceptos y taxonomias aplicables a los distintos niveles educ...
 
Orpos and store practices
Orpos and store practicesOrpos and store practices
Orpos and store practices
 
Traditional knowledge
Traditional knowledgeTraditional knowledge
Traditional knowledge
 
e-reputation et référencement : Introduction
e-reputation et référencement : Introductione-reputation et référencement : Introduction
e-reputation et référencement : Introduction
 

Similar a Kepler vs Xeon Phi

QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
Heiko Joerg Schick
 
MattsonTutorialSC14.pptx
MattsonTutorialSC14.pptxMattsonTutorialSC14.pptx
MattsonTutorialSC14.pptx
gopikahari7
 

Similar a Kepler vs Xeon Phi (20)

Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...
Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...
Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...
 
Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and...
Targeting GPUs using OpenMP  Directives on Summit with  GenASiS: A Simple and...Targeting GPUs using OpenMP  Directives on Summit with  GenASiS: A Simple and...
Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and...
 
Manycores for the Masses
Manycores for the MassesManycores for the Masses
Manycores for the Masses
 
Harnessing OpenCL in Modern Coprocessors
Harnessing OpenCL in Modern CoprocessorsHarnessing OpenCL in Modern Coprocessors
Harnessing OpenCL in Modern Coprocessors
 
Accessible hpc for everyone with docker and containers
Accessible hpc for everyone with docker and containersAccessible hpc for everyone with docker and containers
Accessible hpc for everyone with docker and containers
 
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
 
MattsonTutorialSC14.pptx
MattsonTutorialSC14.pptxMattsonTutorialSC14.pptx
MattsonTutorialSC14.pptx
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
 
A Source-To-Source Approach to HPC Challenges
A Source-To-Source Approach to HPC ChallengesA Source-To-Source Approach to HPC Challenges
A Source-To-Source Approach to HPC Challenges
 
LLVM Optimizations for PGAS Programs -Case Study: LLVM Wide Optimization in C...
LLVM Optimizations for PGAS Programs -Case Study: LLVM Wide Optimization in C...LLVM Optimizations for PGAS Programs -Case Study: LLVM Wide Optimization in C...
LLVM Optimizations for PGAS Programs -Case Study: LLVM Wide Optimization in C...
 
Inferno Scalable Deep Learning on Spark
Inferno Scalable Deep Learning on SparkInferno Scalable Deep Learning on Spark
Inferno Scalable Deep Learning on Spark
 
3. Synthesis.pptx
3. Synthesis.pptx3. Synthesis.pptx
3. Synthesis.pptx
 
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core ArchitecturesPerformance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
 
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP..."Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
 
開放運算&GPU技術研究班
開放運算&GPU技術研究班開放運算&GPU技術研究班
開放運算&GPU技術研究班
 
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsUnderstand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
 
SpeQuloS: A QoS Service for BoT Applications Using Best Effort Distributed Co...
SpeQuloS: A QoS Service for BoT Applications Using Best Effort Distributed Co...SpeQuloS: A QoS Service for BoT Applications Using Best Effort Distributed Co...
SpeQuloS: A QoS Service for BoT Applications Using Best Effort Distributed Co...
 
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
 
Cost-effective software reliability through autonomic tuning of system resources
Cost-effective software reliability through autonomic tuning of system resourcesCost-effective software reliability through autonomic tuning of system resources
Cost-effective software reliability through autonomic tuning of system resources
 
On the Necessity and Inapplicability of Python
On the Necessity and Inapplicability of PythonOn the Necessity and Inapplicability of Python
On the Necessity and Inapplicability of Python
 

Más de Mert Akın

Más de Mert Akın (20)

İlk İşim Girişim 2018 Sunum Mert Akın
İlk İşim Girişim 2018 Sunum Mert Akınİlk İşim Girişim 2018 Sunum Mert Akın
İlk İşim Girişim 2018 Sunum Mert Akın
 
İlk İşim Girişim 2018 Presentation Mert Akın Thai
İlk İşim Girişim 2018 Presentation Mert Akın Thaiİlk İşim Girişim 2018 Presentation Mert Akın Thai
İlk İşim Girişim 2018 Presentation Mert Akın Thai
 
İlk İşim Girişim 2018 Presentation Mert Akın English
İlk İşim Girişim 2018 Presentation Mert Akın Englishİlk İşim Girişim 2018 Presentation Mert Akın English
İlk İşim Girişim 2018 Presentation Mert Akın English
 
İlk İşim Girişim 2018 Präsentation Mert Akın Deutsch
İlk İşim Girişim 2018 Präsentation Mert Akın Deutschİlk İşim Girişim 2018 Präsentation Mert Akın Deutsch
İlk İşim Girişim 2018 Präsentation Mert Akın Deutsch
 
Database Driven OpenCL Programming by Tim Child
Database Driven OpenCL Programming by Tim ChildDatabase Driven OpenCL Programming by Tim Child
Database Driven OpenCL Programming by Tim Child
 
Big Data Trends 2016 by HPC Asia
Big Data Trends 2016 by HPC AsiaBig Data Trends 2016 by HPC Asia
Big Data Trends 2016 by HPC Asia
 
Going to the Cloud by Online Colleges
Going to the Cloud by Online CollegesGoing to the Cloud by Online Colleges
Going to the Cloud by Online Colleges
 
How Many Players in Big Data by Umbel
How Many Players in Big Data by UmbelHow Many Players in Big Data by Umbel
How Many Players in Big Data by Umbel
 
The Four V's of Big Data By IBM
The Four V's of Big Data By IBMThe Four V's of Big Data By IBM
The Four V's of Big Data By IBM
 
The Global State of Data Security in the Cloud by Gemalto
The Global State of Data Security in the Cloud by GemaltoThe Global State of Data Security in the Cloud by Gemalto
The Global State of Data Security in the Cloud by Gemalto
 
HPC, Big Data & Data Center Explanation by Mert Akın
HPC, Big Data & Data  Center Explanation by Mert AkınHPC, Big Data & Data  Center Explanation by Mert Akın
HPC, Big Data & Data Center Explanation by Mert Akın
 
Comparison of Battery Types According Number of Cycles
Comparison of Battery Types According Number of CyclesComparison of Battery Types According Number of Cycles
Comparison of Battery Types According Number of Cycles
 
Understanding Big Data by IKANOW
Understanding Big Data by IKANOWUnderstanding Big Data by IKANOW
Understanding Big Data by IKANOW
 
How the Internet of Things Will Rule Your Workday in 2020 by Forbes and Cent...
How the Internet of Things Will Rule Your Workday in 2020  by Forbes and Cent...How the Internet of Things Will Rule Your Workday in 2020  by Forbes and Cent...
How the Internet of Things Will Rule Your Workday in 2020 by Forbes and Cent...
 
Future of Big Data
Future of Big DataFuture of Big Data
Future of Big Data
 
Future of Database
Future of DatabaseFuture of Database
Future of Database
 
Comparison of Battery Types According Voltage
Comparison of Battery Types According VoltageComparison of Battery Types According Voltage
Comparison of Battery Types According Voltage
 
Comparison of Battery Types According Energy Density
Comparison of Battery Types According Energy DensityComparison of Battery Types According Energy Density
Comparison of Battery Types According Energy Density
 
Comparison of Battery Types According Self Discharge Rate Monthly
Comparison of Battery Types According Self Discharge Rate MonthlyComparison of Battery Types According Self Discharge Rate Monthly
Comparison of Battery Types According Self Discharge Rate Monthly
 
As telcos go digital, cybersecurity risks intensify by pwc
As telcos go digital, cybersecurity risks intensify by pwcAs telcos go digital, cybersecurity risks intensify by pwc
As telcos go digital, cybersecurity risks intensify by pwc
 

Último

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 

Último (20)

Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 

Kepler vs Xeon Phi

  • 1. Kepler vs Xeon Phi : our measures and their complete source code http://www.hpcmagazine.fr/en-couverture/kepler-vs-xeon-phi-nos-mesures/ Florent Duguet, PhD CEO - Altimesh http://www.altimesh.com/ ... article in French Presentation & translation by Ronan Keryell (SILKAN / Aptina)
  • 3. Some functional analogies... ● Vendor data ● Flops/memop: minimal ratio to avoid waiting for memory
  • 4. 3 microbenchmarks From theory to practice... ● 1 memory bound : read a vector – K20: Naïve/vectorized with float4/use texture cache – Phi : Naïve/vectorized/gather/aligned vector load ● 1 compute bound : Hörner approximation iterated (expm1())^12 (= 12 add, 24 mul, 60 madd) – K20: Naïve/vectorized with float4 or double4 – Phi : Naïve/intrinsics ● 1 latency bound : b[i] += a[i + index[k]] – K20: Naïve/loop interchange/ __ldg to skip L2$ – Phi : Naïve/vectorized/gather/aligned vector load
  • 11. Conclusion ● (...) = (vendor data) ● Warning : in this experimentation fma counts for 1 FLOP instead of usual (... and constructors !) 2 FLOP ● Disclaimer : examples available :-) on http://www.hpcmagazine.fr/files/sources/003-Kepler-vs-Phi.zip