SlideShare una empresa de Scribd logo
1 de 21
The benefits of upgrading to Haswell Architecture
and Windows 8.1:
Benchmarking of Hybrid (CPUGPU) Parallel
Processing (CUDA) – enabled, MATLAB Image
Processing Algorithms in GTX TITAN and GTX 780M
DIMITRIS VAYENAS, POSTGRADUATE STUDENT
DEPARTMENT OF COMPUTER SCIENCE @ THE UNIVERSITY OF OXFORD &
SOFTWARE INCUBATOR AT ISIS INNOVATION LTD.
Contents
 Introduction
 A “Real-Life” Hybrid (CPU-GPU) Algorithm
 Hardware and Software of Testing
 Performance
 Comparison
 Conclusion
 Acknowledgements
Introduction
 In this laboratory we are attempting to address the following question:
Is it is worth upgrading from Ivy Bridge to a Haswell Architecture in order to
improve performance?
 Intel claims that its new HD 4600 Integrated Graphics Core in the 4th
Generation Intel i7 processors can increase performance over the previous
architecture by up to 7 times.
 What kind of performance improvements can we look forward in “real life
examples” and under what conditions?
A “Real-Life” Hybrid Algorithm (1/2)
 Hybrid: Executes in both CPU and GPU
Consider a MATLAB implemented algorithm containing the following steps:
A “Real-Life” Hybrid Algorithm (2/2)
 In the hybrid Algorithm the tasks in black are performed in the GPU while
the tasks in red performed in the CPU.
 Thus, we have the usual overhead of transferring the data to and from the
GPU whereas the performance of the CPU plays a significant role; this
consideration is usually ignored by most graphics performance
benchmarks who test either the GPU or the CPU, but not both.
 Ideally we should have liked to run all tasks in the GPU, however the
current version of MATLAB does not, yet, support these functions in the
Parallel Processing Unit.
 As we will see the NVIDIA Drivers have substantial impact on Performance
Hardware and Software of Testing
 System I:
 SCAN Workstation with NVIDIA GTX TITAN, Intel i7 3770K @ 4.5 GHz, 32GB RAM @ 2133
MHz, SSD with over 500 MB/s at Read and Write
OS: Windows Server 2012 Datacentre Edition
NVIDIA Driver: 320.49
 System II:
 Schenker W503 with NVIDIA GTX 780M, Intel i7 4800 @ 3.5 GHz, 16 GB RAM @1600
MHz, SSD with over 500 MB/s at Read and Write
 A) OS: Windows Server 2012 Datacentre Edition
NVIDIA Driver: 320.49
 B) OS: Windows 8.1
NVIDIA Driver: 326.01
(Important Notice: Figures for System I on Windows 8.1 will be added here by
Wednesday 3/7/2013)
Performance (total runtimes)
Task System I
(TITAN on WinSrv 2012)
System II (a)
(780M on WinSrv 2012)
System II (b)
(780M on Win 8.1)
(number of runs per
test/where (CPU or GPU))
(results in seconds – best is less)
Edge (800/CPU) 1720.265 1661.289 1261.870
Regionprops (400/CPU) 956.622 899.934 646.883
Imfilter (1600/GPU) 339.045 339.477 263.572
Imresize(1200/CPU) 338.574 295.782 199.593
Padarray (2000/CPU) 204.734 196.303 149.067
Imfilter (1600/GPU) 126.362 131.112 101.717
Performance (total run times)
1720.265
956.622
339.045 338.574
204.734
1661.289
899.934
339.477 295.782
196.303
1261.87
646.883
263.572
199.593
149.067
0
200
400
600
800
1000
1200
1400
1600
1800
2000
Edge (800) Regionprops (400) Imfilter (1600) Imresize(1200) Padarray (2000)
Task time totals (less is better)
System I System II (a) System II (b)
Performance (Indicative times to process an image)
Parameters: Magnification, Fudge Factor, Sigma and HSize
Image Processing System I System II (a) System II (b)
(results in seconds)
Mag_1_FF_0.2_S_0.2_HS_1 0.39699 0.49159 0.18465
Mag_1_FF_0.6_S_0.6_HS_61 0.46689 0.62617 0.38815
Mag_1_FF_1_S_0.8_HS_1 11.4042 8.1427 0.49579
Mag_3_FF_0.4_S_0.8_HS_41 3.1976 2.8881 1.4568
Mag_5_FF_0.4_S_0.8_HS_41 5.7096 4.4588 3.9456
Mag_7_FF_0.4_S_0.8_HS_41 9.1622 10.6905 8.4348
Mag_9_FF_0.4_S_0.8_HS_41 14.5562 17.9971 14.8889
Mag_9_FF_1_S_0.8_HS_41 28.8458 17.0872 15.5799
Performance (Indicative times to process an image)
Parameters: Magnification, Fudge Factor, Sigma and HSize
0
0.39699
0.46689
11.4042
3.1976
5.7096
9.1622
14.5562
28.8458
0.49159
0.62617
8.1427
2.8881
4.4588
10.6905
17.9971
17.0872
0.18465
0.38815
0.49579
1.4568
3.9456
8.4348
14.8889
15.5799
EXECUTION TIME IN SECONDS TO PROCESS SPECIFIC IMAGES
System I System II (a) System II (b)
Performance Comparison (total run times)
Task System II (a) vs. System I System II (b) vs.
System II (a)
System II (b) vs. System I
(number of runs per
test/where (CPU or GPU))
Percentage Change
Edge (800/CPU) 3.4 24.0 26.6
Regionprops (400/CPU) 5.9 28.1 32.4
Imfilter (1600/GPU) -0.1 22.4 22.3
Imresize(1200/CPU) 12.6 32.5 41.0
Padarray (2000/CPU) 4.1 24.1 27.2
Imfilter (1600/GPU) -3.8 22.4 19.5
Performance Comparison (total run times)
3.4
5.9
-0.1
12.6
4.1
-3.8
24
28.1
22.4
32.5
24.1
22.4
26.6
32.4
22.3
41
27.2
-10
-5
0
5
10
15
20
25
30
35
40
45
Percentage Change
System II (a) vs. System I System II (b) vs. System II (a) System II (b) vs. System I
Performance Comparison based on the time to process image
Parameters: Magnification, Fudge Factor, Sigma and HSize
Image Processing System II (a) vs.
System I
System II (b) vs. System
II (a)
System II (b) vs.
System I
Percentage of Change
Mag_1_FF_0.2_S_0.2_HS_1 -23.8 62.4 53.5
Mag_1_FF_0.6_S_0.6_HS_61 -34.1 38.0 16.9
Mag_1_FF_1_S_0.8_HS_1 28.6 93.9 95.7
Mag_3_FF_0.4_S_0.8_HS_41 9.7 49.6 54.4
Mag_5_FF_0.4_S_0.8_HS_41 21.9 11.5 30.9
Mag_7_FF_0.4_S_0.8_HS_41 -16.7 21.1 7.9
Mag_9_FF_0.4_S_0.8_HS_41 -23.6 17.3 -2.3
Mag_9_FF_1_S_0.8_HS_41 40.8 8.8 46.0
Performance Comparison based on the time to process image
Parameters: Magnification, Fudge Factor, Sigma and HSize
0
-23.8
-34.1
28.6
9.7
21.9
-16.7
-23.6
40.8
62.4
38
93.9
49.6
11.5
21.1 17.3
8.8
53.5
16.9
95.7
54.4
30.9
7.9
-2.3
46
-60
-40
-20
0
20
40
60
80
100
120
Percentage change in image processing
System II (a) vs. System I System II (b) vs. System II (a) System II (b) vs. System I
Conclusion
 The performance improvements due to the new architecture in Intel’s fourth
generation i7 family are substantial as we notice the great improvements for
related of the i7 4800 Mobile CPU over the overclocked i7 3770K!
 NVIDIA also seems to offer improved support of its GTX 7*** Series on Windows
8.1 where we have seen improvement of over 93.9% for a set of parameters
and over 20% overall on an identical hardware running on Windows 8.1 with
326.01 driver vs. the 320.49 driver.
 Obviously, measuring the performance of hybrid algorithms is similar to asking
“how long is a piece of spring”, but given the fact that we see manufacturers
fine-tuning their products in order to perform better in standard benchmarking
tools it is always wise to create your own benchmarks that fit your applications
Acknowledgements
I would like to thank the following individuals for their help in measuring and
optimising the performance of my MATLAB code, through their extensive
knowledge of MATLAB andor CUDA:
 Dr. Mike Giles, Professor of Scientific Computing at the University of Oxford; resident
expert for NVIDIA and MATLAB
 Dr. James Lebak, Parallel Computing Software Engineer at MathWorksat
Mathworks Boston HQ.
 Captain (USMC) John Roberts, Senior Principal GPGPU Software Engineer at BAE
Systems, Inc. (formerly of NVIDIA); John also heads the CUDA Vision Workbench
project.
I would also like to thank XMG-Schenker for supporting my research effort
through their generous sponsorship of my Schenker W503
Hybrid CPU GPU MATLAB Image Processing Benchmarking
Hybrid CPU GPU MATLAB Image Processing Benchmarking
Hybrid CPU GPU MATLAB Image Processing Benchmarking
Hybrid CPU GPU MATLAB Image Processing Benchmarking
Hybrid CPU GPU MATLAB Image Processing Benchmarking

Más contenido relacionado

La actualidad más candente

Part 3 Maximizing the utilization of GPU resources on-premise and in the cloud
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloudPart 3 Maximizing the utilization of GPU resources on-premise and in the cloud
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloudUniva, an Altair Company
 
Evolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server SolutionEvolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server SolutionNVIDIA Taiwan
 
Ai Forum at Computex 2017 - Keynote Slides by Jensen Huang
Ai Forum at Computex 2017 - Keynote Slides by Jensen HuangAi Forum at Computex 2017 - Keynote Slides by Jensen Huang
Ai Forum at Computex 2017 - Keynote Slides by Jensen HuangNVIDIA Taiwan
 
GTC 2017: Powering the AI Revolution
GTC 2017: Powering the AI RevolutionGTC 2017: Powering the AI Revolution
GTC 2017: Powering the AI RevolutionNVIDIA
 
OpenACC Monthly Highlights June 2017
OpenACC Monthly Highlights June 2017OpenACC Monthly Highlights June 2017
OpenACC Monthly Highlights June 2017NVIDIA
 

La actualidad más candente (8)

Latest HPC News from NVIDIA
Latest HPC News from NVIDIALatest HPC News from NVIDIA
Latest HPC News from NVIDIA
 
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloud
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloudPart 3 Maximizing the utilization of GPU resources on-premise and in the cloud
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloud
 
Nvidia SC13 Podcast
Nvidia SC13 PodcastNvidia SC13 Podcast
Nvidia SC13 Podcast
 
Evolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server SolutionEvolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server Solution
 
RAPIDS Overview
RAPIDS OverviewRAPIDS Overview
RAPIDS Overview
 
Ai Forum at Computex 2017 - Keynote Slides by Jensen Huang
Ai Forum at Computex 2017 - Keynote Slides by Jensen HuangAi Forum at Computex 2017 - Keynote Slides by Jensen Huang
Ai Forum at Computex 2017 - Keynote Slides by Jensen Huang
 
GTC 2017: Powering the AI Revolution
GTC 2017: Powering the AI RevolutionGTC 2017: Powering the AI Revolution
GTC 2017: Powering the AI Revolution
 
OpenACC Monthly Highlights June 2017
OpenACC Monthly Highlights June 2017OpenACC Monthly Highlights June 2017
OpenACC Monthly Highlights June 2017
 

Similar a Hybrid CPU GPU MATLAB Image Processing Benchmarking

Get results from demanding workflows in less time with the new HP Z8 Fury G5 ...
Get results from demanding workflows in less time with the new HP Z8 Fury G5 ...Get results from demanding workflows in less time with the new HP Z8 Fury G5 ...
Get results from demanding workflows in less time with the new HP Z8 Fury G5 ...Principled Technologies
 
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...Kohei KaiGai
 
High End Modeling & Imaging with Intel Iris Pro Graphics
High End Modeling & Imaging with Intel Iris Pro GraphicsHigh End Modeling & Imaging with Intel Iris Pro Graphics
High End Modeling & Imaging with Intel Iris Pro GraphicsIntel® Software
 
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...Databricks
 
20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_ProcessingKohei KaiGai
 
Dell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation WebinarDell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation WebinarBill Wong
 
IRJET-A Study on Parallization of Genetic Algorithms on GPUS using CUDA
IRJET-A Study on Parallization of Genetic Algorithms on GPUS using CUDAIRJET-A Study on Parallization of Genetic Algorithms on GPUS using CUDA
IRJET-A Study on Parallization of Genetic Algorithms on GPUS using CUDAIRJET Journal
 
Hardware and Software Co-optimization to Make Sure Oracle Fusion Middleware R...
Hardware and Software Co-optimization to Make Sure Oracle Fusion Middleware R...Hardware and Software Co-optimization to Make Sure Oracle Fusion Middleware R...
Hardware and Software Co-optimization to Make Sure Oracle Fusion Middleware R...Intel IT Center
 
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...IRJET Journal
 
Get in and stay in the productivity zone with the HP Z2 G9 Tower Workstation
Get in and stay in the productivity zone with the HP Z2 G9 Tower WorkstationGet in and stay in the productivity zone with the HP Z2 G9 Tower Workstation
Get in and stay in the productivity zone with the HP Z2 G9 Tower WorkstationPrincipled Technologies
 
The new HP Z8 Fury G5 Workstation Desktop PC: Crunch through demanding worklo...
The new HP Z8 Fury G5 Workstation Desktop PC: Crunch through demanding worklo...The new HP Z8 Fury G5 Workstation Desktop PC: Crunch through demanding worklo...
The new HP Z8 Fury G5 Workstation Desktop PC: Crunch through demanding worklo...Principled Technologies
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGaiKohei KaiGai
 
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...Equnix Business Solutions
 
Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...
Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...
Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...Intel® Software
 
Accelerating AI from the Cloud to the Edge
Accelerating AI from the Cloud to the EdgeAccelerating AI from the Cloud to the Edge
Accelerating AI from the Cloud to the EdgeIntel® Software
 
Laptop drive performance comparison: Seagate Solid State Hybrid Drive vs. har...
Laptop drive performance comparison: Seagate Solid State Hybrid Drive vs. har...Laptop drive performance comparison: Seagate Solid State Hybrid Drive vs. har...
Laptop drive performance comparison: Seagate Solid State Hybrid Drive vs. har...Principled Technologies
 
Trends towards the merge of HPC + Big Data systems
Trends towards the merge of HPC + Big Data systemsTrends towards the merge of HPC + Big Data systems
Trends towards the merge of HPC + Big Data systemsIgor José F. Freitas
 
Benchmark of common AI accelerators: NVIDIA GPU vs. Intel Movidius
Benchmark of common AI accelerators: NVIDIA GPU vs. Intel MovidiusBenchmark of common AI accelerators: NVIDIA GPU vs. Intel Movidius
Benchmark of common AI accelerators: NVIDIA GPU vs. Intel MovidiusbyteLAKE
 
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red_Hat_Storage
 
Performance and power comparisons between nvidia and ati gpus
Performance and power comparisons between nvidia and ati gpusPerformance and power comparisons between nvidia and ati gpus
Performance and power comparisons between nvidia and ati gpusijcsit
 

Similar a Hybrid CPU GPU MATLAB Image Processing Benchmarking (20)

Get results from demanding workflows in less time with the new HP Z8 Fury G5 ...
Get results from demanding workflows in less time with the new HP Z8 Fury G5 ...Get results from demanding workflows in less time with the new HP Z8 Fury G5 ...
Get results from demanding workflows in less time with the new HP Z8 Fury G5 ...
 
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
 
High End Modeling & Imaging with Intel Iris Pro Graphics
High End Modeling & Imaging with Intel Iris Pro GraphicsHigh End Modeling & Imaging with Intel Iris Pro Graphics
High End Modeling & Imaging with Intel Iris Pro Graphics
 
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
 
20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing
 
Dell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation WebinarDell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation Webinar
 
IRJET-A Study on Parallization of Genetic Algorithms on GPUS using CUDA
IRJET-A Study on Parallization of Genetic Algorithms on GPUS using CUDAIRJET-A Study on Parallization of Genetic Algorithms on GPUS using CUDA
IRJET-A Study on Parallization of Genetic Algorithms on GPUS using CUDA
 
Hardware and Software Co-optimization to Make Sure Oracle Fusion Middleware R...
Hardware and Software Co-optimization to Make Sure Oracle Fusion Middleware R...Hardware and Software Co-optimization to Make Sure Oracle Fusion Middleware R...
Hardware and Software Co-optimization to Make Sure Oracle Fusion Middleware R...
 
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...
 
Get in and stay in the productivity zone with the HP Z2 G9 Tower Workstation
Get in and stay in the productivity zone with the HP Z2 G9 Tower WorkstationGet in and stay in the productivity zone with the HP Z2 G9 Tower Workstation
Get in and stay in the productivity zone with the HP Z2 G9 Tower Workstation
 
The new HP Z8 Fury G5 Workstation Desktop PC: Crunch through demanding worklo...
The new HP Z8 Fury G5 Workstation Desktop PC: Crunch through demanding worklo...The new HP Z8 Fury G5 Workstation Desktop PC: Crunch through demanding worklo...
The new HP Z8 Fury G5 Workstation Desktop PC: Crunch through demanding worklo...
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
 
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
 
Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...
Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...
Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...
 
Accelerating AI from the Cloud to the Edge
Accelerating AI from the Cloud to the EdgeAccelerating AI from the Cloud to the Edge
Accelerating AI from the Cloud to the Edge
 
Laptop drive performance comparison: Seagate Solid State Hybrid Drive vs. har...
Laptop drive performance comparison: Seagate Solid State Hybrid Drive vs. har...Laptop drive performance comparison: Seagate Solid State Hybrid Drive vs. har...
Laptop drive performance comparison: Seagate Solid State Hybrid Drive vs. har...
 
Trends towards the merge of HPC + Big Data systems
Trends towards the merge of HPC + Big Data systemsTrends towards the merge of HPC + Big Data systems
Trends towards the merge of HPC + Big Data systems
 
Benchmark of common AI accelerators: NVIDIA GPU vs. Intel Movidius
Benchmark of common AI accelerators: NVIDIA GPU vs. Intel MovidiusBenchmark of common AI accelerators: NVIDIA GPU vs. Intel Movidius
Benchmark of common AI accelerators: NVIDIA GPU vs. Intel Movidius
 
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
 
Performance and power comparisons between nvidia and ati gpus
Performance and power comparisons between nvidia and ati gpusPerformance and power comparisons between nvidia and ati gpus
Performance and power comparisons between nvidia and ati gpus
 

Último

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Último (20)

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Hybrid CPU GPU MATLAB Image Processing Benchmarking

  • 1. The benefits of upgrading to Haswell Architecture and Windows 8.1: Benchmarking of Hybrid (CPUGPU) Parallel Processing (CUDA) – enabled, MATLAB Image Processing Algorithms in GTX TITAN and GTX 780M DIMITRIS VAYENAS, POSTGRADUATE STUDENT DEPARTMENT OF COMPUTER SCIENCE @ THE UNIVERSITY OF OXFORD & SOFTWARE INCUBATOR AT ISIS INNOVATION LTD.
  • 2. Contents  Introduction  A “Real-Life” Hybrid (CPU-GPU) Algorithm  Hardware and Software of Testing  Performance  Comparison  Conclusion  Acknowledgements
  • 3. Introduction  In this laboratory we are attempting to address the following question: Is it is worth upgrading from Ivy Bridge to a Haswell Architecture in order to improve performance?  Intel claims that its new HD 4600 Integrated Graphics Core in the 4th Generation Intel i7 processors can increase performance over the previous architecture by up to 7 times.  What kind of performance improvements can we look forward in “real life examples” and under what conditions?
  • 4. A “Real-Life” Hybrid Algorithm (1/2)  Hybrid: Executes in both CPU and GPU Consider a MATLAB implemented algorithm containing the following steps:
  • 5. A “Real-Life” Hybrid Algorithm (2/2)  In the hybrid Algorithm the tasks in black are performed in the GPU while the tasks in red performed in the CPU.  Thus, we have the usual overhead of transferring the data to and from the GPU whereas the performance of the CPU plays a significant role; this consideration is usually ignored by most graphics performance benchmarks who test either the GPU or the CPU, but not both.  Ideally we should have liked to run all tasks in the GPU, however the current version of MATLAB does not, yet, support these functions in the Parallel Processing Unit.  As we will see the NVIDIA Drivers have substantial impact on Performance
  • 6. Hardware and Software of Testing  System I:  SCAN Workstation with NVIDIA GTX TITAN, Intel i7 3770K @ 4.5 GHz, 32GB RAM @ 2133 MHz, SSD with over 500 MB/s at Read and Write OS: Windows Server 2012 Datacentre Edition NVIDIA Driver: 320.49  System II:  Schenker W503 with NVIDIA GTX 780M, Intel i7 4800 @ 3.5 GHz, 16 GB RAM @1600 MHz, SSD with over 500 MB/s at Read and Write  A) OS: Windows Server 2012 Datacentre Edition NVIDIA Driver: 320.49  B) OS: Windows 8.1 NVIDIA Driver: 326.01 (Important Notice: Figures for System I on Windows 8.1 will be added here by Wednesday 3/7/2013)
  • 7. Performance (total runtimes) Task System I (TITAN on WinSrv 2012) System II (a) (780M on WinSrv 2012) System II (b) (780M on Win 8.1) (number of runs per test/where (CPU or GPU)) (results in seconds – best is less) Edge (800/CPU) 1720.265 1661.289 1261.870 Regionprops (400/CPU) 956.622 899.934 646.883 Imfilter (1600/GPU) 339.045 339.477 263.572 Imresize(1200/CPU) 338.574 295.782 199.593 Padarray (2000/CPU) 204.734 196.303 149.067 Imfilter (1600/GPU) 126.362 131.112 101.717
  • 8. Performance (total run times) 1720.265 956.622 339.045 338.574 204.734 1661.289 899.934 339.477 295.782 196.303 1261.87 646.883 263.572 199.593 149.067 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Edge (800) Regionprops (400) Imfilter (1600) Imresize(1200) Padarray (2000) Task time totals (less is better) System I System II (a) System II (b)
  • 9. Performance (Indicative times to process an image) Parameters: Magnification, Fudge Factor, Sigma and HSize Image Processing System I System II (a) System II (b) (results in seconds) Mag_1_FF_0.2_S_0.2_HS_1 0.39699 0.49159 0.18465 Mag_1_FF_0.6_S_0.6_HS_61 0.46689 0.62617 0.38815 Mag_1_FF_1_S_0.8_HS_1 11.4042 8.1427 0.49579 Mag_3_FF_0.4_S_0.8_HS_41 3.1976 2.8881 1.4568 Mag_5_FF_0.4_S_0.8_HS_41 5.7096 4.4588 3.9456 Mag_7_FF_0.4_S_0.8_HS_41 9.1622 10.6905 8.4348 Mag_9_FF_0.4_S_0.8_HS_41 14.5562 17.9971 14.8889 Mag_9_FF_1_S_0.8_HS_41 28.8458 17.0872 15.5799
  • 10. Performance (Indicative times to process an image) Parameters: Magnification, Fudge Factor, Sigma and HSize 0 0.39699 0.46689 11.4042 3.1976 5.7096 9.1622 14.5562 28.8458 0.49159 0.62617 8.1427 2.8881 4.4588 10.6905 17.9971 17.0872 0.18465 0.38815 0.49579 1.4568 3.9456 8.4348 14.8889 15.5799 EXECUTION TIME IN SECONDS TO PROCESS SPECIFIC IMAGES System I System II (a) System II (b)
  • 11. Performance Comparison (total run times) Task System II (a) vs. System I System II (b) vs. System II (a) System II (b) vs. System I (number of runs per test/where (CPU or GPU)) Percentage Change Edge (800/CPU) 3.4 24.0 26.6 Regionprops (400/CPU) 5.9 28.1 32.4 Imfilter (1600/GPU) -0.1 22.4 22.3 Imresize(1200/CPU) 12.6 32.5 41.0 Padarray (2000/CPU) 4.1 24.1 27.2 Imfilter (1600/GPU) -3.8 22.4 19.5
  • 12. Performance Comparison (total run times) 3.4 5.9 -0.1 12.6 4.1 -3.8 24 28.1 22.4 32.5 24.1 22.4 26.6 32.4 22.3 41 27.2 -10 -5 0 5 10 15 20 25 30 35 40 45 Percentage Change System II (a) vs. System I System II (b) vs. System II (a) System II (b) vs. System I
  • 13. Performance Comparison based on the time to process image Parameters: Magnification, Fudge Factor, Sigma and HSize Image Processing System II (a) vs. System I System II (b) vs. System II (a) System II (b) vs. System I Percentage of Change Mag_1_FF_0.2_S_0.2_HS_1 -23.8 62.4 53.5 Mag_1_FF_0.6_S_0.6_HS_61 -34.1 38.0 16.9 Mag_1_FF_1_S_0.8_HS_1 28.6 93.9 95.7 Mag_3_FF_0.4_S_0.8_HS_41 9.7 49.6 54.4 Mag_5_FF_0.4_S_0.8_HS_41 21.9 11.5 30.9 Mag_7_FF_0.4_S_0.8_HS_41 -16.7 21.1 7.9 Mag_9_FF_0.4_S_0.8_HS_41 -23.6 17.3 -2.3 Mag_9_FF_1_S_0.8_HS_41 40.8 8.8 46.0
  • 14. Performance Comparison based on the time to process image Parameters: Magnification, Fudge Factor, Sigma and HSize 0 -23.8 -34.1 28.6 9.7 21.9 -16.7 -23.6 40.8 62.4 38 93.9 49.6 11.5 21.1 17.3 8.8 53.5 16.9 95.7 54.4 30.9 7.9 -2.3 46 -60 -40 -20 0 20 40 60 80 100 120 Percentage change in image processing System II (a) vs. System I System II (b) vs. System II (a) System II (b) vs. System I
  • 15. Conclusion  The performance improvements due to the new architecture in Intel’s fourth generation i7 family are substantial as we notice the great improvements for related of the i7 4800 Mobile CPU over the overclocked i7 3770K!  NVIDIA also seems to offer improved support of its GTX 7*** Series on Windows 8.1 where we have seen improvement of over 93.9% for a set of parameters and over 20% overall on an identical hardware running on Windows 8.1 with 326.01 driver vs. the 320.49 driver.  Obviously, measuring the performance of hybrid algorithms is similar to asking “how long is a piece of spring”, but given the fact that we see manufacturers fine-tuning their products in order to perform better in standard benchmarking tools it is always wise to create your own benchmarks that fit your applications
  • 16. Acknowledgements I would like to thank the following individuals for their help in measuring and optimising the performance of my MATLAB code, through their extensive knowledge of MATLAB andor CUDA:  Dr. Mike Giles, Professor of Scientific Computing at the University of Oxford; resident expert for NVIDIA and MATLAB  Dr. James Lebak, Parallel Computing Software Engineer at MathWorksat Mathworks Boston HQ.  Captain (USMC) John Roberts, Senior Principal GPGPU Software Engineer at BAE Systems, Inc. (formerly of NVIDIA); John also heads the CUDA Vision Workbench project. I would also like to thank XMG-Schenker for supporting my research effort through their generous sponsorship of my Schenker W503