SlideShare una empresa de Scribd logo
1 de 28
COMPILER OPTIMIZATION-SPACE
EXPLORATION
T. Spyridon, M. Vachharajani, N. Vachharajani, D.
August


Presenter: Tanzir Musabbir
 Group of people from Princeton University, NJ
 Published in International Symposium on Code
  Generation and Optimization: Feedback-Directed
  and Runtime Optimization
 Year 2003.
OUTLINE
 Introduction
 Problems

 Some solutions

 Their solution – Optimization-Space Exploration

 Experiments and Results

 Conclusion
INTRODUCTION - PROCESSORS
 Become more complex
 Incorporate additional computation resources

 Compiler can no longer rely on simple instruction
  count to guide optimization
 It has to balance resource utilization, register usage
  and dependences
INTRODUCTION - COMPILERS
   As a consequences
     Compiler becomes complex
     Use optimizations aggressively
     Have to use predictive heuristics in order to decide
      where and to what extend optimizations should be
      applied
OUTLINE
 Introduction
 Problems

 Some solutions

 Their solution – Optimization-Space Exploration

 Experiments and Results

 Conclusion
PROBLEMS - PREDICTIVE HEURISTICS?
 Modern compilers employ predictive heuristics
 Tries to determine a priori the benefits of certain
  optimization
 Are tuned by compiler writers to give the highest
  average performance
 Resulting optimization decisions remain suboptimal
  for many individual code segments
 Leaving significant potential performance gains
  unrealized
OUTLINE
 Introduction
 Problems

 Some Solutions

 Their Solution – Optimization-Space Exploration

 Experiments and Results

 Conclusion
SOME SOLUTIONS – ITERATIVE COMPILATION
 Compiling a program multiple times with different
  optimization configurations
 After applying several optimizations the predictive
  heuristics are eliminated
 Results are not directly applicable to modern-
  purpose architectures and applications
 Incur large compile time
OUTLINE
 Introduction
 Problems

 Some Solutions

 Their Solution – Optimization-Space Exploration

 Experiments and Results

 Conclusion
THEIR SOLUTION – OPTIMIZATION-SPACE
EXPLORATION
 General and practical version of iterative
  compilation
 Explores the space of optimization configurations
  through multiple compilations
THEIR SOLUTION – OPTIMIZATION-SPACE
EXPLORATION
   To address the compile time:
     It uses the experience of the compiler writer to prune
      the number of configurations that should be explored
     Uses a performance estimator to not evaluate the code
      by execution
     Selects a custom configuration for each code segment
     Selects next optimization configuration by examining the
      previous configurations characteristics
SINGLE FIXED CONFIGURATION
 A set of fixed heuristics is applied to each code
  segment
 Only one version of the code exists at any given
  time
 That version is passed from transformation to
  transformation
OSE OVER MANY CONFIGURATIONS
 OSE compiler simultaneously applies multiple
  transformation sequences on each code segment
 Each version is optimized using a different
  optimization configuration.
 The compiler emits the fittest version as determined
  by the performance evaluator
OSE – LIMITING THE SEARCH SPACE
   Optimization Space
       Derived from a set of optimization parameters
   Optimization Parameters
       Optimization level
       High Level Optimization (HLO) level
       Micro-architecture type
       Coalesce adjacent loads and stores
       HLO phase order
       Loop unroll limit
       Update dependencies after unrolling
       Perform software pipelining
OSE – LIMITING THE SEARCH SPACE
   Optimization Parameters
       Heuristic to disable software pipelining
       Allow control speculation during software pipelining
       Software pipeline outer loops
       Enable if-conversion heuristic for software pipelining
       Software pipeline loops with early exists
       Enable if conversion
       Enable non-standard predication
       Enable pre-scheduling
       Scheduler ready criterion
COMPILER CONSTRUCTION-TIME PRUNING
  Limit the total number of configurations that will be
   considered at compile time
  Construct a set S with at most N configurations
  S is chosen by determining the impact on a
   representative set of code segments C as follows:
      S’ = default configuration + configurations with non-default
       parameters
      a) run C compiled with S’ on real hardware and retain in S’

       only the valuable configurations
      b) consider the combination of configurations in S’ as S’’
       repeat a) for S’’ and retain only the best N configurations
      repeat b) until no new configurations can be generated or
       the speedup does not improve
OSE – LIMITING THE SEARCH SPACE
       Characterizing Configuration Correlations
          build a optimization configuration tree
          critical configurations = conf. at the same level
1. Construct O = set of m most important
                  configurations in S for all
                  code segments in C
2. Choose all oi in O as the successor of the
    root node.
3. For each configurations oi in O:
4. Construct Ci = {cj: argmax(pj,k) = i} k=1…m
5. Repeat steps 3, 4 to find oi successors
    limiting
   the code segments to Ci and configurations
    to SO.
OSE – LIMITING THE SEARCH SPACE
   Compile-time search
     Do a breadth first search on the optimization
      configuration tree
     Choose the configuration that yields the best estimated
      performance
OSE – LIMITING THE SEARCH SPACE
   Limit the OSE application
     To hot code segments
     Hot code segments are identified through profiling or
      hardware performance counters during a program run
EVALUATION
     OSE Compiler Algorithm
1.   Profile the code
2.   For each Function:
3.     Compile to the high level IR
4.     Optimize using HLO
5.   For each Function:
6.     If the function is hot:
7.       Perform OSE on second HLO and CG
8.       Emit the function using the best
         configuration
9.     If the function is not hot use the
       standard configuration
COMPILE TIME PERFORMANCE ESTIMATION
   Model Based on:
     Ideal Cycle Count – T
     Data cache performance, Lambda, L
     Instruction cache performance, I
     Branch mis-prediction, B
OUTLINE
 Introduction
 Problems

 Some solutions

 Their solution – Optimization-Space Exploration

 Experiments and Results

 Conclusion
RESULTS
RESULTS
OUTLINE
 Introduction
 Problems

 Some solutions

 Their solution – Optimization-Space Exploration

 Experiments and Results

 Conclusion
CONCLUSION
 OSE doe not incur the prohibitive compile-time
  costs of other iterative compilation approaches
 Compile time is limited in three ways

 OCE is capable of delivering significant
  performance benefits, while keeping compile times
  reasonable
 It gets more than 20% performance improvement in
  some cases for SPEC codes

Más contenido relacionado

La actualidad más candente

Java Presentation
Java PresentationJava Presentation
Java Presentation
pm2214
 

La actualidad más candente (20)

C pointer
C pointerC pointer
C pointer
 
The role of the parser and Error recovery strategies ppt in compiler design
The role of the parser and Error recovery strategies ppt in compiler designThe role of the parser and Error recovery strategies ppt in compiler design
The role of the parser and Error recovery strategies ppt in compiler design
 
Virtual Functions | Polymorphism | OOP
Virtual Functions | Polymorphism | OOPVirtual Functions | Polymorphism | OOP
Virtual Functions | Polymorphism | OOP
 
Intro to functional programming
Intro to functional programmingIntro to functional programming
Intro to functional programming
 
Java interface
Java interfaceJava interface
Java interface
 
Java collections concept
Java collections conceptJava collections concept
Java collections concept
 
Java package
Java packageJava package
Java package
 
Inheritance in c++ ppt (Powerpoint) | inheritance in c++ ppt presentation | i...
Inheritance in c++ ppt (Powerpoint) | inheritance in c++ ppt presentation | i...Inheritance in c++ ppt (Powerpoint) | inheritance in c++ ppt presentation | i...
Inheritance in c++ ppt (Powerpoint) | inheritance in c++ ppt presentation | i...
 
Advanced C programming
Advanced C programmingAdvanced C programming
Advanced C programming
 
Java Presentation
Java PresentationJava Presentation
Java Presentation
 
Language processors
Language processorsLanguage processors
Language processors
 
C++ programming function
C++ programming functionC++ programming function
C++ programming function
 
1 .java basic
1 .java basic1 .java basic
1 .java basic
 
Pointers in c++
Pointers in c++Pointers in c++
Pointers in c++
 
Compiler Design
Compiler DesignCompiler Design
Compiler Design
 
Pointers in C
Pointers in CPointers in C
Pointers in C
 
Core java
Core javaCore java
Core java
 
Recursion with Python [Rev]
Recursion with Python [Rev]Recursion with Python [Rev]
Recursion with Python [Rev]
 
Presentation on-exception-handling
Presentation on-exception-handlingPresentation on-exception-handling
Presentation on-exception-handling
 
J2EE Introduction
J2EE IntroductionJ2EE Introduction
J2EE Introduction
 

Similar a Compiler Optimization-Space Exploration

Measurement .Net Performance with BenchmarkDotNet
Measurement .Net Performance with BenchmarkDotNetMeasurement .Net Performance with BenchmarkDotNet
Measurement .Net Performance with BenchmarkDotNet
Vasyl Senko
 
Automatic Generation of Peephole Superoptimizers
Automatic Generation of Peephole SuperoptimizersAutomatic Generation of Peephole Superoptimizers
Automatic Generation of Peephole Superoptimizers
keanumit
 
optimizing code in compilers using parallel genetic algorithm
optimizing code in compilers using parallel genetic algorithm optimizing code in compilers using parallel genetic algorithm
optimizing code in compilers using parallel genetic algorithm
Fatemeh Karimi
 

Similar a Compiler Optimization-Space Exploration (20)

Code Optimization
Code OptimizationCode Optimization
Code Optimization
 
The Impact of Compiler Auto-Optimisation on Arm-based HPC Microarchitectures
The Impact of Compiler Auto-Optimisation on Arm-based HPC MicroarchitecturesThe Impact of Compiler Auto-Optimisation on Arm-based HPC Microarchitectures
The Impact of Compiler Auto-Optimisation on Arm-based HPC Microarchitectures
 
GTC16 - S6410 - Comparing OpenACC 2.5 and OpenMP 4.5
GTC16 - S6410 - Comparing OpenACC 2.5 and OpenMP 4.5GTC16 - S6410 - Comparing OpenACC 2.5 and OpenMP 4.5
GTC16 - S6410 - Comparing OpenACC 2.5 and OpenMP 4.5
 
Measurement .Net Performance with BenchmarkDotNet
Measurement .Net Performance with BenchmarkDotNetMeasurement .Net Performance with BenchmarkDotNet
Measurement .Net Performance with BenchmarkDotNet
 
Automatic Generation of Peephole Superoptimizers
Automatic Generation of Peephole SuperoptimizersAutomatic Generation of Peephole Superoptimizers
Automatic Generation of Peephole Superoptimizers
 
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsUnderstand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
 
Peephole optimization techniques
Peephole optimization techniquesPeephole optimization techniques
Peephole optimization techniques
 
optimizing code in compilers using parallel genetic algorithm
optimizing code in compilers using parallel genetic algorithm optimizing code in compilers using parallel genetic algorithm
optimizing code in compilers using parallel genetic algorithm
 
Code tuning strategies
Code tuning strategiesCode tuning strategies
Code tuning strategies
 
Ch1
Ch1Ch1
Ch1
 
Ch1
Ch1Ch1
Ch1
 
Peephole optimization techniques in compiler design
Peephole optimization techniques in compiler designPeephole optimization techniques in compiler design
Peephole optimization techniques in compiler design
 
Test PDF file
Test PDF fileTest PDF file
Test PDF file
 
Cs 568 Spring 10 Lecture 5 Estimation
Cs 568 Spring 10  Lecture 5 EstimationCs 568 Spring 10  Lecture 5 Estimation
Cs 568 Spring 10 Lecture 5 Estimation
 
Unit 3 part2
Unit 3 part2Unit 3 part2
Unit 3 part2
 
Compiler Design- Machine Independent Optimizations
Compiler Design- Machine Independent OptimizationsCompiler Design- Machine Independent Optimizations
Compiler Design- Machine Independent Optimizations
 
Balancing Power & Performance Webinar
Balancing Power & Performance WebinarBalancing Power & Performance Webinar
Balancing Power & Performance Webinar
 
Dst
DstDst
Dst
 
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
 
Parallel Computing - Lec 6
Parallel Computing - Lec 6Parallel Computing - Lec 6
Parallel Computing - Lec 6
 

Compiler Optimization-Space Exploration

  • 1. COMPILER OPTIMIZATION-SPACE EXPLORATION T. Spyridon, M. Vachharajani, N. Vachharajani, D. August Presenter: Tanzir Musabbir
  • 2.  Group of people from Princeton University, NJ  Published in International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization  Year 2003.
  • 3. OUTLINE  Introduction  Problems  Some solutions  Their solution – Optimization-Space Exploration  Experiments and Results  Conclusion
  • 4. INTRODUCTION - PROCESSORS  Become more complex  Incorporate additional computation resources  Compiler can no longer rely on simple instruction count to guide optimization  It has to balance resource utilization, register usage and dependences
  • 5. INTRODUCTION - COMPILERS  As a consequences  Compiler becomes complex  Use optimizations aggressively  Have to use predictive heuristics in order to decide where and to what extend optimizations should be applied
  • 6. OUTLINE  Introduction  Problems  Some solutions  Their solution – Optimization-Space Exploration  Experiments and Results  Conclusion
  • 7. PROBLEMS - PREDICTIVE HEURISTICS?  Modern compilers employ predictive heuristics  Tries to determine a priori the benefits of certain optimization  Are tuned by compiler writers to give the highest average performance  Resulting optimization decisions remain suboptimal for many individual code segments  Leaving significant potential performance gains unrealized
  • 8. OUTLINE  Introduction  Problems  Some Solutions  Their Solution – Optimization-Space Exploration  Experiments and Results  Conclusion
  • 9. SOME SOLUTIONS – ITERATIVE COMPILATION  Compiling a program multiple times with different optimization configurations  After applying several optimizations the predictive heuristics are eliminated  Results are not directly applicable to modern- purpose architectures and applications  Incur large compile time
  • 10. OUTLINE  Introduction  Problems  Some Solutions  Their Solution – Optimization-Space Exploration  Experiments and Results  Conclusion
  • 11. THEIR SOLUTION – OPTIMIZATION-SPACE EXPLORATION  General and practical version of iterative compilation  Explores the space of optimization configurations through multiple compilations
  • 12. THEIR SOLUTION – OPTIMIZATION-SPACE EXPLORATION  To address the compile time:  It uses the experience of the compiler writer to prune the number of configurations that should be explored  Uses a performance estimator to not evaluate the code by execution  Selects a custom configuration for each code segment  Selects next optimization configuration by examining the previous configurations characteristics
  • 13. SINGLE FIXED CONFIGURATION  A set of fixed heuristics is applied to each code segment  Only one version of the code exists at any given time  That version is passed from transformation to transformation
  • 14.
  • 15. OSE OVER MANY CONFIGURATIONS  OSE compiler simultaneously applies multiple transformation sequences on each code segment  Each version is optimized using a different optimization configuration.  The compiler emits the fittest version as determined by the performance evaluator
  • 16. OSE – LIMITING THE SEARCH SPACE  Optimization Space  Derived from a set of optimization parameters  Optimization Parameters  Optimization level  High Level Optimization (HLO) level  Micro-architecture type  Coalesce adjacent loads and stores  HLO phase order  Loop unroll limit  Update dependencies after unrolling  Perform software pipelining
  • 17. OSE – LIMITING THE SEARCH SPACE  Optimization Parameters  Heuristic to disable software pipelining  Allow control speculation during software pipelining  Software pipeline outer loops  Enable if-conversion heuristic for software pipelining  Software pipeline loops with early exists  Enable if conversion  Enable non-standard predication  Enable pre-scheduling  Scheduler ready criterion
  • 18. COMPILER CONSTRUCTION-TIME PRUNING  Limit the total number of configurations that will be considered at compile time  Construct a set S with at most N configurations  S is chosen by determining the impact on a representative set of code segments C as follows:  S’ = default configuration + configurations with non-default parameters  a) run C compiled with S’ on real hardware and retain in S’ only the valuable configurations  b) consider the combination of configurations in S’ as S’’ repeat a) for S’’ and retain only the best N configurations  repeat b) until no new configurations can be generated or the speedup does not improve
  • 19. OSE – LIMITING THE SEARCH SPACE  Characterizing Configuration Correlations  build a optimization configuration tree  critical configurations = conf. at the same level 1. Construct O = set of m most important configurations in S for all code segments in C 2. Choose all oi in O as the successor of the root node. 3. For each configurations oi in O: 4. Construct Ci = {cj: argmax(pj,k) = i} k=1…m 5. Repeat steps 3, 4 to find oi successors limiting the code segments to Ci and configurations to SO.
  • 20. OSE – LIMITING THE SEARCH SPACE  Compile-time search  Do a breadth first search on the optimization configuration tree  Choose the configuration that yields the best estimated performance
  • 21. OSE – LIMITING THE SEARCH SPACE  Limit the OSE application  To hot code segments  Hot code segments are identified through profiling or hardware performance counters during a program run
  • 22. EVALUATION  OSE Compiler Algorithm 1. Profile the code 2. For each Function: 3. Compile to the high level IR 4. Optimize using HLO 5. For each Function: 6. If the function is hot: 7. Perform OSE on second HLO and CG 8. Emit the function using the best configuration 9. If the function is not hot use the standard configuration
  • 23. COMPILE TIME PERFORMANCE ESTIMATION  Model Based on:  Ideal Cycle Count – T  Data cache performance, Lambda, L  Instruction cache performance, I  Branch mis-prediction, B
  • 24. OUTLINE  Introduction  Problems  Some solutions  Their solution – Optimization-Space Exploration  Experiments and Results  Conclusion
  • 27. OUTLINE  Introduction  Problems  Some solutions  Their solution – Optimization-Space Exploration  Experiments and Results  Conclusion
  • 28. CONCLUSION  OSE doe not incur the prohibitive compile-time costs of other iterative compilation approaches  Compile time is limited in three ways  OCE is capable of delivering significant performance benefits, while keeping compile times reasonable  It gets more than 20% performance improvement in some cases for SPEC codes