SlideShare una empresa de Scribd logo
1 de 42
Descargar para leer sin conexión
(Do not be afraid of)

PHP Compiler Internals
          Sebastian Bergmann
                 August 23rd 2009
Sebastian Bergmann

   Co-Founder and
    Principal Consultant
    with thePHP.cc
   Creator of PHPUnit
   Involved in the PHP
    project since 2000
Under PHP's Hood


                                  Extensions

    (date, dom, gd, json, mysql, pcre, pdo, reflection, session, standard, …)



            PHP Core                                    Zend Engine

       Request Management                       Compilation and Execution
   File and Network Operations                Memory and Resource Allocation



                              Server API (SAPI)

                         (mod_php, FastCGI, CLI, ...)




                                                                 This slide contains material by Sara Golemon
How PHP executes code

   Lexical Analysis
      Scan the source for sequences of characters
      and convert them to a sequence of tokens
How PHP executes code

   Lexical Analysis
   Syntax Analysis
      Parse a sequence of tokens to determine
      their grammatical structure
How PHP executes code

   Lexical Analysis
   Syntax Analysis
   Bytecode Generation
      Generate bytecode based on the information
      gathered by analyzing the source
How PHP executes code

   Lexical Analysis
   Syntax Analysis
   Bytecode Generation
   Bytecode Execution
Lexical Analysis
Scan a sequence of characters
1   <?php
2   if (TRUE) {
3       print '*';
4   }
5   ?>
Lexical Analysis
Scan a sequence of characters
1   <?php                 T_OPEN_TAG
2   if (TRUE) {
3       print '*';
4   }
5   ?>
Lexical Analysis
Scan a sequence of characters
1 <?php                   T_OPEN_TAG
2 if (TRUE) {             T_IF
                          T_WHITESPACE
                          (
                          T_STRING
                          )
                          T_WHITESPACE
                          {
                          T_WHITESPACE
3      print '*';
4 }
5 ?>
Lexical Analysis
Scan a sequence of characters
1 <?php                   T_OPEN_TAG
2 if (TRUE) {             T_IF
                          T_WHITESPACE
                          (
                          T_STRING
                          )
                          T_WHITESPACE
                          {
                          T_WHITESPACE
3      print '*';         T_PRINT
                          T_WHITESPACE
                          T_CONSTANT_ENCAPSED_STRING
                          ;
4 }
5 ?>
Lexical Analysis
Scan a sequence of characters
1 <?php                   T_OPEN_TAG
2 if (TRUE) {             T_IF
                          T_WHITESPACE
                          (
                          T_STRING
                          )
                          T_WHITESPACE
                          {
                          T_WHITESPACE
3      print '*';         T_PRINT
                          T_WHITESPACE
                          T_CONSTANT_ENCAPSED_STRING
                          ;
                          T_WHITESPACE
4 }                       }
5 ?>
Lexical Analysis
Scan a sequence of characters
1 <?php                   T_OPEN_TAG
2 if (TRUE) {             T_IF
                          T_WHITESPACE
                          (
                          T_STRING
                          )
                          T_WHITESPACE
                          {
                          T_WHITESPACE
3      print '*';         T_PRINT
                          T_WHITESPACE
                          T_CONSTANT_ENCAPSED_STRING
                          ;
                          T_WHITESPACE
4 }                       }
                          T_WHITESPACE
5 ?>                      T_CLOSE_TAG
Lexical Analysis
Scan a sequence of characters
T_OPEN_TAG                <?php
T_IF                      if
T_WHITESPACE
(
T_STRING                  TRUE
)
T_WHITESPACE
{
T_WHITESPACE
T_PRINT                   print
T_WHITESPACE
T_CONSTANT_ENCAPSE        '*'
D_STRING
;
T_WHITESPACE
}
T_WHITESPACE              ?>
T_CLOSE_TAG
Lexical Analysis
Scan a sequence of characters
Lexical Analysis
Scanner Generators
   You do not want to write a scanner by
    hand
      At least when the code for the scanner should
      be efficient and maintainable
   Tools such as flex or re2c generate the
    code for a scanner from a set of rules


    <ST_IN_SCRIPTING>"if" {
    "if" {
      return T_IF;
    }
Lexical Analysis
    PHP Tokens
    T_ABSTRACT           T_CONCAT_EQUAL                  T_ELSE                         T_FUNCTION

    T_AND_EQUAL          T_CONST                         T_ELSEIF                       T_FUNC_C

    T_ARRAY              T_CONSTANT_ENCAPSED_STRING      T_EMPTY                        T_GLOBAL

    T_ARRAY_CAST         T_CONTINUE                      T_ENCAPSED_AND_WHITESPACE      T_GOTO

    T_AS                 T_CURLY_OPEN                    T_ENDDECLARE                   T_HALT_COMPILER

    T_BAD_CHARACTER      T_DEC                           T_ENDFOR                       T_IF

    T_BOOLEAN_AND        T_DECLARE                       T_ENDFOREACH                   T_IMPLEMENTS

    T_BOOLEAN_OR         T_DEFAULT                       T_ENDIF                        T_INC

    T_BOOL_CAST          T_DIR                           T_ENDSWITCH                    T_INCLUDE

    T_BREAK              T_DIV_EQUAL                     T_ENDWHILE                     T_INCLUDE_ONCE

    T_CASE               T_DNUMBER                       T_END_HEREDOC                  T_INLINE_HTML

    T_CATCH              T_DOC_COMMENT                   T_EVAL                         T_INSTANCEOF

    T_CHARACTER          T_DO                            T_EXIT                         T_INT_CAST

    T_CLASS              T_DOLLAR_OPEN_CURLY_BRACES      T_EXTENDS                      T_INTERFACE

    T_CLASS_C            T_DOUBLE_ARROW                  T_FILE                         T_ISSET

    T_CLONE              T_DOUBLE_CAST                   T_FINAL                        T_IS_EQUAL

    T_CLOSE_TAG          T_DOUBLE_COLON                  T_FOR                          T_IS_GREATER_OR_EQUAL

    T_COMMENT            T_ECHO                          T_FOREACH                      T_IS_IDENTICAL
Lexical Analysis
    PHP Tokens
    T_IS_NOT_EQUAL             T_OBJECT_CAST               T_SR_EQUAL

    T_IS_NOT_IDENTICAL         T_OBJECT_OPERATOR           T_START_HEREDOC

    T_IS_SMALLER_OR_EQUAL      T_OLD_FUNCTION              T_STATIC

    T_LINE                     T_OPEN_TAG                  T_STRING

    T_LIST                     T_OPEN_TAG_WITH_ECHO        T_STRING_CAST

    T_LNUMBER                  T_OR_EQUAL                  T_STRING_VARNAME

    T_LOGICAL_AND              T_PAAMAYIM_NEKUDOTAYIM      T_SWITCH

    T_LOGICAL_OR               T_PLUS_EQUAL                T_THROW

    T_LOGICAL_XOR              T_PRINT                     T_TRY

    T_METHOD_C                 T_PRIVATE                   T_UNSET

    T_MINUS_EQUAL              T_PUBLIC                    T_UNSET_CAST

    T_ML_COMMENT               T_PROTECTED                 T_USE

    T_MOD_EQUAL                T_REQUIRE                   T_VAR

    T_MUL_EQUAL                T_REQUIRE_ONCE              T_VARIABLE

    T_NAMESPACE                T_RETURN                    T_WHILE

    T_NS_C                     T_SL                        T_WHITESPACE

    T_NEW                      T_SL_EQUAL                  T_XOR_EQUAL

    T_NUM_STRING               T_SR
Syntax Analysis
Parse a sequence of tokens
Syntax Analysis
Parse a sequence of tokens
   You do not want to write a parser by hand
      At least when the code for the scanner should
      be efficient and maintainable
   Tools such as bison or lemon generate
    the code for a parser from a set of rules

     T_IF '(' expr ')' { ... }
     statement { ... }
     elseif_list else_single { ... }
PHP Bytecode
Using bytekit-cli to disassemble bytecode
1   <?php
2   if (TRUE) {
3       print '*';
4   }
5   ?>
 sb@thinkpad ~ % bytekit if.php
 bytekit-cli 1.0.0 by Sebastian Bergmann.

 Filename:            /home/sb/if.php
 Function:            main
 Number of oplines:   8

    line #      opcode                           result operands
    -----------------------------------------------------------------------------
    2     0     EXT_STMT
          1     JMPZ                                     true, ->6

    3    2     EXT_STMT
         3     PRINT                            ~0      '*'
         4     FREE                                     ~0
    4    5     JMP                                      ->6

    6    6     EXT_STMT
         7     RETURN                                   1
PHP Bytecode
Using bytekit-cli to visualize bytecode
1   <?php
2   if (TRUE) {
3       print '*';
4   }
5   ?>
 sb@thinkpad ~ % bytekit --graph /tmp --format svg if.php
How if is compiled
Zend/zend_compile.c
void zend_do_if_cond
(const znode *cond, znode *closing_bracket_token TSRMLS_DC)
{
                  typedef struct _znode {
                      int op_type;
                      union {
                          zval constant;

                          zend_uint var;
                          zend_uint opline_num;
                          zend_op_array *op_array;
                          zend_op *jmp_addr;
                          struct {
                              zend_uint var;
                              zend_uint type;
                          } EA;
                      } u;
}                 } znode;



zend_do_if_cond() is called when an if statement is compiled
How if is compiled
Zend/zend_compile.c
void zend_do_if_cond
(const znode *cond, znode *closing_bracket_token TSRMLS_DC)
{
  int if_cond_op_number =
  get_next_op_number(CG(active_op_array));
  zend_op *opline =
  get_next_op(CG(active_op_array) TSRMLS_CC);

                     struct _zend_op {
                         opcode_handler_t handler;
                         znode result;
                         znode op1;
                         znode op2;
                         ulong extended_value;
                         uint lineno;
                         zend_uchar opcode;
}                    };


Allocate a new opline in the current oparray
How if is compiled
Zend/zend_compile.c
void zend_do_if_cond
(const znode *cond, znode *closing_bracket_token TSRMLS_DC)
{
  int if_cond_op_number =
  get_next_op_number(CG(active_op_array));
  zend_op *opline =
  get_next_op(CG(active_op_array) TSRMLS_CC);

    opline->opcode = ZEND_JMPZ;




}


Set the opcode of the new opline to JMPZ (jump if zero)
How if is compiled
Zend/zend_compile.c
void zend_do_if_cond
(const znode *cond, znode *closing_bracket_token TSRMLS_DC)
{
  int if_cond_op_number =
  get_next_op_number(CG(active_op_array));
  zend_op *opline =
  get_next_op(CG(active_op_array) TSRMLS_CC);

    opline->opcode = ZEND_JMPZ;
    opline->op1    = *cond;




}


Set the first operand of the new opline to the if condition
How if is compiled
Zend/zend_compile.c
void zend_do_if_cond
(const znode *cond, znode *closing_bracket_token TSRMLS_DC)
{
  int if_cond_op_number =
  get_next_op_number(CG(active_op_array));
  zend_op *opline =
  get_next_op(CG(active_op_array) TSRMLS_CC);

    opline->opcode = ZEND_JMPZ;
    opline->op1    = *cond;
    closing_bracket_token->u.opline_num =
    if_cond_op_number;
    SET_UNUSED(opline->op2);
    INC_BPC(CG(active_op_array));
}


Perform book keeping tasks such as marking the second operand of the
new opline as unused or incrementing the backpatching counter for the
current oparray
PHP Bytecode
    PHP Opcodes
    NOP                   IS_NOT_EQUAL             POST_INC         ADD_VAR                 UNSET_DIM

    ADD                   IS_SMALLER               POST_DEC         BEGIN_SILENCE           UNSET_OBJ

    SUB                   IS_SMALLER_OR_EQUAL      ASSIGN           END_SILENCE             FE_RESET

    MUL                   CAST                     ASSIGN_REF       INIT_FCALL_BY_NAME      FE_FETCH

    DIV                   QM_ASSIGN                ECHO             DO_FCALL                EXIT

    MOD                   ASSIGN_ADD               PRINT            DO_FCALL_BY_NAME        FETCH_R

    SL                    ASSIGN_SUB               JMPZ             RETURN                  FETCH_DIM_R

    SR                    ASSIGN_MUL               JMPNZ            RECV                    FETCH_OBJ_R

    CONCAT                ASSIGN_DIV               JMPZNZ           RECV_INIT               FETCH_W

    BW_OR                 ASSIGN_MOD               JMPZ_EX          SEND_VAL                FETCH_DIM_W

    BW_AND                ASSIGN_SL                JMPNZ_EX         SEND_VAR                FETCH_OBJ_W

    BW_XOR                ASSIGN_SR                CASE             SEND_REF                FETCH_RW

    BW_NOT                ASSIGN_CONCAT            SWITCH_FREE      NEW                     FETCH_DIM_RW

    BOOL_NOT              ASSIGN_BW_OR             BRK              FREE                    FETCH_OBJ_RW

    BOOL_XOR              ASSIGN_BW_AND            BOOL             INIT_ARRAY              FETCH_IS

    IS_IDENTICAL          ASSIGN_BW_XOR            INIT_STRING      ADD_ARRAY_ELEMENT       FETCH_DIM_IS

    IS_NOT_IDENTICAL      PRE_INC                  ADD_CHAR         INCLUDE_OR_EVAL         FETCH_OBJ_IS

    IS_EQUAL              PRE_DEC                  ADD_STRING       UNSET_VAR               FETCH_FUNC_ARG
PHP Bytecode
    PHP Opcodes
    FETCH_DIM_FUNC_ARG      INIT_STATIC_METHOD_CALL

    FETCH_OBJ_FUNC_ARG      ISSET_ISEMPTY_VAR

    FETCH_UNSET             ISSET_ISEMPTY_DIM_OBJ

    FETCH_DIM_UNSET         PRE_INC_OBJ

    FETCH_OBJ_UNSET         PRE_DEC_OBJ

    FETCH_DIM_TMP_VAR       POST_INC_OBJ

    FETCH_CONSTANT          POST_DEC_OBJ

    EXT_STMT                ASSIGN_OBJ

    EXT_FCALL_BEGIN         INSTANCEOF

    EXT_FCALL_END           DECLARE_CLASS

    EXT_NOP                 DECLARE_INHERITED_CLASS

    TICKS                   DECLARE_FUNCTION

    SEND_VAR_NO_REF         RAISE_ABSTRACT_ERROR

    CATCH                   ADD_INTERFACE

    THROW                   VERIFY_ABSTRACT_CLASS

    FETCH_CLASS             ASSIGN_DIM

    CLONE                   ISSET_ISEMPTY_PROP_OBJ

    INIT_METHOD_CALL        HANDLE_EXCEPTION
Extending the PHP Compiler
Test First!
--TEST--
unless statement
--FILE--
<?php
unless (FALSE) {
    print 'unless FALSE is TRUE, this is printed';
}

unless (TRUE) {
    print 'unless TRUE is TRUE, this is printed';
}
?>
--EXPECT--
unless FALSE is TRUE, this is printed
Extending the PHP Compiler

   Add token for unless to the scanner
   Add rule for unless to the parser
   Implement bytecode generation for
    unless in the compiler
   Add token for unless to ext/tokenizer
Add unless scanner token
Zend/zend_language_parser.y
%token   T_NAMESPACE
%token   T_NS_C
%token   T_DIR
%token   T_NS_SEPARATOR
%token   T_UNLESS
Add unless scanner token
Zend/zend_language_scanner.l
<ST_IN_SCRIPTING>"if" {
   return T_IF;
}

<ST_IN_SCRIPTING>"unless" {
   return T_UNLESS;
}

<ST_IN_SCRIPTING>"elseif" {
   return T_ELSEIF;
}

<ST_IN_SCRIPTING>"endif" {
   return T_ENDIF;
}

<ST_IN_SCRIPTING>"else" {
   return T_ELSE;
}
Add unless parser rule
Zend/zend_language_parser.y
unticked_statement:
   '{' inner_statement_list '}'
 | T_IF '(' expr ')' {
 .
 .
 | T_UNLESS '(' expr ')' {
    zend_do_unless_cond(&$3, &$4 TSRMLS_CC);
 } statement {
    zend_do_if_after_statement(&$4, 1 TSRMLS_CC);
 } {
    zend_do_if_end(TSRMLS_C);
 }
Add unless to the compiler
Zend/zend_compile.c
void zend_do_if_cond
(const znode *cond, znode *closing_bracket_token TSRMLS_DC)
{
  int unless_cond_op_number =
  get_next_op_number(CG(active_op_array));
  zend_op *opline =
  get_next_op(CG(active_op_array) TSRMLS_CC);

    opline->opcode = ZEND_JMPNZ;
    opline->op1    = *cond;
    closing_bracket_token->u.opline_num =
    unless_cond_op_number;
    SET_UNUSED(opline->op2);
    INC_BPC(CG(active_op_array));
}


All we have to do to generate code for the unless statement,
as compared to generate code for the if statement, is to emit
JMPNZ (jump if not zero) instead of JMPZ (jump if zero)
Add unless to the compiler
The generated bytecode
1   <?php
2   unless (FALSE) {
3       print '*';
4   }
5   ?>
sb@thinkpad ~ % bytekit unless.php
bytekit-cli 1.0.0 by Sebastian Bergmann.

Filename:            /home/sb/unless.php
Function:            main
Number of oplines:   8

    line #      opcode                           result operands
    -----------------------------------------------------------------------------
    2     0     EXT_STMT
          1     JMPNZ                                    true, ->6

    3    2     EXT_STMT
         3     PRINT                            ~0      '*'
         4     FREE                                     ~0
    4    5     JMP                                      ->6

    6    6     EXT_STMT
         7     RETURN                                   1
Running the test
sb@thinkpad php-5.3-unless % make test TESTS=Zend/tests/unless.phpt

Build complete.
Don't forget to run 'make test'.


=====================================================================
PHP         : /usr/local/src/php/php-5.3-unless/sapi/cli/php
PHP_SAPI    : cli
PHP_VERSION : 5.3.1-dev
ZEND_VERSION: 2.3.0
PHP_OS      : Linux 2.6.28-14-generic #47-Ubuntu SMP Sat Jul 25 01:19:55 UTC 2009 i686 GNU/Linux
INI actual : /usr/local/src/php/php-5.3-unless/tmp-php.ini
More .INIs :
CWD         : /usr/local/src/php/php-5.3-unless
Extra dirs :
VALGRIND    : Not used
=====================================================================
Running selected tests.
PASS unless statement [Zend/tests/unless.phpt]
=====================================================================
Number of tests :    1                 1
Tests skipped   :    0 ( 0.0%) --------
Tests warned    :    0 ( 0.0%) ( 0.0%)
Tests failed    :    0 ( 0.0%) ( 0.0%)
Expected fail   :    0 ( 0.0%) ( 0.0%)
Tests passed    :    1 (100.0%) (100.0%)
---------------------------------------------------------------------
Time taken      :    0 seconds
=====================================================================
Add unless to ext/tokenizer

sb@thinkpad tokenizer % ./tokenizer_data_gen.sh
Wrote tokenizer_data.c
The End

Thank you for your interest!


These slides will be posted on
http://slideshare.net/sebastian_bergmann
Acknowledgements

   Thomas Lee, whose Python Language
    Internals presentation at OSDC 2008
    inspired this presentation
   Stefan Esser for creating the Bytekit
    extension that provides PHP bytecode
    access and analysis features
   Derick Rethans, David Soria Parra, and
    Scott MacVicar for reviewing these slides
References
   http://www.php.net/manual/en/tokens.php
   http://www.zapt.info/opcodes.html
   ”Extending and Embedding PHP”,
    Sara Golemon
   http://bytekit.org/
   http://github.com/sebastianbergmann/bytekit-cli/
License
    This presentation material is published under the Attribution-Share Alike 3.0 Unported
    license.
    You are free:
      ✔   to Share – to copy, distribute and transmit the work.
      ✔   to Remix – to adapt the work.
    Under the following conditions:
      ●   Attribution. You must attribute the work in the manner specified by the author or
          licensor (but not in any way that suggests that they endorse you or your use of the
          work).
      ●   Share Alike. If you alter, transform, or build upon this work, you may distribute the
          resulting work only under the same, similar or a compatible license.
    For any reuse or distribution, you must make clear to others the license terms of this
    work.
    Any of the above conditions can be waived if you get permission from the copyright
    holder.
    Nothing in this license impairs or restricts the author's moral rights.

Más contenido relacionado

Destacado

Destacado (12)

How PHP Works ?
How PHP Works ?How PHP Works ?
How PHP Works ?
 
Php under the_hood
Php under the_hoodPhp under the_hood
Php under the_hood
 
Building Custom PHP Extensions
Building Custom PHP ExtensionsBuilding Custom PHP Extensions
Building Custom PHP Extensions
 
Accelerating or Complicating PHP execution by LLVM Compiler Infrastructure
Accelerating or Complicating PHP execution by LLVM Compiler Infrastructure Accelerating or Complicating PHP execution by LLVM Compiler Infrastructure
Accelerating or Complicating PHP execution by LLVM Compiler Infrastructure
 
PHP Internals
PHP InternalsPHP Internals
PHP Internals
 
Build Programming Language Runtime with LLVM
Build Programming Language Runtime with LLVMBuild Programming Language Runtime with LLVM
Build Programming Language Runtime with LLVM
 
Understanding PHP memory
Understanding PHP memoryUnderstanding PHP memory
Understanding PHP memory
 
The Php Life Cycle
The Php Life CycleThe Php Life Cycle
The Php Life Cycle
 
Php Extensions for Dummies
Php Extensions for DummiesPhp Extensions for Dummies
Php Extensions for Dummies
 
PHP 7 new engine
PHP 7 new enginePHP 7 new engine
PHP 7 new engine
 
About Tokens and Lexemes
About Tokens and LexemesAbout Tokens and Lexemes
About Tokens and Lexemes
 
Recognition-of-tokens
Recognition-of-tokensRecognition-of-tokens
Recognition-of-tokens
 

Similar a Phpcompilerinternals 090824022750-phpapp02

Basic of Python- Hands on Session
Basic of Python- Hands on SessionBasic of Python- Hands on Session
Basic of Python- Hands on SessionDharmesh Tank
 
Advanced perl finer points ,pack&amp;unpack,eval,files
Advanced perl   finer points ,pack&amp;unpack,eval,filesAdvanced perl   finer points ,pack&amp;unpack,eval,files
Advanced perl finer points ,pack&amp;unpack,eval,filesShankar D
 
Applying Generics
Applying GenericsApplying Generics
Applying GenericsBharat17485
 
Chapter2pp
Chapter2ppChapter2pp
Chapter2ppJ. C.
 

Similar a Phpcompilerinternals 090824022750-phpapp02 (7)

Basic of Python- Hands on Session
Basic of Python- Hands on SessionBasic of Python- Hands on Session
Basic of Python- Hands on Session
 
Advanced perl finer points ,pack&amp;unpack,eval,files
Advanced perl   finer points ,pack&amp;unpack,eval,filesAdvanced perl   finer points ,pack&amp;unpack,eval,files
Advanced perl finer points ,pack&amp;unpack,eval,files
 
Applying Generics
Applying GenericsApplying Generics
Applying Generics
 
Chapter2pp
Chapter2ppChapter2pp
Chapter2pp
 
Chapter 6 Intermediate Code Generation
Chapter 6   Intermediate Code GenerationChapter 6   Intermediate Code Generation
Chapter 6 Intermediate Code Generation
 
Diving deep into twig
Diving deep into twigDiving deep into twig
Diving deep into twig
 
Generics_RIO.ppt
Generics_RIO.pptGenerics_RIO.ppt
Generics_RIO.ppt
 

Último

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Último (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

Phpcompilerinternals 090824022750-phpapp02

  • 1. (Do not be afraid of) PHP Compiler Internals Sebastian Bergmann August 23rd 2009
  • 2. Sebastian Bergmann  Co-Founder and Principal Consultant with thePHP.cc  Creator of PHPUnit  Involved in the PHP project since 2000
  • 3. Under PHP's Hood Extensions (date, dom, gd, json, mysql, pcre, pdo, reflection, session, standard, …) PHP Core Zend Engine Request Management Compilation and Execution File and Network Operations Memory and Resource Allocation Server API (SAPI) (mod_php, FastCGI, CLI, ...) This slide contains material by Sara Golemon
  • 4. How PHP executes code  Lexical Analysis Scan the source for sequences of characters and convert them to a sequence of tokens
  • 5. How PHP executes code  Lexical Analysis  Syntax Analysis Parse a sequence of tokens to determine their grammatical structure
  • 6. How PHP executes code  Lexical Analysis  Syntax Analysis  Bytecode Generation Generate bytecode based on the information gathered by analyzing the source
  • 7. How PHP executes code  Lexical Analysis  Syntax Analysis  Bytecode Generation  Bytecode Execution
  • 8. Lexical Analysis Scan a sequence of characters 1 <?php 2 if (TRUE) { 3 print '*'; 4 } 5 ?>
  • 9. Lexical Analysis Scan a sequence of characters 1 <?php T_OPEN_TAG 2 if (TRUE) { 3 print '*'; 4 } 5 ?>
  • 10. Lexical Analysis Scan a sequence of characters 1 <?php T_OPEN_TAG 2 if (TRUE) { T_IF T_WHITESPACE ( T_STRING ) T_WHITESPACE { T_WHITESPACE 3 print '*'; 4 } 5 ?>
  • 11. Lexical Analysis Scan a sequence of characters 1 <?php T_OPEN_TAG 2 if (TRUE) { T_IF T_WHITESPACE ( T_STRING ) T_WHITESPACE { T_WHITESPACE 3 print '*'; T_PRINT T_WHITESPACE T_CONSTANT_ENCAPSED_STRING ; 4 } 5 ?>
  • 12. Lexical Analysis Scan a sequence of characters 1 <?php T_OPEN_TAG 2 if (TRUE) { T_IF T_WHITESPACE ( T_STRING ) T_WHITESPACE { T_WHITESPACE 3 print '*'; T_PRINT T_WHITESPACE T_CONSTANT_ENCAPSED_STRING ; T_WHITESPACE 4 } } 5 ?>
  • 13. Lexical Analysis Scan a sequence of characters 1 <?php T_OPEN_TAG 2 if (TRUE) { T_IF T_WHITESPACE ( T_STRING ) T_WHITESPACE { T_WHITESPACE 3 print '*'; T_PRINT T_WHITESPACE T_CONSTANT_ENCAPSED_STRING ; T_WHITESPACE 4 } } T_WHITESPACE 5 ?> T_CLOSE_TAG
  • 14. Lexical Analysis Scan a sequence of characters T_OPEN_TAG <?php T_IF if T_WHITESPACE ( T_STRING TRUE ) T_WHITESPACE { T_WHITESPACE T_PRINT print T_WHITESPACE T_CONSTANT_ENCAPSE '*' D_STRING ; T_WHITESPACE } T_WHITESPACE ?> T_CLOSE_TAG
  • 16. Lexical Analysis Scanner Generators  You do not want to write a scanner by hand At least when the code for the scanner should be efficient and maintainable  Tools such as flex or re2c generate the code for a scanner from a set of rules <ST_IN_SCRIPTING>"if" { "if" { return T_IF; }
  • 17. Lexical Analysis PHP Tokens  T_ABSTRACT  T_CONCAT_EQUAL  T_ELSE  T_FUNCTION  T_AND_EQUAL  T_CONST  T_ELSEIF  T_FUNC_C  T_ARRAY  T_CONSTANT_ENCAPSED_STRING  T_EMPTY  T_GLOBAL  T_ARRAY_CAST  T_CONTINUE  T_ENCAPSED_AND_WHITESPACE  T_GOTO  T_AS  T_CURLY_OPEN  T_ENDDECLARE  T_HALT_COMPILER  T_BAD_CHARACTER  T_DEC  T_ENDFOR  T_IF  T_BOOLEAN_AND  T_DECLARE  T_ENDFOREACH  T_IMPLEMENTS  T_BOOLEAN_OR  T_DEFAULT  T_ENDIF  T_INC  T_BOOL_CAST  T_DIR  T_ENDSWITCH  T_INCLUDE  T_BREAK  T_DIV_EQUAL  T_ENDWHILE  T_INCLUDE_ONCE  T_CASE  T_DNUMBER  T_END_HEREDOC  T_INLINE_HTML  T_CATCH  T_DOC_COMMENT  T_EVAL  T_INSTANCEOF  T_CHARACTER  T_DO  T_EXIT  T_INT_CAST  T_CLASS  T_DOLLAR_OPEN_CURLY_BRACES  T_EXTENDS  T_INTERFACE  T_CLASS_C  T_DOUBLE_ARROW  T_FILE  T_ISSET  T_CLONE  T_DOUBLE_CAST  T_FINAL  T_IS_EQUAL  T_CLOSE_TAG  T_DOUBLE_COLON  T_FOR  T_IS_GREATER_OR_EQUAL  T_COMMENT  T_ECHO  T_FOREACH  T_IS_IDENTICAL
  • 18. Lexical Analysis PHP Tokens  T_IS_NOT_EQUAL  T_OBJECT_CAST  T_SR_EQUAL  T_IS_NOT_IDENTICAL  T_OBJECT_OPERATOR  T_START_HEREDOC  T_IS_SMALLER_OR_EQUAL  T_OLD_FUNCTION  T_STATIC  T_LINE  T_OPEN_TAG  T_STRING  T_LIST  T_OPEN_TAG_WITH_ECHO  T_STRING_CAST  T_LNUMBER  T_OR_EQUAL  T_STRING_VARNAME  T_LOGICAL_AND  T_PAAMAYIM_NEKUDOTAYIM  T_SWITCH  T_LOGICAL_OR  T_PLUS_EQUAL  T_THROW  T_LOGICAL_XOR  T_PRINT  T_TRY  T_METHOD_C  T_PRIVATE  T_UNSET  T_MINUS_EQUAL  T_PUBLIC  T_UNSET_CAST  T_ML_COMMENT  T_PROTECTED  T_USE  T_MOD_EQUAL  T_REQUIRE  T_VAR  T_MUL_EQUAL  T_REQUIRE_ONCE  T_VARIABLE  T_NAMESPACE  T_RETURN  T_WHILE  T_NS_C  T_SL  T_WHITESPACE  T_NEW  T_SL_EQUAL  T_XOR_EQUAL  T_NUM_STRING  T_SR
  • 20. Syntax Analysis Parse a sequence of tokens  You do not want to write a parser by hand At least when the code for the scanner should be efficient and maintainable  Tools such as bison or lemon generate the code for a parser from a set of rules T_IF '(' expr ')' { ... } statement { ... } elseif_list else_single { ... }
  • 21. PHP Bytecode Using bytekit-cli to disassemble bytecode 1 <?php 2 if (TRUE) { 3 print '*'; 4 } 5 ?> sb@thinkpad ~ % bytekit if.php bytekit-cli 1.0.0 by Sebastian Bergmann. Filename: /home/sb/if.php Function: main Number of oplines: 8 line # opcode result operands ----------------------------------------------------------------------------- 2 0 EXT_STMT 1 JMPZ true, ->6 3 2 EXT_STMT 3 PRINT ~0 '*' 4 FREE ~0 4 5 JMP ->6 6 6 EXT_STMT 7 RETURN 1
  • 22. PHP Bytecode Using bytekit-cli to visualize bytecode 1 <?php 2 if (TRUE) { 3 print '*'; 4 } 5 ?> sb@thinkpad ~ % bytekit --graph /tmp --format svg if.php
  • 23. How if is compiled Zend/zend_compile.c void zend_do_if_cond (const znode *cond, znode *closing_bracket_token TSRMLS_DC) { typedef struct _znode { int op_type; union { zval constant; zend_uint var; zend_uint opline_num; zend_op_array *op_array; zend_op *jmp_addr; struct { zend_uint var; zend_uint type; } EA; } u; } } znode; zend_do_if_cond() is called when an if statement is compiled
  • 24. How if is compiled Zend/zend_compile.c void zend_do_if_cond (const znode *cond, znode *closing_bracket_token TSRMLS_DC) { int if_cond_op_number = get_next_op_number(CG(active_op_array)); zend_op *opline = get_next_op(CG(active_op_array) TSRMLS_CC); struct _zend_op { opcode_handler_t handler; znode result; znode op1; znode op2; ulong extended_value; uint lineno; zend_uchar opcode; } }; Allocate a new opline in the current oparray
  • 25. How if is compiled Zend/zend_compile.c void zend_do_if_cond (const znode *cond, znode *closing_bracket_token TSRMLS_DC) { int if_cond_op_number = get_next_op_number(CG(active_op_array)); zend_op *opline = get_next_op(CG(active_op_array) TSRMLS_CC); opline->opcode = ZEND_JMPZ; } Set the opcode of the new opline to JMPZ (jump if zero)
  • 26. How if is compiled Zend/zend_compile.c void zend_do_if_cond (const znode *cond, znode *closing_bracket_token TSRMLS_DC) { int if_cond_op_number = get_next_op_number(CG(active_op_array)); zend_op *opline = get_next_op(CG(active_op_array) TSRMLS_CC); opline->opcode = ZEND_JMPZ; opline->op1 = *cond; } Set the first operand of the new opline to the if condition
  • 27. How if is compiled Zend/zend_compile.c void zend_do_if_cond (const znode *cond, znode *closing_bracket_token TSRMLS_DC) { int if_cond_op_number = get_next_op_number(CG(active_op_array)); zend_op *opline = get_next_op(CG(active_op_array) TSRMLS_CC); opline->opcode = ZEND_JMPZ; opline->op1 = *cond; closing_bracket_token->u.opline_num = if_cond_op_number; SET_UNUSED(opline->op2); INC_BPC(CG(active_op_array)); } Perform book keeping tasks such as marking the second operand of the new opline as unused or incrementing the backpatching counter for the current oparray
  • 28. PHP Bytecode PHP Opcodes  NOP  IS_NOT_EQUAL  POST_INC  ADD_VAR  UNSET_DIM  ADD  IS_SMALLER  POST_DEC  BEGIN_SILENCE  UNSET_OBJ  SUB  IS_SMALLER_OR_EQUAL  ASSIGN  END_SILENCE  FE_RESET  MUL  CAST  ASSIGN_REF  INIT_FCALL_BY_NAME  FE_FETCH  DIV  QM_ASSIGN  ECHO  DO_FCALL  EXIT  MOD  ASSIGN_ADD  PRINT  DO_FCALL_BY_NAME  FETCH_R  SL  ASSIGN_SUB  JMPZ  RETURN  FETCH_DIM_R  SR  ASSIGN_MUL  JMPNZ  RECV  FETCH_OBJ_R  CONCAT  ASSIGN_DIV  JMPZNZ  RECV_INIT  FETCH_W  BW_OR  ASSIGN_MOD  JMPZ_EX  SEND_VAL  FETCH_DIM_W  BW_AND  ASSIGN_SL  JMPNZ_EX  SEND_VAR  FETCH_OBJ_W  BW_XOR  ASSIGN_SR  CASE  SEND_REF  FETCH_RW  BW_NOT  ASSIGN_CONCAT  SWITCH_FREE  NEW  FETCH_DIM_RW  BOOL_NOT  ASSIGN_BW_OR  BRK  FREE  FETCH_OBJ_RW  BOOL_XOR  ASSIGN_BW_AND  BOOL  INIT_ARRAY  FETCH_IS  IS_IDENTICAL  ASSIGN_BW_XOR  INIT_STRING  ADD_ARRAY_ELEMENT  FETCH_DIM_IS  IS_NOT_IDENTICAL  PRE_INC  ADD_CHAR  INCLUDE_OR_EVAL  FETCH_OBJ_IS  IS_EQUAL  PRE_DEC  ADD_STRING  UNSET_VAR  FETCH_FUNC_ARG
  • 29. PHP Bytecode PHP Opcodes  FETCH_DIM_FUNC_ARG  INIT_STATIC_METHOD_CALL  FETCH_OBJ_FUNC_ARG  ISSET_ISEMPTY_VAR  FETCH_UNSET  ISSET_ISEMPTY_DIM_OBJ  FETCH_DIM_UNSET  PRE_INC_OBJ  FETCH_OBJ_UNSET  PRE_DEC_OBJ  FETCH_DIM_TMP_VAR  POST_INC_OBJ  FETCH_CONSTANT  POST_DEC_OBJ  EXT_STMT  ASSIGN_OBJ  EXT_FCALL_BEGIN  INSTANCEOF  EXT_FCALL_END  DECLARE_CLASS  EXT_NOP  DECLARE_INHERITED_CLASS  TICKS  DECLARE_FUNCTION  SEND_VAR_NO_REF  RAISE_ABSTRACT_ERROR  CATCH  ADD_INTERFACE  THROW  VERIFY_ABSTRACT_CLASS  FETCH_CLASS  ASSIGN_DIM  CLONE  ISSET_ISEMPTY_PROP_OBJ  INIT_METHOD_CALL  HANDLE_EXCEPTION
  • 30. Extending the PHP Compiler Test First! --TEST-- unless statement --FILE-- <?php unless (FALSE) { print 'unless FALSE is TRUE, this is printed'; } unless (TRUE) { print 'unless TRUE is TRUE, this is printed'; } ?> --EXPECT-- unless FALSE is TRUE, this is printed
  • 31. Extending the PHP Compiler  Add token for unless to the scanner  Add rule for unless to the parser  Implement bytecode generation for unless in the compiler  Add token for unless to ext/tokenizer
  • 32. Add unless scanner token Zend/zend_language_parser.y %token T_NAMESPACE %token T_NS_C %token T_DIR %token T_NS_SEPARATOR %token T_UNLESS
  • 33. Add unless scanner token Zend/zend_language_scanner.l <ST_IN_SCRIPTING>"if" { return T_IF; } <ST_IN_SCRIPTING>"unless" { return T_UNLESS; } <ST_IN_SCRIPTING>"elseif" { return T_ELSEIF; } <ST_IN_SCRIPTING>"endif" { return T_ENDIF; } <ST_IN_SCRIPTING>"else" { return T_ELSE; }
  • 34. Add unless parser rule Zend/zend_language_parser.y unticked_statement: '{' inner_statement_list '}' | T_IF '(' expr ')' { . . | T_UNLESS '(' expr ')' { zend_do_unless_cond(&$3, &$4 TSRMLS_CC); } statement { zend_do_if_after_statement(&$4, 1 TSRMLS_CC); } { zend_do_if_end(TSRMLS_C); }
  • 35. Add unless to the compiler Zend/zend_compile.c void zend_do_if_cond (const znode *cond, znode *closing_bracket_token TSRMLS_DC) { int unless_cond_op_number = get_next_op_number(CG(active_op_array)); zend_op *opline = get_next_op(CG(active_op_array) TSRMLS_CC); opline->opcode = ZEND_JMPNZ; opline->op1 = *cond; closing_bracket_token->u.opline_num = unless_cond_op_number; SET_UNUSED(opline->op2); INC_BPC(CG(active_op_array)); } All we have to do to generate code for the unless statement, as compared to generate code for the if statement, is to emit JMPNZ (jump if not zero) instead of JMPZ (jump if zero)
  • 36. Add unless to the compiler The generated bytecode 1 <?php 2 unless (FALSE) { 3 print '*'; 4 } 5 ?> sb@thinkpad ~ % bytekit unless.php bytekit-cli 1.0.0 by Sebastian Bergmann. Filename: /home/sb/unless.php Function: main Number of oplines: 8 line # opcode result operands ----------------------------------------------------------------------------- 2 0 EXT_STMT 1 JMPNZ true, ->6 3 2 EXT_STMT 3 PRINT ~0 '*' 4 FREE ~0 4 5 JMP ->6 6 6 EXT_STMT 7 RETURN 1
  • 37. Running the test sb@thinkpad php-5.3-unless % make test TESTS=Zend/tests/unless.phpt Build complete. Don't forget to run 'make test'. ===================================================================== PHP : /usr/local/src/php/php-5.3-unless/sapi/cli/php PHP_SAPI : cli PHP_VERSION : 5.3.1-dev ZEND_VERSION: 2.3.0 PHP_OS : Linux 2.6.28-14-generic #47-Ubuntu SMP Sat Jul 25 01:19:55 UTC 2009 i686 GNU/Linux INI actual : /usr/local/src/php/php-5.3-unless/tmp-php.ini More .INIs : CWD : /usr/local/src/php/php-5.3-unless Extra dirs : VALGRIND : Not used ===================================================================== Running selected tests. PASS unless statement [Zend/tests/unless.phpt] ===================================================================== Number of tests : 1 1 Tests skipped : 0 ( 0.0%) -------- Tests warned : 0 ( 0.0%) ( 0.0%) Tests failed : 0 ( 0.0%) ( 0.0%) Expected fail : 0 ( 0.0%) ( 0.0%) Tests passed : 1 (100.0%) (100.0%) --------------------------------------------------------------------- Time taken : 0 seconds =====================================================================
  • 38. Add unless to ext/tokenizer sb@thinkpad tokenizer % ./tokenizer_data_gen.sh Wrote tokenizer_data.c
  • 39. The End Thank you for your interest! These slides will be posted on http://slideshare.net/sebastian_bergmann
  • 40. Acknowledgements  Thomas Lee, whose Python Language Internals presentation at OSDC 2008 inspired this presentation  Stefan Esser for creating the Bytekit extension that provides PHP bytecode access and analysis features  Derick Rethans, David Soria Parra, and Scott MacVicar for reviewing these slides
  • 41. References  http://www.php.net/manual/en/tokens.php  http://www.zapt.info/opcodes.html  ”Extending and Embedding PHP”, Sara Golemon  http://bytekit.org/  http://github.com/sebastianbergmann/bytekit-cli/
  • 42. License   This presentation material is published under the Attribution-Share Alike 3.0 Unported license.   You are free: ✔ to Share – to copy, distribute and transmit the work. ✔ to Remix – to adapt the work.   Under the following conditions: ● Attribution. You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). ● Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license.   For any reuse or distribution, you must make clear to others the license terms of this work.   Any of the above conditions can be waived if you get permission from the copyright holder.   Nothing in this license impairs or restricts the author's moral rights.