2. Hi
Julien PAULI
SensioLabs tech team (Blackfire - PHP)
Programming with PHP since early 2000s
Today working with PHP and hacking it from inside (C)
PHP Internals programmer/contributor
PHP 5.5 & 5.6 Release Manager
@julienpauli
Tech life at http://jpauli.tech
jpauli@php.net
3. What we'll cover together
PHP 7 new compiler
Compiler optimizations
PHP 7 new executor optimizers and OPCache passes
PHP 7 new references mechanism
PHP 7 new Hashtables (PHP arrays)
PHP 7 new strings management
8. How does PHP work ?
Parsing
(Lexer + parser)
Compiling
Executing
opcodes
PHP code
AST nodes
Result
9. A file-based compiler with runtime compiling
include / require / eval() = compile + execute
autoload = compile + execute deffered at execution
Compile Execute
include/require ?
eval() ?
autoload ?
B
N
Y
E
10. OPCache Optimizer
OPCache will optimize the compilation result
opcodes
Optimizing
opcodes
Shared
Memory
opcodes
Caching
Compiling
save
Executing
opcodes
12. PHP 7 new compiler
PHP 7 compiler is based on an AST
It plays with OPCache to optimize the runtime
Every hash of every litteral string is calculated at compile time
Resolves every static/litteral expression
Optimizes some function calls when result is known at compile
time
Compiling a PHP file is a huge heavy process
13. Classes impact
Classes are heavy structures in memory
But objects are light structures in memory ;-)
Don't make PHP parse huge classes
If you won't use them fully at runtime
14. PHP 7 compiler optimizations
If the compile node can be resolved at compile-time , it will be
Static expressions, static arrays initialization
PHP constants
Some functions compile to special ZendVM OPCodes or are
directly resolved by the compiler if possible
strlen(), is_{type}(), {type}val() casts, defined(), chr(), ord(),
cuf()&cufa(), get_class(), get_called_class(), gettype(), count(),
in_array(), array_slice(), func_num_args(), func_get_args()
Some are parts of OPCache optimizer :
function_exists(), is_callable(), extension_loaded(), constant(),
dirname(), defined(), strlen()
17. Namespaced function calls
We have proven that the performance difference is really tiny
on real use cases.
(Usually, IDE prepends the leading for you)
namespace Foo;
class Bar
{
public function hello($str)
{
return "Hello" . strlen($str);
}
}
namespace Foo;
class Bar
{
public function hello($str)
{
return "Hello" . strlen($str);
}
}
VS
18. Symfony example : managing several PHP versions
PHP constants are resolved at compile time
OPCache optimizer will delete the dead branch
public function unserialize($data)
{
if (PHP_VERSION_ID >= 70000) {
list($environment, $debug) = unserialize($data, array('allowed_classes' => false));
} else {
list($environment, $debug) = unserialize($data);
}
$this->__construct($environment, $debug);
}
public function unserialize($data)
{
list($environment, $debug) = unserialize($data, array('allowed_classes' => false));
$this->__construct($environment, $debug);
}
19. PHP 7 compiler optim example, static arrays
Arrays containg keys/vals that are static/litteral
Such arrays are fully resolved at compile time
They involve no runtime work at all
const FOO = ['bar', 'baz', 'foo', 34, [42, 'bar'=>'baz']];
21. PHP 7 new references mechanism
In PHP 7, the deep copy is postponed until COW breakage
If no COW breakage, then no copy happens at all
function foo($arg) { $arg = 'bar'; } /* full copy of the variable */
$a = 'foo';
$b = &$a;
foo($a);
$a = ['foo', 42, ['bar' , new stdclass], 'baz'];
$b = &$a;
if (count($a) == 8) { /* no zval copy here */
}
23. PHP 7 hashtables
It has been fully rewritten, and reworked
Nothing to say from PHP userland POV
Except perhaps for the packed array case
24. Packed arrays
If your keys are integer only (no string key)
If your keys are constantly increasing
No matter if they don't follow each other with +1
Then you'll benefit from packed arrays optimization
Packed arrays will reduce memory size compared to "normal"
array
Reduction of (table_size - 2) * 4 bytes
~ 4Kb for a 1000 entry table
May be noticeable for BIG arrays
25. Packed arrays example
const N = 1024 * 1023;
for ($i=0; $i<N; $i++) {
$tab[] = random_bytes(3);
}
echo memory_get_usage();
const N = 1024 * 1023;
for ($i=0; $i<N; $i++) {
$tab[] = random_bytes(3);
}
$tab['foo'] = 'bar';
echo memory_get_usage();
const N = 1024 * 1023;
for ($i=0; $i<N; $i++) {
$tab[] = random_bytes(3);
}
unset($tab[1000]);
$tab[1000] = 1000;
echo memory_get_usage();
~67Mb
~71Mb
~71Mb
26. Packed arrays conditions (recalled)
Do NOT use string keys
Always use increasing integer-based keys
Contiguous or not is not important
If using the compiler, keep keys into the interval [0-table-size] ,
table-size being rounded to the upper power of two
For example, if you need lists , then you'll benefit from this
optimisation
27. HashTables in PHP 7 : go further
http://jpauli.github.io/2016/04/08/hashtables.html
29. String management
In PHP 5, strings don't have their own structure
String management is hard
Leads to many strings duplication
And thus many memory access
In PHP 7, strings share the zend_string structure
They are refcounted, thus shareable
hashes are precomputed, often at compile time
struct hack is used to compact memory
30. Strings in PHP
char * str
...
zval
gc_infos
int len
refcount is_ref zend_string *
...
zval
...
hash
gc_infos
char str[1]size_t len
...
zend_string
PHP 5 PHP 7
35. Encapsed string optimisation
Encapsed string are double-quoted strings that get parsed
They need to be analyzed for variables
PHP 5 used to reallocate the string at each step
$a = "foo and $b and $c";
3 0 E > ADD_STRING ~0 'foo+and+'
1 ADD_VAR ~0 ~0, !1
2 ADD_STRING ~0 ~0, '+and+'
3 ADD_VAR ~0 ~0, !2
4 ASSIGN !0, ~0
4 5 > RETURN 1
36. Encapsed string in PHP 5
$a = "foo and $b and $c";
3 0 E > ADD_STRING ~0 'foo+and+'
1 ADD_VAR ~0 ~0, !1
2 ADD_STRING ~0 ~0, '+and+'
3 ADD_VAR ~0 ~0, !2
4 ASSIGN !0, ~0
4 5 > RETURN 1
foo and
foo and b
foo and b and
foo and b and c
Lot of pressure on the allocator
Needs to find new chunk
At every new allocation
Browses through a free-chunk linked-lis
Bad for performances
$b = 'b';
$c = 'c';
37. Encapsed string optimisation in PHP 7
PHP 7 uses a "rope", and only reallocates memory once, at the
end
https://en.wikipedia.org/wiki/Rope_(data_structure)
$a = "foo and $b and $c";
L3 #0 ROPE_INIT "foo and " ~1
L3 #1 ROPE_ADD ~1 $b ~1
L3 #2 ROPE_ADD ~1 " and " ~1
L3 #3 ROPE_END ~1 $c ~0
L3 #4 ASSIGN $a ~0
L3 #5 RETURN 1
38. Encapsed strings in PHP 7
$a = "foo and $b and $c";
L3 #0 ROPE_INIT "foo and " ~1
L3 #1 ROPE_ADD ~1 $b ~1
L3 #2 ROPE_ADD ~1 " and " ~1
L3 #3 ROPE_END ~1 $c ~0
L3 #4 ASSIGN $a ~0
L3 #5 RETURN 1
foo and
foo and b
foo and b and
foo and b and c
foo and b and c
INIT
ADD
ADD
ADD
END
Keep every piece of string as its
own buffer
Stack them
At the end, merge them as one
operation
39. So ?
So you'd better use encapsed strings
Than concatenations
$a = "foo and $b and $c";
$a = 'foo and ' . $b . ' and ' . $c;
41. switch-case with special cases
A switch-case statement with only strings as cases, will now use
a jump table instead of testing each case individually in the VM
Same for only-integers keys, starting from at least 5 cases.
switch ($a) {
case 'bar':
echo 'bar';
break;
case 'foo':
echo 'foo';
break;
}
if ($a == 'bar') {
echo 'bar';
} elseif ($a == 'foo') {
echo 'foo';
}
in_array($a, ['bar', 'foo']);PHP >=7.2
PHP < 7.2
This is O(1) in huge majority of cases
42. New maths OPCode specializers
Many of them , with overflows taken care of when needed
ZEND_VM_HOT_TYPE_SPEC_HANDLER(ZEND_ADD,
(op1_info == MAY_BE_DOUBLE && op2_info == MAY_BE_DOUBLE),
ZEND_ADD_DOUBLE, CONST|TMPVARCV, CONST|TMPVARCV, SPEC
(NO_CONST_CONST,COMMUTATIVE))
{
USE_OPLINE
zval *op1, *op2, *result;
op1 = GET_OP1_ZVAL_PTR_UNDEF(BP_VAR_R);
op2 = GET_OP2_ZVAL_PTR_UNDEF(BP_VAR_R);
result = EX_VAR(opline->result.var);
ZVAL_DOUBLE(result, Z_DVAL_P(op1) + Z_DVAL_P(op2));
ZEND_VM_NEXT_OPCODE();
}
45. Future of PHP
PHP 7 branch keeps optimizing things
SSA and other dynamic optimizations have been added to
OPCache optimizer in PHP 7.1 and PHP 7.2
PHP 7 branch keep preparing the JIT engine move that should
happen for PHP 8
PHP 8 will come when ready :-)
Tip : not tomorrow