V8 engine optimizations and memory representations in Node.js

Node.js behind: V8 engine & its optimizations

Agenda
• What is V8?
• Full-codegen
• Crankshaft
• TurboFan
• Ignition
• Optimizations & memory representation by examples
• Garbage collector & memory leaks
• Useful links

2010
Introduced
Crankshaft
2014
Introduced
TurboFan
2016
Introduced
Ignition
2015
TurboFan
enabled in Chrome 41
2017
Ignition
in Chrome 58 for tests
2008
Introduced
V8
2009
Introduced
Node.js
…

Compiler pipeline
What is V8?
Source
Bytecode Unoptimized code Optimized code
Parser Ignition Parser Full-codegen
Crankshaft
TurboFan
Parser
Deoptimize
Interpreted Baseline Optimized

Desired compiler pipeline
What is V8?
Source
Bytecode Optimized code
Parser Ignition
TurboFan
Deoptimize
Interpreted Optimized

How full-codegen works?
• Compiles JS code directly to native code before executing it
• Needs to compile code as quick as possible
• It’s not optimizing code
• One function at a time, compiling just in time
• Uses Inline Caches to implement loads, stores calls, binary, unary and comparison
• Implementation of IC is stub, generated on the fly, can be cached for common cases
• Stub on beginning has no instructions, each missed call makes it more complicated
Full-codegen

How Crankshaft works?
• Only selected "hot" functions are passed for optimizations.
• Hot: marked for optimization -> LazyRecompile -> optimizing
• Hot: uninitialized -> premonomorphic -> monomorphic
• Most of time will mark to optimize at second call
• Sometimes compiler will optimize it immediately (big loops) - on-stack replacement
• Builds Hydrogen control flow graph (SSA)
• Optimizes in Hydrogen graph (only part which can be in parallel to JS)
• Generates Lithium graph
• Emit native instructions
Crankshaft

Static single assignment form
Crankshaft
From: https://en.wikipedia.org/wiki/Static_single_assignment_form
x ← 5
x ← x - 3
x<3?
y ← x * 2
w ←y
y ← x - 3
w ← x - y
z ←x + y
x1 ← 5
x2 ← x - 3
x2<3?
y1 ← x2 * 2
w1 ← y1
y2 ← x2 - 3
w2 ← x2 - y?
z1 ←x2 + y?

Static single assignment form
Crankshaft
From: https://en.wikipedia.org/wiki/Static_single_assignment_form
x ← 5
x ← x - 3
x<3?
y ← x * 2
w ←y
y ← x - 3
w ← x - y
z ←x + y
x1 ← 5
x2 ← x - 3
x2<3?
y1 ← x2 * 2
w1 ← y1
y2 ← x2 - 3
y3 = Φ(y1,y2)
w2 ← x2 - y3
z1 ←x2 + y3

What is Crankshaft optimizing?
• Inline functions - when it’s safe and they are small
• Tag values - e.g. integer, double
• Analyze Uint32 - by default uses 31-bit for small signed integers, allow use of bigger
• Canonicalization - simplify
• Global Value Numbering (GVN) - eliminate redundancy
• Redundant bounds check elimination - remove unneeded checks for arrays
• Dead code elimination
Crankshaft

How TurboFan works?
• Uses „sea of nodes” concept
• Can work directly on byte code (works perfectly with Ignition)
• Machine code is emitted
• Able to easily handle more features (e.g. from ES6)
• More aggressive than Crankshaft
• Optimizes progressively
• Has scheduling algorithm
TurboFan

How to enable Ignition?
• In Chrome it’s put on A/B tests or you can use flag disable-v8-ignition-turbo
• In Node.js 7+ use --ignition flag
Ignition

How Ignition works?
• It’s interpreter
• Compiles to byte code
• Reduce memory usage for compiling (important for mobile)
• Improve startup time
• Simplify compilation pipeline
• Perform some optimizations, e.g. remove dead-code
Ignition

Optimizations & memory representation
by examples

How tests were prepared?
• MacBook Pro (early 2015), i5 2.7GHz, 8GB DDR3
• Node.js 7.7.1 (V8 5.5.372.41)
• Time calculated by time node script.js
• Number of iterations vary in test cases
Optimizations & memory representation by examples

Try/catch
// Set up number of iterations
const iterations = 1e9
// Set up function for tests
function f () {
return 1 + 2 * 3 / 4
}
// Set up function which calls function
function test () {
for (var i = 0; i < iterations; i++) {
f(i)
}
}
try {
test()
} catch (e) {
// Do nothing
}
real: 0.663s user: 0.555s sys: 0.027s
function f () {
return 1 + 2 * 3 / 4
}
function test () {
try {
f(i)
}
} catch (e) {
// Do nothing
}
}
test()
real: 8.826s user: 8.496s sys: 0.071s

Try/catch
function f () {
return 1 + 2 * 3 / 4
}
function test () {
f(i)
}
}
try {
test()
} catch (e) {
// Do something
}
real: 0.663s user: 0.555s sys: 0.027s
function f () {
return 1 + 2 * 3 / 4
}
function test () {
try {
f(i)
}
} catch (e) {
// Do something
}
}
test()
real: 8.826s user: 8.496s sys: 0.071s

For..in
function f () {
var obj = {}
var key
for (key in obj) {
// Do nothing
}
}
// Benchmark function
f(1, 2, 3, 4, 5)
}
real: 1.416s user: 1.281s sys: 0.042s
var key
function f () {
var obj = {}
for (key in obj) {
// Do nothing
}
}
f(1, 2, 3, 4, 5)
}
real: 3.181s user: 2.868s sys: 0.055s

for..of vs classic for loop
// Set up data
const arr = [ 0, 1, 2, 3, 4 ]
function f () {
for (var element of arr) {
// Do nothing
}
}
f()
}
real: 13.879s user: 13.662s sys: 0.099s
// Set up data
const arr = [ 0, 1, 2, 3, 4 ]
function f () {
for (var i = 0; i < arr.length; i++) {
var el = arr[i]
}
}
f()
}
real: 0.722s user: 0.620s sys: 0.028s

for..of implementation
class Iterable {
constructor (...elements) {
this.els = elements || []
}
add (...elements) {
this.els.push(...elements)
}
*[Symbol.iterator] () {
yield* this.els
}
}
const a = new Iterable(1, 2, 3, 4)
for (let element of a) {
console.log(element)
}
class Iterable2 {
constructor (...elements) {
this.els = elements || []
}
add (...elements) {
this.els.push(...elements)
}
*[Symbol.iterator] () {
for (let i = 0; i < this.els.length; i++) {
yield this.els[i]
}
}
}

Mono & polymorphic
operations
function f (x) {
return x + 'a'
}
// Set up testing values
const values = [ 0, 1, 2, 3, 4, 5 ]
const length = values.length
var value = values[i % length]
f(value)
}
real: 7.325s user: 6.697s sys: 0.071s
function f (x) {
return x + 'a'
}
// Warm function with different types of values
f(undefined), f(null), f('a'), f(true)
// Set up testing values
const values = [ 0, 1, 2, 3, 4, 5 ]
const length = values.length
var value = values[i % length]
f(value)
}
real: 21.012s user: 20.271s sys: 0.159s

Mono & polymorphic
operations• Different types causes different native operations
• Monomorphic stub has only one case
• Megamorphic stub has more cases
• For different cases needs to cover „safety gates” to handle different types

Objects interpretation
const obj = { 0: 1 }
function f (i) {
obj[0] = i
}
f(i)
}
real: 0.837s user: 0.729s sys: 0.030s
const obj = { x: 1 }
function f (i) {
obj.x = i
}
f(i)
}
real: 0.579s user: 0.554s sys: 0.013s

Object representation in
memory• Hash tables (dictionaries) - used for difficult objects, slow
• Fast elements - object with integer indexes, handled differently (can be array), small
• Fast, in-object properties - create hidden classes, shared same structure (transitions)
• Methods & Prototypes - Functions go as constant_function to map
• Object can move from one representation to another

Hidden classes
function Point (x, y) {
// Map M0
// ”x”: Transition to M1 at offset 12
this.x = x
// Map M1
// ”x” at 12
// ”y”: Transition to M2 at offset 16
this.y = y
// Map M2
// ”x” at 12, ”y” at 16
// ”do”: Transition to M3 <doSomething>
this.do = doSomething
// Map M3
// ”x” at 12, ”y” at 16
// ”do”: Constant_Function <doSomething>
}
function doSomething (p) { /* ……… */ }
From: https://github.com/thlorenz/v8-perf
this.x = x
this.y = y
if (x > 10) {
this.big = true
}
}
// Map X
const a = new Point(5, 0)
// Map Y
const b = new Point(20, 0)

Hidden classes
// Map M0
// ”x”: Transition to M1 at offset 12
this.x = x
// Map M1
// ”x” at 12
// ”y”: Transition to M2 at offset 16
this.y = y
// Map M2
// ”x” at 12, ”y” at 16
// ”do”: Transition to M3 <doSomething>
this.do = doSomething
// Map M3
// ”x” at 12, ”y” at 16
// ”do”: Constant_Function <doSomething>
}
function doSomething (p) { /* ……… */ }
From: https://github.com/thlorenz/v8-perf
// Map M0
this.x = x
// Map M1
this.y = y
// Map M2
if (x > 10) {
this.big = true
// Map M3
}
}
// Map M2
const a = new Point(5, 0)
// Map M3
const b = new Point(20, 0)

Object representation

Determine type of variable
function f () {
var attr = arguments
if (1 === 1) {
attr = [ 0 ]
}
}
f(i)
}
function f () {
var attr = arguments
var x = [ 1 ]
if (1 === 1) {
x = [ 0 ]
}
}
f(i)
}
real: 3.696s user: 3.466s sys: 0.064s real: 1.679s user: 1.454s sys: 0.037s

Mutate arguments
function f () {
// Modify arguments element
arguments[0] = 0
return arguments[0]
}
f(1)
}
real: 31.372s user: 31.064s sys: 0.106s
// Set up array-like object to modify property
var dummy = { 0: 1, length: 1 }
function f () {
// Modify dummy element
dummy[0] = 0
return arguments[0]
}
f(1)
}
real: 1.854s user: 1.753s sys: 0.029s

Mutate arguments variables
function f (arg) {
// Modify argument
arg = 0
return arguments[0]
}
f(1)
}
real: 31.570s user: 31.233s sys: 0.128s
function f (arg) {
// Modify argument
arg = 0
return arg
}
f(1)
}
real: 0.699s user: 0.554s sys: 0.036s

Methods to find bottlenecks
• Find problems by profiler / timeline (e.g. Chrome DevTools)
• Since 6.3: node —inspect-brk: Inspect code with DevTools
• node —trace-opt: trace optimizations on functions
• node —trace-deopt: trace deoptimizations
• node —allow-natives-syntax: Native Syntax for advanced scripting with V8 access
• node —prof: Profile code, generates v8.log file
• node —prof-process v8.log

Profiling Node (isolate-xxx-v8.log)
[Shared libraries]:
ticks total nonlib name
9 0.2% 0.0% C:WINDOWSsystem32ntdll.dll
2 0.0% 0.0% C:WINDOWSsystem32kernel32.dll
[JavaScript]:
741 17.7% 17.7% LazyCompile: am3 crypto.js:108
113 2.7% 2.7% LazyCompile: Scheduler.schedule richards.js:188
103 2.5% 2.5% LazyCompile: rewrite_nboyer earley-boyer.js:3604
103 2.5% 2.5% LazyCompile: TaskControlBlock.run richards.js:324
96 2.3% 2.3% Builtin: JSConstructCall
...
[C++]:
94 2.2% 2.2% v8::internal::ScavengeVisitor::VisitPointers
33 0.8% 0.8% v8::internal::SweepSpace
32 0.8% 0.8% v8::internal::Heap::MigrateObject
30 0.7% 0.7% v8::internal::Heap::AllocateArgumentsObject
...
[GC]:
458 10.9%

Garbage collector
& memory leaks

What is Garbage Collector for?
• Delete data from memory when it’s no longer needed
• Garbage collector most of time may be slower than managing memory by self
• Program has to stop while garbage collector is working
• Make easier for developer to manage memory
Garbage collector & memory leaks

How does GC work?
• There is created tree of nodes (with dependencies)
• Everything what is accessible from root (all global objects, all in branches) is live
• New data is most of time temporary, so probably should be earlier destroyed

Memory zones & collecting
garbage• Young generation = New space, Old generation = Old space
• After surviving for a while in new space, elements are moved to old space
• Allocation to old space is fast, but collecting there is slower
• On new space - scavenge collection, old - mark-sweep (or mark-compact)
• Marking can be processed in chunks
• There are 3 marking states:
• White - object not discovered by GC
• Grey - discovered, but not all neighbors processed
• Black - all of neighbors has been fully processed

Collecting garbage
root
a b
c d

Collecting garbage
a b
c d
root

Collecting garbage
d
root
a b
c

Collecting garbage
root
a b
c d
// Remove pointer from 'a' to 'd'
a.property = null

Collecting garbage
a b
c
root
a.property = null
d

Collecting garbage
root
a.property = null
a b
c d

Collecting garbage
a b
c
root
a.property = null

Memory zones
• New space - garbage collected quickly, 1-8MB space
• Old pointer space - objects which have pointers to other, moving from „new” after while
• Old data space - objects which contain raw data (objects, strings, arrays etc)
• Large object space - objects larger than size limit for other spaces, never moved
• Code space - JITed instructions, code objects
• Cell space, map space, property cell space - Cells, PropertyCells and Maps

Memory leaks: examples
function f () {
obj = {
a: 1,
b: 1
}
}
• It might be by accident
• It assigns to this element
• this may be global or window
• Assigned to root, so will not be removed

// Get some data
var r = getData()
// Process data
function process () {
if (shouldProcess()) {
console.log(r.items)
}
}
// Do something in interval
setInterval(process, 10000)
• Intervals which uses some data
• When they are not needed - clear them

var theThing = null
function replaceThing () {
var originalThing = theThing
var unused = function () {
if (originalThing) {
console.log("hi")
}
}
theThing = {
longStr: new Array(1000000).join('*'),
someMethod: function () {
console.log(someMessage)
}
}
}
setInterval(replaceThing, 1000)
• Closures can be bad
• unused keeps pointer to originalThing
• someMethod share closure with unused
• Keeps reference to old theThing

Memory leaks: how to detect
• Generate heap dumps (record in Chrome, or e.g. use `heapdump` module)
• Chrome DevTools - JS Profiler: Comparison
• Compare two/three snapshots from different time - you will see what is left
From: http://bit.ly/2pcDTe9

Memory leaks: how to detect
• Generate heap dumps (record in Chrome, or e.g. use `heapdump` module)
• Chrome DevTools - JS Profiler: Comparison
• Compare two/three snapshots from different time - you will see what is left
• Other way: use GCore
gcore `pgrep node`
> ::findjsobjects
object_id::jsprint
object_id::findjsobjects -r

Useful links
• https://github.com/v8/v8 - V8 code mirror on GitHub
• https://v8project.blogspot.com/ - Blog with V8 insights
• http://v8-io12.appspot.com/ - V8 overview (2012)
• http://bit.ly/2oBGvht - Ignition presentation (2016)
• http://bit.ly/2p68bgf - measure and optimize GC for RAIL (2016)
• http://bit.ly/2q4fsfS - RisingStack article, pretty simple about GC (2016)
• http://bit.ly/2oBDXjk - Jay Conrod wrote series about V8 mechanisms (2012)
• http://darksi.de/d.sea-of-nodes/ - Sea of node explanation
• https://github.com/thlorenz/v8-perf/ - Some stuff about V8 (2014)

Thanks for attention!
Something more? Catch me at drusnak@g2a.com
Find presentation at:
https://bit.ly/node-js-behind

V8 engine optimizations and memory representations in Node.js

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a V8 engine optimizations and memory representations in Node.js

Similar a V8 engine optimizations and memory representations in Node.js (20)

Último

Último (20)

V8 engine optimizations and memory representations in Node.js

Notas del editor