5. The Smalltalk demonstration showed three amazing features. One
was how computers could be networked; the second was how
object-oriented programming worked. But Jobs and his team paid
little attention to these attributes because they were so amazed by
the third feature, ...
19. Garbage Collection
• Reference counting (php, python ...), smart pointer
• Tracing
• Stop the world
• Copying, Mark-and-sweep, Mark-and-compact
• Generational GC
• Precise vs. conservative
20. Precise vs. conservative
• Conservative
• If it looks like a pointer, treat it as a pointer
• Might have memory leak
• Cant’ move object, have memory fragmentation
• Precise
• Indirectly vs. Directly reference
35. How JIT work?
• mmap/new/malloc (mprotect)
• generate native code
• c cast/reinterpret_cast
• call the function
36. Trampoline (JSC x86)
asm (
".textn"
".globl " SYMBOL_STRING(ctiTrampoline) "n"
// Execute the code! HIDE_SYMBOL(ctiTrampoline) "n"
inline JSValue execute(RegisterFile* registerFile, SYMBOL_STRING(ctiTrampoline) ":" "n"
CallFrame* callFrame, "pushl %ebp" "n"
JSGlobalData* globalData) "movl %esp, %ebp" "n"
{ "pushl %esi" "n"
JSValue result = JSValue::decode( "pushl %edi" "n"
ctiTrampoline( "pushl %ebx" "n"
m_ref.m_code.executableAddress(), "subl $0x3c, %esp" "n"
registerFile, "movl $512, %esi" "n"
callFrame, "movl 0x58(%esp), %edi" "n"
0, "call *0x50(%esp)" "n"
Profiler::enabledProfilerReference(), "addl $0x3c, %esp" "n"
globalData)); "popl %ebx" "n"
return globalData->exception ? jsNull() : result; "popl %edi" "n"
} "popl %esi" "n"
"popl %ebp" "n"
"ret" "n"
);
49. one secret in V8 hidden class
20x times
slower!
http://jsperf.com/test-v8-delete
50. in Figure 5, reads are far more common than writes: over all
Write_indx
roughly comparable to me
1.
Write_prop Read_prop
0.8
traces the proportion of reads to writes is 6 to 1. Deletes comprise
Write_hash Read_hash class-based languages, suc
Write_indx Read_indx
only .1% of all events. That graph further breaks reads, writes
Write_prop Read_prop Delet_prop ric discussed in [23]. Studi
But property are rarely deleted and deletes into various specific types; prop Delet_hash to accesses
refers
0.8
Write_hash Read_hash
DIT of 8 and a median of
0.6
Write_indx Read_indx Delet_indx
Write_prop Read_prop Delet_prop Define and maximum of 10. Figu
Write_hash Read_hash
Write_indx Read_indx
Delet_hash
Delet_indx
Create
Call median prototype chain le
0.6
10
Write_prop Read_prop Delet_prop Define Throw chain length 1, the minimu
0.4
Write_hash Read_hash Delet_hash Create Catch
Write_indx Read_indx Delet_indx Call have at least one prototyp
Read_prop Define
Object.prototype. The m
1.0
Delet_prop Throw
9
0.4
Read_hash Delet_hash Create Catch
is 10. The majority of site
0.2
Read_indx Delet_indx Call
Delet_prop Define Throw
Delet_hash Create Catch reuse, but this is possibly
8
0.8
Delet_indx Call to achieve code reuse in J
0.2
Define Throw
0.0
Create Catch sures directly into a field o
prototypes have similar in
7
Call
280s
Fbok
Apme
Bing
Blog
Digg
Flkr
Gmai
Gmap
Lvly
Twit
Wiki
Goog
IShk
Word
Ebay
YTub
All*
Prototype chain length
Throw
0.6
0.4 Flkr 0.0
Catch
Only 0.1% delete
5.4 Object Kinds
280s
Fbok
Gmai
Gmap
Lvly
Twit
Wiki
Apme
Bing
Blog
Digg
Goog
IShk
Word
Ebay
YTub
All*
6
280S
BING
BLOG
EBAY
FBOK
DIGG
FLKR
GMIL
GMAP
GOGL
ISHK
LIVE
MECM
TWIT
ALL*
WIKI
WORD
YTUB
Figure 7 breaks down the
Fbok
Bing
Blog
Digg
Flkr
Gmai
Gmap
Lvly
Twit
Wiki
Goog
IShk
Word
Ebay
YTub
All*
into a number of categorie
5
built-in data types: dates (D
Fbok
Gmap
Lvly
Twit
Wiki
Flkr
Gmai
Goog
IShk
Word
Ebay
YTub
All*
0.2
ument and layout objects
4
rors. The remaining objec
Lvly
Twit
Wiki
Goog
IShk
Word
Ebay
0.0 YTub
All*
mous objects, instances, fu
jects are constructed with a
3
Figure 5. Instruction mix. The per-site proportion of read, write, while instances are constr
280S
BING
BLOG
EBAY
FBOK
LIVE
ALL*
DIGG
FLKR
GMIL
GMAP
GOGL
ISHK
MECM
TWIT
WIKI
WORD
YTUB
delete, call instructions (averaged over multiple traces). A function object is creat
2
An Analysis of the Dynamic Behavior ofthe interpreter a
uated by JavaScript Programs
56. Tagged pointer
typedef union {
void *p;
double d;
long l;
} Value;
typedef struct {
unsigned char type; sizeof(a)??
Value value;
} Object; if everything is object, it will be too much overhead
for small integer
Object a;
57. Tagged pointer
In almost all system, the pointer address will be aligned (4 or 8 bytes)
“The address of a block returned by malloc or realloc in the GNU system is
always a multiple of eight (or sixteen on 64-bit systems). ”
http://www.gnu.org/s/libc/manual/html_node/Aligned-Memory-Blocks.html
60. NaN-tagging (JSC 64 bit)
In 64 bit system, we can only use 48 bits, that means it will have 16 bits are 0
* The top 16-bits denote the type of the encoded JSValue:
*
* Pointer { 0000:PPPP:PPPP:PPPP
* / 0001:****:****:****
* Double { ...
* FFFE:****:****:****
* Integer { FFFF:0000:IIII:IIII
73. Built-in objects written in JS
function ArraySort(comparefn) {
if (IS_NULL_OR_UNDEFINED(this) && !IS_UNDETECTABLE(this)) {
throw MakeTypeError("called_on_null_or_undefined",
["Array.prototype.sort"]);
}
// In-place QuickSort algorithm.
// For short (length <= 22) arrays, insertion sort is used for efficiency.
if (!IS_SPEC_FUNCTION(comparefn)) {
comparefn = function (x, y) {
if (x === y) return 0;
if (%_IsSmi(x) && %_IsSmi(y)) {
return %SmiLexicographicCompare(x, y);
}
x = ToString(x);
y = ToString(y);
if (x == y) return 0;
else return x < y ? -1 : 1;
};
}
...
v8/src/array.js
77. Dart
• Clear syntax, Optional types, Libraries
• Performance
• Can compile to JavaScript
• But IE, WebKit and Mozilla rejected it
• What do you think?
• My thought: Will XML replace HTML? No, but thanks
Google, for push the web forward
80. Expose Function
v8::Handle<v8::Value> Print(const v8::Arguments& args) {
for (int i = 0; i < args.Length(); i++) {
v8::HandleScope handle_scope;
v8::String::Utf8Value str(args[i]);
const char* cstr = ToCString(str);
printf("%s", cstr);
}
return v8::Undefined();
}
v8::Handle<v8::ObjectTemplate> global = v8::ObjectTemplate::New();
global->Set(v8::String::New("print"), v8::FunctionTemplate::New(Print));
81.
82. Node.JS
• Pros • Cons
• Async • Lack of great libraries
• One language for everything • ES5 code hard to maintain
• Faster than PHP, Python • Still too youth
• Community
86. “Apple has decided to make Internet Explorer it’s default browser
on macintosh.”
“Since we believe in choice. We going to be shipping other Internet
Browser...”
Steve Jobs
107. Rhino + invokedynamic
• Pros • Cons
• Easier to implement • Only in JVM7
• Lots of great Java Libraries • Not fully optimized yet
• JVM optimization for free • Hard to beat V8
118. exe &
Libraries LLVM
LLVM
exe & Offline Reoptimizer
LLVM
Compiler FE 1 LLVM Native exe Profile
. CPU Info
LLVM
Linker CodeGen Profile
& Trace
. .o files IPO/IPA LLVM
exe Info Runtime
Compiler FE N JIT LLVM Optimizer
LLVM LLVM
Figure 4: LLVM system architecture diagram
code in non-conforming languages is executed as “un-
managed code”. Such code is represented in native External static LLVM compilers (referred to as front-e
form and not in the CLI intermediate representation, translate source-language programs into the LLVM vir
so it is not exposed to CLI optimizations. These sys- instruction set. Each static compiler can perform three
tems do not provide #2 with #1 or #3 because run- tasks, of which the first and third are optional: (1) Per
time optimization is generally only possible when us- language-specific optimizations, e.g., optimizing closure
ing JIT code generation. They do not aim to provide languages with higher-order functions. (2) Translate so
123. Performance? good enough!
benchmark SM V8 gcc ratio two Ja
fannkuch (10) 1.158 0.931 0.231 4.04 benchm
fasta (2100000) 1.115 1.128 0.452 2.47 operati
primes 1.443 3.194 0.438 3.29 code th
raytrace (7,256) 1.930 2.944 0.228 8.46 to usin
dlmalloc (400,400) 5.050 1.880 0.315 5.97 (The m
‘nativiz
The first column is the name of the benchmark, and in Bein
parentheses any parameters used in running it. The source C++ co
128. All problems in computer science can be solved
by another level of indirection
David Wheeler
129.
130.
131. References
• The behavior of efficient virtual • Context Threading: A Flexible
machine interpreters on and Efficient Dispatch
modern architectures Technique for Virtual Machine
Interpreters
• Virtual Machine Showdown:
Stack Versus Registers • Effective Inline-Threaded
Interpretation of Java Bytecode
• The implementation of Lua 5.0 Using Preparation Sequences
• Why Is the New Google V8 • Smalltalk-80: the language and
Engine so Fast? its implementation
132. References
• Design of the Java HotSpotTM • LLVM: A Compilation
Client Compiler for Java 6 Framework for Lifelong
Program Analysis &
• Oracle JRockit: The Definitive Transformation
Guide
• Emscripten: An LLVM-to-
• Virtual Machines: Versatile JavaScript Compiler
platforms for systems and
processes • An Analysis of the Dynamic
Behavior of JavaScript
• Fast and Precise Hybrid Type Programs
Inference for JavaScript
133. References
• Adaptive Optimization for SELF • Design, Implementation, and
Evaluation of Optimizations in a
• Bytecodes meet Combinators: Just-In-Time Compiler
invokedynamic on the JVM
• Optimizing direct threaded
• Context Threading: A Flexible code by selective inlining
and Efficient Dispatch
Technique for Virtual Machine • Linear scan register allocation
Interpreters
• Optimizing Invokedynamic
• Efficient Implementation of the
Smalltalk-80 System
134. References
• Representing Type Information • The Structure and Performance
in Dynamically Typed of Efficient Interpreters
Languages
• Know Your Engines: How to
• The Behavior of Efficient Virtual Make Your JavaScript Fast
Machine Interpreters on
Modern Architectures • IE Blog, Chromium Blog,
WebKit Blog, Opera Blog,
• Trace-based Just-in-Time Type Mozilla Blog, Wingolog’s Blog,
Specialization for Dynamic RednaxelaFX’s Blog, David
Languages Mandelin’s Blog...