SlideShare una empresa de Scribd logo
1 de 87
Descargar para leer sin conexión
JVM Internals
                         Douglas Q. Hawkins
                         http://www.slideshare.net/dougqh
                         http://www.dougqh.net
                         dougqh@gmail.com




Monday, January 23, 12
Topics
                     Java Byte Code
                         File Format
                         Byte Code Examples
                         How Java 5 & 7 Features Are Implemented
                     JVM Optimizations




Monday, January 23, 12
Why?
Monday, January 23, 12

Besides techie edification, why is this useful?
A better understanding of the internals can help in deciphering some of the harder problems, but better...
You’ll know that the compiler and JVM are doing a lot for you letting you focus on writing readable code.
File Format


Monday, January 23, 12
Class File Format
                CA           FE      BA            BE         Minor Version Major Version


                                                Constant Pool


                         Flags        This Class     Super Class
                                              Interfaces

                                                       Fields

                                                    Methods

                                                    Attributes

Monday, January 23, 12

Every file starts the magic 2-bytes: CAFEBABE
Followed by major and minor version - major indicates Java 5, 6, 7, etc.
Then a constant pool - which contains...
  constants: int, long, String, etc.
  references: method and field
  descriptors: method and field
Followed by flags: modifiers for this class/interface
Followed by reference to this class/interface
Followed by the super class - which is an index into the constant pool
Followed by a list interface references - which are indices into constant pool
Followed by fields
Followed by methods
And, finally, attributes which are extra meta-information about the class...
 - the name of the original file
 - annotation information
 - information on sub-classes

Class File Spec: http://java.sun.com/docs/books/jvms/second_edition/ClassFileFormat-Java5.pdf
History of CAFEBABE: http://en.wikipedia.org/wiki/Java_class_file
Class File Format
                CA           FE      BA            BE         Minor Version Major Version


                                                Constant Pool




                                                        n
                         Flags        This Class     Super Class




                                                                                       pu te d
                                                     tio




                                                                      ce




                                                                                          iva te
                                                                er ct
                                               ta


                                                            ab tfp




                                                                                       pr ec
                                                            int ra
                                                                   fa




                                                                                              ic
                                             um




                                                                                       pr ic
                                             no
                                              Interfaces



                                                                st




                                                                                           bl
                                                               ric




                                                                                          ot
                                                                                          al
                                                                                          at
                                          en
                                          an




                                                                                       fin
                                                            st




                                                                                       st
                                                       Fields

                                                    Methods

                                                    Attributes

Monday, January 23, 12

Every file starts the magic 2-bytes: CAFEBABE
Followed by major and minor version - major indicates Java 5, 6, 7, etc.
Then a constant pool - which contains...
  constants: int, long, String, etc.
  references: method and field
  descriptors: method and field
Followed by flags: modifiers for this class/interface
Followed by reference to this class/interface
Followed by the super class - which is an index into the constant pool
Followed by a list interface references - which are indices into constant pool
Followed by fields
Followed by methods
And, finally, attributes which are extra meta-information about the class...
 - the name of the original file
 - annotation information
 - information on sub-classes

Class File Spec: http://java.sun.com/docs/books/jvms/second_edition/ClassFileFormat-Java5.pdf
History of CAFEBABE: http://en.wikipedia.org/wiki/Java_class_file
Field Format
                     Flags              Name        Descriptor




                                                                     pu te d
                                                                         lat nt




                                                                        iva te
                                                                             ile
                                                                     vo ie




                                                                     pr ec

                                                                            ic
                                                                     pr ic
                                                                          ns
                                            Attributes




                                                                         bl
                                                                        ot
                                                                        al
                                                                        at
                                                                     tra


                                                                     fin
Monday, January 23, 12                                               st
Fields consist of...
flags
followed by name - actually index to a string literal into the constant pool
followed by descriptor - e.g. field type - also index into the constant pool
 - type is raw type
followed by attributes
- constant value
- specific type information - List< String >, etc.
Field Format
                     Flags              Name        Descriptor
                                                 “name”
                                            Attributes




Monday, January 23, 12

Fields consist of...
flags
followed by name - actually index to a string literal into the constant pool
followed by descriptor - e.g. field type - also index into the constant pool
 - type is raw type
followed by attributes
- constant value
- specific type information - List< String >, etc.
Field Format
                     Flags              Name        Descriptor                 “Ljava/lang/String;”
                                            Attributes




Monday, January 23, 12

Fields consist of...
flags
followed by name - actually index to a string literal into the constant pool
followed by descriptor - e.g. field type - also index into the constant pool
 - type is raw type
followed by attributes
- constant value
- specific type information - List< String >, etc.
Field Format
                     Flags              Name        Descriptor
                                            Attributes

                                             ConstantValue




Monday, January 23, 12

Fields consist of...
flags
followed by name - actually index to a string literal into the constant pool
followed by descriptor - e.g. field type - also index into the constant pool
 - type is raw type
followed by attributes
- constant value
- specific type information - List< String >, etc.
Method Format




                                                                               d
                                                                            ize
                     Flags              Name        Descriptor




                                                                pu te d
                                                                   al on


                                                                   iva te
                                                                          s
                                                            tfp




                                                                fi n hr


                                                                pr ec
                                                                       rg




                                                                       ic
                                                                va e




                                                                pr ic
                                            Attributes




                                                                    tiv


                                                                   nc




                                                                    bl
                                                                   ra
                                                         ric




                                                                   ot
                                                                   at
                                                                na


                                                                sy
                                                       st




Monday, January 23, 12                                          st
Methods consist of...
flags
followed by name - actually index to a string literal into the constant pool
followed by descriptor - e.g. raw parameter types and return type
followed by attributes
- exceptions & code
- specific type information - List< String >, etc.
- specific exception information
- debugging information
Method Format
                     Flags              Name        Descriptor
                                                  “main”
                                            Attributes




Monday, January 23, 12

Methods consist of...
flags
followed by name - actually index to a string literal into the constant pool
followed by descriptor - e.g. raw parameter types and return type
followed by attributes
- exceptions & code
- specific type information - List< String >, etc.
- specific exception information
- debugging information
Method Format
                     Flags              Name        Descriptor                 “([Ljava/lang/String;)V”
                                            Attributes




Monday, January 23, 12

Methods consist of...
flags
followed by name - actually index to a string literal into the constant pool
followed by descriptor - e.g. raw parameter types and return type
followed by attributes
- exceptions & code
- specific type information - List< String >, etc.
- specific exception information
- debugging information
Method Format
                     Flags              Name        Descriptor
                                            Attributes

                                                Exceptions


                                                     Code




Monday, January 23, 12

Methods consist of...
flags
followed by name - actually index to a string literal into the constant pool
followed by descriptor - e.g. raw parameter types and return type
followed by attributes
- exceptions & code
- specific type information - List< String >, etc.
- specific exception information
- debugging information
Constant Pool
            C            2   UTF     10                                HelloWorld
                                                                            C         4       UTF        16
                                             “java/lang/Object”

          UTF            6                                  “<init>”                                        UTF

                3                   “()V”                    UTF        4                  “Code”
                             M        3             9        N&T        5             6       UTF         4
                             “main”                          UTF       22
                                          “([Ljava/lang/String;)V”

                                                        F       13              15        C       14        UTF

                16                                 “java/lang/System”
Monday, January 23, 12

Dissect the “Hello World” example a little...
Entry 1 is a class entry - a 2-byte index to a UTF entry that contains the name
Entry 2 is the name of the class
Similarly...
Entry 3 is a class entry - referring to the parent class refers to Entry 4 which is the full name of the parent class
Skip over the constructor “<init>” and focus on main
Entry 10 is the name “main” & Entry 11 is the raw type descriptor for “main”
The [Ljava/lang/String indicates String[] - V indicates returns void
Browsing Class File Format




           JClassLib Viewer http://www.ej-technologies.com/products/jclasslib/overview.html
Monday, January 23, 12

JClassLibViewer: http://www.ej-technologies.com/products/jclasslib/overview.html
ConstantValue
      public final class HelloWorld {
      	   public static final String MESSAGE = "Hello, World!";
      	
      	   public static final void main( final String... args ) {
      	   	    System.out.println( MESSAGE );
      	   }
      }




Monday, January 23, 12

Here, we can see that because the “MESSAGE” field is “static final”.
The value is stored in a “ConstantValue” attribute on the “MESSAGE” field.
Exceptions
     public interface InputStreamProvider {
     	   public abstract InputStream open() throws IOException;
     }




Monday, January 23, 12

Exception information is also stored in attribute.
As it turns out the JVM, makes no distinction between checked and unchecked exceptions which has an interesting
implication...
Exceptions
public final class NewInstance {
  public static void main(String... args) {
    try {                                      public class SomeClass {
      Class.                                   	   public SomeClass() throws SomeException {
                                               	
        forName("net.dougqh.runtime.SomeClass").   	    throw new SomeException();
        newInstance();                         	   }
	   } catch (                                  }
      InstantiationException |
	     IllegalAccessException |
	     ClassNotFoundException e)
	   {
	     e.printStackTrace();
	   }
  }
}
                         Exception in thread "main" net.dougqh.runtime.SomeClass$SomeException
                         !   at net.dougqh.runtime.SomeClass.<init>
                         !   at sun.reflect.NativeConstructorAccessorImpl.newInstance0
                         !   at sun.reflect.NativeConstructorAccessorImpl.newInstance
                         !   at sun.reflect.DelegatingConstructorAccessorImpl.newInstance
                         !   at java.lang.reflect.Constructor.newInstance
                         !   at java.lang.Class.newInstance0
                         !   at java.lang.Class.newInstance
                         !   at net.dougqh.runtime.NewInstance.main


Monday, January 23, 12

www.javapuzzlers.com
Because of an oversight in the original reflection API, Class.newInstance can throw a checked exception that is
not reported by the compiler
Generics
     public final class Generics {
       public static final List<String> getStrings() {
         return Collections.singletonList("foo");
       }
     }




Monday, January 23, 12

Here, we can getStrings() which returns List<String> has a descriptor of the raw-type List
However, the exact type information is stored in the “Signature” attribute
Annotations
   @Inherited
   @Retention( RetentionPolicy.RUNTIME )
   public @interface Annotation {
   	   public int foo() default 20;
   	
   	   public String bar();
   }




     @Annotation( bar="quux" )
     class Annotated {}




Monday, January 23, 12

An annotation is just an inteface
The default values for each method are stored in a ConstElement attribute
The annotation information on a class or method is also stored in an attribute
In this case, since the annotation has a RUNTIME RetentionPolicy, it is stored in the RuntimeVisibleAnnotations
attribute
Values for the attribute are stored in the sub-attribute ElementValuePair
Byte Code


Monday, January 23, 12
Stack Based Virtual Machine
            0 iconst_1
                                                              0            1            2            3
            1 iconst_2
            2 iadd
            3 istore_0
            4 iload_0




Monday, January 23, 12

The JVM byte code format is stack-based like many other VMs: CLR, PHP, and Python
In this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack

Let’s look at how to add 1 + 2 together and store into a local variable
First, we use an iconst_1 instruction to load onto the stack
Java has special instructions for common numbers: -1 to 5.
Next, an iconst_2 to place 2 on the stack
Next, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stack
Next, we use an istore_0 to store into the first local variable slot
To load value, back from the local variable slots, we use an iload_0
Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
Stack Based Virtual Machine
            0 iconst_1
                                                              0            1            2            3
            1 iconst_2
            2 iadd
            3 istore_0
            4 iload_0




                                                      1


Monday, January 23, 12

The JVM byte code format is stack-based like many other VMs: CLR, PHP, and Python
In this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack

Let’s look at how to add 1 + 2 together and store into a local variable
First, we use an iconst_1 instruction to load onto the stack
Java has special instructions for common numbers: -1 to 5.
Next, an iconst_2 to place 2 on the stack
Next, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stack
Next, we use an istore_0 to store into the first local variable slot
To load value, back from the local variable slots, we use an iload_0
Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
Stack Based Virtual Machine
            0 iconst_1
                                                              0            1            2            3
            1 iconst_2
            2 iadd
            3 istore_0
            4 iload_0




                                                      2
                                                      1


Monday, January 23, 12

The JVM byte code format is stack-based like many other VMs: CLR, PHP, and Python
In this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack

Let’s look at how to add 1 + 2 together and store into a local variable
First, we use an iconst_1 instruction to load onto the stack
Java has special instructions for common numbers: -1 to 5.
Next, an iconst_2 to place 2 on the stack
Next, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stack
Next, we use an istore_0 to store into the first local variable slot
To load value, back from the local variable slots, we use an iload_0
Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
Stack Based Virtual Machine
            0 iconst_1
                                                              0            1            2            3
            1 iconst_2
            2 iadd
            3 istore_0
            4 iload_0




                                                    1+2


Monday, January 23, 12

The JVM byte code format is stack-based like many other VMs: CLR, PHP, and Python
In this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack

Let’s look at how to add 1 + 2 together and store into a local variable
First, we use an iconst_1 instruction to load onto the stack
Java has special instructions for common numbers: -1 to 5.
Next, an iconst_2 to place 2 on the stack
Next, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stack
Next, we use an istore_0 to store into the first local variable slot
To load value, back from the local variable slots, we use an iload_0
Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
Stack Based Virtual Machine
            0 iconst_1
                                                              0            1            2            3
            1 iconst_2
            2 iadd
            3 istore_0
            4 iload_0




                                                      3


Monday, January 23, 12

The JVM byte code format is stack-based like many other VMs: CLR, PHP, and Python
In this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack

Let’s look at how to add 1 + 2 together and store into a local variable
First, we use an iconst_1 instruction to load onto the stack
Java has special instructions for common numbers: -1 to 5.
Next, an iconst_2 to place 2 on the stack
Next, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stack
Next, we use an istore_0 to store into the first local variable slot
To load value, back from the local variable slots, we use an iload_0
Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
Stack Based Virtual Machine
            0 iconst_1
                                                              0            1            2            3
            1 iconst_2                                              3
            2 iadd
            3 istore_0
            4 iload_0




Monday, January 23, 12

The JVM byte code format is stack-based like many other VMs: CLR, PHP, and Python
In this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack

Let’s look at how to add 1 + 2 together and store into a local variable
First, we use an iconst_1 instruction to load onto the stack
Java has special instructions for common numbers: -1 to 5.
Next, an iconst_2 to place 2 on the stack
Next, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stack
Next, we use an istore_0 to store into the first local variable slot
To load value, back from the local variable slots, we use an iload_0
Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
Stack Based Virtual Machine
            0 iconst_1
                                                              0            1            2            3
            1 iconst_2                                              3
            2 iadd
            3 istore_0
            4 iload_0




                                                      3


Monday, January 23, 12

The JVM byte code format is stack-based like many other VMs: CLR, PHP, and Python
In this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack

Let’s look at how to add 1 + 2 together and store into a local variable
First, we use an iconst_1 instruction to load onto the stack
Java has special instructions for common numbers: -1 to 5.
Next, an iconst_2 to place 2 on the stack
Next, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stack
Next, we use an istore_0 to store into the first local variable slot
To load value, back from the local variable slots, we use an iload_0
Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
Parameters and Local Variables
static int volume(                              0 iload_0
  int width,                                    1 iload_1
  int depth,
  int height )                                  2 imul




                                                                                       e
                                                                                       t
                                                                                      h



                                                                                  lum
                                                                                   igh
                                                                                     h
                                                                                   pt


                                                                                    a
                                                                                  dt
{




                                                                               are
                                                3 istore_3




                                                                               de
                                                                               he
                                                                               wi




                                                                               vo
                                                                           0       1      2       3       4
  int area = width * depth;
                                                4 iload_3
  int volume = area * height;
  return volume;                                5 iload_2
}
                                                6 imul
                                                7 istore         4
                                                9 iload          4
                                              11ireturn

Monday, January 23, 12

Trace through a slightly more complicated example: calculating volume
- arguments are passed into the low local variables slots - 0 - 3 in this case
- first to calculate area, load width and depth from slots 0 & 1 respectively
- multiply the values on the stack, then store result into slot 4 area
- reload area & height - slots 4 & 3 respectively
- multiply the values and store into slot 5: volume
- reload volume and return
Yes, the value is stored and then immediately reloaded in the byte code. Starting with Java 3, byte code is not
optimized by javac, all optimizations are left to the JVM to perform.
Static vs Virtual Methods
 int volume(                                      0 iload_1
  int width,                                      1 iload_2
  int depth,
                                                  2 imul




                                                                                    e
  int height )




                                                                            are t
                                                                            he h


                                                                               lum
                                                                                  h

                                                                                igh
                                                                                pt


                                                                                 a
                                                                                dt
                                                                                s
{                                                 3 istore             4




                                                                            thi

                                                                            de
                                                                            wi




                                                                            vo
                                                                           0    1   2   3   4   5
  int area = width * depth;
                                                  5 iload              4
  int volume = area * height;
  return volume;                                  7 iload_3
}                                                 8 imul
                                                  9 istore             5
                                                11 iload               5
                                                13 ireturn

Monday, January 23, 12

In the prior example, you may have noticed that method was static.
If the method isn’t static, then “this” is invisibly passed to the first slot.
So, our arguments start at 1 and the load and stores all change accordingly.
Hello World
        System.out.println( “Hello World” );
                                                                       0         1         2         3
         0 getstatic           System.out

         3 ldc                 “Hello World”

         5 invokevirtual PrintStream.println
                                                                                  “Hello World”
         8 return



                                                                                     System.out




Monday, January 23, 12

Now, we know enough to understand “Hello World”

The first operation is a getstatic to load the value of System.out onto the stack
We need this reference to invoke println
Second, load the string “Hello World” onto the stack - the ldc indicates a load from the constant pool
Now, since this is non-static method on a class, use invokevirtual to invoke PrintStream.println
This consumes the pointer to System.out (which is the this for PrintStream.println) and the reference to “Hello
World”
These values are then mapped to local slots for “this” and “msg” in the new stack frame
Hello World
        System.out.println( “Hello World” );
                                                                       0         1         2         3
         0 getstatic           System.out

         3 ldc                 “Hello World”

         5 invokevirtual PrintStream.println
                                                                                  “Hello World”
         8 return



                                                                                     System.out




Monday, January 23, 12

Now, we know enough to understand “Hello World”

The first operation is a getstatic to load the value of System.out onto the stack
We need this reference to invoke println
Second, load the string “Hello World” onto the stack - the ldc indicates a load from the constant pool
Now, since this is non-static method on a class, use invokevirtual to invoke PrintStream.println
This consumes the pointer to System.out (which is the this for PrintStream.println) and the reference to “Hello
World”
These values are then mapped to local slots for “this” and “msg” in the new stack frame
Hello World
        System.out.println( “Hello World” );
                                                                       0         1         2         3
         0 getstatic           System.out

         3 ldc                 “Hello World”

         5 invokevirtual PrintStream.println
                                                                                  “Hello World”
         8 return



                                                                                     System.out




Monday, January 23, 12

Now, we know enough to understand “Hello World”

The first operation is a getstatic to load the value of System.out onto the stack
We need this reference to invoke println
Second, load the string “Hello World” onto the stack - the ldc indicates a load from the constant pool
Now, since this is non-static method on a class, use invokevirtual to invoke PrintStream.println
This consumes the pointer to System.out (which is the this for PrintStream.println) and the reference to “Hello
World”
These values are then mapped to local slots for “this” and “msg” in the new stack frame
Hello World




                                                                                        g
                                                                              s
                                                                                      ms
                                                                           thi
        System.out.println( “Hello World” );
                                                                       0          1         2        3
         0 getstatic           System.out

         3 ldc                 “Hello World”

         5 invokevirtual PrintStream.println
                                                                                  “Hello World”
         8 return



                                                                                      System.out




Monday, January 23, 12

Now, we know enough to understand “Hello World”

The first operation is a getstatic to load the value of System.out onto the stack
We need this reference to invoke println
Second, load the string “Hello World” onto the stack - the ldc indicates a load from the constant pool
Now, since this is non-static method on a class, use invokevirtual to invoke PrintStream.println
This consumes the pointer to System.out (which is the this for PrintStream.println) and the reference to “Hello
World”
These values are then mapped to local slots for “this” and “msg” in the new stack frame
Types of Method Invocations
             invokestatic - invoke static methods
             invokevirtual - invoke instance method from class
             invokeinterface - invoke instance method from interface
             invokespecial - invoke <init> / invoke super method
             invokedynamic - optimized dynamic look-up (in Java 7)




Monday, January 23, 12

We’ve seen a call to invokevirtual which is used class methods, but there are other invocation types, too.
invokestatic - for static methods
invokeinterface- for methods invoked through an interface reference (rather than a class reference)
invokespecial - for direct targets - like constructors or invoking a super method where the call is not polymorphic
invokedynamic - used by script languages like JRuby in Java 7 for improved performance
New Object
     BigDecimal num =




                                                                             m
      new BigDecimal(“2.0”);




                                                                           nu
                                                                       0         1         2         3
         0 new                 BigDecimal

         3 dup

         4 ldc                 “2.0”
                                                                                        “2.0”
         6 invokespecial          BigDecimal.<init>

         9 astore_0




Monday, January 23, 12

Now, let’s look an object allocation
The first step is to an object; however, this steps does not yet invoke the constructor
It just allocates space on the heap for the object and returns a pointer to uninitialized memory
Unfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, we
need to a copy (“dup”) so that we’ll have a reference left to store into “num”.
Next, we push “2.0” onto the stack
Then we invoke BigDecimal.<init> which is the BigDecimal constructor.
It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”.

As you can see construction is rather complicated, some of the past security wholes with byte code verifier
involved object construction because the sequence is non-trivial.
CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thus
making byte code verification easier.

From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a single
step and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field.
In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before the
constructor is done being invoked.
New Object
     BigDecimal num =




                                                                             m
      new BigDecimal(“2.0”);




                                                                           nu
                                                                       0         1         2         3
         0 new                 BigDecimal

         3 dup

         4 ldc                 “2.0”
                                                                                        “2.0”
         6 invokespecial          BigDecimal.<init>

         9 astore_0

                                                                                     BigDecimal




Monday, January 23, 12

Now, let’s look an object allocation
The first step is to an object; however, this steps does not yet invoke the constructor
It just allocates space on the heap for the object and returns a pointer to uninitialized memory
Unfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, we
need to a copy (“dup”) so that we’ll have a reference left to store into “num”.
Next, we push “2.0” onto the stack
Then we invoke BigDecimal.<init> which is the BigDecimal constructor.
It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”.

As you can see construction is rather complicated, some of the past security wholes with byte code verifier
involved object construction because the sequence is non-trivial.
CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thus
making byte code verification easier.

From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a single
step and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field.
In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before the
constructor is done being invoked.
New Object
     BigDecimal num =




                                                                             m
      new BigDecimal(“2.0”);




                                                                           nu
                                                                       0         1         2         3
         0 new                 BigDecimal

         3 dup

         4 ldc                 “2.0”
                                                                                        “2.0”
         6 invokespecial          BigDecimal.<init>

         9 astore_0

                                                                                     BigDecimal




Monday, January 23, 12

Now, let’s look an object allocation
The first step is to an object; however, this steps does not yet invoke the constructor
It just allocates space on the heap for the object and returns a pointer to uninitialized memory
Unfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, we
need to a copy (“dup”) so that we’ll have a reference left to store into “num”.
Next, we push “2.0” onto the stack
Then we invoke BigDecimal.<init> which is the BigDecimal constructor.
It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”.

As you can see construction is rather complicated, some of the past security wholes with byte code verifier
involved object construction because the sequence is non-trivial.
CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thus
making byte code verification easier.

From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a single
step and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field.
In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before the
constructor is done being invoked.
New Object
     BigDecimal num =




                                                                             m
      new BigDecimal(“2.0”);




                                                                           nu
                                                                       0         1         2         3
         0 new                 BigDecimal

         3 dup

         4 ldc                 “2.0”
                                                                                        “2.0”
         6 invokespecial          BigDecimal.<init>

         9 astore_0

                                                                                     BigDecimal




Monday, January 23, 12

Now, let’s look an object allocation
The first step is to an object; however, this steps does not yet invoke the constructor
It just allocates space on the heap for the object and returns a pointer to uninitialized memory
Unfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, we
need to a copy (“dup”) so that we’ll have a reference left to store into “num”.
Next, we push “2.0” onto the stack
Then we invoke BigDecimal.<init> which is the BigDecimal constructor.
It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”.

As you can see construction is rather complicated, some of the past security wholes with byte code verifier
involved object construction because the sequence is non-trivial.
CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thus
making byte code verification easier.

From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a single
step and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field.
In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before the
constructor is done being invoked.
New Object
     BigDecimal num =




                                                                             m
      new BigDecimal(“2.0”);




                                                                           nu
                                                                       0         1         2         3
         0 new                 BigDecimal

         3 dup

         4 ldc                 “2.0”
                                                                                        “2.0”
         6 invokespecial          BigDecimal.<init>

         9 astore_0

                                                                                     BigDecimal




Monday, January 23, 12

Now, let’s look an object allocation
The first step is to an object; however, this steps does not yet invoke the constructor
It just allocates space on the heap for the object and returns a pointer to uninitialized memory
Unfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, we
need to a copy (“dup”) so that we’ll have a reference left to store into “num”.
Next, we push “2.0” onto the stack
Then we invoke BigDecimal.<init> which is the BigDecimal constructor.
It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”.

As you can see construction is rather complicated, some of the past security wholes with byte code verifier
involved object construction because the sequence is non-trivial.
CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thus
making byte code verification easier.

From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a single
step and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field.
In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before the
constructor is done being invoked.
Demo
            javap -c


Monday, January 23, 12
Conditionals
   Original                                               Byte Code
   if ( x > 0 ) {                                           0:   iload_0
     return true;                                           1:   ifle 6
   } else {                                                 4:   iconst_1
     return false;                                          5:   ireturn
   }                                                        6:   iconst_0
                                                            7:   ireturn


                                                            0:   iload_0
   return x > 0 ? true : false;
                                                            1:   ifle 8
                                                            4:   iconst_1
                                                            5:   goto 9
                                                            8:   iconst_0
                                                            9:   ireturn



                                                            0:   iload_0
    return ( x > 0 );
                                                            1:   ifle 8
                                                            4:   iconst_1
                                                            5:   goto 9
                                                            8:   iconst_0
                                                            9:   ireturn



Monday, January 23, 12

Three ways to write a method that checks if a number is greater than 0.
The byte code is almost the same in all 3 cases.
Invoke Static
    Original                                                Decompiled
     Math.max(10, 20);                                       0:   bipush 10
                                                             2:   bipush 20
                                                             4:   invokestatic Math.max
                                                             7:   pop
                                                             8:   return




Monday, January 23, 12

Here, we see an extra pop after the invokestatic call.
That’s because the return value of max is left on the stack, since we don’t use it the compiler generates a pop to
discard it.
If we store the value in a variable, the pop will be replaced with an istore
Invocations
    Original                                                 Decompiled
    FileInputStream in =                                      0:   new FileInputStream
      new FileInputStream("foo");                             3:   dup
    in.close();                                               4:   ldc "foo"
                                                              6:   invokespecial FileInputStream.<init>
                                                              9:   astore_0
                                                             10:   aload_0
                                                             11:   invokevirtual FileInputStream.close
                                                             14:   return



   Closeable in = new FileInputStream("foo");                 0:   new FileInputStream
   in.close();                                                3:   dup
                                                              4:   ldc "foo"
                                                              6:   invokespecial FileInputStream.<init>
                                                              9:   astore_0
                                                             10:   aload_0
                                                             11:   invokeinterface Closeable.close
                                                             16:   return




Monday, January 23, 12

In one example, close is called on a class-type FileInputStream in the other it is called on an interface-type
Closeable
In the first case, the compiler generates an invokevirtual call
In the second case, the compiler generates an invokeinterface call
For Loop



                                                                       before
                                                                                 0 iconst_0




                                                           init & test loop
                                                                                 1 istore_2
                                                                                 2 iload_0

   static int sum( int min, int max ){                                           3 istore_3
     int sum = 0;                                                                4 goto             +10 //14
     for ( int i=min; i<max; ++i ){                                              7 iload_2




                                                                loop body
       sum += i;                                                                 8 iload_3
     }                                                                           9 iadd
     return sum;
                                                                                10 istore_2
   }


                                                                 inc
                                                                                11 iinc             3 by 1
                                                                                14 iload_3
                                                                 test           15 iload_1
                                                                                16 if_icmplt        -9 //7
                                                                                19 iload_2
                                                            after
                                                            loop



                                                                                20 ireturn

Monday, January 23, 12

Examine a for loop example

The first 2 ops are the initialization of “sum”, load 0 and store in “sum” (slot 2)
The next 3 ops are the loop initialization and jump to the initial test...
- load the value of “min” (slot 0) into “i” (slot 3)
- then jump to the test
The test is placed at the end since it is generally performed after the body and step portions of the loop
The test...
- loads “i” (slot 3) and “max” (slot 1)
- if “i” is less than “max”, then it jumps back 9 bytes to the start of the loop body
The loop body...
- loads and adds “sum” and “i” (slots 2 and 3) and stores the result back into “sum” (slot 2)
Then the step / increment part of the loop happens...
- which just increments “i”
Then we flow straight into the test portion
If the test fails, we flow through to the after loop portion
Here, we load “sum” (slot 2) and return the result
0 aload_0
Exception Handling                                                           1 invokevirtual InputStream.read




                                                          try / finally
  static int read( InputStream in ) {                                        4 istore_1
    try {                                                                    5 aload_0
      return in.read();                                                      6 invokestatic IoUtils.closeQuietly
    } catch ( IOException e ) {                                              9 iload_1
      return -1;                                                            10 ireturn
    } finally {
                                                                            11 pop
      IoUtils.closeQuietly( in );




                                                           catch / finally
    }                                                                       12 aload_0
  }                                                                         13 invokestatic IoUtils.closeQuietly
                                                                            16 iconst_m1
                                                                            17 ireturn
                         Exception Table
                                                                            18 astore_2
    start           end       handler    Exception
                                                                            19 aload_0
       0                 5      11      IOException
                                                           finally



                                                                            20 invokestatic IoUtils.closeQuietly
       0                 5      18          any
                                                                            23 aload_2
      11                 12     18          any                             24 athrow
Monday, January 23, 12

Now, Exception handling...
Exceptions are handled through extra meta-information that says how to handle different types of exceptions
over a range of byte-code instructions.

The finally portion is inlined in the try, catch, and finally portions of the generated byte code.
(Prior to Java 6, the regular javac compiler generated “jsr” and “ret” to jump to single block of compiled “finally”
code.)

The “try / finally” section represents the normal flow.
- invoke InputStream.read
- store the result into an unnamed temporary variable (slot 1) b/c we need to run the finally code
- run the finally code
- reload the temporary variable and return

The “catch / finally” is the catching of the IOException...
The exception table says if an IOException is raised between instructions 0 and 5 (the try), jump to 11 this catch
section.
First, step is to “pop”, pop what? In this case the IOException which was automatically placed on the stack. Since
we don’t use it discard it. This implies that “e” is never assigned a stack slot by the compiler.
Now, invoke IoUtils.closeQuietly (the finally block) then return -1.
0 aload_0
Synchronization




                                                          before try
                                                                          1 dup
  int inc() {                                                             2 astore_1
    synchronized ( this ) {                                               3 monitorenter
      ++this.counter;                                                     4 aload_0
    }
                                                                          5 dup
  }
                                                                          6 getfield        Counter.num




                                                          try / finally
                                                                          9 iconst_1
                                                                         10 iadd
                                                                         11 putfield        Counter.num
                                                                         14 aload_1
                                                                         15 monitorexit
                                                                         16 goto           +6 //22
                         Exception Table
                                                                         19 aload_1
    start           end       handler   Exception
                                                          finally



                                                                         20 monitorexit
       4                 16     22          any
                                                                         21 athrow
      19                 21     22          any                          22 return
Monday, January 23, 12

Interestingly enough, synchronization works the same way.
To understand synchronization, it is better to luck at synchronization as a lock and unlock within a try / finally.
And, that’s exactly how the byte code works.
And, just like a regular try / finally, the finally is inlined is both the try and the finally.
0 aload_0
Synchronization




                                                          before try
                                                                          1 dup
 int inc() {                                                              2 astore_1
   lock( this );                                                          3 monitorenter
   try {                                                                  4 aload_0
     ++this.counter;
                                                                          5 dup
   } finally {
                                                                          6 getfield        Counter.num




                                                          try / finally
     unlock( this );
   }                                                                      9 iconst_1
 }                                                                       10 iadd
                                                                         11 putfield        Counter.num
                                                                         14 aload_1
                                                                         15 monitorexit
                                                                         16 goto           +6 //22
                         Exception Table
                                                                         19 aload_1
    start           end       handler   Exception
                                                          finally



                                                                         20 monitorexit
       4                 16     22          any
                                                                         21 athrow
      19                 21     22          any                          22 return
Monday, January 23, 12

Interestingly enough, synchronization works the same way.
To understand synchronization, it is better to luck at synchronization as a lock and unlock within a try / finally.
And, that’s exactly how the byte code works.
And, just like a regular try / finally, the finally is inlined is both the try and the finally.
Demo
                         Java 5
                         Java 7




Monday, January 23, 12

In these demos, I demonstrate new language features by showing Java 5 and Java 7 code and then showing what it looks
when its decompiled back into Java 4 code.
JAD - http://www.varaneckas.com/jad
Java 5

Monday, January 23, 12

JAD - http://www.varaneckas.com/jad
Auto-Boxing
      Original                                              Decompiled as Java 4
  public class AutoBoxing {                                public class AutoBoxing {
    public static void main(String[] args) {                 public static void main(String args[]) {
      Integer foo = 20;                                        Integer foo = Integer.valueOf(20);
      Integer bar = 30;                                        Integer bar = Integer.valueOf(30);

          int sum = foo + bar;                                     int sum = foo.intValue() + bar.intValue();
          System.out.println(sum);                                 System.out.println(sum);
      }                                                        }
  }                                                        }




Monday, January 23, 12

Here, we see how auto-boxing works.
The compiler injects the necessary calls to Integer.valueOf and Integer.intValue for us.
NOTE: Even if you don’t like auto-boxing, please call Integer.valueOf rather than calling new Integer.
Unlike new, Integer.valueOf returns cached instances of Integer for commonly used values.
Enhanced For
     Original                                              Decompiled as Java 4
 public class EnhancedFor {                                 public class EnhancedFor {
   static void array(String[] args) {                         static void array(String args[]) {
     for ( String arg : args ) {                                String arr$[] = args;
       System.out.println(arg);                                 int len$ = arr$.length;
     }                                                          for (int i$ = 0; i$ < len$; i$++) {
   }                                                              String arg = arr$[i$];
                                                                  System.out.println(arg);
     static void iterable(                                      }
       Iterable<String> args)                                 }
     {
       for ( String arg: args ) {                               static void iterable(Iterable args) {
         System.out.println(arg);                                 String arg;
       }
     }                                                              for (Iterator i$ = args.iterator();
 }                                                                    i$.hasNext(); )
                                                                    {
                                                                      arg = (String) i$.next();
                                                                      System.out.println(arg)
                                                                    }
                                                                }
                                                            }



Monday, January 23, 12

In this slide, we see how the enhanced for gets handled by the compiler.

The array for loop, converts to the canonical C-style loop. With one slight difference of performing invariant
hoisting on the array length. (Although, this is rather pointless optimization because the JVM would do this at
runtime anyway.)

For an Iterable, a loop that uses an iterator is generated. In this example, we can also see that the compiler
injects a cast to exact type String, too.
Var-Args
    Original                                                Decompiled as Java 4
   public final class VarArgs {                               public final class VarArgs {
     public static void main(String... args) {                  public static transient void main(
       System.out.printf(                                         String[] args)
         "Hello %s %s", "Jon", "Doe");                          {
     }                                                            System.out.printf(
   }                                                               "Hello %s %s",
                                                                   new Object[] {"Jon", "Doe"});
                                                                }
                                                              }




Monday, January 23, 12

In this example, we var-args being used both in the signature and in the call to printf.
NOTE: I’ve declared a main method with var-args, since on a byte-code level this is still just a String[]. This
actually works just fine.

The “transient” modifier in the decompiled Java 4 is a bit amusing. This happens because Java ran out of flag bits
to use in Java 5, so they overloaded the “transient” bit which only applies to fields to mean “var-args” when
applied to methods.

In the call to printf, we can see that the compiler injects a construction of a new Object[] and passes it as the last
arg to printf.
Enum
    Original                                              Decompiled as Java 4
   public enum AnEnum {                                   public static final class AnEnum
     FOO,                                                   extends Enum
     BAR,                                                 {
     QUUX                                                   public static final AnEnum FOO =
   }                                                          new AnEnum(“FOO”, 0);
                                                            public static final AnEnum BAR =
                                                              new AnEnum(“BAR”, 1);
                                                            public static final AnEnum QUUX =
                                                              new AnEnum(“QUUX”, 2);
                                                            private static final AnEnum[] $VALUES =
                                                              new AnEnum[]{FOO, BAR, QUUX};

                                                              public static AnEnum[] values() {
                                                                return (AnEnum[]) $VALUES.clone();
                                                              }
                                                              public static AnEnum valueOf(String name){
                                                                return (AnEnum)Enum.valueOf(
                                                                 AnEnum.class, name);
                                                              }
                                                              private Simple(String s, int i) {
                                                                super(s, i);
                                                              }
                                                          }

Monday, January 23, 12

For Enum-s, the compiler does a great deal of work on your behalf -- even in the simplest case.
The compiler generates a constructor that takes a label and ordinal for each entry.
It then initializes a static final field for each constant from the original file.
These constants are all placed in a value array.
Finally, the compiler generates a values() method and valueOf() method for each enum class.
Covariance
    Original                                                 Decompiled as Java 4
  public interface Parent {                                  public static interface Parent {
    Number calculate();                                        public abstract Number calculate();
  }                                                          }

  public class CovariantChild                                public class CovariantChild
    implements Parent                                          implements Parent
  {                                                          {
    public Integer calculate() {                               public Integer calculate() {
      return 10;                                                 return Integer.valueOf(10);
    }                                                          }
  }
                                                                 public volatile Number calculate() {
                                                                   return calculate();
                                                                 }
                                                             }




Monday, January 23, 12

A lesser known addition to Java 5 is the ability to have a covariant return type.
Here, the child type returns a more specific type of Number -- namely Integer.

The generated code is interesting. We end up with two “calculate” methods - one that returns Integer and another
returns Number. The one that returns Number satisfies the contact of the parent and simply calls the more
specific version that returns Integer.

Here, again we see the curious modifier on a method: “volatile”. This another situation where Java 5 overloaded an
existing flag bit.

For more information on why this is type-safe, look-up Liskov Substitution Principle.
Java 7

Monday, January 23, 12
Multi-Catch
    Original                                              Decompiled as Java 4
    public final class EnhancedCatch {                    public final class EnhancedCatch {
      public static void main(String[] args){               public static void main(String args[]) {
        try {                                                 try {
          Class.                                                Class.
            forName("some.package.SomeClass").                    forName("some.package.SomeClass").
            newInstance();                                         newInstance();
        } catch (                                             } catch (ReflectiveOperationException e){
          InstantiationException |                              throw new IllegalStateException(e);
          IllegalAccessException |                            }
          ClassNotFoundException e)                         }
        {                                                 }
          throw new IllegalStateException(e);
        }
      }
    }




Monday, January 23, 12

Java 7 adds the ability to handle multi-exception types in a single catch.
Great for ugly reflection code.
Here, the catch of all the reflection exceptions simplifies to a single catch of their common parent
ReflectiveOperationException (a new base class for reflection exceptions also introduced in Java 7).
Try With Resources
    Original                                               Decompiled
   public class EnhancedTry {                              public class EnhancedTry {
     public static void main(                                public static void main(String args[])
       String[] args)                                          throws IOException
       throws IOException                                    {
     {                                                         Properties properties = new Properties();
       Properties properties =                                 InputStream in =
         new Properties();                                       new FileInputStream("my.properties");
                                                               Throwable throwable = null;
           try (InputStream in =                               try {
             new FileInputStream("my.properties"))               properties.load(in);
           {                                                   } catch (Throwable throwable1) {
             properties.load(in);                                throwable = throwable1;
           }                                                   } finally {
       }                                                         if (in != null) {
   }                                                               try {
                                                                     in.close();
                                                                   } catch (Throwable x2) {
                                                                     throwable.addSuppressed(x2);
                                                                     throw throwable;
                                                                   }
                                                                 }
                                                               }
                                                             }
                                                           }
Monday, January 23, 12

Java 7 also enhances try by allowing it to automatically close resources.
It generates a similar try / finally to what you’d write by hand.
Although, it puts the resource acquisition outside the try (which is correct but uncommon among many Java
programmers).
However, it does one more thing, it also adds code, so that if an exception happens when closing the original
exception from the body is still propagated. And, even better the exception raised by closed is added to the
suppressed list of the original exception using the new Java 7 method: Throwable.addSuppressed.
String Switch
    Original                                              Decompiled
   switch (args[0]) {                                       byte byte0 = -1;
     case "Hello":                                          switch(args[0].hashCode()) {
     System.out.println("Hello, World!");                     case 69609650: ... break;
     break;                                                   case 67278:
                                                              if(s.equals("9uFFE7")) {
       case "Bye":                                              byte0 = 2;
       System.out.println("Good Bye, World!");                } else if(s.equals("Bye")) {
       break;                                                   byte0 = 1;
                                                              }
       case "9uffe7":                                        break;
       System.out.println("Collision");                     }
       break;                                               switch(byte0) {
   }                                                          case 0:
                                                              System.out.println("Hello, World!");
                                                              break;

                                                                case 1:
                                                                System.out.println("Good Bye, World!");
                                                                break;

                                                                case 2:
                                                                System.out.println("Collision");
                                                                break;
                                                            }
Monday, January 23, 12

One last example from Java 7 -- string switch

String switch is implemented as a switch on the String’s hashCode.
However, hashCode is not unique, so the generated code must also perform an equals check.

To handle this, string switch actually generates two switch statements.
The first on the hashCode, assigns a temporary variable, a case value from the original code.

Then the second switches on the case code, each case containing code from the original Java 7 cases.
Here, I’ve deliberately created a hash collision, so you can see how collisions are resolved.
Compiler
          Optimizations
Monday, January 23, 12

In the next few examples, I show code the original code and the code after it has been decompiled.
By doing this, we can see some of the optimizations performed by the compiler.
JAD - http://www.varaneckas.com/jad
Constant Folding
     Original                                           Decompiled
 public final class StaticInitializer {                 public final class StaticInitializer {
   private static final String LOG_FORMAT =               private static final String LOG_FORMAT =
   "Started at %d ms";                                      "Started at %d ms";

     private static final long START_TIME =                 private static final long START_TIME =
     System.currentTimeMillis();                              System.currentTimeMillis();

     private static final long START_TIME_2;                private static final long START_TIME_2 =
                                                              System.currentTimeMillis();
     static {                                           }
       START_TIME_2 = System.currentTimeMillis();
     }
 }




Monday, January 23, 12

While modern Java compiler’s don’t do much optimization, they do some.
One example is constant folding -- when possible, the compiler computes simply constant expressions at compile
time.
This even includes string concatenation.
Constant Inlining
      Original                                     Decompiled
  public class Inlining {                          public class Inlining {
    public static final String                       public static final String
      INLINED_VERSION = "1.1.0";                      INLINED_VERSION = "1.1.0";
    public static final String                       public static final String
      NOT_INLINED_VERSION = identity("1.2.0");        NOT_INLINED_VERSION = identity("1.2.0");

      private static String identity(                  private static String identity(
        String value)                                    String value)
      {                                                {
        return value;                                    return value;
      }                                                }

      public static void print() {                     public static void print() {
        System.out.println(INLINED_VERSION);             System.out.println("1.1.0");
        System.out.println(NOT_INLINED_VERSION);         System.out.println(NOT_INLINED_VERSION);
      }                                                }
  }                                                }




Monday, January 23, 12

Constants can also be inlined by the compiler
In this example, the compiler inlines INLINED_VERSION in the print method; however,
it does no inlined NOT_INLINED_VERSION.
The reason is that NOT_INLINED_VERSION is complexed expression because a method was invoked.

This has implications in the byte code, too.
INLINED_VERSION will have its value set through a ConstantValue attribute.
NOT_INLINED_VERSION will be initialized in a <clinit> method generated by the compiler and
called automatically when the class is first loaded.
Dead Code Elimination
    Original                                     Decompiled
  public class DeadCodeElimination {             public class DeadCodeElimination {
    public static final boolean                    public static final boolean
      DEBUG_OFF = false;                             DEBUG_OFF = false;

      public static final boolean                    public static final boolean
        DEBUG_ON = true;                               DEBUG_ON = true;

      public static void main(String[] args) {       public static void main(String args[]) {
        if ( DEBUG_OFF ) {                             System.out.println("always");
          System.out.println("never");               }
        }                                        }

          if ( DEBUG_ON ) {
            System.out.println("always");
          }
      }
  }




Monday, January 23, 12

Along with inlining, the compiler can perform dead code elimination.
In this case, DEBUG_OFF is never true, so the “never” print out is not generated by the
compiler.
Even in the DEBUG_ON case, the compiler realizes the if is always true and simply includes an
unconditional print of “always”.
Runtime
          Optimizations
Monday, January 23, 12
HotSpot Lifecycle
                                 1                                                      2
                          Interpreted                                        Profiling



                           Dynamic                                       Dynamic
                         Decompilation                                  Compilation
                                 4                                                      3



Monday, January 23, 12

Client compilation kicks-in at invocation 3000
Server compilation kicks-in at invocation 10000
Tiered compilation - C0, C1, C2
Method Replacement vs On-Stack Replacement

http://java.sun.com/products/hotspot/whitepaper.html
http://openjdk.java.net/groups/hotspot/docs/HotSpotGlossary.html
http://www.azulsystems.com/blog/cliff-click/2010-07-16-tiered-compilation
http://www.slideshare.net/drorbr/so-you-want-to-write-your-own-benchmark-presentation
Is This Optimized?
         double sumU = 0, sumV = 0;
         for ( int i = 0; i < 100; ++i ) {
           Vector2D vector = new Vector2D( i, i );
           synchronized ( vector ) {
              sumU += vector.getU(); How many...?
              sumV += vector.getV(); Loop Iterations                      100
           }
                                           Heap Allocations               100
         }
                                                      Method Invocations 200
                                                      Lock Acquisitions   100

Monday, January 23, 12

Let’s start the runtime observation discussion with a simple question.
Is this optimized?
How many loop iterations does it do? 100
How many heap allocations? 100
How method invocations? 200
How lock acquisitions? 100
Surprisingly, enough the answer to all of these may actually be zero.
Is This Optimized?
         double sumU = 0, sumV = 0;
         for ( int i = 0; i < 100; ++i ) {
           Vector2D vector = new Vector2D( i, i );
           synchronized ( vector ) {
              sumU += vector.getU(); How many...?
              sumV += vector.getV(); Loop Iterations                       0
           }
                                           Heap Allocations                0
         }
                                                      Method Invocations   0
                                                      Lock Acquisitions    0

Monday, January 23, 12

Let’s start the runtime observation discussion with a simple question.
Is this optimized?
How many loop iterations does it do? 100
How many heap allocations? 100
How method invocations? 200
How lock acquisitions? 100
Surprisingly, enough the answer to all of these may actually be zero.
Common Sub-Expression
         Elimination
            int x = a + b;
            int y = a + b;




            int tmp = a + b;
            int x = tmp;
            int y = tmp;


Monday, January 23, 12

Among the simplest optimizations is common sub-expression elimination.
Here the VM optimizes the code by only performing the calculation of “a+b” once.
http://www.slideshare.net/drorbr/so-you-want-to-write-your-own-benchmark-presentation
Array Bounds Check Elimination
            int[] nums = ...
            for ( int i = 0; i < nums.length; ++i ) {
              System.out.println( “nums[“ + i + “]=” + nums[ i ] );
            }


           int[] nums = ...
           for ( int i = 0; i < nums.length; ++i ) {
             if ( i < 0 || i >= nums.length ) {
                  throw new ArrayIndexOutOfBoundsException();
             }
             System.out.println( “nums[“ + i + “]=” + nums[ i ] );
           }

Monday, January 23, 12

One of the nice things about the VM is that we do have to worry about buffer overruns because the VM checks
array bounds for us, but how much is that costing us.
In short, nothing. The VM recognizes common patterns and realizes that it does not need to generate the bound
checking code.
http://www.cs.umd.edu/~vibha/330/array-bounds.pdf
Loop Invariant Hoisting
            for ( int i = 0; i < nums.length; ++i ) {
            ...
            }


            int length = nums.length;
            for ( int i = 0; i < length; ++i ) {
            ...
            }

Monday, January 23, 12

The VM can also also realize that the length of array does not change, so it can replace looking up the length of the array on
each test with a single storing of a temporary variable and comparing against that instead.
http://java.sun.com/products/hotspot/docs/whitepaper/Java_Hotspot_v1.4.1/Java_HSpot_WP_v1.4.1_1002_4.html
Loop Unrolling
            int sum = 0;
            for ( int i = 0; i < 10; ++i ) {
              sum += i;
            }

            int sum = 0;
            sum += 1;
            ...
            sum += 9;

Monday, January 23, 12

In some situations, the loop can even be unrolled into a simple linear code segment.
Method Inlining
            Vector vector = ...
            double magnitude = vector.magnitude();
           Vector vector = ...                        static                       always
           double magnitude = Math.sqrt(              final                         always
             vector.u*vector.u + vector.v*vector.v );
                                                      private                      always
           Vector vector = ...                        virtual                      often
           double magnitude;
                                                      reflective                    sometimes
           if ( vector instance of Vector2D ) {
             magnitude = Math.sqrt(                   dynamic                      often
                vector.u*vector.u + vector.v*vector.v );
           } else {
             magnitude = vector.magnitude();
           }



Monday, January 23, 12

http://www.ibm.com/developerworks/library/j-jtp12214/
http://openjdk.java.net/groups/hotspot/docs/HotSpotGlossary.html
http://blog.headius.com/2009/01/my-favorite-hotspot-jvm-flags.html
http://java.sun.com/developer/technicalArticles/Networking/HotSpot/inlining.html
Lock Coarsening
            StringBuffer buffer = ...
            buffer.append( “Hello” );
            buffer.append( name );
            buffer.append( “n” );

            StringBuffer buffer = ...
            lock( buffer ); buffer.append( “Hello” ); unlock( buffer );
            lock( buffer ); buffer.append( name ); unlock( buffer );
            lock( buffer ); buffer.append( “n” ); unlock( buffer );

           StringBuffer buffer = ...
           lock( buffer );
           buffer.append( “Hello” );
           buffer.append( name );
           buffer.append( “n” );
           unlock( buffer );

Monday, January 23, 12

Starting in Java 5, HotSpot optimizes locks by performing lock coarsening.
The VM realizes that constantly acquiring and releasing the same lock is not performant, so may take a single larger lock
instead.
http://java.sun.com/performance/reference/whitepapers/6_performance.html#2.1
Other Lock Optimizations
               Biased Locking
               Adaptive Locking - Thread sleep vs. Spin lock




Monday, January 23, 12

And, even more lock optimizations are possible...
- biased locking - makes it cheap for the last thread to acquire lock to acquire it again
- adaptive locking - dynamic detects whether a lock is usually held for a short or long period
  - if it is long, the thread is put to sleep
  - if it is short, the thread will simply spin
http://java.sun.com/performance/reference/whitepapers/6_performance.html#2.1
Escape Analysis
           Point p1 = new Point( x1, y1 ), p2 = new Point( x2, y2 );
           synchronized ( p1 ) {
             synchronized ( p2 ) {
               double dx = p1.getX() - p2.getX();
               double dy = p1.getY() - p2.getY();
               double distance = Math.sqrt( dx*dx + dy*dy );
             }
           }




Monday, January 23, 12

Finally, in Java 7, escape analysis is finally on by default.
With escape analysis, the VM can realize that an object never escapes a stack frame allowing
it to...
- elide heap allocation
- elide locks
Escape Analysis
           Point p1 = new Point( x1, y1 ), p2 = new Point( x2, y2 );
           double dx = p1.getX() - p2.getX();
           double dy = p1.getY() - p2.getY();
           double distance = Math.sqrt( dx*dx + dy*dy );




Monday, January 23, 12

Finally, in Java 7, escape analysis is finally on by default.
With escape analysis, the VM can realize that an object never escapes a stack frame allowing
it to...
- elide heap allocation
- elide locks
Escape Analysis
           Point p1 = new Point( x1, y1 ), p2 = new Point( x2, y2 );
           double dx = p1.getX() - p2.getX();
           double dy = p1.getY() - p2.getY();
           double distance = Math.sqrt( dx*dx + dy*dy );




            double dx = x1 - x2;
            double dx = y1 - y2;
            double distance = Math.sqrt( dx*dx + dy*dy );


Monday, January 23, 12

Finally, in Java 7, escape analysis is finally on by default.
With escape analysis, the VM can realize that an object never escapes a stack frame allowing
it to...
- elide heap allocation
- elide locks
Runtime Demo

                         http://code.google.com/p/caliper/




Monday, January 23, 12

To conclude the runtime optimization section, I’ll show some micro-benchmarks illustrating some of the optimizations.
Writing microbenchmarks for a dynamically optimizing VM is devilishly hard, fortunately, Google created a tool called Caliper
to make it easy. You can write JUnit 3 like Benchmark classes to compare various implementation options.
http://www.slideshare.net/drorbr/so-you-want-to-write-your-own-benchmark-presentation
http://code.google.com/p/caliper/
Loop Variable Placement
                   Inside
            for ( int i = 0; i < ints.length; ++i ) {
              int x = ints[i];
              sum += x;

                            vs.
            }



                   Outside
            int x;
            for ( int i = 0; i < ints.length; ++i ) {
              x = ints[i];
              sum += x;

                            vs.
            }



                   No Variable
            for ( int i = 0; i < ints.length; ++i ) {
              sum += ints[i];
            }




Monday, January 23, 12

First, let’s look at loop variable placement -- declaring the loop variable inside the loop vs. outside vs. using no
variable at all.
All three take the same amount of time to run. In fact, declaring inside or outside produces the same byte code.

My recommendation...
For a one-line loop body, skip the variable.
For a complicated loop body, declare the variable inside to keep the code easier to read and refactor.
Loop Invariant Hoisting
                 Regular For
                 for ( int i = 0; i < ints.length; ++i ) {
                   sum += ints[i];
                 }


                            vs.
                 Manual Hoisting
                 for ( int i = 0, len = ints.length; i < len; ++i ) {
                   sum += ints[i];
                 }


                            vs.
                 Enhanced For
                  for ( int x : ints ) {
                    sum += x;
                  }




Monday, January 23, 12

Now, we’ll compare...
- the canonical loop which checks i against array.length each time in the test
- manually, hoisting the length into a len temporary variable
- using Java 5’s enhanced for
Once again, they all take the same amount of time because the VM performs for hoisting for us.
Field Access
                   Direct
                    point.x
                    point.y


                         vs.
                    Virtual Accessor
                    point.getX()
                    point.getY()



                         vs.
                     Interface Accessor
                    point.getX()
                    point.getY()




Monday, January 23, 12

Next, we’ll look at direct field access vs. using a virtual accessor method vs. using an interface accessor method
Once again, the VM can optimize all of these by performing method inlining, so all three take the same amount of
the time.
Loop Variable Placement
                   StringBuilder - no locks
                   StringBuilder builder   = new StringBuilder();
                   builder.append( "foo"   );
                   builder.append( "bar"   );
                   builder.append( "baz"   );

                              vs.
                   StringBuffer - multiple locks
                   StringBuffer buffer = new StringBuffer();
                   buffer.append( "foo" );
                   buffer.append( "bar" );
                   buffer.append( "baz" );

                              vs.
                   StringBuffer - single lock
                   StringBuffer buffer = new StringBuffer();
                   synchronized( buffer ) {
                     buffer.append( "foo" );
                     buffer.append( "bar" );
                     buffer.append( "baz" );
                   }



Monday, January 23, 12

Now, revisiting locking - compare...
Java 5’s StringBuilder which performs no locking
 vs.
Plain StringBuffer code - multiple separate appends
 vs.
StringBuffer - with a manually added bigger lock

The no lock version does come out slightly ahead, but it is close.
And, the attempt to manually improve performance by taking a bigger single lock actually comes in last.
Heap Elision Benchmark
                    Primitive Array
                    Arrays.sort(new int[]{...});

                              vs.
                    Boxed Array - no Comparator
                     Arrays.sort(new Integer[]{...});

                              vs.
                   Boxed Array - singleton Compator
                     Arrays.sort(
                       new Integer[]{...},
                       IntCompator.INSTANCE);

                              vs.
                   Boxed Array - anonymous Compator
                     Arrays.sort(
                       new Integer[]{...},
                       new Comparator<Integer>() {
                         ...
                       });

Monday, January 23, 12

Lastly, lets look at heap elision by looking at sorting some lists.
No surprise, the primitive array is the most performant.
But the no Comparator case, the singleton Comparator case, and an anonymous Comparator all perform the same.
Even creating an anonymous every time does not impact performance much -- in Java 7, no heap allocation may
take place at all.
Is This Optimized?
         double sumU = 0, sumV = 0;
         for ( int i = 0; i < 100; ++i ) {
           Vector2D vector = new Vector2D( i, i );
           synchronized ( vector ) {
              sumU += vector.getU(); How many...?
              sumV += vector.getV(); Loop Iterations                                      0
           }
                                           Heap Allocations                               0
         }
                                                                Method Invocations        0
                                                                Lock Acquisitions         0

Monday, January 23, 12

So now, hopefully, you can see how this could may truly be optimized already.
Just write clean code and trust in the VM to make it fast.
If you must optimize always profile first and use a micro-benchmarking tool like Caliper.
Recommending Reading
                         Java Puzzlers
                         By Joshua Bloch and Neal Gafter
                         http://www.javapuzzlers.com/




                                              Java Specialist Newsletter
                                              http://www.javaspecialists.eu



                                  Brian Goetz’s Articles
    http://www.ibm.com/developerworks/views/java/libraryview.jsp?contentarea_by=Java+technology&search_by=brian+goetz




Monday, January 23, 12
Q&A
Monday, January 23, 12

Más contenido relacionado

Similar a JVM Internals - NHJUG Jan 2012

Introduction to Class File Format & Byte Code
Introduction to Class File Format & Byte CodeIntroduction to Class File Format & Byte Code
Introduction to Class File Format & Byte CodeDoug Hawkins
 
Lecture 13: Effective Presentations - Guest Lecture by Marie-Claude
Lecture 13: Effective Presentations - Guest Lecture by Marie-ClaudeLecture 13: Effective Presentations - Guest Lecture by Marie-Claude
Lecture 13: Effective Presentations - Guest Lecture by Marie-ClaudeJessica Laccetti
 
JVM Internals by Douglas Hawkins
JVM Internals by Douglas Hawkins JVM Internals by Douglas Hawkins
JVM Internals by Douglas Hawkins zuluJDK
 
Open Source Success: jQuery
Open Source Success: jQueryOpen Source Success: jQuery
Open Source Success: jQueryjeresig
 
Python in education
Python in education Python in education
Python in education pyconfi
 
Inside Rawkets - onGameStart
Inside Rawkets - onGameStartInside Rawkets - onGameStart
Inside Rawkets - onGameStartRobin Hawkes
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskMediaEval2012
 
JCoast - A software window into your microbial genomes and metagenomes
JCoast - A software window into your microbial genomes and metagenomesJCoast - A software window into your microbial genomes and metagenomes
JCoast - A software window into your microbial genomes and metagenomesmrichter23
 

Similar a JVM Internals - NHJUG Jan 2012 (10)

Introduction to Class File Format & Byte Code
Introduction to Class File Format & Byte CodeIntroduction to Class File Format & Byte Code
Introduction to Class File Format & Byte Code
 
Lecture 13: Effective Presentations - Guest Lecture by Marie-Claude
Lecture 13: Effective Presentations - Guest Lecture by Marie-ClaudeLecture 13: Effective Presentations - Guest Lecture by Marie-Claude
Lecture 13: Effective Presentations - Guest Lecture by Marie-Claude
 
JVM Internals by Douglas Hawkins
JVM Internals by Douglas Hawkins JVM Internals by Douglas Hawkins
JVM Internals by Douglas Hawkins
 
Info lit case
Info lit caseInfo lit case
Info lit case
 
Info lit case
Info lit caseInfo lit case
Info lit case
 
Open Source Success: jQuery
Open Source Success: jQueryOpen Source Success: jQuery
Open Source Success: jQuery
 
Python in education
Python in education Python in education
Python in education
 
Inside Rawkets - onGameStart
Inside Rawkets - onGameStartInside Rawkets - onGameStart
Inside Rawkets - onGameStart
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
 
JCoast - A software window into your microbial genomes and metagenomes
JCoast - A software window into your microbial genomes and metagenomesJCoast - A software window into your microbial genomes and metagenomes
JCoast - A software window into your microbial genomes and metagenomes
 

Más de Doug Hawkins

JVM Mechanics: Understanding the JIT's Tricks
JVM Mechanics: Understanding the JIT's TricksJVM Mechanics: Understanding the JIT's Tricks
JVM Mechanics: Understanding the JIT's TricksDoug Hawkins
 
ReadyNow: Azul's Unconventional "AOT"
ReadyNow: Azul's Unconventional "AOT"ReadyNow: Azul's Unconventional "AOT"
ReadyNow: Azul's Unconventional "AOT"Doug Hawkins
 
Java Performance Puzzlers
Java Performance PuzzlersJava Performance Puzzlers
Java Performance PuzzlersDoug Hawkins
 
Concurrency Concepts in Java
Concurrency Concepts in JavaConcurrency Concepts in Java
Concurrency Concepts in JavaDoug Hawkins
 
JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?Doug Hawkins
 
Understanding Garbage Collection
Understanding Garbage CollectionUnderstanding Garbage Collection
Understanding Garbage CollectionDoug Hawkins
 
JVM Internals - NEJUG Nov 2010
JVM Internals - NEJUG Nov 2010JVM Internals - NEJUG Nov 2010
JVM Internals - NEJUG Nov 2010Doug Hawkins
 
JVM Internals - Garbage Collection & Runtime Optimizations
JVM Internals - Garbage Collection & Runtime OptimizationsJVM Internals - Garbage Collection & Runtime Optimizations
JVM Internals - Garbage Collection & Runtime OptimizationsDoug Hawkins
 

Más de Doug Hawkins (9)

JVM Mechanics: Understanding the JIT's Tricks
JVM Mechanics: Understanding the JIT's TricksJVM Mechanics: Understanding the JIT's Tricks
JVM Mechanics: Understanding the JIT's Tricks
 
ReadyNow: Azul's Unconventional "AOT"
ReadyNow: Azul's Unconventional "AOT"ReadyNow: Azul's Unconventional "AOT"
ReadyNow: Azul's Unconventional "AOT"
 
Java Performance Puzzlers
Java Performance PuzzlersJava Performance Puzzlers
Java Performance Puzzlers
 
JVM Mechanics
JVM MechanicsJVM Mechanics
JVM Mechanics
 
Concurrency Concepts in Java
Concurrency Concepts in JavaConcurrency Concepts in Java
Concurrency Concepts in Java
 
JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?
 
Understanding Garbage Collection
Understanding Garbage CollectionUnderstanding Garbage Collection
Understanding Garbage Collection
 
JVM Internals - NEJUG Nov 2010
JVM Internals - NEJUG Nov 2010JVM Internals - NEJUG Nov 2010
JVM Internals - NEJUG Nov 2010
 
JVM Internals - Garbage Collection & Runtime Optimizations
JVM Internals - Garbage Collection & Runtime OptimizationsJVM Internals - Garbage Collection & Runtime Optimizations
JVM Internals - Garbage Collection & Runtime Optimizations
 

Último

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 

Último (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 

JVM Internals - NHJUG Jan 2012

  • 1. JVM Internals Douglas Q. Hawkins http://www.slideshare.net/dougqh http://www.dougqh.net dougqh@gmail.com Monday, January 23, 12
  • 2. Topics Java Byte Code File Format Byte Code Examples How Java 5 & 7 Features Are Implemented JVM Optimizations Monday, January 23, 12
  • 3. Why? Monday, January 23, 12 Besides techie edification, why is this useful? A better understanding of the internals can help in deciphering some of the harder problems, but better... You’ll know that the compiler and JVM are doing a lot for you letting you focus on writing readable code.
  • 5. Class File Format CA FE BA BE Minor Version Major Version Constant Pool Flags This Class Super Class Interfaces Fields Methods Attributes Monday, January 23, 12 Every file starts the magic 2-bytes: CAFEBABE Followed by major and minor version - major indicates Java 5, 6, 7, etc. Then a constant pool - which contains... constants: int, long, String, etc. references: method and field descriptors: method and field Followed by flags: modifiers for this class/interface Followed by reference to this class/interface Followed by the super class - which is an index into the constant pool Followed by a list interface references - which are indices into constant pool Followed by fields Followed by methods And, finally, attributes which are extra meta-information about the class... - the name of the original file - annotation information - information on sub-classes Class File Spec: http://java.sun.com/docs/books/jvms/second_edition/ClassFileFormat-Java5.pdf History of CAFEBABE: http://en.wikipedia.org/wiki/Java_class_file
  • 6. Class File Format CA FE BA BE Minor Version Major Version Constant Pool n Flags This Class Super Class pu te d tio ce iva te er ct ta ab tfp pr ec int ra fa ic um pr ic no Interfaces st bl ric ot al at en an fin st st Fields Methods Attributes Monday, January 23, 12 Every file starts the magic 2-bytes: CAFEBABE Followed by major and minor version - major indicates Java 5, 6, 7, etc. Then a constant pool - which contains... constants: int, long, String, etc. references: method and field descriptors: method and field Followed by flags: modifiers for this class/interface Followed by reference to this class/interface Followed by the super class - which is an index into the constant pool Followed by a list interface references - which are indices into constant pool Followed by fields Followed by methods And, finally, attributes which are extra meta-information about the class... - the name of the original file - annotation information - information on sub-classes Class File Spec: http://java.sun.com/docs/books/jvms/second_edition/ClassFileFormat-Java5.pdf History of CAFEBABE: http://en.wikipedia.org/wiki/Java_class_file
  • 7. Field Format Flags Name Descriptor pu te d lat nt iva te ile vo ie pr ec ic pr ic ns Attributes bl ot al at tra fin Monday, January 23, 12 st Fields consist of... flags followed by name - actually index to a string literal into the constant pool followed by descriptor - e.g. field type - also index into the constant pool - type is raw type followed by attributes - constant value - specific type information - List< String >, etc.
  • 8. Field Format Flags Name Descriptor “name” Attributes Monday, January 23, 12 Fields consist of... flags followed by name - actually index to a string literal into the constant pool followed by descriptor - e.g. field type - also index into the constant pool - type is raw type followed by attributes - constant value - specific type information - List< String >, etc.
  • 9. Field Format Flags Name Descriptor “Ljava/lang/String;” Attributes Monday, January 23, 12 Fields consist of... flags followed by name - actually index to a string literal into the constant pool followed by descriptor - e.g. field type - also index into the constant pool - type is raw type followed by attributes - constant value - specific type information - List< String >, etc.
  • 10. Field Format Flags Name Descriptor Attributes ConstantValue Monday, January 23, 12 Fields consist of... flags followed by name - actually index to a string literal into the constant pool followed by descriptor - e.g. field type - also index into the constant pool - type is raw type followed by attributes - constant value - specific type information - List< String >, etc.
  • 11. Method Format d ize Flags Name Descriptor pu te d al on iva te s tfp fi n hr pr ec rg ic va e pr ic Attributes tiv nc bl ra ric ot at na sy st Monday, January 23, 12 st Methods consist of... flags followed by name - actually index to a string literal into the constant pool followed by descriptor - e.g. raw parameter types and return type followed by attributes - exceptions & code - specific type information - List< String >, etc. - specific exception information - debugging information
  • 12. Method Format Flags Name Descriptor “main” Attributes Monday, January 23, 12 Methods consist of... flags followed by name - actually index to a string literal into the constant pool followed by descriptor - e.g. raw parameter types and return type followed by attributes - exceptions & code - specific type information - List< String >, etc. - specific exception information - debugging information
  • 13. Method Format Flags Name Descriptor “([Ljava/lang/String;)V” Attributes Monday, January 23, 12 Methods consist of... flags followed by name - actually index to a string literal into the constant pool followed by descriptor - e.g. raw parameter types and return type followed by attributes - exceptions & code - specific type information - List< String >, etc. - specific exception information - debugging information
  • 14. Method Format Flags Name Descriptor Attributes Exceptions Code Monday, January 23, 12 Methods consist of... flags followed by name - actually index to a string literal into the constant pool followed by descriptor - e.g. raw parameter types and return type followed by attributes - exceptions & code - specific type information - List< String >, etc. - specific exception information - debugging information
  • 15. Constant Pool C 2 UTF 10 HelloWorld C 4 UTF 16 “java/lang/Object” UTF 6 “<init>” UTF 3 “()V” UTF 4 “Code” M 3 9 N&T 5 6 UTF 4 “main” UTF 22 “([Ljava/lang/String;)V” F 13 15 C 14 UTF 16 “java/lang/System” Monday, January 23, 12 Dissect the “Hello World” example a little... Entry 1 is a class entry - a 2-byte index to a UTF entry that contains the name Entry 2 is the name of the class Similarly... Entry 3 is a class entry - referring to the parent class refers to Entry 4 which is the full name of the parent class Skip over the constructor “<init>” and focus on main Entry 10 is the name “main” & Entry 11 is the raw type descriptor for “main” The [Ljava/lang/String indicates String[] - V indicates returns void
  • 16. Browsing Class File Format JClassLib Viewer http://www.ej-technologies.com/products/jclasslib/overview.html Monday, January 23, 12 JClassLibViewer: http://www.ej-technologies.com/products/jclasslib/overview.html
  • 17. ConstantValue public final class HelloWorld { public static final String MESSAGE = "Hello, World!"; public static final void main( final String... args ) { System.out.println( MESSAGE ); } } Monday, January 23, 12 Here, we can see that because the “MESSAGE” field is “static final”. The value is stored in a “ConstantValue” attribute on the “MESSAGE” field.
  • 18. Exceptions public interface InputStreamProvider { public abstract InputStream open() throws IOException; } Monday, January 23, 12 Exception information is also stored in attribute. As it turns out the JVM, makes no distinction between checked and unchecked exceptions which has an interesting implication...
  • 19. Exceptions public final class NewInstance { public static void main(String... args) { try { public class SomeClass { Class. public SomeClass() throws SomeException { forName("net.dougqh.runtime.SomeClass"). throw new SomeException(); newInstance(); } } catch ( } InstantiationException | IllegalAccessException | ClassNotFoundException e) { e.printStackTrace(); } } } Exception in thread "main" net.dougqh.runtime.SomeClass$SomeException ! at net.dougqh.runtime.SomeClass.<init> ! at sun.reflect.NativeConstructorAccessorImpl.newInstance0 ! at sun.reflect.NativeConstructorAccessorImpl.newInstance ! at sun.reflect.DelegatingConstructorAccessorImpl.newInstance ! at java.lang.reflect.Constructor.newInstance ! at java.lang.Class.newInstance0 ! at java.lang.Class.newInstance ! at net.dougqh.runtime.NewInstance.main Monday, January 23, 12 www.javapuzzlers.com Because of an oversight in the original reflection API, Class.newInstance can throw a checked exception that is not reported by the compiler
  • 20. Generics public final class Generics { public static final List<String> getStrings() { return Collections.singletonList("foo"); } } Monday, January 23, 12 Here, we can getStrings() which returns List<String> has a descriptor of the raw-type List However, the exact type information is stored in the “Signature” attribute
  • 21. Annotations @Inherited @Retention( RetentionPolicy.RUNTIME ) public @interface Annotation { public int foo() default 20; public String bar(); } @Annotation( bar="quux" ) class Annotated {} Monday, January 23, 12 An annotation is just an inteface The default values for each method are stored in a ConstElement attribute The annotation information on a class or method is also stored in an attribute In this case, since the annotation has a RUNTIME RetentionPolicy, it is stored in the RuntimeVisibleAnnotations attribute Values for the attribute are stored in the sub-attribute ElementValuePair
  • 23. Stack Based Virtual Machine 0 iconst_1 0 1 2 3 1 iconst_2 2 iadd 3 istore_0 4 iload_0 Monday, January 23, 12 The JVM byte code format is stack-based like many other VMs: CLR, PHP, and Python In this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack Let’s look at how to add 1 + 2 together and store into a local variable First, we use an iconst_1 instruction to load onto the stack Java has special instructions for common numbers: -1 to 5. Next, an iconst_2 to place 2 on the stack Next, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stack Next, we use an istore_0 to store into the first local variable slot To load value, back from the local variable slots, we use an iload_0 Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
  • 24. Stack Based Virtual Machine 0 iconst_1 0 1 2 3 1 iconst_2 2 iadd 3 istore_0 4 iload_0 1 Monday, January 23, 12 The JVM byte code format is stack-based like many other VMs: CLR, PHP, and Python In this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack Let’s look at how to add 1 + 2 together and store into a local variable First, we use an iconst_1 instruction to load onto the stack Java has special instructions for common numbers: -1 to 5. Next, an iconst_2 to place 2 on the stack Next, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stack Next, we use an istore_0 to store into the first local variable slot To load value, back from the local variable slots, we use an iload_0 Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
  • 25. Stack Based Virtual Machine 0 iconst_1 0 1 2 3 1 iconst_2 2 iadd 3 istore_0 4 iload_0 2 1 Monday, January 23, 12 The JVM byte code format is stack-based like many other VMs: CLR, PHP, and Python In this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack Let’s look at how to add 1 + 2 together and store into a local variable First, we use an iconst_1 instruction to load onto the stack Java has special instructions for common numbers: -1 to 5. Next, an iconst_2 to place 2 on the stack Next, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stack Next, we use an istore_0 to store into the first local variable slot To load value, back from the local variable slots, we use an iload_0 Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
  • 26. Stack Based Virtual Machine 0 iconst_1 0 1 2 3 1 iconst_2 2 iadd 3 istore_0 4 iload_0 1+2 Monday, January 23, 12 The JVM byte code format is stack-based like many other VMs: CLR, PHP, and Python In this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack Let’s look at how to add 1 + 2 together and store into a local variable First, we use an iconst_1 instruction to load onto the stack Java has special instructions for common numbers: -1 to 5. Next, an iconst_2 to place 2 on the stack Next, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stack Next, we use an istore_0 to store into the first local variable slot To load value, back from the local variable slots, we use an iload_0 Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
  • 27. Stack Based Virtual Machine 0 iconst_1 0 1 2 3 1 iconst_2 2 iadd 3 istore_0 4 iload_0 3 Monday, January 23, 12 The JVM byte code format is stack-based like many other VMs: CLR, PHP, and Python In this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack Let’s look at how to add 1 + 2 together and store into a local variable First, we use an iconst_1 instruction to load onto the stack Java has special instructions for common numbers: -1 to 5. Next, an iconst_2 to place 2 on the stack Next, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stack Next, we use an istore_0 to store into the first local variable slot To load value, back from the local variable slots, we use an iload_0 Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
  • 28. Stack Based Virtual Machine 0 iconst_1 0 1 2 3 1 iconst_2 3 2 iadd 3 istore_0 4 iload_0 Monday, January 23, 12 The JVM byte code format is stack-based like many other VMs: CLR, PHP, and Python In this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack Let’s look at how to add 1 + 2 together and store into a local variable First, we use an iconst_1 instruction to load onto the stack Java has special instructions for common numbers: -1 to 5. Next, an iconst_2 to place 2 on the stack Next, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stack Next, we use an istore_0 to store into the first local variable slot To load value, back from the local variable slots, we use an iload_0 Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
  • 29. Stack Based Virtual Machine 0 iconst_1 0 1 2 3 1 iconst_2 3 2 iadd 3 istore_0 4 iload_0 3 Monday, January 23, 12 The JVM byte code format is stack-based like many other VMs: CLR, PHP, and Python In this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack Let’s look at how to add 1 + 2 together and store into a local variable First, we use an iconst_1 instruction to load onto the stack Java has special instructions for common numbers: -1 to 5. Next, an iconst_2 to place 2 on the stack Next, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stack Next, we use an istore_0 to store into the first local variable slot To load value, back from the local variable slots, we use an iload_0 Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
  • 30. Parameters and Local Variables static int volume( 0 iload_0 int width, 1 iload_1 int depth, int height ) 2 imul e t h lum igh h pt a dt { are 3 istore_3 de he wi vo 0 1 2 3 4 int area = width * depth; 4 iload_3 int volume = area * height; return volume; 5 iload_2 } 6 imul 7 istore 4 9 iload 4 11ireturn Monday, January 23, 12 Trace through a slightly more complicated example: calculating volume - arguments are passed into the low local variables slots - 0 - 3 in this case - first to calculate area, load width and depth from slots 0 & 1 respectively - multiply the values on the stack, then store result into slot 4 area - reload area & height - slots 4 & 3 respectively - multiply the values and store into slot 5: volume - reload volume and return Yes, the value is stored and then immediately reloaded in the byte code. Starting with Java 3, byte code is not optimized by javac, all optimizations are left to the JVM to perform.
  • 31. Static vs Virtual Methods int volume( 0 iload_1 int width, 1 iload_2 int depth, 2 imul e int height ) are t he h lum h igh pt a dt s { 3 istore 4 thi de wi vo 0 1 2 3 4 5 int area = width * depth; 5 iload 4 int volume = area * height; return volume; 7 iload_3 } 8 imul 9 istore 5 11 iload 5 13 ireturn Monday, January 23, 12 In the prior example, you may have noticed that method was static. If the method isn’t static, then “this” is invisibly passed to the first slot. So, our arguments start at 1 and the load and stores all change accordingly.
  • 32. Hello World System.out.println( “Hello World” ); 0 1 2 3 0 getstatic System.out 3 ldc “Hello World” 5 invokevirtual PrintStream.println “Hello World” 8 return System.out Monday, January 23, 12 Now, we know enough to understand “Hello World” The first operation is a getstatic to load the value of System.out onto the stack We need this reference to invoke println Second, load the string “Hello World” onto the stack - the ldc indicates a load from the constant pool Now, since this is non-static method on a class, use invokevirtual to invoke PrintStream.println This consumes the pointer to System.out (which is the this for PrintStream.println) and the reference to “Hello World” These values are then mapped to local slots for “this” and “msg” in the new stack frame
  • 33. Hello World System.out.println( “Hello World” ); 0 1 2 3 0 getstatic System.out 3 ldc “Hello World” 5 invokevirtual PrintStream.println “Hello World” 8 return System.out Monday, January 23, 12 Now, we know enough to understand “Hello World” The first operation is a getstatic to load the value of System.out onto the stack We need this reference to invoke println Second, load the string “Hello World” onto the stack - the ldc indicates a load from the constant pool Now, since this is non-static method on a class, use invokevirtual to invoke PrintStream.println This consumes the pointer to System.out (which is the this for PrintStream.println) and the reference to “Hello World” These values are then mapped to local slots for “this” and “msg” in the new stack frame
  • 34. Hello World System.out.println( “Hello World” ); 0 1 2 3 0 getstatic System.out 3 ldc “Hello World” 5 invokevirtual PrintStream.println “Hello World” 8 return System.out Monday, January 23, 12 Now, we know enough to understand “Hello World” The first operation is a getstatic to load the value of System.out onto the stack We need this reference to invoke println Second, load the string “Hello World” onto the stack - the ldc indicates a load from the constant pool Now, since this is non-static method on a class, use invokevirtual to invoke PrintStream.println This consumes the pointer to System.out (which is the this for PrintStream.println) and the reference to “Hello World” These values are then mapped to local slots for “this” and “msg” in the new stack frame
  • 35. Hello World g s ms thi System.out.println( “Hello World” ); 0 1 2 3 0 getstatic System.out 3 ldc “Hello World” 5 invokevirtual PrintStream.println “Hello World” 8 return System.out Monday, January 23, 12 Now, we know enough to understand “Hello World” The first operation is a getstatic to load the value of System.out onto the stack We need this reference to invoke println Second, load the string “Hello World” onto the stack - the ldc indicates a load from the constant pool Now, since this is non-static method on a class, use invokevirtual to invoke PrintStream.println This consumes the pointer to System.out (which is the this for PrintStream.println) and the reference to “Hello World” These values are then mapped to local slots for “this” and “msg” in the new stack frame
  • 36. Types of Method Invocations invokestatic - invoke static methods invokevirtual - invoke instance method from class invokeinterface - invoke instance method from interface invokespecial - invoke <init> / invoke super method invokedynamic - optimized dynamic look-up (in Java 7) Monday, January 23, 12 We’ve seen a call to invokevirtual which is used class methods, but there are other invocation types, too. invokestatic - for static methods invokeinterface- for methods invoked through an interface reference (rather than a class reference) invokespecial - for direct targets - like constructors or invoking a super method where the call is not polymorphic invokedynamic - used by script languages like JRuby in Java 7 for improved performance
  • 37. New Object BigDecimal num = m new BigDecimal(“2.0”); nu 0 1 2 3 0 new BigDecimal 3 dup 4 ldc “2.0” “2.0” 6 invokespecial BigDecimal.<init> 9 astore_0 Monday, January 23, 12 Now, let’s look an object allocation The first step is to an object; however, this steps does not yet invoke the constructor It just allocates space on the heap for the object and returns a pointer to uninitialized memory Unfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, we need to a copy (“dup”) so that we’ll have a reference left to store into “num”. Next, we push “2.0” onto the stack Then we invoke BigDecimal.<init> which is the BigDecimal constructor. It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”. As you can see construction is rather complicated, some of the past security wholes with byte code verifier involved object construction because the sequence is non-trivial. CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thus making byte code verification easier. From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a single step and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field. In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before the constructor is done being invoked.
  • 38. New Object BigDecimal num = m new BigDecimal(“2.0”); nu 0 1 2 3 0 new BigDecimal 3 dup 4 ldc “2.0” “2.0” 6 invokespecial BigDecimal.<init> 9 astore_0 BigDecimal Monday, January 23, 12 Now, let’s look an object allocation The first step is to an object; however, this steps does not yet invoke the constructor It just allocates space on the heap for the object and returns a pointer to uninitialized memory Unfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, we need to a copy (“dup”) so that we’ll have a reference left to store into “num”. Next, we push “2.0” onto the stack Then we invoke BigDecimal.<init> which is the BigDecimal constructor. It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”. As you can see construction is rather complicated, some of the past security wholes with byte code verifier involved object construction because the sequence is non-trivial. CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thus making byte code verification easier. From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a single step and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field. In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before the constructor is done being invoked.
  • 39. New Object BigDecimal num = m new BigDecimal(“2.0”); nu 0 1 2 3 0 new BigDecimal 3 dup 4 ldc “2.0” “2.0” 6 invokespecial BigDecimal.<init> 9 astore_0 BigDecimal Monday, January 23, 12 Now, let’s look an object allocation The first step is to an object; however, this steps does not yet invoke the constructor It just allocates space on the heap for the object and returns a pointer to uninitialized memory Unfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, we need to a copy (“dup”) so that we’ll have a reference left to store into “num”. Next, we push “2.0” onto the stack Then we invoke BigDecimal.<init> which is the BigDecimal constructor. It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”. As you can see construction is rather complicated, some of the past security wholes with byte code verifier involved object construction because the sequence is non-trivial. CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thus making byte code verification easier. From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a single step and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field. In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before the constructor is done being invoked.
  • 40. New Object BigDecimal num = m new BigDecimal(“2.0”); nu 0 1 2 3 0 new BigDecimal 3 dup 4 ldc “2.0” “2.0” 6 invokespecial BigDecimal.<init> 9 astore_0 BigDecimal Monday, January 23, 12 Now, let’s look an object allocation The first step is to an object; however, this steps does not yet invoke the constructor It just allocates space on the heap for the object and returns a pointer to uninitialized memory Unfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, we need to a copy (“dup”) so that we’ll have a reference left to store into “num”. Next, we push “2.0” onto the stack Then we invoke BigDecimal.<init> which is the BigDecimal constructor. It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”. As you can see construction is rather complicated, some of the past security wholes with byte code verifier involved object construction because the sequence is non-trivial. CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thus making byte code verification easier. From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a single step and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field. In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before the constructor is done being invoked.
  • 41. New Object BigDecimal num = m new BigDecimal(“2.0”); nu 0 1 2 3 0 new BigDecimal 3 dup 4 ldc “2.0” “2.0” 6 invokespecial BigDecimal.<init> 9 astore_0 BigDecimal Monday, January 23, 12 Now, let’s look an object allocation The first step is to an object; however, this steps does not yet invoke the constructor It just allocates space on the heap for the object and returns a pointer to uninitialized memory Unfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, we need to a copy (“dup”) so that we’ll have a reference left to store into “num”. Next, we push “2.0” onto the stack Then we invoke BigDecimal.<init> which is the BigDecimal constructor. It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”. As you can see construction is rather complicated, some of the past security wholes with byte code verifier involved object construction because the sequence is non-trivial. CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thus making byte code verification easier. From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a single step and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field. In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before the constructor is done being invoked.
  • 42. Demo javap -c Monday, January 23, 12
  • 43. Conditionals Original Byte Code if ( x > 0 ) { 0: iload_0 return true; 1: ifle 6 } else { 4: iconst_1 return false; 5: ireturn } 6: iconst_0 7: ireturn 0: iload_0 return x > 0 ? true : false; 1: ifle 8 4: iconst_1 5: goto 9 8: iconst_0 9: ireturn 0: iload_0 return ( x > 0 ); 1: ifle 8 4: iconst_1 5: goto 9 8: iconst_0 9: ireturn Monday, January 23, 12 Three ways to write a method that checks if a number is greater than 0. The byte code is almost the same in all 3 cases.
  • 44. Invoke Static Original Decompiled Math.max(10, 20); 0: bipush 10 2: bipush 20 4: invokestatic Math.max 7: pop 8: return Monday, January 23, 12 Here, we see an extra pop after the invokestatic call. That’s because the return value of max is left on the stack, since we don’t use it the compiler generates a pop to discard it. If we store the value in a variable, the pop will be replaced with an istore
  • 45. Invocations Original Decompiled FileInputStream in = 0: new FileInputStream new FileInputStream("foo"); 3: dup in.close(); 4: ldc "foo" 6: invokespecial FileInputStream.<init> 9: astore_0 10: aload_0 11: invokevirtual FileInputStream.close 14: return Closeable in = new FileInputStream("foo"); 0: new FileInputStream in.close(); 3: dup 4: ldc "foo" 6: invokespecial FileInputStream.<init> 9: astore_0 10: aload_0 11: invokeinterface Closeable.close 16: return Monday, January 23, 12 In one example, close is called on a class-type FileInputStream in the other it is called on an interface-type Closeable In the first case, the compiler generates an invokevirtual call In the second case, the compiler generates an invokeinterface call
  • 46. For Loop before 0 iconst_0 init & test loop 1 istore_2 2 iload_0 static int sum( int min, int max ){ 3 istore_3 int sum = 0; 4 goto +10 //14 for ( int i=min; i<max; ++i ){ 7 iload_2 loop body sum += i; 8 iload_3 } 9 iadd return sum; 10 istore_2 } inc 11 iinc 3 by 1 14 iload_3 test 15 iload_1 16 if_icmplt -9 //7 19 iload_2 after loop 20 ireturn Monday, January 23, 12 Examine a for loop example The first 2 ops are the initialization of “sum”, load 0 and store in “sum” (slot 2) The next 3 ops are the loop initialization and jump to the initial test... - load the value of “min” (slot 0) into “i” (slot 3) - then jump to the test The test is placed at the end since it is generally performed after the body and step portions of the loop The test... - loads “i” (slot 3) and “max” (slot 1) - if “i” is less than “max”, then it jumps back 9 bytes to the start of the loop body The loop body... - loads and adds “sum” and “i” (slots 2 and 3) and stores the result back into “sum” (slot 2) Then the step / increment part of the loop happens... - which just increments “i” Then we flow straight into the test portion If the test fails, we flow through to the after loop portion Here, we load “sum” (slot 2) and return the result
  • 47. 0 aload_0 Exception Handling 1 invokevirtual InputStream.read try / finally static int read( InputStream in ) { 4 istore_1 try { 5 aload_0 return in.read(); 6 invokestatic IoUtils.closeQuietly } catch ( IOException e ) { 9 iload_1 return -1; 10 ireturn } finally { 11 pop IoUtils.closeQuietly( in ); catch / finally } 12 aload_0 } 13 invokestatic IoUtils.closeQuietly 16 iconst_m1 17 ireturn Exception Table 18 astore_2 start end handler Exception 19 aload_0 0 5 11 IOException finally 20 invokestatic IoUtils.closeQuietly 0 5 18 any 23 aload_2 11 12 18 any 24 athrow Monday, January 23, 12 Now, Exception handling... Exceptions are handled through extra meta-information that says how to handle different types of exceptions over a range of byte-code instructions. The finally portion is inlined in the try, catch, and finally portions of the generated byte code. (Prior to Java 6, the regular javac compiler generated “jsr” and “ret” to jump to single block of compiled “finally” code.) The “try / finally” section represents the normal flow. - invoke InputStream.read - store the result into an unnamed temporary variable (slot 1) b/c we need to run the finally code - run the finally code - reload the temporary variable and return The “catch / finally” is the catching of the IOException... The exception table says if an IOException is raised between instructions 0 and 5 (the try), jump to 11 this catch section. First, step is to “pop”, pop what? In this case the IOException which was automatically placed on the stack. Since we don’t use it discard it. This implies that “e” is never assigned a stack slot by the compiler. Now, invoke IoUtils.closeQuietly (the finally block) then return -1.
  • 48. 0 aload_0 Synchronization before try 1 dup int inc() { 2 astore_1 synchronized ( this ) { 3 monitorenter ++this.counter; 4 aload_0 } 5 dup } 6 getfield Counter.num try / finally 9 iconst_1 10 iadd 11 putfield Counter.num 14 aload_1 15 monitorexit 16 goto +6 //22 Exception Table 19 aload_1 start end handler Exception finally 20 monitorexit 4 16 22 any 21 athrow 19 21 22 any 22 return Monday, January 23, 12 Interestingly enough, synchronization works the same way. To understand synchronization, it is better to luck at synchronization as a lock and unlock within a try / finally. And, that’s exactly how the byte code works. And, just like a regular try / finally, the finally is inlined is both the try and the finally.
  • 49. 0 aload_0 Synchronization before try 1 dup int inc() { 2 astore_1 lock( this ); 3 monitorenter try { 4 aload_0 ++this.counter; 5 dup } finally { 6 getfield Counter.num try / finally unlock( this ); } 9 iconst_1 } 10 iadd 11 putfield Counter.num 14 aload_1 15 monitorexit 16 goto +6 //22 Exception Table 19 aload_1 start end handler Exception finally 20 monitorexit 4 16 22 any 21 athrow 19 21 22 any 22 return Monday, January 23, 12 Interestingly enough, synchronization works the same way. To understand synchronization, it is better to luck at synchronization as a lock and unlock within a try / finally. And, that’s exactly how the byte code works. And, just like a regular try / finally, the finally is inlined is both the try and the finally.
  • 50. Demo Java 5 Java 7 Monday, January 23, 12 In these demos, I demonstrate new language features by showing Java 5 and Java 7 code and then showing what it looks when its decompiled back into Java 4 code. JAD - http://www.varaneckas.com/jad
  • 51. Java 5 Monday, January 23, 12 JAD - http://www.varaneckas.com/jad
  • 52. Auto-Boxing Original Decompiled as Java 4 public class AutoBoxing { public class AutoBoxing { public static void main(String[] args) { public static void main(String args[]) { Integer foo = 20; Integer foo = Integer.valueOf(20); Integer bar = 30; Integer bar = Integer.valueOf(30); int sum = foo + bar; int sum = foo.intValue() + bar.intValue(); System.out.println(sum); System.out.println(sum); } } } } Monday, January 23, 12 Here, we see how auto-boxing works. The compiler injects the necessary calls to Integer.valueOf and Integer.intValue for us. NOTE: Even if you don’t like auto-boxing, please call Integer.valueOf rather than calling new Integer. Unlike new, Integer.valueOf returns cached instances of Integer for commonly used values.
  • 53. Enhanced For Original Decompiled as Java 4 public class EnhancedFor { public class EnhancedFor { static void array(String[] args) { static void array(String args[]) { for ( String arg : args ) { String arr$[] = args; System.out.println(arg); int len$ = arr$.length; } for (int i$ = 0; i$ < len$; i$++) { } String arg = arr$[i$]; System.out.println(arg); static void iterable( } Iterable<String> args) } { for ( String arg: args ) { static void iterable(Iterable args) { System.out.println(arg); String arg; } } for (Iterator i$ = args.iterator(); } i$.hasNext(); ) { arg = (String) i$.next(); System.out.println(arg) } } } Monday, January 23, 12 In this slide, we see how the enhanced for gets handled by the compiler. The array for loop, converts to the canonical C-style loop. With one slight difference of performing invariant hoisting on the array length. (Although, this is rather pointless optimization because the JVM would do this at runtime anyway.) For an Iterable, a loop that uses an iterator is generated. In this example, we can also see that the compiler injects a cast to exact type String, too.
  • 54. Var-Args Original Decompiled as Java 4 public final class VarArgs { public final class VarArgs { public static void main(String... args) { public static transient void main( System.out.printf( String[] args) "Hello %s %s", "Jon", "Doe"); { } System.out.printf( } "Hello %s %s", new Object[] {"Jon", "Doe"}); } } Monday, January 23, 12 In this example, we var-args being used both in the signature and in the call to printf. NOTE: I’ve declared a main method with var-args, since on a byte-code level this is still just a String[]. This actually works just fine. The “transient” modifier in the decompiled Java 4 is a bit amusing. This happens because Java ran out of flag bits to use in Java 5, so they overloaded the “transient” bit which only applies to fields to mean “var-args” when applied to methods. In the call to printf, we can see that the compiler injects a construction of a new Object[] and passes it as the last arg to printf.
  • 55. Enum Original Decompiled as Java 4 public enum AnEnum { public static final class AnEnum FOO, extends Enum BAR, { QUUX public static final AnEnum FOO = } new AnEnum(“FOO”, 0); public static final AnEnum BAR = new AnEnum(“BAR”, 1); public static final AnEnum QUUX = new AnEnum(“QUUX”, 2); private static final AnEnum[] $VALUES = new AnEnum[]{FOO, BAR, QUUX}; public static AnEnum[] values() { return (AnEnum[]) $VALUES.clone(); } public static AnEnum valueOf(String name){ return (AnEnum)Enum.valueOf( AnEnum.class, name); } private Simple(String s, int i) { super(s, i); } } Monday, January 23, 12 For Enum-s, the compiler does a great deal of work on your behalf -- even in the simplest case. The compiler generates a constructor that takes a label and ordinal for each entry. It then initializes a static final field for each constant from the original file. These constants are all placed in a value array. Finally, the compiler generates a values() method and valueOf() method for each enum class.
  • 56. Covariance Original Decompiled as Java 4 public interface Parent { public static interface Parent { Number calculate(); public abstract Number calculate(); } } public class CovariantChild public class CovariantChild implements Parent implements Parent { { public Integer calculate() { public Integer calculate() { return 10; return Integer.valueOf(10); } } } public volatile Number calculate() { return calculate(); } } Monday, January 23, 12 A lesser known addition to Java 5 is the ability to have a covariant return type. Here, the child type returns a more specific type of Number -- namely Integer. The generated code is interesting. We end up with two “calculate” methods - one that returns Integer and another returns Number. The one that returns Number satisfies the contact of the parent and simply calls the more specific version that returns Integer. Here, again we see the curious modifier on a method: “volatile”. This another situation where Java 5 overloaded an existing flag bit. For more information on why this is type-safe, look-up Liskov Substitution Principle.
  • 58. Multi-Catch Original Decompiled as Java 4 public final class EnhancedCatch { public final class EnhancedCatch { public static void main(String[] args){ public static void main(String args[]) { try { try { Class. Class. forName("some.package.SomeClass"). forName("some.package.SomeClass"). newInstance(); newInstance(); } catch ( } catch (ReflectiveOperationException e){ InstantiationException | throw new IllegalStateException(e); IllegalAccessException | } ClassNotFoundException e) } { } throw new IllegalStateException(e); } } } Monday, January 23, 12 Java 7 adds the ability to handle multi-exception types in a single catch. Great for ugly reflection code. Here, the catch of all the reflection exceptions simplifies to a single catch of their common parent ReflectiveOperationException (a new base class for reflection exceptions also introduced in Java 7).
  • 59. Try With Resources Original Decompiled public class EnhancedTry { public class EnhancedTry { public static void main( public static void main(String args[]) String[] args) throws IOException throws IOException { { Properties properties = new Properties(); Properties properties = InputStream in = new Properties(); new FileInputStream("my.properties"); Throwable throwable = null; try (InputStream in = try { new FileInputStream("my.properties")) properties.load(in); { } catch (Throwable throwable1) { properties.load(in); throwable = throwable1; } } finally { } if (in != null) { } try { in.close(); } catch (Throwable x2) { throwable.addSuppressed(x2); throw throwable; } } } } } Monday, January 23, 12 Java 7 also enhances try by allowing it to automatically close resources. It generates a similar try / finally to what you’d write by hand. Although, it puts the resource acquisition outside the try (which is correct but uncommon among many Java programmers). However, it does one more thing, it also adds code, so that if an exception happens when closing the original exception from the body is still propagated. And, even better the exception raised by closed is added to the suppressed list of the original exception using the new Java 7 method: Throwable.addSuppressed.
  • 60. String Switch Original Decompiled switch (args[0]) { byte byte0 = -1; case "Hello": switch(args[0].hashCode()) { System.out.println("Hello, World!"); case 69609650: ... break; break; case 67278: if(s.equals("9uFFE7")) { case "Bye": byte0 = 2; System.out.println("Good Bye, World!"); } else if(s.equals("Bye")) { break; byte0 = 1; } case "9uffe7": break; System.out.println("Collision"); } break; switch(byte0) { } case 0: System.out.println("Hello, World!"); break; case 1: System.out.println("Good Bye, World!"); break; case 2: System.out.println("Collision"); break; } Monday, January 23, 12 One last example from Java 7 -- string switch String switch is implemented as a switch on the String’s hashCode. However, hashCode is not unique, so the generated code must also perform an equals check. To handle this, string switch actually generates two switch statements. The first on the hashCode, assigns a temporary variable, a case value from the original code. Then the second switches on the case code, each case containing code from the original Java 7 cases. Here, I’ve deliberately created a hash collision, so you can see how collisions are resolved.
  • 61. Compiler Optimizations Monday, January 23, 12 In the next few examples, I show code the original code and the code after it has been decompiled. By doing this, we can see some of the optimizations performed by the compiler. JAD - http://www.varaneckas.com/jad
  • 62. Constant Folding Original Decompiled public final class StaticInitializer { public final class StaticInitializer { private static final String LOG_FORMAT = private static final String LOG_FORMAT = "Started at %d ms"; "Started at %d ms"; private static final long START_TIME = private static final long START_TIME = System.currentTimeMillis(); System.currentTimeMillis(); private static final long START_TIME_2; private static final long START_TIME_2 = System.currentTimeMillis(); static { } START_TIME_2 = System.currentTimeMillis(); } } Monday, January 23, 12 While modern Java compiler’s don’t do much optimization, they do some. One example is constant folding -- when possible, the compiler computes simply constant expressions at compile time. This even includes string concatenation.
  • 63. Constant Inlining Original Decompiled public class Inlining { public class Inlining { public static final String public static final String INLINED_VERSION = "1.1.0"; INLINED_VERSION = "1.1.0"; public static final String public static final String NOT_INLINED_VERSION = identity("1.2.0"); NOT_INLINED_VERSION = identity("1.2.0"); private static String identity( private static String identity( String value) String value) { { return value; return value; } } public static void print() { public static void print() { System.out.println(INLINED_VERSION); System.out.println("1.1.0"); System.out.println(NOT_INLINED_VERSION); System.out.println(NOT_INLINED_VERSION); } } } } Monday, January 23, 12 Constants can also be inlined by the compiler In this example, the compiler inlines INLINED_VERSION in the print method; however, it does no inlined NOT_INLINED_VERSION. The reason is that NOT_INLINED_VERSION is complexed expression because a method was invoked. This has implications in the byte code, too. INLINED_VERSION will have its value set through a ConstantValue attribute. NOT_INLINED_VERSION will be initialized in a <clinit> method generated by the compiler and called automatically when the class is first loaded.
  • 64. Dead Code Elimination Original Decompiled public class DeadCodeElimination { public class DeadCodeElimination { public static final boolean public static final boolean DEBUG_OFF = false; DEBUG_OFF = false; public static final boolean public static final boolean DEBUG_ON = true; DEBUG_ON = true; public static void main(String[] args) { public static void main(String args[]) { if ( DEBUG_OFF ) { System.out.println("always"); System.out.println("never"); } } } if ( DEBUG_ON ) { System.out.println("always"); } } } Monday, January 23, 12 Along with inlining, the compiler can perform dead code elimination. In this case, DEBUG_OFF is never true, so the “never” print out is not generated by the compiler. Even in the DEBUG_ON case, the compiler realizes the if is always true and simply includes an unconditional print of “always”.
  • 65. Runtime Optimizations Monday, January 23, 12
  • 66. HotSpot Lifecycle 1 2 Interpreted Profiling Dynamic Dynamic Decompilation Compilation 4 3 Monday, January 23, 12 Client compilation kicks-in at invocation 3000 Server compilation kicks-in at invocation 10000 Tiered compilation - C0, C1, C2 Method Replacement vs On-Stack Replacement http://java.sun.com/products/hotspot/whitepaper.html http://openjdk.java.net/groups/hotspot/docs/HotSpotGlossary.html http://www.azulsystems.com/blog/cliff-click/2010-07-16-tiered-compilation http://www.slideshare.net/drorbr/so-you-want-to-write-your-own-benchmark-presentation
  • 67. Is This Optimized? double sumU = 0, sumV = 0; for ( int i = 0; i < 100; ++i ) { Vector2D vector = new Vector2D( i, i ); synchronized ( vector ) { sumU += vector.getU(); How many...? sumV += vector.getV(); Loop Iterations 100 } Heap Allocations 100 } Method Invocations 200 Lock Acquisitions 100 Monday, January 23, 12 Let’s start the runtime observation discussion with a simple question. Is this optimized? How many loop iterations does it do? 100 How many heap allocations? 100 How method invocations? 200 How lock acquisitions? 100 Surprisingly, enough the answer to all of these may actually be zero.
  • 68. Is This Optimized? double sumU = 0, sumV = 0; for ( int i = 0; i < 100; ++i ) { Vector2D vector = new Vector2D( i, i ); synchronized ( vector ) { sumU += vector.getU(); How many...? sumV += vector.getV(); Loop Iterations 0 } Heap Allocations 0 } Method Invocations 0 Lock Acquisitions 0 Monday, January 23, 12 Let’s start the runtime observation discussion with a simple question. Is this optimized? How many loop iterations does it do? 100 How many heap allocations? 100 How method invocations? 200 How lock acquisitions? 100 Surprisingly, enough the answer to all of these may actually be zero.
  • 69. Common Sub-Expression Elimination int x = a + b; int y = a + b; int tmp = a + b; int x = tmp; int y = tmp; Monday, January 23, 12 Among the simplest optimizations is common sub-expression elimination. Here the VM optimizes the code by only performing the calculation of “a+b” once. http://www.slideshare.net/drorbr/so-you-want-to-write-your-own-benchmark-presentation
  • 70. Array Bounds Check Elimination int[] nums = ... for ( int i = 0; i < nums.length; ++i ) { System.out.println( “nums[“ + i + “]=” + nums[ i ] ); } int[] nums = ... for ( int i = 0; i < nums.length; ++i ) { if ( i < 0 || i >= nums.length ) { throw new ArrayIndexOutOfBoundsException(); } System.out.println( “nums[“ + i + “]=” + nums[ i ] ); } Monday, January 23, 12 One of the nice things about the VM is that we do have to worry about buffer overruns because the VM checks array bounds for us, but how much is that costing us. In short, nothing. The VM recognizes common patterns and realizes that it does not need to generate the bound checking code. http://www.cs.umd.edu/~vibha/330/array-bounds.pdf
  • 71. Loop Invariant Hoisting for ( int i = 0; i < nums.length; ++i ) { ... } int length = nums.length; for ( int i = 0; i < length; ++i ) { ... } Monday, January 23, 12 The VM can also also realize that the length of array does not change, so it can replace looking up the length of the array on each test with a single storing of a temporary variable and comparing against that instead. http://java.sun.com/products/hotspot/docs/whitepaper/Java_Hotspot_v1.4.1/Java_HSpot_WP_v1.4.1_1002_4.html
  • 72. Loop Unrolling int sum = 0; for ( int i = 0; i < 10; ++i ) { sum += i; } int sum = 0; sum += 1; ... sum += 9; Monday, January 23, 12 In some situations, the loop can even be unrolled into a simple linear code segment.
  • 73. Method Inlining Vector vector = ... double magnitude = vector.magnitude(); Vector vector = ... static always double magnitude = Math.sqrt( final always vector.u*vector.u + vector.v*vector.v ); private always Vector vector = ... virtual often double magnitude; reflective sometimes if ( vector instance of Vector2D ) { magnitude = Math.sqrt( dynamic often vector.u*vector.u + vector.v*vector.v ); } else { magnitude = vector.magnitude(); } Monday, January 23, 12 http://www.ibm.com/developerworks/library/j-jtp12214/ http://openjdk.java.net/groups/hotspot/docs/HotSpotGlossary.html http://blog.headius.com/2009/01/my-favorite-hotspot-jvm-flags.html http://java.sun.com/developer/technicalArticles/Networking/HotSpot/inlining.html
  • 74. Lock Coarsening StringBuffer buffer = ... buffer.append( “Hello” ); buffer.append( name ); buffer.append( “n” ); StringBuffer buffer = ... lock( buffer ); buffer.append( “Hello” ); unlock( buffer ); lock( buffer ); buffer.append( name ); unlock( buffer ); lock( buffer ); buffer.append( “n” ); unlock( buffer ); StringBuffer buffer = ... lock( buffer ); buffer.append( “Hello” ); buffer.append( name ); buffer.append( “n” ); unlock( buffer ); Monday, January 23, 12 Starting in Java 5, HotSpot optimizes locks by performing lock coarsening. The VM realizes that constantly acquiring and releasing the same lock is not performant, so may take a single larger lock instead. http://java.sun.com/performance/reference/whitepapers/6_performance.html#2.1
  • 75. Other Lock Optimizations Biased Locking Adaptive Locking - Thread sleep vs. Spin lock Monday, January 23, 12 And, even more lock optimizations are possible... - biased locking - makes it cheap for the last thread to acquire lock to acquire it again - adaptive locking - dynamic detects whether a lock is usually held for a short or long period - if it is long, the thread is put to sleep - if it is short, the thread will simply spin http://java.sun.com/performance/reference/whitepapers/6_performance.html#2.1
  • 76. Escape Analysis Point p1 = new Point( x1, y1 ), p2 = new Point( x2, y2 ); synchronized ( p1 ) { synchronized ( p2 ) { double dx = p1.getX() - p2.getX(); double dy = p1.getY() - p2.getY(); double distance = Math.sqrt( dx*dx + dy*dy ); } } Monday, January 23, 12 Finally, in Java 7, escape analysis is finally on by default. With escape analysis, the VM can realize that an object never escapes a stack frame allowing it to... - elide heap allocation - elide locks
  • 77. Escape Analysis Point p1 = new Point( x1, y1 ), p2 = new Point( x2, y2 ); double dx = p1.getX() - p2.getX(); double dy = p1.getY() - p2.getY(); double distance = Math.sqrt( dx*dx + dy*dy ); Monday, January 23, 12 Finally, in Java 7, escape analysis is finally on by default. With escape analysis, the VM can realize that an object never escapes a stack frame allowing it to... - elide heap allocation - elide locks
  • 78. Escape Analysis Point p1 = new Point( x1, y1 ), p2 = new Point( x2, y2 ); double dx = p1.getX() - p2.getX(); double dy = p1.getY() - p2.getY(); double distance = Math.sqrt( dx*dx + dy*dy ); double dx = x1 - x2; double dx = y1 - y2; double distance = Math.sqrt( dx*dx + dy*dy ); Monday, January 23, 12 Finally, in Java 7, escape analysis is finally on by default. With escape analysis, the VM can realize that an object never escapes a stack frame allowing it to... - elide heap allocation - elide locks
  • 79. Runtime Demo http://code.google.com/p/caliper/ Monday, January 23, 12 To conclude the runtime optimization section, I’ll show some micro-benchmarks illustrating some of the optimizations. Writing microbenchmarks for a dynamically optimizing VM is devilishly hard, fortunately, Google created a tool called Caliper to make it easy. You can write JUnit 3 like Benchmark classes to compare various implementation options. http://www.slideshare.net/drorbr/so-you-want-to-write-your-own-benchmark-presentation http://code.google.com/p/caliper/
  • 80. Loop Variable Placement Inside for ( int i = 0; i < ints.length; ++i ) { int x = ints[i]; sum += x; vs. } Outside int x; for ( int i = 0; i < ints.length; ++i ) { x = ints[i]; sum += x; vs. } No Variable for ( int i = 0; i < ints.length; ++i ) { sum += ints[i]; } Monday, January 23, 12 First, let’s look at loop variable placement -- declaring the loop variable inside the loop vs. outside vs. using no variable at all. All three take the same amount of time to run. In fact, declaring inside or outside produces the same byte code. My recommendation... For a one-line loop body, skip the variable. For a complicated loop body, declare the variable inside to keep the code easier to read and refactor.
  • 81. Loop Invariant Hoisting Regular For for ( int i = 0; i < ints.length; ++i ) { sum += ints[i]; } vs. Manual Hoisting for ( int i = 0, len = ints.length; i < len; ++i ) { sum += ints[i]; } vs. Enhanced For for ( int x : ints ) { sum += x; } Monday, January 23, 12 Now, we’ll compare... - the canonical loop which checks i against array.length each time in the test - manually, hoisting the length into a len temporary variable - using Java 5’s enhanced for Once again, they all take the same amount of time because the VM performs for hoisting for us.
  • 82. Field Access Direct point.x point.y vs. Virtual Accessor point.getX() point.getY() vs. Interface Accessor point.getX() point.getY() Monday, January 23, 12 Next, we’ll look at direct field access vs. using a virtual accessor method vs. using an interface accessor method Once again, the VM can optimize all of these by performing method inlining, so all three take the same amount of the time.
  • 83. Loop Variable Placement StringBuilder - no locks StringBuilder builder = new StringBuilder(); builder.append( "foo" ); builder.append( "bar" ); builder.append( "baz" ); vs. StringBuffer - multiple locks StringBuffer buffer = new StringBuffer(); buffer.append( "foo" ); buffer.append( "bar" ); buffer.append( "baz" ); vs. StringBuffer - single lock StringBuffer buffer = new StringBuffer(); synchronized( buffer ) { buffer.append( "foo" ); buffer.append( "bar" ); buffer.append( "baz" ); } Monday, January 23, 12 Now, revisiting locking - compare... Java 5’s StringBuilder which performs no locking vs. Plain StringBuffer code - multiple separate appends vs. StringBuffer - with a manually added bigger lock The no lock version does come out slightly ahead, but it is close. And, the attempt to manually improve performance by taking a bigger single lock actually comes in last.
  • 84. Heap Elision Benchmark Primitive Array Arrays.sort(new int[]{...}); vs. Boxed Array - no Comparator Arrays.sort(new Integer[]{...}); vs. Boxed Array - singleton Compator Arrays.sort( new Integer[]{...}, IntCompator.INSTANCE); vs. Boxed Array - anonymous Compator Arrays.sort( new Integer[]{...}, new Comparator<Integer>() { ... }); Monday, January 23, 12 Lastly, lets look at heap elision by looking at sorting some lists. No surprise, the primitive array is the most performant. But the no Comparator case, the singleton Comparator case, and an anonymous Comparator all perform the same. Even creating an anonymous every time does not impact performance much -- in Java 7, no heap allocation may take place at all.
  • 85. Is This Optimized? double sumU = 0, sumV = 0; for ( int i = 0; i < 100; ++i ) { Vector2D vector = new Vector2D( i, i ); synchronized ( vector ) { sumU += vector.getU(); How many...? sumV += vector.getV(); Loop Iterations 0 } Heap Allocations 0 } Method Invocations 0 Lock Acquisitions 0 Monday, January 23, 12 So now, hopefully, you can see how this could may truly be optimized already. Just write clean code and trust in the VM to make it fast. If you must optimize always profile first and use a micro-benchmarking tool like Caliper.
  • 86. Recommending Reading Java Puzzlers By Joshua Bloch and Neal Gafter http://www.javapuzzlers.com/ Java Specialist Newsletter http://www.javaspecialists.eu Brian Goetz’s Articles http://www.ibm.com/developerworks/views/java/libraryview.jsp?contentarea_by=Java+technology&search_by=brian+goetz Monday, January 23, 12