The document provides an overview of MSIL (Microsoft Intermediate Language):
- MSIL is a CPU-independent bytecode that is generated by .NET compilers instead of native code. It targets the CLR for execution.
- The article explains MSIL's stack-based approach, data types, instruction types, and how instructions are executed. It also demonstrates simple MSIL code examples from C# code.
- The Ildasm tool can be used to examine the MSIL code generated from C# programs, and help debug issues by viewing the low-level operations.
1. An Overview of MSIL
The .NET architecture addresses an important need - language interoperability. Instead of generating
native code that is specific to one platform, programming languages can generate code in MSIL
(Microsoft Intermediate Language) targeting the Common Language Runtime (CLR) to reap the rich
benefits provided by .NET.
Advanced programmers occasionally peek into MSIL code when they are in doubt of what is
happening under the hood (using the Ildasm tool). Therefore, it is essential that the C# programmer
understands the basics of MSIL. This beginner-level article gives an overview of MSIL and debugging
with the Ildasm tool.
System Requirements
The programming examples in this article use C# as the source language for generating MSIL code,
and so the reader is expected to have some basic understanding of C#. No prior exposure to MSIL is
necessary. In addition, the reader is assumed to know what a stack data structure is. It is preferable that
the reader has access to the Ildasm tool and the C# compiler.
Article Structure
The article has three main sections:
● An Overview of MSIL: The basics of MSIL, the data types, instruction types, and the way that
the instructions are executed are explained.
● Examining MSIL: This section covers MSIL using simple example programs.
● Debugging Using the Ildasm tool: Explains the use of the intermediate language disassembler
(Ildasm) and the way it can be used for debugging.
.NET supports several high-level languages such as C#, VB.NET and Managed C++.NET. The MSIL is
designed to accommodate a wide range of languages. In .NET, the unit of deployment is the PE
(Portable Executable) file - a predefined binary standard (similar to the class files of Java). MSIL,
along with metadata, is stored inside the PE files generated by the compiler. MSIL is such a simple
language that it doesn't require much effort to understand. Metadata describes the types - its definition,
signature, etc - that are useful at runtime.
An Overview of MSIL
MSIL is a CPU independent, stack-based instruction set that can be efficiently converted to the native
code of a specific platform. In this stack-based approach, the representation assumes the presence of a
run-time stack and the code is generated keeping the stack in mind. The runtime environment may use
the stack for evaluation of expressions, and store the intermediate values in the stack itself. Such an
evaluation using a runtime stack is a form of interpretation. In practice, the MSIL is not interpreted -
there is a Just-In-Time (JIT) compiler that translates the intermediate code to native code to execute in
a particular platform at runtime. The stack-based code facilitates maximum portability across the
platforms and is easy to verify.
The MSIL:
2. ● Supports object oriented programming.
● Works in terms of the data types available in the .NET Framework, for example, System.String
and System.Int32.
● Instructions can be classified into various types such as: loading (ld*), storing (st*), method
invocation, arithmetic operations, logical operations, control flow, memory allocation, and
exception handling. The following section covers basic instructions using examples.
Examining MSIL
Let us start with the following simple C# code, and see how it is compiled to intermediate code.
Console.WriteLine(quot;hello worldquot;);
The MSIL code looks like this (using the Ildasm tool that is discussed later).
// disassembled code using ildasm tool
ldstr quot;hello worldquot;
call void [mscorlib]System.Console::WriteLine(string)
Now let us examine how it works:
The ldstr (standing for 'load string') instruction indicates that the string constant quot;hello worldquot; be
pushed onto the evaluation stack.
● The call instruction is for calling a method. Here, the call is made for the static WriteLine
method of the Console class that is available in mscorlib.dll, in the System namespace. The
WriteLine method takes a string as the argument and its return type is void.
It executes as follows:
● The ldstr instruction pushes the reference to the constant quot;hello worldquot; into the stack.
● The call method calls the WriteLine method, which looks for a string argument, and pops it
from the stack. Now the stack contains nothing. The WriteLine method now executes to print
the message quot; hello world quot; on the screen and returns.
As you can see, understanding the MSIL code is far from difficult! If you have prior exposure to any
assembly language, it will be very easy for you to learn MSIL.
From this simple program, let us move on to a program illustrating branching and arithmetic
instructions.
// C# source code
int i = 10;
if(i!=20)
i = i*20;
Console.WriteLine(i);
// disassembled MSIL code using ildasm tool
IL_0000: ldc.i4.s 10
IL_0002: stloc.0
IL_0003: ldloc.0
IL_0004: ldc.i4.s 20
IL_0006: beq.s IL_000d
IL_0008: ldloc.0
IL_0009: ldc.i4.s 20
IL_000b: mul
IL_000c: stloc.0
IL_000d: ldloc.0
3. IL_000e: call void [mscorlib]System.Console::WriteLine(int32)
You can see that lots of MSIL code has been generated for this simple C# code, but it is simple once
you understand what the instructions do. You can see that the instructions are preceded by IL_xxxx: -
these are labels used so that it is possible to 'jump' from one part of the code to another.
The ldc.i4.s (stands for 'load constant'.'four byte integer'.'single byte argument') instruction pushes the
integer constant 10 onto the stack.
The stloc.0 (stands for 'store in location'.'zeroeth variable') instruction pops the integer constant 10 from
the stack and stores it in the variable number 0 (local variables are remembered by counting them from
0).
The ldloc.0 (stands for 'load from location'.'zeroeth variable') instruction loads the value of the variable
from location zero (i.e. variable i in the source code) and push it onto the stack.
The ldc.i4.s instruction pushes the integer constant 20 onto the stack.
The beq.s (stands for 'branch if equal to'.' single byte argument') instruction pops two items from the
stack and checks if they are equal and if so, it transfers the control to the instruction at the location
identified by the label IL_000d.
The ldloc.0 instruction pushes the value of variable i onto the stack.
The ldc.i4.s instruction pushes the integer constant 0 onto the stack.
The mul (stands for 'multiply') instruction pops two items from the stack, multiplies the values, and
pushes the result back to the stack. Now the result of the multiplication is at the top of the stack.
The stloc.0 instruction pops the top value from the stack (the result of the multiplication in this case)
and stores it in variable i.
The ldloc.0 instruction pushes the value of i onto the stack
The call (stands for 'call the method') instruction calls the WriteLine method that takes an integer as an
argument. The WriteLine method pops the value from the stack and displays it on the screen.
Debugging Using ILDASM Tool
Microsoft's .NET SDK is shipped with an IL disassembler, Ildasm.exe (usually located in the
directory Program FilesMicrosoft.NetFrameworkSDKBin). A disassembler loads your assemblies
and shows the MSIL code with other details in the assembly.
This tool can be handy in debugging code once you become proficient at understanding MSIL code.
How can MSIL help in debugging?
Bugs happen in code when there is a mismatch between what we expect the code to do and what the
code actually does. If we can dig down to a lower level and see what the machine is actually doing with
our code, it is easier to spot the mismatch. That is the idea behind using ILDASM for debugging. Let us
look at an example and see how we can debug the code. The following innocent looking code doesn't
work as you'd expect. It doesn't print quot; yes, o1 == o2 quot; as we'd expect, even though the code is
straightforward.
int i = 10;
object o1 = i, o2 = i;
if(o1 == o2)
Console.WriteLine(quot;yes, o1 == o2quot;);
4. Now let us dig a little deeper and see what the machine is actually doing by looking at the MSIL code
generated by the Ildasm tool:
IL_0000: ldc.i4.s 10
IL_0002: stloc.0
IL_0003: ldloc.0
IL_0004: box [mscorlib]System.Int32
IL_0009: stloc.1
IL_000a: ldloc.0
IL_000b: box [mscorlib]System.Int32
IL_0010: stloc.2
IL_0011: ldloc.1
IL_0012: ldloc.2
IL_0013: bne.un.s IL_001f
IL_0015: ldstr quot;yes, o1 == o2quot;
IL_001a: call void [mscorlib]System.Console::WriteLine(string)
IL_001f: ret
There lies the clue. Can you see that the boxing operation from int to object type is taking place twice?
As the value type is converted to a reference type, the object is allocated on the heap. Since boxing is
done twice, the two objects o1 and o2 are allocated in two different places on the heap. We have found
where things went wrong, and this means we can make a simple correction to our code:
int i = 10;
object o1 = i, o2 = o1;
if(o1 == o2)
Console.WriteLine(quot;yes, o1 == o2quot;);
Now when we look at the resulting MSIL code (again disassembling using the Ildasm tool), the boxing
is done only once, and both the references are pointing to the same object now. So, the program now
works as expected.
IL_0000: ldc.i4.s 10
IL_0002: stloc.0
IL_0003: ldloc.0
IL_0004: box [mscorlib]System.Int32
IL_0009: stloc.1
IL_000a: ldloc.1
IL_000b: stloc.2
IL_000c: ldloc.1
IL_000d: ldloc.2
IL_000e: bne.un.s IL_001a
IL_0010: ldstr quot;yes, o1 == o2quot;
IL_0015: call void [mscorlib]System.Console::WriteLine(string)
IL_001a: ret
The example shown here is simple, but it shows how the tool can be employed effectively for
debugging code.
Article Review
In this article we have explained the basics of MSIL, and using this knowledge, looked into how the
Ildasm tool can be used to help debug your code. This is only a beginner-level article, and so interested
readers are encouraged to look further into MSIL and the Ildasm tool.
All rights reserved. Copyright Jan 2004.