This PPT discusses the concept of Dynamic Linker as in Linux and its porting to Solaris ARM platform. It starts from the very basics of linking process
2. Type of Linking
Static Linking : All symbols resolved at the time of linking. For
example, gcc –static flag sets the program to be linked in static
fashion.
Disadvantage : Large program size
Advantage : Fast processing
Dynamic Linking : Symbols are resolved at the time of execution
of the program. gcc by default links the program dynamically.
The program size is small, but the runtime performance cost is
substantial.
2
2009/1/26
3. Example linking process
m.c a.c
Translators Translators
separately compiled
m.o a.o
relocatable object files
Linker (ld)
executable object file
p (contains code and data for all
functions defined in m.c and a.c)
3
2009/1/26
4. What does a linker do?
Merges object files
merges multiple relocatable (.o) object files into a single executable object file
that can loaded and executed by the loader.
Resolves external references
as part of the merging process, resolves external references.
external reference: reference to a symbol defined in another object file.
Relocates symbols
relocates symbols from their relative locations in the .o files to new absolute
positions in the executable.
updates all references to these symbols to reflect their new positions.
references can be in either code or data
code: a(); /* ref to symbol a */
data: int *xp=&x; /* ref to symbol x */
4
2009/1/26
5. Executable and linkable format
(ELF)
Standard binary format for object files
Derives from AT&T System V Unix
later adopted by BSD Unix variants and Linux
One unified format for relocatable object files (.o), executable
object files, and shared object files (.so)
generic name: ELF binaries
Better support for shared libraries than old a.out formats.
5
2009/1/26
6. ELF object file format
Elf header
0
ELF header
magic number, type (.o, exec, .so), machine, byte
Program header table
ordering, etc.
(required for executables)
Program header table
.text section
page size, virtual addresses for memory segments
(sections), segment sizes. .data section
.text section
.bss section
code
.symtab
.data section
.rel.txt
initialized (static) data
.rel.data
.bss section
.debug
uninitialized (static) data
“Block Started by Symbol” Section header table
(required for relocatables)
“Better Save Space”
has section header but occupies no space
6
2009/1/26
7. ELF object file format
.symtab section
symbol table
0
ELF header
procedure and static variable names
Program header table
section names and locations
(required for executables)
.rel.text section
.text section
relocation info for .text section
.data section
addresses of instructions that will need to be
modified in the executable .bss section
instructions for modifying.
.symtab
.rel.data section
.rel.text
relocation info for .data section
.rel.data
addresses of pointer data that will need to be
.debug
modified in the merged executable
.debug section Section header table
(required for relocatables)
info for symbolic debugging (gcc -g)
7
2009/1/26
8. Example C program
m.c a.c
extern int e;
int e=7;
int *ep=&e;
int main() {
int x=15;
int r = a();
int y;
exit(0);
}
int a() {
return *ep+x+y;
}
8
2009/1/26
9. Merging .o files into an executable
Relocatable object files Executable object file
0
.text
system code
headers
.data & .bss
system data system code
main() .text
a()
.text
main()
m.o
more system code
.data
int e = 7
system data
int e = 7 .data
int *ep = &e
.text
a()
int x = 15
.bss
a.o int *ep = &e uninitialized data
.data
int x = 15 .symtab
.bss
int y
.debug
9
2009/1/26
10. Relocating symbols and resolving
external references
Symbols are lexical entities that name functions and variables.
Each symbol has a value (typically a memory address).
Code consists of symbol definitions and references.
References can be either local or external.
m.c a.c
extern int e;
int e=7;
Def of local
symbol e int *ep=&e;
int main() { Ref to
int x=15;
int r = a(); external
int y;
exit(0); symbol e
Def of
}
Defs of
int a() {
local
local
return *ep+x+y;
symbol
Ref to external
symbols x
}
Ref to external ep
symbol exit
and y
(defined in symbol a Def of
Refs of local
libc.so) local
symbols e,x,y
symbol a
10
2009/1/26
11. m.o relocation info
m.c
Disassembly of section .text:
int e=7;
00000000 <main>: 00000000 <main>:
int main() {
0: 55 pushl %ebp
int r = a();
1: 89 e5 movl %esp,%ebp
exit(0); 3: e8 fc ff ff ff call 4 <main+0x4>
} 4: R_386_PC32 a
8: 6a 00 pushl $0x0
a: e8 fc ff ff ff call b <main+0xb>
b: R_386_PC32 exit
f: 90 nop
Disassembly of section .data:
00000000 <e>:
0: 07 00 00 00
source: objdump
11
2009/1/26
12. a.o relocation info (.text)
a.c
Disassembly of section .text:
extern int e;
00000000 <a>:
int *ep=&e;
0: 55 pushl %ebp
int x=15; 1: 8b 15 00 00 00 movl 0x0,%edx
int y; 6: 00
3: R_386_32 ep
int a() { 7: a1 00 00 00 00 movl 0x0,%eax
return *ep+x+y; 8: R_386_32 x
c: 89 e5 movl %esp,%ebp
}
e: 03 02 addl (%edx),%eax
10: 89 ec movl %ebp,%esp
12: 03 05 00 00 00 addl 0x0,%eax
17: 00
14: R_386_32 y
18: 5d popl %ebp
19: c3 ret
12
2009/1/26
13. a.o relocation info (.data)
a.c
Disassembly of section .data:
extern int e;
00000000 <ep>:
int *ep=&e;
0: 00 00 00 00
int x=15; 0: R_386_32 e
int y; 00000004 <x>:
4: 0f 00 00 00
int a() {
return *ep+x+y;
}
13
2009/1/26
15. Executable after relocation and
external reference resolution (.data)
m.c
int e=7;
Disassembly of section .data:
int main() { 0804a010 <__data_start>:
int r = a(); 804a010: 00 00 00 00
exit(0);
0804a014 <p.2>:
}
804a014: f8 a2 04 08
a.c
0804a018 <e>:
extern int e;
804a018: 07 00 00 00
int *ep=&e; 0804a01c <ep>:
int x=15; 804a01c: 18 a0 04 08
int y;
0804a020 <x>:
804a020: 0f 00 00 00
int a() {
return *ep+x+y;
}
15
2009/1/26
16. Strong and weak symbols
Program symbols are either strong or weak
strong: procedures and initialized globals
weak: uninitialized globals
p1.c: p2.c:
weak
strong int foo=5; int foo;
strong
strong p1() { p2() {
} }
16
2009/1/26
17. Static libraries (archives)
p1.c p2.c
Translator Translator
static library (archive) of
p1.o p2.o libc.a
relocatable object files
concatenated into one file.
Linker (ld)
executable object file (only contains code and
p data for libc functions that are called from p1.c
and p2.c)
Further improves modularity and efficiency by packaging
commonly used functions (e.g., C standard library, math library)
Linker selectively includes only the .o files in the archive that are
actually needed by the program.
17
2009/1/26
18. Creating static libraries
atoi.c printf.c random.c
...
Translator Translator Translator
atoi.o printf.o random.o
ar rs libc.a
Archiver (ar)
atoi.o printf.o … random.o
C standard library
libc.a
Archiver allows incremental updates:
• recompile function that changes and replace .o file in archive.
18
2009/1/26
22. Linking process in Solaris
In Solaris/Open Solaris, the linking process is performed in two
steps:
Compile time linking is done by the “ld” tool called the Link
Editor. The link-editor, ld(1), concatenates and interprets data
from one or more input files. These files can be relocatable
objects, shared objects, or archive libraries. From these input
files, one output file is created. This file is either a relocatable
object, an executable application, or a shared object.
The link-editor is most commonly invoked as part of the
compilation environment.
22
2009/1/26
23. Dynamic linker in Solaris
The runtime linker, ld.so.1, processes dynamic executables and
shared objects at runtime, binding the executable and shared
objects together to create a runnable process.
During the link-editing of a dynamic executable, a special .interp
section, together with an associated program header, are created.
This section contains a path name specifying the program’s
interpreter. The default name supplied by the link-editor is the
name of the runtime linker: /usr/lib/ld.so.1 for a 32–bit
executable and /usr/lib/64/ld.so.1 for a 64–bit executable.
The dynamic linker ld.so.1 is itself an ELF shared library. At
program startup, the system maps the ld.so to a part of the
address space and runs its bootstrap code.
23
2009/1/26
24. Link Editor Functions
Following is summary of Link Editor functions:
The concatenation of sections of the same characteristics from the input
relocatable objects to form new sections within the output file. The
concatenated sections can in turn be associated to output segments.
The processing of symbol table information from both relocatable objects
and shared objects to verify and unite references with definitions. The
generation of a new symbol table, or tables, within the output file.
The processing of relocation information from the input relocatable objects,
and the application of this information to the output file by updating other
input sections. In addition, output relocation sections might be generated for
use by the runtime linker.
The generation of program headers that describe all the segments that are
created.
The generation of dynamic linking information sections if necessary, which
provide information such as shared object dependencies and symbol bindings
to the runtime linker.
24
2009/1/26
25. Symbol processing by Link Editor
During input file processing, all local symbols from the input relocatable
objects are passed through to the output file image. All global symbols are
accumulated internally within the link-editor. Each global symbol supplied by a
relocatable object is searched for within this internal symbol table. If a symbol
with the same name has already been encountered from a previous input file,
a symbol resolution process is called. This symbol resolution process
determines which of the two entries are kept.
On completing input file processing, and providing no fatal symbol resolution
errors have occurred, the link-editor determines if any unresolved symbol
references remain. Unresolved symbol references can cause the link-edit to
terminate.
Finally, the link-editor’s internal symbol table is added to the symbol tables of
the image being created.
25
2009/1/26
26. Example
$ cat main.c
extern int u_bar;
extern int u_foo();
int t_bar;
int d_bar = 1;
d_foo()
{
return (u_foo(u_bar, t_bar, d_bar));
}
$ cc -o main.o -c main.c
$ nm -x main.o
[Index] Value Size Type Bind Other Shndx Name
...............
[8] |0x00000000|0x00000000|NOTY |GLOB |0x0 |UNDEF |u_foo
[9] |0x00000000|0x00000040|FUNC |GLOB |0x0 |2 |d_foo
[10] |0x00000004|0x00000004|OBJT |GLOB |0x0 |COMMON |t_bar
[11] |0x00000000|0x00000000|NOTY |GLOB |0x0 |UNDEF |u_bar
[12] |0x00000000|0x00000004|OBJT |GLOB |0x0 |3|d_bar
26
2009/1/26
27. ELF File processing
Sections are the smallest
indivisible units that can be
processed within an ELF file.
Segments are a collection of
sections that represent the
smallest individual units that can
be mapped to a memory image
by the dynamic linker ld.so.
27
2009/1/26
28. Functions of the Dynamic Linker
The runtime linker:
Analyzes the executable’s dynamic information section
(.dynamic) and determines what dependencies are required.
Locates and loads these dependencies, analyzing their dynamic
information sections to determine if any additional
dependencies are required.
Performs any necessary relocations to bind these objects in
preparation for process execution.
Calls any initialization functions provided by the dependencies.
Passes control to the application.
Can be called upon during the application’s execution, to
perform any delayed function binding.
28
2009/1/26
29. ELF Parsing by Dynamic Linker
Executable object file for
example program p
0
ELF header
virtual addr
Process image
Program header table
0x080483e0
(required for executables) init and shared lib
segments
.text section
.data section
0x08048494
.text segment
.bss section
(r/o)
.symtab
.rel.text 0x0804a010
.data segment
.dynamic (initialized r/w)
.debug
0x0804a3b0
Section header table .bss segment
(required for relocatables) (uninitialized r/w)
29
2009/1/26
30. 1. Resolving the Dependencies
When linking a dynamic executable, one or more shared objects
are explicitly referenced. These objects are recorded as
dependencies within the dynamic executable.
The runtime linker uses this dependency information to locate,
and load, the associated objects. These dependencies are
processed in the same order as the dependencies were referenced
during the link-edit of the executable.
Once all the dynamic executable’s dependencies are loaded, each
dependency is inspected, in the order the dependency is loaded,
to locate any additional dependencies. This process continues
until all dependencies are located and loaded. This technique
results in a breadth-first ordering of all dependencies.
30
2009/1/26
31. 1. Resolving the Dependencies
The Solaris runtime linker looks in two default locations for dependencies
/lib and /usr/lib.
The dependencies of a dynamic executable or shared object can be displayed
using ldd. For example, the file /usr/bin/cat has the following dependencies:
$ ldd /usr/bin/cat
libc.so.1 => /lib/libc.so.1
libm.so.2 => /lib/libm.so.2
The dependencies recorded in an object can be inspected using dump. Use
this command to display the file’s .dynamic section, and look for entries that
have a NEEDED tag.
$ dump -Lvp prog
prog:
[INDEX] Tag Value
[1] NEEDED libfoo.so.1
[2] NEEDED libc.so.1
[3] RUNPATH /home/me/lib:/home/you/lib
.........
31
2009/1/26
32. 1. Resolving the Dependencies
The dynamic segment (pointed to by the program header) in the ELF file
contains a pointer to the file's string table (DT_STRTAB) as well as to the
DT_NEEDED entries, each of which contains the offset in the string table
for the name of a required library. The dynamic linker creates a scope list for
the executable, consisting of libraries to be loaded.
For each of the entries in the scope list , the linker searches for the file
containing the library. Once the file is found, the linker reads the ELF Header
to find the program header, which points to the dynamic segment .
The linker maps the library to the process address space. From the dynamic
segment, it adds the library's symbol table to the chain of symbol tables - and
if the libraries has further dependencies, it adds those libraries to the list to be
loaded and the process is continued. For clarification, note that in fact it
actually creates a struct link_map for each of the library and adds it into a
global linked list.
32
2009/1/26
34. Parsing other sections of ELF
For dynamic linking, the Dynamic linker primarily uses two
processor-specific tables, the Global Offset Table (GOT) and
the Procedure Linkage Table (PLT). Dynamic linkers support
PIC Code through the GOT in each shared library.
The GOT contains absolute addresses to all of the static data
referenced in the program. Both the executables that use the
shared libraries and the shared library itself has a PLT. Similar to
how the GOT redirects any position-independent address
calculations to absolute locations, the PLT redirects position-
independent function calls to absolute locations.
34
2009/1/26
35. Parsing other sections of ELF
In the .dynamic section, the important tag types are:
DT_NEEDED: This element holds the string table offset of a
null-terminated string, giving the name of a needed library. The
offset is an index into the table recorded in the DT_STRTAB
entry.
DT_HASH: This element holds the address of the symbol hash
table which refers to the symbol table referenced by the
DT_SYMTAB element.
DT_STRTAB: This element holds the address of the string table.
DT_SYMTAB: This element holds the address of the symbol
table.
35
2009/1/26
36. 2. Relocation Processing
After the runtime linker has loaded all the dependencies required
by an application, the linker processes each object and performs
all necessary relocations.
Relocation is the process of connecting symbolic references with
symbolic definitions. For example, when a program calls a
function, the associated call instruction must transfer control to
the proper destination address at execution. Relocatable files
must have information that describes how to modify their
section contents. This information allows executable and shared
object files to hold the right information for a process’s program
image.
36
2009/1/26
37. 3. Loading segments in memory
The LD_BIND_NOW variable determines the dynamic linking
behavior. If its set, the dynamic linker evaluates the PLT entries,
which is all entries of type R_386_JMP_SLOT, at the load time
itself. Otherwise, the dynamic linker does lazy linking of
procedure addresses and hence the addresses are not bound
unless the routines are called.
37
2009/1/26
38. 4. Delayed Function Binding
Under delayed function binding or lazy loading model, any dependencies that
are labeled for lazy loading are loaded only when explicitly referenced. By
taking advantage of the lazy binding of a function call, the loading of a
dependency is delayed until the function is first referenced. As a result,
objects that are never referenced are never loaded.
As a practical example (.dynamic), shows libdebug.so.1 is marked for lazy
loading. The symbol information section
(.SUNW_syminfo), shows the symbol reference that triggers libdebug.so.1 loading.
$ cc -o prog prog.c -L. -zlazyload -ldebug -znolazyload -lelf -R’$ORIGIN’
$ elfdump -d prog
Dynamic Section: .dynamic
index tag value
[0] POSFLAG_1 0x1 [ LAZY ]
[1] NEEDED 0x123libdebug.so.1
[2] NEEDED 0x131 libelf.so.1
[3] NEEDED 0x13d libc.so.1
38
2009/1/26
39. A look into Solaris runtime linker
The File dl_runtime.c contains the following main routines:
_dl_fixup [Resolves the PLT Symbols]
_dl_profile_fixup
_dl_call_pltexit
Other related routines:
elf_machine_plt_value
elf_machine_fixup_plt
_dl_lookup_symbol_x
39
2009/1/26
40. ARM Specific ELF Header Settings
For ARM target environment, the values in the ELF header are
specifically defined. All other values are as specified in the Tool
Interface Standard Portable Formats Specification:
e_machine is set to EM_ARM (defined as 40)
e_ident[EI_CLASS] is set to ELFCLASS32
e_ident[EI_DATA] is set to:
ELFDATA2LSB for little-endian targets
ELFDATA2MSB for big-endian targets
40
2009/1/26
41. Special sections in ARM ELF Files
In Executable ARM ELF, all Executables have at least two Sections,
unless the linker has been invoked with -nodebug:
The Symbol Table Section:
This Section has the following attributes:
sh_name: quot;.symtabquot;
sh_type: SHT_SYMTAB
sh_addr: 0 (to indicate it is not part of the image)
The String Table Section:
This Section has the following attributes:
sh_name: quot;.strtabquot;
sh_type: SHT_STRTAB
sh_addr: 0 (to indicate it is not part of the image)
41
2009/1/26
42. Special Sections in ARM ELF
Debugging Sections
ARM Executable ELF supports three types of debugging
information held in debugging Sections.
ASD debugging tables
1.
These provide backwards compatibility with ARM's Symbolic Debugger. ASD
debugging information is stored in a single Section in the executable named .asd.
DWARF Version 1.0
2.
DWARF Version 2.0
3.
42
2009/1/26