Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
System Software Course Overview
1. Course Overview
System Software
• Introduction to System
Software
• Compilers.
• Assembler.
• Loaders and linkers.
• Macro Processors.
• Character I/O under
Windows.
• Files and directories
Management under
Windows.
• Process creation under
Windows.
• Inter-process
communication under
Windows.
2. Course Objective
• Going behind the scenes, gain a deep understanding of
how computers actually work.
• Understanding the relationship between system software
and machine architecture.
• Understanding how system software help program
development (compilers, assemblers, linkers and
loaders), and and program execution (OS, process
management, file management, device management).
• Getting basic knowledge and experience with Windows
system through programming.
3. Text Book
• System Software, An Introduction to Systems
Programming, Leland L. Beck, Addison-Wesley ,1996
• Windows System Programming, Johnson M. Hart, Third
Edition, Addison-Wesley, 2005.
• Course Notes.
4. Chapter Overview
Introduction to System Software
• Concepts to be learned
– Application software
– System software
– Program development environment
• Compilers, assemblers, linkers, debuggers
– Program run-time environment
• Operating systems, program loaders, program
libraries
– Source program, object program, executable program
5. System Software Definition
• System software consists of a variety of programs that
support the operation of a computer (but exactly what?)
– developing programs: simplify the programming
environment by hiding machine level complexity
– running programs: enable efficient use of hardware by
sharing
• new definition: provides general programming
development in which programmers can create specific
applications, and lets the applications efficiently use the
system hardware
6. System Software vs. Application Software
• System software
– Support the operation and use of the computer itself
– machine dependency (not all features)
– compilers, assemblers, linkers, loaders, debuggers,
OS
• Application software
– designed as a tool to solve a specific problem
– No direct relation with the hardware
– Web browser, media players, office tools, image
processors, messengers
• text editor ?
7. Software Environments
• Program development environment
– compilers, assemblers, linkers, debuggers
– Integrated developing environment (IDE)
– IDE examples: Visual C++, J Builder, Visual Basic
• Program run-time environment
– operating systems, program loaders, program
libraries
– Java run time environment
8. Steps in Creating and Running
C program
Code
Compiler
Assembly language program
Assembler
Object: Machine language module Object: Library routine (machine language)
Linker
Executable: Machine language program
Loader
Memory
9. System Software for Program Development
Source Object Computer
programs programs hardware
… Assembler … Devices
JSUB F1 4B101036 (I/O)
… … Linker
Proce-
… C/C++ compiler … Sssor(s)
…
F1 4B101036 4B106036
… … …
…
4B10
… 8456
… Pascal compiler 4B101036 Loader …
CALL F1 Memory
… …
10. System Software for Program Running
Computer
hardware
Devices
(I/O) Device Manager
Proce-
Sssor(s) File
Process and
Manager
Resource Manager
Os
PGM1
PGM3
PGM2 Memory Manager
Memory
11. Other System Software
• Window system
– Provide virtual terminal to an application program
– Map virtual terminal operations so that they apply to a specific
physical region on a screen
• Database management system
– Store information on the computer’s permanent storage devices
– Provide abstract data types (schema) and creates new
application-specific software optimized for efficient
queries/updates on the data according to the schema definition
12. Strategies of Learning System Software
Functions
• For each type of system software, distinguishing among:
– Fundamental common features
– Close machine-dependent features
– Other common machine-independent features
– Major design options for structuring a particular piece
of software (ex. Single-pass versus multi-pass
processing)
– Unusual machine-dependent features (examples of
implementations on actual machines)
13. Chapter1:Compilers
• Compiler is a language translator. It is a program that
translates programs written in a source language into
an equivalent program in a target language.
• The source language is usually a high-level
programming language and the target language is
usually the machine language of an actual computer.
Implications: Source Target
Compiler
program Program
-recognize legal (and illegal) programs
-generate correct code Error messages
-manage storage of all variables and code
Diverse & Varied
- agreement on format for object (or
assembly) code
14. Compilers
What qualities are important in a compiler?
– 1. Correct code
– 2. Output runs fast
– 3. Compiler runs fast
– 4. Compile time proportional to program size
– 5. Support for separate compilation
– 6. Good diagnostics for syntax errors
– 7. Works well with the debugger
– 8. Good diagnostics for flow anomalies
– 9. Cross language calls
– 10. Consistent, predictable optimization
15. Complier Design
At the highest level of abstraction,
compilers are often partitioned
into
- a front end that
deals only with language-
specific issues, and
- a back end that deals only
with machine-specific issues.
16. The Many Phases of a Compiler
Source Program
- The typical compiler consists
of several phases each of which 1
Lexical Analyzer
passes its output to the next
phase. It uses Analysis- 2
Syntax Analyzer
Synthesis Model :
3
- Analysis: convert Semantic Analyzer
source code into Symbol-table Error Handler
discrete, manageable Manager
4 Intermediate
“chunks”. Strings Code Generator
tokens -trees
5
- Synthesis: Convert each chunk Code Optimizer
into a piece of target code.
Trees-Intermediate code target
6
code. Code Generator
Phase 1, 2, 3 : Analysis
Phase 4, 5, 6 : Synthesis Target Program
17. The role of each compiler phase: Scanner
• The lexical phase (scanner) groups characters into
lexical units or tokens (Keyword, identifier, number,..etc.)
– The input to the lexical phase is a character stream.
The output is a stream of tokens.
– Regular expressions are used to define the tokens
recognized by a scanner (e.g. digit -> 0|1|..|9 and
letter -> [A..Za-z], and identifier -> letter (letter|digit)*.
– The scanner can be implemented as a finite state
machine.
Example:
Position := initial + rate * 60 ;
_______ __ _____ _ ___ _ __ _
All are tokens
Blanks, Line breaks, etc. are scanned out
18. The role of each compiler phase: Parser
• The parser recognizing whether a program (or sentence)
is grammatically well formed. It groups tokens into
syntactical units.
– The output of the parser is a parse tree
representation of the program.
– Context-free grammars are used to define the
program structure recognized by a parser.
assignment
statement
:=
identifier expression
+
position expression expression
*
identifier expression expression
initial identifier number
rate 60
Nodes of tree are constructed using a grammar for the language
19. What is a Grammar?
• Grammar is a Set of Rules Which Govern the
Interdependencies & Structure Among the Tokens
statement is an assignment statement, or
while statement, or if
statement, or ...
assignment statement is an identifier := expression ;
expression is an (expression), or expression
+ expression, or expression *
expression, or number, or
identifier, or ...
20. The role of each compiler phase: Semantic
• The semantic analysis phase analyzes the parse tree
for context-sensitive information often called the static
semantics.
• Type Checking - Legality of Operands
• Real := int + char ;
• A[int] := A[real] + int ;
• while char <> int do
• The output of the semantic analysis phase is an
annotated parse tree (augmented with semantic
actions).
:= :=
position + position +
initial * initial *
rate 60 rate inttoreal
60
Compressed Tree Conversion Action
21. Symbol Table/Error Handling
• Symbol Table Creation / Maintenance
– Contains Info on Each “Meaningful” Token, Typically
Identifiers
– Data Structure Created / Initialized During Lexical
Analysis
– Utilized / Updated During Later Analysis &
Synthesis
• Error Handling
– Detection of Different Errors Which Correspond to
All Phases
– What Kinds of Errors Are Found During the
Analysis Phase or Synthesis Phase?
– What Happens When an Error Is Found?
22. The role of each compiler phase:
Intermediate Code Generation
– It uses Abstract Machine Version of Code -
Independent of Architecture
– Easy to Produce and Do Final, Machine
Dependent Code Generation
– Three-Address Code: “Portable” assembly-like
language
– Every memory location can act like a register
– At most three operands per instruction
temp1 := inttoreal(60)
temp2 := id3 * temp1
temp3 := id2 + temp2
id1 := temp3
23. The role of each compiler phase: Code
Optimization & Code Generator
Optimizer:
• Find More Efficient Ways to Execute Code
• Replace Code With More Optimal Statements
Code Generator:
• Find More Efficient Ways to Execute Code
• Replace Code With More Optimal Statements
code optimizer
temp1 := id3 * 60.0
id1 := id2 + temp1
final code generator
movf id3, r2
mulf #60.0, r2
movf id2, r1
addf r2, r2
movf r1, id1
24. Compiler Passes
• number of Passes
– Single - read input file, write output file. It is
Preferred
– Multiple - Each pass may consist of several phases.
It is Easier than single, but less efficient because
it needs more memory
25. Chapter2 : Assemblers
• Concepts to be learned
– Assembler directives, forward references, two-pass assembly
– Opcode table and symbol table
– Two-pass assembly process and location counter
– Program-counter relative and base relative addressing
– Program relocation and modification records
– Literals, literal pool, and literal table
– Program blocks and multiple location counters
– Control sections and independent assembly/compilation
– External references and external definitions
– One-pass and multi-pass assemblers
26. Real Machines
• Machine architecture differs in:
• Machine code
• Instruction formats
• Addressing mode
• Registers
• Complex Instruction Set Computers (CISC)
– Relative large and complicated instruction set, more
instruction formats, instruction lengths, and
addressing modes
– Hardware implementation is complex
– Examples: VAX and Intel x86
• Reduced Instruction Set Computers (RISC)
– Simplified design, faster and less expensive
processor development, greater reliability, faster
instruction execution times
– Examples: Sun SPARC and PowerPC
27. Simplified Instructional Computer
(SIC) Architecture
• Why the simplified instructional computer
– A hypothetical computer designed to include common
hardware features while avoiding irrelevant
complexities
– Separate the central concepts of system software
from the implementation details associated with a
particular machine
– A good starting point to begin the design of system
software for a new or unfamiliar computer.
– Two versions (upward compatible)
• Standard model
• XE version (extra equipment)
28. SIC Machine Architecture
• Memory
– 8-bit bytes; 3 bytes word (24 bits); byte addresses;
total of 32,768 (215) bytes in the memory
A word (3 bytes, or 24 bits)
…
• Registers - 5 registers, 24 bits in length 32768 = 215 bytes
A 0 Accumulator; used for arithmetic operations
X 1 Index register; used for addressing (offset)
L 2 Linkage register; the Jump to Subroutine
(JSUB) instruction stores the return address
here
PC 8 Program counter; contains the address of
the next instruction to be fetched for exaction
SW 9 Status word; contains a variety of
information, including a Condition Code (CC)
29. SIC Machine Architecture (continue)
• Data formats
– Integers are stored as 24-bit binary numbers
– Characters are stored using 8-bit ASCII codes
– No floating point hardware
• Instruction formats
8 1 15
opcode x address
– Opcodes all end with 00
– Flag bit x indicates indexed-addressing mode
( ): Contents of a register or a memory location
30. SIC Machine Architecture (continue)
• Instruction set
– Load and store registers
• LDA, LDX, STA, STX
– Integer arithmetic operations (involve register A and a
word in memory, save result in A)
• ADD, SUB, MUL, DIV
– Comparison (involves register A and a word in
memory, save result in the condition code (CC) of
SW)
• COMP
– Conditional jump instructions (according to CC)
• JLE, JEQ, JGT
– Subroutine linkage
• JSUB (jumps and places the return address in register L)
31. SIC Machine Architecture (continue)
• Input and output
– Transfer 1 byte at a time to or from the
rightmost 8 bits of register A
– Each device is assigned a unique 8-bit code
– 3 I/O instructions: Test Device (TD), Read
Data (RD) and Write Data (WD)
• Test device (TD):
– Test whether the device is ready to send/receive
– Test result is set in CC
• Read data (RD): read one byte from the device to
register A
• Write data (WD): write one byte from register A to
the device
– Repeat for each byte, time consuming
32. SIC/XE Machine Architecture
• Increased memory -Total of 1 megabyte (220 bytes)
• Additional registers
B 3 Base register; used for addressing
S 4 General working register
T 5 General working register
F 6 Floating-point accumulator (48 bits)
• Additional data formats
48-bit floating-point data type
1 11 15
s exponent fraction
33. SIC/XE Machine Architecture (continue)
• Varied instruction formats
Format 1 (1 byte) Format 2 (2 bytes)
8 8 4 4
op op r1 r2
Format 3 (3 bytes)
6 1 1 1 1 1 1 12
op n i x b p e disp Flag bits n,i,x,b,p
indicate addressing
Format 4 (4 bytes) modes
6 1 1 1 1 1 1 20
op n i x b p e address
34. SIC/XE Machine Architecture (continue)
Addressing Flag values TA
mode
Direct i = 1, n = 1 = disp; format 3
= address; format 4
Relative Base relative b = 1, p = 0 = (B) + disp
PC relative b = 0, p = 1 = (PC) + disp
Only with format 3
Immediate i = 1, n = o TA = operand value
indirect i = o, n = 1 = (disp); format 3
= (address); format 4
simple i = 0, n = 0; b,p,e are part = disp[15bits]; Upward
of the address field compatible with SIC
37. SIC/XE Machine Architecture (continue)
• Additional instruction set
– Load and store new registers (LDB, STB, etc.)
– Floating-point arithmetic operations (ADDF, SUBF, MULF, DIVF)
– Register move (RMO)
– Register-to-register arithmetic operations (ADDR, SUBR, MULR,
DIVR)
– Special supervisor call instruction (SVC) – generating interrupt to
communicate with the OS
• Additional input and output feature
– Provide I/O channels, overlapping computing with I/O
– Instructions SIO, TOP, and HIO are used to start, test, and halt
the operation of I/O channels
43. Mnemonic opcode SIC Programming Examples
operands (1) Data Movement comments
SIC
Assembler directives
for defining storage
Address labels
SIC/XE
Immediate addressing
makes program faster
due to fewer memory
reference
47. Basic Assembler Functions
• Assembler handles mnemonic operation codes,
constants, literals, directives and addressing modes
• Simple assembler and the assembly process (Role of
Assembler)
– Convert mnemonic operation codes to their machine
language equivalents
– Convert symbolic operands to their equivalent
machine addresses
– Build the machine instructions in the proper format
– Convert the data constants specified in the source
program into their internal machine representations
– Write the object program and the assembly listing
48. Basic Assembler Functions (continue)
• Assembler directives (Fig.2.1, page 45)
– START Specify name and starting address for the program
– END Indicate the end of the source program and (optionally)
specify the first executable instruction in the program
– BYTE Generate character or hexadecimal constant, occupying
as many bytes as needed to represent the constant
– WORD Generate one-word integer constant
– RESB Reserve the indicated number of bytes for a data area
– RESW Reserve the indicated number of words for a data area
• Process assembler directives
– No need to be translated into machine instructions because they
provide instructions to the assembler
52. Basic Assembler Functions (continue)
Forward references (Fig.2.1, page 45)
• Definition
– A reference to a label that is defined later in
the program
– Line by line translation is problematic
• Solution
– Two passes
• First pass: scan the source program for label
definitions and assign addresses assignment
• Second pass: perform most of the actual
instruction translation
53. Basic Assembler Functions (continue)
Loc Source Statement Object Code
• Functions of the two passes ___ _________________ _______
1000 COPY START 1000
pass 1: (Define symbols) loop until the 1000 FIRST STL RETA 141033
end of the program 1003 LOOP JSUB RD 482039
1. read in a line of assembly code 1006 LDA LEN 001036
1009 COMP ZERO 281030
2. assign an address to this line 100C JEQ ENDF 301015
increment N (word addressing or 100F JSUB WR 482061
byte addressing) 1012 J LOOP 3C1003
1015 ENDF LDA
3. save address values assigned to
…
labels in symbol tables 102A EOF BYTE C’EOF” 000000
4. process assembler directives …
constant declaration 1033 RETA RESW 1 000000
space reservation 1036 LEN RESW 1
pass2: (assemble instructions and …
2039 RD LDX ZERO 041030
generate object program) same loop …
1. read in a line of code 2061 WR LDX ZERO 041030
2. validate and translate op code …
using op code table
3. change labels to address HXCOPY X001000X00107A
using the symbol table TX001000X1EX141033X482039X001036 ….
4. process assembler directives …
5. produce object program EX001000
54. Basic Assembler Functions (continue)
• Two-pass assembly structure
OPTAB
Source program Pass 1 Intermediate file Pass 2 Object program
LOCCTR SYMTAB
Intermediate file contains Source statement
with :
assigned address
Error indicators
Pointers to OPTAB and SYMTAB
Etc.
55. Basic Assembler Functions (continue)
• Output object program
- assembler must writes object code to some output
device for later execution
• Simple object program format (Fig.2.3, page 49)
– Header record contains program name, starting
address, length
– Text record contains machine code (translated
instructions and data) with an indication of the
addresses where these are to be loaded
– End record marks the end of object code program
(see textbook pp.49 for details)
56. Object Program Format
• Header
Col. 1 H
Col. 2~7 Program name
Col. 8~13 Starting address of object program (hex)
Col. 14-19 Length of object program in bytes (hex)
• Text
Col.1 T
Col.2~7 Starting address for object code in this record (hex)
Col. 8~9 Length of object code in this record in bytes (hex)
Col. 10~69 Object code, represented in hex (2 col. per byte)
• End
Col.1 E
Col.2~7 Address of first executable instruction in object program (hex)
1033-2038: Storage reserved by the loader
57. Basic Assembler Functions (continue)
• Assembler data structure
– Operation Code Table (OPTAB)
Mnemonic code Machine code
– Symbol Table (SYMTAB)
Label Address
- LOCCTR
A variable accumulated for address assignment,
i.e., LOCCTR gives the address of the
associated label.
• Assembler algorithm
– See Fig.2.4, practice with example in Fig.2.1, Fig.2.2.
58. Data Structures for Assembler
Operation Code Table
• Contents:
– Mnemonic operation codes
– Machine language equivalents
– Instruction format and length
• During pass 1:
– Validate operation codes
– Find the instruction length to increase LOCCTR
• During pass 2:
– Determine the instruction format
– Translate the operation codes to their machine language
equivalents
• key: mnemonic code
• result: bits
• Implementation: a static hash table is usually used
• once prepared, the table is not changed
• efficient lookup is desired
• since mnemonic code is predefined, the hashing function
can be tuned a priori
59. Data Structures for Assembler (cont’d)
Symbol table
• Contents:
– Label name
– Label address
– Flags (to indicate error conditions)
– Data type or length
• During pass 1:
– Store label name and assigned address (from LOCCTR) in SYMTAB
• efficient insertion and retrieval is needed
• deletion does not occur
• During pass 2:
– Symbols used as operands are looked up in SYMTAB
• Implementation:
– a dynamic hash table for efficient insertion and retrieval
– Should perform well with non-random keys (LOOP1, LOOP2, X1,
X2).
• problem
66. Why Program Relocation
• To increase the productivity of the machine
• Want to load and run several programs at the
same time (multiprogramming)
• Must be able to load programs into memory
wherever there is room
• Actual starting address of the program is not
known until load time
67. Absolute Program
• Program with starting address specified at assembly
time
• In the example of SIC assembly program (Fig. 2.2)
started at 1000 (COPY START 1000). The following
statement means Load register A from memory address
1036
Instruction: 55 101B LDA THREE 001036
Calculated from the
starting address 1000
Instruction: 100 1036 THREE RESW 1
• The address may be invalid if the program is loaded into
some where else.
69. What Needs to be Relocated
• Need to be modified:
– The address portion of those instructions
that use absolute (direct) addresses.
• Need not be modified:
– Register-to-register instructions (no
memory references)
– PC or base-relative addressing (relative
displacement remains the same regardless
of different starting addresses)
70. How to Relocate Addresses
• For Assembler
– For an address label, its address is assigned
relative to the start of the program (that’s why
START 0)
– provides loader with information about
• which address needs fixing
• length of address field
– Produce a modification record to store the starting
location and the length of the address field to be
modified.
• For loader
– For each modification record, add the actual
beginning address of the program to the address
field at load time.
71. Format of Modification Record
• One modification record for each address to be modified
• The length is stored in half-bytes (20 bits = 5 half-bytes)
• The starting location is the location of the byte containing the
leftmost bits of the address field to be modified.
• If the field contains an odd number of half-bytes, the starting
location begins in the middle of the first byte.
73. Machine-Dependent Assembler Features
• Use register-to-register instructions whenever possible
– Take advantage of additional registers
– Reduce instructions length;
– avoid memory reference; speed up
• Use immediate addressing as much as possible
– Avoid memory reference
– Can be combined with relative addressing
• Use indirect addressing as much as possible
– Avoids the need for another instruction
– Can be combined with relative addressing
74. Machine-Dependent Assembler Features
• Most register-to-memory instructions are assembled
using relative addressing
– Reduce instruction length
– simplify program relocation
– The displacement should not overflow 12bits,
otherwise use format 4;
– using PC relative or Base relative is arbitrary,
programmer’s choice
75. Machine-Dependent Assembler Features
Extended features reflected in code (Fig 2.5)
• Prefix denotations
@ - indirect addressing
# - immediate addressing
+ instruction format 4 is used, no displacement
• Additional assembly directives
– BASE: Base-Relative addressing mode used
– NOBASE: cancel Base-Relative addressing
• Additional instructions
– COMPR: compare values in two registers (format 2)
76. Machine-Dependent Assembler Features
• Program Relocation (Fig.2.6, 2.7, 2.8)
– Multiprogramming; shared memory
– Load-time binding
– Relocatable program instead of absolute program
– Assembler generates relative address (assume the
program starts at 0)
– Object program includes modification record
Col.1 M
Col.2-7 Starting location of the address field to be modified,
relative to the beginning of the program
Col.8-9 Length of the address field to be modified, in 1/2bytes
77. Machine-Dependent Assembler Features
• Address modification
– Add the beginning address to the address field of an
instruction
– Instructions need to be modified at load time
• Specific direct addresses
• For SIC/XE, only in format 4
– Instructions need not to be modified at load time
• Operand is not memory address
• PC relative and base relative addressing is used
• Immediate + relative addressing is used
78. Machine-Independent Assembler Features
• Literals (Fig.2.9,2.10)
– The value of a constant operand directly stated in the instruction
– Label and BYTE statement are avoided
– Same effect as using BYTE statement, same object code
– Prefix notation: =, followed by a specification of the literal value
– Example:
45 001A ENDFIL LDA =C’EOF’ 032010
215 1062 WLOOP TD =X’05’ E32011
Literal The assembler generates the specified value as
a constant at some other memory location
Immediate Operand value is assembled as part of the
operand machine instruction
79. Machine-Independent Assembler Features
• Literal pool (Fig.2.9, 2.10)
– At the end of the program
– At certain locations in the program
• use directive LTORG
• containing all the literal operands used since
previous LTORG
• Keep the literal operand close to the instruction
that uses it
• Enable relative addressing, avoid using instruction
format 4
80. Machine-Independent Assembler Features
• Duplicate literals
– The same literal used in more than one place
– Store only one copy of the specified data value
– The literals =C’EOF’ and =X’454F46’ have identical
operand values
– Problems: same literal name, different values
• Literals whose value depends upon their location in the
program
• when a literal refers to any item whose value changes
(location counter)
81. Machine-Independent Assembler Features
• Assembly process for literals
Literal table (LITTAB) Literal Operand value Address
name and length assigned
Pass 1
Literal operand recognized Pass 2
Search LITTAB Literal operand encountered
Add literal to LITTAB if it is not Search LITTAB to obtain
present operand address
Encounter a LTORG or end of Insert data values of literals
program into appropriate places in the
object program
Scan LITTAB, assign address
to each literal
82. Machine-Independent Assembler Features
• Symbol–defining statements (Fig 2.9, 2.10)
– EQU statement symbol EQU value
• Directly assign values to symbols
• Insert symbols into SYMTAB
– ORG statement ORG value
• Indirectly assign values to symbols
• reset LOCCTR to the specified value
• Affect the values of all labels defined until the next ORG
• Useful when defining the internal structure of the symbol
table Restrictions
– Restrictions: values should be constant or expression
involving constants and previously defined symbols
83. Machine-Independent Assembler Features
• EQU statement examples
MAXLEN EQU 4096 Use symbol instead of numeric
+LDT #MAXLEN values, improve readability, easy to
find and change values
A EQU 0 Define mnemonic names for registers,
X EQU 1 some instruction may require register
L EQU 2 numbers instead of names (RMO)
BASE EQU R1 Define general-purpose registers as
COUNT EQU R2 special registers
INDEX EQU R3
84. Machine-Independent Assembler Features
• ORG statement Examples
SYMBOL VALUE FLAGS
STAB
(100 entries) 6-byte 3-byte 2-byte
STAB RESB 1100 ;reserve space for the symbol table
ORG STAB ;reset LOCCTR to the value of STAB
SYMBOL RESB 6 ;assign to SYMBOL the address STAB
VALUE RESW 1 ;assign to VALUE the address STAB+6
FLAGS RESB 2 ;assign to FLAGS the address STAB+9
ORG STAB+1100 ;set LOCCTR to its previous value
86. Machine-Independent Assembler Features
• Expressions
– Absolute expressions
• A expression contains only absolute terms
• Or a expression contains relative terms which occur in pairs
and the terms in each such pair have opposite signs
– Relative expressions
• A expression in which all the relative terms except one can
be paired as described above; the remaining unpaired
relative term must have a positive sign.
– Error expressions
• Neither absolute nor relative expressions
• Should be flagged by the assembler as errors
87. Machine-Independent Assembler Features
• Expression terms
– Constant (absolute term)
– Label (relative term)
– Symbol defined by EQU (absolute or relative term, depending on
the expression used to define its value)
– Special term * used to refer to LOCCTR (relative)
• Type flag in SYMTAB
Symbol Type Value
RETADR R 0030
BUFFER R 0036
BUFFER R 1036
MAXLEN A 1000
88. Machine-Independent Assembler Features
• Expression rules
– Legal expressions are those whose value remains meaningful
when program is relocated;
– None of the relative terms may enter into a multiplication or
division operation
• Expression example (Fig.2.9, 2,10)
107 MAXLEN EQU BUFEND-BUFFER
– Illegal expressions
BUFEND + BUFFER
100 – BUFFER
3 * BUFFER
89. Machine-Independent Assembler Features
• Program Blocks (Fig.2.11, 2.12)
– Segments of code that are rearranged within a single
object program unit
– Each program block may contain several separate
segments of the source program
– The assembler provides reorganization
90. Machine-Independent Assembler Features
• Benefits of using program blocks
– Move large buffer area to the end of the object
program, avoid using format 4
– Base register is avoided
– Place literals ahead of any large data areas
– Separate source program order from object program
order
91. Machine-Independent Assembler Features
• Assembler directive USE
– indicates which portions of the source program belong
to the various blocks
– Example:
92 USE CDATA ;begin block named CDATA
103 USE CBLKS ;begin block named CBLKS
183 USE ;resume the default block
92. Machine-Independent Assembler Features
• Assembler handling for program blocks (Fig.2.12)
Pass 1
Separate LOCCTR for each block, initialized to 0
Save and restore LOCCTR values when switching between two blocks
Each label is assigned an address relative to the start of the block that
contains it, and label address is stored with block number in SYMTAB
Constructs a table that contains the starting addresses and lengths for all
blocks
Pass 2
Generate address for each symbol relative to the start of the object program
Access SYMTAB, and add the location of the symbol to the block starting
address
93. Machine-Independent Assembler Features
• Block table (Fig.2.12)
Block name Block number address Length
(default) 0 0000 0066
CDATA 1 0066 000B
CBLKS 2 0071 1000
• SYMTAB
Symbol Address Bock number
LENGTH 0003 1
• Object program respecting program blocks (see
Fig.2.13, 2.14)
94. Machine-Independent Assembler Features
• Control sections (Fig.2.15, 2.16)
– Segments that are translated into independent object
program units.
– Each control section can be loaded and relocated
independently of others.
– Programmer can assemble and manipulate each
control section separately
– Mostly used for subroutines or other logical
subdivisions of a program
• Assembler directive CSECT
– signals the start of a new control section
95. Machine-Independent Assembler Features
• External reference (Fig.2.15, 2.16)
– References between control sections
– Assembler has no idea about other control sections’ location at
execution time
– Assembly directive EXTDEF (external definition) and EXTREF
EXTDEF Define external symbols that may be used by
other sections
EXTREF Names symbols that are used in this control
section and are defined elsewhere
– Control section names are automatically external symbols
96. Machine-Independent Assembler Features
• Assembler handling for control sections
– Separate LOCCTR for each section, initialized to 0
– Inserts an address of zero to external reference
15 0003 CLOOP +JSUB RDREC 4B100000
160 0017 +STCH BUFFER,X 57900000
190 0028 MAXLEN WORD BUFEND-BUFFER 000000
– Format 4 has to be used for external reference (relative
addressing is not possible )
– Assembler must remember (via entries in SYMTAB) in which
control section a symbol is defined
– References to unidentified external symbol are flagged as an
error.
– Same symbol name can be used in different sections
97. Machine-Independent Assembler Features
• Object program respecting control sections (Fig.2.17)
– Define record
– Refer record
– Modification record
(See pp.89 for details)
98. Assembler Design Options
• One-pass Assemblers
– Must solve the problem of forward references
– Defined data items before they are referenced
– Special handling of symbols
• Two types of one-pass assemblers
1. Produces object code directly in memory for
immediate execution (load-and-go)
2. Produces object program for later execution
99. Assembler Design Options
• Load-and-go assembler features
– Produce object code directly in memory
– Load and go, no loader is needed
– Efficient assembly process, good for program
development and testing
– Generate absolute code at assembly time
100. Assembler Design Options
• Load-and-go assembler handling for symbols (Fig. 2.18,
2.19)
– Encounter a symbol operand that hasn’t been defined
– Omit the operand address
– Enter the symbol into SYMTAB, flag it as undefined
– Add the address of this symbol operand to a list of forward
references associated with the SYMTAB entry
– Encounter the definition for a symbol
– Scan the forward reference list for this symbol
– Insert the proper symbol address into the listed address
101. Assembler Design Options
• Features of the one-pass assembler that output object
programs
– Produce object programs as output
– Used on system where external storage is slow
(eliminating intermediate file)
– Generate extra text record in object program to
handle forward references
– Insert addresses for forward references during
loading time
102. Assembler Design Options
• One-pass assembler handling for symbols (output object
programs) (Fig. 2.20)
– Encounter a symbol operand that hasn’t been defined
– Generate the operand address as 0000
– Enter the symbol into SYMTAB, flag it as undefined
– Add the address of this symbol operand to a list of forward
references associated with the SYMTAB entry
– Encounter the definition for a symbol
– Scan the forward reference list for this symbol
– Generate Text record to insert the proper operand address into
the listed address
103. Assembler Design Options
• Multi-pass assemblers (Fig.2.21)
– Eliminate the prohibition of forward references in
symbol definition
– Make as many passes as are needed to process the
definition of symbols
– Assembler still pass the entire program for twice
– In pass1, additional passes only scan the stored
symbol definitions that involve forward reference
– Finally, a normal pass2 is made
105. Loaders and Linkers
• Concepts to be learned
– Absolute loader, relocatable loader, linking loader,
bootstrap loader
– Independent assembly/compilation and program
linking
– Static and dynamic program libraries
– Linage editors and linking loaders
– Bootstrap loaders and program loaders
106. Chapter Overview
Loaders and Linkers
• Some concepts and definitions
– Loading, which brings the object program into memory for
execution
– Relocation, which modifies the object program so that it can be
loaded at an address different from the location originally
specified
– Linking, which combines two or more separate object programs
and supplies the information needed to allow references
between them
– Loader, a system program that performs the loading function,
and may also support relocation and linking
107. Basic Loader Functions
• Design of an absolute loader (Fig.3.1, 3.2)
A single pass
Check Header record to verify the program name
and size
Jump to the starting address
Read each Text record and move object code to
the memory
Read End record and jump to the specified
address
Execute the program
108. Basic Loader Functions
• Representation of object program
– Hexadecimal representation in character form (waist
memory space and execution time)
Characters in ‘0’ ‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘6’ ‘7’ ‘8’ ‘9’ ‘A’ ‘B’ ‘C’ ‘D’ ‘E’ ‘F’
object program
ASCII code (hex) 30 31 32 33 34 35 36 37 38 39 41 42 43 44 45 46
Internal value 0 1 2 3 4 5 6 7 8 9 A B C D E F
(hex)
Difference (hex) 30 30 30 30 30 30 30 30 30 30 37 37 37 37 37 37
– In binary form (save memory space and execution
time, but low readability)
109. Basic Loader Functions
• A simple bootstrap loader (Fig. 3.3)
– The loader program begins at address 0 in the memory
– Load the first program to be run by the computer (OS),
– Load the object code into consecutive bytes of memory, starting
at address 80
– Simplified object program (contains only object code, no Head
record, End record, or control information)
– Object code is represented as hexadecimal digits in character
form
– Loader must convert ASCII character code to the value of the
hexadecimal digit that is represented by that character.
110. Machine-dependent loader Features
• Program relocation and relocation loaders (or relative
loaders)
– For SIC/XE, processing modification record (Fig. 3.4,
3.5)
– For SIC, processing relocation bit mask (Fig. 3.6, 3.7)
• Modification record is no longer suitable for SIC, since there
is no relative addressing and immediate addressing, almost
all the addresses need to be modified.
• One modification bit is assigned to each instruction.
• The relocation bits are gathered together into a bit mask (3
hexadecimal digits) following the length indicator in each
Text record.
111. Machine-dependent loader Features
• Program linking and linking loaders (Fig.3.8,3.9,3.10)
– Loader processes Define record, Refer record, and
Modification record
– The assembler will evaluate as much of the
expression as it can
– The remaining terms are passed on to the loader via
Modification records
112. Machine-dependent loader Features
• Linking loader data structures (Fig.3.11)
– External symbol table (EATAB)
Control section Symbol name Address Length
– Program load address (PROGADDR)
• The beginning address in memory where the linked program
is to be loaded
• OS supplies the value of PROGADDR
– Control section address (CSADDR)
• The starting address assigned to the control section currently
being scanned by the loader
• Loader uses this value to convert relative addresses to actual
addresses within the control section
– Control section length (CSLTH)
– Execution address (EXECADDR)
113. Machine-dependent loader Features
• linking loader Algorithms (Fig.3.11)
– Pass 1
• Process only Header and Define record in the object
program
• Construct ESTAB
• Assign address to each control section
• Assign addresses to external symbols
– Pass 2
• Process Text and Modification record in the object program
• perform the actual loading, relocation, and linking
• For each Text record, move object code to the specified
address (plus the current value of CSADDR)
• For each Modification record, look up ESTAB for the
specified symbol value, add it to or subtract it from the
specified address (plus the current value of CSADDR)
114. Machine-dependent loader Features
• Transfer address (Fig.3.11)
– Loader performs the transferring of control to the
loaded program to begin execution
• Normally, a transfer address is placed in the End record for a
main program, not for a subroutine
• If more than one control section specifies a transfer address,
the loader arbitrarily uses the last one encountered.
• If no control section contains a transfer address, the loader
uses the beginning of the linked program (PROGADDR) as
the transfer point
– Alternatively, user can enter a separate Execute
command to specify the transfer address (some
systems)
115. Machine-dependent loader Features
• Reference number (Fig.3.12)
– In Refer record, assign reference number to each
external symbol
– In Modification record, reference numbers are used
instead of symbol names
– 01 is usually assign to the control section name
– Avoid multiple searches of ESTAB for the same
symbol during the loading of a control section
– Obtains the values for code modification by simply
indexing into an array of these values
116. Machine-Independent Loader Features
• Using program Libraries
– Assembled or compiled versions of the subroutines
(object programs) in organized structure
– Allow programmer to use subroutines from one or
more libraries as part of the programming language
– Library subroutines are automatically fetched, linked
with the main program and loaded.
– Standard system library (automatically incorporated)
• I/O library, math library, graphics libraries, etc.
– Other libraries (specified by parameters to the loader)
• C library, Java library (JDK), etc.
117. Machine-Independent Loader Features
• program Libraries – organized collection of object
programs
Source program Object program Program library
… … Index
REREC REREC …
… … REREC
WRREC WRREC …
… … WRREC
Assembly/ …
organize
compile
… … Index
ERR- ERR- …
HANDLER HANDLER ERR-
… … HANDLER
…
118. Machine-Independent Loader Features
• Automatic library search
– Keep track of the external symbols that are referred to
– In pass1, enter symbols from each Refer record into ESTAB,
marked undefined
– When the definition is encountered, the address assigned to the
symbol is filled in to complete the entry
– At the end of Pass1, the symbol in ESTAB that remain undefined
represent unresolved external references
– The loader searches the library or libraries specified for routines
that contain the definition of these symbols
– Loader processes the found library subroutines, which may
contain further external reference
– repeat the library search process until all the external references
are resolved
119. Machine-Independent Loader Features
• Static Libraries
– Functions from static libraries are linked/loaded before execution
time
Copy
…
Copy
… JSUB RDREC
linking …
JSUB RDREC
Index loader
… If error JSUB
…
ERRHANDLER
If error JSUB RDREC
…
ERRHANDLER …
… WRREC JSUB WRREC
… …
JSUB WRREC
… JSUB WRREC
Index …
JSUB WRREC
… RDREC
…
ERR- …
HANDLER WRREC
… …
ERRHANDLER
…
120. Machine-Independent Loader Features
• Overriding
– Programmer supplies his or her own routines instead
of library routines by using the same routine names
– Programmer defined routines are included as input to
the loader
– By the end of Pass1, ESTAB already contains a
complete entry for each of the programmer defined
routines.
– Library search for those routines is avoided.
121. Machine-Independent Loader Features
• Library directory
– It is not efficient to search libraries by scanning the Define
records for all of the object programs on the library
– Perform library search on the directory, which is constructed as
part of the library
– Directory gives the name of each routine and a pointer to its
address within the file
– If a subroutine is to be callable by more than one name, both
names are entered into the directory, but only one copy of the
object program is stored
– Directory for commonly used libraries may be kept in memory
permanently
122. Machine-Independent Loader Features
• Loader options and commands
– Selection of alternative sources of input
INCLUDE program-name(library-name)
– Delete external symbols or entire control sections
DELETE symbol-name or DELETE csect-name
– Change external symbol name1 to name2 whenever it appears
in the object program
CHANGE name1,name2
– Specify alternative libraries to be searched before standard
system libraries
LIBRARY MYLIB
– Leave external references unresolved
NOCALL symbol-name
123. Machine-Independent Loader Features
• Other loader options
– Specify that no external references be resolved by
library search
– Specify the options of outputting a load map
– Specify the location at which execution is to begin
(overriding any information given in the object
programs)
– Control whether or not the loader should attempt to
execute the program if errors are detected during the
load
124. Machine-Independent Loader Features
• Loader options examples
– Use library routines READ and WRITE instead of programmer
defined routine RDREC and WRREC, without reassembling the
program
INCLUDE READ(UTLIB)
INCLUDE WRITE(UTLIB)
DELETE RDREC, WRREC
CHANGE RDREC, READ
CHANGE WRREC, WRITE
– It is known that in a particular execution, routines STDDEV,
PLOT, and CORREL will not be called. The following command
can instruct the loader to leave those external references
unresolved (avoid the overhead of loading and linking, save
memory space)
NOCALL STDDEV, PLOT, CORREL
125. Loader Design Options
• Linkage Editors (Fig.3.13)
– Produces a linked version of the program (load
module) before loading
– Performs relocation of all control sections relative to
the start of the linked program
– Simplify the loading process
• Only a simply relocating loader is needed
• Less modification is needed
• All the items that need to be modified have values that are
relative to the start of the linked program
• No EXTAB is required
• A single pass is enough
126. Loader Design Options
• Linkage Editors options and commands
– Perform relocation if the starting address of a program is known
– Replace subroutine in the linked version of a program
– Build packages of subroutines or other control sections that are
generally used together
– Allow external references unresolved
127. Loader Design Options
• Dynamic Linking (dynamic loading, load on call)
– Postpones the linking function until execution time
– A subroutine is loaded and linked to the rest of the program
when it is first called
• Advantages of dynamic linking
– Allow several executing programs to share one copy of a
subroutine or library
– Rarely called subroutines don’t need to be load into the memory
every time the program is run
– Allow modification to a subroutine without changing the
programs in which it is called
128. Machine-Independent Loader Features
• Dynamic libraries
– Functions from dynamic libraries are loaded and linked during
execution time WRREC
…
Copy Index
… COPY …
… RDREC
JSUB RDREC JSUB RDREC …
… …
Static If error
WRREC
If error JSUB linking JSUB ERRHANDLER Dynamic …
ERRHANDLER loader … linking
… JSUB WRREC loader
… Index
JSUB WRREC JSUB SRREC
…
…
…
ERR-
JSUB WRREC HANDLER
… ERRHANDLER …
…
RDREC
…
129. Loader Design Options
• Dynamic Linking process (Fig.3.14)
– A subroutine is called
– Send load-and-call service request to the OS
– OS examine the internal table
– If the subroutine is not loaded, then load it from the library
– OS passes control to the routine being called
– Subroutine completes its processing, and returns control to the
OS
– OS does some processing if necessary (release the memory or
not)
– OS returns control to the program that issue the request
– This is called execution time binding, which gives more
capabilities at a higher cost (OS intervention)
130. Loader Design Options
• Dynamic loading of libraries
– Entire library is loaded, used, and unloaded during execution
time under program control
RDREC
(entire library)
…
Copy
WRREC
… …
Load library
Get function addr. Static Copy
JSUB RDREC linking … Dynamic
… loader Load library loader
Get function addr. Get function addr.
JSUB WRREC JSUB RDREC
…
… Index
Get function addr. …
JSUB WRREC
… JSUB WRREC RDREC
… …
Close library
JSUB WRREC WRREC
… …
Close library
131. Loader Design Options
• Bootstrap loaders
– Solution 1
• An absolute loader program is permanently resident in ROM
• Start execution when some hardware signal occurs
• Executed directly in the ROM or copied to main memory and
executed there
• Load the OS or any other stand-alone programs
– Solution 2
• A bootstrap loader is added to the beginning of all object programs
that are to be loaded into an empty and idle system
• Built-in hardware function reads the first record of the loader into
memory at a fixed location
• If necessary, this record will read more records until the whole
bootstrap loader is loaded into the memory
• The bootstrap loader loads the absolute program that follows