SlideShare una empresa de Scribd logo
1 de 7
Descargar para leer sin conexión
Source vs. Object Code: A False Dichotomy
The notion of source and object code as disjoint classes of computer
code is a false dichotomy common among non-programmers. The general
public's understanding is that computer programs are written as
"source code" that is human readable and not immediately executable by
the machine. Source code is supposed to contain meaningful variable
names and helpful comments that are intended only to be read by
humans. A piece of software called a "compiler" must convert source
code to object code before the program described in the source code
can be executed. Object code cannot be read by people; it is a
sequence of bytes that encode specific machine instructions that will
be executed by the microprocessor when it runs (executes) the program.
The above statements are not exactly false; they are in fact the usual
way of explaining to laymen how a computer works. However, any
attempt to draw legal distinctions between "source" code and "object"
code, or between code that is executable and code that is not, will
immediately encounter difficulties, because these dichotomies are only
a convenient fiction. Computer scientists don't view things this way
at all. Here are some of the problems:
1. "Source" and "object" are not well-defined classes of code. They
are actually relative terms. Given a device for transforming programs
from one form to another, source code is what goes into the device,
and object code (or "target" code) is what comes out. The target code
of one device is frequently the source code of another. For example,
both early C++ compilers and the Kyoto Common Lisp compiler produced C
code as their output. C++ (or Common Lisp) was the source language; C
was the target language. This output was then run through a C
compiler to produce symbolic assembler code as output. This then had
to be run through yet another proram, the "assembler", to produce
binary machine code that could be directly executed by a processor.
In summary, programs typically go through a series of transformations
from higher level to lower level languages. The assembler language
code that a C compiler produces as "object" code is source code for
the assembler. Today, the GNU C compiler (known as gcc) will deliver
assembler language code instead of binary if the user requests it by
specifying the -S switch.
2. Even binary machine code is perfectly readable by humans. It was,
after all, designed by humans. It may be tedious to read, but this
can be helped somewhat by using a program called a disassembler to
translate the raw binary instructions back into symbolic form. For
example, the Pentium instruction "add 7 to register AL" is written
0000010000000111 in the machine's binary language; it is written in
symbolic form as "ADD AL,7". (Reference: Intel Architecture
Developer's Manual, 1997 edition, volume 2, page 3-17.) Converting
between these two forms is trivial.
The instant litigation seems to be a result of a person having read
and understood a piece of machine code. A document signed by 1) a
Norwegian minor, Jon Johansen, 2) an "anonymous German cracker" from
the group MoRE (Masters of Reverse Engineering), who is designated the
group's "top coder/disassembler", and 3) an individual referred to
only as "[dEZZY/DoD]", suggests that the original CSS "crack" was
performed by disassembling a software DVD player and rewriting the
descrambling algorithm in C. The "anonymous C source code" in my
Gallery of CSS Descramblers (http://www.cs.cmu.edu/~dst/DeCSS/Gallery)
may be the fruit of that effort.
3. Human-readable programs do not have to be compiled in order to be
executed. Compilation merely transforms the program into a form that
can be executed more efficiently. But a piece of software called an
"interpreter" can execute source code directly, without compiling it.
Programs normally run more slowly when they are being executed by an
interpreter than when they are first compiled into machine
instructions that the processor can execute directly. But they do
run. And interpreters have some advantages: they are easier to write
than compilers, and they are better suited for debugging purposes.
An interpreter for the C programming language, called EIC, is
available for free on the web at http://www.anarchos.com/eic/. There
are other C interpreters as well.
Some languages, such as MATLAB, are normally executed in interpreted
mode. Today, most laser printers have a computer inside running a
PostScript interpreter. When a document prepared in a text editor
such as Microsoft Word is to be printed, it is first converted to a
Postscript program. The Postscript interpreter inside the printer
then executes the program to construct the page image that is sent to
the printing engine. Postscript programs are easily human readable;
they do not even require a disassembler. Yet they are executable.
4. Binary "machine code" isn't necessarily directly executable by the
computer's microprocessor. Languages such as Java and PERL compile to
what is called 'byte code", or "virtual machine code". This is binary
code for an idealized, imaginary processor that does not correspond to
any particular commercial processor architecture, such as the the
Pentium, SPARC, or PowerPC architectures. The virtual machine code
must then be executed by a piece of software called a "byte code
interpreter" that simulates the imaginary processor. The advantage of
this approach is that it allows one to quickly produce Java or Perl
implementations for new computer architectures, because the same
compiler can be used. Only a new byte code interpreter is required.
Byte code interpreters are easier to write than compilers.
Another approach taken by some implementations is to compile a piece
of Java byte code into native machine instructions (e.g., Pentium
instructions, SPARC instructions, or PowerPC instructions) when that
piece begins to execute. In this scheme, the Java byte code becomes
the "source code" and the native mode instructions are the target
code.
The new Crusoe processor from Transmeta treats Pentium code as virtual
machine code. Although it appears to execute Pentium code, internally
the chip translates the code into another machine language that is
considerably more complex, in order to take better advantage of
pipelining and other optimization techniques. Hence, what the layman
would call object code is actually the source code for the Crusoe's
transformation process.
5. "Binary executable" code sometimes isn't. Consider the DECSS.EXE
file that is the subject of the instant litigation. This file
contains a program expressed as machine instructions directly
executable by a Pentium processor. It also includes "system calls",
specific to the Microsoft Windows operating system, to perform
input/output operations. If this file were downloaded to a Sun SPARC
workstation running Unix, or a PowerPC running the Macintosh operating
system, it would not be directly executable. But all is not lost.
The SPARC or PowerPC owner could construct a Pentium emulator program
to simulate the operation of the Pentium processor. Such programs are
commonplace; in fact they are a necessary step in the development of
any new processor architecture. He or she would also need to build a
small portion of a Windows emulator in order to handle the Windows
system calls. At that point, DECSS.EXE could be executed by the
emulator program. An alternative approach would be to build a
translator program to convert Pentium instructions to SPARC or PowerPC
native instructions. Again, this is a well-known technique in
computer science and not technically difficult. DECSS.EXE would then
be the input, or source code, for the emulator or translator program.
6. Executable binary code has much of the same expressive content as
code written in a symbolic language, be it assembly language or a
higher level language such as C or Lisp. Appendix A shows a small C
program for computing 5 factorial (the product of the integers from 1
to 5). Appendix B shows the assembler language output produced by gcc
version 2.95.3 on a Sun UltraSparc 170 running the Solaris operating
system. Appendix C is a dump of the binary output file produced by
the assembler. (The values are actually displayed in hexadecimal,
which is a more compact form of binary.) Appendix D is the result of
disassembling the binary file, going back to symbolic assembler
language code. Note that line number 20 in Appendix D contains an
integer compare instruction, "cmp %o0, 5". The equivalent hexadecimal
form is also shown: it is 80a22005. Appendix C shows this same
sequence in the binary file, mid-way through the line labeled 0000220.
The same instruction, "cmp %o0, 5", also appears in Appendix B, two
lines after the label .LL3. And the equivalent in the C code of
Appendix A is the sequence "i<6". The fact that the constant is 6 in
the C code and 5 in the assembly language code is the sort of thing
one can learn only by looking at the assembly language code, or the
binary file that results from it. (The reason for the difference, 5
vs. 6, has to do with the particular strategy used by the gcc compiler
to implement the FOR loop.)
The lessons that should be drawn from the above are:
1) All computer code is human readable. Some forms are simply more
convenient to read than others.
2. All computer code is expressive. Many of the ideas expressed in C
code are also expressed in the assembly language code that results
from compiling that C code, and again in the binary machine language
that is the output of the assembler. Some content may be lost, e.g.,
source code comments are typically not preserved in object code,
although variable names may be. But some ideas that are only implicit
in the source code may be made more apparent in the object code, such
as how a particular sequence of actions should be best expressed in
terms of processor operations in order to obtain maximum performance
from the machine.
3) All computer code is executable. In some instances it may be
advantageous to transform the code into another form first, but
transformation is by no means mandatory. An interpreter can be
employed instead. Interpreters are in common use in computer systems.
4) "Source" and "object" are relative terms, not absolute categories.
5) The file DECSS.EXE is a particular expression of an algorithm for
converting video files from one format to another. It expresses the
same algorithm as the C code from which it was compiled. DECSS.EXE is
coded in a binary language that is more tedious to read than the C
code, but more efficiently executable by a Pentium processor. These
are differences in degree only. If C code is protected speech because
of its expressive content --- and one can argue that a computer
program is nothing but expressive content -- then code written in a
binary machine language that expresses the same algorithm should not
be regarded any differently.
================================================================
APPENDIX A: C source file fact.c
#include <stdio.h>
void main(int argc, char *argv[]) {
int i, result;
result = 1;
for (i = 1; i<6; i++) {
result = result * i;
}
printf ("Result is: %d.n",result);
}
================================================================
APPENDIX B: assembler language file fact.s (produced by: gcc -S fact.c)
.file "fact.c"
gcc2_compiled.:
.global .umul
.section ".rodata"
.align 8
.LLC0:
.asciz "Result is: %d.n"
.section ".text"
.align 4
.global main
.type main,#function
.proc 020
main:
!#PROLOGUE# 0
save %sp, -120, %sp
!#PROLOGUE# 1
st %i0, [%fp+68]
st %i1, [%fp+72]
mov 1, %o0
st %o0, [%fp-24]
mov 1, %o0
st %o0, [%fp-20]
.LL3:
ld [%fp-20], %o0
cmp %o0, 5
ble .LL6
nop
b .LL4
nop
.LL6:
ld [%fp-24], %o0
ld [%fp-20], %o1
call .umul, 0
nop
st %o0, [%fp-24]
.LL5:
ld [%fp-20], %o0
add %o0, 1, %o1
st %o1, [%fp-20]
b .LL3
nop
.LL4:
sethi %hi(.LLC0), %o1
or %o1, %lo(.LLC0), %o0
ld [%fp-24], %o1
call printf, 0
nop
.LL2:
ret
restore
.LLfe1:
.size main,.LLfe1-main
.ident "GCC: (GNU) 2.95.2 19991024 (release)"
================================================================
APPENDIX C: binary file fact.o (produced by: gcc -c fact.c;
dumped by: od -x fact.o)
0000000 7f45 4c46 0102 0100 0000 0000 0000 0000
0000020 0001 0002 0000 0001 0000 0000 0000 0000
0000040 0000 0234 0000 0000 0034 0000 0000 0028
0000060 0008 0001 002e 7368 7374 7274 6162 002e
0000100 7465 7874 002e 726f 6461 7461 002e 7379
0000120 6d74 6162 002e 7374 7274 6162 002e 7265
0000140 6c61 2e74 6578 7400 2e63 6f6d 6d65 6e74
0000160 0000 0000 9de3 bf88 f027 a044 f227 a048
0000200 9010 2001 d027 bfe8 9010 2001 d027 bfec
0000220 d007 bfec 80a2 2005 0480 0004 0100 0000
0000240 1080 000c 0100 0000 d007 bfe8 d207 bfec
0000260 4000 0000 0100 0000 d027 bfe8 d007 bfec
0000300 9202 2001 d227 bfec 10bf fff2 0100 0000
0000320 1300 0000 9012 6000 d207 bfe8 4000 0000
0000340 0100 0000 81c7 e008 81e8 0000 0000 0000
0000360 5265 7375 6c74 2069 733a 2020 2564 2e0a
0000400 0000 0000 0000 0001 0000 0000 0000 0000
0000420 0400 fff1 0000 0001 0000 0000 0000 0000
0000440 0400 fff1 0000 0000 0000 0000 0000 0000
0000460 0300 0003 0000 0008 0000 0000 0000 0000
0000500 0000 0002 0000 0000 0000 0000 0000 0000
0000520 0300 0002 0000 0017 0000 0000 0000 0000
0000540 1000 0000 0000 001d 0000 0000 0000 0000
0000560 1000 0000 0000 0024 0000 0000 0000 0078
0000600 1200 0002 0066 6163 742e 6300 6763 6332
0000620 5f63 6f6d 7069 6c65 642e 002e 756d 756c
0000640 0070 7269 6e74 6600 6d61 696e 0000 0000
0000660 0000 003c 0000 0507 0000 0000 0000 005c
0000700 0000 0209 0000 0000 0000 0060 0000 020c
0000720 0000 0000 0000 0068 0000 0607 0000 0000
0000740 0061 733a 2057 6f72 6b53 686f 7020 436f
0000760 6d70 696c 6572 7320 342e 3220 6465 7620
0001000 3133 204d 6179 2031 3939 360a 0047 4343
0001020 3a20 2847 4e55 2920 322e 3935 2e32 2031
0001040 3939 3931 3032 3420 2872 656c 6561 7365
0001060 2900 0000 0000 0000 0000 0000 0000 0000
0001100 0000 0000 0000 0000 0000 0000 0000 0000
0001120 0000 0000 0000 0000 0000 0000 0000 0001
0001140 0000 0003 0000 0000 0000 0000 0000 0034
0001160 0000 003d 0000 0000 0000 0000 0000 0001
0001200 0000 0000 0000 000b 0000 0001 0000 0006
0001220 0000 0000 0000 0074 0000 0078 0000 0000
0001240 0000 0000 0000 0004 0000 0000 0000 0011
0001260 0000 0001 0000 0002 0000 0000 0000 00f0
0001300 0000 0011 0000 0000 0000 0000 0000 0008
0001320 0000 0000 0000 0019 0000 0002 0000 0002
0001340 0000 0000 0000 0104 0000 0080 0000 0005
0001360 0000 0005 0000 0004 0000 0010 0000 0021
0001400 0000 0003 0000 0002 0000 0000 0000 0184
0001420 0000 0029 0000 0000 0000 0000 0000 0001
0001440 0000 0000 0000 0029 0000 0004 0000 0002
0001460 0000 0000 0000 01b0 0000 0030 0000 0004
0001500 0000 0002 0000 0004 0000 000c 0000 0034
0001520 0000 0001 0000 0000 0000 0000 0000 01e0
0001540 0000 0052 0000 0000 0000 0000 0000 0001
0001560 0000 0000
0001564
================================================================
APPENDIX D: disassembly of fact.o (produced by: dis fact.o)
**** DISASSEMBLER ****
section .text
main()
0: 9d e3 bf 88 save %sp, -120, %sp
4: f0 27 a0 44 st %i0, [%fp + 68]
8: f2 27 a0 48 st %i1, [%fp + 72]
c: 90 10 20 01 mov 1, %o0
10: d0 27 bf e8 st %o0, [%fp - 24]
14: 90 10 20 01 mov 1, %o0
18: d0 27 bf ec st %o0, [%fp - 20]
1c: d0 07 bf ec ld [%fp - 20], %o0
20: 80 a2 20 05 cmp %o0, 5
24: 04 80 00 04 ble 0x34
28: 01 00 00 00 nop
2c: 10 80 00 0c ba 0x5c
30: 01 00 00 00 nop
34: d0 07 bf e8 ld [%fp - 24], %o0
38: d2 07 bf ec ld [%fp - 20], %o1
3c: 40 00 00 00 call 0x3c
40: 01 00 00 00 nop
44: d0 27 bf e8 st %o0, [%fp - 24]
48: d0 07 bf ec ld [%fp - 20], %o0
4c: 92 02 20 01 add %o0, 1, %o1
50: d2 27 bf ec st %o1, [%fp - 20]
54: 10 bf ff f2 ba 0x1c
58: 01 00 00 00 nop
5c: 13 00 00 00 sethi %hi(gcc2_compiled.), %o1
60: 90 12 60 00 or %o1, gcc2_compiled., %o0
! gcc2_compiled.
64: d2 07 bf e8 ld [%fp - 24], %o1
68: 40 00 00 00 call 0x68
6c: 01 00 00 00 nop
70: 81 c7 e0 08 ret
74: 81 e8 00 00 restore

Más contenido relacionado

La actualidad más candente

A Comparison of .NET Framework vs. Java Virtual Machine
A Comparison of .NET Framework vs. Java Virtual MachineA Comparison of .NET Framework vs. Java Virtual Machine
A Comparison of .NET Framework vs. Java Virtual MachineAbdelrahman Hosny
 
Java vs .net
Java vs .netJava vs .net
Java vs .netTech_MX
 
WEB PROGRAMMING UNIT V BY BHAVSINGH MALOTH
WEB PROGRAMMING UNIT V BY BHAVSINGH MALOTHWEB PROGRAMMING UNIT V BY BHAVSINGH MALOTH
WEB PROGRAMMING UNIT V BY BHAVSINGH MALOTHBhavsingh Maloth
 
Emscripten - compile your C/C++ to JavaScript
Emscripten - compile your C/C++ to JavaScriptEmscripten - compile your C/C++ to JavaScript
Emscripten - compile your C/C++ to JavaScript穎睿 梁
 
3. Introduction to C language ||Learn C Programming Complete.
3. Introduction to C language ||Learn C Programming Complete.3. Introduction to C language ||Learn C Programming Complete.
3. Introduction to C language ||Learn C Programming Complete.Fiaz Hussain
 
List of programming_languages_by_type
List of programming_languages_by_typeList of programming_languages_by_type
List of programming_languages_by_typePhoenix
 
Introduction to Level Zero API for Heterogeneous Programming : NOTES
Introduction to Level Zero API for Heterogeneous Programming : NOTESIntroduction to Level Zero API for Heterogeneous Programming : NOTES
Introduction to Level Zero API for Heterogeneous Programming : NOTESSubhajit Sahu
 
C plus plus for hackers it security
C plus plus for hackers it securityC plus plus for hackers it security
C plus plus for hackers it securityCESAR A. RUIZ C
 
5.Hello World program Explanation. ||C Programming tutorial.
5.Hello World program Explanation. ||C Programming tutorial.5.Hello World program Explanation. ||C Programming tutorial.
5.Hello World program Explanation. ||C Programming tutorial.Fiaz Hussain
 
C# programming language
C# programming languageC# programming language
C# programming languageswarnapatil
 
Types Of Coding Languages: A Complete Guide To Master Programming
Types Of Coding Languages: A Complete Guide To Master ProgrammingTypes Of Coding Languages: A Complete Guide To Master Programming
Types Of Coding Languages: A Complete Guide To Master Programmingcalltutors
 

La actualidad más candente (20)

A Comparison of .NET Framework vs. Java Virtual Machine
A Comparison of .NET Framework vs. Java Virtual MachineA Comparison of .NET Framework vs. Java Virtual Machine
A Comparison of .NET Framework vs. Java Virtual Machine
 
Oops index
Oops indexOops index
Oops index
 
Java vs .net
Java vs .netJava vs .net
Java vs .net
 
Learn C Language
Learn C LanguageLearn C Language
Learn C Language
 
WEB PROGRAMMING UNIT V BY BHAVSINGH MALOTH
WEB PROGRAMMING UNIT V BY BHAVSINGH MALOTHWEB PROGRAMMING UNIT V BY BHAVSINGH MALOTH
WEB PROGRAMMING UNIT V BY BHAVSINGH MALOTH
 
Emscripten - compile your C/C++ to JavaScript
Emscripten - compile your C/C++ to JavaScriptEmscripten - compile your C/C++ to JavaScript
Emscripten - compile your C/C++ to JavaScript
 
Computer programming languages
Computer programming languagesComputer programming languages
Computer programming languages
 
Python Introduction
Python IntroductionPython Introduction
Python Introduction
 
3. Introduction to C language ||Learn C Programming Complete.
3. Introduction to C language ||Learn C Programming Complete.3. Introduction to C language ||Learn C Programming Complete.
3. Introduction to C language ||Learn C Programming Complete.
 
List of programming_languages_by_type
List of programming_languages_by_typeList of programming_languages_by_type
List of programming_languages_by_type
 
How a Compiler Works ?
How a Compiler Works ?How a Compiler Works ?
How a Compiler Works ?
 
Introduction to Level Zero API for Heterogeneous Programming : NOTES
Introduction to Level Zero API for Heterogeneous Programming : NOTESIntroduction to Level Zero API for Heterogeneous Programming : NOTES
Introduction to Level Zero API for Heterogeneous Programming : NOTES
 
Srgoc dotnet_new
Srgoc dotnet_newSrgoc dotnet_new
Srgoc dotnet_new
 
C plus plus for hackers it security
C plus plus for hackers it securityC plus plus for hackers it security
C plus plus for hackers it security
 
Ctutor ashu
Ctutor ashuCtutor ashu
Ctutor ashu
 
5.Hello World program Explanation. ||C Programming tutorial.
5.Hello World program Explanation. ||C Programming tutorial.5.Hello World program Explanation. ||C Programming tutorial.
5.Hello World program Explanation. ||C Programming tutorial.
 
Introduction to C Programming
Introduction to C ProgrammingIntroduction to C Programming
Introduction to C Programming
 
C# programming language
C# programming languageC# programming language
C# programming language
 
Types Of Coding Languages: A Complete Guide To Master Programming
Types Of Coding Languages: A Complete Guide To Master ProgrammingTypes Of Coding Languages: A Complete Guide To Master Programming
Types Of Coding Languages: A Complete Guide To Master Programming
 
Programming
ProgrammingProgramming
Programming
 

Similar a Source vs object code

Reversing and Patching Machine Code
Reversing and Patching Machine CodeReversing and Patching Machine Code
Reversing and Patching Machine CodeTeodoro Cipresso
 
Introduction of c language
Introduction of c languageIntroduction of c language
Introduction of c languagefarishah
 
Applying Anti-Reversing Techniques to Machine Code
Applying Anti-Reversing Techniques to Machine CodeApplying Anti-Reversing Techniques to Machine Code
Applying Anti-Reversing Techniques to Machine CodeTeodoro Cipresso
 
Advance Android Application Development
Advance Android Application DevelopmentAdvance Android Application Development
Advance Android Application DevelopmentRamesh Prasad
 
Itroduction about java
Itroduction about javaItroduction about java
Itroduction about javasrmohan06
 
Chapter 2 Program language translation.pptx
Chapter 2 Program language translation.pptxChapter 2 Program language translation.pptx
Chapter 2 Program language translation.pptxdawod yimer
 
Introduction
IntroductionIntroduction
IntroductionKamran
 
Introduction-to-C-Part-1 (1).doc
Introduction-to-C-Part-1 (1).docIntroduction-to-C-Part-1 (1).doc
Introduction-to-C-Part-1 (1).docMayurWagh46
 

Similar a Source vs object code (20)

First session quiz
First session quizFirst session quiz
First session quiz
 
First session quiz
First session quizFirst session quiz
First session quiz
 
Reversing and Patching Machine Code
Reversing and Patching Machine CodeReversing and Patching Machine Code
Reversing and Patching Machine Code
 
C programming first_session
C programming first_sessionC programming first_session
C programming first_session
 
C programming first_session
C programming first_sessionC programming first_session
C programming first_session
 
The basics of c programming
The basics of c programmingThe basics of c programming
The basics of c programming
 
Introduction to c language
Introduction to c language Introduction to c language
Introduction to c language
 
Introduction of c language
Introduction of c languageIntroduction of c language
Introduction of c language
 
Chapter1.pdf
Chapter1.pdfChapter1.pdf
Chapter1.pdf
 
Embedded C.pptx
Embedded C.pptxEmbedded C.pptx
Embedded C.pptx
 
Applying Anti-Reversing Techniques to Machine Code
Applying Anti-Reversing Techniques to Machine CodeApplying Anti-Reversing Techniques to Machine Code
Applying Anti-Reversing Techniques to Machine Code
 
Advance Android Application Development
Advance Android Application DevelopmentAdvance Android Application Development
Advance Android Application Development
 
Itroduction about java
Itroduction about javaItroduction about java
Itroduction about java
 
Chapter 2 Program language translation.pptx
Chapter 2 Program language translation.pptxChapter 2 Program language translation.pptx
Chapter 2 Program language translation.pptx
 
Introduction
IntroductionIntroduction
Introduction
 
C++Basics2022.pptx
C++Basics2022.pptxC++Basics2022.pptx
C++Basics2022.pptx
 
The compilation process
The compilation processThe compilation process
The compilation process
 
Introduction-to-C-Part-1 (1).doc
Introduction-to-C-Part-1 (1).docIntroduction-to-C-Part-1 (1).doc
Introduction-to-C-Part-1 (1).doc
 
C# tutorial
C# tutorialC# tutorial
C# tutorial
 
C intro
C introC intro
C intro
 

Último

Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...Sean Meyn
 
Design Analysis of Alogorithm 1 ppt 2024.pptx
Design Analysis of Alogorithm 1 ppt 2024.pptxDesign Analysis of Alogorithm 1 ppt 2024.pptx
Design Analysis of Alogorithm 1 ppt 2024.pptxrajesshs31r
 
SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....
SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....
SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....santhyamuthu1
 
A Seminar on Electric Vehicle Software Simulation
A Seminar on Electric Vehicle Software SimulationA Seminar on Electric Vehicle Software Simulation
A Seminar on Electric Vehicle Software SimulationMohsinKhanA
 
Power System electrical and electronics .pptx
Power System electrical and electronics .pptxPower System electrical and electronics .pptx
Power System electrical and electronics .pptxMUKULKUMAR210
 
cloud computing notes for anna university syllabus
cloud computing notes for anna university syllabuscloud computing notes for anna university syllabus
cloud computing notes for anna university syllabusViolet Violet
 
News web APP using NEWS API for web platform to enhancing user experience
News web APP using NEWS API for web platform to enhancing user experienceNews web APP using NEWS API for web platform to enhancing user experience
News web APP using NEWS API for web platform to enhancing user experienceAkashJha84
 
دليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratory
دليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratoryدليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratory
دليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide LaboratoryBahzad5
 
solar wireless electric vechicle charging system
solar wireless electric vechicle charging systemsolar wireless electric vechicle charging system
solar wireless electric vechicle charging systemgokuldongala
 
Guardians and Glitches: Navigating the Duality of Gen AI in AppSec
Guardians and Glitches: Navigating the Duality of Gen AI in AppSecGuardians and Glitches: Navigating the Duality of Gen AI in AppSec
Guardians and Glitches: Navigating the Duality of Gen AI in AppSecTrupti Shiralkar, CISSP
 
Tachyon 100G PCB Performance Attributes and Applications
Tachyon 100G PCB Performance Attributes and ApplicationsTachyon 100G PCB Performance Attributes and Applications
Tachyon 100G PCB Performance Attributes and ApplicationsEpec Engineered Technologies
 
Transforming Process Safety Management: Challenges, Benefits, and Transition ...
Transforming Process Safety Management: Challenges, Benefits, and Transition ...Transforming Process Safety Management: Challenges, Benefits, and Transition ...
Transforming Process Safety Management: Challenges, Benefits, and Transition ...soginsider
 
Carbohydrates principles of biochemistry
Carbohydrates principles of biochemistryCarbohydrates principles of biochemistry
Carbohydrates principles of biochemistryKomakeTature
 
ChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai searchChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai searchrohitcse52
 
sdfsadopkjpiosufoiasdoifjasldkjfl a asldkjflaskdjflkjsdsdf
sdfsadopkjpiosufoiasdoifjasldkjfl a asldkjflaskdjflkjsdsdfsdfsadopkjpiosufoiasdoifjasldkjfl a asldkjflaskdjflkjsdsdf
sdfsadopkjpiosufoiasdoifjasldkjfl a asldkjflaskdjflkjsdsdfJulia Kaye
 
Popular-NO1 Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialis...
Popular-NO1 Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialis...Popular-NO1 Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialis...
Popular-NO1 Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialis...Amil baba
 
The relationship between iot and communication technology
The relationship between iot and communication technologyThe relationship between iot and communication technology
The relationship between iot and communication technologyabdulkadirmukarram03
 

Último (20)

Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
 
Lecture 4 .pdf
Lecture 4                              .pdfLecture 4                              .pdf
Lecture 4 .pdf
 
Design Analysis of Alogorithm 1 ppt 2024.pptx
Design Analysis of Alogorithm 1 ppt 2024.pptxDesign Analysis of Alogorithm 1 ppt 2024.pptx
Design Analysis of Alogorithm 1 ppt 2024.pptx
 
Présentation IIRB 2024 Marine Cordonnier.pdf
Présentation IIRB 2024 Marine Cordonnier.pdfPrésentation IIRB 2024 Marine Cordonnier.pdf
Présentation IIRB 2024 Marine Cordonnier.pdf
 
SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....
SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....
SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....
 
A Seminar on Electric Vehicle Software Simulation
A Seminar on Electric Vehicle Software SimulationA Seminar on Electric Vehicle Software Simulation
A Seminar on Electric Vehicle Software Simulation
 
Power System electrical and electronics .pptx
Power System electrical and electronics .pptxPower System electrical and electronics .pptx
Power System electrical and electronics .pptx
 
cloud computing notes for anna university syllabus
cloud computing notes for anna university syllabuscloud computing notes for anna university syllabus
cloud computing notes for anna university syllabus
 
News web APP using NEWS API for web platform to enhancing user experience
News web APP using NEWS API for web platform to enhancing user experienceNews web APP using NEWS API for web platform to enhancing user experience
News web APP using NEWS API for web platform to enhancing user experience
 
دليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratory
دليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratoryدليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratory
دليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratory
 
solar wireless electric vechicle charging system
solar wireless electric vechicle charging systemsolar wireless electric vechicle charging system
solar wireless electric vechicle charging system
 
Litature Review: Research Paper work for Engineering
Litature Review: Research Paper work for EngineeringLitature Review: Research Paper work for Engineering
Litature Review: Research Paper work for Engineering
 
Guardians and Glitches: Navigating the Duality of Gen AI in AppSec
Guardians and Glitches: Navigating the Duality of Gen AI in AppSecGuardians and Glitches: Navigating the Duality of Gen AI in AppSec
Guardians and Glitches: Navigating the Duality of Gen AI in AppSec
 
Tachyon 100G PCB Performance Attributes and Applications
Tachyon 100G PCB Performance Attributes and ApplicationsTachyon 100G PCB Performance Attributes and Applications
Tachyon 100G PCB Performance Attributes and Applications
 
Transforming Process Safety Management: Challenges, Benefits, and Transition ...
Transforming Process Safety Management: Challenges, Benefits, and Transition ...Transforming Process Safety Management: Challenges, Benefits, and Transition ...
Transforming Process Safety Management: Challenges, Benefits, and Transition ...
 
Carbohydrates principles of biochemistry
Carbohydrates principles of biochemistryCarbohydrates principles of biochemistry
Carbohydrates principles of biochemistry
 
ChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai searchChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai search
 
sdfsadopkjpiosufoiasdoifjasldkjfl a asldkjflaskdjflkjsdsdf
sdfsadopkjpiosufoiasdoifjasldkjfl a asldkjflaskdjflkjsdsdfsdfsadopkjpiosufoiasdoifjasldkjfl a asldkjflaskdjflkjsdsdf
sdfsadopkjpiosufoiasdoifjasldkjfl a asldkjflaskdjflkjsdsdf
 
Popular-NO1 Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialis...
Popular-NO1 Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialis...Popular-NO1 Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialis...
Popular-NO1 Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialis...
 
The relationship between iot and communication technology
The relationship between iot and communication technologyThe relationship between iot and communication technology
The relationship between iot and communication technology
 

Source vs object code

  • 1. Source vs. Object Code: A False Dichotomy The notion of source and object code as disjoint classes of computer code is a false dichotomy common among non-programmers. The general public's understanding is that computer programs are written as "source code" that is human readable and not immediately executable by the machine. Source code is supposed to contain meaningful variable names and helpful comments that are intended only to be read by humans. A piece of software called a "compiler" must convert source code to object code before the program described in the source code can be executed. Object code cannot be read by people; it is a sequence of bytes that encode specific machine instructions that will be executed by the microprocessor when it runs (executes) the program. The above statements are not exactly false; they are in fact the usual way of explaining to laymen how a computer works. However, any attempt to draw legal distinctions between "source" code and "object" code, or between code that is executable and code that is not, will immediately encounter difficulties, because these dichotomies are only a convenient fiction. Computer scientists don't view things this way at all. Here are some of the problems: 1. "Source" and "object" are not well-defined classes of code. They are actually relative terms. Given a device for transforming programs from one form to another, source code is what goes into the device, and object code (or "target" code) is what comes out. The target code of one device is frequently the source code of another. For example, both early C++ compilers and the Kyoto Common Lisp compiler produced C code as their output. C++ (or Common Lisp) was the source language; C was the target language. This output was then run through a C compiler to produce symbolic assembler code as output. This then had to be run through yet another proram, the "assembler", to produce binary machine code that could be directly executed by a processor. In summary, programs typically go through a series of transformations from higher level to lower level languages. The assembler language code that a C compiler produces as "object" code is source code for the assembler. Today, the GNU C compiler (known as gcc) will deliver assembler language code instead of binary if the user requests it by specifying the -S switch. 2. Even binary machine code is perfectly readable by humans. It was, after all, designed by humans. It may be tedious to read, but this can be helped somewhat by using a program called a disassembler to translate the raw binary instructions back into symbolic form. For example, the Pentium instruction "add 7 to register AL" is written 0000010000000111 in the machine's binary language; it is written in symbolic form as "ADD AL,7". (Reference: Intel Architecture Developer's Manual, 1997 edition, volume 2, page 3-17.) Converting between these two forms is trivial. The instant litigation seems to be a result of a person having read and understood a piece of machine code. A document signed by 1) a Norwegian minor, Jon Johansen, 2) an "anonymous German cracker" from the group MoRE (Masters of Reverse Engineering), who is designated the
  • 2. group's "top coder/disassembler", and 3) an individual referred to only as "[dEZZY/DoD]", suggests that the original CSS "crack" was performed by disassembling a software DVD player and rewriting the descrambling algorithm in C. The "anonymous C source code" in my Gallery of CSS Descramblers (http://www.cs.cmu.edu/~dst/DeCSS/Gallery) may be the fruit of that effort. 3. Human-readable programs do not have to be compiled in order to be executed. Compilation merely transforms the program into a form that can be executed more efficiently. But a piece of software called an "interpreter" can execute source code directly, without compiling it. Programs normally run more slowly when they are being executed by an interpreter than when they are first compiled into machine instructions that the processor can execute directly. But they do run. And interpreters have some advantages: they are easier to write than compilers, and they are better suited for debugging purposes. An interpreter for the C programming language, called EIC, is available for free on the web at http://www.anarchos.com/eic/. There are other C interpreters as well. Some languages, such as MATLAB, are normally executed in interpreted mode. Today, most laser printers have a computer inside running a PostScript interpreter. When a document prepared in a text editor such as Microsoft Word is to be printed, it is first converted to a Postscript program. The Postscript interpreter inside the printer then executes the program to construct the page image that is sent to the printing engine. Postscript programs are easily human readable; they do not even require a disassembler. Yet they are executable. 4. Binary "machine code" isn't necessarily directly executable by the computer's microprocessor. Languages such as Java and PERL compile to what is called 'byte code", or "virtual machine code". This is binary code for an idealized, imaginary processor that does not correspond to any particular commercial processor architecture, such as the the Pentium, SPARC, or PowerPC architectures. The virtual machine code must then be executed by a piece of software called a "byte code interpreter" that simulates the imaginary processor. The advantage of this approach is that it allows one to quickly produce Java or Perl implementations for new computer architectures, because the same compiler can be used. Only a new byte code interpreter is required. Byte code interpreters are easier to write than compilers. Another approach taken by some implementations is to compile a piece of Java byte code into native machine instructions (e.g., Pentium instructions, SPARC instructions, or PowerPC instructions) when that piece begins to execute. In this scheme, the Java byte code becomes the "source code" and the native mode instructions are the target code. The new Crusoe processor from Transmeta treats Pentium code as virtual machine code. Although it appears to execute Pentium code, internally the chip translates the code into another machine language that is considerably more complex, in order to take better advantage of pipelining and other optimization techniques. Hence, what the layman would call object code is actually the source code for the Crusoe's transformation process.
  • 3. 5. "Binary executable" code sometimes isn't. Consider the DECSS.EXE file that is the subject of the instant litigation. This file contains a program expressed as machine instructions directly executable by a Pentium processor. It also includes "system calls", specific to the Microsoft Windows operating system, to perform input/output operations. If this file were downloaded to a Sun SPARC workstation running Unix, or a PowerPC running the Macintosh operating system, it would not be directly executable. But all is not lost. The SPARC or PowerPC owner could construct a Pentium emulator program to simulate the operation of the Pentium processor. Such programs are commonplace; in fact they are a necessary step in the development of any new processor architecture. He or she would also need to build a small portion of a Windows emulator in order to handle the Windows system calls. At that point, DECSS.EXE could be executed by the emulator program. An alternative approach would be to build a translator program to convert Pentium instructions to SPARC or PowerPC native instructions. Again, this is a well-known technique in computer science and not technically difficult. DECSS.EXE would then be the input, or source code, for the emulator or translator program. 6. Executable binary code has much of the same expressive content as code written in a symbolic language, be it assembly language or a higher level language such as C or Lisp. Appendix A shows a small C program for computing 5 factorial (the product of the integers from 1 to 5). Appendix B shows the assembler language output produced by gcc version 2.95.3 on a Sun UltraSparc 170 running the Solaris operating system. Appendix C is a dump of the binary output file produced by the assembler. (The values are actually displayed in hexadecimal, which is a more compact form of binary.) Appendix D is the result of disassembling the binary file, going back to symbolic assembler language code. Note that line number 20 in Appendix D contains an integer compare instruction, "cmp %o0, 5". The equivalent hexadecimal form is also shown: it is 80a22005. Appendix C shows this same sequence in the binary file, mid-way through the line labeled 0000220. The same instruction, "cmp %o0, 5", also appears in Appendix B, two lines after the label .LL3. And the equivalent in the C code of Appendix A is the sequence "i<6". The fact that the constant is 6 in the C code and 5 in the assembly language code is the sort of thing one can learn only by looking at the assembly language code, or the binary file that results from it. (The reason for the difference, 5 vs. 6, has to do with the particular strategy used by the gcc compiler to implement the FOR loop.) The lessons that should be drawn from the above are: 1) All computer code is human readable. Some forms are simply more convenient to read than others. 2. All computer code is expressive. Many of the ideas expressed in C code are also expressed in the assembly language code that results from compiling that C code, and again in the binary machine language that is the output of the assembler. Some content may be lost, e.g., source code comments are typically not preserved in object code, although variable names may be. But some ideas that are only implicit in the source code may be made more apparent in the object code, such
  • 4. as how a particular sequence of actions should be best expressed in terms of processor operations in order to obtain maximum performance from the machine. 3) All computer code is executable. In some instances it may be advantageous to transform the code into another form first, but transformation is by no means mandatory. An interpreter can be employed instead. Interpreters are in common use in computer systems. 4) "Source" and "object" are relative terms, not absolute categories. 5) The file DECSS.EXE is a particular expression of an algorithm for converting video files from one format to another. It expresses the same algorithm as the C code from which it was compiled. DECSS.EXE is coded in a binary language that is more tedious to read than the C code, but more efficiently executable by a Pentium processor. These are differences in degree only. If C code is protected speech because of its expressive content --- and one can argue that a computer program is nothing but expressive content -- then code written in a binary machine language that expresses the same algorithm should not be regarded any differently. ================================================================ APPENDIX A: C source file fact.c #include <stdio.h> void main(int argc, char *argv[]) { int i, result; result = 1; for (i = 1; i<6; i++) { result = result * i; } printf ("Result is: %d.n",result); } ================================================================ APPENDIX B: assembler language file fact.s (produced by: gcc -S fact.c) .file "fact.c" gcc2_compiled.: .global .umul .section ".rodata" .align 8 .LLC0: .asciz "Result is: %d.n" .section ".text" .align 4 .global main .type main,#function .proc 020 main: !#PROLOGUE# 0 save %sp, -120, %sp !#PROLOGUE# 1
  • 5. st %i0, [%fp+68] st %i1, [%fp+72] mov 1, %o0 st %o0, [%fp-24] mov 1, %o0 st %o0, [%fp-20] .LL3: ld [%fp-20], %o0 cmp %o0, 5 ble .LL6 nop b .LL4 nop .LL6: ld [%fp-24], %o0 ld [%fp-20], %o1 call .umul, 0 nop st %o0, [%fp-24] .LL5: ld [%fp-20], %o0 add %o0, 1, %o1 st %o1, [%fp-20] b .LL3 nop .LL4: sethi %hi(.LLC0), %o1 or %o1, %lo(.LLC0), %o0 ld [%fp-24], %o1 call printf, 0 nop .LL2: ret restore .LLfe1: .size main,.LLfe1-main .ident "GCC: (GNU) 2.95.2 19991024 (release)" ================================================================ APPENDIX C: binary file fact.o (produced by: gcc -c fact.c; dumped by: od -x fact.o) 0000000 7f45 4c46 0102 0100 0000 0000 0000 0000 0000020 0001 0002 0000 0001 0000 0000 0000 0000 0000040 0000 0234 0000 0000 0034 0000 0000 0028 0000060 0008 0001 002e 7368 7374 7274 6162 002e 0000100 7465 7874 002e 726f 6461 7461 002e 7379 0000120 6d74 6162 002e 7374 7274 6162 002e 7265 0000140 6c61 2e74 6578 7400 2e63 6f6d 6d65 6e74 0000160 0000 0000 9de3 bf88 f027 a044 f227 a048 0000200 9010 2001 d027 bfe8 9010 2001 d027 bfec 0000220 d007 bfec 80a2 2005 0480 0004 0100 0000 0000240 1080 000c 0100 0000 d007 bfe8 d207 bfec 0000260 4000 0000 0100 0000 d027 bfe8 d007 bfec 0000300 9202 2001 d227 bfec 10bf fff2 0100 0000 0000320 1300 0000 9012 6000 d207 bfe8 4000 0000
  • 6. 0000340 0100 0000 81c7 e008 81e8 0000 0000 0000 0000360 5265 7375 6c74 2069 733a 2020 2564 2e0a 0000400 0000 0000 0000 0001 0000 0000 0000 0000 0000420 0400 fff1 0000 0001 0000 0000 0000 0000 0000440 0400 fff1 0000 0000 0000 0000 0000 0000 0000460 0300 0003 0000 0008 0000 0000 0000 0000 0000500 0000 0002 0000 0000 0000 0000 0000 0000 0000520 0300 0002 0000 0017 0000 0000 0000 0000 0000540 1000 0000 0000 001d 0000 0000 0000 0000 0000560 1000 0000 0000 0024 0000 0000 0000 0078 0000600 1200 0002 0066 6163 742e 6300 6763 6332 0000620 5f63 6f6d 7069 6c65 642e 002e 756d 756c 0000640 0070 7269 6e74 6600 6d61 696e 0000 0000 0000660 0000 003c 0000 0507 0000 0000 0000 005c 0000700 0000 0209 0000 0000 0000 0060 0000 020c 0000720 0000 0000 0000 0068 0000 0607 0000 0000 0000740 0061 733a 2057 6f72 6b53 686f 7020 436f 0000760 6d70 696c 6572 7320 342e 3220 6465 7620 0001000 3133 204d 6179 2031 3939 360a 0047 4343 0001020 3a20 2847 4e55 2920 322e 3935 2e32 2031 0001040 3939 3931 3032 3420 2872 656c 6561 7365 0001060 2900 0000 0000 0000 0000 0000 0000 0000 0001100 0000 0000 0000 0000 0000 0000 0000 0000 0001120 0000 0000 0000 0000 0000 0000 0000 0001 0001140 0000 0003 0000 0000 0000 0000 0000 0034 0001160 0000 003d 0000 0000 0000 0000 0000 0001 0001200 0000 0000 0000 000b 0000 0001 0000 0006 0001220 0000 0000 0000 0074 0000 0078 0000 0000 0001240 0000 0000 0000 0004 0000 0000 0000 0011 0001260 0000 0001 0000 0002 0000 0000 0000 00f0 0001300 0000 0011 0000 0000 0000 0000 0000 0008 0001320 0000 0000 0000 0019 0000 0002 0000 0002 0001340 0000 0000 0000 0104 0000 0080 0000 0005 0001360 0000 0005 0000 0004 0000 0010 0000 0021 0001400 0000 0003 0000 0002 0000 0000 0000 0184 0001420 0000 0029 0000 0000 0000 0000 0000 0001 0001440 0000 0000 0000 0029 0000 0004 0000 0002 0001460 0000 0000 0000 01b0 0000 0030 0000 0004 0001500 0000 0002 0000 0004 0000 000c 0000 0034 0001520 0000 0001 0000 0000 0000 0000 0000 01e0 0001540 0000 0052 0000 0000 0000 0000 0000 0001 0001560 0000 0000 0001564 ================================================================ APPENDIX D: disassembly of fact.o (produced by: dis fact.o) **** DISASSEMBLER **** section .text main() 0: 9d e3 bf 88 save %sp, -120, %sp 4: f0 27 a0 44 st %i0, [%fp + 68] 8: f2 27 a0 48 st %i1, [%fp + 72] c: 90 10 20 01 mov 1, %o0
  • 7. 10: d0 27 bf e8 st %o0, [%fp - 24] 14: 90 10 20 01 mov 1, %o0 18: d0 27 bf ec st %o0, [%fp - 20] 1c: d0 07 bf ec ld [%fp - 20], %o0 20: 80 a2 20 05 cmp %o0, 5 24: 04 80 00 04 ble 0x34 28: 01 00 00 00 nop 2c: 10 80 00 0c ba 0x5c 30: 01 00 00 00 nop 34: d0 07 bf e8 ld [%fp - 24], %o0 38: d2 07 bf ec ld [%fp - 20], %o1 3c: 40 00 00 00 call 0x3c 40: 01 00 00 00 nop 44: d0 27 bf e8 st %o0, [%fp - 24] 48: d0 07 bf ec ld [%fp - 20], %o0 4c: 92 02 20 01 add %o0, 1, %o1 50: d2 27 bf ec st %o1, [%fp - 20] 54: 10 bf ff f2 ba 0x1c 58: 01 00 00 00 nop 5c: 13 00 00 00 sethi %hi(gcc2_compiled.), %o1 60: 90 12 60 00 or %o1, gcc2_compiled., %o0 ! gcc2_compiled. 64: d2 07 bf e8 ld [%fp - 24], %o1 68: 40 00 00 00 call 0x68 6c: 01 00 00 00 nop 70: 81 c7 e0 08 ret 74: 81 e8 00 00 restore