SlideShare una empresa de Scribd logo
1 de 105
Descargar para leer sin conexión
Program Synthesis in Reverse Engineering
Rolf Rolles
M¨obius Strip Reverse Engineering
http://www.msreverseengineering.com
No Such Conference Keynote Speech
November 19th, 2014
Program Synthesis in Reverse Engineering
Overview
Desired
Behavior
Program
Synthesizer
Program
Program synthesis is an academic discipline devoted to
creating programs automatically, given an expectation of how
the program should behave.
We apply and adapt existing academic work to automate
tasks in reverse engineering.
Program Synthesis
What it is Not
“Can I have a web browser?”
Program Synthesizer
“Sure, here you go!”
Program Synthesis
What it is Not
“Can I have a web browser?”
Program Synthesizer
“Sure, here you go!”
Program synthesis:
Requires precise behavioral specifications
Operates on small scales
Is primarily researched on loop-free programs
Program Synthesis
The Simplest Possible Program Synthesizer
Starting from a behavioral specification:
int abs(int x) =
x if x >= 0
-x otherwise
Program Synthesis
The Simplest Possible Program Synthesizer
Starting from a behavioral specification:
int abs(int x) =
x if x >= 0
-x otherwise
Create a harness for testing abs(x) exhaustively:
int main(int , char **) {
for(int x = 0; x != MAX_INT; ++x) {
int r = abs(x);
if((x >= 0 && r != x) || r != -x)
return -1;
}
return 0;
}
Program Synthesis
The Simplest Possible Program Synthesizer
Starting from a behavioral specification:
int abs(int x) =
x if x >= 0
-x otherwise
Create a harness for testing abs(x) exhaustively:
int main(int , char **) {
for(int x = 0; x != MAX_INT; ++x) {
int r = abs(x);
if((x >= 0 && r != x) || r != -x)
return -1;
}
return 0;
}
Enumerate every possible function abs(int x):
int abs(int x) { return -x; }
Program Synthesis
The Simplest Possible Program Synthesizer
Starting from a behavioral specification:
int abs(int x) =
x if x >= 0
-x otherwise
Create a harness for testing abs(x) exhaustively:
int main(int , char **) {
for(int x = 0; x != MAX_INT; ++x) {
int r = abs(x);
if((x >= 0 && r != x) || r != -x)
return -1;
}
return 0;
}
Enumerate every possible function abs(int x):
int abs(int x) { return ~x; }
Program Synthesis
The Simplest Possible Program Synthesizer
Starting from a behavioral specification:
int abs(int x) =
x if x >= 0
-x otherwise
Create a harness for testing abs(x) exhaustively:
int main(int , char **) {
for(int x = 0; x != MAX_INT; ++x) {
int r = abs(x);
if((x >= 0 && r != x) || r != -x)
return -1;
}
return 0;
}
Enumerate every possible function abs(int x):
int abs(int x) { return -(x+0); }
Program Synthesis
The Simplest Possible Program Synthesizer
Starting from a behavioral specification:
int abs(int x) =
x if x >= 0
-x otherwise
Create a harness for testing abs(x) exhaustively:
int main(int , char **) {
for(int x = 0; x != MAX_INT; ++x) {
int r = abs(x);
if((x >= 0 && r != x) || r != -x)
return -1;
}
return 0;
}
Enumerate every possible function abs(int x):
int abs(int x) { return ~(x+0); }
Program Synthesis
The Simplest Possible Program Synthesizer
Starting from a behavioral specification:
int abs(int x) =
x if x >= 0
-x otherwise
Create a harness for testing abs(x) exhaustively:
int main(int , char **) {
for(int x = 0; x != MAX_INT; ++x) {
int r = abs(x);
if((x >= 0 && r != x) || r != -x)
return -1;
}
return 0;
}
Enumerate every possible function abs(int x):
int abs(int x) { return -(x+1); }
Program Synthesis
The Simplest Possible Program Synthesizer
Starting from a behavioral specification:
int abs(int x) =
x if x >= 0
-x otherwise
Create a harness for testing abs(x) exhaustively:
int main(int , char **) {
for(int x = 0; x != MAX_INT; ++x) {
int r = abs(x);
if((x >= 0 && r != x) || r != -x)
return -1;
}
return 0;
}
Enumerate every possible function abs(int x):
int abs(int x) { return ~(x+1); }
Program Synthesis
The Simplest Possible Program Synthesizer
The brute-force synthesizer:
Is extremely slow, and
Explores an infinite number of programs.
We could improve the situation by carefully choosing which
programs to generate.
Modern program synthesis goes further and does better.
Introduction
Ingredients
X86 Disassembly and Assembly
IR Translation
IR Interpreter
SMT Integration
Applications
Enumerative Program Synthesis
CPU Emulator Synthesis
Peephole Superdeobfuscation
Template-Based Program Synthesis
Metamorphic Extraction
Conclusion
Ingredients
X86 Disassembly and Assembly
28 D8
0F B6 DB
81 C3 BB BB BB BB
01 D8
X86 Disassembler
sub bl, bl
movzx ebx, bl
add ebx, 0BBBBBBBBh
add eax, ebx
28 D8
0F B6 DB
81 C3 BB BB BB BB
01 D8
X86 Assembler
sub bl, bl
movzx ebx, bl
add ebx, 0BBBBBBBBh
add eax, ebx
Given bytes, a disassembler produces X86 assembly. An
assembler does the opposite.
Ingredients
IR Translation
add eax, ecx
IR Translator
vRes = vEax + vEcx;
vZF = vRes == 0;
vSF = vRes <s 0;
vPF = Parity(vRes);
vCF = vRes <u vEax;
vOF = (vEcx ^ vRes)
& (vEax ^ vRes) <s 0;
vAF = (vRes ^ vEax ^ vEcx)
& 0x10 != 0;
vEax = vRes;
IR translators convert instructions to a symbolic representation of
their actual effects when executed.
Ingredients
IR Interpreter
vRes = vEax + vEcx;
vZF = vRes == 0;
vSF = vRes <s 0;
vPF = Parity(vRes);
vCF = vRes <u vEax;
vOF = (vEcx ^ vRes)
& (vEax ^ vRes) <s 0;
vAF = (vRes ^ vEax ^ vEcx)
& 0x10 != 0;
vEax = vRes;
IR for add eax, ecx
eax 645EDE7Bh zf 0 of 0 sf 1
ecx 0FCD02BDEh af 0 pf 1 cf 0
Input State
IR Interpreter
eax 612F0A59h zf 0 of 0 sf 0
ecx 0FCD02BDEh af 1 pf 1 cf 1
Output State
The IR interpreter executes IR statements in an input state,
producing an output state.
Ingredients
SMT Integration
vRes = vEax + vEcx;
vZF = vRes == 0;
vSF = vRes <s 0;
vPF = Parity(vRes);
vCF = vRes <u vEax;
vOF = (vEcx ^ vRes)
& (vEax ^ vRes) <s 0;
vAF = (vRes ^ vEax ^ vEcx)
& 0x10 != 0;
vEax = vRes;
IR for add eax, ecx
vZF == 1
Query
SMT Solver
vEax = 0, vEcx = 0
SMT solvers can answer questions about the values of
variables and memory within IR sequences.
We queried for a model of eax and ecx where vZF == 1.
Introduction
Ingredients
Applications
Enumerative Program Synthesis
CPU Emulator Synthesis
Peephole Superdeobfuscation
Template-Based Program Synthesis
Metamorphic Extraction
Conclusion
Program Synthesis in Reverse Engineering
CPU Emulator Synthesis
CPU
Emulator
Synthesizer
X86 Assembler
Templates
add eax, ecx
vRes = vEax + vEcx;
vZF = vRes == 0;
vSF = vRes <s 0;
vPF = Parity(vRes);
vCF = vRes <u vEax;
vOF = (vEcx ^ vRes)
& (vEax ^ vRes) <s 0;
vAF = (vRes ^ vEax ^ vEcx)
& 0x10 != 0;
vEax = vEax + vEcx;
CPU Emulator Logic
We present a re-designed version of [2] for generating CPU
emulators.
Program Synthesis in Reverse Engineering
Peephole Superdeobfuscation
Deobfuscator
Generator
push edx
mov edx, 6F6B5081h
mov esi, 0BC6D38Bh
add esi, edx
pop edx
Obfuscated Code push r321
mov r321, i321
mov r322, i322
add r322, r321
pop r321
Simplifies To
mov r322, i321+i322
Deobfuscator Rule
We apply the ideas of [1] to automatically create
deobfuscators for certain obfuscators.
Program Synthesis in Reverse Engineering
Metamorphic Extraction
Functionality
Extractor
lodsb
add al, bl
xor al, 0D2h
sub al, 0F1h
add bl, al
add al, 0B7h
sub al, 90h
push cx
mov cl, bl
add al, bl
pop cx
add al, 90h
sub al, 0B7h
push ebx
mov ebx, 8716AEF1h
push ecx
push eax
xor dword ptr [esp], ebx
; continued
We apply a technique from [3] to automatically recover
metamorphically-generated code sequences.
Introduction
Ingredients
X86 Disassembly and Assembly
IR Translation
IR Interpreter
SMT Integration
Applications
Enumerative Program Synthesis
CPU Emulator Synthesis
Peephole Superdeobfuscation
Template-Based Program Synthesis
Metamorphic Extraction
Conclusion
CPU Emulator Synthesis
Overview
CPU emulators require accurate descriptions of the operation
of the X86 processor.
Sadly, the Intel manuals are at best vague, at worst wrong.
IF 64-Bit Mode
THEN
#UD;
ELSE
IF ((AL AND 0FH) > 9) or (AF = 1)
THEN
WRONG: AL ← AL + 6; AX ← AX + 0x106
WRONG: AH ← AH + 1;
AF ← 1;
CF ← 1;
AL ← AL AND 0FH;
ELSE
AF ← 0;
CF ← 0;
AL ← AL AND 0FH;
FI;
FI;
That’s from the first instruction in the manual.
CPU Emulator Synthesis
Overview
Idea: use program synthesis to find instruction descriptions.
add eax, ecx
CPU Emulator Synthesizer
vRes = vEax + vEcx;
vZF = vRes == 0;
vSF = vRes <s 0;
vPF = Parity(vRes);
vCF = vRes <u vEax;
vOF = (vEcx ^ vRes)
& (vEax ^ vRes) <s 0;
vAF = (vRes ^ vEax ^ vEcx)
& 0x10 != 0;
vEax = vEax + vEcx;
CPU Emulator Synthesis
Overview
add eax, ecx
Behavior Sampler
I/O Pair Data
Hypothesis Generator
Templates, Substitutions
Hypotheses
Empirical Sieve
Candidate Hypotheses
Equivalence Checker
Single Result
We describe each component
next.
1. Hypothesis Generator
2. Behavior Sampler
3. Empirical Sieve
4. Equivalence Checker
CPU Emulator Synthesis
Hypothesis Generator
We generate a family of expressions from a template expression
and a list of substitutions.
Binop(_,+,_)
vEax Dword(0)
Tree representation of vEax + 0
Token Replace With
+ +,-,&,|,^
vEax vEax,vEcx
Specified Substitutions
vEax + 0 vEax - 0 vEax & 0 vEax | 0 vEax ^ 0
vEcx + 0 vEcx - 0 vEcx & 0 vEcx | 0 vEcx ^ 0
The 5 ∗ 2 = 10 expressions generated from the above
CPU Emulator Synthesis
Hypothesis Generator
A result location is a location modified by an instruction.
A hypothesis is an IR expression of equality between a result
location variable and a generated expression.
For the instruction add eax, ebx, the result locations are
vEaxout , vZFout , vSFout , vPFout , vCFout , vOFout , and vAFout .
vZFout == (vEax + vEbx) <s 0 vZFout == ((vEax + vEbx) ^ vEax) >u 0
vSFout == (vEax + vEbx) <s 0 vSFout == ((vEax + vEbx) ^ vEax) >u 0
vPFout == (vEax + vEbx) <s 0 vPFout == ((vEax + vEbx) ^ vEax) >u 0
vCFout == (vEax + vEbx) <s 0 vCFout == ((vEax + vEbx) ^ vEax) >u 0
vOFout == (vEax + vEbx) <s 0 vOFout == ((vEax + vEbx) ^ vEax) >u 0
vAFout == (vEax + vEbx) <s 0 vAFout == ((vEax + vEbx) ^ vEax) >u 0
Some hypotheses for add eax, ebx
CPU Emulator Synthesis
Behavior Sampler
add eax, ecx Input State
Behavior Sampler
Output State
Given a set of register and flag values as input:
1. JIT assemble the instruction
2. Set the processor registers and flags to the input state
3. Execute the instruction
4. Collect the values of the registers and flags as output
These I/O pairs are samples of the instruction’s behavior.
CPU Emulator Synthesis
Empirical Sieve
eaxin = 0
ecxin = 0
cfout = 0
I/O Data
CFout == (vEax == vEcx)
Hypothesis
0 == (0 == 0)
0 == 1
false
Evaluation
If a hypothesis is false in any I/O pair, it cannot describe the
instruction’s behavior, so we discard it.
In the figure above, CFout == (vEax == vEcx) is an invalid
hypothesis because it evaluates to false for the pair listed.
CPU Emulator Synthesis
Empirical Sieve
I/O Pairs Hypothesis Generator
IR Interpreter
Unfalsified Hypotheses
We generate many hypotheses, test them against the I/O
pairs, and discard any that evaluate to false.
Multiple unfalsified hypotheses may remain after the empirical
sieve. The next component removes more of them.
CPU Emulator Synthesis
Equivalence Checker
Equivalence checking uses an SMT solver to determine
whether two expressions evaluate equally for all inputs. If not,
it finds an input that causes a different evaluation.
(vEax+vEcx) == 0
Expression #1
(vEcx+vEax+vEcx+vEax) == 0
Expression #2
Two hypotheses for ZFout .
We query an SMT solver as to the satisfiability of
((vEax+vEcx) == 0) != ((vEcx+vEax+vEcx+vEax) == 0).
If UNSAT, the hypotheses always behave the same.
IF SAT, the solution contains values for vEax and vEcx that
cause the hypotheses to evaluate differently.
CPU Emulator Synthesis
Equivalence Checker
The SMT solver reports satisfiability, and gives a model where
the first hypothesis for ZFout evaluates to 0, and the other to 1.
eaxin = 3d800000h
ecxin = 42800000h
Model for ((vEax+vEcx) == 0) != ((vEcx+vEax+vEcx+vEax) == 0)
If we sample the instruction’s behavior in this input state, one
of the hypotheses must become falsified, since:
If zfout = 1, then (vEax+vEcx) == 0 is false.
If zfout = 0, then (vEcx+vEax+vEcx+vEax) == 0 is false.
Hypothesis #2 is falsified.
CPU Emulator Synthesis
Behavior Sampler
add eax, ecx
Behavior Sampler
I/O Pair Data
Hypothesis Generator
<out> == <in>,
<out> == <in> <+> <in>,
. . .
CFout == (vEax == vEcx),
CFout == (vEcx <s 0),
CFout == (vEax + vEcx <u vEax),
. . .EmpiricalSieve
CFout == (vEcx <s 0),
CFout == (vEax + vEcx <u vEax),
. . .
Equivalence Checker
CFout == vEax + vEcx <u vEax
First, we generate hy-
potheses for the instruc-
tion’s effects.
CPU Emulator Synthesis
Hypothesis Generator
add eax, ecx
Behavior Sampler
I/O Pair Data
Hypothesis Generator
<out> == <in>,
<out> == <in> <+> <in>,
. . .
CFout == (vEax == vEcx),
CFout == (vEcx <s 0),
CFout == (vEax + vEcx <u vEax),
. . .EmpiricalSieve
CFout == (vEcx <s 0),
CFout == (vEax + vEcx <u vEax),
. . .
Equivalence Checker
CFout == vEax + vEcx <u vEax
Next, we collect I/O
pairs for an instruction.
CPU Emulator Synthesis
Empirical Sieve
add eax, ecx
Behavior Sampler
I/O Pair Data
Hypothesis Generator
<out> == <in>,
<out> == <in> <+> <in>,
. . .
CFout == (vEax == vEcx),
CFout == (vEcx <s 0),
CFout == (vEax + vEcx <u vEax),
. . .Empirical Sieve
CFout == (vEcx <s 0),
CFout == (vEax + vEcx <u vEax),
. . .
Equivalence Checker
CFout == vEax + vEcx <u vEax
Remove hypotheses
that are shown to be
false by any I/O pair.
CPU Emulator Synthesis
Equivalence Checking
add eax, ecx
Behavior Sampler
I/O Pair Data
Hypothesis Generator
<out> == <in>,
<out> == <in> <+> <in>,
. . .
CFout == (vEax == vEcx),
CFout == (vEcx <s 0),
CFout == (vEax + vEcx <u vEax),
. . .Empirical Sieve
CFout == (vEcx <s 0),
CFout == (vEax + vEcx <u vEax),
. . .
Equivalence Checker
CFout == vEax + vEcx <u vEax
The equivalence checker
removes all but one hy-
pothesis.
Introduction
Ingredients
X86 Disassembly and Assembly
IR Translation
IR Interpreter
SMT Integration
Applications
Enumerative Program Synthesis
CPU Emulator Synthesis
Peephole Superdeobfuscation
Template-Based Program Synthesis
Metamorphic Extraction
Conclusion
Peephole Superdeobfuscation
Rule-Based Obfuscation
Unobfuscated Obfuscated Constraints
add r81, r82 add r81, imm8
add r81, r82
sub r81, imm8
sub r321, r322 push r323 r323 = r321
mov r323, r322 r321 = esp
sub r321, r323 r321 = esp
pop r323 r323 = esp
Obfuscator Rules
Some obfuscators use a rule set to randomly permute code.
The ordinary and obfuscated sequences must have the same
(or at least “similar”) effects when executed.
Peephole Superdeobfuscation
add al, bl sub al, 9Fh sub al, 9Fh sub al, 9Fh
add al, bl
Blah
Peephole Superdeobfuscation
add al, bl sub al, 9Fh sub al, 9Fh sub al, 9Fh
sub al, 9Fh
add al, bl add al, bl
add al, 9Fh
Replace instructions with obfuscated sequences
Obfuscation
Peephole Superdeobfuscation
add al, bl sub al, 9Fh sub al, 9Fh sub al, 9Fh
sub al, 9Fh sub al, 9Fh
sub al, 7Fh
add al, bl add al, bl
add al, 7Fh
push edx
mov dh, 22h
inc dh
dec dh
add dh, 7Dh
add al, 9Fh add al, dh
pop edx
Replace instructions with obfuscated sequences
Obfuscation
Peephole Superdeobfuscation
add al, bl sub al, 9Fh sub al, 9Fh push cx
mov ch, 9Fh
sub al, 9Fh sub al, ch
pop cx
sub al, 7Fh sub al, 7Fh
sub al, 70h
add al, bl add al, bl
add al, 70h
add al, 7Fh add al, 7Fh
push edx push edx
push ecx
mov ch, 22h
mov dh, 22h mov dh, ch
pop ecx
inc dh inc dh
dec dh dec dh
add dh, 7Dh add dh, 7Dh
add al, dh add al, dh
pop edx pop edx
Replace instructions with obfuscated sequences
Obfuscation
Peephole Superdeobfuscation
add al, bl sub al, 9Fh sub al, 9Fh push cx
mov ch, 9Fh
sub al, ch
pop cx
sub al, 7Fh
sub al, 70h
add al, bl
add al, 70h
add al, 7Fh
push edx
push ecx
mov ch, 22h
mov dh, ch
pop ecx
inc dh
dec dh
add dh, 7Dh
add al, dh
pop edx
Blah
Peephole Superdeobfuscation
add al, bl sub al, 9Fh sub al, 9Fh push cx
mov ch, 9Fh
sub al, 9Fh sub al, ch
pop cx
sub al, 7Fh sub al, 7Fh
sub al, 70h
add al, bl add al, bl
add al, 70h
add al, 7Fh add al, 7Fh
push edx push edx
push ecx
mov ch, 22h
mov dh, 22h mov dh, ch
pop ecx
inc dh inc dh
dec dh dec dh
add dh, 7Dh add dh, 7Dh
add al, dh add al, dh
pop edx pop edx
Replace obfuscated sequences with instructions
Deobfuscation with Inverse Rules
Peephole Superdeobfuscation
add al, bl sub al, 9Fh sub al, 9Fh sub al, 9Fh
sub al, 9Fh sub al, 9Fh
sub al, 7Fh
add al, bl add al, bl
add al, 7Fh
push edx
mov dh, 22h
inc dh
dec dh
add dh, 7Dh
add al, 9Fh add al, dh
pop edx
Replace obfuscated sequences with instructions
Deobfuscation with Inverse Rules
Peephole Superdeobfuscation
add al, bl sub al, 9Fh sub al, 9Fh sub al, 9Fh
sub al, 9Fh
add al, bl add al, bl
add al, 9Fh
Replace obfuscated sequences with instructions
Deobfuscation with Inverse Rules
Peephole Superdeobfuscation
add al, bl sub al, 9Fh sub al, 9Fh sub al, 9Fh
add al, bl
Blah
Deobfuscation with Inverse Rules
Peephole Superdeobfuscation
Rule-Based Deobfuscation
Obfuscated Unobfuscated Constraints
add r81, imm8 add r81, r82
add r81, r82
sub r81, imm8
push r323 sub r321, r322 r323 = r321
mov r323, r322 r321 = esp
sub r321, r323 r321 = esp
pop r323 r323 = esp
Deobfuscator Rules
We could inspect the code manually, pick out patterns, and
write deobfuscator rules.
This is Tedious And Error-Prone—.
We automate the process with program synthesis.
Peephole Superdeobfuscation
The Idea: Obfuscated/Unobfuscated Behavior Must Match
lea esp, [esp-4]
mov [esp], eax
Obfuscated
push eax
IR Interpreter
eax 645EDE7Bh
eax 0FCD02BDEh
Output State #1 Output State #2
State
Comparison
Collect obfuscated sequences.
Peephole Superdeobfuscation
The Idea: Obfuscated/Unobfuscated Behavior Must Match
lea esp, [esp-4]
mov [esp], eax
Obfuscated
push eax
IR Interpreter
eax 645EDE7Bh
esp 0FCD02BDEh
Input State
esp 0FCD02BDAh
mem[esp] = 0x645EDE7B
Output State #1
Output State #2
State
Comparison
Generate I/O pairs.
Peephole Superdeobfuscation
The Idea: Obfuscated/Unobfuscated Behavior Must Match
lea esp, [esp-4]
mov [esp], eax
Obfuscated
push eax
Collect
Registers /
Constants
eax 645EDE7Bh
eax 0FCD02BDEh
Output State #1
eax, esp, 4
Operand Parts
Output State #2
State
Comparison
Preparing for candidate generation, collect registers and constants
from the obfuscated sequence.
Peephole Superdeobfuscation
The Idea: Obfuscated/Unobfuscated Behavior Must Match
lea esp, [esp-4]
push r32
push i32
add r32, i32
Candidate Templates
push eax
Candidate
push esp
push 4
add esp, 4
add eax, 4
IR Interpreter
Candidate
Enumerator
eax 645EDE7Bh
eax 0FCD02BDEh
Output State #1
eax, esp, 4
Operand Parts
Output State #2
State
Comparison
Plug the registers and constants into the templates to generate
candidate replacements.
Peephole Superdeobfuscation
The Idea: Obfuscated/Unobfuscated Behavior Must Match
lea esp, [esp-4] push eax
Candidate
IR Interpreter
eax 645EDE7Bh
esp 0FCD02BDEh
Input State
Output State #1
esp 0FCD02BDAh
mem[esp] = 0x645EDE7B
Output State #2
State
Comparison
For each candidate, generate I/O pairs in the same input states
as for the original sequence.
Peephole Superdeobfuscation
The Idea: Obfuscated/Unobfuscated Behavior Must Match
lea esp, [esp-4]
mov [esp], eax
Obfuscated
push eax
Candidate
IR Interpreter
eax 645EDE7Bh
esp 0FCD02BDEh
Input State
esp 0FCD02BDAh
mem[esp] = 0x645EDE7B
Output State #1
esp 0FCD02BDAh
mem[esp] = 0x645EDE7B
Output State #2
State
Comparison
Compare the resulting states. We can be lenient for flags and
stack memory. If they match, the candidate is a potential
replacement for the obfuscated sequence.
Peephole Superdeobfuscation
The Idea: Obfuscated/Unobfuscated Behavior Must Match
lea esp, [esp-4]
mov [esp], eax
Obfuscated
push eax
Candidate
Equivalence Checker
eax 645EDE7Bh
eax 0FCD02BDEh
Output State #1 Output State #2
State
Comparison
Obfuscated Deobfuscated
lea esp, [esp-4] push eax
mov [esp], eax
Deobfuscator Rules
If the states matched, use an SMT solver to ensure that the
sequences are actually equivalent. Learn rules if so.
Peephole Superdeobfuscation
Generalization
Obfuscated Deobfuscated Constraints
lea esp, [esp-4] push eax
mov [esp], eax
Deobfuscator Rules
Our deobfuscator rule is specific to the register eax.
In fact, eax can be replaced with any register except esp.
Peephole Superdeobfuscation
Generalization
Obfuscated Deobfuscated Constraints
lea esp, [esp-4] push eax
mov [esp], eax
Deobfuscator Rules
Obfuscated Deobfuscated Constraints
Generalized Rules
lea esp, [esp-4] push r32 r32 = esp
mov [esp], r32
Our deobfuscator rule is specific to the register eax.
In fact, eax can be replaced with any register except esp.
Generalization removes specific register and/or immediate
values from deobfuscation rules.
Peephole Superdeobfuscation
In Action
sub esp, 4
mov [esp], eax
mov eax, ebp
push ebp
push edi
mov [esp], eax
xor [esp], 7FEE03B1h
pop ebp
sub esp, 4
mov [esp], esi
push edx
mov edx, 6F6B5081h
mov esi, 0BC6D38Bh
add esi, edx
pop edx
Obfuscated Deobfuscated Constraints
mov [esp], r322
mov r322, i321+i322
r321 = r322
We iterate through the in-
structions, trying to sim-
plify those within a window.
: bottom of current window.
: can simplify.
: simplification applied.
Peephole Superdeobfuscation
In Action
sub esp, 4
mov [esp], eax
mov eax, ebp
push ebp
push edi
mov [esp], eax
xor [esp], 7FEE03B1h
pop ebp
sub esp, 4
mov [esp], esi
push edx
mov edx, 6F6B5081h
mov esi, 0BC6D38Bh
add esi, edx
pop edx
Obfuscated Deobfuscated Constraints
sub esp, 4 push r32 r32 = esp
mov [esp], r32
mov [esp], r322
mov r322, i321+i322
r321 = r322
Learn rule and generalize.
Peephole Superdeobfuscation
In Action
push eax
mov eax, ebp
push ebp
push edi
mov [esp], eax
xor [esp], 7FEE03B1h
pop ebp
sub esp, 4
mov [esp], esi
push edx
mov edx, 6F6B5081h
mov esi, 0BC6D38Bh
add esi, edx
pop edx
Obfuscated Deobfuscated Constraints
sub esp, 4 push r32 r32 = esp
mov [esp], r32
mov [esp], r322
mov r322, i321+i322
r321 = r322
No simplification.
Peephole Superdeobfuscation
In Action
push eax
mov eax, ebp
push ebp
push edi
mov [esp], eax
xor [esp], 7FEE03B1h
pop ebp
sub esp, 4
mov [esp], esi
push edx
mov edx, 6F6B5081h
mov esi, 0BC6D38Bh
add esi, edx
pop edx
Obfuscated Deobfuscated Constraints
sub esp, 4 push r32 r32 = esp
mov [esp], r32
mov [esp], r322
mov r322, i321+i322
r321 = r322
No simplification.
Peephole Superdeobfuscation
In Action
push eax
mov eax, ebp
push ebp
push edi
mov [esp], eax
xor [esp], 7FEE03B1h
pop ebp
sub esp, 4
mov [esp], esi
push edx
mov edx, 6F6B5081h
mov esi, 0BC6D38Bh
add esi, edx
pop edx
Obfuscated Deobfuscated Constraints
sub esp, 4 push r32 r32 = esp
mov [esp], r32
mov [esp], r322
mov r322, i321+i322
r321 = r322
No simplification.
Peephole Superdeobfuscation
In Action
push eax
mov eax, ebp
push ebp
push edi
mov [esp], eax
xor [esp], 7FEE03B1h
pop ebp
sub esp, 4
mov [esp], esi
push edx
mov edx, 6F6B5081h
mov esi, 0BC6D38Bh
add esi, edx
pop edx
Obfuscated Deobfuscated Constraints
sub esp, 4 push r32 r32 = esp
mov [esp], r32
push r321 push r322 r322 = esp
mov [esp], r322
mov r322, i321+i322
r321 = r322
Learn rule and generalize.
Peephole Superdeobfuscation
In Action
push eax
mov eax, ebp
push ebp
push eax
xor [esp], 7FEE03B1h
xor [esp], 7FEE03B1h
pop ebp
sub esp, 4
mov [esp], esi
push edx
mov edx, 6F6B5081h
mov esi, 0BC6D38Bh
add esi, edx
pop edx
Obfuscated Deobfuscated Constraints
sub esp, 4 push r32 r32 = esp
mov [esp], r32
push r321 push r322 r322 = esp
mov [esp], r322
mov r322, i321+i322
r321 = r322
No simplification.
Peephole Superdeobfuscation
In Action
push eax
mov eax, ebp
push ebp
push eax
xor [esp], 7FEE03B1h
xor [esp], 7FEE03B1h
pop ebp
sub esp, 4
mov [esp], esi
push edx
mov edx, 6F6B5081h
mov esi, 0BC6D38Bh
add esi, edx
pop edx
Obfuscated Deobfuscated Constraints
sub esp, 4 push r32 r32 = esp
mov [esp], r32
push r321 push r322 r322 = esp
mov [esp], r322
push r321 mov r322, r321 r321 = esp
xor [esp], i32 xor r322, i32 r322 = esp
pop r322
mov r322, i321+i322
r321 = r322
Learn rule and generalize.
Peephole Superdeobfuscation
In Action
push eax
mov eax, ebp
push ebp
xor [esp], 7FEE03B1h
mov ebp, eax
xor ebp, 7FEE03B1h
sub esp, 4
mov [esp], esi
push edx
mov edx, 6F6B5081h
mov esi, 0BC6D38Bh
add esi, edx
pop edx
Obfuscated Deobfuscated Constraints
sub esp, 4 push r32 r32 = esp
mov [esp], r32
push r321 push r322 r322 = esp
mov [esp], r322
push r321 mov r322, r321 r321 = esp
xor [esp], i32 xor r322, i32 r322 = esp
pop r322
mov r322, i321+i322
r321 = r322
No simplification.
Peephole Superdeobfuscation
In Action
push eax
mov eax, ebp
push ebp
xor [esp], 7FEE03B1h
mov ebp, eax
xor ebp, 7FEE03B1h
sub esp, 4
mov [esp], esi
push edx
mov edx, 6F6B5081h
mov esi, 0BC6D38Bh
add esi, edx
pop edx
Obfuscated Deobfuscated Constraints
sub esp, 4 push r32 r32 = esp
mov [esp], r32
push r321 push r322 r322 = esp
mov [esp], r322
push r321 mov r322, r321 r321 = esp
xor [esp], i32 xor r322, i32 r322 = esp
pop r322
mov r322, i321+i322
r321 = r322
Apply previous simplification.
Peephole Superdeobfuscation
In Action
push eax
mov eax, ebp
push ebp
xor [esp], 7FEE03B1h
mov ebp, eax
xor ebp, 7FEE03B1h
push esi
push edx
mov edx, 6F6B5081h
mov esi, 0BC6D38Bh
add esi, edx
pop edx
Obfuscated Deobfuscated Constraints
sub esp, 4 push r32 r32 = esp
mov [esp], r32
push r321 push r322 r322 = esp
mov [esp], r322
push r321 mov r322, r321 r321 = esp
xor [esp], i32 xor r322, i32 r322 = esp
pop r322
mov r322, i321+i322
r321 = r322
No simplification.
Peephole Superdeobfuscation
In Action
push eax
mov eax, ebp
push ebp
xor [esp], 7FEE03B1h
mov ebp, eax
xor ebp, 7FEE03B1h
push esi
push edx
mov edx, 6F6B5081h
mov esi, 0BC6D38Bh
add esi, edx
pop edx
Obfuscated Deobfuscated Constraints
sub esp, 4 push r32 r32 = esp
mov [esp], r32
push r321 push r322 r322 = esp
mov [esp], r322
push r321 mov r322, r321 r321 = esp
xor [esp], i32 xor r322, i32 r322 = esp
pop r322
mov r322, i321+i322
r321 = r322
No simplification.
Peephole Superdeobfuscation
In Action
push eax
mov eax, ebp
push ebp
xor [esp], 7FEE03B1h
mov ebp, eax
xor ebp, 7FEE03B1h
push esi
push edx
mov edx, 6F6B5081h
mov esi, 0BC6D38Bh
add esi, edx
pop edx
Obfuscated Deobfuscated Constraints
sub esp, 4 push r32 r32 = esp
mov [esp], r32
push r321 push r322 r322 = esp
mov [esp], r322
push r321 mov r322, r321 r321 = esp
xor [esp], i32 xor r322, i32 r322 = esp
pop r322
mov r322, i321+i322
r321 = r322
No simplification.
Peephole Superdeobfuscation
In Action
push eax
mov eax, ebp
push ebp
xor [esp], 7FEE03B1h
mov ebp, eax
xor ebp, 7FEE03B1h
push esi
push edx
mov edx, 6F6B5081h
mov esi, 0BC6D38Bh
add esi, edx
pop edx
Obfuscated Deobfuscated Constraints
sub esp, 4 push r32 r32 = esp
mov [esp], r32
push r321 push r322 r322 = esp
mov [esp], r322
push r321 mov r322, r321 r321 = esp
xor [esp], i32 xor r322, i32 r322 = esp
pop r322
mov r322, i321+i322
r321 = r322
No simplification.
Peephole Superdeobfuscation
In Action
push eax
mov eax, ebp
push ebp
xor [esp], 7FEE03B1h
mov ebp, eax
xor ebp, 7FEE03B1h
push esi
push edx
mov edx, 6F6B5081h
mov esi, 0BC6D38Bh
add esi, edx
pop edx
Obfuscated Deobfuscated Constraints
sub esp, 4 push r32 r32 = esp
mov [esp], r32
push r321 push r322 r322 = esp
mov [esp], r322
push r321 mov r322, r321 r321 = esp
xor [esp], i32 xor r322, i32 r322 = esp
pop r322
push r321 mov r322, i321+i322 r321 = esp
mov r321, i321 mov r322, i321+i322 r322 = esp
mov r322, i322 r321 = r322
add r322, r321
pop r321
Learn rule and generalize.
Peephole Superdeobfuscation
In Action
push eax
mov eax, ebp
push ebp
xor [esp], 7FEE03B1h
mov ebp, eax
xor ebp, 7FEE03B1h
push esi
mov esi, 7B32240Ch
Obfuscated Deobfuscated Constraints
sub esp, 4 push r32 r32 = esp
mov [esp], r32
push r321 push r322 r322 = esp
mov [esp], r322
push r321 mov r322, r321 r321 = esp
xor [esp], i32 xor r322, i32 r322 = esp
pop r322
push r321 mov r322, i321+i322 r321 = esp
mov r321, i321 mov r322, i321+i322 r322 = esp
mov r322, i322 r321 = r322
add r322, r321
pop r321
Continue ad infinitum.
wtf?
Peephole Superdeobfuscation
System Output
Deobfuscator Rules
Code Generator
C CodePython Code OCaml Code
We can turn a set of rules into a program that uses
pattern-matching to implement those transformations.
We can generate a deobfuscator automatically, and also the
obfuscator that created the code in the first place!
Peephole Superdeobfuscation
Limitations
mov eax, 12h
or edx, eax
add eax, 34h
jmp @end
bswap ax
bsr edx, 12h
salc
cmc
@end:
xor edx, edx
mov eax, [esp]
add esp, 4
ret
Could simplify if we knew that edx was dead (i.e., not used
before its next modification ).
Ours is a forward analysis; dead code elimination requires
backwards analysis.
Can incorporate a liveness analysis for this purpose.
Introduction
Ingredients
X86 Disassembly and Assembly
IR Translation
IR Interpreter
SMT Integration
Applications
Enumerative Program Synthesis
CPU Emulator Synthesis
Peephole Superdeobfuscation
Template-Based Program Synthesis
Metamorphic Extraction
Conclusion
Template-Based Program Synthesis
Overview
Question: is it possible to create the function x+1 by using two of
~ (not) and/or - (neg)?
y = ~x; y = -x;
z = ~y; z = ~y;
y = ~x; y = -x;
z = -y; z = -y;
Table: All Possible Sequences
Template-Based Program Synthesis
Templates
Problem phrased as a template: what do bop1 and bop2 need to be
set to, so that f(x) == x+1 for all values of x?
bool bop1 , bop2;
int f(int x)
{
int y = bop1 ? -x : ~x;
return bop2 ? -y : ~y;
}
Template-Based Program Synthesis
Solving
After phrasing the question appropriately:
English Mathematics
Are there values of bop1, bop2 ∃ bop1, bop2 ∈ Bool ·
Such that, for all values of x ∀ x ∈ BV[32] ·
In the code
y = bop1 ? -x : ~x let y = bop1 ? -x : ~x in
z = bop2 ? -y : ~y let z = bop2 ? -y : ~y in
z == x+1 is always true? z == x+1
An SMT solver gives bop1 = false, bop2 = true.
I.e., int f(int x) { return -~x; }
Template-Based Program Synthesis
Extending the Framework
We can extend the idea to use
more than two operator types:
Or reference further constants
that the solver must provide:
char op1;
int f(int x)
{
y = op1 == 0 ? -x :
op1 == 1 ? ~x :
x-1;
return y;
}
bool op1;
char c1;
int f(int x)
{
y = op1 ? x + c1 :
x ^ c1;
return y;
}
Introduction
Ingredients
X86 Disassembly and Assembly
IR Translation
IR Interpreter
SMT Integration
Applications
Enumerative Program Synthesis
CPU Emulator Synthesis
Peephole Superdeobfuscation
Template-Based Program Synthesis
Metamorphic Extraction
Conclusion
Metamorphic Extraction
Metamorphic Decoder Generation
A metamorphic engine generates code like the following:
lodsb
op al, bl
op al, i81
op al, i82
op bl, al
Where each op can be add, sub, or xor, and i81
/ i82 are 8-bit constants. I.e., 34 ∗ 28 ∗ 28 ≈ 5.3
million possible instances.
Example metamorphic decoder:
lodsb
add al, bl
xor al, 0D2h
sub al, 0F1h
add bl, al
Metamorphic Extraction
Further Obfuscation
add al, 0B7h
sub al, 90h
push cx
mov cl, bl
add al, bl
pop cx
add al, 90h
sub al, 0B7h
push ebx
mov ebx, 8716AEF1h
push ecx
push eax
xor dword ptr [esp], ebx
; continued
Obfuscator
lodsb
add al, bl
xor al, 0D2h
sub al, 0F1h
add bl, al
The metamorphic code is then further obfuscated.
Metamorphic Extraction
Goal
lodsb
op al, bl
op al, i81
op al, i82
op bl, al
add al, 0B7h
sub al, 90h
push cx
mov cl, bl
add al, bl
pop cx
add al, 90h
sub al, 0B7h
push ebx
mov ebx, 8716AEF1h
push ecx
push eax
xor dword ptr [esp], ebx
; continued
Re-Create
Given an obfuscated sequence, we want to determine the
underlying metamorphically-generated function.
I.e., would like to know the ops and i8s.
We re-create the information instead of deobfuscating.
Let’s explore two approaches based on program synthesis.
Metamorphic Extraction
Template for Metamorphic Functionality
Template for Metamorphic Functionality
al = Load(vMem,vEdi,8);
a = op1 == 0 ? al + vBl : op1 == 1 ? al - vBl : al ^ vBl;
b = op2 == 0 ? a + c1 : op2 == 1 ? a - c1 : a ^ c1;
c = op3 == 0 ? b + c2 : op3 == 1 ? b - c2 : b ^ c2;
d = op4 == 0 ? vBl + c : op4 == 1 ? vBl - c : vBl ^ c;
X86-related variables:
al is the value read by lodsb
vBl is the initial value of bl
d is the final value of bl
Template parameters:
Constants c1, c2
Operators op1, op2, op3, op4
0:add, 1:sub, 2+:xor
lodsb
op al, bl
op al, i81
op al, i82
op bl, al
Metamorphic Extraction
Straightforward Approach Using ∃ · ∀·
Template Program
IR for obfuscated X86
IR assertions
∃ op1, op2, op3, op4, c1, c2 ·
∀ input X86 states ·
d == vBlAfter
Query
SMT Solver
Values for
op1, op2, op3, op4, c1, c2
We can solve directly using the quantifiers ∃, ∀.
Quantifiers can slow solving, so we show an alternative.
Metamorphic Extraction
Collect I/O Pairs
IR for obfuscated X86 Input State
IR Interpreter
Output State
Collect I/O pairs for the obfuscated X86 seqeuence.
Extract the important parts of the states.
al 0x00 vBl 0x00 vBlAfter 0xE1
Parts of state needed for synthesis
Metamorphic Extraction
Create Witnesses
Plug the I/O pair
al 0x00 vBl 0x00 vBlAfter 0xE1
Into the template
a = op1 == 0 ? al + vBl : op1 == 1 ? al - vBl : al ^ vBl;
b = op2 == 0 ? a + c1 : op2 == 1 ? a - c1 : a ^ c1;
c = op3 == 0 ? b + c2 : op3 == 1 ? b - c2 : b ^ c2;
d = op4 == 0 ? vBl + c : op4 == 1 ? vBl - c : vBl ^ c;
To obtain a witness:
a1 = op1 == 0 ? 0x00+0x00 : op1 == 1 ? 0x00-0x00 : 0x00^0x00;
b1 = op2 == 0 ? a1 + c1 : op2 == 1 ? a1 - c1 : a1 ^ c1;
c1 = op3 == 0 ? b1 + c2 : op3 == 1 ? b1 - c2 : b1 ^ c2;
d1 = op4 == 0 ? 0x00 + c1 : 0x00 == 1 ? 0x00 - c1 : 0x00 ^ c1;
assert(d1 == 0xE1);
Metamorphic Extraction
Synthesizing Candidate Functions
Witnesses
SMT Solver
Function Doesn’t Exist
Values for
op1, op2, op3, op4, c1, c2
UNSAT SAT
Query for template parameters that satisfy the witnesses.
If they exist, create a function from them.
a = al ^ vBl;
b = a ^ 0x4A;
c = b ^ 0x85;
d = vBl + c;
Metamorphic Extraction
Equivalence Checking
Our function is only valid for witnesses seen far.
Does its behavior always match the X86?
Synthesized Function
Obfuscated X86
IR assertions
d != vBlAfter
Query
SMT Solver
Valid State Causing Difference
UNSAT SAT
If the formula is satisfiable, the output is a counterexample:
a state causing a difference in execution.
al 0x00 vBl 0x88
Metamorphic Extraction
Refinement
If the function did not match the X86, use the output state to
create a new witness. Repeat until error or success.
Witnesses
Synthesis
Can’t SynthesizeEquivalence Check
Create New Witness Function is Valid
UNSATSAT
UNSATSAT
Metamorphic Extraction
In Action
lodsb
add al, bl
xor al, 0D2h
sub al, 0F1h
add bl, al
The deobfuscated sequence is shown for clarity. Our analysis
works upon obfuscated sequences.
a = al ^ vBl; c = b ^ 0x85; c = b ^ 0x85; c = b ^ 0x85;
Synthesized b = a ^ 0x4A;
Program c = b ^ 0x85;
d = vBl + c;
Counter-
Example
Begin with a program synthesized from the witnesses.
Try to find a counter-example.
Metamorphic Extraction
In Action
lodsb
add al, bl
xor al, 0D2h
sub al, 0F1h
add bl, al
The deobfuscated sequence is shown for clarity. Our analysis
works upon obfuscated sequences.
a = al ^ vBl; c = b ^ 0x85; c = b ^ 0x85; c = b ^ 0x85;
Synthesized b = a ^ 0x4A;
Program c = b ^ 0x85;
d = vBl + c;
Counter- vBl = 0x88
Example al = 0x00
Synthesize a program from the witnesses.
Try to find a counter-example.
Found: generate new witness.
Metamorphic Extraction
In Action
lodsb
add al, bl
xor al, 0D2h
sub al, 0F1h
add bl, al
The deobfuscated sequence is shown for clarity. Our analysis
works upon obfuscated sequences.
a = al ^ vBl; a = al + vBl; c = b ^ 0x85; c = b ^ 0x85;
Synthesized b = a ^ 0x4A; b = a ^ 0xD9;
Program c = b ^ 0x85; c = b - 0xE8;
d = vBl + c; d = vBl + c;
Counter- vBl = 0x88
Example al = 0x00
Synthesize a program from the witnesses.
Try to find a counter-example.
Metamorphic Extraction
In Action
lodsb
add al, bl
xor al, 0D2h
sub al, 0F1h
add bl, al
The deobfuscated sequence is shown for clarity. Our analysis
works upon obfuscated sequences.
a = al ^ vBl; a = al + vBl; c = b ^ 0x85; c = b ^ 0x85;
Synthesized b = a ^ 0x4A; b = a ^ 0xD9;
Program c = b ^ 0x85; c = b - 0xE8;
d = vBl + c; d = vBl + c;
Counter- vBl = 0x88 vBl = 0x20
Example al = 0x00 al = 0x10
Synthesize a program from the witnesses.
Try to find a counter-example.
Found: generate new witness.
Metamorphic Extraction
In Action
lodsb
add al, bl
xor al, 0D2h
sub al, 0F1h
add bl, al
The deobfuscated sequence is shown for clarity. Our analysis
works upon obfuscated sequences.
a = al ^ vBl; a = al + vBl; a = al + vBl; c = b ^ 0x85;
Synthesized b = a ^ 0x4A; b = a ^ 0xD9; b = a ^ 0xD3;
Program c = b ^ 0x85; c = b - 0xE8; c = b + 0x0E;
d = vBl + c; d = vBl + c; d = vBl + c;
Counter- vBl = 0x88 vBl = 0x20
Example al = 0x00 al = 0x10
Synthesize a program from the witnesses.
Try to find a counter-example.
Metamorphic Extraction
In Action
lodsb
add al, bl
xor al, 0D2h
sub al, 0F1h
add bl, al
The deobfuscated sequence is shown for clarity. Our analysis
works upon obfuscated sequences.
a = al ^ vBl; a = al + vBl; a = al + vBl; c = b ^ 0x85;
Synthesized b = a ^ 0x4A; b = a ^ 0xD9; b = a ^ 0xD3;
Program c = b ^ 0x85; c = b - 0xE8; c = b + 0x0E;
d = vBl + c; d = vBl + c; d = vBl + c;
Counter- vBl = 0x88 vBl = 0x20 vBl = 0x23
Example al = 0x00 al = 0x10 al = 0x08
Synthesize a program from the witnesses.
Try to find a counter-example.
Found: generate new witness.
Metamorphic Extraction
In Action
lodsb
add al, bl
xor al, 0D2h
sub al, 0F1h
add bl, al
The deobfuscated sequence is shown for clarity. Our analysis
works upon obfuscated sequences.
a = al ^ vBl; a = al + vBl; a = al + vBl; a = al + vBl;
Synthesized b = a ^ 0x4A; b = a ^ 0xD9; b = a ^ 0xD3; b = a ^ 0xD2;
Program c = b ^ 0x85; c = b - 0xE8; c = b + 0x0E; c = b + 0x0F;
d = vBl + c; d = vBl + c; d = vBl + c; d = vBl + c;
Counter- vBl = 0x88 vBl = 0x20 vBl = 0x23
Example al = 0x00 al = 0x10 al = 0x08
Synthesize a program from the witnesses.
Try to find a counter-example.
Metamorphic Extraction
In Action
lodsb
add al, bl
xor al, 0D2h
sub al, 0F1h
add bl, al
The deobfuscated sequence is shown for clarity. Our analysis
works upon obfuscated sequences.
a = al ^ vBl; a = al + vBl; a = al + vBl; a = al + vBl;
Synthesized b = a ^ 0x4A; b = a ^ 0xD9; b = a ^ 0xD3; b = a ^ 0xD2;
Program c = b ^ 0x85; c = b - 0xE8; c = b + 0x0E; c = b + 0x0F;
d = vBl + c; d = vBl + c; d = vBl + c; d = vBl + c;
Counter- vBl = 0x88 vBl = 0x20 vBl = 0x23
Example al = 0x00 al = 0x10 al = 0x08 NONE
Synthesize a program from the witnesses.
Try to find a counter-example.
Not found: function is valid.
Introduction
Ingredients
X86 Disassembly and Assembly
IR Translation
IR Interpreter
SMT Integration
Applications
Enumerative Program Synthesis
CPU Emulator Synthesis
Peephole Superdeobfuscation
Template-Based Program Synthesis
Metamorphic Extraction
Conclusion
New Course Offering
New training course offering on SMT-based binary program analysis.
Written for low-level people comfortable programming in
Python; no particular math or CS background required.
Learn what SMT solvers are and how to use them.
Lecture material vividly illustrated like these slides.
Students construct a minimal, yet fully functional SMT-based
program analysis framework in Python.
Dozens of small, guided programming exercises.
Code an SMT solver, X86 → IR translator, ROP compiler1
.
Available now!
1
ROP compiler application subject to potential replacement pending
forthcoming regulation of the computer security industry
Questions?
rolf@msreverseengineering.com
Check out M¨obius Strip Reverse Engineering at:
http://www.msreverseengineering.com
Program analysis training classes
Reverse engineering training classes
Consulting services
Blog, research archive, and other resources
Thanks
My proofreaders are awesome:
Igor Skochinsky
William Whistler
Vijay D’Silva
People who publish inspiring work are awesome, too.
References
Sorav Bansal and Alex Aiken.
Automatic generation of peephole superoptimizers.
In ACM Sigplan Notices, volume 41, pages 394–403. ACM,
2006.
Patrice Godefroid and Ankur Taly.
Automated synthesis of symbolic instruction encodings from
i/o samples.
In ACM SIGPLAN Notices, volume 47, pages 441–452. ACM,
2012.
Sumit Gulwani, Susmit Jha, Ashish Tiwari, and Ramarathnam
Venkatesan.
Synthesis of loop-free programs.
In ACM SIGPLAN Notices, volume 46, pages 62–73. ACM,
2011.

Más contenido relacionado

La actualidad más candente

The Evolution of Async-Programming on .NET Platform (TUP, Full)
The Evolution of Async-Programming on .NET Platform (TUP, Full)The Evolution of Async-Programming on .NET Platform (TUP, Full)
The Evolution of Async-Programming on .NET Platform (TUP, Full)jeffz
 
Apache PIG - User Defined Functions
Apache PIG - User Defined FunctionsApache PIG - User Defined Functions
Apache PIG - User Defined FunctionsChristoph Bauer
 
Synapse india dotnet development overloading operater part 3
Synapse india dotnet development overloading operater part 3Synapse india dotnet development overloading operater part 3
Synapse india dotnet development overloading operater part 3Synapseindiappsdevelopment
 
Algoritmos sujei
Algoritmos sujeiAlgoritmos sujei
Algoritmos sujeigersonjack
 
Introduction to nsubstitute
Introduction to nsubstituteIntroduction to nsubstitute
Introduction to nsubstituteSuresh Loganatha
 
The Evolution of Async-Programming (SD 2.0, JavaScript)
The Evolution of Async-Programming (SD 2.0, JavaScript)The Evolution of Async-Programming (SD 2.0, JavaScript)
The Evolution of Async-Programming (SD 2.0, JavaScript)jeffz
 
An Introduction to Property Based Testing
An Introduction to Property Based TestingAn Introduction to Property Based Testing
An Introduction to Property Based TestingC4Media
 
Asterisk: PVS-Studio Takes Up Telephony
Asterisk: PVS-Studio Takes Up TelephonyAsterisk: PVS-Studio Takes Up Telephony
Asterisk: PVS-Studio Takes Up TelephonyAndrey Karpov
 
Implementing the IO Monad in Scala
Implementing the IO Monad in ScalaImplementing the IO Monad in Scala
Implementing the IO Monad in ScalaHermann Hueck
 
EcmaScript unchained
EcmaScript unchainedEcmaScript unchained
EcmaScript unchainedEduard Tomàs
 
Javascript Uncommon Programming
Javascript Uncommon ProgrammingJavascript Uncommon Programming
Javascript Uncommon Programmingjeffz
 
All VLSI programs
All VLSI programsAll VLSI programs
All VLSI programsGouthaman V
 
Meck at erlang factory, london 2011
Meck at erlang factory, london 2011Meck at erlang factory, london 2011
Meck at erlang factory, london 2011Adam Lindberg
 
VTU DSA Lab Manual
VTU DSA Lab ManualVTU DSA Lab Manual
VTU DSA Lab ManualAkhilaaReddy
 
Gevent what's the point
Gevent what's the pointGevent what's the point
Gevent what's the pointseanmcq
 
pdx-react-observables
pdx-react-observablespdx-react-observables
pdx-react-observablesIan Irvine
 
Lecture 5, c++(complete reference,herbet sheidt)chapter-15
Lecture 5, c++(complete reference,herbet sheidt)chapter-15Lecture 5, c++(complete reference,herbet sheidt)chapter-15
Lecture 5, c++(complete reference,herbet sheidt)chapter-15Abu Saleh
 

La actualidad más candente (20)

The Evolution of Async-Programming on .NET Platform (TUP, Full)
The Evolution of Async-Programming on .NET Platform (TUP, Full)The Evolution of Async-Programming on .NET Platform (TUP, Full)
The Evolution of Async-Programming on .NET Platform (TUP, Full)
 
JVM Mechanics
JVM MechanicsJVM Mechanics
JVM Mechanics
 
06 Loops
06 Loops06 Loops
06 Loops
 
Apache PIG - User Defined Functions
Apache PIG - User Defined FunctionsApache PIG - User Defined Functions
Apache PIG - User Defined Functions
 
Synapse india dotnet development overloading operater part 3
Synapse india dotnet development overloading operater part 3Synapse india dotnet development overloading operater part 3
Synapse india dotnet development overloading operater part 3
 
Algoritmos sujei
Algoritmos sujeiAlgoritmos sujei
Algoritmos sujei
 
Introduction to nsubstitute
Introduction to nsubstituteIntroduction to nsubstitute
Introduction to nsubstitute
 
The Evolution of Async-Programming (SD 2.0, JavaScript)
The Evolution of Async-Programming (SD 2.0, JavaScript)The Evolution of Async-Programming (SD 2.0, JavaScript)
The Evolution of Async-Programming (SD 2.0, JavaScript)
 
An Introduction to Property Based Testing
An Introduction to Property Based TestingAn Introduction to Property Based Testing
An Introduction to Property Based Testing
 
Asterisk: PVS-Studio Takes Up Telephony
Asterisk: PVS-Studio Takes Up TelephonyAsterisk: PVS-Studio Takes Up Telephony
Asterisk: PVS-Studio Takes Up Telephony
 
Implementing the IO Monad in Scala
Implementing the IO Monad in ScalaImplementing the IO Monad in Scala
Implementing the IO Monad in Scala
 
EcmaScript unchained
EcmaScript unchainedEcmaScript unchained
EcmaScript unchained
 
Javascript Uncommon Programming
Javascript Uncommon ProgrammingJavascript Uncommon Programming
Javascript Uncommon Programming
 
All VLSI programs
All VLSI programsAll VLSI programs
All VLSI programs
 
Meck at erlang factory, london 2011
Meck at erlang factory, london 2011Meck at erlang factory, london 2011
Meck at erlang factory, london 2011
 
VTU DSA Lab Manual
VTU DSA Lab ManualVTU DSA Lab Manual
VTU DSA Lab Manual
 
Gevent what's the point
Gevent what's the pointGevent what's the point
Gevent what's the point
 
Meck
MeckMeck
Meck
 
pdx-react-observables
pdx-react-observablespdx-react-observables
pdx-react-observables
 
Lecture 5, c++(complete reference,herbet sheidt)chapter-15
Lecture 5, c++(complete reference,herbet sheidt)chapter-15Lecture 5, c++(complete reference,herbet sheidt)chapter-15
Lecture 5, c++(complete reference,herbet sheidt)chapter-15
 

Similar a NSC #2 - D1 01 - Rolf Rolles - Program synthesis in reverse engineering

MuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CMuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CSusumu Tokumoto
 
Design Patterns - Compiler Case Study - Hands-on Examples
Design Patterns - Compiler Case Study - Hands-on ExamplesDesign Patterns - Compiler Case Study - Hands-on Examples
Design Patterns - Compiler Case Study - Hands-on ExamplesGanesh Samarthyam
 
Advanced procedures in assembly language Full chapter ppt
Advanced procedures in assembly language Full chapter pptAdvanced procedures in assembly language Full chapter ppt
Advanced procedures in assembly language Full chapter pptMuhammad Sikandar Mustafa
 
Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Romain Francois
 
CorePy High-Productivity CellB.E. Programming
CorePy High-Productivity CellB.E. ProgrammingCorePy High-Productivity CellB.E. Programming
CorePy High-Productivity CellB.E. ProgrammingSlide_N
 
Functional Reactive Programming / Compositional Event Systems
Functional Reactive Programming / Compositional Event SystemsFunctional Reactive Programming / Compositional Event Systems
Functional Reactive Programming / Compositional Event SystemsLeonardo Borges
 
Compiler Construction | Lecture 14 | Interpreters
Compiler Construction | Lecture 14 | InterpretersCompiler Construction | Lecture 14 | Interpreters
Compiler Construction | Lecture 14 | InterpretersEelco Visser
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonAMD Developer Central
 
Intel IPP Samples for Windows - error correction
Intel IPP Samples for Windows - error correctionIntel IPP Samples for Windows - error correction
Intel IPP Samples for Windows - error correctionPVS-Studio
 
Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Romain Francois
 
Столпы функционального программирования для адептов ООП, Николай Мозговой
Столпы функционального программирования для адептов ООП, Николай МозговойСтолпы функционального программирования для адептов ООП, Николай Мозговой
Столпы функционального программирования для адептов ООП, Николай МозговойSigma Software
 
Lesson 24. Phantom errors
Lesson 24. Phantom errorsLesson 24. Phantom errors
Lesson 24. Phantom errorsPVS-Studio
 
Compiler Construction for DLX Processor
Compiler Construction for DLX Processor Compiler Construction for DLX Processor
Compiler Construction for DLX Processor Soham Kulkarni
 
PYTHON-PROGRAMMING-UNIT-II (1).pptx
PYTHON-PROGRAMMING-UNIT-II (1).pptxPYTHON-PROGRAMMING-UNIT-II (1).pptx
PYTHON-PROGRAMMING-UNIT-II (1).pptxgeorgejustymirobi1
 
The Evolution of Async-Programming on .NET Platform (.Net China, C#)
The Evolution of Async-Programming on .NET Platform (.Net China, C#)The Evolution of Async-Programming on .NET Platform (.Net China, C#)
The Evolution of Async-Programming on .NET Platform (.Net China, C#)jeffz
 
High-Performance Haskell
High-Performance HaskellHigh-Performance Haskell
High-Performance HaskellJohan Tibell
 

Similar a NSC #2 - D1 01 - Rolf Rolles - Program synthesis in reverse engineering (20)

MuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CMuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for C
 
[ASM]Lab4
[ASM]Lab4[ASM]Lab4
[ASM]Lab4
 
presentation
presentationpresentation
presentation
 
Design Patterns - Compiler Case Study - Hands-on Examples
Design Patterns - Compiler Case Study - Hands-on ExamplesDesign Patterns - Compiler Case Study - Hands-on Examples
Design Patterns - Compiler Case Study - Hands-on Examples
 
Advanced procedures in assembly language Full chapter ppt
Advanced procedures in assembly language Full chapter pptAdvanced procedures in assembly language Full chapter ppt
Advanced procedures in assembly language Full chapter ppt
 
Reactive x
Reactive xReactive x
Reactive x
 
Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Rcpp: Seemless R and C++
Rcpp: Seemless R and C++
 
CorePy High-Productivity CellB.E. Programming
CorePy High-Productivity CellB.E. ProgrammingCorePy High-Productivity CellB.E. Programming
CorePy High-Productivity CellB.E. Programming
 
Functional Reactive Programming / Compositional Event Systems
Functional Reactive Programming / Compositional Event SystemsFunctional Reactive Programming / Compositional Event Systems
Functional Reactive Programming / Compositional Event Systems
 
Compiler Construction | Lecture 14 | Interpreters
Compiler Construction | Lecture 14 | InterpretersCompiler Construction | Lecture 14 | Interpreters
Compiler Construction | Lecture 14 | Interpreters
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
 
Intel IPP Samples for Windows - error correction
Intel IPP Samples for Windows - error correctionIntel IPP Samples for Windows - error correction
Intel IPP Samples for Windows - error correction
 
Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Rcpp: Seemless R and C++
Rcpp: Seemless R and C++
 
Столпы функционального программирования для адептов ООП, Николай Мозговой
Столпы функционального программирования для адептов ООП, Николай МозговойСтолпы функционального программирования для адептов ООП, Николай Мозговой
Столпы функционального программирования для адептов ООП, Николай Мозговой
 
Exploiting vectorization with ISPC
Exploiting vectorization with ISPCExploiting vectorization with ISPC
Exploiting vectorization with ISPC
 
Lesson 24. Phantom errors
Lesson 24. Phantom errorsLesson 24. Phantom errors
Lesson 24. Phantom errors
 
Compiler Construction for DLX Processor
Compiler Construction for DLX Processor Compiler Construction for DLX Processor
Compiler Construction for DLX Processor
 
PYTHON-PROGRAMMING-UNIT-II (1).pptx
PYTHON-PROGRAMMING-UNIT-II (1).pptxPYTHON-PROGRAMMING-UNIT-II (1).pptx
PYTHON-PROGRAMMING-UNIT-II (1).pptx
 
The Evolution of Async-Programming on .NET Platform (.Net China, C#)
The Evolution of Async-Programming on .NET Platform (.Net China, C#)The Evolution of Async-Programming on .NET Platform (.Net China, C#)
The Evolution of Async-Programming on .NET Platform (.Net China, C#)
 
High-Performance Haskell
High-Performance HaskellHigh-Performance Haskell
High-Performance Haskell
 

Más de NoSuchCon

NSC #2 - Challenge Solution
NSC #2 - Challenge SolutionNSC #2 - Challenge Solution
NSC #2 - Challenge SolutionNoSuchCon
 
NSC #2 - Challenge Introduction
NSC #2 - Challenge IntroductionNSC #2 - Challenge Introduction
NSC #2 - Challenge IntroductionNoSuchCon
 
NSC #2 - D3 05 - Alex Ionescu- Breaking Protected Processes
NSC #2 - D3 05 - Alex Ionescu- Breaking Protected ProcessesNSC #2 - D3 05 - Alex Ionescu- Breaking Protected Processes
NSC #2 - D3 05 - Alex Ionescu- Breaking Protected ProcessesNoSuchCon
 
NSC #2 - D3 04 - Guillaume Valadon & Nicolas Vivet - Detecting BGP hijacks
NSC #2 - D3 04 - Guillaume Valadon & Nicolas Vivet - Detecting BGP hijacksNSC #2 - D3 04 - Guillaume Valadon & Nicolas Vivet - Detecting BGP hijacks
NSC #2 - D3 04 - Guillaume Valadon & Nicolas Vivet - Detecting BGP hijacksNoSuchCon
 
NSC #2 - D3 03 - Jean-Philippe Aumasson - Cryptographic Backdooring
NSC #2 - D3 03 - Jean-Philippe Aumasson - Cryptographic BackdooringNSC #2 - D3 03 - Jean-Philippe Aumasson - Cryptographic Backdooring
NSC #2 - D3 03 - Jean-Philippe Aumasson - Cryptographic BackdooringNoSuchCon
 
NSC #2 - D3 02 - Peter Hlavaty - Attack on the Core
NSC #2 - D3 02 - Peter Hlavaty - Attack on the CoreNSC #2 - D3 02 - Peter Hlavaty - Attack on the Core
NSC #2 - D3 02 - Peter Hlavaty - Attack on the CoreNoSuchCon
 
NSC #2 - D3 01 - Thomas Braden - Exploitation of hardened MSP430-based device
NSC #2 - D3 01 - Thomas Braden - Exploitation of hardened MSP430-based deviceNSC #2 - D3 01 - Thomas Braden - Exploitation of hardened MSP430-based device
NSC #2 - D3 01 - Thomas Braden - Exploitation of hardened MSP430-based deviceNoSuchCon
 
NSC #2 - D2 06 - Richard Johnson - SAGEly Advice
NSC #2 - D2 06 - Richard Johnson - SAGEly AdviceNSC #2 - D2 06 - Richard Johnson - SAGEly Advice
NSC #2 - D2 06 - Richard Johnson - SAGEly AdviceNoSuchCon
 
NSC #2 - D2 05 - Andrea Barisani - Forging the USB Armory
NSC #2 - D2 05 - Andrea Barisani - Forging the USB ArmoryNSC #2 - D2 05 - Andrea Barisani - Forging the USB Armory
NSC #2 - D2 05 - Andrea Barisani - Forging the USB ArmoryNoSuchCon
 
NSC #2 - D2 04 - Ezequiel Gutesman - Blended Web and Database Attacks
NSC #2 - D2 04 - Ezequiel Gutesman - Blended Web and Database AttacksNSC #2 - D2 04 - Ezequiel Gutesman - Blended Web and Database Attacks
NSC #2 - D2 04 - Ezequiel Gutesman - Blended Web and Database AttacksNoSuchCon
 
NSC #2 - D2 03 - Nicolas Collignon - Google Apps Engine Security
NSC #2 - D2 03 - Nicolas Collignon - Google Apps Engine SecurityNSC #2 - D2 03 - Nicolas Collignon - Google Apps Engine Security
NSC #2 - D2 03 - Nicolas Collignon - Google Apps Engine SecurityNoSuchCon
 
NSC #2 - D2 02 - Benjamin Delpy - Mimikatz
NSC #2 - D2 02 - Benjamin Delpy - MimikatzNSC #2 - D2 02 - Benjamin Delpy - Mimikatz
NSC #2 - D2 02 - Benjamin Delpy - MimikatzNoSuchCon
 
NSC #2 - D2 01 - Andrea Allievi - Windows 8.1 Patch Protections
NSC #2 - D2 01 - Andrea Allievi - Windows 8.1 Patch ProtectionsNSC #2 - D2 01 - Andrea Allievi - Windows 8.1 Patch Protections
NSC #2 - D2 01 - Andrea Allievi - Windows 8.1 Patch ProtectionsNoSuchCon
 
NSC #2 - D1 05 - Renaud Lifchitz - Quantum computing in practice
NSC #2 - D1 05 - Renaud Lifchitz - Quantum computing in practiceNSC #2 - D1 05 - Renaud Lifchitz - Quantum computing in practice
NSC #2 - D1 05 - Renaud Lifchitz - Quantum computing in practiceNoSuchCon
 
NSC #2 - D1 03 - Sébastien Dudek - HomePlugAV PLC
NSC #2 - D1 03 - Sébastien Dudek - HomePlugAV PLCNSC #2 - D1 03 - Sébastien Dudek - HomePlugAV PLC
NSC #2 - D1 03 - Sébastien Dudek - HomePlugAV PLCNoSuchCon
 
NSC #2 - D1 02 - Georgi Geshev - Your Q is my Q
NSC #2 - D1 02 - Georgi Geshev - Your Q is my QNSC #2 - D1 02 - Georgi Geshev - Your Q is my Q
NSC #2 - D1 02 - Georgi Geshev - Your Q is my QNoSuchCon
 

Más de NoSuchCon (16)

NSC #2 - Challenge Solution
NSC #2 - Challenge SolutionNSC #2 - Challenge Solution
NSC #2 - Challenge Solution
 
NSC #2 - Challenge Introduction
NSC #2 - Challenge IntroductionNSC #2 - Challenge Introduction
NSC #2 - Challenge Introduction
 
NSC #2 - D3 05 - Alex Ionescu- Breaking Protected Processes
NSC #2 - D3 05 - Alex Ionescu- Breaking Protected ProcessesNSC #2 - D3 05 - Alex Ionescu- Breaking Protected Processes
NSC #2 - D3 05 - Alex Ionescu- Breaking Protected Processes
 
NSC #2 - D3 04 - Guillaume Valadon & Nicolas Vivet - Detecting BGP hijacks
NSC #2 - D3 04 - Guillaume Valadon & Nicolas Vivet - Detecting BGP hijacksNSC #2 - D3 04 - Guillaume Valadon & Nicolas Vivet - Detecting BGP hijacks
NSC #2 - D3 04 - Guillaume Valadon & Nicolas Vivet - Detecting BGP hijacks
 
NSC #2 - D3 03 - Jean-Philippe Aumasson - Cryptographic Backdooring
NSC #2 - D3 03 - Jean-Philippe Aumasson - Cryptographic BackdooringNSC #2 - D3 03 - Jean-Philippe Aumasson - Cryptographic Backdooring
NSC #2 - D3 03 - Jean-Philippe Aumasson - Cryptographic Backdooring
 
NSC #2 - D3 02 - Peter Hlavaty - Attack on the Core
NSC #2 - D3 02 - Peter Hlavaty - Attack on the CoreNSC #2 - D3 02 - Peter Hlavaty - Attack on the Core
NSC #2 - D3 02 - Peter Hlavaty - Attack on the Core
 
NSC #2 - D3 01 - Thomas Braden - Exploitation of hardened MSP430-based device
NSC #2 - D3 01 - Thomas Braden - Exploitation of hardened MSP430-based deviceNSC #2 - D3 01 - Thomas Braden - Exploitation of hardened MSP430-based device
NSC #2 - D3 01 - Thomas Braden - Exploitation of hardened MSP430-based device
 
NSC #2 - D2 06 - Richard Johnson - SAGEly Advice
NSC #2 - D2 06 - Richard Johnson - SAGEly AdviceNSC #2 - D2 06 - Richard Johnson - SAGEly Advice
NSC #2 - D2 06 - Richard Johnson - SAGEly Advice
 
NSC #2 - D2 05 - Andrea Barisani - Forging the USB Armory
NSC #2 - D2 05 - Andrea Barisani - Forging the USB ArmoryNSC #2 - D2 05 - Andrea Barisani - Forging the USB Armory
NSC #2 - D2 05 - Andrea Barisani - Forging the USB Armory
 
NSC #2 - D2 04 - Ezequiel Gutesman - Blended Web and Database Attacks
NSC #2 - D2 04 - Ezequiel Gutesman - Blended Web and Database AttacksNSC #2 - D2 04 - Ezequiel Gutesman - Blended Web and Database Attacks
NSC #2 - D2 04 - Ezequiel Gutesman - Blended Web and Database Attacks
 
NSC #2 - D2 03 - Nicolas Collignon - Google Apps Engine Security
NSC #2 - D2 03 - Nicolas Collignon - Google Apps Engine SecurityNSC #2 - D2 03 - Nicolas Collignon - Google Apps Engine Security
NSC #2 - D2 03 - Nicolas Collignon - Google Apps Engine Security
 
NSC #2 - D2 02 - Benjamin Delpy - Mimikatz
NSC #2 - D2 02 - Benjamin Delpy - MimikatzNSC #2 - D2 02 - Benjamin Delpy - Mimikatz
NSC #2 - D2 02 - Benjamin Delpy - Mimikatz
 
NSC #2 - D2 01 - Andrea Allievi - Windows 8.1 Patch Protections
NSC #2 - D2 01 - Andrea Allievi - Windows 8.1 Patch ProtectionsNSC #2 - D2 01 - Andrea Allievi - Windows 8.1 Patch Protections
NSC #2 - D2 01 - Andrea Allievi - Windows 8.1 Patch Protections
 
NSC #2 - D1 05 - Renaud Lifchitz - Quantum computing in practice
NSC #2 - D1 05 - Renaud Lifchitz - Quantum computing in practiceNSC #2 - D1 05 - Renaud Lifchitz - Quantum computing in practice
NSC #2 - D1 05 - Renaud Lifchitz - Quantum computing in practice
 
NSC #2 - D1 03 - Sébastien Dudek - HomePlugAV PLC
NSC #2 - D1 03 - Sébastien Dudek - HomePlugAV PLCNSC #2 - D1 03 - Sébastien Dudek - HomePlugAV PLC
NSC #2 - D1 03 - Sébastien Dudek - HomePlugAV PLC
 
NSC #2 - D1 02 - Georgi Geshev - Your Q is my Q
NSC #2 - D1 02 - Georgi Geshev - Your Q is my QNSC #2 - D1 02 - Georgi Geshev - Your Q is my Q
NSC #2 - D1 02 - Georgi Geshev - Your Q is my Q
 

Último

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Último (20)

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

NSC #2 - D1 01 - Rolf Rolles - Program synthesis in reverse engineering

  • 1. Program Synthesis in Reverse Engineering Rolf Rolles M¨obius Strip Reverse Engineering http://www.msreverseengineering.com No Such Conference Keynote Speech November 19th, 2014
  • 2. Program Synthesis in Reverse Engineering Overview Desired Behavior Program Synthesizer Program Program synthesis is an academic discipline devoted to creating programs automatically, given an expectation of how the program should behave. We apply and adapt existing academic work to automate tasks in reverse engineering.
  • 3. Program Synthesis What it is Not “Can I have a web browser?” Program Synthesizer “Sure, here you go!”
  • 4. Program Synthesis What it is Not “Can I have a web browser?” Program Synthesizer “Sure, here you go!” Program synthesis: Requires precise behavioral specifications Operates on small scales Is primarily researched on loop-free programs
  • 5. Program Synthesis The Simplest Possible Program Synthesizer Starting from a behavioral specification: int abs(int x) = x if x >= 0 -x otherwise
  • 6. Program Synthesis The Simplest Possible Program Synthesizer Starting from a behavioral specification: int abs(int x) = x if x >= 0 -x otherwise Create a harness for testing abs(x) exhaustively: int main(int , char **) { for(int x = 0; x != MAX_INT; ++x) { int r = abs(x); if((x >= 0 && r != x) || r != -x) return -1; } return 0; }
  • 7. Program Synthesis The Simplest Possible Program Synthesizer Starting from a behavioral specification: int abs(int x) = x if x >= 0 -x otherwise Create a harness for testing abs(x) exhaustively: int main(int , char **) { for(int x = 0; x != MAX_INT; ++x) { int r = abs(x); if((x >= 0 && r != x) || r != -x) return -1; } return 0; } Enumerate every possible function abs(int x): int abs(int x) { return -x; }
  • 8. Program Synthesis The Simplest Possible Program Synthesizer Starting from a behavioral specification: int abs(int x) = x if x >= 0 -x otherwise Create a harness for testing abs(x) exhaustively: int main(int , char **) { for(int x = 0; x != MAX_INT; ++x) { int r = abs(x); if((x >= 0 && r != x) || r != -x) return -1; } return 0; } Enumerate every possible function abs(int x): int abs(int x) { return ~x; }
  • 9. Program Synthesis The Simplest Possible Program Synthesizer Starting from a behavioral specification: int abs(int x) = x if x >= 0 -x otherwise Create a harness for testing abs(x) exhaustively: int main(int , char **) { for(int x = 0; x != MAX_INT; ++x) { int r = abs(x); if((x >= 0 && r != x) || r != -x) return -1; } return 0; } Enumerate every possible function abs(int x): int abs(int x) { return -(x+0); }
  • 10. Program Synthesis The Simplest Possible Program Synthesizer Starting from a behavioral specification: int abs(int x) = x if x >= 0 -x otherwise Create a harness for testing abs(x) exhaustively: int main(int , char **) { for(int x = 0; x != MAX_INT; ++x) { int r = abs(x); if((x >= 0 && r != x) || r != -x) return -1; } return 0; } Enumerate every possible function abs(int x): int abs(int x) { return ~(x+0); }
  • 11. Program Synthesis The Simplest Possible Program Synthesizer Starting from a behavioral specification: int abs(int x) = x if x >= 0 -x otherwise Create a harness for testing abs(x) exhaustively: int main(int , char **) { for(int x = 0; x != MAX_INT; ++x) { int r = abs(x); if((x >= 0 && r != x) || r != -x) return -1; } return 0; } Enumerate every possible function abs(int x): int abs(int x) { return -(x+1); }
  • 12. Program Synthesis The Simplest Possible Program Synthesizer Starting from a behavioral specification: int abs(int x) = x if x >= 0 -x otherwise Create a harness for testing abs(x) exhaustively: int main(int , char **) { for(int x = 0; x != MAX_INT; ++x) { int r = abs(x); if((x >= 0 && r != x) || r != -x) return -1; } return 0; } Enumerate every possible function abs(int x): int abs(int x) { return ~(x+1); }
  • 13. Program Synthesis The Simplest Possible Program Synthesizer The brute-force synthesizer: Is extremely slow, and Explores an infinite number of programs. We could improve the situation by carefully choosing which programs to generate. Modern program synthesis goes further and does better.
  • 14. Introduction Ingredients X86 Disassembly and Assembly IR Translation IR Interpreter SMT Integration Applications Enumerative Program Synthesis CPU Emulator Synthesis Peephole Superdeobfuscation Template-Based Program Synthesis Metamorphic Extraction Conclusion
  • 15. Ingredients X86 Disassembly and Assembly 28 D8 0F B6 DB 81 C3 BB BB BB BB 01 D8 X86 Disassembler sub bl, bl movzx ebx, bl add ebx, 0BBBBBBBBh add eax, ebx 28 D8 0F B6 DB 81 C3 BB BB BB BB 01 D8 X86 Assembler sub bl, bl movzx ebx, bl add ebx, 0BBBBBBBBh add eax, ebx Given bytes, a disassembler produces X86 assembly. An assembler does the opposite.
  • 16. Ingredients IR Translation add eax, ecx IR Translator vRes = vEax + vEcx; vZF = vRes == 0; vSF = vRes <s 0; vPF = Parity(vRes); vCF = vRes <u vEax; vOF = (vEcx ^ vRes) & (vEax ^ vRes) <s 0; vAF = (vRes ^ vEax ^ vEcx) & 0x10 != 0; vEax = vRes; IR translators convert instructions to a symbolic representation of their actual effects when executed.
  • 17. Ingredients IR Interpreter vRes = vEax + vEcx; vZF = vRes == 0; vSF = vRes <s 0; vPF = Parity(vRes); vCF = vRes <u vEax; vOF = (vEcx ^ vRes) & (vEax ^ vRes) <s 0; vAF = (vRes ^ vEax ^ vEcx) & 0x10 != 0; vEax = vRes; IR for add eax, ecx eax 645EDE7Bh zf 0 of 0 sf 1 ecx 0FCD02BDEh af 0 pf 1 cf 0 Input State IR Interpreter eax 612F0A59h zf 0 of 0 sf 0 ecx 0FCD02BDEh af 1 pf 1 cf 1 Output State The IR interpreter executes IR statements in an input state, producing an output state.
  • 18. Ingredients SMT Integration vRes = vEax + vEcx; vZF = vRes == 0; vSF = vRes <s 0; vPF = Parity(vRes); vCF = vRes <u vEax; vOF = (vEcx ^ vRes) & (vEax ^ vRes) <s 0; vAF = (vRes ^ vEax ^ vEcx) & 0x10 != 0; vEax = vRes; IR for add eax, ecx vZF == 1 Query SMT Solver vEax = 0, vEcx = 0 SMT solvers can answer questions about the values of variables and memory within IR sequences. We queried for a model of eax and ecx where vZF == 1.
  • 19. Introduction Ingredients Applications Enumerative Program Synthesis CPU Emulator Synthesis Peephole Superdeobfuscation Template-Based Program Synthesis Metamorphic Extraction Conclusion
  • 20. Program Synthesis in Reverse Engineering CPU Emulator Synthesis CPU Emulator Synthesizer X86 Assembler Templates add eax, ecx vRes = vEax + vEcx; vZF = vRes == 0; vSF = vRes <s 0; vPF = Parity(vRes); vCF = vRes <u vEax; vOF = (vEcx ^ vRes) & (vEax ^ vRes) <s 0; vAF = (vRes ^ vEax ^ vEcx) & 0x10 != 0; vEax = vEax + vEcx; CPU Emulator Logic We present a re-designed version of [2] for generating CPU emulators.
  • 21. Program Synthesis in Reverse Engineering Peephole Superdeobfuscation Deobfuscator Generator push edx mov edx, 6F6B5081h mov esi, 0BC6D38Bh add esi, edx pop edx Obfuscated Code push r321 mov r321, i321 mov r322, i322 add r322, r321 pop r321 Simplifies To mov r322, i321+i322 Deobfuscator Rule We apply the ideas of [1] to automatically create deobfuscators for certain obfuscators.
  • 22. Program Synthesis in Reverse Engineering Metamorphic Extraction Functionality Extractor lodsb add al, bl xor al, 0D2h sub al, 0F1h add bl, al add al, 0B7h sub al, 90h push cx mov cl, bl add al, bl pop cx add al, 90h sub al, 0B7h push ebx mov ebx, 8716AEF1h push ecx push eax xor dword ptr [esp], ebx ; continued We apply a technique from [3] to automatically recover metamorphically-generated code sequences.
  • 23. Introduction Ingredients X86 Disassembly and Assembly IR Translation IR Interpreter SMT Integration Applications Enumerative Program Synthesis CPU Emulator Synthesis Peephole Superdeobfuscation Template-Based Program Synthesis Metamorphic Extraction Conclusion
  • 24. CPU Emulator Synthesis Overview CPU emulators require accurate descriptions of the operation of the X86 processor. Sadly, the Intel manuals are at best vague, at worst wrong. IF 64-Bit Mode THEN #UD; ELSE IF ((AL AND 0FH) > 9) or (AF = 1) THEN WRONG: AL ← AL + 6; AX ← AX + 0x106 WRONG: AH ← AH + 1; AF ← 1; CF ← 1; AL ← AL AND 0FH; ELSE AF ← 0; CF ← 0; AL ← AL AND 0FH; FI; FI; That’s from the first instruction in the manual.
  • 25. CPU Emulator Synthesis Overview Idea: use program synthesis to find instruction descriptions. add eax, ecx CPU Emulator Synthesizer vRes = vEax + vEcx; vZF = vRes == 0; vSF = vRes <s 0; vPF = Parity(vRes); vCF = vRes <u vEax; vOF = (vEcx ^ vRes) & (vEax ^ vRes) <s 0; vAF = (vRes ^ vEax ^ vEcx) & 0x10 != 0; vEax = vEax + vEcx;
  • 26. CPU Emulator Synthesis Overview add eax, ecx Behavior Sampler I/O Pair Data Hypothesis Generator Templates, Substitutions Hypotheses Empirical Sieve Candidate Hypotheses Equivalence Checker Single Result We describe each component next. 1. Hypothesis Generator 2. Behavior Sampler 3. Empirical Sieve 4. Equivalence Checker
  • 27. CPU Emulator Synthesis Hypothesis Generator We generate a family of expressions from a template expression and a list of substitutions. Binop(_,+,_) vEax Dword(0) Tree representation of vEax + 0 Token Replace With + +,-,&,|,^ vEax vEax,vEcx Specified Substitutions vEax + 0 vEax - 0 vEax & 0 vEax | 0 vEax ^ 0 vEcx + 0 vEcx - 0 vEcx & 0 vEcx | 0 vEcx ^ 0 The 5 ∗ 2 = 10 expressions generated from the above
  • 28. CPU Emulator Synthesis Hypothesis Generator A result location is a location modified by an instruction. A hypothesis is an IR expression of equality between a result location variable and a generated expression. For the instruction add eax, ebx, the result locations are vEaxout , vZFout , vSFout , vPFout , vCFout , vOFout , and vAFout . vZFout == (vEax + vEbx) <s 0 vZFout == ((vEax + vEbx) ^ vEax) >u 0 vSFout == (vEax + vEbx) <s 0 vSFout == ((vEax + vEbx) ^ vEax) >u 0 vPFout == (vEax + vEbx) <s 0 vPFout == ((vEax + vEbx) ^ vEax) >u 0 vCFout == (vEax + vEbx) <s 0 vCFout == ((vEax + vEbx) ^ vEax) >u 0 vOFout == (vEax + vEbx) <s 0 vOFout == ((vEax + vEbx) ^ vEax) >u 0 vAFout == (vEax + vEbx) <s 0 vAFout == ((vEax + vEbx) ^ vEax) >u 0 Some hypotheses for add eax, ebx
  • 29. CPU Emulator Synthesis Behavior Sampler add eax, ecx Input State Behavior Sampler Output State Given a set of register and flag values as input: 1. JIT assemble the instruction 2. Set the processor registers and flags to the input state 3. Execute the instruction 4. Collect the values of the registers and flags as output These I/O pairs are samples of the instruction’s behavior.
  • 30. CPU Emulator Synthesis Empirical Sieve eaxin = 0 ecxin = 0 cfout = 0 I/O Data CFout == (vEax == vEcx) Hypothesis 0 == (0 == 0) 0 == 1 false Evaluation If a hypothesis is false in any I/O pair, it cannot describe the instruction’s behavior, so we discard it. In the figure above, CFout == (vEax == vEcx) is an invalid hypothesis because it evaluates to false for the pair listed.
  • 31. CPU Emulator Synthesis Empirical Sieve I/O Pairs Hypothesis Generator IR Interpreter Unfalsified Hypotheses We generate many hypotheses, test them against the I/O pairs, and discard any that evaluate to false. Multiple unfalsified hypotheses may remain after the empirical sieve. The next component removes more of them.
  • 32. CPU Emulator Synthesis Equivalence Checker Equivalence checking uses an SMT solver to determine whether two expressions evaluate equally for all inputs. If not, it finds an input that causes a different evaluation. (vEax+vEcx) == 0 Expression #1 (vEcx+vEax+vEcx+vEax) == 0 Expression #2 Two hypotheses for ZFout . We query an SMT solver as to the satisfiability of ((vEax+vEcx) == 0) != ((vEcx+vEax+vEcx+vEax) == 0). If UNSAT, the hypotheses always behave the same. IF SAT, the solution contains values for vEax and vEcx that cause the hypotheses to evaluate differently.
  • 33. CPU Emulator Synthesis Equivalence Checker The SMT solver reports satisfiability, and gives a model where the first hypothesis for ZFout evaluates to 0, and the other to 1. eaxin = 3d800000h ecxin = 42800000h Model for ((vEax+vEcx) == 0) != ((vEcx+vEax+vEcx+vEax) == 0) If we sample the instruction’s behavior in this input state, one of the hypotheses must become falsified, since: If zfout = 1, then (vEax+vEcx) == 0 is false. If zfout = 0, then (vEcx+vEax+vEcx+vEax) == 0 is false. Hypothesis #2 is falsified.
  • 34. CPU Emulator Synthesis Behavior Sampler add eax, ecx Behavior Sampler I/O Pair Data Hypothesis Generator <out> == <in>, <out> == <in> <+> <in>, . . . CFout == (vEax == vEcx), CFout == (vEcx <s 0), CFout == (vEax + vEcx <u vEax), . . .EmpiricalSieve CFout == (vEcx <s 0), CFout == (vEax + vEcx <u vEax), . . . Equivalence Checker CFout == vEax + vEcx <u vEax First, we generate hy- potheses for the instruc- tion’s effects.
  • 35. CPU Emulator Synthesis Hypothesis Generator add eax, ecx Behavior Sampler I/O Pair Data Hypothesis Generator <out> == <in>, <out> == <in> <+> <in>, . . . CFout == (vEax == vEcx), CFout == (vEcx <s 0), CFout == (vEax + vEcx <u vEax), . . .EmpiricalSieve CFout == (vEcx <s 0), CFout == (vEax + vEcx <u vEax), . . . Equivalence Checker CFout == vEax + vEcx <u vEax Next, we collect I/O pairs for an instruction.
  • 36. CPU Emulator Synthesis Empirical Sieve add eax, ecx Behavior Sampler I/O Pair Data Hypothesis Generator <out> == <in>, <out> == <in> <+> <in>, . . . CFout == (vEax == vEcx), CFout == (vEcx <s 0), CFout == (vEax + vEcx <u vEax), . . .Empirical Sieve CFout == (vEcx <s 0), CFout == (vEax + vEcx <u vEax), . . . Equivalence Checker CFout == vEax + vEcx <u vEax Remove hypotheses that are shown to be false by any I/O pair.
  • 37. CPU Emulator Synthesis Equivalence Checking add eax, ecx Behavior Sampler I/O Pair Data Hypothesis Generator <out> == <in>, <out> == <in> <+> <in>, . . . CFout == (vEax == vEcx), CFout == (vEcx <s 0), CFout == (vEax + vEcx <u vEax), . . .Empirical Sieve CFout == (vEcx <s 0), CFout == (vEax + vEcx <u vEax), . . . Equivalence Checker CFout == vEax + vEcx <u vEax The equivalence checker removes all but one hy- pothesis.
  • 38. Introduction Ingredients X86 Disassembly and Assembly IR Translation IR Interpreter SMT Integration Applications Enumerative Program Synthesis CPU Emulator Synthesis Peephole Superdeobfuscation Template-Based Program Synthesis Metamorphic Extraction Conclusion
  • 39. Peephole Superdeobfuscation Rule-Based Obfuscation Unobfuscated Obfuscated Constraints add r81, r82 add r81, imm8 add r81, r82 sub r81, imm8 sub r321, r322 push r323 r323 = r321 mov r323, r322 r321 = esp sub r321, r323 r321 = esp pop r323 r323 = esp Obfuscator Rules Some obfuscators use a rule set to randomly permute code. The ordinary and obfuscated sequences must have the same (or at least “similar”) effects when executed.
  • 40. Peephole Superdeobfuscation add al, bl sub al, 9Fh sub al, 9Fh sub al, 9Fh add al, bl Blah
  • 41. Peephole Superdeobfuscation add al, bl sub al, 9Fh sub al, 9Fh sub al, 9Fh sub al, 9Fh add al, bl add al, bl add al, 9Fh Replace instructions with obfuscated sequences Obfuscation
  • 42. Peephole Superdeobfuscation add al, bl sub al, 9Fh sub al, 9Fh sub al, 9Fh sub al, 9Fh sub al, 9Fh sub al, 7Fh add al, bl add al, bl add al, 7Fh push edx mov dh, 22h inc dh dec dh add dh, 7Dh add al, 9Fh add al, dh pop edx Replace instructions with obfuscated sequences Obfuscation
  • 43. Peephole Superdeobfuscation add al, bl sub al, 9Fh sub al, 9Fh push cx mov ch, 9Fh sub al, 9Fh sub al, ch pop cx sub al, 7Fh sub al, 7Fh sub al, 70h add al, bl add al, bl add al, 70h add al, 7Fh add al, 7Fh push edx push edx push ecx mov ch, 22h mov dh, 22h mov dh, ch pop ecx inc dh inc dh dec dh dec dh add dh, 7Dh add dh, 7Dh add al, dh add al, dh pop edx pop edx Replace instructions with obfuscated sequences Obfuscation
  • 44. Peephole Superdeobfuscation add al, bl sub al, 9Fh sub al, 9Fh push cx mov ch, 9Fh sub al, ch pop cx sub al, 7Fh sub al, 70h add al, bl add al, 70h add al, 7Fh push edx push ecx mov ch, 22h mov dh, ch pop ecx inc dh dec dh add dh, 7Dh add al, dh pop edx Blah
  • 45. Peephole Superdeobfuscation add al, bl sub al, 9Fh sub al, 9Fh push cx mov ch, 9Fh sub al, 9Fh sub al, ch pop cx sub al, 7Fh sub al, 7Fh sub al, 70h add al, bl add al, bl add al, 70h add al, 7Fh add al, 7Fh push edx push edx push ecx mov ch, 22h mov dh, 22h mov dh, ch pop ecx inc dh inc dh dec dh dec dh add dh, 7Dh add dh, 7Dh add al, dh add al, dh pop edx pop edx Replace obfuscated sequences with instructions Deobfuscation with Inverse Rules
  • 46. Peephole Superdeobfuscation add al, bl sub al, 9Fh sub al, 9Fh sub al, 9Fh sub al, 9Fh sub al, 9Fh sub al, 7Fh add al, bl add al, bl add al, 7Fh push edx mov dh, 22h inc dh dec dh add dh, 7Dh add al, 9Fh add al, dh pop edx Replace obfuscated sequences with instructions Deobfuscation with Inverse Rules
  • 47. Peephole Superdeobfuscation add al, bl sub al, 9Fh sub al, 9Fh sub al, 9Fh sub al, 9Fh add al, bl add al, bl add al, 9Fh Replace obfuscated sequences with instructions Deobfuscation with Inverse Rules
  • 48. Peephole Superdeobfuscation add al, bl sub al, 9Fh sub al, 9Fh sub al, 9Fh add al, bl Blah Deobfuscation with Inverse Rules
  • 49. Peephole Superdeobfuscation Rule-Based Deobfuscation Obfuscated Unobfuscated Constraints add r81, imm8 add r81, r82 add r81, r82 sub r81, imm8 push r323 sub r321, r322 r323 = r321 mov r323, r322 r321 = esp sub r321, r323 r321 = esp pop r323 r323 = esp Deobfuscator Rules We could inspect the code manually, pick out patterns, and write deobfuscator rules. This is Tedious And Error-Prone—. We automate the process with program synthesis.
  • 50. Peephole Superdeobfuscation The Idea: Obfuscated/Unobfuscated Behavior Must Match lea esp, [esp-4] mov [esp], eax Obfuscated push eax IR Interpreter eax 645EDE7Bh eax 0FCD02BDEh Output State #1 Output State #2 State Comparison Collect obfuscated sequences.
  • 51. Peephole Superdeobfuscation The Idea: Obfuscated/Unobfuscated Behavior Must Match lea esp, [esp-4] mov [esp], eax Obfuscated push eax IR Interpreter eax 645EDE7Bh esp 0FCD02BDEh Input State esp 0FCD02BDAh mem[esp] = 0x645EDE7B Output State #1 Output State #2 State Comparison Generate I/O pairs.
  • 52. Peephole Superdeobfuscation The Idea: Obfuscated/Unobfuscated Behavior Must Match lea esp, [esp-4] mov [esp], eax Obfuscated push eax Collect Registers / Constants eax 645EDE7Bh eax 0FCD02BDEh Output State #1 eax, esp, 4 Operand Parts Output State #2 State Comparison Preparing for candidate generation, collect registers and constants from the obfuscated sequence.
  • 53. Peephole Superdeobfuscation The Idea: Obfuscated/Unobfuscated Behavior Must Match lea esp, [esp-4] push r32 push i32 add r32, i32 Candidate Templates push eax Candidate push esp push 4 add esp, 4 add eax, 4 IR Interpreter Candidate Enumerator eax 645EDE7Bh eax 0FCD02BDEh Output State #1 eax, esp, 4 Operand Parts Output State #2 State Comparison Plug the registers and constants into the templates to generate candidate replacements.
  • 54. Peephole Superdeobfuscation The Idea: Obfuscated/Unobfuscated Behavior Must Match lea esp, [esp-4] push eax Candidate IR Interpreter eax 645EDE7Bh esp 0FCD02BDEh Input State Output State #1 esp 0FCD02BDAh mem[esp] = 0x645EDE7B Output State #2 State Comparison For each candidate, generate I/O pairs in the same input states as for the original sequence.
  • 55. Peephole Superdeobfuscation The Idea: Obfuscated/Unobfuscated Behavior Must Match lea esp, [esp-4] mov [esp], eax Obfuscated push eax Candidate IR Interpreter eax 645EDE7Bh esp 0FCD02BDEh Input State esp 0FCD02BDAh mem[esp] = 0x645EDE7B Output State #1 esp 0FCD02BDAh mem[esp] = 0x645EDE7B Output State #2 State Comparison Compare the resulting states. We can be lenient for flags and stack memory. If they match, the candidate is a potential replacement for the obfuscated sequence.
  • 56. Peephole Superdeobfuscation The Idea: Obfuscated/Unobfuscated Behavior Must Match lea esp, [esp-4] mov [esp], eax Obfuscated push eax Candidate Equivalence Checker eax 645EDE7Bh eax 0FCD02BDEh Output State #1 Output State #2 State Comparison Obfuscated Deobfuscated lea esp, [esp-4] push eax mov [esp], eax Deobfuscator Rules If the states matched, use an SMT solver to ensure that the sequences are actually equivalent. Learn rules if so.
  • 57. Peephole Superdeobfuscation Generalization Obfuscated Deobfuscated Constraints lea esp, [esp-4] push eax mov [esp], eax Deobfuscator Rules Our deobfuscator rule is specific to the register eax. In fact, eax can be replaced with any register except esp.
  • 58. Peephole Superdeobfuscation Generalization Obfuscated Deobfuscated Constraints lea esp, [esp-4] push eax mov [esp], eax Deobfuscator Rules Obfuscated Deobfuscated Constraints Generalized Rules lea esp, [esp-4] push r32 r32 = esp mov [esp], r32 Our deobfuscator rule is specific to the register eax. In fact, eax can be replaced with any register except esp. Generalization removes specific register and/or immediate values from deobfuscation rules.
  • 59. Peephole Superdeobfuscation In Action sub esp, 4 mov [esp], eax mov eax, ebp push ebp push edi mov [esp], eax xor [esp], 7FEE03B1h pop ebp sub esp, 4 mov [esp], esi push edx mov edx, 6F6B5081h mov esi, 0BC6D38Bh add esi, edx pop edx Obfuscated Deobfuscated Constraints mov [esp], r322 mov r322, i321+i322 r321 = r322 We iterate through the in- structions, trying to sim- plify those within a window. : bottom of current window. : can simplify. : simplification applied.
  • 60. Peephole Superdeobfuscation In Action sub esp, 4 mov [esp], eax mov eax, ebp push ebp push edi mov [esp], eax xor [esp], 7FEE03B1h pop ebp sub esp, 4 mov [esp], esi push edx mov edx, 6F6B5081h mov esi, 0BC6D38Bh add esi, edx pop edx Obfuscated Deobfuscated Constraints sub esp, 4 push r32 r32 = esp mov [esp], r32 mov [esp], r322 mov r322, i321+i322 r321 = r322 Learn rule and generalize.
  • 61. Peephole Superdeobfuscation In Action push eax mov eax, ebp push ebp push edi mov [esp], eax xor [esp], 7FEE03B1h pop ebp sub esp, 4 mov [esp], esi push edx mov edx, 6F6B5081h mov esi, 0BC6D38Bh add esi, edx pop edx Obfuscated Deobfuscated Constraints sub esp, 4 push r32 r32 = esp mov [esp], r32 mov [esp], r322 mov r322, i321+i322 r321 = r322 No simplification.
  • 62. Peephole Superdeobfuscation In Action push eax mov eax, ebp push ebp push edi mov [esp], eax xor [esp], 7FEE03B1h pop ebp sub esp, 4 mov [esp], esi push edx mov edx, 6F6B5081h mov esi, 0BC6D38Bh add esi, edx pop edx Obfuscated Deobfuscated Constraints sub esp, 4 push r32 r32 = esp mov [esp], r32 mov [esp], r322 mov r322, i321+i322 r321 = r322 No simplification.
  • 63. Peephole Superdeobfuscation In Action push eax mov eax, ebp push ebp push edi mov [esp], eax xor [esp], 7FEE03B1h pop ebp sub esp, 4 mov [esp], esi push edx mov edx, 6F6B5081h mov esi, 0BC6D38Bh add esi, edx pop edx Obfuscated Deobfuscated Constraints sub esp, 4 push r32 r32 = esp mov [esp], r32 mov [esp], r322 mov r322, i321+i322 r321 = r322 No simplification.
  • 64. Peephole Superdeobfuscation In Action push eax mov eax, ebp push ebp push edi mov [esp], eax xor [esp], 7FEE03B1h pop ebp sub esp, 4 mov [esp], esi push edx mov edx, 6F6B5081h mov esi, 0BC6D38Bh add esi, edx pop edx Obfuscated Deobfuscated Constraints sub esp, 4 push r32 r32 = esp mov [esp], r32 push r321 push r322 r322 = esp mov [esp], r322 mov r322, i321+i322 r321 = r322 Learn rule and generalize.
  • 65. Peephole Superdeobfuscation In Action push eax mov eax, ebp push ebp push eax xor [esp], 7FEE03B1h xor [esp], 7FEE03B1h pop ebp sub esp, 4 mov [esp], esi push edx mov edx, 6F6B5081h mov esi, 0BC6D38Bh add esi, edx pop edx Obfuscated Deobfuscated Constraints sub esp, 4 push r32 r32 = esp mov [esp], r32 push r321 push r322 r322 = esp mov [esp], r322 mov r322, i321+i322 r321 = r322 No simplification.
  • 66. Peephole Superdeobfuscation In Action push eax mov eax, ebp push ebp push eax xor [esp], 7FEE03B1h xor [esp], 7FEE03B1h pop ebp sub esp, 4 mov [esp], esi push edx mov edx, 6F6B5081h mov esi, 0BC6D38Bh add esi, edx pop edx Obfuscated Deobfuscated Constraints sub esp, 4 push r32 r32 = esp mov [esp], r32 push r321 push r322 r322 = esp mov [esp], r322 push r321 mov r322, r321 r321 = esp xor [esp], i32 xor r322, i32 r322 = esp pop r322 mov r322, i321+i322 r321 = r322 Learn rule and generalize.
  • 67. Peephole Superdeobfuscation In Action push eax mov eax, ebp push ebp xor [esp], 7FEE03B1h mov ebp, eax xor ebp, 7FEE03B1h sub esp, 4 mov [esp], esi push edx mov edx, 6F6B5081h mov esi, 0BC6D38Bh add esi, edx pop edx Obfuscated Deobfuscated Constraints sub esp, 4 push r32 r32 = esp mov [esp], r32 push r321 push r322 r322 = esp mov [esp], r322 push r321 mov r322, r321 r321 = esp xor [esp], i32 xor r322, i32 r322 = esp pop r322 mov r322, i321+i322 r321 = r322 No simplification.
  • 68. Peephole Superdeobfuscation In Action push eax mov eax, ebp push ebp xor [esp], 7FEE03B1h mov ebp, eax xor ebp, 7FEE03B1h sub esp, 4 mov [esp], esi push edx mov edx, 6F6B5081h mov esi, 0BC6D38Bh add esi, edx pop edx Obfuscated Deobfuscated Constraints sub esp, 4 push r32 r32 = esp mov [esp], r32 push r321 push r322 r322 = esp mov [esp], r322 push r321 mov r322, r321 r321 = esp xor [esp], i32 xor r322, i32 r322 = esp pop r322 mov r322, i321+i322 r321 = r322 Apply previous simplification.
  • 69. Peephole Superdeobfuscation In Action push eax mov eax, ebp push ebp xor [esp], 7FEE03B1h mov ebp, eax xor ebp, 7FEE03B1h push esi push edx mov edx, 6F6B5081h mov esi, 0BC6D38Bh add esi, edx pop edx Obfuscated Deobfuscated Constraints sub esp, 4 push r32 r32 = esp mov [esp], r32 push r321 push r322 r322 = esp mov [esp], r322 push r321 mov r322, r321 r321 = esp xor [esp], i32 xor r322, i32 r322 = esp pop r322 mov r322, i321+i322 r321 = r322 No simplification.
  • 70. Peephole Superdeobfuscation In Action push eax mov eax, ebp push ebp xor [esp], 7FEE03B1h mov ebp, eax xor ebp, 7FEE03B1h push esi push edx mov edx, 6F6B5081h mov esi, 0BC6D38Bh add esi, edx pop edx Obfuscated Deobfuscated Constraints sub esp, 4 push r32 r32 = esp mov [esp], r32 push r321 push r322 r322 = esp mov [esp], r322 push r321 mov r322, r321 r321 = esp xor [esp], i32 xor r322, i32 r322 = esp pop r322 mov r322, i321+i322 r321 = r322 No simplification.
  • 71. Peephole Superdeobfuscation In Action push eax mov eax, ebp push ebp xor [esp], 7FEE03B1h mov ebp, eax xor ebp, 7FEE03B1h push esi push edx mov edx, 6F6B5081h mov esi, 0BC6D38Bh add esi, edx pop edx Obfuscated Deobfuscated Constraints sub esp, 4 push r32 r32 = esp mov [esp], r32 push r321 push r322 r322 = esp mov [esp], r322 push r321 mov r322, r321 r321 = esp xor [esp], i32 xor r322, i32 r322 = esp pop r322 mov r322, i321+i322 r321 = r322 No simplification.
  • 72. Peephole Superdeobfuscation In Action push eax mov eax, ebp push ebp xor [esp], 7FEE03B1h mov ebp, eax xor ebp, 7FEE03B1h push esi push edx mov edx, 6F6B5081h mov esi, 0BC6D38Bh add esi, edx pop edx Obfuscated Deobfuscated Constraints sub esp, 4 push r32 r32 = esp mov [esp], r32 push r321 push r322 r322 = esp mov [esp], r322 push r321 mov r322, r321 r321 = esp xor [esp], i32 xor r322, i32 r322 = esp pop r322 mov r322, i321+i322 r321 = r322 No simplification.
  • 73. Peephole Superdeobfuscation In Action push eax mov eax, ebp push ebp xor [esp], 7FEE03B1h mov ebp, eax xor ebp, 7FEE03B1h push esi push edx mov edx, 6F6B5081h mov esi, 0BC6D38Bh add esi, edx pop edx Obfuscated Deobfuscated Constraints sub esp, 4 push r32 r32 = esp mov [esp], r32 push r321 push r322 r322 = esp mov [esp], r322 push r321 mov r322, r321 r321 = esp xor [esp], i32 xor r322, i32 r322 = esp pop r322 push r321 mov r322, i321+i322 r321 = esp mov r321, i321 mov r322, i321+i322 r322 = esp mov r322, i322 r321 = r322 add r322, r321 pop r321 Learn rule and generalize.
  • 74. Peephole Superdeobfuscation In Action push eax mov eax, ebp push ebp xor [esp], 7FEE03B1h mov ebp, eax xor ebp, 7FEE03B1h push esi mov esi, 7B32240Ch Obfuscated Deobfuscated Constraints sub esp, 4 push r32 r32 = esp mov [esp], r32 push r321 push r322 r322 = esp mov [esp], r322 push r321 mov r322, r321 r321 = esp xor [esp], i32 xor r322, i32 r322 = esp pop r322 push r321 mov r322, i321+i322 r321 = esp mov r321, i321 mov r322, i321+i322 r322 = esp mov r322, i322 r321 = r322 add r322, r321 pop r321 Continue ad infinitum. wtf?
  • 75. Peephole Superdeobfuscation System Output Deobfuscator Rules Code Generator C CodePython Code OCaml Code We can turn a set of rules into a program that uses pattern-matching to implement those transformations. We can generate a deobfuscator automatically, and also the obfuscator that created the code in the first place!
  • 76. Peephole Superdeobfuscation Limitations mov eax, 12h or edx, eax add eax, 34h jmp @end bswap ax bsr edx, 12h salc cmc @end: xor edx, edx mov eax, [esp] add esp, 4 ret Could simplify if we knew that edx was dead (i.e., not used before its next modification ). Ours is a forward analysis; dead code elimination requires backwards analysis. Can incorporate a liveness analysis for this purpose.
  • 77. Introduction Ingredients X86 Disassembly and Assembly IR Translation IR Interpreter SMT Integration Applications Enumerative Program Synthesis CPU Emulator Synthesis Peephole Superdeobfuscation Template-Based Program Synthesis Metamorphic Extraction Conclusion
  • 78. Template-Based Program Synthesis Overview Question: is it possible to create the function x+1 by using two of ~ (not) and/or - (neg)? y = ~x; y = -x; z = ~y; z = ~y; y = ~x; y = -x; z = -y; z = -y; Table: All Possible Sequences
  • 79. Template-Based Program Synthesis Templates Problem phrased as a template: what do bop1 and bop2 need to be set to, so that f(x) == x+1 for all values of x? bool bop1 , bop2; int f(int x) { int y = bop1 ? -x : ~x; return bop2 ? -y : ~y; }
  • 80. Template-Based Program Synthesis Solving After phrasing the question appropriately: English Mathematics Are there values of bop1, bop2 ∃ bop1, bop2 ∈ Bool · Such that, for all values of x ∀ x ∈ BV[32] · In the code y = bop1 ? -x : ~x let y = bop1 ? -x : ~x in z = bop2 ? -y : ~y let z = bop2 ? -y : ~y in z == x+1 is always true? z == x+1 An SMT solver gives bop1 = false, bop2 = true. I.e., int f(int x) { return -~x; }
  • 81. Template-Based Program Synthesis Extending the Framework We can extend the idea to use more than two operator types: Or reference further constants that the solver must provide: char op1; int f(int x) { y = op1 == 0 ? -x : op1 == 1 ? ~x : x-1; return y; } bool op1; char c1; int f(int x) { y = op1 ? x + c1 : x ^ c1; return y; }
  • 82. Introduction Ingredients X86 Disassembly and Assembly IR Translation IR Interpreter SMT Integration Applications Enumerative Program Synthesis CPU Emulator Synthesis Peephole Superdeobfuscation Template-Based Program Synthesis Metamorphic Extraction Conclusion
  • 83. Metamorphic Extraction Metamorphic Decoder Generation A metamorphic engine generates code like the following: lodsb op al, bl op al, i81 op al, i82 op bl, al Where each op can be add, sub, or xor, and i81 / i82 are 8-bit constants. I.e., 34 ∗ 28 ∗ 28 ≈ 5.3 million possible instances. Example metamorphic decoder: lodsb add al, bl xor al, 0D2h sub al, 0F1h add bl, al
  • 84. Metamorphic Extraction Further Obfuscation add al, 0B7h sub al, 90h push cx mov cl, bl add al, bl pop cx add al, 90h sub al, 0B7h push ebx mov ebx, 8716AEF1h push ecx push eax xor dword ptr [esp], ebx ; continued Obfuscator lodsb add al, bl xor al, 0D2h sub al, 0F1h add bl, al The metamorphic code is then further obfuscated.
  • 85. Metamorphic Extraction Goal lodsb op al, bl op al, i81 op al, i82 op bl, al add al, 0B7h sub al, 90h push cx mov cl, bl add al, bl pop cx add al, 90h sub al, 0B7h push ebx mov ebx, 8716AEF1h push ecx push eax xor dword ptr [esp], ebx ; continued Re-Create Given an obfuscated sequence, we want to determine the underlying metamorphically-generated function. I.e., would like to know the ops and i8s. We re-create the information instead of deobfuscating. Let’s explore two approaches based on program synthesis.
  • 86. Metamorphic Extraction Template for Metamorphic Functionality Template for Metamorphic Functionality al = Load(vMem,vEdi,8); a = op1 == 0 ? al + vBl : op1 == 1 ? al - vBl : al ^ vBl; b = op2 == 0 ? a + c1 : op2 == 1 ? a - c1 : a ^ c1; c = op3 == 0 ? b + c2 : op3 == 1 ? b - c2 : b ^ c2; d = op4 == 0 ? vBl + c : op4 == 1 ? vBl - c : vBl ^ c; X86-related variables: al is the value read by lodsb vBl is the initial value of bl d is the final value of bl Template parameters: Constants c1, c2 Operators op1, op2, op3, op4 0:add, 1:sub, 2+:xor lodsb op al, bl op al, i81 op al, i82 op bl, al
  • 87. Metamorphic Extraction Straightforward Approach Using ∃ · ∀· Template Program IR for obfuscated X86 IR assertions ∃ op1, op2, op3, op4, c1, c2 · ∀ input X86 states · d == vBlAfter Query SMT Solver Values for op1, op2, op3, op4, c1, c2 We can solve directly using the quantifiers ∃, ∀. Quantifiers can slow solving, so we show an alternative.
  • 88. Metamorphic Extraction Collect I/O Pairs IR for obfuscated X86 Input State IR Interpreter Output State Collect I/O pairs for the obfuscated X86 seqeuence. Extract the important parts of the states. al 0x00 vBl 0x00 vBlAfter 0xE1 Parts of state needed for synthesis
  • 89. Metamorphic Extraction Create Witnesses Plug the I/O pair al 0x00 vBl 0x00 vBlAfter 0xE1 Into the template a = op1 == 0 ? al + vBl : op1 == 1 ? al - vBl : al ^ vBl; b = op2 == 0 ? a + c1 : op2 == 1 ? a - c1 : a ^ c1; c = op3 == 0 ? b + c2 : op3 == 1 ? b - c2 : b ^ c2; d = op4 == 0 ? vBl + c : op4 == 1 ? vBl - c : vBl ^ c; To obtain a witness: a1 = op1 == 0 ? 0x00+0x00 : op1 == 1 ? 0x00-0x00 : 0x00^0x00; b1 = op2 == 0 ? a1 + c1 : op2 == 1 ? a1 - c1 : a1 ^ c1; c1 = op3 == 0 ? b1 + c2 : op3 == 1 ? b1 - c2 : b1 ^ c2; d1 = op4 == 0 ? 0x00 + c1 : 0x00 == 1 ? 0x00 - c1 : 0x00 ^ c1; assert(d1 == 0xE1);
  • 90. Metamorphic Extraction Synthesizing Candidate Functions Witnesses SMT Solver Function Doesn’t Exist Values for op1, op2, op3, op4, c1, c2 UNSAT SAT Query for template parameters that satisfy the witnesses. If they exist, create a function from them. a = al ^ vBl; b = a ^ 0x4A; c = b ^ 0x85; d = vBl + c;
  • 91. Metamorphic Extraction Equivalence Checking Our function is only valid for witnesses seen far. Does its behavior always match the X86? Synthesized Function Obfuscated X86 IR assertions d != vBlAfter Query SMT Solver Valid State Causing Difference UNSAT SAT If the formula is satisfiable, the output is a counterexample: a state causing a difference in execution. al 0x00 vBl 0x88
  • 92. Metamorphic Extraction Refinement If the function did not match the X86, use the output state to create a new witness. Repeat until error or success. Witnesses Synthesis Can’t SynthesizeEquivalence Check Create New Witness Function is Valid UNSATSAT UNSATSAT
  • 93. Metamorphic Extraction In Action lodsb add al, bl xor al, 0D2h sub al, 0F1h add bl, al The deobfuscated sequence is shown for clarity. Our analysis works upon obfuscated sequences. a = al ^ vBl; c = b ^ 0x85; c = b ^ 0x85; c = b ^ 0x85; Synthesized b = a ^ 0x4A; Program c = b ^ 0x85; d = vBl + c; Counter- Example Begin with a program synthesized from the witnesses. Try to find a counter-example.
  • 94. Metamorphic Extraction In Action lodsb add al, bl xor al, 0D2h sub al, 0F1h add bl, al The deobfuscated sequence is shown for clarity. Our analysis works upon obfuscated sequences. a = al ^ vBl; c = b ^ 0x85; c = b ^ 0x85; c = b ^ 0x85; Synthesized b = a ^ 0x4A; Program c = b ^ 0x85; d = vBl + c; Counter- vBl = 0x88 Example al = 0x00 Synthesize a program from the witnesses. Try to find a counter-example. Found: generate new witness.
  • 95. Metamorphic Extraction In Action lodsb add al, bl xor al, 0D2h sub al, 0F1h add bl, al The deobfuscated sequence is shown for clarity. Our analysis works upon obfuscated sequences. a = al ^ vBl; a = al + vBl; c = b ^ 0x85; c = b ^ 0x85; Synthesized b = a ^ 0x4A; b = a ^ 0xD9; Program c = b ^ 0x85; c = b - 0xE8; d = vBl + c; d = vBl + c; Counter- vBl = 0x88 Example al = 0x00 Synthesize a program from the witnesses. Try to find a counter-example.
  • 96. Metamorphic Extraction In Action lodsb add al, bl xor al, 0D2h sub al, 0F1h add bl, al The deobfuscated sequence is shown for clarity. Our analysis works upon obfuscated sequences. a = al ^ vBl; a = al + vBl; c = b ^ 0x85; c = b ^ 0x85; Synthesized b = a ^ 0x4A; b = a ^ 0xD9; Program c = b ^ 0x85; c = b - 0xE8; d = vBl + c; d = vBl + c; Counter- vBl = 0x88 vBl = 0x20 Example al = 0x00 al = 0x10 Synthesize a program from the witnesses. Try to find a counter-example. Found: generate new witness.
  • 97. Metamorphic Extraction In Action lodsb add al, bl xor al, 0D2h sub al, 0F1h add bl, al The deobfuscated sequence is shown for clarity. Our analysis works upon obfuscated sequences. a = al ^ vBl; a = al + vBl; a = al + vBl; c = b ^ 0x85; Synthesized b = a ^ 0x4A; b = a ^ 0xD9; b = a ^ 0xD3; Program c = b ^ 0x85; c = b - 0xE8; c = b + 0x0E; d = vBl + c; d = vBl + c; d = vBl + c; Counter- vBl = 0x88 vBl = 0x20 Example al = 0x00 al = 0x10 Synthesize a program from the witnesses. Try to find a counter-example.
  • 98. Metamorphic Extraction In Action lodsb add al, bl xor al, 0D2h sub al, 0F1h add bl, al The deobfuscated sequence is shown for clarity. Our analysis works upon obfuscated sequences. a = al ^ vBl; a = al + vBl; a = al + vBl; c = b ^ 0x85; Synthesized b = a ^ 0x4A; b = a ^ 0xD9; b = a ^ 0xD3; Program c = b ^ 0x85; c = b - 0xE8; c = b + 0x0E; d = vBl + c; d = vBl + c; d = vBl + c; Counter- vBl = 0x88 vBl = 0x20 vBl = 0x23 Example al = 0x00 al = 0x10 al = 0x08 Synthesize a program from the witnesses. Try to find a counter-example. Found: generate new witness.
  • 99. Metamorphic Extraction In Action lodsb add al, bl xor al, 0D2h sub al, 0F1h add bl, al The deobfuscated sequence is shown for clarity. Our analysis works upon obfuscated sequences. a = al ^ vBl; a = al + vBl; a = al + vBl; a = al + vBl; Synthesized b = a ^ 0x4A; b = a ^ 0xD9; b = a ^ 0xD3; b = a ^ 0xD2; Program c = b ^ 0x85; c = b - 0xE8; c = b + 0x0E; c = b + 0x0F; d = vBl + c; d = vBl + c; d = vBl + c; d = vBl + c; Counter- vBl = 0x88 vBl = 0x20 vBl = 0x23 Example al = 0x00 al = 0x10 al = 0x08 Synthesize a program from the witnesses. Try to find a counter-example.
  • 100. Metamorphic Extraction In Action lodsb add al, bl xor al, 0D2h sub al, 0F1h add bl, al The deobfuscated sequence is shown for clarity. Our analysis works upon obfuscated sequences. a = al ^ vBl; a = al + vBl; a = al + vBl; a = al + vBl; Synthesized b = a ^ 0x4A; b = a ^ 0xD9; b = a ^ 0xD3; b = a ^ 0xD2; Program c = b ^ 0x85; c = b - 0xE8; c = b + 0x0E; c = b + 0x0F; d = vBl + c; d = vBl + c; d = vBl + c; d = vBl + c; Counter- vBl = 0x88 vBl = 0x20 vBl = 0x23 Example al = 0x00 al = 0x10 al = 0x08 NONE Synthesize a program from the witnesses. Try to find a counter-example. Not found: function is valid.
  • 101. Introduction Ingredients X86 Disassembly and Assembly IR Translation IR Interpreter SMT Integration Applications Enumerative Program Synthesis CPU Emulator Synthesis Peephole Superdeobfuscation Template-Based Program Synthesis Metamorphic Extraction Conclusion
  • 102. New Course Offering New training course offering on SMT-based binary program analysis. Written for low-level people comfortable programming in Python; no particular math or CS background required. Learn what SMT solvers are and how to use them. Lecture material vividly illustrated like these slides. Students construct a minimal, yet fully functional SMT-based program analysis framework in Python. Dozens of small, guided programming exercises. Code an SMT solver, X86 → IR translator, ROP compiler1 . Available now! 1 ROP compiler application subject to potential replacement pending forthcoming regulation of the computer security industry
  • 103. Questions? rolf@msreverseengineering.com Check out M¨obius Strip Reverse Engineering at: http://www.msreverseengineering.com Program analysis training classes Reverse engineering training classes Consulting services Blog, research archive, and other resources
  • 104. Thanks My proofreaders are awesome: Igor Skochinsky William Whistler Vijay D’Silva People who publish inspiring work are awesome, too.
  • 105. References Sorav Bansal and Alex Aiken. Automatic generation of peephole superoptimizers. In ACM Sigplan Notices, volume 41, pages 394–403. ACM, 2006. Patrice Godefroid and Ankur Taly. Automated synthesis of symbolic instruction encodings from i/o samples. In ACM SIGPLAN Notices, volume 47, pages 441–452. ACM, 2012. Sumit Gulwani, Susmit Jha, Ashish Tiwari, and Ramarathnam Venkatesan. Synthesis of loop-free programs. In ACM SIGPLAN Notices, volume 46, pages 62–73. ACM, 2011.