2. Presentation for STAB Annual Seminar Weekday 2
DISCLAIMER
• This presentation is intended for educational purposes only.
• Reverse engineering of copyrighted material is illegal and
unethical and in no way, do I encourage this.
• Playing with malware is not a good idea, unless you take
proper precautionary measures. Always work within
sandboxed environment. For example, in a virtual OS.
• Malware analysis is “advanced” stuff. And when you choose to
mess with a nasty app, it is expected that you know what you
are doing. Only you are responsible for what happens to your
system.
3. Presentation for STAB Annual Seminar Weekday 3
What is this heavy term?
• Reverse Engineering simply aims
at “understanding” a system
through analysis of its structure,
function and operation.
• We try to go backwards in the
development cycle: having an
implementation, we try to go
back to the analysis stage, with a
high level understanding.
• With the abstract understanding,
we try to modify parts of existing
implementation or implement on
Complete disassembly of a Pentax K1000 camera. our own!
Image borrowed from bitrebel.com.
4. Presentation for STAB Annual Seminar Weekday 4
And why should someone learn this?
• Malware analysis: Analyzing malware to build anti-malware
• Bug fixing: Fixing bugs in legacy software
• Personalization/Customizations
• Academic/Learning purposes
• Removing access restrictions
• Removal of copy protection
• Compatibility
• Just for fun!
• Convey the message “Go Open Source”! :-P
5. Presentation for STAB Annual Seminar Weekday 5
Wait … Isn’t this “illegal”!?
• Public release of information obtained through reversing a
proprietary application or sharing the application after
modifying it to remove/reduce security is illegal.
• DMCA allows, reverse engineering applications for achieving
interoperability.
• Analysis of malware is of course legal!
• “Clean room” design is perfectly legal.
• A team of examiners write a specification for target the software.
• Several reviews ensure exclusion of any copyrighted materials.
• A separate team of developers re-implement the software.
6. Presentation for STAB Annual Seminar Weekday 6
A bit of history…
• AMD had reverse engineered Intel’s early processors (and had
outperformed all of them!).
• ReactOS, an open source clone of Windows is still under active
development. It’s not based on *nix systems at all.
• Phoenix developed it’s BIOS chip by reverse engineering IBM’s
BIOS, with a clean room approach. Phoenix BIOS gave birth to
the first IBM-compatible PCs.
• Wine is reverse engineering Windows for supporting Win API.
• OpenOffice.org is reverse engineering Microsoft Office for
supporting the proprietary file formats.
• Samba enables file sharing between Windows and non-
Windows systems. But they had to reverse engineer Windows
file sharing.
7. Presentation for STAB Annual Seminar Weekday 7
What we would be discussing today…
Reverse Engineering is a whole different branch by itself. We
would only be touching upon a few important fundamental
points that would give you the basic feel of it (and hopefully
motivate you to dig further into it)…
•PE Identification: Packers, Identifying the source language
•Decompilers; Disassemblers and Debuggers
•Introduction to OllyDbg’s Features
•Where to start? & Where to focus?
•Patching with OllyDbg
•Phishing and KeyGening
8. Presentation for STAB Annual Seminar Weekday 8
PEiD
• Identification of PE (Portable Executable) is an important step.
It involves identification of compilers, cryptors, packers etc…
• Gives you a starting point to look for a solution.
• Unpacking needed?
• De-compilation possible?
• Crypto libraries used?
• PEiD is by far the best PE Identification tool (470+ signatures
and can be extended with external signatures).
• If PEiD says “Nothing Found”, the application might be using a
custom packer.
PEiD (PEiD) PEiD (OllyDbg) PEiD (KeyGenMe_#6) X
9. Presentation for STAB Annual Seminar Weekday 9
Packers ?!
• Packers compress the compiled program, much like data.
• They compress either or both code and data sections and can
additionally encrypt them.
Compiled Packed
Packers Executable
Executable
Unpacking
CODE DATA The extractor stub
Unpackers
CODE DATA STUB decompresses and/or
decrypts the data and
code. Writes them to
a separate file or does
it in-memory.
OEP: Original Entry Point Entry Point
• Several packers exist. Most of them are selective about the
type of PE they support.
10. Presentation for STAB Annual Seminar Weekday 10
Packers ?!
• Some of the most popular (and thus most wanted targets for
unpackers) are ASPack, Enigma, Themida, MPRESS, UPX etc.
• Most packed PEs are manually customized to avoid usage of
direct unpackers.
• Customized executables might not even be recognized by PE
Identifiers! So, might not have any idea of the “behavior” of
the packer.
• Without direct unpackers, we have to manually step through
the executable, and find out the original entry point. (When we
step through code in a debugger, the application is actually in
execution, so take precaution while trying this with malware).
11. Presentation for STAB Annual Seminar Weekday 11
Why should you know the compiler?
• Java and Python compile sources to byte code instead of
machine code, which runs on a virtual machine.
• Same is the case with managed code like .NET. C#, VB and all
.NET languages compile the code to MSIL (MicroSoft
Intermediate Language).
• The meta-data present with the intermediate byte code is
enough to make satisfactory “de-compilation” possible.
• For example, Java byte code contains names of classes and all
members; types and modifiers of fields; signatures of methods.
Even names of local variables and line numbers can be saved
optionally, by setting specific attributes.
12. Presentation for STAB Annual Seminar Weekday 12
Why should you know the compiler?
• Some of the things de-compilers need to be “smart” about are:
• Structured control flow (loops and conditionals) from gotos in byte code.
• Type inference for locals, especially for generics as they are generated at
compile time.
• This is very different from the compilation with debug info (-g)
that can be used with C/C++ compilers.
• Debug info is just a mapping from binary to source file.
• If the sources are missing, debugger would not “magically” show them!
• Machine code is so much “reduced” that it’s almost impossible
to “grow” it back to a high level source code.
13. Presentation for STAB Annual Seminar Weekday 13
Some Available Decompilers…
• .NET MSIL Decompiler: Reflector (Freeware till 6.x)
KeyGenMe #6 Reflector(KeyGenMe #6) UnpackMe Reflector(UnpackMe)
• Java Decompiler: JD (Freeware and Decent, there are dozen others!)
SK_CrackMe JD(SK_CrackMe)
14. Presentation for STAB Annual Seminar Weekday 14
What if you can’t decompile?
• Well .. The fun begins!
• Disassemblers and debuggers transform the binary machine
code to assembly instructions. Debuggers offer additional
features like setting break points, stepping through the code
etc…
• OllyDbg is one of the best light-weight debuggers available. IDA
Pro is a heavy-weight and paid debugger (but it’s worth it!).
• OllyDbg supports external plugins and that makes it an even
more powerful tool.
• IDA Pro has the power of recognizing “known” library functions
by their signatures. It can highly simplify the assembly dump.
15. Presentation for STAB Annual Seminar Weekday 15
More about OllyDbg…
• Emphasizes on code analysis.
• Olly contains description of about 2200 standard C/C++ library
and Win32 API functions. It also contains 7800 symbolic
constants (grouped into 490 types).
• Olly can detect nested loops, switches and cascaded ifs. It can
also predict register usage.
• You can add comments at each assembly line.
• You can assemble your own expressions instantly and re-build
the program. (Great for patching!)
KeyGenMe #2 OllyDbg(KeyGenMe #2)
16. Presentation for STAB Annual Seminar Weekday 16
Finding the needle(s) in a haystack…
• Patience and lateral thinking!
• Examine “All referenced strings”
• Locality of Good Boy or Bad boy
• Examine “All inter-modular calls”
• Get used to Windows API (Olly does provide context sensitive help)
• I/O calls are a usual starting point for examination
• msvcrt.printf
• msvcrt.scanf
• USER32.GetMessageX
• USER32.GetDlgItemTextX
• Use breakpoints at suspicious DLL calls
• Use memory breakpoint at addresses storing the entered key
17. Presentation for STAB Annual Seminar Weekday 17
A (very) simple example
string username, password, therealpass; Case_1 Olly(Case_1)
cout << “USERNAME : ”; cin >> username;
cout << “PASSWORD : ”; cin >> password;
threalpass = “TOP_SECRET”;
if(password == therealpass) cout << “You deserve an award!n”;
else cout << “Y U No Give up?n”;
• Suspicious entities in dump: (good starting points)
• both the final messages,
• therealpass “TOP_SECRET”.
• Just the hex dump gives some hints about a “possible” password and
trying this out gives good boy already!
18. Presentation for STAB Annual Seminar Weekday 18
Another example (Patching)
while(username.length() > 99 || username.length() < 10) {
cout << “USERNAME : ”; cin >> username;}
Case_2
cin >> password;
therealpass = username.substr(username.length() - 4) + “-”;
therealpass += (‘0’ + (username.length()%10));
therealpass += (‘0’ + (username.length()/10)); Olly(Case_2)
therealpass += “-” + username.substr(0,4); Case_2_PATCHED
if(password == therealpass) cout << “You deserve an award!n”;
else cout << “Y U No Give up?n”;
• Patch JZ after TEST AL,AL to bypass the check and always output good boy.
• When trying to analyze, “what” is done with the password (KeyGen-ing), more
interesting aspects like why BPs at strcmp are useless; why some passwords hit no
BPs at memcmp but others hit a BP on memcmp…
• Not only helps you make a KeyGen; but helps you discover low level GCC
implementation details as well!