The Ultimate Deobfuscator - ToorCON San Diego 2008
1. ToorConX - The Ultimate Deobfuscator
Stephan Chenette, Principle Security Researcher
Websense Security Labs
2. Agenda
The Ultimate Deobfuscator: Hooking versus Simulation
The Obfuscation “Problem”
The “Simulated Solution”
The Ultimate Deobfuscator
Concluding remarks
2
6. DeObfuscation…how far we’ve come…
Replacing all occurrences of "document.write" with alert.
Replacing eval with document.write
Forcing document.write to write between
<textarea></textarea> tags
But malcode authors added anti-debugging traps like
arguments.callee.toString() to break the flow of execution
if code were changed in anyway.
6
8. The “Simulated” solution
Various toolkits have been released to Safely
Investigating Malicious JavaScript
– HTML Parser
– DOM Implementation
– Scripting Engine(s)/Interpreter(s)
8
24. The ultimate goal would be….
A headless browser basically….
Non-graphical browser that would give us all the details
internally of what’s going on without rendering the content.
That wouldn’t allow successful exploitation.
24
25. The Dream…
Deobfuscation results
– eval, document.write
Navigation to a new page
Pop-ups prompting user to install an exe
DNS errors
Changes to the DOM, additional iframe elements or image
elements
Loading of ActiveX controls, success or failure and CLSID
Iteration of the DOM whenever we wanted.
All to do this automatically without a debugger
25
26. Another solution…
The “Ultimate” Deobfuscator…
What better to use to deobfuscate content than the full
browser itself?
This presentation focuses on Internet Explorer, but this
technique works for every browser.
26
27. Is it really that easy?
Hook the right functions insides the browser and just sit
back.
27
28. Hooking is a dirty job
Especially when you’re functions that are neither exported
or documented.
Requires changes if browser dlls are updated
28
42. Conclusion
Obfuscation != Malicious
Security Researchers analyzing web content must have a
solid deobfuscation tools to deal with today’s malicious
webscape.
Hooking the browser is one more solution for manual
reversing of obfuscated content.
42
43. Thank you.
Any questions?
I will be releasing POC code for this in a blog next week.
Also…We’re hiring! One intern and One-Full time position
open! San Diego position - Research/Dev skills, C, PERL,
etc.!
Check out our website and blogs
http://securitylabs.websense.com/content/blogs.aspx
http://securitylabs.websense.com/
43
http://www.linuxquestions.org/linux/answers/Security/Decoding_obfuscated_javascript_Simple_wayhttp://handlers.sans.org/dwesemann/decode/index.htmlhttp://handlers.sans.org/dwesemann/decode/https://isc2.sans.org/diary.html?storyid=2268properties of arguments.callee.toString() and how this makes analysis harder. This method allows a function to reference itself, and hence allows a function to detect modifications to its own code. Changing eval() to print() changes the function string, and with it the result. This can usually be defeated by re-defining the eval() function into a simple call to print(), but not so in this case. So let's take a look at some of the protection features in detail.http://yaisb.blogspot.com/2006/10/defeating-dean-edwards-javascript.html
Too high level, need to see results at a lower level.
There have been several presentations on building JavaScript simulators in the last two years.Let me show you how our simulator within Websense Security Labs works…
When creating a simulator, the simulator must implement all the objects and functions that would be available by the browserSuch as document.write, alert, location.href, navigator, screen, etc.Also you have to load external pages. And all the js code from them, simple to break! Also have to deal with the transaction.If any of you have attempted to write any cross-browser JavaScript you’ll also know that there are lots of differences between them and how they interact with the Document Object Model or DOM inside the browser. Imagine the number of methods to test against to make sure the simulator has implemented everything necessary as well as how it has implemented it.
Lacking any object or function will result in an error in deobfuscation process
We want to see CreateElement for iframeOr SetAtribute for clsid, etc.CreateObjecte.g msxml2.XMLHTTP, Shell.Application, ADODB.stream, etc.
Cover this slide in detail…
codes that employed one or several of the "document.referrer", "document.location" and "location.href" properties as part of the decoding process.http://myitforum.com/cs2/blogs/cmosby/archive/2008/07/17/obfuscated-javascript-redux-sans-internet-storm-center.aspxhttp://parentnode.org/javascript/obfuscated-javascript-challenge/http://hype-free.blogspot.com/2007/02/decoding-obfuscated-javascript.html
Also doesn’t work for vista or 2003 due to aslr, and the dll base addresses being randomized.We have it working on 2000 and XP.
document.write – write to a webpageeval – evaluates a string and executes it as if it were script codeDocument.write was pretty straight forward to hook.Eval wasn’t as straight forward. I had to hook it in the middle of the functon in order to get access to the buffer I wanted to output…kindadity, but it works.But there are tons of other functions within these two dlls that would be interesting to hook and log for analysis.
Let’s take a non-malicious example to start.We see x*y or 10*20 which we see in the browser outputs as 200We see 2+2 which we see in the browser as 27And we see 10+17 which makes 27.The hooking has been done in such as way that a dll has been injected inside of IE, hooked document.write and eval and has outputted The results to a DbgView window.
So lookng at DebugView we can see the same output that was computed for the browser.
Now let’s look at a malicious example.Left off here…
Where before an incomplete simulator would failA hooked decoder will have no trouble logging dynamic content.
Obfuscation does not equal malicious, we must be able to tell the difference.Malicious files, exploits and content is found on the webAnd security researchers who don’t have the tools needed to analyze web content are going to be missing out on A key ability
I will be releasing a blog next week where I’ll release the hooking code that accomplishes what I’ve talked about in this presentationAnd is a subset of what we’ve written for our infrastructure that works on a much larger scale.I also want to mention that we have two open positions, so if you’re interested drop your CV or card my way.