This is the slides to accompany the talk given by Darren Martyn at the Steelcon security conference in July 2014 about process injection using python.
Covers using Python to manipulate processes by injecting code on x86, x86_64, and ARMv7l platforms, and writing a stager that automatically detects what platform it is running on and intelligently decides which shellcode to inject, and via which method.
The Proof of Concept code is available at https://github.com/infodox/steelcon-python-injection
3. what
• Manipulating what another process is doing in memory
• Memory-resident, process-less backdoors
• Doing all this in Python
4. Use Cases
• Getting around weird runtime packers, such as those used by
malware developers to obfuscate code.
• Cheating in video games!
• Extending / Modifying a programs functionality at runtime!
• Developing forensically challenging code.
5. Today…
• We will talk about being bad guys!
• Because, quite frankly, using this to develop a proof of concept
version of a sneaky malware is the best way to demonstrate
risk.
7. Forensically Challenging
• The concepts outlined here will demonstrate how to create a
forensically-challenging to detect piece of malware.
• Note that this is not impossible to detect or mitigate against,
we will discuss mitigations and suchlike later in the talk.
8. Before I begin… The basics.
• First off, before we begin, we need to understand how the
stack works.
• Quite simple, will use the x86 stack as an example.
11. Stack – How code gets executed…
• The EIP, or Extended Instruction Pointer, is the place on the
stack where the next instruction to be executed sits.
• In exploit development, overwriting the EIP with a return to an
attacker controlled address is the “normal” way to get code to
execute.
• Conceptually, what we are doing is somewhat similar to exploit
development, in that we wish to gain control of EIP and point it
at our code.
12. Stack – How we are executing code
• Instead of triggering a vulnerability such as in a buffer overflow
exploit to gain control of the EIP, we are simply overwriting it
using the powers of ptrace().
• The ptrace() system call is used for debugging software. It can
both read and write arbitrary data to a process’s memory.
• This allows us to directly manipulate the stack, and execute
code or alter the programs state at runtime.
13. Process Injection 101
• Attach to process
• Pause Process (this happens when attach)
• Get EIP/RIP
• Overwrite EIP/RIP with shellcode
• Set EBX/RBX to 0
• Continue Process
• Shellcode runs
15. Prior art of note…
• Process injection has been done before on Linux.
• One example of prior art is “Cymothoa”, by Crossbower.
• Written in C, and released via Phrack magazine (a publication
in which hackers publish research), it worked on x86 Linux and
was extremely effective for injecting backdoors into other
processes.
16. Prior art of note…
• There is also some research done by elfmaster in vx-heaven.
• Libhijack, by lattera, implements a whole library of functions to
do this in an easy-to-use format.
• Parasite, by jtripper, also uses these techniques to inject a bind
shell into running processes.
• None of the prior art to the best of my knowledge has been in
anything other than C/ASM.
17. Python Code (warning: wall of text)
process = attach(pid) # Attach to target PID
rip = process.getInstrPointer() # get RIP
bytes = process.writeBytes(rip, shellcode)
# overwrite RIP with shellcode…
process.setreg("rbx", 0) # set RBX to 0
process.cont() # Let process continue :)
22. So we need a solution…
• We have our code running in the infected processes memory
• We need our code to not interfere with the process, and run
along side it
• How?
24. Let’s Fork()
• Prepend our payload with some fork() shellcode
• Process is forked, new clone runs with our shellcode running in
it
• Original process continues (theoretically) unchanged
25. Prepending Fork
• “Prepending” means we affix something ahead of our main
payload.
• The fork syscall basically creates a new process, identical to the
parent, as a “child” process.
• This helps us avoid killing/damaging the parent process and
causing possible loss of data or alerting administrators to our
presence.
26. Prepending Fork
• We prepend a shellcode to our shellcode which does the
following:
Step 1: Fork parent process.
Step 2: Run our shellcode.
Note: I even drew a terrible picture to explain this!
27. Prepending Fork
Parent Process
Parent Process Continues
Child process spawned with shellcode in it, so
it is infected
fork()
Execution flow of process….
28. Lets fork() – a demonstration of forking
• Demo of forking (probably pre-recorded if live not working
out).
29. Problems with forking
• With fork, we create a new process
• New process shows up in process listings
• In future, I will be playing with clone() ala Cymothoa, but
simply could not get it working for this yet
30. Back to the python
• So far, we have scratched the surface of memory injection
• So why Python for this?
• Simplicity.
31. Python: Making the hard stuff easy
• The injection code is incredibly short
• We can very easily improve it if we feel the need
• Can spend more time working on the rest of the project (like,
say, the hard bit: shellcode!)
32. Enhancing our injector
• So, the more astute of you may be wondering why I am
clobbering the stack here, and leaving it in a fairly clobbered-
state…
• In this bit, I am going to *attempt* to restore the registers
post-injection. This is not always successful, mind…
34. Restoring the Registers (1)
• This is a bit of a filthy hack, but worked well enough for me to
consider it
• Method I am using is a filthy hack and I should feel terrible
• Again, be warned. This might crash
35. Restoring the Registers (2)
• After process.cont(), we sleep for a second
• We then restore the registers to pre-injection state
• We then pray the fork prepender worked and that the stack is
now unclobbered
36. Live Demo (this may well fail)
No, really. You might want to close your eyes for this one
37. Moving swiftly on…
• Now for the extra shiny fun part
• Because its Python, our injector is portable
• Write once, pwn everything
• Well, kind of...
38. Writing a multi-arch injector
• Our next toy is a process injector that magically determines if
its on an x86 or x84_64 system, and reacts accordingly
• So, let’s write the getArch() function, eh?
• This is easy… 2 lines of code
39. Getting OS Architecture
import platform # import platform module
arch = platform.machine() # get architecture
print arch # print architecture
Test on 64bit
Test on 32bit
40. Porting to x86
• Step 1 (hard bit). Rewrite shellcode for x86.
• Step 2 (easy bit). Rewrite injector part for x86.
• Porting injector part is easy!
• s/rip/eip/
• s/rbx/ebx/
41. Demo of injection on x86
This is a demo of injecting a bindshell on x86. Not live, this box is x86_64.
42. Bonus! Porting to ARM (last minute
addition)
• x86 and x86_64 not too much of a challlenge…
• Realized I could chroot Linux on Nexus 7 tablet… Which is
ARMv7l.
• Let’s explore ARMv7l…
43. ARMv7l – the important bits…
• On ARM, the “EIP” is the PC register.
• Our EBX equivalent is the R1 register.
• Other registers not so important for our purposes.
44. ARMv7l – Challenges
• First attempts were failures because my shellcodes seemed to simply
be too large, causing segfaults.
• Segfault, or segmentation fault, being a crash. This is bad, and will
set off alarms.
• Not an ARM expert, but guessing I was clobbering other registers
with my bulky shellcodes.
• Ended up using a staged Metasploit payload. No prepend-fork yet, so
this is a work in progress! On exit, the parent does get killed!
46. Adding Logic to our injector…
• We can autodetect target machines architecture.
• We can inject into all three architectures demonstrated.
• One injector to rule them all!
47. Testing EnchantedMushroom…
• Wrote a quick “stager” that runs python code in memory over
SSH. “DiabolicalMouse”.
• Decided to use this to test our newly created cross architecture
injector tool, to see if we could get it working without dropping
anything to disc…
• Fairly hacky code, works for this demos purposes!
49. Future Ideas…
• Explore this further on OSX/*BSD and Windows.
• Automatic OS detection to go with Architecture detection?
• More architectures! MIPSLE/BE, SPARC, PPC?
• Improved shellcodes?
• Remove python-ptrace dependency entirely…
50. Future Ideas… (2)
• Implement our own ptrace using ctypes to avoid any non-
native dependencies?
• Lots of further research to do! Stuff like injecting entire ELF
files in memory and suchlike!
• Barrier of entry to exploring this stuff very low, anyone and
everyone can make some research!
51. Limitations as an attack vector
• As is, the attacker has to be able to execute code on your
system to employ these kinds of techniques.
• This means that by the time the attacker can do this, you are
already compromised, by, say, weak login passwords, or
exploitation of vulnerable software on your system.
• *POST* exploitation technique, will not gain you any further
access to a system.
52. Forensically Detecting This…
• Forensic analysts should be aware of this vector of attack.
• Some analysts only investigate artifacts written to disc.
• This kind of analysis is useless against this kind of attack,
wherin nothing is written to disc.
• Analysts should dump the RAM on a suspected-compromised
host to determine if anything exists in-memory.
53. Forensically Detecting This…
• By comparing dumped memory against “good” samples, it may
be possible to actually detect this after the fact.
• Anti-malware solutions that actively scan memory might also
be a decent defence against this kind of thing.
• Network forensics may also assist in detection – observing the
attack actually happening, unusual network connections, etc,
could be a good indicator something has gone terribly wrong.
54. Possible Mitigations
• Prevent access to ptrace() system call by non-root users.
• Some distributions take this approach, but badly.
• Disable ptrace() entirely on production webservers where
debugging access is not required.
• This is doable with certain kernel patches. Grsecurity locks
down ptrace quite well (but is bypassable)
• Monitor process memory and alert on any “unusual”
alterations to said memoryspace.
55. Summary
• Python makes hard things like memory hacking easy!
• Writing cross architecture in-memory malware/implants is
accessible to anyone!
• Most architectures can be owned with very little
effort/modification of existing code!
• Further research needed into forensic detection of process
manipulation and in-memory backdoors.
56. Thanks!
Thanks to the SteelCon organizers for having me here today and
allowing me to give this talk!
Also thanks to my co-workers at Xiphos Research for helping me
get this off the ground.
Finally, thanks to all of you for listening!