3. Introduction
As the field of digital forensics (DF) continues to grow
Few of today’s forensic tool developers have formal t
4. Meaning of digital forensics software
ry dumps, network packet captures, program executable
5. The use of DF tools
1-criminal investigations
2-internal investigations.
3-audits.
of which have different standards for chain-of-custody , admissibility , and scientific validit
6. Hackers hide data in several ways
and steganography techniques but can be caught by artifacts , copy forge techniqu
bad sectors or using alternate data stream (ADS) like C:notepade file.txt:hide (he
e files securely for good you need to use Gutmann algorithm for writing 35 times ra
7. Distinct Sector Hashes for Target file detection
Hashing files to check for file changes
Hashing sectors to discover changes in file segment
Hashing algorithm depends on probability so it won't hash the whole drive bec
Looking for distinct hashes and repeated file patterns using Government data,
Algorithm using urn statistic problem for finding sectors that need to be inspec
10. Network forensics
Network forensics challenges :
Cloud computing challenges needed new tools
New frontiers in network intrusion starting from the firewall
Emerging Network forensic areas:
Social networks
Data mining
Digital imaging and
data visualization
11. Applying network forensics in critical infrastructures
Botnets
Wireless networks still lacking good forensic tools
Sink holes:accept,analyze and forensically store attack traffic
13. Smart phone security challenges
Smart phone threat model showing malware spreading from the application layer to th
14. Lessons in digital forensics
The challenge of data diversity
1-processing incomplete or corrupt data.
2-Why data will not validate?
3-Windows inconsistencies.
4-Eliminate data that are consistent.
Data Scale challenges
1-The amount of data.
2-Applying big data solutions to DF.
15. ub-linear algorithms for reading secto
hms that operate by sampling data. Sampling is a powerful technique and can frequently fi
he absence of data: the only way to establish that there are no written sectors on a hard d
16. Temporal diversity:
the never-ending upgrade cycle
Many computer users have learned that upgrades are
1-Upgrading forensics tools
2-Software Versions to be upgraded
3-Encase forensics tool
4-Intelligent forensics tools
17. Human capital demands and limitations
1-It was found that users of DF software come overwhelming
2-Examiners that have substantial knowledge in one area (e.g
3-developers also with skills like opcodes, multi-threading,
Organization of processes and operating system data structu
18. The CSI Effect
Hard to recover data in reality
Hard to recover data from Hard disk
Recovering data from hard drives typically involves decoding
Funding problems
The differences between Windows Explorer and EnCase Fore
19. Lessons learned managing a research corpus
This project started in 1998 and has expanded to incl
downloaded from US Government web servers, disk i
20. Corpus management --technical issues
1-Imaging ATA drives
Lesson: read the documentation for the computer that you are using.
Lesson: make the most of the tools that you have and follow the technical innovation
(Because you are dealing with hard disks with different technologies whether
21. 2-Automation as the key to corpus management
Needed a process for capturing the hard disk make,model, serial numb
Lesson: automation is key; any process that involves manual record ke
Lesson: useful data will outlive the system in which it is stored, so mak
22. 3-Evidence file formats(customer container file)
Trying to use his own container files did not work well and he had to use standard co
Lesson: avoid developing new file formats has never been possible.
Lesson: kill your darlings.
4-Crashes from bad drives
Causes of crash are many as it could be kernel memory overwritten or faulty drive or
Lesson: many technical options remain unexplored.
23. 5- Drive failures produce better data
Algorithm1: Developed an algorithm that reads from
Algorithm2: developed a disk imaging program called
24. Lessons learned
Lesson: Drives with some bad sectors invariably have more sensitive in
Lesson: do research, and only to maintain software that implements a p
25. 6- Numbering and naming
Algorithm1: developed an algorithm that was generating files
Lesson: Names must be short enough to be usable but long e
When I started acquiring data outside the US I discovered that the country of origin w
a batch number allows different individuals in the same country to assign their own n
Lesson: although it is advantageous to have names that contain no semantic
content, it is significantly easier to work with names that have
some semantic meaning.
26. 7- Path names
• Lesson: place access-control information as near to
the root of a path name as possible.
27. 8- Anti-virus and indexing
Lesson: Configure anti-virus scanners and other indexing tools to ignore directo
9- Distribution and updates
Lesson: solutions developed by other disciplines for distributing large files rarely wor
28. Corpus management–policy issues
1- Privacy issues
Lesson: just because something is legal, you may wish to think twice before you do it.
2- Illegal content financial, passwords, and copyright
Lesson: never sell access to DF data, even if you have personal ownership.
Lesson: understand Copyright Law before copying other people’s data.
Lesson: make sure your intent is scientific research, not fraud, so that any collection of access
3- Illegal content pornography
Lesson: do not give minors access to real DF data; do not intentionally extract pornography fro
4- Institutional Review Boards
Lesson: While IRBs exist to protect human subjects, many
have expanded their role to protect institutions and experimenters.
Unfortunately this expanded role occasionally decreases the protection afforded human subje
the IRB watching over you, it’s important to watch your back.
29. Lessons learned developing DF tools
1- Platform and language
2- Parallelism and high performance computing
3- All-in-one tools vs. single-use tools
4- Evidence container file formats
30. 1- Platform and language
1- The easiest way to write multi-platform tools is to write command-li
2-Although C has historically been the DF developer’s language of choic
3-Java has a reputation for being slow especially for high computationa
4-While it is easy to write programs in Python, experience to date has s
31. 2-Parallelism and high performance computing
ications bottlenecks and a lot of times host computer processor is better th
32. 3- All-in-one tools vs. single-use tools
My experience argues that it
is better to have a single tool than many:
If there are many tools, most investigators will want to have them all. Splitting functi
Much of what a DF tools does ---data ingest, decoding and enumerating data structu
There is a finite cost to packaging, distributing, and promoting a tool. When a tool ha
33. 4- Evidence container file formats
should be allowed to process inputs in any format and transparently handle disk images in
2-With network packets the situation is better, with pcap being the universal format.
35. Conclusion
1-Digital Forensics is an exciting area in which to work, but it is exceedingly difficult b
2-These problems are likely to get worse over time, and our only way to survive the c
3-in building and maintaining this corpus he encountered many problems that are in