SlideShare una empresa de Scribd logo
1 de 28
Watching the Detectives
Using digital forensic techniques to
  investigate the digital persona

                    Gareth Knight
                Centre for e-Research,
Anatomy Museum, King’s College London, 8th November 2011
Overview

• Introduction to digital forensics
  • How is it used in law enforcement
  • How can it be used for research and digital
    curation
• Forensic practices
  • Media imaging
  • Hash filtering
  • Data carving
• Current/future challenges
Origin of Digital Forensics
• Emerged in 1980s as a response to increasing use of
  electronic devices for criminal activity.
• Practioner-led approach - a set of methods applied to
  gather, retrieve and analyse potential evidence held
  on digital devices
• Emphasis upon “scientifically derived and proven
  methods” to obtain, analyse & report upon digital
  evidence (Digital Forensics Research Workshop,
  2001)
• Legal acceptability influenced by Daubert Standard:
  •   methods must be tested,
  •   Subject to peer review and publication,
  •   Possess a known error rate,
  •   Subject to standards governing their application
Intelligence gathering in law-
                   enforcement
•Role in legal Disclosure
(UK)/e-discovery (US) to obtain
data designated as evidence in
legal investigation.
                                   Robert Clark’s target-centric approach
•Broad intelligence gathering
activities – develop & test
hypothesis
•Several intelligence cycles
developed to model
investigation process

                                  Peter Pirolli and Stuart Card sense making loop
Value for digital archiving and
                 research
Increasing amount of digital
                                                 Salman Rushdie Archive
     information:
Analysis of research activities
  •   When did an author create a notable
      work?
  •   What tools did they use?
  •   What sources did they consult?
  •   Is there evidence of material they
      abandoned?
Business function
      Staff have their machine appraised         Emulation of several Apple Macs
                                                 owned by the author
      prior to leaving institution/finishing a
      project to identify data of long-term      http://www.emory.edu/home/academic
                                                 s/libraries/salman-rushdie.html
      value not held elsewhere
Digital Forensics workflow

Forensic activities, as described by Digital Forensics Research Workshop (2001)



Preservation      Collection   Validation   Identification     Analysis   Interpretation   Documentation   Presentation




               Acquisition                                   Analysis                         Reporting
Data Acquisition
       Act of obtaining possession of digital data for subsequent analysis.
        Commonly achieved through creation a disk image or clone that
                          provides a bit copy of disk.

                                                                             1 or more
 60GB                                                                        files that
hard disk                                                                    add up to
                                                                               60GB


  Motivation for creating a disk image in forensic environment:
  1.   Backup copy avoids risk of media failure or other damage during use
  2.   Avoids risk of making inadvertent, unrecoverable change to the
       primary copy
       •      Files can be created/modified/deleted through access to disk
  1.       Enable analysis using methods and tools that are not
           possible/available in the original environment (e.g. emulation, text
           mining)
Forensic Utility Belt
(1) Capture software                         (2) Write Blocker

            Stored on bootable       Prevents OS
            media (floppy, CD,       writing to
            USB)                     connected devices
                                     E.g. USB plug-
            Examples: Dc3dd,         through unit
            DDRescue, OSFClone,
            FTK Imager


(3) Access Devices                        (4) Destination Media
            Drive enclosure
            allows use of internal   Digital media on
            disks via USB            which the disk
                                     image will be
            Kryoflux USB disk        written, e.g. USB
            controller allows low    hard disk
            level disk access
Key Questions to be addressed

            1.       What type of media do you want to
                     capture?
                 •     Floppy disk, hard disk, optical media
            1.       How can the data be accessed?
                 •     Hard disk installed within users’ computer
                 •     Accessed using appropriate reader (USB
                       hard disk caddy, floppy disk reader,
                       CD/DVD reader)
                 •     Network connected disk
            1.       Where will the acquired image be
                     stored?
                 •     External USB disk,
                 •     Network device over Ethernet/Serial, etc.
            1.     What software should you use to
Different Hardware capture the disk image?
                                                                    Different Media
Data Analysis
     Content held on digital media serves many purposes:
        •   Operating system files, e.g. Windows has 30,000+ after fresh install
        •   Software: Applications, utilities, games, etc.
        •   Log data: Windows Registry, browser cache, cookies, temp files
        •   User-generated content: Documents, images, sound, emails, etc.

     Different data layers available:
        1. Active data: Information readily available as normally seen by an OS

        2. Inactive/residual data: Information that has been deleted or modified
            •   Deleted files located in unallocated space that have yet to be overwritten
                (retrieved using undelete application)
            •   Data fragments that contains information from a partially deleted file
                (retrieved through carving)

            Inactive data useful, but need to consider ethical issues



10
Locating active files
Common techniques for locating user content:
•  Navigate directory structure to get a ‘feel’ for data
   files held on disk
•  Search by:
    •   File name, e.g. *report*
    •   File type, e.g. *.doc, *.pdf, etc.
    •   Creation/modification date
    •   Content type, e.g. word usage
    •   File size
•   Additional parameters configurable
Windows search easy to perform, but does not identify
    everything – investigation process can leave
    artefacts, e.g. thumbs.db behind
Case Management Tools
          Common interface for analysing drive
          without content change
          Commercial: FTK, OSForensic
          OSS: Sleuthkit/Autopsy, Digital
          Forensics Framework, PyFlag


          Provide tools to sort/visualise data by:
             •   Name,
             •   Folder,
             •   Size,
             •   Type,
             •   Creation/Modification date
             •   Hash set
Identifying user data using
          checksums
• Checksum algorithm applied to a file
  generates a distinct (possibly unique)
  alphanumeric value
• Many different types of checksum algorithm




• Commonly used to check for
  accidental/deliberate data change/corruption
  • Generate checksum on October 1st
  • Generate checksum on October 14th & compare
    to Oct 1st value – are they the same?
Hash filtering / Exclusion Hashing

• Technique to identify data files obtained from
  different sources
  • Calculate checksum (e.g. MD5, SHA-1) of one or
    more files
  • Compare each checksum against a checksum
    database indicating files known to originate from a
    third party
Checksum types
  • known good’ - Files that perform a legitimate
    purpose, e.g. Operating System, application.
  • ‘known bad’ - Files that denote viruses, Trojans,
    cracker's tools, or other malicious files
  • Unknown – Files that have not been previously
    encountered.
Hash datasets – Information
                Sources
NIST National Software Reference Library (NSRL):
    • Checksums of legitimate files generated from software products
      obtained through purchase/donation.
    • Stores 10,000+ software files.
    • Reference Data Set published every 3 months & available through 3rd
      parties, such as Find-a-Hash

HashKeeper - National Drug Intelligence Center
    • Checksums gathered through criminal investigation.
    • Academic (and other) institutions must file a FoI request to gain
      access to software and database.

Online File Signature Database (OFSDB):
    • Subscription based system dependent upon user contribution.
    • Full access available through subscription of 25 USD per year

•   Currently being used by curators/archivists to distinguish between
    known third-party and potential user created files.
Practical Example
60GB hard disk 9,698 known files, 12,974 unknown files




Windows 2000 files that match the NSRL   Unknown files that may be user created
              database                                  content

  Method may be combined with other techniques, e.g. path and filename
  analysis to exclude other common files (e.g. thumbs.db)
Recovering deleted data
• Data files continue to exist in full or in part for some
  time after deletion
   • The list of disk clusters occupied by the file is relabelled as
     ‘unallocated’, i.e. available for use.
Recovering complete files
• Files may be recovered if the space has not been
  allocated to new data – Recovery soft
  may be used to recreate pointer to files
  that exist
  • Likelihood of retrieving entire file
    decreases over time
(Data/File) Carving
 “File carving is the process of
recovering computer files from a
storage medium without the use of
the standard file-system metadata
that is typically used during a normal
file retrieval.”
http://www.techheadsitconsulting.com/f/file-carving.html



Useful for data recovery when:
     • The File system ‘pointer’ (directory
       entry) to the file has been deleted or
       corrupted.
     • Sectors allocated to data file have
       been partially overwritten
Carving Techniques
• Block-based carving
• Header/Footer Carving
• Header/Maximum (file) size Carving
• Header/Embedded length Carving
• Statistical Carving
• Semantic Carving
• Fragment Recovery Carving
• Repackaging Carving
• Fuzzy hashing Carving

http://www.forensicswiki.org/wiki/File_Carving
Header/Footer Carving

Analyse file to identify data sequences that
 match a known filetype header & footer


          Header                          Footer
   GIF    nx47nx49nx46nx38nx37nx61 nx00nx3b

   JPG    nxffnxd8nxffnxe0nx00nx10        nxffnxd9
   ZIP    PKnx03nx04                      nx3cnxac



Sample header/information used by Scapel to identify files
Other carving methods
•   Header/Maximum (file size) Carving: Match header of known
    file type and extract data in sequence until a specified file size
    (e.g. 10MB) has been reached.

•   Header/Embedded Length carving: Technique for carving
    formats that store total size(length) in header, e.g. BMP, PDF,
    AVI

•   File structure based/Deep carving: Use documentation on file
    type structure to carve files

•   Smart Carving: Use documentation on file system’s data
    handling to address disk fragmentation issues
Data Carving tool capabilities
          A disk containing 20 deleted files - 5 100k text files, 5 5Mb JPEGs, 5
          90MB WMV videos and 5 300MB AVI videos (approx file size) is
          imaged and stored as RAW /DD

     1.   PhotoRec recovered all texts and JPGs. 3 AVIs were recovered in
          entirety, 2 were incomplete (but partially playable).
     2.   Scalpel – Recovered all JPGs and 3 incomplete (but partially
          playable) AVIs. Did not extract WMV or txt
     3.   MagicRescue – Only recovers files it has a ‘recipe’ for (JPG, AVI,
          but not txt or WMV) – recovered JPGs, but not AVI. Did not attempt
          other formats.
     4.   Foremost - unable to recover any files

     Planned Carver 2.0 may provide intelligent carving
     http://www.forensicswiki.org/wiki/Carver_2.0_Planning_Pag


22
Real world Experience
         Laptop containing 60GB hard disk in use for 6-7 years
•Able to extract 363 legitimate files,
but….

    • Disk fragmentation a big problem!

    • Data carving can take a loooonnng
      time – potentially weeks or months
      to perform in full

    • Software instability

    • Data carving requires a lot of disk
      space to store extracted data files

    • Large number of false positives
      (fake files) produced

    • Filestreams (e.g. images within
      container) often extracted, but not   Examples of Incomplete & invalid data files
      larger file (PowerPoint)
Timeline visualisation
          Chronological list of activities performed
          on the host machine

          Uses:
             • Gain understanding of research
               activities on machine
             • Investigate a specific incident

          •Traditionally concerned with File
          creation/accessed/modification
          •SuperTimeline tools being developed
          that merge time data from multiple
          sources.

             • OSS Timescanner useful for
               generating log of events
Text Mining
                                                   Java characterisation tool (AQUA)
                                                   •Uses Apache Tika to obtain information
                                                   about a file collection and its textual
                                                   content
                                                   •Relative path, file name, size, modified
                                                   date, SHA-256 digest, MIME type,
                                                   •Word frequency of the generated
                                                   Lucene index

                                                      Stanford MUSE Java tool
                                                      Mailbox analysis

                                                      •Relationships - Grouping of contacts
                                                      •Name lists (people, places,
                                                      organizations
                                                      •Sentiment analysis using word lists
                                                      – map over time
AQUA http://wiki.opf-labs.org/display/AQuA/Characterising+Externally+Generated+Content
Stanford Muse http://vis.stanford.edu/papers/muse
Conclusion (1): Challenges for use of
         digital forensics in research
     Expertise of the researcher
       •   Some technical expertise req. to perform acquisition and
           analysis
     Ethics of a forensic investigation
       •   User may not realise that deleted/scraps of content
           continues to exist - how do you communicate intent to your
           user community?
       •   Terminology is currently influenced by law enforcement
           community and is a barrier to wider use – forensics?
           Suspect?
     Capabilities of the tools
       •   No single tool is appropriate – require a combination of
           different ones
       •   Some integration is necessary to simplify process.

26
Conclusion (2):
             Current/Future challenges
     Multi-user systems
       • Distinguishing between data created by multiple users on
         same machine is time-consuming - requires analysis of
         timestamps and other features.
     Archiving data on 3rd party services:
       • Ethical issues associated with accessing & archiving user data
         on mail servers, second life, and cloud providers etc.
     Diverse device & media types:
       • Solid State devices subject to ‘wear levelling’ which purges
         inactive data
         (http://www.jdfsl.org/subscriptions/abstracts/abstract-v5n3-bell.htm)
       • Use of portable (personal/work) devices in the workplace, e.g.
         iPad, iPhone, Android devices – what is the master copy?



27
References
Digital Forensics and Born-Digital Content in Cultural Heritage
   Collections (2010)
http://www.clir.org/pubs/abstract/pub149abst.html
Performance Evaluation of Open-Source Disk Imaging Tools for
   Collecting Digital Evidence
http://www.kuis.edu.my/ictconf/proceedings/353_integration2010_proceedi
    ngs.pdf
The Evolution of File Carving (2009)
http://digital-assembly.com/technology/research/pubs/ieee-spm-2009.pdf
Hash Filtering techniques
http://computer-forensics.sans.org/blog/2010/02/22/extracting-known-bad-
    hashset-from-nsrl/
Digital Forensic tutorials http://computer-forensics.sans.org/blog/
Open Source Forensics http://www2.opensourceforensics.org/
Forensics Wiki http://www.forensicswiki.org/wiki/Main_Page

Más contenido relacionado

La actualidad más candente

Computer forensics
Computer  forensicsComputer  forensics
Computer forensicsLalit Garg
 
Digital forensics
Digital forensics Digital forensics
Digital forensics vishnuv43
 
Analysis of digital evidence
Analysis of digital evidenceAnalysis of digital evidence
Analysis of digital evidencerakesh mishra
 
Digital Forensics
Digital ForensicsDigital Forensics
Digital ForensicsOldsun
 
Computer forensics and its role
Computer forensics and its roleComputer forensics and its role
Computer forensics and its roleSudeshna Basak
 
Role of a Forensic Investigator
Role of a Forensic InvestigatorRole of a Forensic Investigator
Role of a Forensic InvestigatorAgape Inc
 
Introduction to computer forensic
Introduction to computer forensicIntroduction to computer forensic
Introduction to computer forensicOnline
 
The Future of Digital Forensics
The Future of Digital ForensicsThe Future of Digital Forensics
The Future of Digital Forensics00heights
 
Lecture 4,5, 6 comp forensics 19 9-2018 basic security
Lecture 4,5, 6 comp forensics 19 9-2018 basic securityLecture 4,5, 6 comp forensics 19 9-2018 basic security
Lecture 4,5, 6 comp forensics 19 9-2018 basic securityAlchemist095
 
Computer +forensics
Computer +forensicsComputer +forensics
Computer +forensicsRahul Baghla
 
Lecture 9 and 10 comp forensics 09 10-18 file system
Lecture 9 and 10 comp forensics 09 10-18 file systemLecture 9 and 10 comp forensics 09 10-18 file system
Lecture 9 and 10 comp forensics 09 10-18 file systemAlchemist095
 
Computer forensic 101 - OWASP Khartoum
Computer forensic 101 - OWASP KhartoumComputer forensic 101 - OWASP Khartoum
Computer forensic 101 - OWASP KhartoumOWASP Khartoum
 
Digital Forensics best practices with the use of open source tools and admiss...
Digital Forensics best practices with the use of open source tools and admiss...Digital Forensics best practices with the use of open source tools and admiss...
Digital Forensics best practices with the use of open source tools and admiss...Sagar Rahurkar
 

La actualidad más candente (20)

Computer forensics
Computer  forensicsComputer  forensics
Computer forensics
 
Digital forensics
Digital forensics Digital forensics
Digital forensics
 
Computer forensics
Computer forensicsComputer forensics
Computer forensics
 
Analysis of digital evidence
Analysis of digital evidenceAnalysis of digital evidence
Analysis of digital evidence
 
Digital forensic tools
Digital forensic toolsDigital forensic tools
Digital forensic tools
 
Digital Forensics
Digital ForensicsDigital Forensics
Digital Forensics
 
Computer forensics and its role
Computer forensics and its roleComputer forensics and its role
Computer forensics and its role
 
Intro to cyber forensics
Intro to cyber forensicsIntro to cyber forensics
Intro to cyber forensics
 
Digital Forensics
Digital ForensicsDigital Forensics
Digital Forensics
 
DF Process Models
DF Process ModelsDF Process Models
DF Process Models
 
Role of a Forensic Investigator
Role of a Forensic InvestigatorRole of a Forensic Investigator
Role of a Forensic Investigator
 
Introduction to computer forensic
Introduction to computer forensicIntroduction to computer forensic
Introduction to computer forensic
 
Digital Forensic
Digital ForensicDigital Forensic
Digital Forensic
 
The Future of Digital Forensics
The Future of Digital ForensicsThe Future of Digital Forensics
The Future of Digital Forensics
 
Lecture 4,5, 6 comp forensics 19 9-2018 basic security
Lecture 4,5, 6 comp forensics 19 9-2018 basic securityLecture 4,5, 6 comp forensics 19 9-2018 basic security
Lecture 4,5, 6 comp forensics 19 9-2018 basic security
 
Computer +forensics
Computer +forensicsComputer +forensics
Computer +forensics
 
Lecture 9 and 10 comp forensics 09 10-18 file system
Lecture 9 and 10 comp forensics 09 10-18 file systemLecture 9 and 10 comp forensics 09 10-18 file system
Lecture 9 and 10 comp forensics 09 10-18 file system
 
Current Forensic Tools
Current Forensic Tools Current Forensic Tools
Current Forensic Tools
 
Computer forensic 101 - OWASP Khartoum
Computer forensic 101 - OWASP KhartoumComputer forensic 101 - OWASP Khartoum
Computer forensic 101 - OWASP Khartoum
 
Digital Forensics best practices with the use of open source tools and admiss...
Digital Forensics best practices with the use of open source tools and admiss...Digital Forensics best practices with the use of open source tools and admiss...
Digital Forensics best practices with the use of open source tools and admiss...
 

Destacado

Chfi V3 Module 01 Computer Forensics In Todays World
Chfi V3 Module 01 Computer Forensics In Todays WorldChfi V3 Module 01 Computer Forensics In Todays World
Chfi V3 Module 01 Computer Forensics In Todays Worldgueste0d962
 
Legal aspects of handling cyber frauds
Legal aspects of handling cyber fraudsLegal aspects of handling cyber frauds
Legal aspects of handling cyber fraudsSagar Rahurkar
 
Cyberwar poster english
Cyberwar poster englishCyberwar poster english
Cyberwar poster englishAbbas Badran
 
02 Types of Computer Forensics Technology - Notes
02 Types of Computer Forensics Technology - Notes02 Types of Computer Forensics Technology - Notes
02 Types of Computer Forensics Technology - NotesKranthi
 
Computer forensic ppt
Computer forensic pptComputer forensic ppt
Computer forensic pptPriya Manik
 

Destacado (7)

Digital forensics
Digital forensicsDigital forensics
Digital forensics
 
Chfi V3 Module 01 Computer Forensics In Todays World
Chfi V3 Module 01 Computer Forensics In Todays WorldChfi V3 Module 01 Computer Forensics In Todays World
Chfi V3 Module 01 Computer Forensics In Todays World
 
computer forensics
computer forensicscomputer forensics
computer forensics
 
Legal aspects of handling cyber frauds
Legal aspects of handling cyber fraudsLegal aspects of handling cyber frauds
Legal aspects of handling cyber frauds
 
Cyberwar poster english
Cyberwar poster englishCyberwar poster english
Cyberwar poster english
 
02 Types of Computer Forensics Technology - Notes
02 Types of Computer Forensics Technology - Notes02 Types of Computer Forensics Technology - Notes
02 Types of Computer Forensics Technology - Notes
 
Computer forensic ppt
Computer forensic pptComputer forensic ppt
Computer forensic ppt
 

Similar a Watching the Detectives: Using digital forensics techniques to investigate the digital persona

SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012
SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012
SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012peterchanws
 
Accessioning Born-Digital Materials
Accessioning Born-Digital MaterialsAccessioning Born-Digital Materials
Accessioning Born-Digital Materialspeterchanws
 
Digital Forensics in the Archive
Digital Forensics in the ArchiveDigital Forensics in the Archive
Digital Forensics in the ArchiveGarethKnight
 
Computer Forensic Tools.pptx
Computer Forensic Tools.pptxComputer Forensic Tools.pptx
Computer Forensic Tools.pptxKomalNagre4
 
Workshop 2 revised
Workshop 2 revisedWorkshop 2 revised
Workshop 2 revisedpeterchanws
 
University of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersUniversity of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersJez Cope
 
Computer forensics libin
Computer forensics   libinComputer forensics   libin
Computer forensics libinlibinp
 
Beauty of open source in cyber forensics
Beauty of open source in cyber forensicsBeauty of open source in cyber forensics
Beauty of open source in cyber forensicssaddamhusain hadimani
 
Group project linux helix
Group project linux helixGroup project linux helix
Group project linux helixJeff Carroll
 
computer forensic tools-Hardware & Software tools
computer forensic tools-Hardware & Software toolscomputer forensic tools-Hardware & Software tools
computer forensic tools-Hardware & Software toolsN.Jagadish Kumar
 
Analytics with unified file and object
Analytics with unified file and object Analytics with unified file and object
Analytics with unified file and object Sandeep Patil
 
Preserving and recovering digital evidence
Preserving and recovering digital evidencePreserving and recovering digital evidence
Preserving and recovering digital evidenceOnline
 
Memory Forensic: Investigating Memory Artefact
Memory Forensic: Investigating Memory ArtefactMemory Forensic: Investigating Memory Artefact
Memory Forensic: Investigating Memory ArtefactSatria Ady Pradana
 
Memory Forensic - Investigating Memory Artefact
Memory Forensic - Investigating Memory ArtefactMemory Forensic - Investigating Memory Artefact
Memory Forensic - Investigating Memory ArtefactSatria Ady Pradana
 
AntiForensics - Leveraging OS and File System Artifacts.pdf
AntiForensics - Leveraging OS and File System Artifacts.pdfAntiForensics - Leveraging OS and File System Artifacts.pdf
AntiForensics - Leveraging OS and File System Artifacts.pdfekobelasting
 

Similar a Watching the Detectives: Using digital forensics techniques to investigate the digital persona (20)

SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012
SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012
SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012
 
Accessioning Born-Digital Materials
Accessioning Born-Digital MaterialsAccessioning Born-Digital Materials
Accessioning Born-Digital Materials
 
Digital Forensics in the Archive
Digital Forensics in the ArchiveDigital Forensics in the Archive
Digital Forensics in the Archive
 
Computer Forensic Tools.pptx
Computer Forensic Tools.pptxComputer Forensic Tools.pptx
Computer Forensic Tools.pptx
 
Autopsy Digital forensics tool
Autopsy Digital forensics toolAutopsy Digital forensics tool
Autopsy Digital forensics tool
 
Workshop 2 revised
Workshop 2 revisedWorkshop 2 revised
Workshop 2 revised
 
Latest presentation
Latest presentationLatest presentation
Latest presentation
 
University of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersUniversity of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchers
 
Computer forensics libin
Computer forensics   libinComputer forensics   libin
Computer forensics libin
 
Beauty of open source in cyber forensics
Beauty of open source in cyber forensicsBeauty of open source in cyber forensics
Beauty of open source in cyber forensics
 
Group project linux helix
Group project linux helixGroup project linux helix
Group project linux helix
 
3871778
38717783871778
3871778
 
Digital forensics
Digital forensicsDigital forensics
Digital forensics
 
I say emulate
I say emulateI say emulate
I say emulate
 
computer forensic tools-Hardware & Software tools
computer forensic tools-Hardware & Software toolscomputer forensic tools-Hardware & Software tools
computer forensic tools-Hardware & Software tools
 
Analytics with unified file and object
Analytics with unified file and object Analytics with unified file and object
Analytics with unified file and object
 
Preserving and recovering digital evidence
Preserving and recovering digital evidencePreserving and recovering digital evidence
Preserving and recovering digital evidence
 
Memory Forensic: Investigating Memory Artefact
Memory Forensic: Investigating Memory ArtefactMemory Forensic: Investigating Memory Artefact
Memory Forensic: Investigating Memory Artefact
 
Memory Forensic - Investigating Memory Artefact
Memory Forensic - Investigating Memory ArtefactMemory Forensic - Investigating Memory Artefact
Memory Forensic - Investigating Memory Artefact
 
AntiForensics - Leveraging OS and File System Artifacts.pdf
AntiForensics - Leveraging OS and File System Artifacts.pdfAntiForensics - Leveraging OS and File System Artifacts.pdf
AntiForensics - Leveraging OS and File System Artifacts.pdf
 

Más de GarethKnight

Supporting Open Science in Research
Supporting Open Science in ResearchSupporting Open Science in Research
Supporting Open Science in ResearchGarethKnight
 
Making Sense of a Digital Collection
Making Sense of a Digital CollectionMaking Sense of a Digital Collection
Making Sense of a Digital CollectionGarethKnight
 
Building Sustainability: Preserving research data without breaking the bank
Building Sustainability: Preserving research data without breaking the bankBuilding Sustainability: Preserving research data without breaking the bank
Building Sustainability: Preserving research data without breaking the bankGarethKnight
 
GIS: A project by project prospective
GIS: A project by project prospectiveGIS: A project by project prospective
GIS: A project by project prospectiveGarethKnight
 
Complying with EPSRC policy: An LSHTM case study
Complying with EPSRC policy: An LSHTM case studyComplying with EPSRC policy: An LSHTM case study
Complying with EPSRC policy: An LSHTM case studyGarethKnight
 
Data Management for Librarians: An Introduction
Data Management for Librarians: An IntroductionData Management for Librarians: An Introduction
Data Management for Librarians: An IntroductionGarethKnight
 
Challenges in setting up an RDM Support Service
Challenges in setting up an RDM Support ServiceChallenges in setting up an RDM Support Service
Challenges in setting up an RDM Support ServiceGarethKnight
 
Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...GarethKnight
 
Doing research better: The role of meta‐data
Doing research better: The role of meta‐dataDoing research better: The role of meta‐data
Doing research better: The role of meta‐dataGarethKnight
 
Laying the Foundation: Establishing an institutional RDM Support Service for ...
Laying the Foundation: Establishing an institutional RDM Support Service for ...Laying the Foundation: Establishing an institutional RDM Support Service for ...
Laying the Foundation: Establishing an institutional RDM Support Service for ...GarethKnight
 
Preservation Planning: Choosing a suitable digital preservation strategy
Preservation Planning: Choosing a suitable digital preservation strategyPreservation Planning: Choosing a suitable digital preservation strategy
Preservation Planning: Choosing a suitable digital preservation strategyGarethKnight
 
Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curationGarethKnight
 
Keep Calm and Curate
Keep Calm and CurateKeep Calm and Curate
Keep Calm and CurateGarethKnight
 
Same as it ever was? Significant Properties and the preservation of meaning o...
Same as it ever was? Significant Properties and the preservation of meaning o...Same as it ever was? Significant Properties and the preservation of meaning o...
Same as it ever was? Significant Properties and the preservation of meaning o...GarethKnight
 
Who Decides? Reinterpreting archival processes for the management of digital ...
Who Decides? Reinterpreting archival processes for the management of digital ...Who Decides? Reinterpreting archival processes for the management of digital ...
Who Decides? Reinterpreting archival processes for the management of digital ...GarethKnight
 
Establishing the significant properties of digital research
Establishing the significant properties of digital researchEstablishing the significant properties of digital research
Establishing the significant properties of digital researchGarethKnight
 

Más de GarethKnight (16)

Supporting Open Science in Research
Supporting Open Science in ResearchSupporting Open Science in Research
Supporting Open Science in Research
 
Making Sense of a Digital Collection
Making Sense of a Digital CollectionMaking Sense of a Digital Collection
Making Sense of a Digital Collection
 
Building Sustainability: Preserving research data without breaking the bank
Building Sustainability: Preserving research data without breaking the bankBuilding Sustainability: Preserving research data without breaking the bank
Building Sustainability: Preserving research data without breaking the bank
 
GIS: A project by project prospective
GIS: A project by project prospectiveGIS: A project by project prospective
GIS: A project by project prospective
 
Complying with EPSRC policy: An LSHTM case study
Complying with EPSRC policy: An LSHTM case studyComplying with EPSRC policy: An LSHTM case study
Complying with EPSRC policy: An LSHTM case study
 
Data Management for Librarians: An Introduction
Data Management for Librarians: An IntroductionData Management for Librarians: An Introduction
Data Management for Librarians: An Introduction
 
Challenges in setting up an RDM Support Service
Challenges in setting up an RDM Support ServiceChallenges in setting up an RDM Support Service
Challenges in setting up an RDM Support Service
 
Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...
 
Doing research better: The role of meta‐data
Doing research better: The role of meta‐dataDoing research better: The role of meta‐data
Doing research better: The role of meta‐data
 
Laying the Foundation: Establishing an institutional RDM Support Service for ...
Laying the Foundation: Establishing an institutional RDM Support Service for ...Laying the Foundation: Establishing an institutional RDM Support Service for ...
Laying the Foundation: Establishing an institutional RDM Support Service for ...
 
Preservation Planning: Choosing a suitable digital preservation strategy
Preservation Planning: Choosing a suitable digital preservation strategyPreservation Planning: Choosing a suitable digital preservation strategy
Preservation Planning: Choosing a suitable digital preservation strategy
 
Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curation
 
Keep Calm and Curate
Keep Calm and CurateKeep Calm and Curate
Keep Calm and Curate
 
Same as it ever was? Significant Properties and the preservation of meaning o...
Same as it ever was? Significant Properties and the preservation of meaning o...Same as it ever was? Significant Properties and the preservation of meaning o...
Same as it ever was? Significant Properties and the preservation of meaning o...
 
Who Decides? Reinterpreting archival processes for the management of digital ...
Who Decides? Reinterpreting archival processes for the management of digital ...Who Decides? Reinterpreting archival processes for the management of digital ...
Who Decides? Reinterpreting archival processes for the management of digital ...
 
Establishing the significant properties of digital research
Establishing the significant properties of digital researchEstablishing the significant properties of digital research
Establishing the significant properties of digital research
 

Último

Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 

Último (20)

Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 

Watching the Detectives: Using digital forensics techniques to investigate the digital persona

  • 1. Watching the Detectives Using digital forensic techniques to investigate the digital persona Gareth Knight Centre for e-Research, Anatomy Museum, King’s College London, 8th November 2011
  • 2. Overview • Introduction to digital forensics • How is it used in law enforcement • How can it be used for research and digital curation • Forensic practices • Media imaging • Hash filtering • Data carving • Current/future challenges
  • 3. Origin of Digital Forensics • Emerged in 1980s as a response to increasing use of electronic devices for criminal activity. • Practioner-led approach - a set of methods applied to gather, retrieve and analyse potential evidence held on digital devices • Emphasis upon “scientifically derived and proven methods” to obtain, analyse & report upon digital evidence (Digital Forensics Research Workshop, 2001) • Legal acceptability influenced by Daubert Standard: • methods must be tested, • Subject to peer review and publication, • Possess a known error rate, • Subject to standards governing their application
  • 4. Intelligence gathering in law- enforcement •Role in legal Disclosure (UK)/e-discovery (US) to obtain data designated as evidence in legal investigation. Robert Clark’s target-centric approach •Broad intelligence gathering activities – develop & test hypothesis •Several intelligence cycles developed to model investigation process Peter Pirolli and Stuart Card sense making loop
  • 5. Value for digital archiving and research Increasing amount of digital Salman Rushdie Archive information: Analysis of research activities • When did an author create a notable work? • What tools did they use? • What sources did they consult? • Is there evidence of material they abandoned? Business function Staff have their machine appraised Emulation of several Apple Macs owned by the author prior to leaving institution/finishing a project to identify data of long-term http://www.emory.edu/home/academic s/libraries/salman-rushdie.html value not held elsewhere
  • 6. Digital Forensics workflow Forensic activities, as described by Digital Forensics Research Workshop (2001) Preservation Collection Validation Identification Analysis Interpretation Documentation Presentation Acquisition Analysis Reporting
  • 7. Data Acquisition Act of obtaining possession of digital data for subsequent analysis. Commonly achieved through creation a disk image or clone that provides a bit copy of disk. 1 or more 60GB files that hard disk add up to 60GB Motivation for creating a disk image in forensic environment: 1. Backup copy avoids risk of media failure or other damage during use 2. Avoids risk of making inadvertent, unrecoverable change to the primary copy • Files can be created/modified/deleted through access to disk 1. Enable analysis using methods and tools that are not possible/available in the original environment (e.g. emulation, text mining)
  • 8. Forensic Utility Belt (1) Capture software (2) Write Blocker Stored on bootable Prevents OS media (floppy, CD, writing to USB) connected devices E.g. USB plug- Examples: Dc3dd, through unit DDRescue, OSFClone, FTK Imager (3) Access Devices (4) Destination Media Drive enclosure allows use of internal Digital media on disks via USB which the disk image will be Kryoflux USB disk written, e.g. USB controller allows low hard disk level disk access
  • 9. Key Questions to be addressed 1. What type of media do you want to capture? • Floppy disk, hard disk, optical media 1. How can the data be accessed? • Hard disk installed within users’ computer • Accessed using appropriate reader (USB hard disk caddy, floppy disk reader, CD/DVD reader) • Network connected disk 1. Where will the acquired image be stored? • External USB disk, • Network device over Ethernet/Serial, etc. 1. What software should you use to Different Hardware capture the disk image? Different Media
  • 10. Data Analysis Content held on digital media serves many purposes: • Operating system files, e.g. Windows has 30,000+ after fresh install • Software: Applications, utilities, games, etc. • Log data: Windows Registry, browser cache, cookies, temp files • User-generated content: Documents, images, sound, emails, etc. Different data layers available: 1. Active data: Information readily available as normally seen by an OS 2. Inactive/residual data: Information that has been deleted or modified • Deleted files located in unallocated space that have yet to be overwritten (retrieved using undelete application) • Data fragments that contains information from a partially deleted file (retrieved through carving) Inactive data useful, but need to consider ethical issues 10
  • 11. Locating active files Common techniques for locating user content: • Navigate directory structure to get a ‘feel’ for data files held on disk • Search by: • File name, e.g. *report* • File type, e.g. *.doc, *.pdf, etc. • Creation/modification date • Content type, e.g. word usage • File size • Additional parameters configurable Windows search easy to perform, but does not identify everything – investigation process can leave artefacts, e.g. thumbs.db behind
  • 12. Case Management Tools Common interface for analysing drive without content change Commercial: FTK, OSForensic OSS: Sleuthkit/Autopsy, Digital Forensics Framework, PyFlag Provide tools to sort/visualise data by: • Name, • Folder, • Size, • Type, • Creation/Modification date • Hash set
  • 13. Identifying user data using checksums • Checksum algorithm applied to a file generates a distinct (possibly unique) alphanumeric value • Many different types of checksum algorithm • Commonly used to check for accidental/deliberate data change/corruption • Generate checksum on October 1st • Generate checksum on October 14th & compare to Oct 1st value – are they the same?
  • 14. Hash filtering / Exclusion Hashing • Technique to identify data files obtained from different sources • Calculate checksum (e.g. MD5, SHA-1) of one or more files • Compare each checksum against a checksum database indicating files known to originate from a third party Checksum types • known good’ - Files that perform a legitimate purpose, e.g. Operating System, application. • ‘known bad’ - Files that denote viruses, Trojans, cracker's tools, or other malicious files • Unknown – Files that have not been previously encountered.
  • 15. Hash datasets – Information Sources NIST National Software Reference Library (NSRL): • Checksums of legitimate files generated from software products obtained through purchase/donation. • Stores 10,000+ software files. • Reference Data Set published every 3 months & available through 3rd parties, such as Find-a-Hash HashKeeper - National Drug Intelligence Center • Checksums gathered through criminal investigation. • Academic (and other) institutions must file a FoI request to gain access to software and database. Online File Signature Database (OFSDB): • Subscription based system dependent upon user contribution. • Full access available through subscription of 25 USD per year • Currently being used by curators/archivists to distinguish between known third-party and potential user created files.
  • 16. Practical Example 60GB hard disk 9,698 known files, 12,974 unknown files Windows 2000 files that match the NSRL Unknown files that may be user created database content Method may be combined with other techniques, e.g. path and filename analysis to exclude other common files (e.g. thumbs.db)
  • 17. Recovering deleted data • Data files continue to exist in full or in part for some time after deletion • The list of disk clusters occupied by the file is relabelled as ‘unallocated’, i.e. available for use. Recovering complete files • Files may be recovered if the space has not been allocated to new data – Recovery soft may be used to recreate pointer to files that exist • Likelihood of retrieving entire file decreases over time
  • 18. (Data/File) Carving  “File carving is the process of recovering computer files from a storage medium without the use of the standard file-system metadata that is typically used during a normal file retrieval.” http://www.techheadsitconsulting.com/f/file-carving.html Useful for data recovery when: • The File system ‘pointer’ (directory entry) to the file has been deleted or corrupted. • Sectors allocated to data file have been partially overwritten
  • 19. Carving Techniques • Block-based carving • Header/Footer Carving • Header/Maximum (file) size Carving • Header/Embedded length Carving • Statistical Carving • Semantic Carving • Fragment Recovery Carving • Repackaging Carving • Fuzzy hashing Carving http://www.forensicswiki.org/wiki/File_Carving
  • 20. Header/Footer Carving Analyse file to identify data sequences that match a known filetype header & footer Header Footer GIF nx47nx49nx46nx38nx37nx61 nx00nx3b JPG nxffnxd8nxffnxe0nx00nx10 nxffnxd9 ZIP PKnx03nx04 nx3cnxac Sample header/information used by Scapel to identify files
  • 21. Other carving methods • Header/Maximum (file size) Carving: Match header of known file type and extract data in sequence until a specified file size (e.g. 10MB) has been reached. • Header/Embedded Length carving: Technique for carving formats that store total size(length) in header, e.g. BMP, PDF, AVI • File structure based/Deep carving: Use documentation on file type structure to carve files • Smart Carving: Use documentation on file system’s data handling to address disk fragmentation issues
  • 22. Data Carving tool capabilities A disk containing 20 deleted files - 5 100k text files, 5 5Mb JPEGs, 5 90MB WMV videos and 5 300MB AVI videos (approx file size) is imaged and stored as RAW /DD 1. PhotoRec recovered all texts and JPGs. 3 AVIs were recovered in entirety, 2 were incomplete (but partially playable). 2. Scalpel – Recovered all JPGs and 3 incomplete (but partially playable) AVIs. Did not extract WMV or txt 3. MagicRescue – Only recovers files it has a ‘recipe’ for (JPG, AVI, but not txt or WMV) – recovered JPGs, but not AVI. Did not attempt other formats. 4. Foremost - unable to recover any files Planned Carver 2.0 may provide intelligent carving http://www.forensicswiki.org/wiki/Carver_2.0_Planning_Pag 22
  • 23. Real world Experience Laptop containing 60GB hard disk in use for 6-7 years •Able to extract 363 legitimate files, but…. • Disk fragmentation a big problem! • Data carving can take a loooonnng time – potentially weeks or months to perform in full • Software instability • Data carving requires a lot of disk space to store extracted data files • Large number of false positives (fake files) produced • Filestreams (e.g. images within container) often extracted, but not Examples of Incomplete & invalid data files larger file (PowerPoint)
  • 24. Timeline visualisation Chronological list of activities performed on the host machine Uses: • Gain understanding of research activities on machine • Investigate a specific incident •Traditionally concerned with File creation/accessed/modification •SuperTimeline tools being developed that merge time data from multiple sources. • OSS Timescanner useful for generating log of events
  • 25. Text Mining Java characterisation tool (AQUA) •Uses Apache Tika to obtain information about a file collection and its textual content •Relative path, file name, size, modified date, SHA-256 digest, MIME type, •Word frequency of the generated Lucene index Stanford MUSE Java tool Mailbox analysis •Relationships - Grouping of contacts •Name lists (people, places, organizations •Sentiment analysis using word lists – map over time AQUA http://wiki.opf-labs.org/display/AQuA/Characterising+Externally+Generated+Content Stanford Muse http://vis.stanford.edu/papers/muse
  • 26. Conclusion (1): Challenges for use of digital forensics in research Expertise of the researcher • Some technical expertise req. to perform acquisition and analysis Ethics of a forensic investigation • User may not realise that deleted/scraps of content continues to exist - how do you communicate intent to your user community? • Terminology is currently influenced by law enforcement community and is a barrier to wider use – forensics? Suspect? Capabilities of the tools • No single tool is appropriate – require a combination of different ones • Some integration is necessary to simplify process. 26
  • 27. Conclusion (2): Current/Future challenges Multi-user systems • Distinguishing between data created by multiple users on same machine is time-consuming - requires analysis of timestamps and other features. Archiving data on 3rd party services: • Ethical issues associated with accessing & archiving user data on mail servers, second life, and cloud providers etc. Diverse device & media types: • Solid State devices subject to ‘wear levelling’ which purges inactive data (http://www.jdfsl.org/subscriptions/abstracts/abstract-v5n3-bell.htm) • Use of portable (personal/work) devices in the workplace, e.g. iPad, iPhone, Android devices – what is the master copy? 27
  • 28. References Digital Forensics and Born-Digital Content in Cultural Heritage Collections (2010) http://www.clir.org/pubs/abstract/pub149abst.html Performance Evaluation of Open-Source Disk Imaging Tools for Collecting Digital Evidence http://www.kuis.edu.my/ictconf/proceedings/353_integration2010_proceedi ngs.pdf The Evolution of File Carving (2009) http://digital-assembly.com/technology/research/pubs/ieee-spm-2009.pdf Hash Filtering techniques http://computer-forensics.sans.org/blog/2010/02/22/extracting-known-bad- hashset-from-nsrl/ Digital Forensic tutorials http://computer-forensics.sans.org/blog/ Open Source Forensics http://www2.opensourceforensics.org/ Forensics Wiki http://www.forensicswiki.org/wiki/Main_Page