2. About viaForensics
viaForensics is an innovative digital forensics
and security company providing expert services
to:
• Law Enforcement
• Government Agencies
• Corporations
• Attorneys/Individuals
3. What’s the problem?
• We want to recover as much data from devices
as possible
• People delete data, mostly the data we want!
• SQLite is a very popular data storage format
• Currently no advanced SQLite recovery tool on
the market (but stay tuned)
4. What is SQLite?
• SQLite is a widely used, lightweight database
contained in a single cross-platform file used by
developers for structured data storage
• Used in most smart phones (iPhone, Android,
Symbian, webOS)
• Used in major operating systems and
applications (Apple OS X, Google Chrome and
Chrome OS, Firefox)
5. Why do developers need structured data storage?
• Applications need to store and retrieve data
• In past and today, developers created their own
file formats
• But why reinvent the wheel for basic data
storage?
• SQLite is free, open, high quality and takes care
of the messy details
6. Core SQLite characteristics (from their FAQ)
• Transactions are atomic, consistent, isolated, and durable (ACID)
even after system crashes and power failures.
• Zero-configuration - no setup or administration needed.
• A complete database is stored in a single cross-platform disk file.
• Small code footprint: 190KiB - 325KiB fully configured
• Cross-platform and easy to port to unsupported systems.
• Sources are in the public domain. Use for any purpose.
• Standalone command-line interface (CLI) client
7. SQL = Structured Query Language
• SQL is the language used to interact with many
databases, including SQLite
• Basic functions: Create, Read, Update and
Deleted (CRUD)
• Transactions: Start a change and it either
completes in entirety (commit) or not at all
(rollback)
• Very powerful, many variations
8. SQL – basic commands
• SELECT – queries data from tables or tables
– SELECT rowid, address, date, text FROM message;
• INSERT INTO – adds data row to table
– INSERT INTO message VALUES (NULL, ‘3128781100’, 1282844546, ‘text message’);
• UPDATE – updates data rows in tables
– UPDATE message SET date=1282846291 WHERE rowid=4;
• DELETE – deletes data rows in tables
– DELETE FROM message WHERE rowid=4;
• Many good tutorials online
9. Viewing a SQLite database – command line
• Command line apps
– sqlite3 for full SQLite functions
– sqlite_analyzer for db metadata
• Linux/Mac/Windows versions
• Represents latest version
• Full source code and documentation
• http://www.sqlite.org/download.html
10. Example sqlite3 session
Run sqlite3 on database file
ahoog@linux-wks-001:~/sqlite$ ./sqlite3 iPhone-3G-313-sms.db
SQLite version 3.7.4
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite>
List tables in database
sqlite> .tables
_SqliteDatabaseProperties msg_group
group_member msg_pieces
message
Examine schema (structure) of message database
sqlite> .schema message
CREATE TABLE message (ROWID INTEGER PRIMARY KEY AUTOINCREMENT, address TEXT, date
INTEGER, text TEXT, flags INTEGER, replace INTEGER, svc_center TEXT, group_id INTEGER,
association_id INTEGER, height INTEGER, UIFlags INTEGER, version INTEGER, subject TEXT,
country TEXT, headers BLOB, recipients BLOB, read INTEGER);
11. Example sqlite3 session - continued
View record “4” in 2 formats
sqlite> .headers on
sqlite> SELECT * FROM message WHERE ROWID = 4;
ROWID|address|date|text|flags|replace|svc_center|group_id|association_id|height|UIFlags
|version|subject|country|headers|recipients|read
4|(312) 898-4070|1282844546|Sure is a nice day out |3|0||3|1282844546|0|4|0||us|||1
sqlite> .mode line
sqlite> SELECT * FROM message WHERE ROWID = 4;
ROWID = 4
address = (312) 898-4070
date = 1282844546
text = Sure is a nice day out
flags = 3
replace = 0
svc_center =
group_id = 3
association_id = 1282844546
height = 0
UIFlags = 4
version = 0
subject =
country = us
headers =
recipients =
read = 1
12. sqlite3_analyzer – very useful in forensic analysis
ahoog@linux-wks-001:~/sqlite$ ./sqlite3_analyzer iPhone-3G-313-sms.db
/** Disk-Space Utilization Report For iPhone-3G-313-sms.db
Page size in bytes.................... 2048
Pages in the whole file (measured).... 14
Pages in the whole file (calculated).. 14
Pages that store data................. 13 92.9%
Pages on the freelist (per header).... 0 0.0%
Pages on the freelist (calculated).... 0 0.0%
Pages of auto-vacuum overhead......... 1 7.1%
Number of tables in the database...... 7
Number of indices..................... 4
Number of named indices............... 3
Automatically generated indices....... 1
Size of the file in bytes............. 28672
Bytes of user payload stored.......... 1833 6.4%
*** Page counts for all tables with their indices ********************
MESSAGE............................... 3 21.4%
SQLITE_MASTER......................... 3 21.4%
_SQLITEDATABASEPROPERTIES............. 2 14.3%
MSG_PIECES............................ 2 14.3%
<snip>
13. Viewing a SQLite database – SQLite Database Browser
• Freeware, public domain, open source visual tool used
to create, design and edit database files compatible with
SQLite
• Windows/Linux/Mac
• Support SQLite 3.x
• Last updated 12/2009
• http://sqlitebrowser.sourceforge.net/
• Many other (free) options listed at:
http://www.sqlite.org/cvstrac/wiki?p=ManagementTools
16. SQLite – database header format
• The first 100 bytes of the database file comprise the
database file header.
• First 5 of 22 fields
Offset Size Description
0 16 The header string: "SQLite format 3000"
16 2 The database page size in bytes. Must be a power of two
between 512 and 32768 inclusive, or the value 1 representing a
page size of 65536.
18 1 File format write version. 1 for legacy; 2 for WAL.
19 1 File format read version. 1 for legacy; 2 for WAL.
20 1 Bytes of unused "reserved" space at the end of each page.
Usually 0.
17. SQLite – Organized in Pages
• Database consists of one or more pages, logical units which store
data
• Pages are numbered beginning with 1
• A page is one of the following:
Page type Description
B-Tree page B-Tree pages are part of the tree structures used to store
database tables and indexes.
Overflow page Overflow pages are used by particularly large database
records that do not fit on a single B-Tree page.
Free page Free pages are pages within the database file that are not
being used to store meaningful data. (or so they think!)
Pointer-map page Part of auto-vacuum system
Locking page Tracks when database rows are locked for updating
18. B+tree and B-Tree formats – on-disk data structure
• Data structure which represents sorted data in a way
that allows for efficient insertion, retrieval and removal of
records
• Optimized for storage devices (vs. in memory) by
minimizing the number of disk accesses.
• In a B+tree, all data is stored in the leaves of the tree
instead of in both the leaves and the intermediate branch
nodes.
• A single B-Tree structure is stored using one or more
database pages. Each page contains a single B-Tree
node.
20. SQLite storage classes and data types
• Only 5 storage classes/data types :
1. NULL: The value is a NULL value.
2. INTEGER: The value is a signed integer, stored in 1, 2, 3, 4, 6, or 8 bytes
depending on the magnitude of the value.
3. REAL: The value is a floating point value, stored as an 8-byte IEEE floating
point number.
4. TEXT: The value is a text string, stored using the database encoding (UTF-
8, UTF-16BE or UTF-16LE).
5. BLOB: The value is a blob of data, stored exactly as it was input. Often
used to store binary data
21. SQLite storage classes – on disk example
• 5 storage classes in hex on disk:
• NULL: 0x00
• INTEGER (4-byte): 0x4c76a782 = 1282844546
• REAL: 0x41B1EC2EC004D9D7 = 300691136.018949
– http://babbage.cs.qc.edu/IEEE-754/64bit.html
• TEXT (ASCII): 0x53757265206973 = Sure is
• BLOB: hard to represent binary here…see Text
22. Variable Integers – saving space, adding confusion
• A variable-length integer or "varint" uses less space for small positive
values.
• Used in SQLite metadata (row headers, b-tree indexes, etc.)
• A varint is between 1 and 9 bytes in length.
• The varint consists of either zero or more byte which have the high-order bit
set followed by a single byte with the high-order bit clear, or nine bytes,
whichever is shorter. The lower seven bits of each of the first eight bytes
and all 8 bits of the ninth byte are used to reconstruct the 64-bit twos-
complement integer.
• Varints are big-endian: bits taken from the earlier byte of the varint are the
more significant and bits taken from the later bytes.
• http://www.sqlite.org/fileformat.html#varint_format
• Clear? How about an example ->
23. Variable Integers – example
• Let’s say you find the following hex varint: 0x8CA06F
– Examine each bit, if > 0x80 then not the last byte
– So, we have 3 bytes: 0x8C 0xA0 0x6F (since 0x6F < 0x80 it’s
the last byte). Here’s how to convert:
* MSB: Most significant bit (left most bit)
Original Bytes 0x8C 0xA0 0x6F
Convert to binary 1000 1100 1010 0000 0110 1111
Remove MSB* 000 1100 010 0000 110 1111
Concatenate 000110001000001101111
In hex/decimal Hex: 0x03106F Decimal: 200,815
24. Freelist / Free page list
• When information is deleted from the database,
pages containing that data are not in active use.
• Unused pages are stored on the freelist and are
reused when additional pages are required.
• Forensic value: “Freelist leaf pages contain no
information. SQLite avoids reading or writing
freelist leaf pages in order to reduce disk I/O.”
25. Rollback journal
• Created when a database is going to be updated
• The original unmodified content of that page is written
into the rollback journal.
• The rollback journal is always located in the same
directory as the database file and has the same name as
the database file but with the string "-journal" appended
• Excellent source of forensic data if recoverable
• Recoverable on many systems though some are now
writing to tmpfs/RAM disks
26. Write Ahead Log (WAL)
• New technique just introduced in 3.7.0
• Generally faster and disk I/O is more sequential (which helps us in
advanced recovery)
• All changes to the database are recorded by writing frames into the WAL.
• Transactions commit when a frame is written that contains a commit marker.
• A single WAL can and usually does record multiple transactions.
• Periodically, the content of the WAL is transferred back into the database file
in an operation called a "checkpoint".
• Forensic value: recovery of WAL files
27. Record Format
• A record contains a header and a body, in that order. The
header:
– begins with a single varint which determines the total number of
bytes in the header. The varint value is the size of the header in
bytes including the size varint itself.
– Following the size varint are one or more additional varints, one
per column. These additional varints are called "serial type"
numbers and determine the datatype of each column
– After the final header varint, the record data immediately follows
– The 2-bytes prior to the start of the header correspond to the
auto-increment integer assigned by the system (also a varint)
28. Record Format – visual representation
• http://www.sqlite.org/fileformat.html#record_format
29. Record Format
Header Value Data type and size
0 An SQL NULL value (type SQLITE_NULL). This value consumes zero bytes of space in the record's data area.
1 An SQL integer value (type SQLITE_INTEGER), stored as a big-endian 1-byte signed integer.
2 An SQL integer value (type SQLITE_INTEGER), stored as a big-endian 2-byte signed integer.
3 An SQL integer value (type SQLITE_INTEGER), stored as a big-endian 3-byte signed integer.
4 An SQL integer value (type SQLITE_INTEGER), stored as a big-endian 4-byte signed integer.
5 An SQL integer value (type SQLITE_INTEGER), stored as a big-endian 6-byte signed integer.
6 An SQL integer value (type SQLITE_INTEGER), stored as an big-endian 8-byte signed integer.
7 An SQL real value (type SQLITE_FLOAT), stored as an 8-byte IEEE floating point value.
8 The literal SQL integer 0 (type SQLITE_INTEGER). The value consumes zero bytes of space in the record's data
area. Values of this type are only present in databases with a schema file format (the 32-bit integer at byte
offset 44 of the database header) value of 4 or greater. (iOS4 uses this)
9 The literal SQL integer 1 (type SQLITE_INTEGER). The value consumes zero bytes of space in the record's data
area. Values of this type are only present in databases with a schema file format (the 32-bit integer at byte
offset 44 of the database header) value of 4 or greater. (iOS4 uses this)
10,11 Not used. Reserved for expansion.
bytes * 2 + 12 Even values greater than or equal to 12 are used to signify a blob of data (type SQLITE_BLOB) (n-12)/2 bytes
in length, where n is the integer value stored in the record header.
bytes * 2 + 13 Odd values greater than 12 are used to signify a string (type SQLITE_TEXT) (n-13)/2 bytes in length, where n
is the integer value stored in the record header.
30. Recovery from allocated SQLite with strings
ahoog@linux-wks-001:~/sqlite$ strings iPhone-3G-313-sms.db | less
<snip>
msg_group
(314) 267-6611us
(920) 277-1869us
(312) 898-4070us
(312) 401-1679us
(414) 331-5030us
Piece of cake! Can't wait to try em out on Sunday
text/plain
2text_0002.txt
image/jpeg
1IMG_6807.jpg?
Check out mccalister
text/plain
2text_0002.txt
image/jpeg
1IMG_6807.jpg
<snip>
32. Carving SQLite files – OS specific findings
• iOS
– Good recovery of both allocated and “latent”
SQLite files
• Android
– Excellent recovery but high repetition due to
log-structured file system repeating SQLite
header
• Other common file systems
– Good recovery form typical magnetic media
device running FAT, FAT32, NTFS, HFS, etc.
33. SQLite in Hex (really the only way to look at it)
0002270: 0000 0000 0000 0000 004d 0d12 0029 0445 .........M...).E
0002280: 0101 0001 0401 0101 0011 0000 0128 3331 .............(31
0002290: 3229 2038 3938 2d34 3037 304c 77d8 a257 2) 898-4070Lw..W
00022a0: 696c 6c20 796f 7520 676f 2067 6574 206d ill you go get m
00022b0: 6520 6120 636f 6666 6565 3f03 0003 4c77 e a coffee?...Lw
00022c0: d8a2 0000 0075 7301 3f0c 1200 2904 2901 .....us.?...).).
Name Type Header Converted Body Value / notes
Rowid – actual Varint 0x0d 13 So rowid = 13
Header Size Varint 0x12 18 Length of header is 18 bytes (header size + 17 rows)
Rowid NULL 0x00 0 NULL tells SQLite on insert to determine next auto increment
Address Text 0x29 (41 -13)/2 = 14 (312) 898-4070 [14 chars - covert 0x29 to decimal, calc size]
Date Integer 0x04 4-byte integer 0x4c77d8a2 in decimal is 1282922658 [recognize number format?]
Text Text 0x45 (69 -13)/2 = 28 Will you go get me a coffee?
Flags Integer 0x01 1-byte integer 0x03 = 3
Replace Integer 0x01 1-byte integer 0x00 = 0
Svc_center Text 0x00 NULL No value, not represented in data at all
Group_id Integer 0x01 1-byte integer 0x03 = 3
Association_id Integer 0x04 4-byte integer 0x4c77d8a2 in decimal is 1282922658 [recognize number format?]
Height Integer 0x01 1-byte integer 0x00 = 0
UIFlags Integer 0x01 1-byte integer 0x00 = 0
Version Integer 0x01 1-byte integer 0x00 = 0
Subject Text 0x00 NULL No value, not represented in data at all
Country Text 0x11 (17 – 13)/2 = 2 us
Headers Blob 0x00 NULL No value, not represented in data at all
Recipients Blog 0x00 NULL No value, not represented in data at all
Read Integer 0x01 1-byte integer 1 [Last data byte]
34. Advanced Technique
• Use well defined SQLite structure to develop a program
to recover SQLite rows
• Row header and data values “decay” over time due to
– Being (partially) re-allocated
– Fragmentation
– Compensated for this with simple probability engine which
determined likelihood sequence of bytes represented header row
we are interested in
• Underlying file system can have great impact, from FAT,
HFSplus (iPhone) and YAFFS2 (Android)
• Look for journal files and WAL data too
35. Contact Us
Andrew Hoog, CIO
ahoog@viaforensics.com
http://viaforensics.com
1000 Lake St, Suite 203
Oak Park, IL 60301
Tel: 312-878-1100 | Fax: 312-268-7281