How Technology-Assisted Review Can Be More Effective Than Manual for E-Discovery

Julia Brickell
General Counsel
H5
Your “Big Buckets”
Are Full Of “Big Data”
© 2014 H5

Myriad
sources
Google
Docs
Employee sources
Internal
Enterprise data sources
External
Managed
External
Cloud
External
Gmail
Google
Docs

The End Game? To Retain What’s Needed
• Know what you need to keep
• Employ the right expertise to find it
– The right tools
– The right expertise
– Deployed effectively against diverse sources
• Securely dispose of the rest

“Overall, the myth that exhaustive
manual review is the most effective –
and therefore, the most defensible –
approach to document review is
strongly refuted. Technology-assisted
review can (and does) yield more
accurate results than exhaustive
manual review, with much lower effort.
Search
“superior
to manual
reviews”
Richmond Journal of Law
and Technology (2011)
___________________________
TECHNOLOGY-ASSISTED
REVIEW IN
E-DISCOVERY CAN BE MORE
EFFECTIVE AND MORE
EFFICIENT THAN EXHAUSTIVE
MANUAL REVIEW
Maura R. Grossman
Gordon V. Cormack
XVII RICH. J.L. & TECH. 11 (2011),
http://jolt.richmond.edu/v17i3/article11.pdf , p.48

Search
Results
Vary
NIST TREC
Legal Track
Interactive
Task
2008-2010
0.0 0.2 0.4 0.6 0.8 1.0
1.0
0.8
0.6
0.4
0.2
0.0
Recall
Precision
High
Recall
High
Precision
2008
2009
2010
Keyword Search
(Blair & Maron,1985)
Manual Review
(Grossman &
Cormack, 2011)
Precision
Recall
(Sponsored by National Institute of
Standards and Technology
TREC Legal Track
http://trec-legal.umiacs.umd.edu)

Search is
run on an
index
Token Locations
action 3:1; 24:10;
45:112;
all 3:5; 4; 23
accountants 2:2; 41::33
business 2:3; 4::56
conferences
3:12; 7:1; 88:5;
95:1
date 1:1; 4:1; 5:3;
8:13
dec 1:3; 155:9
Same search queries
provide different
results depending on
the tool
• Google
• Exact search
• Algorithmic
search

Target documents:
common cold
virus
cough!
fever
congest!
loss w/3
appetite
allergies
sneez!
smoking
flu
computers
traffic
malaise
sore throat
runny nose
o known
o adjustable
o over-inclusive – anchor
o under-inclusive – add
Exact
Search
Boolean,
Rule-
Based,
Modeling
Linguistic
Patterns

Exact
Search
Rule-
Based,
Modeling
Linguistic
Patterns
enron #w5 [data, documents, e{ }mail{s},
record{s}, evidence{s}, info{rmation}, copy[y,
ies], file{s}] #w10 [shred{s, ded, dding},
destroy{s, ed, ing}]
TreC09_204_ST_Retention_
Deletion
BM
o known
o adjustable
o over-inclusive – anchor
o under-inclusive – add

Concept
Search:
Thesaurus
addition
Target documents:
common cold
virus
cough!
fever
chills
congest!
loss w/3
appetite
sneez!
heat
hotness
torridness
delirium
ecstasy
excitement
febrile
disease
ferment
fervor
fire
flush
frenzy
intensity
germ
micro
organism
bacterium
bug
microbe
bacillus
ailment
disease
illness
infection
pathogen
sickness
flu
venom
o unknown
o imbedded
o not adjustable
o over-inclusive

Algorithmic
Search
Computes
document
“totals” and
compares
totals
Document 1
“total”
Document 2
“total”
α
β
o unknown
o imbedded
o hard to adjust
o over-inclusive
o under-inclusive

Algorithmic
search with
“seed sets”
NR
NRRNR R R cough
cough
smokin
ache
malaise
sleep
sneezed
cocaine
congest
chill
chill
ice
virus
comput
counsel
patent
misuse
chill
ed
dripping
fever
trip
cold
runny
er
crash
g
NNRR
NR
NR
NR
R
R
NR
R
R seed set
“total”
NR
seed set
“total”
α
β
Seed set
o unknown
o imbedded
o hard to adjust
o over-inclusive
o under-inclusive

Statistics
Supports
Defensibility
Yield Estimate
– Estimate of
responsive
documents in
data set
Data set – 100,000 documents
1000 doc
sample
15,000 docs
estimated
responsive yield
150 target
docs
150
150/1000 target docs in sample = 15%
Hence estimated 15,000/100,000 target docs in data set

Statistics
Supports
Defensibility
Sample of
Results –
“Not Tagged” Data
90,000 documents
1000
doc
sample
“Tagged “ Data
10,000 documents
1000
doc
sample
700
70% correctly
tagged
90/1000
target docs
missed
90
10,000 x 70% correct = 7,000 target docs tagged
90,000 x 9% missed = 8,100 target docs missed
46% recall:
7,000/15,100
More target docs
missed than
tagged.

Julia Brickell
General Counsel
H5
jbrickell@H5.com
www.H5.com

How Technology-Assisted Review Can Be More Effective Than Manual for E-Discovery

Recomendados

Recomendados

Más contenido relacionado

Destacado

Destacado (20)

Similar a How Technology-Assisted Review Can Be More Effective Than Manual for E-Discovery

Similar a How Technology-Assisted Review Can Be More Effective Than Manual for E-Discovery (20)

Más de ARMA International

Más de ARMA International (20)

Último

Último (20)

How Technology-Assisted Review Can Be More Effective Than Manual for E-Discovery