SlideShare una empresa de Scribd logo
1 de 14
Julia Brickell 
General Counsel 
H5 
Your “Big Buckets” 
Are Full Of “Big Data” 
© 2014 H5
Myriad 
sources 
Google 
Docs 
Employee sources 
Internal 
Enterprise data sources 
External 
Managed 
External 
Cloud 
External 
Gmail 
Google 
Docs
The End Game? To Retain What’s Needed 
• Know what you need to keep 
• Employ the right expertise to find it 
– The right tools 
– The right expertise 
– Deployed effectively against diverse sources 
• Securely dispose of the rest
“Overall, the myth that exhaustive 
manual review is the most effective – 
and therefore, the most defensible – 
approach to document review is 
strongly refuted. Technology-assisted 
review can (and does) yield more 
accurate results than exhaustive 
manual review, with much lower effort. 
Search 
“superior 
to manual 
reviews” 
Richmond Journal of Law 
and Technology (2011) 
___________________________ 
TECHNOLOGY-ASSISTED 
REVIEW IN 
E-DISCOVERY CAN BE MORE 
EFFECTIVE AND MORE 
EFFICIENT THAN EXHAUSTIVE 
MANUAL REVIEW 
Maura R. Grossman 
Gordon V. Cormack 
XVII RICH. J.L. & TECH. 11 (2011), 
http://jolt.richmond.edu/v17i3/article11.pdf , p.48
Search 
Results 
Vary 
NIST TREC 
Legal Track 
Interactive 
Task 
2008-2010 
0.0 0.2 0.4 0.6 0.8 1.0 
1.0 
0.8 
0.6 
0.4 
0.2 
0.0 
Recall 
Precision 
High 
Recall 
High 
Precision 
2008 
2009 
2010 
Keyword Search 
(Blair & Maron,1985) 
Manual Review 
(Grossman & 
Cormack, 2011) 
Precision 
Recall 
(Sponsored by National Institute of 
Standards and Technology 
TREC Legal Track 
http://trec-legal.umiacs.umd.edu)
Search is 
run on an 
index 
Token Locations 
action 3:1; 24:10; 
45:112; 
all 3:5; 4; 23 
accountants 2:2; 41::33 
business 2:3; 4::56 
conferences 
3:12; 7:1; 88:5; 
95:1 
date 1:1; 4:1; 5:3; 
8:13 
dec 1:3; 155:9 
Same search queries 
provide different 
results depending on 
the tool 
• Google 
• Exact search 
• Algorithmic 
search
Target documents: 
common cold 
virus 
cough! 
fever 
congest! 
loss w/3 
appetite 
allergies 
sneez! 
smoking 
flu 
computers 
traffic 
malaise 
sore throat 
runny nose 
o known 
o adjustable 
o over-inclusive – anchor 
o under-inclusive – add 
Exact 
Search 
Boolean, 
Rule- 
Based, 
Modeling 
Linguistic 
Patterns
Exact 
Search 
Rule- 
Based, 
Modeling 
Linguistic 
Patterns 
enron #w5 [data, documents, e{ }mail{s}, 
record{s}, evidence{s}, info{rmation}, copy[y, 
ies], file{s}] #w10 [shred{s, ded, dding}, 
destroy{s, ed, ing}] 
TreC09_204_ST_Retention_ 
Deletion 
BM 
o known 
o adjustable 
o over-inclusive – anchor 
o under-inclusive – add
Concept 
Search: 
Thesaurus 
addition 
Target documents: 
common cold 
virus 
cough! 
fever 
chills 
congest! 
loss w/3 
appetite 
sneez! 
heat 
hotness 
torridness 
delirium 
ecstasy 
excitement 
febrile 
disease 
ferment 
fervor 
fire 
flush 
frenzy 
intensity 
germ 
micro 
organism 
bacterium 
bug 
microbe 
bacillus 
ailment 
disease 
illness 
infection 
pathogen 
sickness 
flu 
venom 
o unknown 
o imbedded 
o not adjustable 
o over-inclusive
Algorithmic 
Search 
Computes 
document 
“totals” and 
compares 
totals 
Document 1 
“total” 
Document 2 
“total” 
α 
β 
o unknown 
o imbedded 
o hard to adjust 
o over-inclusive 
o under-inclusive
Algorithmic 
search with 
“seed sets” 
NR 
NRRNR R R cough 
cough 
smokin 
ache 
malaise 
sleep 
sneezed 
cocaine 
congest 
chill 
chill 
ice 
virus 
comput 
counsel 
patent 
misuse 
chill 
ed 
dripping 
fever 
trip 
cold 
runny 
er 
crash 
g 
NNRR 
NR 
NR 
NR 
R 
R 
NR 
R 
R seed set 
“total” 
NR 
seed set 
“total” 
α 
β 
Seed set 
o unknown 
o imbedded 
o hard to adjust 
o over-inclusive 
o under-inclusive
Statistics 
Supports 
Defensibility 
Yield Estimate 
– Estimate of 
responsive 
documents in 
data set 
Data set – 100,000 documents 
1000 doc 
sample 
15,000 docs 
estimated 
responsive yield 
150 target 
docs 
150 
150/1000 target docs in sample = 15% 
Hence estimated 15,000/100,000 target docs in data set
Statistics 
Supports 
Defensibility 
Sample of 
Results – 
“Not Tagged” Data 
90,000 documents 
1000 
doc 
sample 
“Tagged “ Data 
10,000 documents 
1000 
doc 
sample 
700 
70% correctly 
tagged 
90/1000 
target docs 
missed 
90 
10,000 x 70% correct = 7,000 target docs tagged 
90,000 x 9% missed = 8,100 target docs missed 
46% recall: 
7,000/15,100 
More target docs 
missed than 
tagged.
Julia Brickell 
General Counsel 
H5 
jbrickell@H5.com 
www.H5.com

Más contenido relacionado

Destacado

Données ouvertes : faire autrement.
Données ouvertes : faire autrement. Données ouvertes : faire autrement.
Données ouvertes : faire autrement. Diane Mercier
 
Master Thesis Data Governance Maturity Model - Jan Merkus MSc
Master Thesis Data Governance Maturity Model - Jan Merkus MScMaster Thesis Data Governance Maturity Model - Jan Merkus MSc
Master Thesis Data Governance Maturity Model - Jan Merkus MScJan Merkus
 
BDM - project management in big data context.pptx
BDM -  project management in big data context.pptxBDM -  project management in big data context.pptx
BDM - project management in big data context.pptxJean-Louis Quéguiner
 
Data Governance Assessment - Jan Rutger Merkus MSc
Data Governance Assessment - Jan Rutger Merkus MScData Governance Assessment - Jan Rutger Merkus MSc
Data Governance Assessment - Jan Rutger Merkus MScJan Merkus
 
Content Governance and Workflow - Confab Intensive 2015
Content Governance and Workflow - Confab Intensive 2015Content Governance and Workflow - Confab Intensive 2015
Content Governance and Workflow - Confab Intensive 2015Content Strategy Inc.
 
The Big Data - Same Humans Problem (CIDR 2015)
The Big Data - Same Humans Problem (CIDR 2015)The Big Data - Same Humans Problem (CIDR 2015)
The Big Data - Same Humans Problem (CIDR 2015)Alexandros Labrinidis
 
Introduction to Data Management Maturity Models
Introduction to Data Management Maturity ModelsIntroduction to Data Management Maturity Models
Introduction to Data Management Maturity ModelsKingland
 
Chief Data Officer: Overcoming Data Silos for True Business Value
Chief Data Officer: Overcoming Data Silos for True Business ValueChief Data Officer: Overcoming Data Silos for True Business Value
Chief Data Officer: Overcoming Data Silos for True Business ValueCraig Milroy
 
Gartner's Maturity Model For Programme Portfolio Management (PPM)
Gartner's Maturity Model For Programme Portfolio Management (PPM)Gartner's Maturity Model For Programme Portfolio Management (PPM)
Gartner's Maturity Model For Programme Portfolio Management (PPM)Gartner
 
Gartner's ITScore for BPM Maturity
Gartner's ITScore for BPM MaturityGartner's ITScore for BPM Maturity
Gartner's ITScore for BPM MaturityGartner
 
3 Steps for Breaking Down Data & Analytic Silos
3 Steps for Breaking Down Data & Analytic Silos 3 Steps for Breaking Down Data & Analytic Silos
3 Steps for Breaking Down Data & Analytic Silos Dun & Bradstreet
 
Enterprise Data Governance for Financial Institutions
Enterprise Data Governance for Financial InstitutionsEnterprise Data Governance for Financial Institutions
Enterprise Data Governance for Financial InstitutionsSheldon McCarthy
 
Enterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for HealthcareEnterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for HealthcareDATA360US
 
Project governance
Project governanceProject governance
Project governanceGlen Alleman
 
The Gartner IAM Program Maturity Model
The Gartner IAM Program Maturity ModelThe Gartner IAM Program Maturity Model
The Gartner IAM Program Maturity ModelSarah Moore
 
Review of Data Management Maturity Models
Review of Data Management Maturity ModelsReview of Data Management Maturity Models
Review of Data Management Maturity ModelsAlan McSweeney
 
Ibm data governance framework
Ibm data governance frameworkIbm data governance framework
Ibm data governance frameworkkaiyun7631
 
5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance 5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance Qubole
 

Destacado (20)

Données ouvertes : faire autrement.
Données ouvertes : faire autrement. Données ouvertes : faire autrement.
Données ouvertes : faire autrement.
 
Master Thesis Data Governance Maturity Model - Jan Merkus MSc
Master Thesis Data Governance Maturity Model - Jan Merkus MScMaster Thesis Data Governance Maturity Model - Jan Merkus MSc
Master Thesis Data Governance Maturity Model - Jan Merkus MSc
 
BDM - project management in big data context.pptx
BDM -  project management in big data context.pptxBDM -  project management in big data context.pptx
BDM - project management in big data context.pptx
 
Data Governance Assessment - Jan Rutger Merkus MSc
Data Governance Assessment - Jan Rutger Merkus MScData Governance Assessment - Jan Rutger Merkus MSc
Data Governance Assessment - Jan Rutger Merkus MSc
 
Content Governance and Workflow - Confab Intensive 2015
Content Governance and Workflow - Confab Intensive 2015Content Governance and Workflow - Confab Intensive 2015
Content Governance and Workflow - Confab Intensive 2015
 
KPIs for Big Data
KPIs for Big DataKPIs for Big Data
KPIs for Big Data
 
The Big Data - Same Humans Problem (CIDR 2015)
The Big Data - Same Humans Problem (CIDR 2015)The Big Data - Same Humans Problem (CIDR 2015)
The Big Data - Same Humans Problem (CIDR 2015)
 
Introduction to Data Management Maturity Models
Introduction to Data Management Maturity ModelsIntroduction to Data Management Maturity Models
Introduction to Data Management Maturity Models
 
Chief Data Officer: Overcoming Data Silos for True Business Value
Chief Data Officer: Overcoming Data Silos for True Business ValueChief Data Officer: Overcoming Data Silos for True Business Value
Chief Data Officer: Overcoming Data Silos for True Business Value
 
Unicom Big Data Conference
Unicom  Big Data ConferenceUnicom  Big Data Conference
Unicom Big Data Conference
 
Gartner's Maturity Model For Programme Portfolio Management (PPM)
Gartner's Maturity Model For Programme Portfolio Management (PPM)Gartner's Maturity Model For Programme Portfolio Management (PPM)
Gartner's Maturity Model For Programme Portfolio Management (PPM)
 
Gartner's ITScore for BPM Maturity
Gartner's ITScore for BPM MaturityGartner's ITScore for BPM Maturity
Gartner's ITScore for BPM Maturity
 
3 Steps for Breaking Down Data & Analytic Silos
3 Steps for Breaking Down Data & Analytic Silos 3 Steps for Breaking Down Data & Analytic Silos
3 Steps for Breaking Down Data & Analytic Silos
 
Enterprise Data Governance for Financial Institutions
Enterprise Data Governance for Financial InstitutionsEnterprise Data Governance for Financial Institutions
Enterprise Data Governance for Financial Institutions
 
Enterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for HealthcareEnterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for Healthcare
 
Project governance
Project governanceProject governance
Project governance
 
The Gartner IAM Program Maturity Model
The Gartner IAM Program Maturity ModelThe Gartner IAM Program Maturity Model
The Gartner IAM Program Maturity Model
 
Review of Data Management Maturity Models
Review of Data Management Maturity ModelsReview of Data Management Maturity Models
Review of Data Management Maturity Models
 
Ibm data governance framework
Ibm data governance frameworkIbm data governance framework
Ibm data governance framework
 
5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance 5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance
 

Similar a How Technology-Assisted Review Can Be More Effective Than Manual for E-Discovery

Splunk for cyber_threat
Splunk for cyber_threatSplunk for cyber_threat
Splunk for cyber_threatGreg Hanchin
 
How Logilab ELN helps Organizations in Research Data Management
How Logilab ELN helps Organizations in Research Data ManagementHow Logilab ELN helps Organizations in Research Data Management
How Logilab ELN helps Organizations in Research Data ManagementAgaram Technologies
 
sience 2.0 : an illustration of good research practices in a real study
sience 2.0 : an illustration of good research practices in a real studysience 2.0 : an illustration of good research practices in a real study
sience 2.0 : an illustration of good research practices in a real studywolf vanpaemel
 
Web-scale Discovery Tools and the Backgrounding of Government Information
Web-scale Discovery Tools and the Backgrounding of Government InformationWeb-scale Discovery Tools and the Backgrounding of Government Information
Web-scale Discovery Tools and the Backgrounding of Government InformationChristopher Brown
 
Using Machine Learning to Automate Clinical Pathways
Using Machine Learning to Automate Clinical PathwaysUsing Machine Learning to Automate Clinical Pathways
Using Machine Learning to Automate Clinical Pathwaysdiannepatricia
 
LFS302_Real-World Evidence Platform to Enable Therapeutic Innovation
LFS302_Real-World Evidence Platform to Enable Therapeutic InnovationLFS302_Real-World Evidence Platform to Enable Therapeutic Innovation
LFS302_Real-World Evidence Platform to Enable Therapeutic InnovationAmazon Web Services
 
ACEDS - ZyLAB webinar - AI Based eDiscovery Analytics
ACEDS - ZyLAB webinar - AI Based eDiscovery AnalyticsACEDS - ZyLAB webinar - AI Based eDiscovery Analytics
ACEDS - ZyLAB webinar - AI Based eDiscovery AnalyticsAnnelore van der Lint
 
Querylog-based Assessment of Retrievability Bias in Delpher
Querylog-based Assessment of Retrievability Bias in DelpherQuerylog-based Assessment of Retrievability Bias in Delpher
Querylog-based Assessment of Retrievability Bias in DelpherMyriam Traub
 
Fti Journal Predictive Discovery
Fti Journal   Predictive DiscoveryFti Journal   Predictive Discovery
Fti Journal Predictive DiscoveryAlbert Kassis
 
How new ai based analytics ignite a productivity revolution in e discovery-final
How new ai based analytics ignite a productivity revolution in e discovery-finalHow new ai based analytics ignite a productivity revolution in e discovery-final
How new ai based analytics ignite a productivity revolution in e discovery-finaljcscholtes
 
AI-SDV 2021: Heiko Wongel - Machine learning tools in patent searching - are ...
AI-SDV 2021: Heiko Wongel - Machine learning tools in patent searching - are ...AI-SDV 2021: Heiko Wongel - Machine learning tools in patent searching - are ...
AI-SDV 2021: Heiko Wongel - Machine learning tools in patent searching - are ...Dr. Haxel Consult
 
2016-08-22_winning_on_technicalities_for_linkedin
2016-08-22_winning_on_technicalities_for_linkedin2016-08-22_winning_on_technicalities_for_linkedin
2016-08-22_winning_on_technicalities_for_linkedinDaniel Thornton
 
The Taverna Workflow Management Software Suite - Past, Present, Future
The Taverna Workflow Management Software Suite - Past, Present, FutureThe Taverna Workflow Management Software Suite - Past, Present, Future
The Taverna Workflow Management Software Suite - Past, Present, FuturemyGrid team
 
Equivalence is in the (ID) of the beholder
Equivalence is in the (ID) of the beholderEquivalence is in the (ID) of the beholder
Equivalence is in the (ID) of the beholdermhaendel
 
What is Objective Evidence? - EduQuest FDA Compliance Advisory
What is Objective Evidence? - EduQuest FDA Compliance AdvisoryWhat is Objective Evidence? - EduQuest FDA Compliance Advisory
What is Objective Evidence? - EduQuest FDA Compliance AdvisoryEduQuest, Inc.
 
Webinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with FusionWebinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with FusionLucidworks
 
Managing data responsibly to enable research interity
Managing data responsibly to enable research interityManaging data responsibly to enable research interity
Managing data responsibly to enable research interityIUPUI
 

Similar a How Technology-Assisted Review Can Be More Effective Than Manual for E-Discovery (20)

Splunk for cyber_threat
Splunk for cyber_threatSplunk for cyber_threat
Splunk for cyber_threat
 
How Logilab ELN helps Organizations in Research Data Management
How Logilab ELN helps Organizations in Research Data ManagementHow Logilab ELN helps Organizations in Research Data Management
How Logilab ELN helps Organizations in Research Data Management
 
Machine Learning and Multi Drug Resistant(MDR) Infections case study
Machine Learning and Multi Drug Resistant(MDR) Infections case studyMachine Learning and Multi Drug Resistant(MDR) Infections case study
Machine Learning and Multi Drug Resistant(MDR) Infections case study
 
sience 2.0 : an illustration of good research practices in a real study
sience 2.0 : an illustration of good research practices in a real studysience 2.0 : an illustration of good research practices in a real study
sience 2.0 : an illustration of good research practices in a real study
 
Web-scale Discovery Tools and the Backgrounding of Government Information
Web-scale Discovery Tools and the Backgrounding of Government InformationWeb-scale Discovery Tools and the Backgrounding of Government Information
Web-scale Discovery Tools and the Backgrounding of Government Information
 
Using Machine Learning to Automate Clinical Pathways
Using Machine Learning to Automate Clinical PathwaysUsing Machine Learning to Automate Clinical Pathways
Using Machine Learning to Automate Clinical Pathways
 
LFS302_Real-World Evidence Platform to Enable Therapeutic Innovation
LFS302_Real-World Evidence Platform to Enable Therapeutic InnovationLFS302_Real-World Evidence Platform to Enable Therapeutic Innovation
LFS302_Real-World Evidence Platform to Enable Therapeutic Innovation
 
ACEDS - ZyLAB webinar - AI Based eDiscovery Analytics
ACEDS - ZyLAB webinar - AI Based eDiscovery AnalyticsACEDS - ZyLAB webinar - AI Based eDiscovery Analytics
ACEDS - ZyLAB webinar - AI Based eDiscovery Analytics
 
Querylog-based Assessment of Retrievability Bias in Delpher
Querylog-based Assessment of Retrievability Bias in DelpherQuerylog-based Assessment of Retrievability Bias in Delpher
Querylog-based Assessment of Retrievability Bias in Delpher
 
Fti Journal Predictive Discovery
Fti Journal   Predictive DiscoveryFti Journal   Predictive Discovery
Fti Journal Predictive Discovery
 
Metopen 6
Metopen 6Metopen 6
Metopen 6
 
How new ai based analytics ignite a productivity revolution in e discovery-final
How new ai based analytics ignite a productivity revolution in e discovery-finalHow new ai based analytics ignite a productivity revolution in e discovery-final
How new ai based analytics ignite a productivity revolution in e discovery-final
 
AI-SDV 2021: Heiko Wongel - Machine learning tools in patent searching - are ...
AI-SDV 2021: Heiko Wongel - Machine learning tools in patent searching - are ...AI-SDV 2021: Heiko Wongel - Machine learning tools in patent searching - are ...
AI-SDV 2021: Heiko Wongel - Machine learning tools in patent searching - are ...
 
2016-08-22_winning_on_technicalities_for_linkedin
2016-08-22_winning_on_technicalities_for_linkedin2016-08-22_winning_on_technicalities_for_linkedin
2016-08-22_winning_on_technicalities_for_linkedin
 
The Taverna Workflow Management Software Suite - Past, Present, Future
The Taverna Workflow Management Software Suite - Past, Present, FutureThe Taverna Workflow Management Software Suite - Past, Present, Future
The Taverna Workflow Management Software Suite - Past, Present, Future
 
Equivalence is in the (ID) of the beholder
Equivalence is in the (ID) of the beholderEquivalence is in the (ID) of the beholder
Equivalence is in the (ID) of the beholder
 
What is Objective Evidence? - EduQuest FDA Compliance Advisory
What is Objective Evidence? - EduQuest FDA Compliance AdvisoryWhat is Objective Evidence? - EduQuest FDA Compliance Advisory
What is Objective Evidence? - EduQuest FDA Compliance Advisory
 
Webinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with FusionWebinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with Fusion
 
Martone grethe
Martone gretheMartone grethe
Martone grethe
 
Managing data responsibly to enable research interity
Managing data responsibly to enable research interityManaging data responsibly to enable research interity
Managing data responsibly to enable research interity
 

Más de ARMA International

Information Governance in the Cloud: Compare and Contrast (2020 update)
Information Governance in the Cloud: Compare and Contrast (2020 update)Information Governance in the Cloud: Compare and Contrast (2020 update)
Information Governance in the Cloud: Compare and Contrast (2020 update)ARMA International
 
“7 "Reasonable Steps" for Legal Holds of ESI and Other Documents
“7 "Reasonable Steps" for Legal Holds of ESI and Other Documents“7 "Reasonable Steps" for Legal Holds of ESI and Other Documents
“7 "Reasonable Steps" for Legal Holds of ESI and Other DocumentsARMA International
 
ARMA's Information Governance Implementation Model (IGIM): The Way Forward Fo...
ARMA's Information Governance Implementation Model (IGIM): The Way Forward Fo...ARMA's Information Governance Implementation Model (IGIM): The Way Forward Fo...
ARMA's Information Governance Implementation Model (IGIM): The Way Forward Fo...ARMA International
 
Jocelyn Gunter - Bringing The Information Disciplines Together
Jocelyn Gunter - Bringing The Information Disciplines TogetherJocelyn Gunter - Bringing The Information Disciplines Together
Jocelyn Gunter - Bringing The Information Disciplines TogetherARMA International
 
Nick Inglis - A Complete Circle (Open Source Knowledge, The Hubble Telescope,...
Nick Inglis - A Complete Circle (Open Source Knowledge, The Hubble Telescope,...Nick Inglis - A Complete Circle (Open Source Knowledge, The Hubble Telescope,...
Nick Inglis - A Complete Circle (Open Source Knowledge, The Hubble Telescope,...ARMA International
 
Morgan Templar - Connecting IT Strategy To Business Operations For Seamless C...
Morgan Templar - Connecting IT Strategy To Business Operations For Seamless C...Morgan Templar - Connecting IT Strategy To Business Operations For Seamless C...
Morgan Templar - Connecting IT Strategy To Business Operations For Seamless C...ARMA International
 
Ty Molchany - Information Remediation After Mergers & Acquisitions: An Auto-C...
Ty Molchany - Information Remediation After Mergers & Acquisitions: An Auto-C...Ty Molchany - Information Remediation After Mergers & Acquisitions: An Auto-C...
Ty Molchany - Information Remediation After Mergers & Acquisitions: An Auto-C...ARMA International
 
Brent Gatewood - Technologies Attack
Brent Gatewood - Technologies AttackBrent Gatewood - Technologies Attack
Brent Gatewood - Technologies AttackARMA International
 
Tod Chernikoff - Conducting large scale records inventory (handout)
Tod Chernikoff - Conducting large scale records inventory (handout)Tod Chernikoff - Conducting large scale records inventory (handout)
Tod Chernikoff - Conducting large scale records inventory (handout)ARMA International
 
Kathryn Rattigan - Cybersecurity & The Commercial Done Industry
Kathryn Rattigan - Cybersecurity & The Commercial Done IndustryKathryn Rattigan - Cybersecurity & The Commercial Done Industry
Kathryn Rattigan - Cybersecurity & The Commercial Done IndustryARMA International
 
Steve Weissman - Maximizing The Value Of Your Information Investments
Steve Weissman - Maximizing The Value Of Your Information InvestmentsSteve Weissman - Maximizing The Value Of Your Information Investments
Steve Weissman - Maximizing The Value Of Your Information InvestmentsARMA International
 
Randy Moeller - Mitigating Application Risk Upfront (Without Increased Hair L...
Randy Moeller - Mitigating Application Risk Upfront (Without Increased Hair L...Randy Moeller - Mitigating Application Risk Upfront (Without Increased Hair L...
Randy Moeller - Mitigating Application Risk Upfront (Without Increased Hair L...ARMA International
 
Jim Koziol - The Sport of Information Governance
Jim Koziol - The Sport of Information GovernanceJim Koziol - The Sport of Information Governance
Jim Koziol - The Sport of Information GovernanceARMA International
 
Steve Weissman, Patrick O'Guinn, Kevin Parker, Donda Young - Planning For Inf...
Steve Weissman, Patrick O'Guinn, Kevin Parker, Donda Young - Planning For Inf...Steve Weissman, Patrick O'Guinn, Kevin Parker, Donda Young - Planning For Inf...
Steve Weissman, Patrick O'Guinn, Kevin Parker, Donda Young - Planning For Inf...ARMA International
 
Dr. Stephanie Carter - Training Humans To Be Machines
Dr. Stephanie Carter - Training Humans To Be MachinesDr. Stephanie Carter - Training Humans To Be Machines
Dr. Stephanie Carter - Training Humans To Be MachinesARMA International
 
Michael Fillion - Data Governance In The Digitally Transformed Enterprise
Michael Fillion - Data Governance In The Digitally Transformed EnterpriseMichael Fillion - Data Governance In The Digitally Transformed Enterprise
Michael Fillion - Data Governance In The Digitally Transformed EnterpriseARMA International
 
Kevin Parker - The Leadership Journey
Kevin Parker - The Leadership JourneyKevin Parker - The Leadership Journey
Kevin Parker - The Leadership JourneyARMA International
 
Ali Daneshmand - How Does Institutional Culture Influence Information Governance
Ali Daneshmand - How Does Institutional Culture Influence Information GovernanceAli Daneshmand - How Does Institutional Culture Influence Information Governance
Ali Daneshmand - How Does Institutional Culture Influence Information GovernanceARMA International
 
Nick Inglis - Welcome To #InfoGov17 & Providence, RI
Nick Inglis - Welcome To #InfoGov17 & Providence, RINick Inglis - Welcome To #InfoGov17 & Providence, RI
Nick Inglis - Welcome To #InfoGov17 & Providence, RIARMA International
 

Más de ARMA International (20)

Information Governance in the Cloud: Compare and Contrast (2020 update)
Information Governance in the Cloud: Compare and Contrast (2020 update)Information Governance in the Cloud: Compare and Contrast (2020 update)
Information Governance in the Cloud: Compare and Contrast (2020 update)
 
“7 "Reasonable Steps" for Legal Holds of ESI and Other Documents
“7 "Reasonable Steps" for Legal Holds of ESI and Other Documents“7 "Reasonable Steps" for Legal Holds of ESI and Other Documents
“7 "Reasonable Steps" for Legal Holds of ESI and Other Documents
 
ARMA's Information Governance Implementation Model (IGIM): The Way Forward Fo...
ARMA's Information Governance Implementation Model (IGIM): The Way Forward Fo...ARMA's Information Governance Implementation Model (IGIM): The Way Forward Fo...
ARMA's Information Governance Implementation Model (IGIM): The Way Forward Fo...
 
Jocelyn Gunter - Bringing The Information Disciplines Together
Jocelyn Gunter - Bringing The Information Disciplines TogetherJocelyn Gunter - Bringing The Information Disciplines Together
Jocelyn Gunter - Bringing The Information Disciplines Together
 
Nick Inglis - A Complete Circle (Open Source Knowledge, The Hubble Telescope,...
Nick Inglis - A Complete Circle (Open Source Knowledge, The Hubble Telescope,...Nick Inglis - A Complete Circle (Open Source Knowledge, The Hubble Telescope,...
Nick Inglis - A Complete Circle (Open Source Knowledge, The Hubble Telescope,...
 
Morgan Templar - Connecting IT Strategy To Business Operations For Seamless C...
Morgan Templar - Connecting IT Strategy To Business Operations For Seamless C...Morgan Templar - Connecting IT Strategy To Business Operations For Seamless C...
Morgan Templar - Connecting IT Strategy To Business Operations For Seamless C...
 
Ty Molchany - Information Remediation After Mergers & Acquisitions: An Auto-C...
Ty Molchany - Information Remediation After Mergers & Acquisitions: An Auto-C...Ty Molchany - Information Remediation After Mergers & Acquisitions: An Auto-C...
Ty Molchany - Information Remediation After Mergers & Acquisitions: An Auto-C...
 
Brent Gatewood - Technologies Attack
Brent Gatewood - Technologies AttackBrent Gatewood - Technologies Attack
Brent Gatewood - Technologies Attack
 
Tod Chernikoff - Conducting large scale records inventory (handout)
Tod Chernikoff - Conducting large scale records inventory (handout)Tod Chernikoff - Conducting large scale records inventory (handout)
Tod Chernikoff - Conducting large scale records inventory (handout)
 
Kathryn Rattigan - Cybersecurity & The Commercial Done Industry
Kathryn Rattigan - Cybersecurity & The Commercial Done IndustryKathryn Rattigan - Cybersecurity & The Commercial Done Industry
Kathryn Rattigan - Cybersecurity & The Commercial Done Industry
 
Steve Weissman - Maximizing The Value Of Your Information Investments
Steve Weissman - Maximizing The Value Of Your Information InvestmentsSteve Weissman - Maximizing The Value Of Your Information Investments
Steve Weissman - Maximizing The Value Of Your Information Investments
 
Randy Moeller - Mitigating Application Risk Upfront (Without Increased Hair L...
Randy Moeller - Mitigating Application Risk Upfront (Without Increased Hair L...Randy Moeller - Mitigating Application Risk Upfront (Without Increased Hair L...
Randy Moeller - Mitigating Application Risk Upfront (Without Increased Hair L...
 
Jim Koziol - The Sport of Information Governance
Jim Koziol - The Sport of Information GovernanceJim Koziol - The Sport of Information Governance
Jim Koziol - The Sport of Information Governance
 
Gene Stakhov - Taxonology
Gene Stakhov - TaxonologyGene Stakhov - Taxonology
Gene Stakhov - Taxonology
 
Steve Weissman, Patrick O'Guinn, Kevin Parker, Donda Young - Planning For Inf...
Steve Weissman, Patrick O'Guinn, Kevin Parker, Donda Young - Planning For Inf...Steve Weissman, Patrick O'Guinn, Kevin Parker, Donda Young - Planning For Inf...
Steve Weissman, Patrick O'Guinn, Kevin Parker, Donda Young - Planning For Inf...
 
Dr. Stephanie Carter - Training Humans To Be Machines
Dr. Stephanie Carter - Training Humans To Be MachinesDr. Stephanie Carter - Training Humans To Be Machines
Dr. Stephanie Carter - Training Humans To Be Machines
 
Michael Fillion - Data Governance In The Digitally Transformed Enterprise
Michael Fillion - Data Governance In The Digitally Transformed EnterpriseMichael Fillion - Data Governance In The Digitally Transformed Enterprise
Michael Fillion - Data Governance In The Digitally Transformed Enterprise
 
Kevin Parker - The Leadership Journey
Kevin Parker - The Leadership JourneyKevin Parker - The Leadership Journey
Kevin Parker - The Leadership Journey
 
Ali Daneshmand - How Does Institutional Culture Influence Information Governance
Ali Daneshmand - How Does Institutional Culture Influence Information GovernanceAli Daneshmand - How Does Institutional Culture Influence Information Governance
Ali Daneshmand - How Does Institutional Culture Influence Information Governance
 
Nick Inglis - Welcome To #InfoGov17 & Providence, RI
Nick Inglis - Welcome To #InfoGov17 & Providence, RINick Inglis - Welcome To #InfoGov17 & Providence, RI
Nick Inglis - Welcome To #InfoGov17 & Providence, RI
 

Último

Memorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMMemorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMVoces Mineras
 
20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf
20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf
20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdfChris Skinner
 
Go for Rakhi Bazaar and Pick the Latest Bhaiya Bhabhi Rakhi.pptx
Go for Rakhi Bazaar and Pick the Latest Bhaiya Bhabhi Rakhi.pptxGo for Rakhi Bazaar and Pick the Latest Bhaiya Bhabhi Rakhi.pptx
Go for Rakhi Bazaar and Pick the Latest Bhaiya Bhabhi Rakhi.pptxRakhi Bazaar
 
Welding Electrode Making Machine By Deccan Dynamics
Welding Electrode Making Machine By Deccan DynamicsWelding Electrode Making Machine By Deccan Dynamics
Welding Electrode Making Machine By Deccan DynamicsIndiaMART InterMESH Limited
 
Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...Peter Ward
 
Introducing the Analogic framework for business planning applications
Introducing the Analogic framework for business planning applicationsIntroducing the Analogic framework for business planning applications
Introducing the Analogic framework for business planning applicationsKnowledgeSeed
 
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...ssuserf63bd7
 
The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...
The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...
The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...Operational Excellence Consulting
 
1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdf1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdfShaun Heinrichs
 
WSMM Technology February.March Newsletter_vF.pdf
WSMM Technology February.March Newsletter_vF.pdfWSMM Technology February.March Newsletter_vF.pdf
WSMM Technology February.March Newsletter_vF.pdfJamesConcepcion7
 
Healthcare Feb. & Mar. Healthcare Newsletter
Healthcare Feb. & Mar. Healthcare NewsletterHealthcare Feb. & Mar. Healthcare Newsletter
Healthcare Feb. & Mar. Healthcare NewsletterJamesConcepcion7
 
Driving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon HarmerDriving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon HarmerAggregage
 
Onemonitar Android Spy App Features: Explore Advanced Monitoring Capabilities
Onemonitar Android Spy App Features: Explore Advanced Monitoring CapabilitiesOnemonitar Android Spy App Features: Explore Advanced Monitoring Capabilities
Onemonitar Android Spy App Features: Explore Advanced Monitoring CapabilitiesOne Monitar
 
digital marketing , introduction of digital marketing
digital marketing , introduction of digital marketingdigital marketing , introduction of digital marketing
digital marketing , introduction of digital marketingrajputmeenakshi733
 
GUIDELINES ON USEFUL FORMS IN FREIGHT FORWARDING (F) Danny Diep Toh MBA.pdf
GUIDELINES ON USEFUL FORMS IN FREIGHT FORWARDING (F) Danny Diep Toh MBA.pdfGUIDELINES ON USEFUL FORMS IN FREIGHT FORWARDING (F) Danny Diep Toh MBA.pdf
GUIDELINES ON USEFUL FORMS IN FREIGHT FORWARDING (F) Danny Diep Toh MBA.pdfDanny Diep To
 
EUDR Info Meeting Ethiopian coffee exporters
EUDR Info Meeting Ethiopian coffee exportersEUDR Info Meeting Ethiopian coffee exporters
EUDR Info Meeting Ethiopian coffee exportersPeter Horsten
 
business environment micro environment macro environment.pptx
business environment micro environment macro environment.pptxbusiness environment micro environment macro environment.pptx
business environment micro environment macro environment.pptxShruti Mittal
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03DallasHaselhorst
 

Último (20)

Memorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMMemorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQM
 
20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf
20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf
20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf
 
WAM Corporate Presentation April 12 2024.pdf
WAM Corporate Presentation April 12 2024.pdfWAM Corporate Presentation April 12 2024.pdf
WAM Corporate Presentation April 12 2024.pdf
 
Go for Rakhi Bazaar and Pick the Latest Bhaiya Bhabhi Rakhi.pptx
Go for Rakhi Bazaar and Pick the Latest Bhaiya Bhabhi Rakhi.pptxGo for Rakhi Bazaar and Pick the Latest Bhaiya Bhabhi Rakhi.pptx
Go for Rakhi Bazaar and Pick the Latest Bhaiya Bhabhi Rakhi.pptx
 
Welding Electrode Making Machine By Deccan Dynamics
Welding Electrode Making Machine By Deccan DynamicsWelding Electrode Making Machine By Deccan Dynamics
Welding Electrode Making Machine By Deccan Dynamics
 
Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...
 
Introducing the Analogic framework for business planning applications
Introducing the Analogic framework for business planning applicationsIntroducing the Analogic framework for business planning applications
Introducing the Analogic framework for business planning applications
 
The Bizz Quiz-E-Summit-E-Cell-IITPatna.pptx
The Bizz Quiz-E-Summit-E-Cell-IITPatna.pptxThe Bizz Quiz-E-Summit-E-Cell-IITPatna.pptx
The Bizz Quiz-E-Summit-E-Cell-IITPatna.pptx
 
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
 
The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...
The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...
The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...
 
1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdf1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdf
 
WSMM Technology February.March Newsletter_vF.pdf
WSMM Technology February.March Newsletter_vF.pdfWSMM Technology February.March Newsletter_vF.pdf
WSMM Technology February.March Newsletter_vF.pdf
 
Healthcare Feb. & Mar. Healthcare Newsletter
Healthcare Feb. & Mar. Healthcare NewsletterHealthcare Feb. & Mar. Healthcare Newsletter
Healthcare Feb. & Mar. Healthcare Newsletter
 
Driving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon HarmerDriving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon Harmer
 
Onemonitar Android Spy App Features: Explore Advanced Monitoring Capabilities
Onemonitar Android Spy App Features: Explore Advanced Monitoring CapabilitiesOnemonitar Android Spy App Features: Explore Advanced Monitoring Capabilities
Onemonitar Android Spy App Features: Explore Advanced Monitoring Capabilities
 
digital marketing , introduction of digital marketing
digital marketing , introduction of digital marketingdigital marketing , introduction of digital marketing
digital marketing , introduction of digital marketing
 
GUIDELINES ON USEFUL FORMS IN FREIGHT FORWARDING (F) Danny Diep Toh MBA.pdf
GUIDELINES ON USEFUL FORMS IN FREIGHT FORWARDING (F) Danny Diep Toh MBA.pdfGUIDELINES ON USEFUL FORMS IN FREIGHT FORWARDING (F) Danny Diep Toh MBA.pdf
GUIDELINES ON USEFUL FORMS IN FREIGHT FORWARDING (F) Danny Diep Toh MBA.pdf
 
EUDR Info Meeting Ethiopian coffee exporters
EUDR Info Meeting Ethiopian coffee exportersEUDR Info Meeting Ethiopian coffee exporters
EUDR Info Meeting Ethiopian coffee exporters
 
business environment micro environment macro environment.pptx
business environment micro environment macro environment.pptxbusiness environment micro environment macro environment.pptx
business environment micro environment macro environment.pptx
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03
 

How Technology-Assisted Review Can Be More Effective Than Manual for E-Discovery

  • 1. Julia Brickell General Counsel H5 Your “Big Buckets” Are Full Of “Big Data” © 2014 H5
  • 2. Myriad sources Google Docs Employee sources Internal Enterprise data sources External Managed External Cloud External Gmail Google Docs
  • 3. The End Game? To Retain What’s Needed • Know what you need to keep • Employ the right expertise to find it – The right tools – The right expertise – Deployed effectively against diverse sources • Securely dispose of the rest
  • 4. “Overall, the myth that exhaustive manual review is the most effective – and therefore, the most defensible – approach to document review is strongly refuted. Technology-assisted review can (and does) yield more accurate results than exhaustive manual review, with much lower effort. Search “superior to manual reviews” Richmond Journal of Law and Technology (2011) ___________________________ TECHNOLOGY-ASSISTED REVIEW IN E-DISCOVERY CAN BE MORE EFFECTIVE AND MORE EFFICIENT THAN EXHAUSTIVE MANUAL REVIEW Maura R. Grossman Gordon V. Cormack XVII RICH. J.L. & TECH. 11 (2011), http://jolt.richmond.edu/v17i3/article11.pdf , p.48
  • 5. Search Results Vary NIST TREC Legal Track Interactive Task 2008-2010 0.0 0.2 0.4 0.6 0.8 1.0 1.0 0.8 0.6 0.4 0.2 0.0 Recall Precision High Recall High Precision 2008 2009 2010 Keyword Search (Blair & Maron,1985) Manual Review (Grossman & Cormack, 2011) Precision Recall (Sponsored by National Institute of Standards and Technology TREC Legal Track http://trec-legal.umiacs.umd.edu)
  • 6. Search is run on an index Token Locations action 3:1; 24:10; 45:112; all 3:5; 4; 23 accountants 2:2; 41::33 business 2:3; 4::56 conferences 3:12; 7:1; 88:5; 95:1 date 1:1; 4:1; 5:3; 8:13 dec 1:3; 155:9 Same search queries provide different results depending on the tool • Google • Exact search • Algorithmic search
  • 7. Target documents: common cold virus cough! fever congest! loss w/3 appetite allergies sneez! smoking flu computers traffic malaise sore throat runny nose o known o adjustable o over-inclusive – anchor o under-inclusive – add Exact Search Boolean, Rule- Based, Modeling Linguistic Patterns
  • 8. Exact Search Rule- Based, Modeling Linguistic Patterns enron #w5 [data, documents, e{ }mail{s}, record{s}, evidence{s}, info{rmation}, copy[y, ies], file{s}] #w10 [shred{s, ded, dding}, destroy{s, ed, ing}] TreC09_204_ST_Retention_ Deletion BM o known o adjustable o over-inclusive – anchor o under-inclusive – add
  • 9. Concept Search: Thesaurus addition Target documents: common cold virus cough! fever chills congest! loss w/3 appetite sneez! heat hotness torridness delirium ecstasy excitement febrile disease ferment fervor fire flush frenzy intensity germ micro organism bacterium bug microbe bacillus ailment disease illness infection pathogen sickness flu venom o unknown o imbedded o not adjustable o over-inclusive
  • 10. Algorithmic Search Computes document “totals” and compares totals Document 1 “total” Document 2 “total” α β o unknown o imbedded o hard to adjust o over-inclusive o under-inclusive
  • 11. Algorithmic search with “seed sets” NR NRRNR R R cough cough smokin ache malaise sleep sneezed cocaine congest chill chill ice virus comput counsel patent misuse chill ed dripping fever trip cold runny er crash g NNRR NR NR NR R R NR R R seed set “total” NR seed set “total” α β Seed set o unknown o imbedded o hard to adjust o over-inclusive o under-inclusive
  • 12. Statistics Supports Defensibility Yield Estimate – Estimate of responsive documents in data set Data set – 100,000 documents 1000 doc sample 15,000 docs estimated responsive yield 150 target docs 150 150/1000 target docs in sample = 15% Hence estimated 15,000/100,000 target docs in data set
  • 13. Statistics Supports Defensibility Sample of Results – “Not Tagged” Data 90,000 documents 1000 doc sample “Tagged “ Data 10,000 documents 1000 doc sample 700 70% correctly tagged 90/1000 target docs missed 90 10,000 x 70% correct = 7,000 target docs tagged 90,000 x 9% missed = 8,100 target docs missed 46% recall: 7,000/15,100 More target docs missed than tagged.
  • 14. Julia Brickell General Counsel H5 jbrickell@H5.com www.H5.com