Quantitative Methods for Lawyers - Class #8 - Bayes Rule and Conditional Probability - Professor Daniel Martin Katz

Quantitative
Methods
for
Lawyers
Conditional Probability
& Bayes Theorem
!
Class #8
@ computational
computationallegalstudies.com
professor daniel martin katz danielmartinkatz.com
lexpredict.com slideshare.net/DanielKatz

Conditional Probability is an Important Concept
and A Precursor to Discussing Bayes Rule
Relies on a Little Bit of Set Theory”
Prob of “A Given B”
P (A intersect B) Divided by the Prob of B

In a conditional probability problem, the sample space is
“reduced” to the “space” of the given outcome (e.g. if given
B, we now just care about the probability of A occurring
“inside” of B)
Given B, what’s
the probability of A?
A Visual Depiction of
The Entire Yellow Space is
Intuitively we are asking ...
What Share of B contains the
overlapwithA?

A Dice Based Example
What is the Probability of Getting a “2” if we know
that the number thrown is less than 5?

given that the number thrown is less than 5?

Again Here is Our Formula:

P ( “2”|know it is Less than 5) = P ( “2” {1,2,3,4} )
P ( {1,2,3,4} )
Again Here is Our Formula:

P ( {1,2,3,4} )

Okay What is P ( “2” {1,2,3,4} ) ?
The only element that intersects is “2” so is it the
Prob of “2” which is 1/6
P ( {1,2,3,4} )

Okay What is P ( “2” {1,2,3,4} ) ?
P ( {1,2,3,4} )
Now What is P ( {1,2,3,4} ) ?
1/6 + 1/6 +1/6 + 1/6 = 4/6

Okay What is P ( “2” {1,2,3,4} ) ?
P ( {1,2,3,4} )
Now What is P ( {1,2,3,4} ) ?
1/6 + 1/6 +1/6 + 1/6 = 4/6
Okay Lets Put
it All Together:
P ( “2” {1,2,3,4} )
P ( {1,2,3,4} )
= 1/41/6
4/6
=

In “Lets Make a Deal” you are given the opportunity to
select one closed door of three, behind one of which there is
a prize.
The other two doors hide “goats” (or some other such “non–
prize”), or nothing at all. Once you have made your
selection, Monty Hall will open one of the remaining doors,
revealing that it does not contain the prize.
Monty Hall
Problem

Assume You Picked Door #1
Monty Hall Problem
Now Assume Monty Has Removed Door #2
Here is the problem:
Should You Stay or Should you switch?

Monty Hall Problem
Answer is You Should Switch
This is Counterintuitive
Key Fact: the host always opens the door to
reveal a goat (if not the properties of the problem
would change)

Tree showing the probability of every possible outcome if the
player initially picks Door 1
Monty Hall Problem

There are 100 doors to pick from in the beginning
You pick one door
Monty looks at the 99 others, finds the goats, and opens all but 1
Do you stick with your original door (1/100), or the other door, which was filtered
from 99?
It’s a bit clearer now : Monty is taking a set of 99 choices and improving them by
removing 98 goats. When he’s done, he has the top door out of 99 for you to pick.
Your decision: Do you want a random door out of 100 (initial guess) or the best
door out of 99? Said another way, do you want 1 random chance or the best of 99
random chances?
We’re starting to see why Monty’s actions help us. He’s letting us choose between a
generic, random choice and a curated, filtered choice. Filtered is better.
Monty Hall Problem

Spam
Filtering
Fighting spam is a constant exercise. As the junk ﬁlters
become more intelligent, the spam senders come up with
innovative means to ensure their emails reach your
inbox.
The automatic identiﬁcation of spam and phishing scams
is usually coupled with a “human” element. This
Human Element has to be Weighted / Blended.

Spam
Filtering
When many people mark an email message as spam,
the filter will eventually “update”
(using an updating rule)
The properties of a spam message are constantly in flux.
Thus, spam filters need to be taught constantly.

Spam Filtering
Key Insight is that when developing a ﬁlter we are trying
to mimic the information that allows you (as a human
reasoner) to rapidly detect that a message is spam:
(1) Message is from another country ( in particular
china, Nigeria, India, etc.)
(2) Message is from new email address
(3) ... What Else?

Bayes Rule In
Spam Filtering
Some of the same properties at work in spam filtering are
those at work in E-Discovery
bayesian spam filters calculate the probability of a
message being spam based on its contents. Unlike
simple content-based filters, Bayesian spam filtering
learns from spam and from good mail, resulting in a
very robust, adapting and efficient anti-spam approach
that, best of all, returns hardly any false positives.

Bayes Rule In
Spam Filtering
Type 1 vs. Type 2 Error Trade Off:
Type 1 = False positive
(convict someone /something that is innocent)
Type 2 = False Negative
(Fail to convict someone /something that is Guilty)
Which Would We Rather Have in This Context?

Bayes Rule In
Spam Filtering
Type 1 vs. Type 2 Error Trade Off:
Type 1 = False positive
(convict someone /something that is innocent)
Type 2 = False Negative
(Fail to convict someone /something that is Guilty)
Which Would We Rather Have in This Context?
Allow Some Messages to Go Your Inbox

Bayes Rule In
Spam Filtering
Basic Scoring Content-based spam ﬁlter
looks for words and other characteristics typical of spam.
Every characteristic element is assigned a score, and a
spam score for the whole message is computed from the
individual scores.
Some scoring ﬁlters also look for characteristics of
legitimate mail, lowering the complete score
=

Train the Filter =
In light of what you have now identiﬁed as spam, update
the scoring methods or properties that the spam ﬁlter
uses
Wisdom of Crowds --> Leverage large data set to see
what crowd thinks is spam

Example From a
Info Tech Company
http://www.bluewatermedia.com/support/spam-ﬁlter.html

Bayes Rule In
Spam Filtering
http://www.bluewatermedia.com/support/spam-ﬁlter.html

Keep Thinking About the Relationship
between
Spam Filters
and
EDiscovery /
Automated Doc Review

Daniel Martin Katz
@ computational
computationallegalstudies.com
lexpredict.com
danielmartinkatz.com
illinois tech - chicago kent college of law@

Quantitative Methods for Lawyers - Class #8 - Bayes Rule and Conditional Probability - Professor Daniel Martin Katz

Recomendados

Recomendados

Más contenido relacionado

Destacado

Destacado (7)

Más de Daniel Katz

Más de Daniel Katz (20)

Último

Último (20)

Quantitative Methods for Lawyers - Class #8 - Bayes Rule and Conditional Probability - Professor Daniel Martin Katz