SlideShare una empresa de Scribd logo
1 de 21
Descargar para leer sin conexión
ISSUES IN AI, POLICING AND
CRIMINAL JUSTICE:
BIAS
Dr Janet Bastiman
@yssybyl
janet.bastiman@story-stream.com
STORY-STREAM.COM.
What is AI?
"Any system that makes a decision that appears to be intelligent
from specific inputs” John McCarthy (1955)
AI -> Machine Learning -> Deep Learning -> (G)AI
Visualizing and Understanding Convolutional Networks,
Zeiler and Fergus 2013
https://arxiv.org/pdf/1311.2901.pdf
Deep if there's more than one stage
of non-linear feature transformation
What is Bias?
"an unwarranted correlation between input variables and output classification"
"erroneous assumptions in the learning algorithm resulting
in missing relevant relationships" - bias (underfitting)
"noise is modelled rather than the valid outputs" - variance (overfitting)
Under vs Over
Poor data is also a problem
Algorithm too simple
Prediction: Cow Prediction: Horse
Industry problems
Nothing is (externally) peer reviewed
IP is in the training data, network architecture and test set - hidden
Results are cherry picked / exaggerated
This might look fine but would be embarrassing and potentially
contract ending for a brand – how disastrous could this be for a
decision on a person’s life?
Machine Washing
• Overcomplicating AI deliberately to promote its abilities
• Removes desire to question as it sounds too difficult
• Pretending something is AI when it’s not
We cannot have transparency when the general public are deliberately misled
with statistics in the interest of sensationalism.
Bias in data gathering
• Ignorance of the problem
– building a lego kit without picture or instructions
• Asking biased questions to lead the model
– presupposing the answer
• Limiting the data to a set that supports the hypothesis
Data choices
Must be:
• representative of real world data
• varied
• sufficient
• able to define the predictions well
There will always be exceptions - a good model will handle these sensibly.
Ignorance of the problem space will lead to poor models
Data choices ...
Biased data will result in biased models
E.g. Oxbridge entry is inherently biased:
• state school candidates are:
• less likely to apply
• more likely to apply to oversubscribed courses
Hence there is an observed bias towards privately educated students
getting places.
Racial discrepancy is exacerbated without understanding the inherent
data bias
COMPAS: Is there bias? If so, where?
• Only looked at individuals arrested for crimes
• Exact algorithms and training data unknown
• 137 point questionnaire to feed risk score
• Questions appear unsuitable
• Testing showed a significant correlation between risk scores and
reoffending
• Black individuals were given higher risk scores than white
individuals for the same crimes
Was it biased?
COMPAS: Self-Validation
• 5575 individuals
• 50.2% Black, 42.0% White, 7.8% Other
• 86.3% male
• Had been assessed by COMPAS for risk of reoffence and were
then monitored over 12 months.
• Over half the individuals scored low (1-3)
• Changes between percentage, percentage change and
percentage per category to show results in the best light
http://criminology.fsu.edu/wp-content/uploads/Validation-of-the-COMPAS-Risk-Assessment-Classification-Instrument.pdf
COMPAS: Self-Validation
COMPAS: Pro Republica Study
• Same individuals
• Concluded that there was fundamental bias as a false positive for
a black defendant categorised as high risk was twice that of a
white defendant
Two conflicting statistical studies…
https://www.prorepublica.org/article/how-we-analysed-the-compass-recidivism-algorithm
Bias-free but skewed population?
• Chouldechova (2016) reviewed both
• For all risk scores in the set, the test is fair if the probability is
irrespective of group membership
• P(Y=1 | S= s, R = b) = P(Y=1 | S=s, R = w)
• COMPAS adheres well to this fairness condition
• However, FNR and FPR are related by:
• 𝐹𝑃𝑅 =
𝑝
1−𝜌
1−𝑃𝑃𝑉
𝑃𝑃𝑉
1 − 𝐹𝑁𝑅
• If recidivism is not equal between the two groups then a fair test
score cannot have equal FPR and FNR across the two groups
• Either the prediction is unbiased or the error is unbiased
https://arxiv.org/pdf/1610.07524.pdf
Inherent trade offs
• Kleinberg et al (2016)
• Is statistical parity possible? Or should we strive for balance of
classes so that the chance of making a mistake does not depend
on their group.
• Also determined independently that you could not balance
unbiased predictions with unbiased errors.
• They define more complex feature vectors 𝜎 and avoid the case
where 𝜎 is not defined or incomplete for some individuals
• Rigorous proof that you cannot balance all side
https://arxiv.org/pdf/1609.05807.pdf
Making data fair?
• Hardt et al (2016)
• Ignore all protected attributes? Ineffective due to redundant
encodings and other connected attributes
• Demographic parity? If the probability within a class varies you
cannot have positive parity and error parity at the same time
• Propose new parity – aiming to equalise both true positives and
false positives
• “fairer” is still subjective
https://ttic.uchicago.edu/~nati/Publications/HardtPriceSrebro2016.pdf
Summary
• Cannot maximise predictive success and equalise error across
unbalanced classes
• Several approaches – depends on what is the goal of the algorithm
• Input data needs to be unbiased
“All models are wrong, but some are useful – how wrong do they have to be
to not be useful?” – George E. P. Box
Human “gut feel” decisions are acceptable and when challenged as long as
some of the contributing factors are explained, that’s okay.
Are we holding AI predictive models to an unachievably high standard that
we do not apply to humans?

Más contenido relacionado

Similar a AI Bias Oxford 2017

Eliminating Machine Bias - Mary Ann Brennan - ML4ALL 2018
Eliminating Machine Bias - Mary Ann Brennan - ML4ALL 2018Eliminating Machine Bias - Mary Ann Brennan - ML4ALL 2018
Eliminating Machine Bias - Mary Ann Brennan - ML4ALL 2018MaryAnnBrennan3
 
Algorithmic fairness
Algorithmic fairnessAlgorithmic fairness
Algorithmic fairnessAnthonyMelson
 
Machine learning, health data & the limits of knowledge
Machine learning, health data & the limits of knowledgeMachine learning, health data & the limits of knowledge
Machine learning, health data & the limits of knowledgePaul Agapow
 
Learn How to Overcome Patient Identity Challenges
Learn How to Overcome Patient Identity ChallengesLearn How to Overcome Patient Identity Challenges
Learn How to Overcome Patient Identity ChallengesIatric Systems
 
Building an ethical data science practice
Building an ethical data science practiceBuilding an ethical data science practice
Building an ethical data science practiceCal Al-Dhubaib
 
How (Not) To Lie With Statistics.pptx
How (Not) To Lie With Statistics.pptxHow (Not) To Lie With Statistics.pptx
How (Not) To Lie With Statistics.pptxCrina Boros
 
Fairness in Machine Learning @Codemotion
Fairness in Machine Learning @CodemotionFairness in Machine Learning @Codemotion
Fairness in Machine Learning @CodemotionAzzurra Ragone
 
Survey Methodology for Security and Privacy Researchers
Survey Methodology for Security and Privacy ResearchersSurvey Methodology for Security and Privacy Researchers
Survey Methodology for Security and Privacy ResearchersElissa Redmiles
 
Statistics in Journalism
Statistics in JournalismStatistics in Journalism
Statistics in JournalismRegina Nuzzo
 
Lab presentation (a framework for understanding unintended consequences of ma...
Lab presentation (a framework for understanding unintended consequences of ma...Lab presentation (a framework for understanding unintended consequences of ma...
Lab presentation (a framework for understanding unintended consequences of ma...chguxu
 
IE_expressyourself_EssayH
IE_expressyourself_EssayHIE_expressyourself_EssayH
IE_expressyourself_EssayHjk6653284
 
Fore FAIR ISMB 2019
Fore FAIR ISMB 2019Fore FAIR ISMB 2019
Fore FAIR ISMB 2019Ian Fore
 
Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...
Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...
Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...Data Driven Innovation
 
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Krishnaram Kenthapadi
 
The Data Errors we Make by Sean Taylor at Big Data Spain 2017
The Data Errors we Make by Sean Taylor at Big Data Spain 2017The Data Errors we Make by Sean Taylor at Big Data Spain 2017
The Data Errors we Make by Sean Taylor at Big Data Spain 2017Big Data Spain
 
475 media effects methods 2012 up
475 media effects methods 2012 up475 media effects methods 2012 up
475 media effects methods 2012 upmpeffl
 
Essay On Juvenile Incarceration
Essay On Juvenile IncarcerationEssay On Juvenile Incarceration
Essay On Juvenile IncarcerationLissette Hartman
 
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...QuantUniversity
 

Similar a AI Bias Oxford 2017 (20)

Eliminating Machine Bias - Mary Ann Brennan - ML4ALL 2018
Eliminating Machine Bias - Mary Ann Brennan - ML4ALL 2018Eliminating Machine Bias - Mary Ann Brennan - ML4ALL 2018
Eliminating Machine Bias - Mary Ann Brennan - ML4ALL 2018
 
Algorithmic fairness
Algorithmic fairnessAlgorithmic fairness
Algorithmic fairness
 
Machine learning, health data & the limits of knowledge
Machine learning, health data & the limits of knowledgeMachine learning, health data & the limits of knowledge
Machine learning, health data & the limits of knowledge
 
Learn How to Overcome Patient Identity Challenges
Learn How to Overcome Patient Identity ChallengesLearn How to Overcome Patient Identity Challenges
Learn How to Overcome Patient Identity Challenges
 
Building an ethical data science practice
Building an ethical data science practiceBuilding an ethical data science practice
Building an ethical data science practice
 
How (Not) To Lie With Statistics.pptx
How (Not) To Lie With Statistics.pptxHow (Not) To Lie With Statistics.pptx
How (Not) To Lie With Statistics.pptx
 
Fairness in Machine Learning @Codemotion
Fairness in Machine Learning @CodemotionFairness in Machine Learning @Codemotion
Fairness in Machine Learning @Codemotion
 
Survey Methodology for Security and Privacy Researchers
Survey Methodology for Security and Privacy ResearchersSurvey Methodology for Security and Privacy Researchers
Survey Methodology for Security and Privacy Researchers
 
Statistics in Journalism
Statistics in JournalismStatistics in Journalism
Statistics in Journalism
 
Lab presentation (a framework for understanding unintended consequences of ma...
Lab presentation (a framework for understanding unintended consequences of ma...Lab presentation (a framework for understanding unintended consequences of ma...
Lab presentation (a framework for understanding unintended consequences of ma...
 
IE_expressyourself_EssayH
IE_expressyourself_EssayHIE_expressyourself_EssayH
IE_expressyourself_EssayH
 
Fore FAIR ISMB 2019
Fore FAIR ISMB 2019Fore FAIR ISMB 2019
Fore FAIR ISMB 2019
 
Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...
Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...
Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...
 
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
 
The Data Errors we Make by Sean Taylor at Big Data Spain 2017
The Data Errors we Make by Sean Taylor at Big Data Spain 2017The Data Errors we Make by Sean Taylor at Big Data Spain 2017
The Data Errors we Make by Sean Taylor at Big Data Spain 2017
 
Gender balance at work a study of an Irish civil service department
Gender balance at work a study of an Irish civil service departmentGender balance at work a study of an Irish civil service department
Gender balance at work a study of an Irish civil service department
 
475 media effects methods 2012 up
475 media effects methods 2012 up475 media effects methods 2012 up
475 media effects methods 2012 up
 
Ethical Dilemmas in AI/ML-based systems
Ethical Dilemmas in AI/ML-based systemsEthical Dilemmas in AI/ML-based systems
Ethical Dilemmas in AI/ML-based systems
 
Essay On Juvenile Incarceration
Essay On Juvenile IncarcerationEssay On Juvenile Incarceration
Essay On Juvenile Incarceration
 
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
 

Más de Dr Janet Bastiman

AI Fails: Avoiding bias in your systems
AI Fails: Avoiding bias in your systemsAI Fails: Avoiding bias in your systems
AI Fails: Avoiding bias in your systemsDr Janet Bastiman
 
Can abstraction lead to intelligence?
Can abstraction lead to intelligence?Can abstraction lead to intelligence?
Can abstraction lead to intelligence?Dr Janet Bastiman
 
Creating AI using biological network techniques
Creating AI using biological network techniquesCreating AI using biological network techniques
Creating AI using biological network techniquesDr Janet Bastiman
 
Collaboration, Publications, Community: Building your personal tech brand
Collaboration, Publications, Community: Building your personal tech brandCollaboration, Publications, Community: Building your personal tech brand
Collaboration, Publications, Community: Building your personal tech brandDr Janet Bastiman
 

Más de Dr Janet Bastiman (8)

Making a deepfake
Making a deepfakeMaking a deepfake
Making a deepfake
 
Ethics of Deepfakes
Ethics of DeepfakesEthics of Deepfakes
Ethics of Deepfakes
 
What are deepfakes?
What are deepfakes?What are deepfakes?
What are deepfakes?
 
AI Fails: Avoiding bias in your systems
AI Fails: Avoiding bias in your systemsAI Fails: Avoiding bias in your systems
AI Fails: Avoiding bias in your systems
 
Making AI efficient
Making AI efficientMaking AI efficient
Making AI efficient
 
Can abstraction lead to intelligence?
Can abstraction lead to intelligence?Can abstraction lead to intelligence?
Can abstraction lead to intelligence?
 
Creating AI using biological network techniques
Creating AI using biological network techniquesCreating AI using biological network techniques
Creating AI using biological network techniques
 
Collaboration, Publications, Community: Building your personal tech brand
Collaboration, Publications, Community: Building your personal tech brandCollaboration, Publications, Community: Building your personal tech brand
Collaboration, Publications, Community: Building your personal tech brand
 

Último

AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 

Último (20)

20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 

AI Bias Oxford 2017

  • 1. ISSUES IN AI, POLICING AND CRIMINAL JUSTICE: BIAS Dr Janet Bastiman @yssybyl janet.bastiman@story-stream.com STORY-STREAM.COM.
  • 2. What is AI? "Any system that makes a decision that appears to be intelligent from specific inputs” John McCarthy (1955) AI -> Machine Learning -> Deep Learning -> (G)AI
  • 3. Visualizing and Understanding Convolutional Networks, Zeiler and Fergus 2013 https://arxiv.org/pdf/1311.2901.pdf Deep if there's more than one stage of non-linear feature transformation
  • 4. What is Bias? "an unwarranted correlation between input variables and output classification" "erroneous assumptions in the learning algorithm resulting in missing relevant relationships" - bias (underfitting) "noise is modelled rather than the valid outputs" - variance (overfitting)
  • 6. Poor data is also a problem
  • 7. Algorithm too simple Prediction: Cow Prediction: Horse
  • 8. Industry problems Nothing is (externally) peer reviewed IP is in the training data, network architecture and test set - hidden Results are cherry picked / exaggerated
  • 9. This might look fine but would be embarrassing and potentially contract ending for a brand – how disastrous could this be for a decision on a person’s life?
  • 10. Machine Washing • Overcomplicating AI deliberately to promote its abilities • Removes desire to question as it sounds too difficult • Pretending something is AI when it’s not We cannot have transparency when the general public are deliberately misled with statistics in the interest of sensationalism.
  • 11. Bias in data gathering • Ignorance of the problem – building a lego kit without picture or instructions • Asking biased questions to lead the model – presupposing the answer • Limiting the data to a set that supports the hypothesis
  • 12. Data choices Must be: • representative of real world data • varied • sufficient • able to define the predictions well There will always be exceptions - a good model will handle these sensibly. Ignorance of the problem space will lead to poor models
  • 13. Data choices ... Biased data will result in biased models E.g. Oxbridge entry is inherently biased: • state school candidates are: • less likely to apply • more likely to apply to oversubscribed courses Hence there is an observed bias towards privately educated students getting places. Racial discrepancy is exacerbated without understanding the inherent data bias
  • 14. COMPAS: Is there bias? If so, where? • Only looked at individuals arrested for crimes • Exact algorithms and training data unknown • 137 point questionnaire to feed risk score • Questions appear unsuitable • Testing showed a significant correlation between risk scores and reoffending • Black individuals were given higher risk scores than white individuals for the same crimes Was it biased?
  • 15. COMPAS: Self-Validation • 5575 individuals • 50.2% Black, 42.0% White, 7.8% Other • 86.3% male • Had been assessed by COMPAS for risk of reoffence and were then monitored over 12 months. • Over half the individuals scored low (1-3) • Changes between percentage, percentage change and percentage per category to show results in the best light http://criminology.fsu.edu/wp-content/uploads/Validation-of-the-COMPAS-Risk-Assessment-Classification-Instrument.pdf
  • 17. COMPAS: Pro Republica Study • Same individuals • Concluded that there was fundamental bias as a false positive for a black defendant categorised as high risk was twice that of a white defendant Two conflicting statistical studies… https://www.prorepublica.org/article/how-we-analysed-the-compass-recidivism-algorithm
  • 18. Bias-free but skewed population? • Chouldechova (2016) reviewed both • For all risk scores in the set, the test is fair if the probability is irrespective of group membership • P(Y=1 | S= s, R = b) = P(Y=1 | S=s, R = w) • COMPAS adheres well to this fairness condition • However, FNR and FPR are related by: • 𝐹𝑃𝑅 = 𝑝 1−𝜌 1−𝑃𝑃𝑉 𝑃𝑃𝑉 1 − 𝐹𝑁𝑅 • If recidivism is not equal between the two groups then a fair test score cannot have equal FPR and FNR across the two groups • Either the prediction is unbiased or the error is unbiased https://arxiv.org/pdf/1610.07524.pdf
  • 19. Inherent trade offs • Kleinberg et al (2016) • Is statistical parity possible? Or should we strive for balance of classes so that the chance of making a mistake does not depend on their group. • Also determined independently that you could not balance unbiased predictions with unbiased errors. • They define more complex feature vectors 𝜎 and avoid the case where 𝜎 is not defined or incomplete for some individuals • Rigorous proof that you cannot balance all side https://arxiv.org/pdf/1609.05807.pdf
  • 20. Making data fair? • Hardt et al (2016) • Ignore all protected attributes? Ineffective due to redundant encodings and other connected attributes • Demographic parity? If the probability within a class varies you cannot have positive parity and error parity at the same time • Propose new parity – aiming to equalise both true positives and false positives • “fairer” is still subjective https://ttic.uchicago.edu/~nati/Publications/HardtPriceSrebro2016.pdf
  • 21. Summary • Cannot maximise predictive success and equalise error across unbalanced classes • Several approaches – depends on what is the goal of the algorithm • Input data needs to be unbiased “All models are wrong, but some are useful – how wrong do they have to be to not be useful?” – George E. P. Box Human “gut feel” decisions are acceptable and when challenged as long as some of the contributing factors are explained, that’s okay. Are we holding AI predictive models to an unachievably high standard that we do not apply to humans?