SlideShare una empresa de Scribd logo
1 de 72
How Do Algorithms
Become Biased?
Eva Sasson
@evasasson
#DataDayMX
@evasasson
What you’ll learn today
1. How to build a predictive model
2. Where in the building process bias
can be introduced
3. What are real - world ramifications
What are all these “buzzwords” ?
● Data science produces insights
● Machine learning produces predictions
● Artificial intelligence produces actions
paulvanderlaken.com
First off, let’s define bias
1. Sample bias
2. Systematic value distortion
3. Prejudice or stereotype bias
Let’s build our
model!
Predicting Property Prices
Clean &
Prepare
2
Get data
1
Predict
5
Train &
Test
3
Improve
4
Step 1:
Gathering the data
Getting Data
◂ Public datasets
◂ APIs
◂ Existing datasets
◂ Surveys
◂ Web scrapers
Getting Data
◂ Public datasets
◂ APIs
◂ Existing datasets
◂ Surveys
◂ Web scrapers: import.io, beautiful
soup, scrapey
Our Model: Predicting Housing Prices
$$$
Is our chosen prediction variable
13
Biggest
concentration of
bias is in the
training data itself!
◂ Variables remaining can proxy race
◂ If race is a useful predictor, then you have
a hole in the data
◂ Indirect discrimination
Removing ‘race’ from the dataset
doesn’t remove the problem
Now we know the
risks of training
data...
What do we do now?
Step 2:
Explore and Clean Data
Clean &
Prepare
2
Get data
1
Predict
5
Train &
Test
3
Improve
4
80%
Of data science is cleaning and preparing data
20
Examples of data cleaning
1. Remove Duplicates
2. Remove Empty columns
3. Remove Not-relevant variables
4. Find averages for empty rows, or mark as 0
5. Remove rows that are blank for the features
most important for you
6. Standardize units
9,500
Rows started with
5,360
Rows of data remaining
4140
Duplicates and blank data
removed
22
Be careful in the
cleaning process!
Some variables can be
tampered with, dropped, and
some cannot.
23
“ Collecting and cleaning data is an
inherently subjective process.
-Fabliha Ibnat, researcher at the University of
Washington
biased, skewed,
incomplete,
human-labelled,
human-cleaned
Training
data
=
Supplement your dataset
with new features that
might help you!
Adding additional variables by zip code
◂ Yelp count of stars
◂ Yelp average of stars
◂ Average household income
◂ Per capita income
◂ High income households (% > $200k/yr)
Yelp data seems pretty
democratic, that can’t
cultivate bias right?
Not so fast!
“
As people talk about authenticity
more online, star ratings decrease,
independent of food quality
-Sara Kay, food educator in NYC
Low housing prices with
many ethnic restaurants
High housing prices
with few ethnic restaurants
Housing prices from Yelp stars
What happened with
Amazon Express?
We have our data and
it’s cleaned and
wrangled
Now, we’re ready to build
our model with it
Step 3:
Train your model & test it
Clean &
Prepare
2
Get data
1
Predict
5
Train &
Test
3
Improve
4
80:20 split
Pick your train: test ratio
38
Optimize for: MAPE
Results table
Variable Importance for Random Forest
Variable Importance for XGBoost
13%
Of prediction power based on high household income
43
“ “Algorithms replicate the
status quo”
-Cathy O’Neil. Author, speaker, professor.
Step 4: Improve
Hyperparameter tuning
Clean &
Prepare
2
Get data
1
Predict
5
Train &
Test
3
Improve
4
Experiment with Hyperparameter tuning
◂ Increase or decrease number of trees
◂ 10-fold cross validation
◂ Look at depth
◂ Random seed
◂ Where to split the data
Removing outliers leads to lowest MAPE
“ Algorithms will do more justice
to the people who are easiest
to understand at the expense
of those who aren’t.
-Michael Veale, Phd in Responsible ML at UCL
Error-Modeling for XG Boost
Step 5: Predict
Put your model into action to
uncover predictive insights
Clean &
Prepare
2
Get data
1
Predict
5
Train &
Test
3
Improve
4
Actual Value vs
Predicted Value
Issues and areas of
bias with prediction
Problems of
Big Data Hubris
Algorithms make the same
prediction every time.
Most algorithms are
secret
Human bias treated as
science. Opinion
embedded in math.
Cathy O’Neil.
Disadvantages the
already disadvantaged.
What we can make from all of this
Conclusions,
Takeaways &
Solutions
What can we do
about it?
“ “When we look at bias as
just a technical issue, we
are missing the point”
- Kate Crawford
Awareness
We must check
our training data
Python
Packages
FairML package
Black Box
AI Fairness 360
Transparency
What assumptions were made?
How decisions were made ?
Who may be affected?
What was underlying logic?
Who is most at risk?
Purpose limitation
(Just because it exists,
doesn’t mean it should
be used for new
purposes)
Representation!
Diversity in:
ideas
opinions
perspectives
72
Any Questions?
Thanks!
@evasasson
Eva Sasson

Más contenido relacionado

La actualidad más candente

Fairness in AI (DDSW 2019)
Fairness in AI (DDSW 2019)Fairness in AI (DDSW 2019)
Fairness in AI (DDSW 2019)GoDataDriven
 
AIF360 - Trusted and Fair AI
AIF360 - Trusted and Fair AIAIF360 - Trusted and Fair AI
AIF360 - Trusted and Fair AIAnimesh Singh
 
NYAI #24: Developing Trust in Artificial Intelligence and Machine Learning fo...
NYAI #24: Developing Trust in Artificial Intelligence and Machine Learning fo...NYAI #24: Developing Trust in Artificial Intelligence and Machine Learning fo...
NYAI #24: Developing Trust in Artificial Intelligence and Machine Learning fo...Maryam Farooq
 
Technology for everyone - AI ethics and Bias
Technology for everyone - AI ethics and BiasTechnology for everyone - AI ethics and Bias
Technology for everyone - AI ethics and BiasMarion Mulder
 
Artificial Intelligence and Machine Learning for business
Artificial Intelligence and Machine Learning for businessArtificial Intelligence and Machine Learning for business
Artificial Intelligence and Machine Learning for businessSteven Finlay
 
Ethical issues facing Artificial Intelligence
Ethical issues facing Artificial IntelligenceEthical issues facing Artificial Intelligence
Ethical issues facing Artificial IntelligenceRah Abdelhak
 
Algorithmic Impact Assessment: Fairness, Robustness and Explainability in Aut...
Algorithmic Impact Assessment: Fairness, Robustness and Explainability in Aut...Algorithmic Impact Assessment: Fairness, Robustness and Explainability in Aut...
Algorithmic Impact Assessment: Fairness, Robustness and Explainability in Aut...Adriano Soares Koshiyama
 
How do we train AI to be Ethical and Unbiased?
How do we train AI to be Ethical and Unbiased?How do we train AI to be Ethical and Unbiased?
How do we train AI to be Ethical and Unbiased?Mark Borg
 
Privacy in AI/ML Systems: Practical Challenges and Lessons Learned
Privacy in AI/ML Systems: Practical Challenges and Lessons LearnedPrivacy in AI/ML Systems: Practical Challenges and Lessons Learned
Privacy in AI/ML Systems: Practical Challenges and Lessons LearnedKrishnaram Kenthapadi
 
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...Krishnaram Kenthapadi
 
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...Krishnaram Kenthapadi
 
The Ethics of AI in Education
The Ethics of AI in EducationThe Ethics of AI in Education
The Ethics of AI in EducationMark S. Steed
 
Fairness in Machine Learning
Fairness in Machine LearningFairness in Machine Learning
Fairness in Machine LearningDelip Rao
 
Fairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsFairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsKrishnaram Kenthapadi
 
Fairness, Transparency, and Privacy in AI @LinkedIn
Fairness, Transparency, and Privacy in AI @LinkedInFairness, Transparency, and Privacy in AI @LinkedIn
Fairness, Transparency, and Privacy in AI @LinkedInC4Media
 
Fairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsFairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsKrishnaram Kenthapadi
 
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...Krishnaram Kenthapadi
 

La actualidad más candente (20)

Fairness in AI (DDSW 2019)
Fairness in AI (DDSW 2019)Fairness in AI (DDSW 2019)
Fairness in AI (DDSW 2019)
 
AIF360 - Trusted and Fair AI
AIF360 - Trusted and Fair AIAIF360 - Trusted and Fair AI
AIF360 - Trusted and Fair AI
 
How do we Humans feel about AI?
How do we Humans feel about AI?How do we Humans feel about AI?
How do we Humans feel about AI?
 
Ethical Dilemmas in AI/ML-based systems
Ethical Dilemmas in AI/ML-based systemsEthical Dilemmas in AI/ML-based systems
Ethical Dilemmas in AI/ML-based systems
 
NYAI #24: Developing Trust in Artificial Intelligence and Machine Learning fo...
NYAI #24: Developing Trust in Artificial Intelligence and Machine Learning fo...NYAI #24: Developing Trust in Artificial Intelligence and Machine Learning fo...
NYAI #24: Developing Trust in Artificial Intelligence and Machine Learning fo...
 
Technology for everyone - AI ethics and Bias
Technology for everyone - AI ethics and BiasTechnology for everyone - AI ethics and Bias
Technology for everyone - AI ethics and Bias
 
Smart life 3.0
Smart life 3.0Smart life 3.0
Smart life 3.0
 
Artificial Intelligence and Machine Learning for business
Artificial Intelligence and Machine Learning for businessArtificial Intelligence and Machine Learning for business
Artificial Intelligence and Machine Learning for business
 
Ethical issues facing Artificial Intelligence
Ethical issues facing Artificial IntelligenceEthical issues facing Artificial Intelligence
Ethical issues facing Artificial Intelligence
 
Algorithmic Impact Assessment: Fairness, Robustness and Explainability in Aut...
Algorithmic Impact Assessment: Fairness, Robustness and Explainability in Aut...Algorithmic Impact Assessment: Fairness, Robustness and Explainability in Aut...
Algorithmic Impact Assessment: Fairness, Robustness and Explainability in Aut...
 
How do we train AI to be Ethical and Unbiased?
How do we train AI to be Ethical and Unbiased?How do we train AI to be Ethical and Unbiased?
How do we train AI to be Ethical and Unbiased?
 
Privacy in AI/ML Systems: Practical Challenges and Lessons Learned
Privacy in AI/ML Systems: Practical Challenges and Lessons LearnedPrivacy in AI/ML Systems: Practical Challenges and Lessons Learned
Privacy in AI/ML Systems: Practical Challenges and Lessons Learned
 
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
 
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...
 
The Ethics of AI in Education
The Ethics of AI in EducationThe Ethics of AI in Education
The Ethics of AI in Education
 
Fairness in Machine Learning
Fairness in Machine LearningFairness in Machine Learning
Fairness in Machine Learning
 
Fairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsFairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML Systems
 
Fairness, Transparency, and Privacy in AI @LinkedIn
Fairness, Transparency, and Privacy in AI @LinkedInFairness, Transparency, and Privacy in AI @LinkedIn
Fairness, Transparency, and Privacy in AI @LinkedIn
 
Fairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsFairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML Systems
 
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...
 

Similar a How can algorithms be biased?

Barga Data Science lecture 9
Barga Data Science lecture 9Barga Data Science lecture 9
Barga Data Science lecture 9Roger Barga
 
10NTC - Data Superheroes - DiJulio
10NTC - Data Superheroes - DiJulio10NTC - Data Superheroes - DiJulio
10NTC - Data Superheroes - DiJuliosarahdijulio
 
Machine Learning for Designers - UX Scotland
Machine Learning for Designers - UX ScotlandMachine Learning for Designers - UX Scotland
Machine Learning for Designers - UX ScotlandMemi Beltrame
 
Module 1 introduction to machine learning
Module 1  introduction to machine learningModule 1  introduction to machine learning
Module 1 introduction to machine learningSara Hooker
 
Machine Learning for Designers
Machine Learning for DesignersMachine Learning for Designers
Machine Learning for DesignersMemi Beltrame
 
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision MakingData-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Makingindeedeng
 
Unit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxUnit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxChitrachitrap
 
Using AI to Build Fair and Equitable Workplaces
Using AI to Build Fair and Equitable WorkplacesUsing AI to Build Fair and Equitable Workplaces
Using AI to Build Fair and Equitable WorkplacesData Con LA
 
Barga Data Science lecture 1
Barga Data Science lecture 1Barga Data Science lecture 1
Barga Data Science lecture 1Roger Barga
 
JDO 2019: Data Science for Developers - Matthew Renze
JDO 2019: Data Science for Developers -  Matthew RenzeJDO 2019: Data Science for Developers -  Matthew Renze
JDO 2019: Data Science for Developers - Matthew RenzePROIDEA
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsSri Ambati
 
6 Guidelines for A/B Testing
6 Guidelines for A/B Testing6 Guidelines for A/B Testing
6 Guidelines for A/B TestingEmily Robinson
 
Machine Learning for Designers - UX Camp Switzerland
Machine Learning for Designers - UX Camp SwitzerlandMachine Learning for Designers - UX Camp Switzerland
Machine Learning for Designers - UX Camp SwitzerlandMemi Beltrame
 
Data Science Toolkit for Product Managers
Data Science Toolkit for Product ManagersData Science Toolkit for Product Managers
Data Science Toolkit for Product ManagersMahmoud Jalajel
 
Data science toolkit for product managers
Data science toolkit for product managers Data science toolkit for product managers
Data science toolkit for product managers ProductFolks
 
Algorithmic Fairness: A Brief Introduction
Algorithmic Fairness: A Brief IntroductionAlgorithmic Fairness: A Brief Introduction
Algorithmic Fairness: A Brief IntroductionAnthonyMelson
 
IE_expressyourself_EssayH
IE_expressyourself_EssayHIE_expressyourself_EssayH
IE_expressyourself_EssayHjk6653284
 
A/B testing, optimization and results analysis by Mariia Bocheva, ATD'18
A/B testing, optimization and results analysis by Mariia Bocheva, ATD'18A/B testing, optimization and results analysis by Mariia Bocheva, ATD'18
A/B testing, optimization and results analysis by Mariia Bocheva, ATD'18Mariia Bocheva
 
Machine learning101 v1.2
Machine learning101 v1.2Machine learning101 v1.2
Machine learning101 v1.2CCG
 

Similar a How can algorithms be biased? (20)

Barga Data Science lecture 9
Barga Data Science lecture 9Barga Data Science lecture 9
Barga Data Science lecture 9
 
10NTC - Data Superheroes - DiJulio
10NTC - Data Superheroes - DiJulio10NTC - Data Superheroes - DiJulio
10NTC - Data Superheroes - DiJulio
 
Machine Learning for Designers - UX Scotland
Machine Learning for Designers - UX ScotlandMachine Learning for Designers - UX Scotland
Machine Learning for Designers - UX Scotland
 
Module 1 introduction to machine learning
Module 1  introduction to machine learningModule 1  introduction to machine learning
Module 1 introduction to machine learning
 
Turning Information chaos into reliable data
Turning Information chaos into reliable dataTurning Information chaos into reliable data
Turning Information chaos into reliable data
 
Machine Learning for Designers
Machine Learning for DesignersMachine Learning for Designers
Machine Learning for Designers
 
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision MakingData-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
 
Unit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxUnit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptx
 
Using AI to Build Fair and Equitable Workplaces
Using AI to Build Fair and Equitable WorkplacesUsing AI to Build Fair and Equitable Workplaces
Using AI to Build Fair and Equitable Workplaces
 
Barga Data Science lecture 1
Barga Data Science lecture 1Barga Data Science lecture 1
Barga Data Science lecture 1
 
JDO 2019: Data Science for Developers - Matthew Renze
JDO 2019: Data Science for Developers -  Matthew RenzeJDO 2019: Data Science for Developers -  Matthew Renze
JDO 2019: Data Science for Developers - Matthew Renze
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner Pitfalls
 
6 Guidelines for A/B Testing
6 Guidelines for A/B Testing6 Guidelines for A/B Testing
6 Guidelines for A/B Testing
 
Machine Learning for Designers - UX Camp Switzerland
Machine Learning for Designers - UX Camp SwitzerlandMachine Learning for Designers - UX Camp Switzerland
Machine Learning for Designers - UX Camp Switzerland
 
Data Science Toolkit for Product Managers
Data Science Toolkit for Product ManagersData Science Toolkit for Product Managers
Data Science Toolkit for Product Managers
 
Data science toolkit for product managers
Data science toolkit for product managers Data science toolkit for product managers
Data science toolkit for product managers
 
Algorithmic Fairness: A Brief Introduction
Algorithmic Fairness: A Brief IntroductionAlgorithmic Fairness: A Brief Introduction
Algorithmic Fairness: A Brief Introduction
 
IE_expressyourself_EssayH
IE_expressyourself_EssayHIE_expressyourself_EssayH
IE_expressyourself_EssayH
 
A/B testing, optimization and results analysis by Mariia Bocheva, ATD'18
A/B testing, optimization and results analysis by Mariia Bocheva, ATD'18A/B testing, optimization and results analysis by Mariia Bocheva, ATD'18
A/B testing, optimization and results analysis by Mariia Bocheva, ATD'18
 
Machine learning101 v1.2
Machine learning101 v1.2Machine learning101 v1.2
Machine learning101 v1.2
 

Más de Software Guru

Hola Mundo del Internet de las Cosas
Hola Mundo del Internet de las CosasHola Mundo del Internet de las Cosas
Hola Mundo del Internet de las CosasSoftware Guru
 
Estructuras de datos avanzadas: Casos de uso reales
Estructuras de datos avanzadas: Casos de uso realesEstructuras de datos avanzadas: Casos de uso reales
Estructuras de datos avanzadas: Casos de uso realesSoftware Guru
 
Building bias-aware environments
Building bias-aware environmentsBuilding bias-aware environments
Building bias-aware environmentsSoftware Guru
 
El secreto para ser un desarrollador Senior
El secreto para ser un desarrollador SeniorEl secreto para ser un desarrollador Senior
El secreto para ser un desarrollador SeniorSoftware Guru
 
Cómo encontrar el trabajo remoto ideal
Cómo encontrar el trabajo remoto idealCómo encontrar el trabajo remoto ideal
Cómo encontrar el trabajo remoto idealSoftware Guru
 
Automatizando ideas con Apache Airflow
Automatizando ideas con Apache AirflowAutomatizando ideas con Apache Airflow
Automatizando ideas con Apache AirflowSoftware Guru
 
How thick data can improve big data analysis for business:
How thick data can improve big data analysis for business:How thick data can improve big data analysis for business:
How thick data can improve big data analysis for business:Software Guru
 
Introducción al machine learning
Introducción al machine learningIntroducción al machine learning
Introducción al machine learningSoftware Guru
 
Democratizando el uso de CoDi
Democratizando el uso de CoDiDemocratizando el uso de CoDi
Democratizando el uso de CoDiSoftware Guru
 
Gestionando la felicidad de los equipos con Management 3.0
Gestionando la felicidad de los equipos con Management 3.0Gestionando la felicidad de los equipos con Management 3.0
Gestionando la felicidad de los equipos con Management 3.0Software Guru
 
Taller: Creación de Componentes Web re-usables con StencilJS
Taller: Creación de Componentes Web re-usables con StencilJSTaller: Creación de Componentes Web re-usables con StencilJS
Taller: Creación de Componentes Web re-usables con StencilJSSoftware Guru
 
El camino del full stack developer (o como hacemos en SERTI para que no solo ...
El camino del full stack developer (o como hacemos en SERTI para que no solo ...El camino del full stack developer (o como hacemos en SERTI para que no solo ...
El camino del full stack developer (o como hacemos en SERTI para que no solo ...Software Guru
 
¿Qué significa ser un programador en Bitso?
¿Qué significa ser un programador en Bitso?¿Qué significa ser un programador en Bitso?
¿Qué significa ser un programador en Bitso?Software Guru
 
Colaboración efectiva entre desarrolladores del cliente y tu equipo.
Colaboración efectiva entre desarrolladores del cliente y tu equipo.Colaboración efectiva entre desarrolladores del cliente y tu equipo.
Colaboración efectiva entre desarrolladores del cliente y tu equipo.Software Guru
 
Pruebas de integración con Docker en Azure DevOps
Pruebas de integración con Docker en Azure DevOpsPruebas de integración con Docker en Azure DevOps
Pruebas de integración con Docker en Azure DevOpsSoftware Guru
 
Elixir + Elm: Usando lenguajes funcionales en servicios productivos
Elixir + Elm: Usando lenguajes funcionales en servicios productivosElixir + Elm: Usando lenguajes funcionales en servicios productivos
Elixir + Elm: Usando lenguajes funcionales en servicios productivosSoftware Guru
 
Así publicamos las apps de Spotify sin stress
Así publicamos las apps de Spotify sin stressAsí publicamos las apps de Spotify sin stress
Así publicamos las apps de Spotify sin stressSoftware Guru
 
Achieving Your Goals: 5 Tips to successfully achieve your goals
Achieving Your Goals: 5 Tips to successfully achieve your goalsAchieving Your Goals: 5 Tips to successfully achieve your goals
Achieving Your Goals: 5 Tips to successfully achieve your goalsSoftware Guru
 
Acciones de comunidades tech en tiempos del Covid19
Acciones de comunidades tech en tiempos del Covid19Acciones de comunidades tech en tiempos del Covid19
Acciones de comunidades tech en tiempos del Covid19Software Guru
 
De lo operativo a lo estratégico: un modelo de management de diseño
De lo operativo a lo estratégico: un modelo de management de diseñoDe lo operativo a lo estratégico: un modelo de management de diseño
De lo operativo a lo estratégico: un modelo de management de diseñoSoftware Guru
 

Más de Software Guru (20)

Hola Mundo del Internet de las Cosas
Hola Mundo del Internet de las CosasHola Mundo del Internet de las Cosas
Hola Mundo del Internet de las Cosas
 
Estructuras de datos avanzadas: Casos de uso reales
Estructuras de datos avanzadas: Casos de uso realesEstructuras de datos avanzadas: Casos de uso reales
Estructuras de datos avanzadas: Casos de uso reales
 
Building bias-aware environments
Building bias-aware environmentsBuilding bias-aware environments
Building bias-aware environments
 
El secreto para ser un desarrollador Senior
El secreto para ser un desarrollador SeniorEl secreto para ser un desarrollador Senior
El secreto para ser un desarrollador Senior
 
Cómo encontrar el trabajo remoto ideal
Cómo encontrar el trabajo remoto idealCómo encontrar el trabajo remoto ideal
Cómo encontrar el trabajo remoto ideal
 
Automatizando ideas con Apache Airflow
Automatizando ideas con Apache AirflowAutomatizando ideas con Apache Airflow
Automatizando ideas con Apache Airflow
 
How thick data can improve big data analysis for business:
How thick data can improve big data analysis for business:How thick data can improve big data analysis for business:
How thick data can improve big data analysis for business:
 
Introducción al machine learning
Introducción al machine learningIntroducción al machine learning
Introducción al machine learning
 
Democratizando el uso de CoDi
Democratizando el uso de CoDiDemocratizando el uso de CoDi
Democratizando el uso de CoDi
 
Gestionando la felicidad de los equipos con Management 3.0
Gestionando la felicidad de los equipos con Management 3.0Gestionando la felicidad de los equipos con Management 3.0
Gestionando la felicidad de los equipos con Management 3.0
 
Taller: Creación de Componentes Web re-usables con StencilJS
Taller: Creación de Componentes Web re-usables con StencilJSTaller: Creación de Componentes Web re-usables con StencilJS
Taller: Creación de Componentes Web re-usables con StencilJS
 
El camino del full stack developer (o como hacemos en SERTI para que no solo ...
El camino del full stack developer (o como hacemos en SERTI para que no solo ...El camino del full stack developer (o como hacemos en SERTI para que no solo ...
El camino del full stack developer (o como hacemos en SERTI para que no solo ...
 
¿Qué significa ser un programador en Bitso?
¿Qué significa ser un programador en Bitso?¿Qué significa ser un programador en Bitso?
¿Qué significa ser un programador en Bitso?
 
Colaboración efectiva entre desarrolladores del cliente y tu equipo.
Colaboración efectiva entre desarrolladores del cliente y tu equipo.Colaboración efectiva entre desarrolladores del cliente y tu equipo.
Colaboración efectiva entre desarrolladores del cliente y tu equipo.
 
Pruebas de integración con Docker en Azure DevOps
Pruebas de integración con Docker en Azure DevOpsPruebas de integración con Docker en Azure DevOps
Pruebas de integración con Docker en Azure DevOps
 
Elixir + Elm: Usando lenguajes funcionales en servicios productivos
Elixir + Elm: Usando lenguajes funcionales en servicios productivosElixir + Elm: Usando lenguajes funcionales en servicios productivos
Elixir + Elm: Usando lenguajes funcionales en servicios productivos
 
Así publicamos las apps de Spotify sin stress
Así publicamos las apps de Spotify sin stressAsí publicamos las apps de Spotify sin stress
Así publicamos las apps de Spotify sin stress
 
Achieving Your Goals: 5 Tips to successfully achieve your goals
Achieving Your Goals: 5 Tips to successfully achieve your goalsAchieving Your Goals: 5 Tips to successfully achieve your goals
Achieving Your Goals: 5 Tips to successfully achieve your goals
 
Acciones de comunidades tech en tiempos del Covid19
Acciones de comunidades tech en tiempos del Covid19Acciones de comunidades tech en tiempos del Covid19
Acciones de comunidades tech en tiempos del Covid19
 
De lo operativo a lo estratégico: un modelo de management de diseño
De lo operativo a lo estratégico: un modelo de management de diseñoDe lo operativo a lo estratégico: un modelo de management de diseño
De lo operativo a lo estratégico: un modelo de management de diseño
 

Último

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 

Último (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 

How can algorithms be biased?

Notas del editor

  1. Hello! My name is Eva. I am SO humbled and honored to be here today to talk to you about this topic that is so important to me. 2 years ago I completed an MSC in business analytics and management science where I learned all about algorithms, including their benefits and risks. Now I’m a PMM at Sentry in SF. Please tweet me with your questions or comments.
  2. A great example to understand the difference is in autonomous cars stopping at a stop sign. Data Science - understanding false negatives, insights like does time of day matter for the car to stop Machine Learning - gathering a dataset to predict which ones have stop signs or not AI - takes the action to apply the breaks at the stop sign
  3. Get data, clean prepare and manipulate (feature extraction), train, test, deploy and improve
  4. Used as a framework
  5. What are examples of this?
  6. The bias is still there, even though the variable is removed.
  7. How do we move forward with building our model?
  8. Get data, clean prepare and manipulate (feature extraction), train, test, deploy and improve
  9. Many data sets are collected by a particular entity to answer specific types of questions to accomplish a particular goal. To minimize bias at the data cleaning stage, get context on the data about how the raw data was collected and how certain variables should be interpreted.
  10. The research on Yelp data posted on eater shows that Mexican cuisine in the US had the highest number of people talking about “Authenticity” following by Chinese, Thai, Japanese, and Indian.
  11. How would using this data of yelp stars by zip code to predict housing prices affect our model?
  12. This is another dataset with human-informed, opt-in decisions. Already when using both Amazon Express and Yelp, you have Omitted Variable Bias since the service is voluntary and opt-in.
  13. Racial discrimination that comes through location-based opt-in data
  14. So back to our model
  15. Get data, clean prepare and manipulate (feature extraction), train, test, deploy and improve
  16. The “test” data is used for evaluation.
  17. We can’t and shouldn’t blindly look at prediction power, we also need to understand the variables that are predicting it
  18. This will build on itself in away that the people created it wouldn’t have neccesarily wanted. This will compound and continue a vicious cycle. Rich people’s houses would have a higher value BECAUSE they are rich.
  19. It’s a self-fulfilling prophecy - it’s reinforcing the vicious cycles of inequality in society, and disadvantages that already exist.
  20. We looked at this a bit in the last section, but we’re going to try to improve our model even more
  21. Get data, clean prepare and manipulate (feature extraction), train, test, deploy and improve
  22. Also notices, it gives rise to new variables for XGBoost
  23. Algorithms don’t do well with outlier data, like we learned when building our model.
  24. Bell curve slightly to the left, it is slightly undervaluing the property prices. This would be favorable to buyers but not favorable to sellers. Be mindful of who does this hurt
  25. Get data, clean prepare and manipulate (feature extraction), train, test, deploy and improve
  26. This phase is when we bring the bias to life. What’s the problem with using predictive algorithms?
  27. In 2013 Google miss-predicted the peak of the flu by 140%
  28. If mulitple companies used that same tool, women would have a hard time getting hired anywhere.
  29. Personal story
  30. I hope we We can’t blindly trust the algorithm.
  31. We can check the way that decisions that we make affect the models that we build which can affect real people’s lives in the world. Obviously, awareness. We’ve touched on this. That’s my whole point of talking to you about this.
  32. This is for now.
  33. Right now, it’s a fact, one small, homogenous group of people, make decisions that affect everybody. The people who create this generally look like each other, with a similar upbringing, look and talk the same way. When we have more people of color training image-recognition models, we’re less likely to have self-driving cars that can’t recognize people of color. When we have more women writing software and training data models, we’re less likely to have hiring algorithms that discriminate against women. When you bring your diverse perspective to the conversation, you change the conversation.