SlideShare una empresa de Scribd logo
1 de 30
Not Fair! Testing AI Bias and
Organizational Values
Peter Varhol and Gerie Owen
About me
• International speaker and writer
• Graduate degrees in Math, CS, Psychology
• Technology communicator
• AWS certified
• Former university professor, tech journalist
• Cat owner and distance runner
• peter@petervarhol.com
Gerie Owen
3
• Quality Engineering Architect
• Testing Strategist & Evangelist
• Test Manager
• Subject expert on testing for
TechTarget’s
SearchSoftwareQuality.com
• International and Domestic
Conference Presenter
Gerie.owen@gerieowen.com
What You Will Learn
• Why bias is often an outcome of machine learning results.
• How bias that reflects organizational values can be a desirable result.
• How to test bias against organizational values.
Agenda
• What is bias in AI?
• How does it happen?
• Is bias ever good?
• Building in bias intentionally
• Bias in data
• Summary
Bug vs. Bias
• A bug is an identifiable and measurable error in process or result
• Usually fixed with a code change
• A bias is a systematic inflection in decisions that produces results
inconsistent with reality
• Bias can’t be fixed with a code change
How Does This Happen?
• The problem domain is ambiguous
• There is no single “right” answer
• “Close enough” can usually work
• As long as we can quantify “close enough”
• We don’t know quite why the software
responds as it does
• We can’t easily trace code paths
• We choose the data
• The software “learns” from past actions
How Can We Tell If It’s Biased?
• We look very carefully at the training data
• We set strict success criteria based on the system requirements
• We run many tests
• Most change parameters only slightly
• Some use radical inputs
• Compare results to success criteria
Amazon Can’t Rid Its AI of Bias
• Amazon created an AI to crawl the web to find job candidates
• Training data was all resumes submitted for the last ten years
• In IT, the overwhelming majority were male
• The AI “learned” that males were superior for IT jobs
• Amazon couldn’t fix that training bias
Many Systems Use Objective Data
• Electric wind sensor
• Determines wind speed and direction
• Based on the cooling of filaments
• Designed a three-layer neural network
• Then used the known data to train it
• Cooling in degrees of all four filaments
• Wind speed, direction
Can This Possibly Be Biased?
• Well, yes
• The training data could have been recorded in single
temperature/sunlight/humidity conditions
• Which could affect results under those conditions
• It’s a possible bias that doesn’t hurt anyone
• Or does it?
• Does anyone remember a certain O-ring?
Where Do Biases Come From?
• Data selection
• We choose training data that represents only one segment of the domain
• We limit our training data to certain times or seasons
• We overrepresent one population
• Or
• The problem domain has subtly changed
Where Do Biases Come From?
• Latent bias
• Concepts become incorrectly correlated
• Correlation does not mean causation
• But it is high enough to believe
• We could be promoting stereotypes
• This describes Amazon’s problem
Where Do Biases Come From?
• Interaction bias
• We may focus on keywords that users apply incorrectly
• User incorporates slang or unusual words
• “That’s bad, man”
• The story of Microsoft Tay
• It wasn’t bad, it was trained that way
Why Does Bias Matter?
• Wrong answers
• Often with no recourse
• Subtle discrimination (legal or illegal)
• And no one knows it
• Suboptimal results
• We’re not getting it right often enough
It’s Not Just AI
• All software has biases
• It’s written by people
• People make decisions on how to design and implement
• Bias is inevitable
• But can we find it and correct it?
• Do we have to?
Like This One
• A London doctor can’t get into her fitness center locker room
• The fitness center uses a “smart card” to access and record services
• While acknowledging the problem
• The fitness center couldn’t fix it
• But the software development team could
• They had hard-coded “doctor” to be synonymous
with “male”
• It was meant as a convenient shortcut
About That Data
• We use data from the problem domain
• What’s that?
• In some cases, scientific measurements are accurate
• But we can choose the wrong measures
• Or not fully represent the problem domain
• But data can also be subjective
• We train with photos of one race over another
• We train with our own values of beauty
Is Bias Always Bad?
• Bias can result in suboptimal answers
• Answers that reflect the bias rather than rational thought
• But is that always a problem?
• It depends on how we measure our answers
• We may not want the most profitable answer
• Instead we want to reflect organizational values
• What are those values?
Examples of Organizational Values
• Committed with goals to equal hiring, pay, and promotion
• Will not exclude credit based on location, race, or other irrelevant
factor
• Will keep the environment cleaner than we left it
• Net carbon neutral
• No pollutants into atmosphere
• We will delight our customers
Examples of Organizational Values
• These values don’t maximize profit at the expense of everything
• They represent what we might stand for
• They are extremely difficult to train AI for
• Values tend to be nebulous
• Organizations don’t always practice them
• We don’t know how to measure them
• So we don’t know what data to use
• Are we achieving the desired results?
• How can we test this?
How Do We Design Systems With
These Goals in Mind?
• We need data
• But we don’t directly measure the goal
• Is there proxy data?
• Training the system
• Data must reflect goals
• That means we must know or suspect the data
is measuring the bias we want
Examples of Useful Data
• Customer satisfaction
• Survey data
• Complaints/resolution times
• Maintain a clean environment
• Emissions from operations/employee commute
• Recycling volume
• Equal opportunity
• Salary comparisons, hiring statistics
Sample Scenario
• “We delight our customers”
• AI apps make decisions on customer complaints
• Goal is to satisfy as many as possible
• Make it right if possible
• Train with
• Customer satisfaction survey results
• Objective assessment of customer interaction results
Testing the Bias
• Define hypotheses
• Map vague to operational definitions
• Establish test scenarios
• Specify the exact results expected
• With means and standard deviations
• Test using training data
• Measure the results in terms of definitions
Testing the Bias
• Compare test results to the data
• That data measures your organizational values
• Is there a consistent match?
• A consistent match means that the AI is accurately reflecting organizational
values
• Does it meet the goals set forth at the beginning of the project?
• Are ML recommendations reflecting values?
• If not, it’s time to go back to the drawing board
• Better operational definitions
• New data
Finally
• Test using real life data
• Put the application into production
• Confirm results in practice
• At first, side by side with human decision-makers
• Validate the recommendations with people
• Compare recommendations with results
• Yes/no – does the software reflect values
Back to Bias
• Bias isn’t necessarily bad in ML/AI
• But we need to understand it
• And make sure it reflects our goals
• Testers need to understand organizational values
• And how they represent bias
• And how to incorporate that bias into ML/AI apps
Summary
• Machine learning/AI apps can be designed to reflect organizational
values
• That may not result in the best decision from a strict business standpoint
• Know your organizational values
• And be committed to maintaining them
• Test to the data that represents the values
• As well as the written values themselves
• Draw conclusions about the decisions being made
Thank You
• Peter Varhol
peter@petervarhol.com
• Gerie Owen
gerie@gerieowen.com

Más contenido relacionado

La actualidad más candente

Building and Delivering High Stakes Executive Presentations
Building and Delivering High Stakes Executive PresentationsBuilding and Delivering High Stakes Executive Presentations
Building and Delivering High Stakes Executive PresentationsMatt Baker
 
The culture of netflix
The culture of netflixThe culture of netflix
The culture of netflixRJRoh
 
Ultimate qualities of a content strategist and designer
Ultimate qualities of a content strategist and designerUltimate qualities of a content strategist and designer
Ultimate qualities of a content strategist and designerElle Geraghty
 
I Am Athlete - Defining Culture at the Intersection of Sports and Technology
I Am Athlete - Defining Culture at the Intersection of Sports and TechnologyI Am Athlete - Defining Culture at the Intersection of Sports and Technology
I Am Athlete - Defining Culture at the Intersection of Sports and TechnologyJeff Matlow
 
Pop Inc. Culture - v1
Pop Inc. Culture - v1Pop Inc. Culture - v1
Pop Inc. Culture - v1Pop Inc.
 
Product Management: The Untapped World of Leveraging on Giants
Product Management: The Untapped World of Leveraging on GiantsProduct Management: The Untapped World of Leveraging on Giants
Product Management: The Untapped World of Leveraging on GiantsEkoInnovationCentre
 
Hiring for Devops - how to nail that DevOps interview - Uri Cohen VP GigaSpaces
Hiring for Devops - how to nail that DevOps interview - Uri Cohen VP GigaSpacesHiring for Devops - how to nail that DevOps interview - Uri Cohen VP GigaSpaces
Hiring for Devops - how to nail that DevOps interview - Uri Cohen VP GigaSpacesAgileSparks
 
Hr Covered Consulting Services
Hr Covered Consulting ServicesHr Covered Consulting Services
Hr Covered Consulting ServicesRoberta Butticci
 
The Values-Driven Startup
The Values-Driven StartupThe Values-Driven Startup
The Values-Driven Startupdavekashen
 
Agile Marketing: Does one size fit all?
Agile Marketing: Does one size fit all?Agile Marketing: Does one size fit all?
Agile Marketing: Does one size fit all?Steve Offsey
 
Netflix interview questions and answers
Netflix interview questions and answersNetflix interview questions and answers
Netflix interview questions and answersselinasimpson294
 
Netflix culture deck
Netflix culture deckNetflix culture deck
Netflix culture deckkevibak
 
Better Living Through KPIs: How to Manage Your Community Team (CMX Summit Ris...
Better Living Through KPIs: How to Manage Your Community Team (CMX Summit Ris...Better Living Through KPIs: How to Manage Your Community Team (CMX Summit Ris...
Better Living Through KPIs: How to Manage Your Community Team (CMX Summit Ris...Lauren Clevenger
 
Developing Agent Empathy Through Emotional Intelligence
Developing Agent Empathy Through Emotional IntelligenceDeveloping Agent Empathy Through Emotional Intelligence
Developing Agent Empathy Through Emotional IntelligenceAggregage
 
Volleyball and management
Volleyball and managementVolleyball and management
Volleyball and managementPaul Kaerger
 

La actualidad más candente (20)

Building and Delivering High Stakes Executive Presentations
Building and Delivering High Stakes Executive PresentationsBuilding and Delivering High Stakes Executive Presentations
Building and Delivering High Stakes Executive Presentations
 
The culture of netflix
The culture of netflixThe culture of netflix
The culture of netflix
 
Ultimate qualities of a content strategist and designer
Ultimate qualities of a content strategist and designerUltimate qualities of a content strategist and designer
Ultimate qualities of a content strategist and designer
 
Campus to corporate
Campus to corporateCampus to corporate
Campus to corporate
 
Culture9 090801103430-phpapp02
Culture9 090801103430-phpapp02Culture9 090801103430-phpapp02
Culture9 090801103430-phpapp02
 
I Am Athlete - Defining Culture at the Intersection of Sports and Technology
I Am Athlete - Defining Culture at the Intersection of Sports and TechnologyI Am Athlete - Defining Culture at the Intersection of Sports and Technology
I Am Athlete - Defining Culture at the Intersection of Sports and Technology
 
Pop Inc. Culture - v1
Pop Inc. Culture - v1Pop Inc. Culture - v1
Pop Inc. Culture - v1
 
Product Management: The Untapped World of Leveraging on Giants
Product Management: The Untapped World of Leveraging on GiantsProduct Management: The Untapped World of Leveraging on Giants
Product Management: The Untapped World of Leveraging on Giants
 
Hiring for Devops - how to nail that DevOps interview - Uri Cohen VP GigaSpaces
Hiring for Devops - how to nail that DevOps interview - Uri Cohen VP GigaSpacesHiring for Devops - how to nail that DevOps interview - Uri Cohen VP GigaSpaces
Hiring for Devops - how to nail that DevOps interview - Uri Cohen VP GigaSpaces
 
The Netflix Culture
The Netflix CultureThe Netflix Culture
The Netflix Culture
 
Hr Covered Consulting Services
Hr Covered Consulting ServicesHr Covered Consulting Services
Hr Covered Consulting Services
 
The Values-Driven Startup
The Values-Driven StartupThe Values-Driven Startup
The Values-Driven Startup
 
Agile Marketing: Does one size fit all?
Agile Marketing: Does one size fit all?Agile Marketing: Does one size fit all?
Agile Marketing: Does one size fit all?
 
Vermeer manifesto
Vermeer manifestoVermeer manifesto
Vermeer manifesto
 
Netflix interview questions and answers
Netflix interview questions and answersNetflix interview questions and answers
Netflix interview questions and answers
 
Netflix culture deck
Netflix culture deckNetflix culture deck
Netflix culture deck
 
Better Living Through KPIs: How to Manage Your Community Team (CMX Summit Ris...
Better Living Through KPIs: How to Manage Your Community Team (CMX Summit Ris...Better Living Through KPIs: How to Manage Your Community Team (CMX Summit Ris...
Better Living Through KPIs: How to Manage Your Community Team (CMX Summit Ris...
 
Culture9 info
Culture9 infoCulture9 info
Culture9 info
 
Developing Agent Empathy Through Emotional Intelligence
Developing Agent Empathy Through Emotional IntelligenceDeveloping Agent Empathy Through Emotional Intelligence
Developing Agent Empathy Through Emotional Intelligence
 
Volleyball and management
Volleyball and managementVolleyball and management
Volleyball and management
 

Similar a Testing AI Bias Against Organizational Values

Testing for cognitive bias in ai systems
Testing for cognitive bias in ai systemsTesting for cognitive bias in ai systems
Testing for cognitive bias in ai systemsPeter Varhol
 
Correlation does not mean causation
Correlation does not mean causationCorrelation does not mean causation
Correlation does not mean causationPeter Varhol
 
Using Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps PracticesUsing Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps PracticesPeter Varhol
 
Testing a movingtarget_quest_dynatrace
Testing a movingtarget_quest_dynatraceTesting a movingtarget_quest_dynatrace
Testing a movingtarget_quest_dynatracePeter Varhol
 
Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...
Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...
Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...OrateTeam
 
Data Quality: Issues and Fixes
Data Quality: Issues and FixesData Quality: Issues and Fixes
Data Quality: Issues and FixesCRRC-Armenia
 
Amp Up Your Testing by Harnessing Test Data
Amp Up Your Testing by Harnessing Test DataAmp Up Your Testing by Harnessing Test Data
Amp Up Your Testing by Harnessing Test DataTechWell
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedKrishnaram Kenthapadi
 
Software Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour Presentation
Software Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour PresentationSoftware Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour Presentation
Software Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour PresentationXBOSoft
 
Using AI to Build Fair and Equitable Workplaces
Using AI to Build Fair and Equitable WorkplacesUsing AI to Build Fair and Equitable Workplaces
Using AI to Build Fair and Equitable WorkplacesData Con LA
 
How do we fix testing
How do we fix testingHow do we fix testing
How do we fix testingPeter Varhol
 
A PSYCHOMETRIC ASSESSMENT IS THE RIGHT WAY TO HIRE EMPLOYEES
A PSYCHOMETRIC ASSESSMENT  IS THE RIGHT WAY TO HIRE EMPLOYEESA PSYCHOMETRIC ASSESSMENT  IS THE RIGHT WAY TO HIRE EMPLOYEES
A PSYCHOMETRIC ASSESSMENT IS THE RIGHT WAY TO HIRE EMPLOYEESThink Exam
 
How to choose the right Martech stack and Data for your organization
How to choose the right Martech stack and Data for your organization How to choose the right Martech stack and Data for your organization
How to choose the right Martech stack and Data for your organization DemandGen
 
Recruitment slides handouts
Recruitment slides handoutsRecruitment slides handouts
Recruitment slides handoutsjdrcables
 
The Role of Analytics in Talent Acquisition
The Role of Analytics in Talent AcquisitionThe Role of Analytics in Talent Acquisition
The Role of Analytics in Talent AcquisitionHuman Capital Media
 
Darim's Synagogue Data Series, Part 3
Darim's Synagogue Data Series, Part 3Darim's Synagogue Data Series, Part 3
Darim's Synagogue Data Series, Part 3Idealware
 
Making disaster routine
Making disaster routineMaking disaster routine
Making disaster routinePeter Varhol
 
Enterprise Machine Learning Governance
Enterprise Machine Learning Governance Enterprise Machine Learning Governance
Enterprise Machine Learning Governance Terence Siganakis
 
The Analysis Part of Integration Projects
The Analysis Part of Integration ProjectsThe Analysis Part of Integration Projects
The Analysis Part of Integration ProjectsBizTalk360
 

Similar a Testing AI Bias Against Organizational Values (20)

Testing for cognitive bias in ai systems
Testing for cognitive bias in ai systemsTesting for cognitive bias in ai systems
Testing for cognitive bias in ai systems
 
Correlation does not mean causation
Correlation does not mean causationCorrelation does not mean causation
Correlation does not mean causation
 
Using Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps PracticesUsing Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps Practices
 
Testing a movingtarget_quest_dynatrace
Testing a movingtarget_quest_dynatraceTesting a movingtarget_quest_dynatrace
Testing a movingtarget_quest_dynatrace
 
Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...
Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...
Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...
 
Data Quality: Issues and Fixes
Data Quality: Issues and FixesData Quality: Issues and Fixes
Data Quality: Issues and Fixes
 
Amp Up Your Testing by Harnessing Test Data
Amp Up Your Testing by Harnessing Test DataAmp Up Your Testing by Harnessing Test Data
Amp Up Your Testing by Harnessing Test Data
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
 
Software Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour Presentation
Software Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour PresentationSoftware Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour Presentation
Software Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour Presentation
 
Using AI to Build Fair and Equitable Workplaces
Using AI to Build Fair and Equitable WorkplacesUsing AI to Build Fair and Equitable Workplaces
Using AI to Build Fair and Equitable Workplaces
 
How do we fix testing
How do we fix testingHow do we fix testing
How do we fix testing
 
A PSYCHOMETRIC ASSESSMENT IS THE RIGHT WAY TO HIRE EMPLOYEES
A PSYCHOMETRIC ASSESSMENT  IS THE RIGHT WAY TO HIRE EMPLOYEESA PSYCHOMETRIC ASSESSMENT  IS THE RIGHT WAY TO HIRE EMPLOYEES
A PSYCHOMETRIC ASSESSMENT IS THE RIGHT WAY TO HIRE EMPLOYEES
 
How to choose the right Martech stack and Data for your organization
How to choose the right Martech stack and Data for your organization How to choose the right Martech stack and Data for your organization
How to choose the right Martech stack and Data for your organization
 
Recruitment slides handouts
Recruitment slides handoutsRecruitment slides handouts
Recruitment slides handouts
 
The Role of Analytics in Talent Acquisition
The Role of Analytics in Talent AcquisitionThe Role of Analytics in Talent Acquisition
The Role of Analytics in Talent Acquisition
 
A New Approach to Defining BI Requirements
A New Approach to Defining BI RequirementsA New Approach to Defining BI Requirements
A New Approach to Defining BI Requirements
 
Darim's Synagogue Data Series, Part 3
Darim's Synagogue Data Series, Part 3Darim's Synagogue Data Series, Part 3
Darim's Synagogue Data Series, Part 3
 
Making disaster routine
Making disaster routineMaking disaster routine
Making disaster routine
 
Enterprise Machine Learning Governance
Enterprise Machine Learning Governance Enterprise Machine Learning Governance
Enterprise Machine Learning Governance
 
The Analysis Part of Integration Projects
The Analysis Part of Integration ProjectsThe Analysis Part of Integration Projects
The Analysis Part of Integration Projects
 

Más de Peter Varhol

DevOps and the Impostor Syndrome
DevOps and the Impostor SyndromeDevOps and the Impostor Syndrome
DevOps and the Impostor SyndromePeter Varhol
 
Not fair! testing ai bias and organizational values
Not fair! testing ai bias and organizational valuesNot fair! testing ai bias and organizational values
Not fair! testing ai bias and organizational valuesPeter Varhol
 
162 the technologist of the future
162   the technologist of the future162   the technologist of the future
162 the technologist of the futurePeter Varhol
 
Digital transformation through devops dod indianapolis
Digital transformation through devops dod indianapolisDigital transformation through devops dod indianapolis
Digital transformation through devops dod indianapolisPeter Varhol
 
What Aircrews Can Teach Testing Teams
What Aircrews Can Teach Testing TeamsWhat Aircrews Can Teach Testing Teams
What Aircrews Can Teach Testing TeamsPeter Varhol
 
Identifying and measuring testing debt
Identifying and measuring testing debtIdentifying and measuring testing debt
Identifying and measuring testing debtPeter Varhol
 
What aircrews can teach devops teams ignite
What aircrews can teach devops teams igniteWhat aircrews can teach devops teams ignite
What aircrews can teach devops teams ignitePeter Varhol
 
Talking to people lightning
Talking to people lightningTalking to people lightning
Talking to people lightningPeter Varhol
 
Varhol oracle database_firewall_oct2011
Varhol oracle database_firewall_oct2011Varhol oracle database_firewall_oct2011
Varhol oracle database_firewall_oct2011Peter Varhol
 
Qa test managed_code_varhol
Qa test managed_code_varholQa test managed_code_varhol
Qa test managed_code_varholPeter Varhol
 
Talking to people: the forgotten DevOps tool
Talking to people: the forgotten DevOps toolTalking to people: the forgotten DevOps tool
Talking to people: the forgotten DevOps toolPeter Varhol
 
Moneyball peter varhol_starwest2012
Moneyball peter varhol_starwest2012Moneyball peter varhol_starwest2012
Moneyball peter varhol_starwest2012Peter Varhol
 

Más de Peter Varhol (12)

DevOps and the Impostor Syndrome
DevOps and the Impostor SyndromeDevOps and the Impostor Syndrome
DevOps and the Impostor Syndrome
 
Not fair! testing ai bias and organizational values
Not fair! testing ai bias and organizational valuesNot fair! testing ai bias and organizational values
Not fair! testing ai bias and organizational values
 
162 the technologist of the future
162   the technologist of the future162   the technologist of the future
162 the technologist of the future
 
Digital transformation through devops dod indianapolis
Digital transformation through devops dod indianapolisDigital transformation through devops dod indianapolis
Digital transformation through devops dod indianapolis
 
What Aircrews Can Teach Testing Teams
What Aircrews Can Teach Testing TeamsWhat Aircrews Can Teach Testing Teams
What Aircrews Can Teach Testing Teams
 
Identifying and measuring testing debt
Identifying and measuring testing debtIdentifying and measuring testing debt
Identifying and measuring testing debt
 
What aircrews can teach devops teams ignite
What aircrews can teach devops teams igniteWhat aircrews can teach devops teams ignite
What aircrews can teach devops teams ignite
 
Talking to people lightning
Talking to people lightningTalking to people lightning
Talking to people lightning
 
Varhol oracle database_firewall_oct2011
Varhol oracle database_firewall_oct2011Varhol oracle database_firewall_oct2011
Varhol oracle database_firewall_oct2011
 
Qa test managed_code_varhol
Qa test managed_code_varholQa test managed_code_varhol
Qa test managed_code_varhol
 
Talking to people: the forgotten DevOps tool
Talking to people: the forgotten DevOps toolTalking to people: the forgotten DevOps tool
Talking to people: the forgotten DevOps tool
 
Moneyball peter varhol_starwest2012
Moneyball peter varhol_starwest2012Moneyball peter varhol_starwest2012
Moneyball peter varhol_starwest2012
 

Último

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Testing AI Bias Against Organizational Values

  • 1. Not Fair! Testing AI Bias and Organizational Values Peter Varhol and Gerie Owen
  • 2. About me • International speaker and writer • Graduate degrees in Math, CS, Psychology • Technology communicator • AWS certified • Former university professor, tech journalist • Cat owner and distance runner • peter@petervarhol.com
  • 3. Gerie Owen 3 • Quality Engineering Architect • Testing Strategist & Evangelist • Test Manager • Subject expert on testing for TechTarget’s SearchSoftwareQuality.com • International and Domestic Conference Presenter Gerie.owen@gerieowen.com
  • 4. What You Will Learn • Why bias is often an outcome of machine learning results. • How bias that reflects organizational values can be a desirable result. • How to test bias against organizational values.
  • 5. Agenda • What is bias in AI? • How does it happen? • Is bias ever good? • Building in bias intentionally • Bias in data • Summary
  • 6. Bug vs. Bias • A bug is an identifiable and measurable error in process or result • Usually fixed with a code change • A bias is a systematic inflection in decisions that produces results inconsistent with reality • Bias can’t be fixed with a code change
  • 7. How Does This Happen? • The problem domain is ambiguous • There is no single “right” answer • “Close enough” can usually work • As long as we can quantify “close enough” • We don’t know quite why the software responds as it does • We can’t easily trace code paths • We choose the data • The software “learns” from past actions
  • 8. How Can We Tell If It’s Biased? • We look very carefully at the training data • We set strict success criteria based on the system requirements • We run many tests • Most change parameters only slightly • Some use radical inputs • Compare results to success criteria
  • 9. Amazon Can’t Rid Its AI of Bias • Amazon created an AI to crawl the web to find job candidates • Training data was all resumes submitted for the last ten years • In IT, the overwhelming majority were male • The AI “learned” that males were superior for IT jobs • Amazon couldn’t fix that training bias
  • 10. Many Systems Use Objective Data • Electric wind sensor • Determines wind speed and direction • Based on the cooling of filaments • Designed a three-layer neural network • Then used the known data to train it • Cooling in degrees of all four filaments • Wind speed, direction
  • 11. Can This Possibly Be Biased? • Well, yes • The training data could have been recorded in single temperature/sunlight/humidity conditions • Which could affect results under those conditions • It’s a possible bias that doesn’t hurt anyone • Or does it? • Does anyone remember a certain O-ring?
  • 12. Where Do Biases Come From? • Data selection • We choose training data that represents only one segment of the domain • We limit our training data to certain times or seasons • We overrepresent one population • Or • The problem domain has subtly changed
  • 13. Where Do Biases Come From? • Latent bias • Concepts become incorrectly correlated • Correlation does not mean causation • But it is high enough to believe • We could be promoting stereotypes • This describes Amazon’s problem
  • 14. Where Do Biases Come From? • Interaction bias • We may focus on keywords that users apply incorrectly • User incorporates slang or unusual words • “That’s bad, man” • The story of Microsoft Tay • It wasn’t bad, it was trained that way
  • 15. Why Does Bias Matter? • Wrong answers • Often with no recourse • Subtle discrimination (legal or illegal) • And no one knows it • Suboptimal results • We’re not getting it right often enough
  • 16. It’s Not Just AI • All software has biases • It’s written by people • People make decisions on how to design and implement • Bias is inevitable • But can we find it and correct it? • Do we have to?
  • 17. Like This One • A London doctor can’t get into her fitness center locker room • The fitness center uses a “smart card” to access and record services • While acknowledging the problem • The fitness center couldn’t fix it • But the software development team could • They had hard-coded “doctor” to be synonymous with “male” • It was meant as a convenient shortcut
  • 18. About That Data • We use data from the problem domain • What’s that? • In some cases, scientific measurements are accurate • But we can choose the wrong measures • Or not fully represent the problem domain • But data can also be subjective • We train with photos of one race over another • We train with our own values of beauty
  • 19. Is Bias Always Bad? • Bias can result in suboptimal answers • Answers that reflect the bias rather than rational thought • But is that always a problem? • It depends on how we measure our answers • We may not want the most profitable answer • Instead we want to reflect organizational values • What are those values?
  • 20. Examples of Organizational Values • Committed with goals to equal hiring, pay, and promotion • Will not exclude credit based on location, race, or other irrelevant factor • Will keep the environment cleaner than we left it • Net carbon neutral • No pollutants into atmosphere • We will delight our customers
  • 21. Examples of Organizational Values • These values don’t maximize profit at the expense of everything • They represent what we might stand for • They are extremely difficult to train AI for • Values tend to be nebulous • Organizations don’t always practice them • We don’t know how to measure them • So we don’t know what data to use • Are we achieving the desired results? • How can we test this?
  • 22. How Do We Design Systems With These Goals in Mind? • We need data • But we don’t directly measure the goal • Is there proxy data? • Training the system • Data must reflect goals • That means we must know or suspect the data is measuring the bias we want
  • 23. Examples of Useful Data • Customer satisfaction • Survey data • Complaints/resolution times • Maintain a clean environment • Emissions from operations/employee commute • Recycling volume • Equal opportunity • Salary comparisons, hiring statistics
  • 24. Sample Scenario • “We delight our customers” • AI apps make decisions on customer complaints • Goal is to satisfy as many as possible • Make it right if possible • Train with • Customer satisfaction survey results • Objective assessment of customer interaction results
  • 25. Testing the Bias • Define hypotheses • Map vague to operational definitions • Establish test scenarios • Specify the exact results expected • With means and standard deviations • Test using training data • Measure the results in terms of definitions
  • 26. Testing the Bias • Compare test results to the data • That data measures your organizational values • Is there a consistent match? • A consistent match means that the AI is accurately reflecting organizational values • Does it meet the goals set forth at the beginning of the project? • Are ML recommendations reflecting values? • If not, it’s time to go back to the drawing board • Better operational definitions • New data
  • 27. Finally • Test using real life data • Put the application into production • Confirm results in practice • At first, side by side with human decision-makers • Validate the recommendations with people • Compare recommendations with results • Yes/no – does the software reflect values
  • 28. Back to Bias • Bias isn’t necessarily bad in ML/AI • But we need to understand it • And make sure it reflects our goals • Testers need to understand organizational values • And how they represent bias • And how to incorporate that bias into ML/AI apps
  • 29. Summary • Machine learning/AI apps can be designed to reflect organizational values • That may not result in the best decision from a strict business standpoint • Know your organizational values • And be committed to maintaining them • Test to the data that represents the values • As well as the written values themselves • Draw conclusions about the decisions being made
  • 30. Thank You • Peter Varhol peter@petervarhol.com • Gerie Owen gerie@gerieowen.com