Visualisatie voor transparante beslissingen - Big Data Expo 2019

•

0 recomendaciones•115 vistas

Complexe modellen worden steeds meer gebruikt voor het ondersteunen en maken van beslissingen, zoals welke behandeling het meest effectief is voor een patient, wie er wordt uitgenodigd voor een sollicitatiegesprek, of waar we extra moeten controleren op fraude. Dergelijke modellen zijn vaak gebaseerd op machine learning: door de gegevens van veel cases te analyseren worden automatisch modellen gegenereerd. Vervolgens worden die modellen gebruikt om voor nieuwe cases advies te geven of zelfs automatisch beslissingen te nemen. Dit kan verrassend effectief zijn, maar is niet zonder risico. Door historische gegevens te gebruiken, is er bijvoorbeeld een groot risico dat het model alle vooroordelen van het verleden reproduceert. Deze ontwikkeling leidt dan ook tot steeds meer vragen: zijn de uitkomsten correct, wat is de gevolgde redenering, en hoe kunnen we dat controleren? Transparantie is het streven, en tegenwoordig wordt door de GDPR ook afgedwongen dat bij ingrijpende beslissingen altijd een mens betrokken is. Het geven van inzicht in complexe modellen is een grote uitdaging en krijgt veel aandacht. In de presentatie wordt een overzicht van het probleem gegeven en worden voorbeelden getoond hoe visualisatie kan helpen om modellen te begrijpen. Deze voorbeelden zijn het resultaat van onderzoek van TU/e en gaan onder andere over het beoordelen van riskant scheepsgedrag, analyseren van slaapstoornissen en het inschatten van fraude.

Datos y análisis

Jack vanWijk
BIG DATA EXPO
Utrecht, 18 & 19 september, 2019
Visualization for
Transparent Decisions

More decisions…
• You qualify for our special offer
• You are not admitted to our education program
• Your job application is put aside
• Your mortgage request cannot be honored
• Your research proposal is rejected
• You should get vitrectomie
• Your probation request is declined
• You are fired
• You are arrested

The challenge
• How to obtain transparency in
predictive analytics?
• How to present the evidence and
reasoning used, such that humans can
understand, validate, and judge the
results?

Complex models
Increasing complexity:
• rules
• logistic regression
• decision trees
• support vector machines
• random forests
• neural networks
• deep learning networks
Size matters:
• 1000 rules?
• 100 variables?
• 50 layers?
• 10 dimensions?
• 100 trees?
• 1000’s of nodes?
• millions of nodes?

Approaches to explanation
• Model:
– White box: show how the model works
– Black box: use simplified model
• Scope:
– Global: explain for all possible cases
– Local: explain for selected cases

Case 1: Decision tree visualization
Problem:
• Support construction of decision trees
• Enable domain expert to bring in domain
knowledge
White box approach:
• Model explicitly shown
• Global

BaobabView
Stef van den Elzen, IEEEVAST 2011

Decision tree for
tumor location
head & neck
prostate
pancreas
stomach
lung
ovary
BaobabView
Stef van den Elzen, IEEEVAST 2011

Case 2: Polysomnography
• Measure brain signals during sleep
• Classify 30s intervals according to five stages
Humberto Garçia Caballero et al., EuroVis 2019
Classifying one night sleep
takes one hour of an expert
Classification with deep
learning: accuracy ± 85%
How to improve?

Classification of sleep stages
Humberto Garçia Caballero et al., EuroVis 2019

Case 3: RationaleVisualization for Safety and
Security
Approach:
• show strongly simplified model
• for one case
Roeland Scheepens, Steffen Michels et al., EuroVis 2015

Context
AIS-data,
radar data,
web data,
reports… on
vessels
Probabilistic first order
logic inference engine
Coast guard
Roeland Scheepens, Steffen Michels et al., EuroVis 2015

But why!?
AIS-data,
radar data,
web data,
reports… on
vessels
Probabilistic first order
logic inference engine
Coast guard
Roeland Scheepens, Steffen Michels et al., EuroVis 2015
Problem

Aha!
Roeland Scheepens, Steffen Michels et al., EuroVis 2015
Problem

Roeland Scheepens, Steffen Michels et al., EuroVis 2015
Example

Case 4: Insurance Fraud detection
MSc project Dennis Collaris
Support fraud detection team in
prioritization of cases
Approach:
• show strongly simplified model
• for one case

Start point
Data set:
– 38,138 insurance policies
– 49 attributes per policy
– 129 confirmed fraud
Model:
Bagging ensemble of
– 100 Random Forest models, each with
– 500 CART decision trees
Dennis Collaris, 2018

Dennis Collaris, 2018 violin plots feature importance

Dennis Collaris, 2018 dependence plots features

Dennis Collaris, 2018 derivation and visualization local rules

Observations Achmea case
• Deriving explanations is hard work
• Different techniques yield different explanations
• But, domain experts did not seem to care???
Dennis Collaris, 2018

Finally
• Explaining algorithms / data science / AI
• Transparency crucial
• Many challenges ahead
• Stop for red lights!

Más contenido relacionado

Similar a Visualisatie voor transparante beslissingen - Big Data Expo 2019

Why am I doing this???Anne-Marie Tousch

Data is the new oil: Big data, data mining and bio - inspiring techniquesAboul Ella Hassanien

Data are the new oil: Big data, data mining and bio - inspiring techniquesAboul Ella Hassanien

Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey

AI for the Human Retina to Protect Newborn VisionStepan Pushkarev

How to Enhance Your Career with AIKeita Broadwater

2019 June 27 - Big data and data scienceFabio Stella

Neural Networks for Pattern RecognitionVipra Singh

Capstone Project.pptxARESProject1

Latest trends in information technologyAtifa Aqueel

Security Analytics Beyond CyberPhil Huggins FBCS CITP

44CON 2014 - Security Analytics Beyond Cyber, Phil Huggins44CON

control room design.pdfPawachMetharattanara

AI IN PATH final PPT.pptxDivyaGaurav4

Week1- Introduction.pptxfahmi324663

Discussion materials for Internet of Things and Smart Cities - Vespucci 2016 ...SensorUp

форсайт Technology & educationДенис Семыкин

Lifelogging, egocentric vision and health: how a small wearable camera can he...Petia Radeva

7-1 ARTIFICIAL INTELLIGENCE IN PATHOLOGY semiar 2.pptxHarishankarSharma27

Thin Slicing a Black Swan: A Search for the UnknownsMichele Chubirka

Similar a Visualisatie voor transparante beslissingen - Big Data Expo 2019 (20)

Why am I doing this???

Data is the new oil: Big data, data mining and bio - inspiring techniques

Data are the new oil: Big data, data mining and bio - inspiring techniques

Data Science - An emerging Stream of Science with its Spreading Reach & Impact

AI for the Human Retina to Protect Newborn Vision

How to Enhance Your Career with AI

2019 June 27 - Big data and data science

Neural Networks for Pattern Recognition

Capstone Project.pptx

Latest trends in information technology

Security Analytics Beyond Cyber

44CON 2014 - Security Analytics Beyond Cyber, Phil Huggins

control room design.pdf

AI IN PATH final PPT.pptx

Week1- Introduction.pptx

Discussion materials for Internet of Things and Smart Cities - Vespucci 2016 ...

форсайт Technology & education

Lifelogging, egocentric vision and health: how a small wearable camera can he...

7-1 ARTIFICIAL INTELLIGENCE IN PATHOLOGY semiar 2.pptx

Thin Slicing a Black Swan: A Search for the Unknowns

Más de webwinkelvakdag

ISM eCompany: Sander Berlinskiwebwinkelvakdag

Social Nomads - Lynnwebwinkelvakdag

Thuiswinkel.org & Omoda: Alicja Van Ewijkwebwinkelvakdag

Worldpay: Maria Pradoswebwinkelvakdag

Van Moof: Simon Vreemanwebwinkelvakdag

ANWB: Carolina van den Hoven & Margot van Leeuwenwebwinkelvakdag

HEMA: Ilse Lankhorst, Bas Karsemeijerwebwinkelvakdag

ISM eCompany: Kees Beckeringhwebwinkelvakdag

ING: Dirk Mulderwebwinkelvakdag

Martijn Kozijn: Jessica van Haaster & Martijn Leclairewebwinkelvakdag

ING: Dirk Mulderwebwinkelvakdag

Cemex trescon: Marloe de Ruiterwebwinkelvakdag

LINDA.Foundation: Jocelyn Nassenstein-Brouwerwebwinkelvakdag

Maersk: Niek Minderhoudwebwinkelvakdag

Q&A: Brenda Hoekstrawebwinkelvakdag

Aanhangwagendirect & PI Marketing: Merin Eggink & Mascha Soorswebwinkelvakdag

ISM eCompany: Ralph van Woenselwebwinkelvakdag

Lecot: Raf Maesenwebwinkelvakdag

Lobbes: Berry de Snoowebwinkelvakdag

ISM eCompany: Sander Lemswebwinkelvakdag

Más de webwinkelvakdag (20)

ISM eCompany: Sander Berlinski

Social Nomads - Lynn

Thuiswinkel.org & Omoda: Alicja Van Ewijk

Worldpay: Maria Prados

Van Moof: Simon Vreeman

ANWB: Carolina van den Hoven & Margot van Leeuwen

HEMA: Ilse Lankhorst, Bas Karsemeijer

ISM eCompany: Kees Beckeringh

ING: Dirk Mulder

Martijn Kozijn: Jessica van Haaster & Martijn Leclaire

ING: Dirk Mulder

Cemex trescon: Marloe de Ruiter

LINDA.Foundation: Jocelyn Nassenstein-Brouwer

Maersk: Niek Minderhoud

Q&A: Brenda Hoekstra

Aanhangwagendirect & PI Marketing: Merin Eggink & Mascha Soors

ISM eCompany: Ralph van Woensel

Lecot: Raf Maesen

Lobbes: Berry de Snoo

ISM eCompany: Sander Lems

Último

RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993

毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss

Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics

Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03

办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss

毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...ttt fff

Semantic Shed - Squashing and Squeezing.pptxMike Bennett

GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch

Learn How Data Science Changes Our WorldEduminds Learning

Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics

Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson

Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster

Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7

科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss

LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter

Multiple time frame trading analysis -brianshannon.pdfchwongval

Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics

INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman

Visualisatie voor transparante beslissingen - Big Data Expo 2019

1. Jack vanWijk BIG DATA EXPO Utrecht, 18 & 19 september, 2019 Visualization for Transparent Decisions

4. More decisions… • You qualify for our special offer • You are not admitted to our education program • Your job application is put aside • Your mortgage request cannot be honored • Your research proposal is rejected • You should get vitrectomie • Your probation request is declined • You are fired • You are arrested

5. More decisions… • You qualify for our special offer • You are not admitted to our education program • Your job application is put aside • Your mortgage request cannot be honored • Your research proposal is rejected • You should get vitrectomie • Your probation request is declined • You are fired • You are arrested Should we let the computer decide?

7. The challenge • How to obtain transparency in predictive analytics? • How to present the evidence and reasoning used, such that humans can understand, validate, and judge the results?

8. http://www.responsibledatascience.org/

9. Complex models Increasing complexity: • rules • logistic regression • decision trees • support vector machines • random forests • neural networks • deep learning networks Size matters: • 1000 rules? • 100 variables? • 50 layers? • 10 dimensions? • 100 trees? • 1000’s of nodes? • millions of nodes?

10. Approaches to explanation • Model: – White box: show how the model works – Black box: use simplified model • Scope: – Global: explain for all possible cases – Local: explain for selected cases

11. Case 1: Decision tree visualization Problem: • Support construction of decision trees • Enable domain expert to bring in domain knowledge White box approach: • Model explicitly shown • Global

12. BaobabView Stef van den Elzen, IEEEVAST 2011

13. Decision tree for tumor location head & neck prostate pancreas stomach lung ovary BaobabView Stef van den Elzen, IEEEVAST 2011

14. Case 2: Polysomnography • Measure brain signals during sleep • Classify 30s intervals according to five stages Humberto Garçia Caballero et al., EuroVis 2019 Classifying one night sleep takes one hour of an expert Classification with deep learning: accuracy ± 85% How to improve?

15. Classification of sleep stages Humberto Garçia Caballero et al., EuroVis 2019

16. Classification of sleep stages Humberto Garçia Caballero et al., EuroVis 2019

17. Case 3: RationaleVisualization for Safety and Security Approach: • show strongly simplified model • for one case Roeland Scheepens, Steffen Michels et al., EuroVis 2015

18. Context AIS-data, radar data, web data, reports… on vessels Probabilistic first order logic inference engine Coast guard Roeland Scheepens, Steffen Michels et al., EuroVis 2015

19. But why!? AIS-data, radar data, web data, reports… on vessels Probabilistic first order logic inference engine Coast guard Roeland Scheepens, Steffen Michels et al., EuroVis 2015 Problem

20. Aha! Roeland Scheepens, Steffen Michels et al., EuroVis 2015 Problem

21. Roeland Scheepens, Steffen Michels et al., EuroVis 2015 Example

22. Case 4: Insurance Fraud detection MSc project Dennis Collaris Support fraud detection team in prioritization of cases Approach: • show strongly simplified model • for one case

23. Start point Data set: – 38,138 insurance policies – 49 attributes per policy – 129 confirmed fraud Model: Bagging ensemble of – 100 Random Forest models, each with – 500 CART decision trees Dennis Collaris, 2018

24. Dennis Collaris, 2018 violin plots feature importance

25. Dennis Collaris, 2018 dependence plots features

26. Dennis Collaris, 2018 derivation and visualization local rules

27. Observations Achmea case • Deriving explanations is hard work • Different techniques yield different explanations • But, domain experts did not seem to care??? Dennis Collaris, 2018

28. Finally • Explaining algorithms / data science / AI • Transparency crucial • Many challenges ahead • Stop for red lights!

Visualisatie voor transparante beslissingen - Big Data Expo 2019

Recomendados

Recomendados

Más contenido relacionado

Similar a Visualisatie voor transparante beslissingen - Big Data Expo 2019

Similar a Visualisatie voor transparante beslissingen - Big Data Expo 2019 (20)

Más de webwinkelvakdag

Más de webwinkelvakdag (20)

Último

Último (20)

Visualisatie voor transparante beslissingen - Big Data Expo 2019