SlideShare a Scribd company logo
1 of 21
Download to read offline
An Analysis of Machine- and Human-
Analytics in Classification
Authors:
1. Gary K. L. Tam (Swansea University)
2. Vivek Kothari (University of Oxford)
3. Min Chen (University of Oxford)
Presented by:
Subhashis Hazarika
(Ohio State University)
Major Contribution
• An information-theoretic model that explains why a human driven visual
analytic model of classification performs better than a purely machine-
learning model.
Overview
• Consider two classification case studies.
• Create a decision tree classifier applying standard ML algorithms.
• Create a decision tree classifier using visual analytics guided by “soft
knowledge” of a human model-developer.[1]
• Using Information theory explain why the human centric approach
performs better than the ML approach.
• Quantify the “soft knowledge” that a human centric approach takes
advantage of.
[1]: “Visualization of Time-Series Data in Parameter Space for Understanding Facial Dynamics”, G.K.L. Tam, H. Fang, A. J.
Aubrey, P.W. Grant, D. Marshall, M. Chen. Eurovis2011.
Case Study A (Facial Dynamics Data)
• Input and feature extraction:
– 68 raw facial videos classified as one of the four (smile, sadness, surprise, anger).
– For each video, extracted 14 time series representing different temporal facial features.
– For each time series, 23 quantitative measures were obtained.
– Resulting in 14x23 attributes/features per video.
• Create Decision Tree using a Parallel Coordinates based visual analytics
system.
• Create a Decision Tree with standard ML algorithms (C4.5 or CART).
A: Visual Analytics Approach
A: Interactive Visualization
A: Outliers and anomalies
A: Building the D-tree
A: VA v/s ML
Case Study B (Visualization Image Classification)
• Input and feature extraction:
– 4x49 jpeg images classified as (bubble-chart, treemap, parallel-coordinate, bar-graphs).
– For each image, extracted 222 features via. different image classification and clustering .
• Create Decision Tree using a Parallel Coordinates based visual analytics
system.
• Create a Decision Tree with standard ML algorithms (C4.5 or CART).
B: Building the D-Tree
B: Comparative Evaluation
The Team
• Case Study A:
– Conducted by 7 researchers with expertise in vision, visual analytics, computer graphics
and machine learning.
– Human-centric D-Tree was constructed by a researcher who was specialized in graphics
and acquired the knowledge of computer vision and visual analytics during the project.
• Case Study B:
– Conducted by 2 researchers with expertise in image processing and visual analytics.
– Human-centric D-Tree was constructed by a researcher with 8 months of experience in
visual analytics.
But Why? Some Empirical Observations
• O1: Overview and Axis Distribution.
– A machine-centric approach examines many cut positions on all the axis and greedily
picks the cut with the highest quality measure.
– While a human model developer usually first obtains a general overview of the data and
identifies important axes with promising patterns before paying detailed attention to
these axes.
• O2: General Agreement amongst Statistics.
– ML algorithms only use one metric to determine the cut.
– HC approach can evaluate more than one statistics to decide the cut.
• O3: Look-ahead.
– Humans’ insights into the consequence often influences the current decision.
– Humans’ look-ahead ability enables multi-step judgement, while the ML algorithms
focused only on the current decisions.
But Why? Some Empirical Observations
• O4: Outliers.
– If possible model developers avoid axes with outliers, as they may be unreliable.
– Such reasoning is not available in the ML algorithms.
• O5: Cut Positions on an Axis.
– Humans’ look for a cut or cuts that would allow each class to expand beyond the current
instance in the training set.
– ML algorithms decide the cuts at the very edges of a particular class.
• O6: Human (Domain) Knowledge.
– Humans’ incorporate their domain knowledge into their model construction process.
Information Flow
Information Theoretic Analysis
• Estimated World Population : 7.4 billion
• Consider each person have 5 variations for each of the 4 expressions.
• The number of possible scenarios to capture : 148 billion
• The maximal entropy is 37.1 bits.
• We only know 68 cases(the raw training video)
• That is 1.7 x 10-8 bits. (a drop in the ocean)
• [ML] Optimistically, assuming the categorization retains 50% mutual
information. That leaves us with 8.5 x 10-9 bits of information.
Information Theoretic Analysis
• [VA] Model developer may know some 200 people reasonably well, and
can recall their 5 variations of 4 expressions at ease. Conservatively, that is
equivalent to 4068 videos instead of 68. Representing 1.0 x 10-6 bits of
known information.
• [VA]When given an arbitrary facial image, the developer can also
reconstruct an expression using imagination e.g at least 1 variation per
expression. This ability accounts to 29.6 billion videos, representing 7.4
bits of known information. (This ability shows up in determining outliers).
• 7.4 bits v/s 8.5 x 10-9 bits . That is roughly 871 million times more
information content.
Soft Knowledge and Soft Models
• Soft Knowledge: The uncaptured information not available to the
machine-centric approach.
• Soft Model: The models which make decisions based on soft knowledge.
• Examples:
1. Given a facial photo (input), imagine how the person would smile (output).
2. Given a video (input), determine if it is an outlier (output).
3. Given a set of points on an axis (input), decide how many cuts and where they are
(output).
Soft Models
Conclusion
• There is an overwhelming amount of information available to the human-
centric approach in the form of soft knowledge that can’t be utilized by a
machine-centric approach.
• It is necessary to understand and quantify the information flow in both
machine- and human-centric approaches to help design a mixed model
performing an much better job.
• Human model developer can never by cast aside.

More Related Content

What's hot

This is a heavily data-oriented
This is a heavily data-orientedThis is a heavily data-oriented
This is a heavily data-oriented
butest
 
Project on collision avoidance in static and dynamic environment
Project on collision avoidance in static and dynamic environmentProject on collision avoidance in static and dynamic environment
Project on collision avoidance in static and dynamic environment
gopaljee1989
 

What's hot (18)

14/01/20 "Engineering Optimization in Aircraft Design" Aerodynamic Design at ...
14/01/20 "Engineering Optimization in Aircraft Design" Aerodynamic Design at ...14/01/20 "Engineering Optimization in Aircraft Design" Aerodynamic Design at ...
14/01/20 "Engineering Optimization in Aircraft Design" Aerodynamic Design at ...
 
What goes on during haar cascade face detection
What goes on during haar cascade face detectionWhat goes on during haar cascade face detection
What goes on during haar cascade face detection
 
How machines can take decisions
How machines can take decisionsHow machines can take decisions
How machines can take decisions
 
General introduction to AI ML DL DS
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DS
 
Facial expression recognition system : survey
Facial expression recognition system : surveyFacial expression recognition system : survey
Facial expression recognition system : survey
 
Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)
 
Applications
ApplicationsApplications
Applications
 
Regression, Bayesian Learning and Support vector machine
Regression, Bayesian Learning and Support vector machineRegression, Bayesian Learning and Support vector machine
Regression, Bayesian Learning and Support vector machine
 
This is a heavily data-oriented
This is a heavily data-orientedThis is a heavily data-oriented
This is a heavily data-oriented
 
Pattern Recognition
Pattern RecognitionPattern Recognition
Pattern Recognition
 
Machine learning seminar ppt
Machine learning seminar pptMachine learning seminar ppt
Machine learning seminar ppt
 
NPS_TDA_forPDF_JPrendki
NPS_TDA_forPDF_JPrendkiNPS_TDA_forPDF_JPrendki
NPS_TDA_forPDF_JPrendki
 
face detection
face detectionface detection
face detection
 
Optimizing Mobile Robot Path Planning and Navigation by Use of Differential E...
Optimizing Mobile Robot Path Planning and Navigation by Use of Differential E...Optimizing Mobile Robot Path Planning and Navigation by Use of Differential E...
Optimizing Mobile Robot Path Planning and Navigation by Use of Differential E...
 
Computational Rationality I - a Lecture at Aalto University by Antti Oulasvirta
Computational Rationality I - a Lecture at Aalto University by Antti OulasvirtaComputational Rationality I - a Lecture at Aalto University by Antti Oulasvirta
Computational Rationality I - a Lecture at Aalto University by Antti Oulasvirta
 
機器學習速遊
機器學習速遊機器學習速遊
機器學習速遊
 
The state of the art in integrating machine learning into visual analytics
The state of the art in integrating machine learning into visual analyticsThe state of the art in integrating machine learning into visual analytics
The state of the art in integrating machine learning into visual analytics
 
Project on collision avoidance in static and dynamic environment
Project on collision avoidance in static and dynamic environmentProject on collision avoidance in static and dynamic environment
Project on collision avoidance in static and dynamic environment
 

Similar to An analysis of_machine_and_human_analytics_in_classification

Detection and recognition of face using neural network
Detection and recognition of face using neural networkDetection and recognition of face using neural network
Detection and recognition of face using neural network
Smriti Tikoo
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverview
Motaz El-Saban
 
Introduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdfIntroduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdf
SisayNegash4
 

Similar to An analysis of_machine_and_human_analytics_in_classification (20)

HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
 
Multimodal Learning Analytics
Multimodal Learning AnalyticsMultimodal Learning Analytics
Multimodal Learning Analytics
 
Multimodal Learning Analytics
Multimodal Learning AnalyticsMultimodal Learning Analytics
Multimodal Learning Analytics
 
Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
Elderly Assistance- Deep Learning Theme detection
Elderly Assistance- Deep Learning Theme detectionElderly Assistance- Deep Learning Theme detection
Elderly Assistance- Deep Learning Theme detection
 
Emotion recognition using image processing in deep learning
Emotion recognition using image     processing in deep learningEmotion recognition using image     processing in deep learning
Emotion recognition using image processing in deep learning
 
Detection and recognition of face using neural network
Detection and recognition of face using neural networkDetection and recognition of face using neural network
Detection and recognition of face using neural network
 
Human in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AIHuman in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AI
 
Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016
 
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - PosterMediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
 
Multimodal Learning Analytics
Multimodal Learning AnalyticsMultimodal Learning Analytics
Multimodal Learning Analytics
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverview
 
Seminar nov2017
Seminar nov2017Seminar nov2017
Seminar nov2017
 
H2O with Erin LeDell at Portland R User Group
H2O with Erin LeDell at Portland R User GroupH2O with Erin LeDell at Portland R User Group
H2O with Erin LeDell at Portland R User Group
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
 
2018.01.25 rune sætre_triallecture_xai_v2
2018.01.25 rune sætre_triallecture_xai_v22018.01.25 rune sætre_triallecture_xai_v2
2018.01.25 rune sætre_triallecture_xai_v2
 
Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision
 
Introduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdfIntroduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdf
 
Deep learning introduction
Deep learning introductionDeep learning introduction
Deep learning introduction
 

More from Subhashis Hazarika

Visualizing the variability of gradient in uncertain 2d scalarfield
Visualizing the variability of gradient in uncertain 2d scalarfieldVisualizing the variability of gradient in uncertain 2d scalarfield
Visualizing the variability of gradient in uncertain 2d scalarfield
Subhashis Hazarika
 
Semi automatic vortex extraction in 4 d pc-mri cardiac blood flow data using ...
Semi automatic vortex extraction in 4 d pc-mri cardiac blood flow data using ...Semi automatic vortex extraction in 4 d pc-mri cardiac blood flow data using ...
Semi automatic vortex extraction in 4 d pc-mri cardiac blood flow data using ...
Subhashis Hazarika
 
Linear programming in computational geometry
Linear programming in computational geometryLinear programming in computational geometry
Linear programming in computational geometry
Subhashis Hazarika
 

More from Subhashis Hazarika (13)

DNN Model Interpretability
DNN Model InterpretabilityDNN Model Interpretability
DNN Model Interpretability
 
Deep_Learning_Frameworks_CNTK_PyTorch
Deep_Learning_Frameworks_CNTK_PyTorchDeep_Learning_Frameworks_CNTK_PyTorch
Deep_Learning_Frameworks_CNTK_PyTorch
 
Word2Vec Network Structure Explained
Word2Vec Network Structure ExplainedWord2Vec Network Structure Explained
Word2Vec Network Structure Explained
 
Probabilistic Graph Layout for Uncertain Network Visualization
Probabilistic Graph Layout for Uncertain Network VisualizationProbabilistic Graph Layout for Uncertain Network Visualization
Probabilistic Graph Layout for Uncertain Network Visualization
 
Uncertainty aware multidimensional ensemble data visualization and exploration
Uncertainty aware multidimensional ensemble data visualization and explorationUncertainty aware multidimensional ensemble data visualization and exploration
Uncertainty aware multidimensional ensemble data visualization and exploration
 
CSE5559::Visualizing the Life and Anatomy of Cosmic Particles
CSE5559::Visualizing the Life and Anatomy of Cosmic ParticlesCSE5559::Visualizing the Life and Anatomy of Cosmic Particles
CSE5559::Visualizing the Life and Anatomy of Cosmic Particles
 
Visualizing the variability of gradient in uncertain 2d scalarfield
Visualizing the variability of gradient in uncertain 2d scalarfieldVisualizing the variability of gradient in uncertain 2d scalarfield
Visualizing the variability of gradient in uncertain 2d scalarfield
 
Sparse PDF Volumes for Consistent Multi-resolution Volume Rendering
Sparse PDF Volumes for Consistent Multi-resolution Volume RenderingSparse PDF Volumes for Consistent Multi-resolution Volume Rendering
Sparse PDF Volumes for Consistent Multi-resolution Volume Rendering
 
Visualization of uncertainty_without_a_mean
Visualization of uncertainty_without_a_meanVisualization of uncertainty_without_a_mean
Visualization of uncertainty_without_a_mean
 
Semi automatic vortex extraction in 4 d pc-mri cardiac blood flow data using ...
Semi automatic vortex extraction in 4 d pc-mri cardiac blood flow data using ...Semi automatic vortex extraction in 4 d pc-mri cardiac blood flow data using ...
Semi automatic vortex extraction in 4 d pc-mri cardiac blood flow data using ...
 
Graph cluster randomization
Graph cluster randomizationGraph cluster randomization
Graph cluster randomization
 
Linear programming in computational geometry
Linear programming in computational geometryLinear programming in computational geometry
Linear programming in computational geometry
 
CERN summer presentation
CERN summer presentationCERN summer presentation
CERN summer presentation
 

Recently uploaded

Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
 

Recently uploaded (20)

Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 

An analysis of_machine_and_human_analytics_in_classification

  • 1. An Analysis of Machine- and Human- Analytics in Classification Authors: 1. Gary K. L. Tam (Swansea University) 2. Vivek Kothari (University of Oxford) 3. Min Chen (University of Oxford) Presented by: Subhashis Hazarika (Ohio State University)
  • 2. Major Contribution • An information-theoretic model that explains why a human driven visual analytic model of classification performs better than a purely machine- learning model.
  • 3. Overview • Consider two classification case studies. • Create a decision tree classifier applying standard ML algorithms. • Create a decision tree classifier using visual analytics guided by “soft knowledge” of a human model-developer.[1] • Using Information theory explain why the human centric approach performs better than the ML approach. • Quantify the “soft knowledge” that a human centric approach takes advantage of. [1]: “Visualization of Time-Series Data in Parameter Space for Understanding Facial Dynamics”, G.K.L. Tam, H. Fang, A. J. Aubrey, P.W. Grant, D. Marshall, M. Chen. Eurovis2011.
  • 4. Case Study A (Facial Dynamics Data) • Input and feature extraction: – 68 raw facial videos classified as one of the four (smile, sadness, surprise, anger). – For each video, extracted 14 time series representing different temporal facial features. – For each time series, 23 quantitative measures were obtained. – Resulting in 14x23 attributes/features per video. • Create Decision Tree using a Parallel Coordinates based visual analytics system. • Create a Decision Tree with standard ML algorithms (C4.5 or CART).
  • 7. A: Outliers and anomalies
  • 10. Case Study B (Visualization Image Classification) • Input and feature extraction: – 4x49 jpeg images classified as (bubble-chart, treemap, parallel-coordinate, bar-graphs). – For each image, extracted 222 features via. different image classification and clustering . • Create Decision Tree using a Parallel Coordinates based visual analytics system. • Create a Decision Tree with standard ML algorithms (C4.5 or CART).
  • 11. B: Building the D-Tree
  • 13. The Team • Case Study A: – Conducted by 7 researchers with expertise in vision, visual analytics, computer graphics and machine learning. – Human-centric D-Tree was constructed by a researcher who was specialized in graphics and acquired the knowledge of computer vision and visual analytics during the project. • Case Study B: – Conducted by 2 researchers with expertise in image processing and visual analytics. – Human-centric D-Tree was constructed by a researcher with 8 months of experience in visual analytics.
  • 14. But Why? Some Empirical Observations • O1: Overview and Axis Distribution. – A machine-centric approach examines many cut positions on all the axis and greedily picks the cut with the highest quality measure. – While a human model developer usually first obtains a general overview of the data and identifies important axes with promising patterns before paying detailed attention to these axes. • O2: General Agreement amongst Statistics. – ML algorithms only use one metric to determine the cut. – HC approach can evaluate more than one statistics to decide the cut. • O3: Look-ahead. – Humans’ insights into the consequence often influences the current decision. – Humans’ look-ahead ability enables multi-step judgement, while the ML algorithms focused only on the current decisions.
  • 15. But Why? Some Empirical Observations • O4: Outliers. – If possible model developers avoid axes with outliers, as they may be unreliable. – Such reasoning is not available in the ML algorithms. • O5: Cut Positions on an Axis. – Humans’ look for a cut or cuts that would allow each class to expand beyond the current instance in the training set. – ML algorithms decide the cuts at the very edges of a particular class. • O6: Human (Domain) Knowledge. – Humans’ incorporate their domain knowledge into their model construction process.
  • 17. Information Theoretic Analysis • Estimated World Population : 7.4 billion • Consider each person have 5 variations for each of the 4 expressions. • The number of possible scenarios to capture : 148 billion • The maximal entropy is 37.1 bits. • We only know 68 cases(the raw training video) • That is 1.7 x 10-8 bits. (a drop in the ocean) • [ML] Optimistically, assuming the categorization retains 50% mutual information. That leaves us with 8.5 x 10-9 bits of information.
  • 18. Information Theoretic Analysis • [VA] Model developer may know some 200 people reasonably well, and can recall their 5 variations of 4 expressions at ease. Conservatively, that is equivalent to 4068 videos instead of 68. Representing 1.0 x 10-6 bits of known information. • [VA]When given an arbitrary facial image, the developer can also reconstruct an expression using imagination e.g at least 1 variation per expression. This ability accounts to 29.6 billion videos, representing 7.4 bits of known information. (This ability shows up in determining outliers). • 7.4 bits v/s 8.5 x 10-9 bits . That is roughly 871 million times more information content.
  • 19. Soft Knowledge and Soft Models • Soft Knowledge: The uncaptured information not available to the machine-centric approach. • Soft Model: The models which make decisions based on soft knowledge. • Examples: 1. Given a facial photo (input), imagine how the person would smile (output). 2. Given a video (input), determine if it is an outlier (output). 3. Given a set of points on an axis (input), decide how many cuts and where they are (output).
  • 21. Conclusion • There is an overwhelming amount of information available to the human- centric approach in the form of soft knowledge that can’t be utilized by a machine-centric approach. • It is necessary to understand and quantify the information flow in both machine- and human-centric approaches to help design a mixed model performing an much better job. • Human model developer can never by cast aside.