SlideShare una empresa de Scribd logo
1 de 51
Recognizing Human-Object Interactions in
Still Images by Modeling the Mutual
Context
of Objects and Human Poses
Presented By
Arwa Chittalwala
Irfan Shaikh
Heena Patel
1
Robots interact
with objects
Automatic sports
commentary
“Kobe is dunking the ball.”
2
Human-Object Interaction
Medical care
3
Vs.
Human-Object Interaction
Playing
saxophone
Playing
bassoon
Playing
saxophone
Grouplet is a generic feature for structured objects, or interactions
of groups of objects.
(Previous talk: Grouplet)
Caltech101
HOI activity: Tennis Forehand
Holistic image based classification
Detailed understanding and reasoning
Berg & Malik, 2005 Grauman & Darrell, 2005 Gehler & Nowozin, 2009 OURS
48% 59% 77% 62%
4
Human-Object Interaction
Torso
Head
• Human pose estimation
Holistic image based classification
Detailed understanding and reasoning
5
Human-Object Interaction
Tennis
racket
• Human pose estimation
Holistic image based classification
Detailed understanding and reasoning
• Object detection
6
Human-Object Interaction
• Human pose estimation
Holistic image based classification
Detailed understanding and reasoning
• Object detection
Torso
Head
Tennis
racket
HOI activity: Tennis Forehand
• Background and Intuition
• Mutual Context of Object and Human Pose
 Model Representation
 Model Learning
 Model Inference
• Experiments
• Conclusion
Outline
7
• Background and Intuition
• Mutual Context of Object and Human Pose
 Model Representation
 Model Learning
 Model Inference
• Experiments
• Conclusion
Outline
8
• Felzenszwalb & Huttenlocher, 2005
• Ren et al, 2005
• Ramanan, 2006
• Ferrari et al, 2008
• Yang & Mori, 2008
• Andriluka et al, 2009
• Eichner & Ferrari, 2009
Difficult part
appearance
Self-occlusion
Image region looks
like a body part
Human pose estimation & Object detection
9
Human pose
estimation is
challenging.
Human pose estimation & Object detection
10
Human pose
estimation is
challenging.
• Felzenszwalb & Huttenlocher, 2005
• Ren et al, 2005
• Ramanan, 2006
• Ferrari et al, 2008
• Yang & Mori, 2008
• Andriluka et al, 2009
• Eichner & Ferrari, 2009
Human pose estimation & Object detection
11
Facilitate
Given the
object is
detected.
• Viola & Jones, 2001
• Lampert et al, 2008
• Divvala et al, 2009
• Vedaldi et al, 2009
Small, low-resolution,
partially occluded
Image region similar
to detection target
Human pose estimation & Object detection
12
Object
detection is
challenging
Human pose estimation & Object detection
13
Object
detection is
challenging
• Viola & Jones, 2001
• Lampert et al, 2008
• Divvala et al, 2009
• Vedaldi et al, 2009
Human pose estimation & Object detection
14
Facilitate
Given the
pose is
estimated.
Human pose estimation & Object detection
15
Mutual Context
• Hoiem et al, 2006
• Rabinovich et al, 2007
• Oliva & Torralba, 2007
• Heitz & Koller, 2008
• Desai et al, 2009
• Divvala et al, 2009
• Murphy et al, 2003
• Shotton et al, 2006
• Harzallah et al, 2009
• Li, Socher & Fei-Fei, 2009
• Marszalek et al, 2009
• Bao & Savarese, 2010
Context in Computer Vision
~3-4%
with
context
without
context
Helpful, but only moderately
outperform better

Previous work – Use context
cues to facilitate object detection:
• Viola & Jones, 2001
• Lampert et al, 2008

16
Context in Computer Vision
Our approach – Two challenging
tasks serve as mutual context of
each other:
With
mutual
context:
Without
context:
17
~3-4%
with
context
without
context
Helpful, but only moderately
outperform better
Previous work – Use context
cues to facilitate object detection:
• Hoiem et al, 2006
• Rabinovich et al, 2007
• Oliva & Torralba, 2007
• Heitz & Koller, 2008
• Desai et al, 2009
• Divvala et al, 2009
• Murphy et al, 2003
• Shotton et al, 2006
• Harzallah et al, 2009
• Li, Socher & Fei-Fei, 2009
• Marszalek et al, 2009
• Bao & Savarese, 2010
• Background and Intuition
• Mutual Context of Object and Human Pose
 Model Representation
 Model Learning
 Model Inference
• Experiments
• Conclusion
Outline
18
19
H
A
Mutual Context Model Representation
• More than one H for each A;
• Unobserved during training.
A:

Croquet
shot
Volleyball
smash
Tennis
forehand
Intra-class variations
Activity
Object
Human pose
Body parts
lP: location; θP: orientation; sP: scale.
Croquet
mallet
Volleyball

Tennis
racket
O:
H:
P:
f: Shape context. [Belongie et al, 2002]
P1
Image evidence

fO
f1 f2 fN
O
P2 PN
20
Mutual Context Model Representation
( , )e O H
( , )e A O
( , )e A H
e e
e E
w

  
Markov Random Field
Clique
potential
Clique
weight
O
P1 PN

fO
H
A
P2
f1 f2 fN
( , )e A O ( , )e A H ( , )e O H• , , : Frequency
of co-occurrence between A, O, and H.
21
A
f1 f2 fN
Mutual Context Model Representation
( , )e nO P
( , )e m nP P

fO
P1 PNP2
O
H• , , : Spatial
relationship among object and body parts.
( , )e nO P ( , )e m nP P( , )e nH P
     bin binn n nO P O P O Pl l s s    
location orientation size
( , )e nH P
e e
e E
w

  
Markov Random Field
Clique
potential
Clique
weight
( , )e A O ( , )e A H ( , )e O H• , , : Frequency
of co-occurrence between A, O, and H.
22
H
A
f1 f2 fN
Mutual Context Model Representation
Obtained by
structure learning

fO
PNP1 P2
O
• Learn structural connectivity among
the body parts and the object.
( , )e A O ( , )e A H ( , )e O H• , , : Frequency
of co-occurrence between A, O, and H.
• , , : Spatial
relationship among object and body parts.
( , )e nO P ( , )e m nP P( , )e nH P
     bin binn n nO P O P O Pl l s s    
location orientation size ( , )e nO P
( , )e m nP P
( , )e nH P
e e
e E
w

  
Markov Random Field
Clique
potential
Clique
weight
23
H
O
A

fO
f1 f2 fN
P1 P2 PN
Mutual Context Model Representation
• and : Discriminative
part detection scores.
( , )e OO f ( , )ne n PP f
[Andriluka et al, 2009]
Shape context + AdaBoost
• Learn structural connectivity among
the body parts and the object.
[Belongie et al, 2002]
[Viola & Jones, 2001]
( , )e OO f
( , )ne n PP f
( , )e A O ( , )e A H ( , )e O H• , , : Frequency
of co-occurrence between A, O, and H.
• , , : Spatial
relationship among object and body parts.
( , )e nO P ( , )e m nP P( , )e nH P
     bin binn n nO P O P O Pl l s s    
location orientation size
e e
e E
w

  
Markov Random Field
Clique
potential
Clique
weight
• Background and Intuition
• Mutual Context of Object and Human Pose
 Model Representation
 Model Learning
 Model Inference
• Experiments
• Conclusion
Outline
24
25
Model Learning
H
O
A

fO
f1 f2 fN
P1 P2 PN
e e
e E
w

  

cricket
shot
cricket
bowling
Input:
Goals:
Hidden human poses
26
Model Learning
H
O
A

fO
f1 f2 fN
P1 P2 PN

Input:
Goals:
Hidden human poses
Structural connectivity
e e
e E
w

  
cricket
shot
cricket
bowling
e e
e E
w

  
27
Model Learning
Goals:
Hidden human poses
Structural connectivity
Potential parameters
Potential weights
H
O
A

fO
f1 f2 fN
P1 P2 PN

Input:
cricket
shot
cricket
bowling
28
Model Learning
Goals:
Parameter estimation
Hidden variables
Structure learning
H
O
A

fO
f1 f2 fN
P1 P2 PN

Input:
e e
e E
w

  
cricket
shot
cricket
bowling
Hidden human poses
Structural connectivity
Potential parameters
Potential weights
29
Model Learning
Goals:
H
O
A

fO
f1 f2 fN
P1 P2 PN
Approach:
croquet shot
e e
e E
w

  
Hidden human poses
Structural connectivity
Potential parameters
Potential weights
30
Model Learning
Goals:
H
O
A

fO
f1 f2 fN
P1 P2 PN
Approach:
 
 
2
2
max
2
e eeE e
E
w



  
 
  

Joint density
of the model
Gaussian priori of
the edge number











 

Hill-climbing
e e
e E
w

  
Hidden human poses
Structural connectivity
Potential parameters
Potential weights
31
Model Learning
Goals:
H
O
A

fO
f1 f2 fN
P1 P2 PN
Approach:
( , )e O H( , )e A O ( , )e A H
( , )e nO P ( , )e m nP P( , )e nH P
( , )e OO f ( , )ne n PP f
• Maximum likelihood
• Standard AdaBoost
e e
e E
w

  
Hidden human poses
Structural connectivity
Potential parameters
Potential weights
32
Model Learning
Goals:
H
O
A

fO
f1 f2 fN
P1 P2 PN
Approach:
Max-margin learning
2
2,
1
min
2
r i
r i

  w
w
• xi: Potential values of the i-th image.
• wr: Potential weights of the r-th pose.
• y(r): Activity of the r-th pose.
• ξi: A slack variable for the i-th image.
Notations
   s.t. , where ,
1
, 0
i
i
c i r i i
i
i r y r y c
i


 
    
 
w x w x
e e
e E
w

  
Hidden human poses
Structural connectivity
Potential parameters
Potential weights
33
Learning Results
Cricket
defensive
shot
Cricket
bowling
Croquet
shot
34
Learning Results
Tennis
serve
Volleyball
smash
Tennis
forehand
• Background and Intuition
• Mutual Context of Object and Human Pose
 Model Representation
 Model Learning
 Model Inference
• Experiments
• Conclusion
Outline
35
I
 
36
Model Inference
The learned models
I
 
37
Model Inference
The learned models
Head detection
Torso detection
Tennis racket detection

Layout of the object and body parts.
Compositional
Inference
[Chen et al, 2007]
  * *
1 1 1 1,, , , n n
A H O P
I
38
Model Inference
The learned models
 
 
  * *
1 1 1 1,, , , n n
A H O P   * *
,, , ,K K K K n n
A H O P
Output
• Background and Intuition
• Mutual Context of Object and Human Pose
 Model Representation
 Model Learning
 Model Inference
• Experiments
• Conclusion
Outline
39
40
Dataset and Experiment Setup
• Object detection;
• Pose estimation;
• Activity classification.
Tasks:
[Gupta et al, 2009]
Cricket
defensive shot
Cricket
bowling
Croquet
shot
Tennis
forehand
Tennis
serve
Volleyball
smash
Sport data set: 6 classes
180 training (supervised with object and part locations) & 120 testing images
[Gupta et al, 2009]
Cricket
defensive shot
Cricket
bowling
Croquet
shot
Tennis
forehand
Tennis
serve
Volleyball
smash
Sport data set: 6 classes
41
Dataset and Experiment Setup
• Object detection;
• Pose estimation;
• Activity classification.
Tasks:
180 training (supervised with object and part locations) & 120 testing images
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
Recall
Precision
Object Detection Results
Cricket bat
42


Valid
region
Croquet mallet Tennis racket Volleyball
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
Recall
Precision
Cricket ball
Our
Method
Sliding
window
Pedestrian
context
[Andriluka
et al, 2009]
[Dalal &
Triggs, 2006]
Object Detection Results
43
43
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
Recall
Precision
Volleyball
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
Recall
Precision
Cricket ball
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
RecallPrecision
Our Method
Pedestrian as context
Scanning window detector
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
Recall
Precision
Our Method
Pedestrian as context
Scanning window detector
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
Recall
Precision
Our Method
Pedestrian as context
Scanning window detector
Sliding window Pedestrian context Our method
SmallobjectBackgroundclutter
44
Dataset and Experiment Setup
• Object detection;
• Pose estimation;
• Activity classification.
Tasks:
[Gupta et al, 2009]
Cricket
defensive shot
Cricket
bowling
Croquet
shot
Tennis
forehand
Tennis
serve
Volleyball
smash
Sport data set: 6 classes
180 training & 120 testing images
45
Human Pose Estimation Results
Method Torso Upper Leg Lower Leg Upper Arm Lower Arm Head
Ramanan,
2006
.52 .22 .22 .21 .28 .24 .28 .17 .14 .42
Andriluka et
al, 2009
.50 .31 .30 .31 .27 .18 .19 .11 .11 .45
Our full
model
.66 .43 .39 .44 .34 .44 .40 .27 .29 .58
46
Human Pose Estimation Results
Method Torso Upper Leg Lower Leg Upper Arm Lower Arm Head
Ramanan,
2006
.52 .22 .22 .21 .28 .24 .28 .17 .14 .42
Andriluka et
al, 2009
.50 .31 .30 .31 .27 .18 .19 .11 .11 .45
Our full
model
.66 .43 .39 .44 .34 .44 .40 .27 .29 .58
Andriluka
et al, 2009
Our estimation
result
Tennis serve
model
Andriluka
et al, 2009
Our estimation
result
Volleyball
smash model
47
Human Pose Estimation Results
Method Torso Upper Leg Lower Leg Upper Arm Lower Arm Head
Ramanan,
2006
.52 .22 .22 .21 .28 .24 .28 .17 .14 .42
Andriluka et
al, 2009
.50 .31 .30 .31 .27 .18 .19 .11 .11 .45
Our full
model
.66 .43 .39 .44 .34 .44 .40 .27 .29 .58
One pose
per class
.63 .40 .36 .41 .31 .38 .35 .21 .23 .52
Estimation
result
Estimation
result
Estimation
result
Estimation
result
48
Dataset and Experiment Setup
• Object detection;
• Pose estimation;
• Activity classification.
Tasks:
[Gupta et al, 2009]
Cricket
defensive shot
Cricket
bowling
Croquet
shot
Tennis
forehand
Tennis
serve
Volleyball
smash
Sport data set: 6 classes
180 training & 120 testing images
Activity Classification Results
49
Gupta et
al, 2009
Our
model
Bag-of-
Words
83.3%
Classificationaccuracy
78.9%
52.5%
0.9
0.8
0.7
0.6
0.5
No scene
information Scene is
critical!! Cricket
shot
Tennis
forehand
Bag-of-words
SIFT+SVM
Gupta et
al, 2009
Our
model
50
Conclusion
Human-Object Interaction
Next Steps
Vs.
• Pose estimation & Object detection on PPMI images.
• Modeling multiple objects and humans.

Grouplet representation
Mutual context model
51

Más contenido relacionado

Similar a Recognizing Human-Object Interactions in Still Images by Modeling the Mutual Context of Objects and Human Poses

CVPR2010: modeling mutual context of object and human pose in human-object in...
CVPR2010: modeling mutual context of object and human pose in human-object in...CVPR2010: modeling mutual context of object and human pose in human-object in...
CVPR2010: modeling mutual context of object and human pose in human-object in...zukun
 
NIPS2009: Understand Visual Scenes - Part 2
NIPS2009: Understand Visual Scenes - Part 2NIPS2009: Understand Visual Scenes - Part 2
NIPS2009: Understand Visual Scenes - Part 2zukun
 
A Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaA Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaPreferred Networks
 
CVPR2010: Context-aware saliency detection
CVPR2010: Context-aware saliency detectionCVPR2010: Context-aware saliency detection
CVPR2010: Context-aware saliency detectionzukun
 
What knowledge bases know (and what they don't)
What knowledge bases know (and what they don't)What knowledge bases know (and what they don't)
What knowledge bases know (and what they don't)srazniewski
 
Xin Yao: "What can evolutionary computation do for you?"
Xin Yao: "What can evolutionary computation do for you?"Xin Yao: "What can evolutionary computation do for you?"
Xin Yao: "What can evolutionary computation do for you?"ieee_cis_cyprus
 
Iccv2009 recognition and learning object categories p2 c03 - objects and an...
Iccv2009 recognition and learning object categories   p2 c03 - objects and an...Iccv2009 recognition and learning object categories   p2 c03 - objects and an...
Iccv2009 recognition and learning object categories p2 c03 - objects and an...zukun
 
Visutl Causal Feature Learning
Visutl Causal Feature LearningVisutl Causal Feature Learning
Visutl Causal Feature LearningKojin Oshiba
 
Introduction to query rewriting optimisation with dependencies
Introduction to query rewriting optimisation with dependenciesIntroduction to query rewriting optimisation with dependencies
Introduction to query rewriting optimisation with dependenciesMariano Rodriguez-Muro
 
Ordinal Common-sense Inference
Ordinal Common-sense InferenceOrdinal Common-sense Inference
Ordinal Common-sense InferenceNaoki Otani
 
NIPS2009: Understand Visual Scenes - Part 1
NIPS2009: Understand Visual Scenes - Part 1NIPS2009: Understand Visual Scenes - Part 1
NIPS2009: Understand Visual Scenes - Part 1zukun
 
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...Jeff Z. Pan
 
Interactive Animation And Modeling By Drawing - Pedagogical Applications In M...
Interactive Animation And Modeling By Drawing - Pedagogical Applications In M...Interactive Animation And Modeling By Drawing - Pedagogical Applications In M...
Interactive Animation And Modeling By Drawing - Pedagogical Applications In M...David Bourguignon
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsRoelof Pieters
 
Introduction to Topological Data Analysis
Introduction to Topological Data AnalysisIntroduction to Topological Data Analysis
Introduction to Topological Data AnalysisMason Porter
 
Sampling based motion planning method and shallow survey
Sampling based motion planning method and shallow surveySampling based motion planning method and shallow survey
Sampling based motion planning method and shallow surveyssuser165ef9
 
Information Visualisation (Multimedia 2009 course)
Information Visualisation (Multimedia 2009 course)Information Visualisation (Multimedia 2009 course)
Information Visualisation (Multimedia 2009 course)Joris Klerkx
 
Modes of Play: A Frame Analytic Account of Video Gaming
Modes of Play: A Frame Analytic Account of Video GamingModes of Play: A Frame Analytic Account of Video Gaming
Modes of Play: A Frame Analytic Account of Video GamingSebastian Deterding
 

Similar a Recognizing Human-Object Interactions in Still Images by Modeling the Mutual Context of Objects and Human Poses (20)

CVPR2010: modeling mutual context of object and human pose in human-object in...
CVPR2010: modeling mutual context of object and human pose in human-object in...CVPR2010: modeling mutual context of object and human pose in human-object in...
CVPR2010: modeling mutual context of object and human pose in human-object in...
 
NIPS2009: Understand Visual Scenes - Part 2
NIPS2009: Understand Visual Scenes - Part 2NIPS2009: Understand Visual Scenes - Part 2
NIPS2009: Understand Visual Scenes - Part 2
 
A Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaA Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi Kerola
 
CVPR2010: Context-aware saliency detection
CVPR2010: Context-aware saliency detectionCVPR2010: Context-aware saliency detection
CVPR2010: Context-aware saliency detection
 
What knowledge bases know (and what they don't)
What knowledge bases know (and what they don't)What knowledge bases know (and what they don't)
What knowledge bases know (and what they don't)
 
2019 Triangle Machine Learning Day - Machine Learning for 3D Imaging - Sayan ...
2019 Triangle Machine Learning Day - Machine Learning for 3D Imaging - Sayan ...2019 Triangle Machine Learning Day - Machine Learning for 3D Imaging - Sayan ...
2019 Triangle Machine Learning Day - Machine Learning for 3D Imaging - Sayan ...
 
Xin Yao: "What can evolutionary computation do for you?"
Xin Yao: "What can evolutionary computation do for you?"Xin Yao: "What can evolutionary computation do for you?"
Xin Yao: "What can evolutionary computation do for you?"
 
Iccv2009 recognition and learning object categories p2 c03 - objects and an...
Iccv2009 recognition and learning object categories   p2 c03 - objects and an...Iccv2009 recognition and learning object categories   p2 c03 - objects and an...
Iccv2009 recognition and learning object categories p2 c03 - objects and an...
 
Visutl Causal Feature Learning
Visutl Causal Feature LearningVisutl Causal Feature Learning
Visutl Causal Feature Learning
 
Introduction to query rewriting optimisation with dependencies
Introduction to query rewriting optimisation with dependenciesIntroduction to query rewriting optimisation with dependencies
Introduction to query rewriting optimisation with dependencies
 
Ordinal Common-sense Inference
Ordinal Common-sense InferenceOrdinal Common-sense Inference
Ordinal Common-sense Inference
 
NIPS2009: Understand Visual Scenes - Part 1
NIPS2009: Understand Visual Scenes - Part 1NIPS2009: Understand Visual Scenes - Part 1
NIPS2009: Understand Visual Scenes - Part 1
 
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
 
Interactive Animation And Modeling By Drawing - Pedagogical Applications In M...
Interactive Animation And Modeling By Drawing - Pedagogical Applications In M...Interactive Animation And Modeling By Drawing - Pedagogical Applications In M...
Interactive Animation And Modeling By Drawing - Pedagogical Applications In M...
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
Introduction to Topological Data Analysis
Introduction to Topological Data AnalysisIntroduction to Topological Data Analysis
Introduction to Topological Data Analysis
 
Sampling based motion planning method and shallow survey
Sampling based motion planning method and shallow surveySampling based motion planning method and shallow survey
Sampling based motion planning method and shallow survey
 
Iciap 2
Iciap 2Iciap 2
Iciap 2
 
Information Visualisation (Multimedia 2009 course)
Information Visualisation (Multimedia 2009 course)Information Visualisation (Multimedia 2009 course)
Information Visualisation (Multimedia 2009 course)
 
Modes of Play: A Frame Analytic Account of Video Gaming
Modes of Play: A Frame Analytic Account of Video GamingModes of Play: A Frame Analytic Account of Video Gaming
Modes of Play: A Frame Analytic Account of Video Gaming
 

Más de أحلام انصارى (18)

Security issues in cloud database
Security  issues  in cloud   database Security  issues  in cloud   database
Security issues in cloud database
 
Html5 offers 5 times better ways to hijack the website
Html5 offers 5 times better ways to hijack the website Html5 offers 5 times better ways to hijack the website
Html5 offers 5 times better ways to hijack the website
 
Honey pot in cloud computing
Honey pot in cloud computingHoney pot in cloud computing
Honey pot in cloud computing
 
grid authentication
grid authenticationgrid authentication
grid authentication
 
Security As A Service In Cloud(SECaaS)
Security As A Service In Cloud(SECaaS)Security As A Service In Cloud(SECaaS)
Security As A Service In Cloud(SECaaS)
 
Soa
SoaSoa
Soa
 
Rbac
RbacRbac
Rbac
 
Password craking techniques
Password craking techniques Password craking techniques
Password craking techniques
 
Operating system vulnerability and control
Operating system vulnerability and control Operating system vulnerability and control
Operating system vulnerability and control
 
Network ssecurity toolkit
Network ssecurity toolkitNetwork ssecurity toolkit
Network ssecurity toolkit
 
Image forgery and security
Image forgery and securityImage forgery and security
Image forgery and security
 
Image based authentication
Image based authenticationImage based authentication
Image based authentication
 
Dmz
Dmz Dmz
Dmz
 
Cryptography
Cryptography Cryptography
Cryptography
 
Website attack n defacement n its control measures
Website attack n defacement n its control measures Website attack n defacement n its control measures
Website attack n defacement n its control measures
 
public cloud security via ids
public cloud security via idspublic cloud security via ids
public cloud security via ids
 
Grid computing by ahlam ansari
Grid computing by  ahlam ansariGrid computing by  ahlam ansari
Grid computing by ahlam ansari
 
Analysis of Algorithm
Analysis of AlgorithmAnalysis of Algorithm
Analysis of Algorithm
 

Último

Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxNikitaBankoti2
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfChris Hunter
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 

Último (20)

Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 

Recognizing Human-Object Interactions in Still Images by Modeling the Mutual Context of Objects and Human Poses

  • 1. Recognizing Human-Object Interactions in Still Images by Modeling the Mutual Context of Objects and Human Poses Presented By Arwa Chittalwala Irfan Shaikh Heena Patel 1
  • 2. Robots interact with objects Automatic sports commentary “Kobe is dunking the ball.” 2 Human-Object Interaction Medical care
  • 3. 3 Vs. Human-Object Interaction Playing saxophone Playing bassoon Playing saxophone Grouplet is a generic feature for structured objects, or interactions of groups of objects. (Previous talk: Grouplet) Caltech101 HOI activity: Tennis Forehand Holistic image based classification Detailed understanding and reasoning Berg & Malik, 2005 Grauman & Darrell, 2005 Gehler & Nowozin, 2009 OURS 48% 59% 77% 62%
  • 4. 4 Human-Object Interaction Torso Head • Human pose estimation Holistic image based classification Detailed understanding and reasoning
  • 5. 5 Human-Object Interaction Tennis racket • Human pose estimation Holistic image based classification Detailed understanding and reasoning • Object detection
  • 6. 6 Human-Object Interaction • Human pose estimation Holistic image based classification Detailed understanding and reasoning • Object detection Torso Head Tennis racket HOI activity: Tennis Forehand
  • 7. • Background and Intuition • Mutual Context of Object and Human Pose  Model Representation  Model Learning  Model Inference • Experiments • Conclusion Outline 7
  • 8. • Background and Intuition • Mutual Context of Object and Human Pose  Model Representation  Model Learning  Model Inference • Experiments • Conclusion Outline 8
  • 9. • Felzenszwalb & Huttenlocher, 2005 • Ren et al, 2005 • Ramanan, 2006 • Ferrari et al, 2008 • Yang & Mori, 2008 • Andriluka et al, 2009 • Eichner & Ferrari, 2009 Difficult part appearance Self-occlusion Image region looks like a body part Human pose estimation & Object detection 9 Human pose estimation is challenging.
  • 10. Human pose estimation & Object detection 10 Human pose estimation is challenging. • Felzenszwalb & Huttenlocher, 2005 • Ren et al, 2005 • Ramanan, 2006 • Ferrari et al, 2008 • Yang & Mori, 2008 • Andriluka et al, 2009 • Eichner & Ferrari, 2009
  • 11. Human pose estimation & Object detection 11 Facilitate Given the object is detected.
  • 12. • Viola & Jones, 2001 • Lampert et al, 2008 • Divvala et al, 2009 • Vedaldi et al, 2009 Small, low-resolution, partially occluded Image region similar to detection target Human pose estimation & Object detection 12 Object detection is challenging
  • 13. Human pose estimation & Object detection 13 Object detection is challenging • Viola & Jones, 2001 • Lampert et al, 2008 • Divvala et al, 2009 • Vedaldi et al, 2009
  • 14. Human pose estimation & Object detection 14 Facilitate Given the pose is estimated.
  • 15. Human pose estimation & Object detection 15 Mutual Context
  • 16. • Hoiem et al, 2006 • Rabinovich et al, 2007 • Oliva & Torralba, 2007 • Heitz & Koller, 2008 • Desai et al, 2009 • Divvala et al, 2009 • Murphy et al, 2003 • Shotton et al, 2006 • Harzallah et al, 2009 • Li, Socher & Fei-Fei, 2009 • Marszalek et al, 2009 • Bao & Savarese, 2010 Context in Computer Vision ~3-4% with context without context Helpful, but only moderately outperform better  Previous work – Use context cues to facilitate object detection: • Viola & Jones, 2001 • Lampert et al, 2008  16
  • 17. Context in Computer Vision Our approach – Two challenging tasks serve as mutual context of each other: With mutual context: Without context: 17 ~3-4% with context without context Helpful, but only moderately outperform better Previous work – Use context cues to facilitate object detection: • Hoiem et al, 2006 • Rabinovich et al, 2007 • Oliva & Torralba, 2007 • Heitz & Koller, 2008 • Desai et al, 2009 • Divvala et al, 2009 • Murphy et al, 2003 • Shotton et al, 2006 • Harzallah et al, 2009 • Li, Socher & Fei-Fei, 2009 • Marszalek et al, 2009 • Bao & Savarese, 2010
  • 18. • Background and Intuition • Mutual Context of Object and Human Pose  Model Representation  Model Learning  Model Inference • Experiments • Conclusion Outline 18
  • 19. 19 H A Mutual Context Model Representation • More than one H for each A; • Unobserved during training. A:  Croquet shot Volleyball smash Tennis forehand Intra-class variations Activity Object Human pose Body parts lP: location; θP: orientation; sP: scale. Croquet mallet Volleyball  Tennis racket O: H: P: f: Shape context. [Belongie et al, 2002] P1 Image evidence  fO f1 f2 fN O P2 PN
  • 20. 20 Mutual Context Model Representation ( , )e O H ( , )e A O ( , )e A H e e e E w     Markov Random Field Clique potential Clique weight O P1 PN  fO H A P2 f1 f2 fN ( , )e A O ( , )e A H ( , )e O H• , , : Frequency of co-occurrence between A, O, and H.
  • 21. 21 A f1 f2 fN Mutual Context Model Representation ( , )e nO P ( , )e m nP P  fO P1 PNP2 O H• , , : Spatial relationship among object and body parts. ( , )e nO P ( , )e m nP P( , )e nH P      bin binn n nO P O P O Pl l s s     location orientation size ( , )e nH P e e e E w     Markov Random Field Clique potential Clique weight ( , )e A O ( , )e A H ( , )e O H• , , : Frequency of co-occurrence between A, O, and H.
  • 22. 22 H A f1 f2 fN Mutual Context Model Representation Obtained by structure learning  fO PNP1 P2 O • Learn structural connectivity among the body parts and the object. ( , )e A O ( , )e A H ( , )e O H• , , : Frequency of co-occurrence between A, O, and H. • , , : Spatial relationship among object and body parts. ( , )e nO P ( , )e m nP P( , )e nH P      bin binn n nO P O P O Pl l s s     location orientation size ( , )e nO P ( , )e m nP P ( , )e nH P e e e E w     Markov Random Field Clique potential Clique weight
  • 23. 23 H O A  fO f1 f2 fN P1 P2 PN Mutual Context Model Representation • and : Discriminative part detection scores. ( , )e OO f ( , )ne n PP f [Andriluka et al, 2009] Shape context + AdaBoost • Learn structural connectivity among the body parts and the object. [Belongie et al, 2002] [Viola & Jones, 2001] ( , )e OO f ( , )ne n PP f ( , )e A O ( , )e A H ( , )e O H• , , : Frequency of co-occurrence between A, O, and H. • , , : Spatial relationship among object and body parts. ( , )e nO P ( , )e m nP P( , )e nH P      bin binn n nO P O P O Pl l s s     location orientation size e e e E w     Markov Random Field Clique potential Clique weight
  • 24. • Background and Intuition • Mutual Context of Object and Human Pose  Model Representation  Model Learning  Model Inference • Experiments • Conclusion Outline 24
  • 25. 25 Model Learning H O A  fO f1 f2 fN P1 P2 PN e e e E w      cricket shot cricket bowling Input: Goals: Hidden human poses
  • 26. 26 Model Learning H O A  fO f1 f2 fN P1 P2 PN  Input: Goals: Hidden human poses Structural connectivity e e e E w     cricket shot cricket bowling
  • 27. e e e E w     27 Model Learning Goals: Hidden human poses Structural connectivity Potential parameters Potential weights H O A  fO f1 f2 fN P1 P2 PN  Input: cricket shot cricket bowling
  • 28. 28 Model Learning Goals: Parameter estimation Hidden variables Structure learning H O A  fO f1 f2 fN P1 P2 PN  Input: e e e E w     cricket shot cricket bowling Hidden human poses Structural connectivity Potential parameters Potential weights
  • 29. 29 Model Learning Goals: H O A  fO f1 f2 fN P1 P2 PN Approach: croquet shot e e e E w     Hidden human poses Structural connectivity Potential parameters Potential weights
  • 30. 30 Model Learning Goals: H O A  fO f1 f2 fN P1 P2 PN Approach:     2 2 max 2 e eeE e E w             Joint density of the model Gaussian priori of the edge number               Hill-climbing e e e E w     Hidden human poses Structural connectivity Potential parameters Potential weights
  • 31. 31 Model Learning Goals: H O A  fO f1 f2 fN P1 P2 PN Approach: ( , )e O H( , )e A O ( , )e A H ( , )e nO P ( , )e m nP P( , )e nH P ( , )e OO f ( , )ne n PP f • Maximum likelihood • Standard AdaBoost e e e E w     Hidden human poses Structural connectivity Potential parameters Potential weights
  • 32. 32 Model Learning Goals: H O A  fO f1 f2 fN P1 P2 PN Approach: Max-margin learning 2 2, 1 min 2 r i r i    w w • xi: Potential values of the i-th image. • wr: Potential weights of the r-th pose. • y(r): Activity of the r-th pose. • ξi: A slack variable for the i-th image. Notations    s.t. , where , 1 , 0 i i c i r i i i i r y r y c i            w x w x e e e E w     Hidden human poses Structural connectivity Potential parameters Potential weights
  • 35. • Background and Intuition • Mutual Context of Object and Human Pose  Model Representation  Model Learning  Model Inference • Experiments • Conclusion Outline 35
  • 37. I   37 Model Inference The learned models Head detection Torso detection Tennis racket detection  Layout of the object and body parts. Compositional Inference [Chen et al, 2007]   * * 1 1 1 1,, , , n n A H O P
  • 38. I 38 Model Inference The learned models       * * 1 1 1 1,, , , n n A H O P   * * ,, , ,K K K K n n A H O P Output
  • 39. • Background and Intuition • Mutual Context of Object and Human Pose  Model Representation  Model Learning  Model Inference • Experiments • Conclusion Outline 39
  • 40. 40 Dataset and Experiment Setup • Object detection; • Pose estimation; • Activity classification. Tasks: [Gupta et al, 2009] Cricket defensive shot Cricket bowling Croquet shot Tennis forehand Tennis serve Volleyball smash Sport data set: 6 classes 180 training (supervised with object and part locations) & 120 testing images
  • 41. [Gupta et al, 2009] Cricket defensive shot Cricket bowling Croquet shot Tennis forehand Tennis serve Volleyball smash Sport data set: 6 classes 41 Dataset and Experiment Setup • Object detection; • Pose estimation; • Activity classification. Tasks: 180 training (supervised with object and part locations) & 120 testing images
  • 42. 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Recall Precision Object Detection Results Cricket bat 42   Valid region Croquet mallet Tennis racket Volleyball 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Recall Precision Cricket ball Our Method Sliding window Pedestrian context [Andriluka et al, 2009] [Dalal & Triggs, 2006]
  • 43. Object Detection Results 43 43 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Recall Precision Volleyball 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Recall Precision Cricket ball 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 RecallPrecision Our Method Pedestrian as context Scanning window detector 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Recall Precision Our Method Pedestrian as context Scanning window detector 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Recall Precision Our Method Pedestrian as context Scanning window detector Sliding window Pedestrian context Our method SmallobjectBackgroundclutter
  • 44. 44 Dataset and Experiment Setup • Object detection; • Pose estimation; • Activity classification. Tasks: [Gupta et al, 2009] Cricket defensive shot Cricket bowling Croquet shot Tennis forehand Tennis serve Volleyball smash Sport data set: 6 classes 180 training & 120 testing images
  • 45. 45 Human Pose Estimation Results Method Torso Upper Leg Lower Leg Upper Arm Lower Arm Head Ramanan, 2006 .52 .22 .22 .21 .28 .24 .28 .17 .14 .42 Andriluka et al, 2009 .50 .31 .30 .31 .27 .18 .19 .11 .11 .45 Our full model .66 .43 .39 .44 .34 .44 .40 .27 .29 .58
  • 46. 46 Human Pose Estimation Results Method Torso Upper Leg Lower Leg Upper Arm Lower Arm Head Ramanan, 2006 .52 .22 .22 .21 .28 .24 .28 .17 .14 .42 Andriluka et al, 2009 .50 .31 .30 .31 .27 .18 .19 .11 .11 .45 Our full model .66 .43 .39 .44 .34 .44 .40 .27 .29 .58 Andriluka et al, 2009 Our estimation result Tennis serve model Andriluka et al, 2009 Our estimation result Volleyball smash model
  • 47. 47 Human Pose Estimation Results Method Torso Upper Leg Lower Leg Upper Arm Lower Arm Head Ramanan, 2006 .52 .22 .22 .21 .28 .24 .28 .17 .14 .42 Andriluka et al, 2009 .50 .31 .30 .31 .27 .18 .19 .11 .11 .45 Our full model .66 .43 .39 .44 .34 .44 .40 .27 .29 .58 One pose per class .63 .40 .36 .41 .31 .38 .35 .21 .23 .52 Estimation result Estimation result Estimation result Estimation result
  • 48. 48 Dataset and Experiment Setup • Object detection; • Pose estimation; • Activity classification. Tasks: [Gupta et al, 2009] Cricket defensive shot Cricket bowling Croquet shot Tennis forehand Tennis serve Volleyball smash Sport data set: 6 classes 180 training & 120 testing images
  • 49. Activity Classification Results 49 Gupta et al, 2009 Our model Bag-of- Words 83.3% Classificationaccuracy 78.9% 52.5% 0.9 0.8 0.7 0.6 0.5 No scene information Scene is critical!! Cricket shot Tennis forehand Bag-of-words SIFT+SVM Gupta et al, 2009 Our model
  • 50. 50 Conclusion Human-Object Interaction Next Steps Vs. • Pose estimation & Object detection on PPMI images. • Modeling multiple objects and humans.  Grouplet representation Mutual context model
  • 51. 51