SlideShare una empresa de Scribd logo
1 de 68
Human action recognition using spatio-temporal features Nikhil Sawant (2007MCS2899) Guide : Dr. K.K. Biswas
Human activity recognition Higher resolution Longer Time Scale Courtesy : Y. Ke,  Fathi and Mori, Bobick and Davis, Schuldt  et al,  Leibe  et al,  Vaswani  et al.   Pose Estimation Action Recognition Action Classification Tracking Activity Recognition
Use Action recognition? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Goals…. ,[object Object],[object Object],[object Object],[object Object],[object Object]
Goals…. ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Existing Approaches ,[object Object],[object Object],[object Object]
Tracking interest points ,[object Object],[object Object],[object Object],Images Courtesy : P. Correra Tracking 5 crucial points i.e. Head, 2 hands, 2 feet. Mostly present at the local maxima on the plot of geodesic distance
Tracking interest points ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Flow based approaches ,[object Object],[object Object],[object Object],[object Object]
Shape based Approaches ,[object Object],[object Object],[object Object],[object Object],[object Object],Images Courtesy : M. Blank
Our Approach ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Optical flow and motion features
Target Localization ,[object Object],[object Object],[object Object],[object Object],[object Object],Original Video Silhouette Original Video with ROI marked
Motion estimation ,[object Object],[object Object],[object Object],[object Object],[object Object]
Noise removal ,[object Object],[object Object],[object Object],[object Object],Noisy Optical flows After noise removal
Organizing optical flow ,[object Object],[object Object]
Organizing optical flow (Local oriented Histogram) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],Organizing optical flow (Local oriented Histogram) C (0,0) d 2 d 1 O 1 O 2 O e O e
Organizing optical flow (Weighted Averaging) ,[object Object],[object Object]
Organizing optical flows
Formation of motion descriptor ,[object Object],[object Object],[object Object],[object Object],[object Object]
Learning with Adaboost Strong  classifier Weak classifier Weight Features vector
Classification Example  taken from Antonio Torralba @MIT Weak learners from the family of lines h => p(error) = 0.5  it is at chance Each data point has a class label: w t  =1 and a weight: + 1 (  ) -1 (  ) y t  =
Classification Example  This one seems to be the best This is a ‘ weak classifier ’: It performs slightly better than chance. Each data point has a class label: w t  =1 and a weight: + 1 (  ) -1 (  ) y t  =
Classification Example  We set a new problem for which the previous weak classifier performs at chance again Each data point has a class label: w t  w t  exp{-y t  H t } We update the weights: +  1 (  ) - 1 (  ) y t  =
Classification Example  We set a new problem for which the previous weak classifier performs at chance again Each data point has a class label: w t  w t  exp{-y t  H t } We update the weights: +  1 (  ) - 1 (  ) y t  =
Classification Example  We set a new problem for which the previous weak classifier performs at chance again Each data point has a class label: w t  w t  exp{-y t  H t } We update the weights: +  1 (  ) - 1 (  ) y t  =
Classification Example  We set a new problem for which the previous weak classifier performs at chance again Each data point has a class label: w t  w t  exp{-y t  H t } We update the weights: +  1 (  ) - 1 (  ) y t  =
Classification Example  The strong (non- linear) classifier is built as the combination of all the weak (linear) classifiers. f 1 f 2 f 3 f 4
Our Dataset ,[object Object],[object Object],ACTION SUBJECTS VIDEOS Walking 8 34 Running 8 20 Flying 5 25 Waving 5 25 Pick up 6 24 Stand up 6 48 Sitting down 6 24
Our Dataset (Tennis actions) ,[object Object],ACTION SUBJECTS VIDEOS Forehand 3 11 Backhand 3 10 Service 2 9
Training and Testing Dataset ,[object Object],[object Object],[object Object],ACTION TRAINING TESTING Walking 1184 1710 Running 183 335 Flying 182 373 Waving 198 317 Pick up 111 160 Stand up 128 187 Sitting down 230 282
Classification result ( framewise ) ,[object Object],Walking Running  Flying  Waving  Pick up  Sit down  Stand up  Error  Walking 1644  46  0  17  1  2  3.86%  Running 35  295  3  2  11.94%  Flying  1  2  349  11  9  1  6.43%  Waving  11  8  269  29  15.14%  Pick up 8  7  1  120  23  1  25%  Sit down  1  1  26  179  14.97%  Stand up 23  282  8.15%
Classification results ( clipwise ) ,[object Object],Walking Running Waving1 waving2 bending Sit-down Stand-up Error Walking 10 0.0% Running 10 0.0% Waving1 9 1 10.0% waving2 10 0.0% bending 9 1 10.0% Sit-down 10 0.0% Stand-up 1 9 10.0%
Action classification
Classification results (Tennis events) ,[object Object],Forehand Backhand Service Error Forehand 54 7 11 21.95% Backhand 11 53 10.75% Service 8 49 14.04%
Event Detection ,[object Object],[object Object],Current frame ‘ f’ Next  n  frames Previous  n  frames f f+1 f+2 f+3 f+4 … … f-1 f-2 f-3 f-4 … … f-n f+n
Event Detection Without using prediction logic With prediction logic
Weizmann Dataset ACTION SUBJECTS VIDEOS Bend 9 9 Jack 9 9 Jump 9 9 Pjump 9 9 Run 9 10 Side 9 9 Skip 9 10 Walk 9 10 Wave1 9 9 Wave2 9 9
Standard Dataset (Weizmann Dataset) Walk Side Skip Wave1 Wave2 Bend Run Jack Jump Pjump
confusion matrix ( framewise ) ,[object Object],Bend  Jack Jump Pjump Run Side Skip Walk Wave1 Wave2 Bend  271 1 1 20 3 30 11 Jack 18 368 8 48 3 2 3 9 16 Jump 9 3 157 8 2 26 19 7 Pjump 36 26 237 22 6 Run 4 2 5 158 3 50 6 1 2 Side 11 9 77 1 1 84 3 58 2 1 Skip 3 9 76 43 5 109 24 1 7 Walk 2 5 16 2 13 5 395 Wave1 47 2 12 238 27 Wave2 30 6 1 4 1 55 269
Weizmann dataset ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Use of MV + Shape Info(SI) ,[object Object],[object Object],[object Object],[object Object]
Use of MV + Differential SI  ,[object Object],[object Object],[object Object],[object Object]
confusion matrix ( framewise ) Bend  Jack Jump Pjump Run Side Skip Walk Wave1 Wave2 Bend  326 7 2 2 Jack 6 418 39 1 3 8 Jump 18 1 189 1 5 4 13 Pjump 11 55 243 6 1 11 Run 2 2 173 2 45 7 Side 8 30 11 1 152 12 33 Skip 1 20 32 83 4 121 13 1 2 Walk 1 1 2 1 1 432 Wave1 43 1 10 10 232 30 Wave2 13 25 328
Spatio-temporal features TSPAN TLEN
Spatio-temporal descriptor ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Event classification ( clipwise ) ,[object Object],[object Object],bend  Jack Jump Pjump Run Side Skip Walk Wave1 Wave2 Error bend  9 0.0% Jack 9 0.0% Jump 9 0.0% Pjump 9 0.0% Run 9 1 10.0% Side 9 0.0% Skip 10 0.0% Walk 10 0.0% Wave1 8 1 11.1% Wave2 9 0.0%
Action recognition in cluttered background
Cluttered environment ,[object Object],[object Object],[object Object],[object Object],[object Object]
Training ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Training data drinking
Training data bending
Template length ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Single template formation ,[object Object],[object Object],[object Object],[object Object],[object Object],1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 3 4 5 6 8 9 10 11 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12
Optical flow and Adaboost ,[object Object],[object Object],[object Object],[object Object],[object Object]
Testing ,[object Object],[object Object],Height Width Length
t x y
Testing ,[object Object],[object Object],[object Object],[object Object],[object Object],Height Width Length
Confidence matrix ,[object Object],[object Object],[object Object],[object Object],[object Object]
Confidence matrix
Results
Results
Results
Results
Results
Results
Key References ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Más contenido relacionado

La actualidad más candente

Facial Image Analysis for age and gender and
Facial Image Analysis for age and gender andFacial Image Analysis for age and gender and
Facial Image Analysis for age and gender and
Yuheng Wang
 

La actualidad más candente (20)

Deep learning-for-pose-estimation-wyang-defense
Deep learning-for-pose-estimation-wyang-defenseDeep learning-for-pose-estimation-wyang-defense
Deep learning-for-pose-estimation-wyang-defense
 
Facial Image Analysis for age and gender and
Facial Image Analysis for age and gender andFacial Image Analysis for age and gender and
Facial Image Analysis for age and gender and
 
Semantic Segmentation Methods using Deep Learning
Semantic Segmentation Methods using Deep LearningSemantic Segmentation Methods using Deep Learning
Semantic Segmentation Methods using Deep Learning
 
YOLO
YOLOYOLO
YOLO
 
Moving object detection
Moving object detectionMoving object detection
Moving object detection
 
You only look once: Unified, real-time object detection (UPC Reading Group)
You only look once: Unified, real-time object detection (UPC Reading Group)You only look once: Unified, real-time object detection (UPC Reading Group)
You only look once: Unified, real-time object detection (UPC Reading Group)
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
Machine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationMachine Learning - Object Detection and Classification
Machine Learning - Object Detection and Classification
 
Yolo
YoloYolo
Yolo
 
Anchor free object detection by deep learning
Anchor free object detection by deep learningAnchor free object detection by deep learning
Anchor free object detection by deep learning
 
Image Classification using deep learning
Image Classification using deep learning Image Classification using deep learning
Image Classification using deep learning
 
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionYou Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object Detection
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural Networks
 
Pose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learningPose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learning
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
 
An Introduction to Computer Vision
An Introduction to Computer VisionAn Introduction to Computer Vision
An Introduction to Computer Vision
 
[Mmlab seminar 2016] deep learning for human pose estimation
[Mmlab seminar 2016] deep learning for human pose estimation[Mmlab seminar 2016] deep learning for human pose estimation
[Mmlab seminar 2016] deep learning for human pose estimation
 
Skeleton-based Human Action Recognition with Recurrent Neural Network
Skeleton-based Human Action Recognition with Recurrent Neural NetworkSkeleton-based Human Action Recognition with Recurrent Neural Network
Skeleton-based Human Action Recognition with Recurrent Neural Network
 
Paper Summary of Disentangling by Factorising (Factor-VAE)
Paper Summary of Disentangling by Factorising (Factor-VAE)Paper Summary of Disentangling by Factorising (Factor-VAE)
Paper Summary of Disentangling by Factorising (Factor-VAE)
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 

Similar a Action Recognition (Thesis presentation)

Human Action Recognition Based on Spacio-temporal features-Poster
Human Action Recognition Based on Spacio-temporal features-PosterHuman Action Recognition Based on Spacio-temporal features-Poster
Human Action Recognition Based on Spacio-temporal features-Poster
nikhilus85
 
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon TransformHuman Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Fadwa Fouad
 
Frontal motion analysis of the knee during a bicycle pedal revolution 2011
Frontal motion analysis of the knee during a bicycle pedal revolution 2011Frontal motion analysis of the knee during a bicycle pedal revolution 2011
Frontal motion analysis of the knee during a bicycle pedal revolution 2011
Harry_Sowieja
 
Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...
Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...
Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...
MLconf
 
Introduction
IntroductionIntroduction
Introduction
butest
 

Similar a Action Recognition (Thesis presentation) (20)

Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble tracking
 
Object Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online LearningObject Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online Learning
 
Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)
Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)
Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)
 
presentation.ppt
presentation.pptpresentation.ppt
presentation.ppt
 
Human Action Recognition Based on Spacio-temporal features-Poster
Human Action Recognition Based on Spacio-temporal features-PosterHuman Action Recognition Based on Spacio-temporal features-Poster
Human Action Recognition Based on Spacio-temporal features-Poster
 
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon TransformHuman Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
 
05397385
0539738505397385
05397385
 
05397385
0539738505397385
05397385
 
Frontal motion analysis of the knee during a bicycle pedal revolution 2011
Frontal motion analysis of the knee during a bicycle pedal revolution 2011Frontal motion analysis of the knee during a bicycle pedal revolution 2011
Frontal motion analysis of the knee during a bicycle pedal revolution 2011
 
Soundarya m.sc
Soundarya m.scSoundarya m.sc
Soundarya m.sc
 
Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...
Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...
Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...
 
Camera Calibration for Video Surveillance
Camera Calibration for Video SurveillanceCamera Calibration for Video Surveillance
Camera Calibration for Video Surveillance
 
pres06-main
pres06-mainpres06-main
pres06-main
 
Measuring movements of golfers with an accelerometer
Measuring movements of golfers with an accelerometerMeasuring movements of golfers with an accelerometer
Measuring movements of golfers with an accelerometer
 
EC8553 Discrete time signal processing
EC8553 Discrete time signal processing EC8553 Discrete time signal processing
EC8553 Discrete time signal processing
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 
Pose Machine
Pose MachinePose Machine
Pose Machine
 
Particle Learning in Online Tool Wear Diagnosis and Prognosis
Particle Learning in Online Tool Wear Diagnosis and PrognosisParticle Learning in Online Tool Wear Diagnosis and Prognosis
Particle Learning in Online Tool Wear Diagnosis and Prognosis
 
PSOk-NN: A Particle Swarm Optimization Approach to Optimize k-Nearest Neighbo...
PSOk-NN: A Particle Swarm Optimization Approach to Optimize k-Nearest Neighbo...PSOk-NN: A Particle Swarm Optimization Approach to Optimize k-Nearest Neighbo...
PSOk-NN: A Particle Swarm Optimization Approach to Optimize k-Nearest Neighbo...
 
Introduction
IntroductionIntroduction
Introduction
 

Action Recognition (Thesis presentation)

  • 1. Human action recognition using spatio-temporal features Nikhil Sawant (2007MCS2899) Guide : Dr. K.K. Biswas
  • 2. Human activity recognition Higher resolution Longer Time Scale Courtesy : Y. Ke, Fathi and Mori, Bobick and Davis, Schuldt et al, Leibe et al, Vaswani et al. Pose Estimation Action Recognition Action Classification Tracking Activity Recognition
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12. Optical flow and motion features
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 21.
  • 22. Learning with Adaboost Strong classifier Weak classifier Weight Features vector
  • 23. Classification Example taken from Antonio Torralba @MIT Weak learners from the family of lines h => p(error) = 0.5 it is at chance Each data point has a class label: w t =1 and a weight: + 1 ( ) -1 ( ) y t =
  • 24. Classification Example This one seems to be the best This is a ‘ weak classifier ’: It performs slightly better than chance. Each data point has a class label: w t =1 and a weight: + 1 ( ) -1 ( ) y t =
  • 25. Classification Example We set a new problem for which the previous weak classifier performs at chance again Each data point has a class label: w t w t exp{-y t H t } We update the weights: + 1 ( ) - 1 ( ) y t =
  • 26. Classification Example We set a new problem for which the previous weak classifier performs at chance again Each data point has a class label: w t w t exp{-y t H t } We update the weights: + 1 ( ) - 1 ( ) y t =
  • 27. Classification Example We set a new problem for which the previous weak classifier performs at chance again Each data point has a class label: w t w t exp{-y t H t } We update the weights: + 1 ( ) - 1 ( ) y t =
  • 28. Classification Example We set a new problem for which the previous weak classifier performs at chance again Each data point has a class label: w t w t exp{-y t H t } We update the weights: + 1 ( ) - 1 ( ) y t =
  • 29. Classification Example The strong (non- linear) classifier is built as the combination of all the weak (linear) classifiers. f 1 f 2 f 3 f 4
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 36.
  • 37.
  • 38. Event Detection Without using prediction logic With prediction logic
  • 39. Weizmann Dataset ACTION SUBJECTS VIDEOS Bend 9 9 Jack 9 9 Jump 9 9 Pjump 9 9 Run 9 10 Side 9 9 Skip 9 10 Walk 9 10 Wave1 9 9 Wave2 9 9
  • 40. Standard Dataset (Weizmann Dataset) Walk Side Skip Wave1 Wave2 Bend Run Jack Jump Pjump
  • 41.
  • 42.
  • 43.
  • 44.
  • 45. confusion matrix ( framewise ) Bend Jack Jump Pjump Run Side Skip Walk Wave1 Wave2 Bend 326 7 2 2 Jack 6 418 39 1 3 8 Jump 18 1 189 1 5 4 13 Pjump 11 55 243 6 1 11 Run 2 2 173 2 45 7 Side 8 30 11 1 152 12 33 Skip 1 20 32 83 4 121 13 1 2 Walk 1 1 2 1 1 432 Wave1 43 1 10 10 232 30 Wave2 13 25 328
  • 47.
  • 48.
  • 49. Action recognition in cluttered background
  • 50.
  • 51.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58. t x y
  • 59.
  • 60.
  • 68.