Action Recognition (Thesis presentation)

Human action recognition using spatio-temporal features Nikhil Sawant (2007MCS2899) Guide : Dr. K.K. Biswas

Human activity recognition Higher resolution Longer Time Scale Courtesy : Y. Ke, Fathi and Mori, Bobick and Davis, Schuldt et al, Leibe et al, Vaswani et al. Pose Estimation Action Recognition Action Classification Tracking Activity Recognition

Use Action recognition? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Goals…. ,[object Object],[object Object],[object Object],[object Object],[object Object]

Goals…. ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Existing Approaches ,[object Object],[object Object],[object Object]

Tracking interest points ,[object Object],[object Object],[object Object],Images Courtesy : P. Correra Tracking 5 crucial points i.e. Head, 2 hands, 2 feet. Mostly present at the local maxima on the plot of geodesic distance

Tracking interest points ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Flow based approaches ,[object Object],[object Object],[object Object],[object Object]

Shape based Approaches ,[object Object],[object Object],[object Object],[object Object],[object Object],Images Courtesy : M. Blank

Our Approach ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Optical flow and motion features

Target Localization ,[object Object],[object Object],[object Object],[object Object],[object Object],Original Video Silhouette Original Video with ROI marked

Motion estimation ,[object Object],[object Object],[object Object],[object Object],[object Object]

Noise removal ,[object Object],[object Object],[object Object],[object Object],Noisy Optical flows After noise removal

Organizing optical flow ,[object Object],[object Object]

Organizing optical flow (Local oriented Histogram) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],Organizing optical flow (Local oriented Histogram) C (0,0) d 2 d 1 O 1 O 2 O e O e

Organizing optical flow (Weighted Averaging) ,[object Object],[object Object]

Formation of motion descriptor ,[object Object],[object Object],[object Object],[object Object],[object Object]

Learning with Adaboost Strong classifier Weak classifier Weight Features vector

Classification Example taken from Antonio Torralba @MIT Weak learners from the family of lines h => p(error) = 0.5 it is at chance Each data point has a class label: w t =1 and a weight: + 1 ( ) -1 ( ) y t =

Classification Example This one seems to be the best This is a ‘ weak classifier ’: It performs slightly better than chance. Each data point has a class label: w t =1 and a weight: + 1 ( ) -1 ( ) y t =

Classification Example We set a new problem for which the previous weak classifier performs at chance again Each data point has a class label: w t w t exp{-y t H t } We update the weights: + 1 ( ) - 1 ( ) y t =

Classification Example The strong (non- linear) classifier is built as the combination of all the weak (linear) classifiers. f 1 f 2 f 3 f 4

Our Dataset ,[object Object],[object Object],ACTION SUBJECTS VIDEOS Walking 8 34 Running 8 20 Flying 5 25 Waving 5 25 Pick up 6 24 Stand up 6 48 Sitting down 6 24

Our Dataset (Tennis actions) ,[object Object],ACTION SUBJECTS VIDEOS Forehand 3 11 Backhand 3 10 Service 2 9

Training and Testing Dataset ,[object Object],[object Object],[object Object],ACTION TRAINING TESTING Walking 1184 1710 Running 183 335 Flying 182 373 Waving 198 317 Pick up 111 160 Stand up 128 187 Sitting down 230 282

Classification result ( framewise ) ,[object Object],Walking Running Flying Waving Pick up Sit down Stand up Error Walking 1644 46 0 17 1 2 3.86% Running 35 295 3 2 11.94% Flying 1 2 349 11 9 1 6.43% Waving 11 8 269 29 15.14% Pick up 8 7 1 120 23 1 25% Sit down 1 1 26 179 14.97% Stand up 23 282 8.15%

Classification results ( clipwise ) ,[object Object],Walking Running Waving1 waving2 bending Sit-down Stand-up Error Walking 10 0.0% Running 10 0.0% Waving1 9 1 10.0% waving2 10 0.0% bending 9 1 10.0% Sit-down 10 0.0% Stand-up 1 9 10.0%

Classification results (Tennis events) ,[object Object],Forehand Backhand Service Error Forehand 54 7 11 21.95% Backhand 11 53 10.75% Service 8 49 14.04%

Event Detection ,[object Object],[object Object],Current frame ‘ f’ Next n frames Previous n frames f f+1 f+2 f+3 f+4 … … f-1 f-2 f-3 f-4 … … f-n f+n

Event Detection Without using prediction logic With prediction logic

Weizmann Dataset ACTION SUBJECTS VIDEOS Bend 9 9 Jack 9 9 Jump 9 9 Pjump 9 9 Run 9 10 Side 9 9 Skip 9 10 Walk 9 10 Wave1 9 9 Wave2 9 9

Standard Dataset (Weizmann Dataset) Walk Side Skip Wave1 Wave2 Bend Run Jack Jump Pjump

confusion matrix ( framewise ) ,[object Object],Bend Jack Jump Pjump Run Side Skip Walk Wave1 Wave2 Bend 271 1 1 20 3 30 11 Jack 18 368 8 48 3 2 3 9 16 Jump 9 3 157 8 2 26 19 7 Pjump 36 26 237 22 6 Run 4 2 5 158 3 50 6 1 2 Side 11 9 77 1 1 84 3 58 2 1 Skip 3 9 76 43 5 109 24 1 7 Walk 2 5 16 2 13 5 395 Wave1 47 2 12 238 27 Wave2 30 6 1 4 1 55 269

Weizmann dataset ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Use of MV + Shape Info(SI) ,[object Object],[object Object],[object Object],[object Object]

Use of MV + Differential SI ,[object Object],[object Object],[object Object],[object Object]

confusion matrix ( framewise ) Bend Jack Jump Pjump Run Side Skip Walk Wave1 Wave2 Bend 326 7 2 2 Jack 6 418 39 1 3 8 Jump 18 1 189 1 5 4 13 Pjump 11 55 243 6 1 11 Run 2 2 173 2 45 7 Side 8 30 11 1 152 12 33 Skip 1 20 32 83 4 121 13 1 2 Walk 1 1 2 1 1 432 Wave1 43 1 10 10 232 30 Wave2 13 25 328

Spatio-temporal features TSPAN TLEN

Spatio-temporal descriptor ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Event classification ( clipwise ) ,[object Object],[object Object],bend Jack Jump Pjump Run Side Skip Walk Wave1 Wave2 Error bend 9 0.0% Jack 9 0.0% Jump 9 0.0% Pjump 9 0.0% Run 9 1 10.0% Side 9 0.0% Skip 10 0.0% Walk 10 0.0% Wave1 8 1 11.1% Wave2 9 0.0%

Action recognition in cluttered background

Cluttered environment ,[object Object],[object Object],[object Object],[object Object],[object Object]

Training ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Template length ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Single template formation ,[object Object],[object Object],[object Object],[object Object],[object Object],1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 3 4 5 6 8 9 10 11 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12

Optical flow and Adaboost ,[object Object],[object Object],[object Object],[object Object],[object Object]

Testing ,[object Object],[object Object],Height Width Length

Testing ,[object Object],[object Object],[object Object],[object Object],[object Object],Height Width Length

Confidence matrix ,[object Object],[object Object],[object Object],[object Object],[object Object]

Key References ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Action Recognition (Thesis presentation)

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Action Recognition (Thesis presentation)

Similar a Action Recognition (Thesis presentation) (20)

Action Recognition (Thesis presentation)