SlideShare una empresa de Scribd logo
1 de 85
Presented in Partial Fullment of the Requirements
of the Degree of Masters of Science in the School
of Communication and Information Technology

Fadwa Fawzy Fouad
Supervisor: Dr. Moataz M.Abdelwahab
Agenda

 Introduction

 Quick overview
 2DHOOF/2DPCA Contour Based Optical Flow
Algorithm
 Human Gesture Recognition Employing Radon
Transform/2DPCA
Introduction

• Importance & Applications
• Action V.S. Activity
• Challenges & characteristics of the domain
Importance &Applications


Human actionactivity recognition is one of the most promising applications of
computer vision. The interest of this topic is motivated by the promise of many
applications include
• character animation for games and movies
• advanced intelligent user interfaces
• biomechanical analysis of actions for sports and medicine
• automatic surveillance
Action V.S. Activity

Action

Activity

 Single person

 Complex sequence of
actions
 Single/ multiple person(s)

 Short time duration

 Long time duration

 Simple motion pattern
Challenges and
characteristics of the domain



The difficulty of the recognition process is associated with multiple variation
sources
 Inter- and intra-class variations
 Environmental Variations and Capturing conditions
 Temporal variations
• Inter-class variations (variations within single
class)
The variations in the performance of certain action due to anthropometric
differences between individuals. For example, running movements can
differ in speed and stride length.

• Intra-class variations (variations within different
classes)

Overlap between different action classes due to the similarity in actions
performance.
• Environmental variations
Destructions originate from the actor’s surroundings include dynamic or
cluttered environments, illumination variation, Body occlusion

• Capturing conditions
Depend on the method used to capture the scene, wither singlemultiple
static/dynamic camera(s) systems.

• Temporal variations
Includes the changes in the performance rate from one person to another.
Also, the changes in the recording rate (frame/sec).
Agenda

 Introduction

 Quick overview
 2DHOOF/2DPCA Contour Based Optical Flow
Algorithm
 Human Gesture Recognition Employing Radon
Transform/2DPCA
Overview

The main structure of action recognition system
The main structure of
action recognition system



The structure of the action recognition system is typically hierarchical.
Action
classification
Extraction of the
action
descriptors
Human detection &
segmentation
Capture the input video
Capture the input video

For single camera, the scene is captured from only one viewpoint, so it can't
provide enough information about the action performed in case of poor
viewpoint. Besides, it can't handle the occlusion problem.
Video 1
Video 3

Video 2

Video 4
Multi-camera systems can capture the same view from different poses., so they
provide sufficient information that can alleviate the occlusion problem.

Camera 0

Camera 2

Camera 1

Camera 3
The new technology of Kinect depth camera can be utilized to capture the
performed actions. The device has: RGB camera, depth sensor and multi-array
microphone.
It provides full-body 3D motion capture, facial recognition and voice recognition
capabilities. Furthermore, depth information can be used for segmentation.

RGB
information

Kinect depth camera
Depth
information
Human detection &
segmentation
It’s the first step of the full process of human sequence evaluation.
Techniques can be divided into :
• Background Subtraction techniques
• Motion Based techniques
• Appearance Based techniques
• Depth Based Segmentation
Extraction of the
action
descriptors
Input videos consist of massive amounts of information in the form of spatiotemporal pixel intensity variations. But most of this information is not directly
relevant to the task of understanding and identifying the activity occurring in
the video.
In this work we used Non-Parametric approaches in which a set of features
are extracted per video frame, then these features are accumulated and
matched to stored templates.
Example:
Motion Energy Image

&

Motion History Image
Action
classification
When the extracted features are available for an input video, human action
recognition becomes a classification problem.

Dimensionality reduction is a common step before the actual classification and is
discussed first.

Dimensionality reduction
Image representations are often high-dimensional. This makes matching task
computationally more expensive. Also, the representation might contain noisy
features. This problem trigged the idea of obtaining a more compact, robust
feature representation by reducing the space of the image representation into a
lower dimensional space.
Example: OneTwo Dimension(s) Principal component analysis (PCA)
Nearest neighbor classification
k-Nearest neighbor (NN) classifiers use the distance between the features of an
observed sequence and those in a training set. The most common label among the
k closest training sequences is chosen as the classification.
NN classification can be either performed at the frame level, or for the whole
video sequences. In the latter case, issues with different frame lengths need to be
resolved.
In our work we used 1-NN with Euclidean distance to classify the tested
actions.

is class
is class
Agenda

 Introduction

 Quick overview
 2DHOOF/2DPCA Contour Based Optical Flow
Algorithm
 Human Gesture Recognition Employing Radon
Transform/2DPCA
2DHOOF/2DPCA
Contour Based
Optical Flow Algorithm

•
•
•
•
•

Dense V.S. Sparse OF
Alignment issues with OF
The Calculation of 2D Histogram of Optical Flow(2DHOOF)
Overall System Description
Experimental Results
Dense V.S. Sparse OF

In practice, dense OF is not the best choice to get the OF. Besides it’s high
computation complexity, it is not accurate for homogenous moving objects
(aperture problem).
Alignment issues with OF


We had two choices to decide the best order for actor alignment:

 Align actor then calculate OF

 Calculate OF then Align it
Jumping & Transition effects in Running action
Calculate OF then Align OF

Align actor then calculate OF
The Calculation of 2D Histogram
of Optical Flow(2DHOOF)



Calculated OF

Histogram layers
W/m x H/m x n
An example to obtain the n-layers 2DHOOF for any
two successive frames
Accumulated 2D-HOOF that represents the whole
video
1DHOOF V.S. 2DHOOF

Confusion between Wave and Bend actions when
using 1DHOOF

Wave

Bend
Overall System Description


Segmentation
& Contour
Extraction

Extract the
dominant
vectors

Store
extracted
features

Training Mode
Sparse OF

Testing Mode

Segmentation
& Contour
Extraction

2DHOOF

Projection on
the dominant
vectors

Sparse OF

2DPCA

Classification
and Voting
Scheme

2DHOOF
Training Mode

Segmentation
& Contour
Extraction

Sparse OF

Extract the
dominant
vectors

2DHOOF

Store
extracted
features

2DPCA
Segmentation & Contour Extraction (Method 1)
• Geodesic segmentation
Where
xi : stroke pixels (black)
x : other pixels (white)
I : image intensity

Input
Video
Frame

Face
Detection

Initial
Stroke

Blob
Extraction

Final
Contour
Segmentation & Contour Extraction (Method 2)
• Contour extraction from Magnitude dense OF
Edge pixel has specific criteria based on it's (3 x 3) neighbor pixels.
Applying edgy criteria on the magnitude of the dense OF
Steps of contour extraction from dense OF
Training Mode

Segmentation
& Contour
Extraction

Sparse OF

Extract the
dominant
vectors

2DHOOF

Store
extracted
features

2DPCA
2DHOOF-2DPCA Features Extraction
2DHOOF of
Training Videos

Final Features
Projection
Training Mode

Segmentation
& Contour
Extraction

Sparse OF

Extract the
dominant
vectors

2DHOOF

Store
extracted
features

2DPCA
Testing Mode

Segmentation
& Contour
Extraction

Projection on
the dominant
vectors

Sparse OF

Classification
and Voting
Scheme

2DHOOF
Projection on the dominant vectors
Classification

D3

Final Decision
based on the
minimum D
value
Experimental Results

Two experiments were conducted to evaluate the performance of the proposed
algorithm.
• For the first experiment Weizmann dataset was used to measure the
performance of the low resolution single camera operation.
• For the second Experiment IXMAS multi-view dataset was used to evaluate
the performance of the parallel camera structure.
The two experiments was conducted using the Leave-One-Actor-Out (LOAO)
technique to be consistent with the most recent algorithms.
Both datasets provide RGB frames and the actor ‘s silhouettes.
Weizmann dataset

The Weizmann dataset consists of 90 low-resolution video sequences showing 9
different actors, each performing 10 natural actions such as walk, run, jump
forward, gallop sideways, bend, wave with one hand (wave1), wave with two
hands (wave2), jump in place (Pjump), jump-jack, and skip.
Bend

Run

Jump

Jump-jack

Gallop
The confusion matrix for this experiment shows that the average recognition
accuracy is 97.78%, and eight actions were 100% accurate.

2DHOOF / 2DPCA
On the other hand, using 1DHOOF with 1DPCA decreases the accuracy to
63.34% because of the large confusion between actions (as discussed before).

1DHOOF / 1DPCA
Comparison with the most recent algorithms:
• Recognition Accuracy
Method

Accuracy

Previous Contribution

98.89%

Our Algorithm

97.79%

Shah et al.

95.57%

Yang et al.

92.8%

Yuan et al.

92.22%

• Average Testing Time
Method

Average Runtime

Our Algorithm

66.11 msec

Previous Contribution

113.00 msec

Shah et al.

18.65 sec

Blank et al.

30 sec
Samples from the calculated contour OF

Walk

Skip

P-jump
IXMAS Dataset

The proposed parallel structure algorithm was applied on the IXMAS multi-view
dataset. Each camera is considered as an independent system, then a voting
scheme was carried out between the four cameras to obtain the final decision.
This dataset consists of 5 cameras capturing the scene, 12 actors, each performing
Our
13 natural actions 3 times in which the actors are free to change their orientation
Camera0
Algorithm
for each scenario.
Our

Voting
Scheme

Camera1
The actions: check Algorithm arms, scratch head, sit down, get up, turn
watch, cross
Final
around, walk, wave, punch, kick, and pick up and throw.
Camera2

Our
Algorithm

Camera3

Our
Algorithm

Decision
Example on IXMAS multi-camera dataset. Action: Pick up and Throw

Camera 0

Camera 2

Camera 1

Camera 3
Confusion matrix for IXMAS dataset shows that average accuracy is
87.12%,where SH=Scratch head, CW=Check watch, CA=Cross arms, SD=Sit
down, GU=Get up, TA=Turn around, PU=Pick up.
Cam(2) %

Cam(3) %

12

97.29

79.04

72.47

78.53

87.12

Previous Contribution

12

78.9

78.61

80.93

77.38

84.59

Weinland et al.

10

65.04

70.00

54.30

66.00

81.30

Srivastava et al.

10

N/A

N/A

N/A

N/A

81.40

Shah et al.

12

72.00

53.00

68.00

63.00

78.00

Method

Overall
Vote%

Cam(1) %

Proposed Algorithm

Actors #

Cam(0) %

Comparison with the best reported accuracies shows that we achieved the
highest accuracy with an enhancement of 3%.

Bold indicates the best performance, N/A= Not available in published reports
Samples from the calculated contour OF
Walk

Set down

Kick
Published Paper

F. Fawzy, M. Abdelwahab, and W. Mikhael. 2DHOOF-2DPCA Contour
Based Optical Flow Algorithm for Human Activity Recognition . IEEE
International Midwest Symposium on Circuits and Systems (MWSCAS
2013), Ohio, USA.
Agenda

 Introduction

 Quick overview
 2DHOOF/2DPCA Contour Based Optical Flow
Algorithm
 Human Gesture Recognition Employing Radon
Transform/2DPCA
Human Gesture Recognition
Employing Radon
Transform/2DPCA



• Radon Transform (RT)
• Overall system description
Radon Transform


The RT computes projections of an image matrix along specified directions. A
projection of a two-dimensional function f(x,y) is a set of line integrals along
parallel paths, or beams.
Overall system description


The proposed system is designed and tested for gesture recognition and can be
extended to regular action recognition.
We have two modes for this algorithm
• Training Mode
• Testing Mode
Both have a pre-processing step before feature extraction.
Training Mode

Pre-processing Step:
1) Input videos
The One Shot Learning ChaLearn Gesture Dataset was used for this experiment.
In this dataset a single user facing a fixed Kinect™ camera, interacting with a
computer by performing gestures was captured.
Videos are represented by RGB and depth images.
Each actor has from 8 to 15 different gestures(vocabulary) for training, and 47
input videos each has from 1 to 5 gesture(s) for testing.
We applied our algorithm on a subset of this dataset consists of 37 different
actors.
The dataset can be divided into two main groups; standing actors, and sitting
actors. In this experiment we used a subset of the standing actor group in which
actors are using their whole body to perform the gesture and make significant
motion to be captured by the MEI and MHI.

Standing actors

Sitting actors
Also, we used only the depth videos as input videos. Depth information makes
the segmentation task easier than using RGB or gray videos, especially when the
actor's clothes have the same color as the background, or textured background.
Training Mode

Pre-processing Step:
2) Segmentation & Blob extraction
We used Basic Global Thresholding Algorithm in order to extract the actor's
blob.
In some cases the resultant blob has some objects with it. This noise results from
some objects that were at the same depth as the actor.

Case 1

Case 2

Case 3
In this situation we perform a noise elimination step

Case 1

Case 2

Case 3
Training Mode

Alignment using RT of
the First Frame

• Vertical alignment using the projection on the y-axis (90o from RT)
• Horizontal alignment using the projection on the x-axis (0o from RT)
Training Mode

Calculate the MEI and
MHI

Whole Body
MEI

Body Parts
MHI

MEI

MHI
Training Mode

Get Radon Transform
for MEI and MHI

Basically, the difference between RT of the whole body and RT of the body parts is
the white portion in the center representing the projection of the actor's body
Training Mode

Testing Mode

Video Chopping

As we have mentioned, the testing videos may contain from 1 to 5
different gestures per video. In this case we need to separate these
gestures into one gesture per video to test our system with.
We can do that by two main steps :
1. Calculate the plot that represents the moving area/frame
2. Apply the Local minima criteria on this plot.
1. Calculate the plot that represents the moving area/frame
2. Apply the Local minima criteria
We are searching for a frame i that satisfies the following conditions:
a) The number of frames before this i
Threshold.

is greater than or equal to the Frame

b) The amount of decrease in the area at i
value.

is greater than 50% of Peak

c) The area at i-1 and i+1 is grater than the area at i to insure that i is a
local minima between two peaks.
Good Results
Bad Results
Experimental Results


We did four One Shot Learning experiments
I, II

Radon
Transform
OSL
Experiments

III, IV

MEI/MHI

2DPCA
Direct
correlation
2DPCA
Direct
correlation
Recognition accuracy of the four experiments

RT
MEI/MHI

Experiment

Whole Body

Body Parts

MEI

MHI

MEI

MHI

I

71

69

82

81.5

II

70

70

81.7

81.6

III

70

68

82

81.7

IV

71.24

68.7

83.33

82.9

Better

Comparison between using RT, and using MEI/MHI directly without RT
Features

% Maintained Energy

Storage Requirements

RT

99%

72 Mbytes

MEI/MHI

88%

102Mbytes

2DPCA

Features
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform

Más contenido relacionado

La actualidad más candente

ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...cscpconf
 
Flow Trajectory Approach for Human Action Recognition
Flow Trajectory Approach for Human Action RecognitionFlow Trajectory Approach for Human Action Recognition
Flow Trajectory Approach for Human Action RecognitionIRJET Journal
 
motion and feature based person tracking in survillance videos
motion and feature based person tracking in survillance videosmotion and feature based person tracking in survillance videos
motion and feature based person tracking in survillance videosshiva kumar cheruku
 
Moving object detection using background subtraction algorithm using simulink
Moving object detection using background subtraction algorithm using simulinkMoving object detection using background subtraction algorithm using simulink
Moving object detection using background subtraction algorithm using simulinkeSAT Publishing House
 
Talk 2011-buet-perception-event
Talk 2011-buet-perception-eventTalk 2011-buet-perception-event
Talk 2011-buet-perception-eventMahfuzul Haque
 
Camera calibration technique
Camera calibration techniqueCamera calibration technique
Camera calibration techniqueKrzysztof Wegner
 
Canny Edge Detection Algorithm on FPGA
Canny Edge Detection Algorithm on FPGA Canny Edge Detection Algorithm on FPGA
Canny Edge Detection Algorithm on FPGA IOSR Journals
 
【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset
【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset
【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder DatasetHirokatsu Kataoka
 
Comparison of Some Motion Detection Methods in cases of Single and Multiple M...
Comparison of Some Motion Detection Methods in cases of Single and Multiple M...Comparison of Some Motion Detection Methods in cases of Single and Multiple M...
Comparison of Some Motion Detection Methods in cases of Single and Multiple M...CSCJournals
 
Imu fusion algorithm for pose estimation (mCube invited talk) 2018 1003-1
Imu fusion algorithm for pose estimation (mCube invited talk) 2018 1003-1Imu fusion algorithm for pose estimation (mCube invited talk) 2018 1003-1
Imu fusion algorithm for pose estimation (mCube invited talk) 2018 1003-1James D.B. Wang, PhD
 
[G4]image deblurring, seeing the invisible
[G4]image deblurring, seeing the invisible[G4]image deblurring, seeing the invisible
[G4]image deblurring, seeing the invisibleNAVER D2
 
Survey on Image Integration of Misaligned Images
Survey on Image Integration of Misaligned ImagesSurvey on Image Integration of Misaligned Images
Survey on Image Integration of Misaligned ImagesIRJET Journal
 
Overview Of Video Object Tracking System
Overview Of Video Object Tracking SystemOverview Of Video Object Tracking System
Overview Of Video Object Tracking SystemEditor IJMTER
 
Survey on optical flow estimation with DL
Survey on optical flow estimation with DLSurvey on optical flow estimation with DL
Survey on optical flow estimation with DLLeapMind Inc
 
Automatic identification of animal using visual and motion saliency
Automatic identification of animal using visual and motion saliencyAutomatic identification of animal using visual and motion saliency
Automatic identification of animal using visual and motion saliencyeSAT Publishing House
 

La actualidad más candente (20)

Moving object detection
Moving object detectionMoving object detection
Moving object detection
 
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...
 
Flow Trajectory Approach for Human Action Recognition
Flow Trajectory Approach for Human Action RecognitionFlow Trajectory Approach for Human Action Recognition
Flow Trajectory Approach for Human Action Recognition
 
Background subtraction
Background subtractionBackground subtraction
Background subtraction
 
motion and feature based person tracking in survillance videos
motion and feature based person tracking in survillance videosmotion and feature based person tracking in survillance videos
motion and feature based person tracking in survillance videos
 
Moving object detection using background subtraction algorithm using simulink
Moving object detection using background subtraction algorithm using simulinkMoving object detection using background subtraction algorithm using simulink
Moving object detection using background subtraction algorithm using simulink
 
Talk 2011-buet-perception-event
Talk 2011-buet-perception-eventTalk 2011-buet-perception-event
Talk 2011-buet-perception-event
 
Background Subtraction Algorithm for Moving Object Detection Using Denoising ...
Background Subtraction Algorithm for Moving Object Detection Using Denoising ...Background Subtraction Algorithm for Moving Object Detection Using Denoising ...
Background Subtraction Algorithm for Moving Object Detection Using Denoising ...
 
Camera calibration technique
Camera calibration techniqueCamera calibration technique
Camera calibration technique
 
Background subtraction
Background subtractionBackground subtraction
Background subtraction
 
Canny Edge Detection Algorithm on FPGA
Canny Edge Detection Algorithm on FPGA Canny Edge Detection Algorithm on FPGA
Canny Edge Detection Algorithm on FPGA
 
【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset
【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset
【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset
 
Comparison of Some Motion Detection Methods in cases of Single and Multiple M...
Comparison of Some Motion Detection Methods in cases of Single and Multiple M...Comparison of Some Motion Detection Methods in cases of Single and Multiple M...
Comparison of Some Motion Detection Methods in cases of Single and Multiple M...
 
Imu fusion algorithm for pose estimation (mCube invited talk) 2018 1003-1
Imu fusion algorithm for pose estimation (mCube invited talk) 2018 1003-1Imu fusion algorithm for pose estimation (mCube invited talk) 2018 1003-1
Imu fusion algorithm for pose estimation (mCube invited talk) 2018 1003-1
 
[G4]image deblurring, seeing the invisible
[G4]image deblurring, seeing the invisible[G4]image deblurring, seeing the invisible
[G4]image deblurring, seeing the invisible
 
Survey on Image Integration of Misaligned Images
Survey on Image Integration of Misaligned ImagesSurvey on Image Integration of Misaligned Images
Survey on Image Integration of Misaligned Images
 
Overview Of Video Object Tracking System
Overview Of Video Object Tracking SystemOverview Of Video Object Tracking System
Overview Of Video Object Tracking System
 
Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...
Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...
Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...
 
Survey on optical flow estimation with DL
Survey on optical flow estimation with DLSurvey on optical flow estimation with DL
Survey on optical flow estimation with DL
 
Automatic identification of animal using visual and motion saliency
Automatic identification of animal using visual and motion saliencyAutomatic identification of animal using visual and motion saliency
Automatic identification of animal using visual and motion saliency
 

Destacado

Palm leaf character recognition using radon transform
Palm leaf character recognition using radon transformPalm leaf character recognition using radon transform
Palm leaf character recognition using radon transformವಿ ಸುಲೇಖಾ
 
Semantic human activity detection in videos
Semantic human activity detection in videosSemantic human activity detection in videos
Semantic human activity detection in videosHirantha Pradeep
 
Rule-based Real-Time Activity Recognition in a Smart Home Environment
Rule-based Real-Time Activity Recognition in a Smart Home EnvironmentRule-based Real-Time Activity Recognition in a Smart Home Environment
Rule-based Real-Time Activity Recognition in a Smart Home EnvironmentGeorge Baryannis
 
Introduction to Action Recognition in Python by Bertrand Nouvel, Jonathan Kel...
Introduction to Action Recognition in Python by Bertrand Nouvel, Jonathan Kel...Introduction to Action Recognition in Python by Bertrand Nouvel, Jonathan Kel...
Introduction to Action Recognition in Python by Bertrand Nouvel, Jonathan Kel...PyData
 
Radon Transform - image analysis
Radon Transform - image analysisRadon Transform - image analysis
Radon Transform - image analysisVanya Valindria
 
Planning, prioritising and efficiency: a Time Management Workshop
Planning, prioritising and efficiency: a Time Management WorkshopPlanning, prioritising and efficiency: a Time Management Workshop
Planning, prioritising and efficiency: a Time Management WorkshopImprovement Skills Consulting Ltd.
 
Inria - White Paper - Artificial intelligence, current challenges and Inria's...
Inria - White Paper - Artificial intelligence, current challenges and Inria's...Inria - White Paper - Artificial intelligence, current challenges and Inria's...
Inria - White Paper - Artificial intelligence, current challenges and Inria's...Inria
 

Destacado (7)

Palm leaf character recognition using radon transform
Palm leaf character recognition using radon transformPalm leaf character recognition using radon transform
Palm leaf character recognition using radon transform
 
Semantic human activity detection in videos
Semantic human activity detection in videosSemantic human activity detection in videos
Semantic human activity detection in videos
 
Rule-based Real-Time Activity Recognition in a Smart Home Environment
Rule-based Real-Time Activity Recognition in a Smart Home EnvironmentRule-based Real-Time Activity Recognition in a Smart Home Environment
Rule-based Real-Time Activity Recognition in a Smart Home Environment
 
Introduction to Action Recognition in Python by Bertrand Nouvel, Jonathan Kel...
Introduction to Action Recognition in Python by Bertrand Nouvel, Jonathan Kel...Introduction to Action Recognition in Python by Bertrand Nouvel, Jonathan Kel...
Introduction to Action Recognition in Python by Bertrand Nouvel, Jonathan Kel...
 
Radon Transform - image analysis
Radon Transform - image analysisRadon Transform - image analysis
Radon Transform - image analysis
 
Planning, prioritising and efficiency: a Time Management Workshop
Planning, prioritising and efficiency: a Time Management WorkshopPlanning, prioritising and efficiency: a Time Management Workshop
Planning, prioritising and efficiency: a Time Management Workshop
 
Inria - White Paper - Artificial intelligence, current challenges and Inria's...
Inria - White Paper - Artificial intelligence, current challenges and Inria's...Inria - White Paper - Artificial intelligence, current challenges and Inria's...
Inria - White Paper - Artificial intelligence, current challenges and Inria's...
 

Similar a Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform

Action_recognition-topic.pptx
Action_recognition-topic.pptxAction_recognition-topic.pptx
Action_recognition-topic.pptxcomputerscience98
 
Recognition and tracking moving objects using moving camera in complex scenes
Recognition and tracking moving objects using moving camera in complex scenesRecognition and tracking moving objects using moving camera in complex scenes
Recognition and tracking moving objects using moving camera in complex scenesIJCSEA Journal
 
Gait analysis report
Gait analysis reportGait analysis report
Gait analysis reportconoranthony
 
Object Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online LearningObject Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online LearningJui-Hsin (Larry) Lai
 
Generating a time shrunk lecture video by event
Generating a time shrunk lecture video by eventGenerating a time shrunk lecture video by event
Generating a time shrunk lecture video by eventYara Ali
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
IRJET- Human Fall Detection using Co-Saliency-Enhanced Deep Recurrent Convolu...
IRJET- Human Fall Detection using Co-Saliency-Enhanced Deep Recurrent Convolu...IRJET- Human Fall Detection using Co-Saliency-Enhanced Deep Recurrent Convolu...
IRJET- Human Fall Detection using Co-Saliency-Enhanced Deep Recurrent Convolu...IRJET Journal
 
Human Motion Detection in Video Surveillance using Computer Vision Technique
Human Motion Detection in Video Surveillance using Computer Vision TechniqueHuman Motion Detection in Video Surveillance using Computer Vision Technique
Human Motion Detection in Video Surveillance using Computer Vision TechniqueIRJET Journal
 
Survey Paper for Different Video Stabilization Techniques
Survey Paper for Different Video Stabilization TechniquesSurvey Paper for Different Video Stabilization Techniques
Survey Paper for Different Video Stabilization TechniquesIRJET Journal
 
Event recognition image & video segmentation
Event recognition image & video segmentationEvent recognition image & video segmentation
Event recognition image & video segmentationeSAT Journals
 
Robust techniques for background subtraction in urban
Robust techniques for background subtraction in urbanRobust techniques for background subtraction in urban
Robust techniques for background subtraction in urbantaylor_1313
 
Video Manifold Feature Extraction Based on ISOMAP
Video Manifold Feature Extraction Based on ISOMAPVideo Manifold Feature Extraction Based on ISOMAP
Video Manifold Feature Extraction Based on ISOMAPinventionjournals
 
Video Classification: Human Action Recognition on HMDB-51 dataset
Video Classification: Human Action Recognition on HMDB-51 datasetVideo Classification: Human Action Recognition on HMDB-51 dataset
Video Classification: Human Action Recognition on HMDB-51 datasetGiorgio Carbone
 
SENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTION
SENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTIONSENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTION
SENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTIONsipij
 
Review of Pose Recognition Systems
Review of Pose Recognition SystemsReview of Pose Recognition Systems
Review of Pose Recognition Systemsvivatechijri
 
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...sipij
 
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...sipij
 
Human Action Recognition Using Deep Learning
Human Action Recognition Using Deep LearningHuman Action Recognition Using Deep Learning
Human Action Recognition Using Deep LearningIRJET Journal
 
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...Francisco (Paco) Florez-Revuelta
 
Video Denoising using Transform Domain Method
Video Denoising using Transform Domain MethodVideo Denoising using Transform Domain Method
Video Denoising using Transform Domain MethodIRJET Journal
 

Similar a Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform (20)

Action_recognition-topic.pptx
Action_recognition-topic.pptxAction_recognition-topic.pptx
Action_recognition-topic.pptx
 
Recognition and tracking moving objects using moving camera in complex scenes
Recognition and tracking moving objects using moving camera in complex scenesRecognition and tracking moving objects using moving camera in complex scenes
Recognition and tracking moving objects using moving camera in complex scenes
 
Gait analysis report
Gait analysis reportGait analysis report
Gait analysis report
 
Object Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online LearningObject Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online Learning
 
Generating a time shrunk lecture video by event
Generating a time shrunk lecture video by eventGenerating a time shrunk lecture video by event
Generating a time shrunk lecture video by event
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
IRJET- Human Fall Detection using Co-Saliency-Enhanced Deep Recurrent Convolu...
IRJET- Human Fall Detection using Co-Saliency-Enhanced Deep Recurrent Convolu...IRJET- Human Fall Detection using Co-Saliency-Enhanced Deep Recurrent Convolu...
IRJET- Human Fall Detection using Co-Saliency-Enhanced Deep Recurrent Convolu...
 
Human Motion Detection in Video Surveillance using Computer Vision Technique
Human Motion Detection in Video Surveillance using Computer Vision TechniqueHuman Motion Detection in Video Surveillance using Computer Vision Technique
Human Motion Detection in Video Surveillance using Computer Vision Technique
 
Survey Paper for Different Video Stabilization Techniques
Survey Paper for Different Video Stabilization TechniquesSurvey Paper for Different Video Stabilization Techniques
Survey Paper for Different Video Stabilization Techniques
 
Event recognition image & video segmentation
Event recognition image & video segmentationEvent recognition image & video segmentation
Event recognition image & video segmentation
 
Robust techniques for background subtraction in urban
Robust techniques for background subtraction in urbanRobust techniques for background subtraction in urban
Robust techniques for background subtraction in urban
 
Video Manifold Feature Extraction Based on ISOMAP
Video Manifold Feature Extraction Based on ISOMAPVideo Manifold Feature Extraction Based on ISOMAP
Video Manifold Feature Extraction Based on ISOMAP
 
Video Classification: Human Action Recognition on HMDB-51 dataset
Video Classification: Human Action Recognition on HMDB-51 datasetVideo Classification: Human Action Recognition on HMDB-51 dataset
Video Classification: Human Action Recognition on HMDB-51 dataset
 
SENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTION
SENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTIONSENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTION
SENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTION
 
Review of Pose Recognition Systems
Review of Pose Recognition SystemsReview of Pose Recognition Systems
Review of Pose Recognition Systems
 
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
 
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
 
Human Action Recognition Using Deep Learning
Human Action Recognition Using Deep LearningHuman Action Recognition Using Deep Learning
Human Action Recognition Using Deep Learning
 
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
 
Video Denoising using Transform Domain Method
Video Denoising using Transform Domain MethodVideo Denoising using Transform Domain Method
Video Denoising using Transform Domain Method
 

Último

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 

Último (20)

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 

Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform

  • 1. Presented in Partial Fullment of the Requirements of the Degree of Masters of Science in the School of Communication and Information Technology Fadwa Fawzy Fouad Supervisor: Dr. Moataz M.Abdelwahab
  • 2. Agenda   Introduction  Quick overview  2DHOOF/2DPCA Contour Based Optical Flow Algorithm  Human Gesture Recognition Employing Radon Transform/2DPCA
  • 3. Introduction  • Importance & Applications • Action V.S. Activity • Challenges & characteristics of the domain
  • 4. Importance &Applications  Human actionactivity recognition is one of the most promising applications of computer vision. The interest of this topic is motivated by the promise of many applications include • character animation for games and movies • advanced intelligent user interfaces • biomechanical analysis of actions for sports and medicine • automatic surveillance
  • 5. Action V.S. Activity  Action Activity  Single person  Complex sequence of actions  Single/ multiple person(s)  Short time duration  Long time duration  Simple motion pattern
  • 6. Challenges and characteristics of the domain  The difficulty of the recognition process is associated with multiple variation sources  Inter- and intra-class variations  Environmental Variations and Capturing conditions  Temporal variations
  • 7. • Inter-class variations (variations within single class) The variations in the performance of certain action due to anthropometric differences between individuals. For example, running movements can differ in speed and stride length. • Intra-class variations (variations within different classes) Overlap between different action classes due to the similarity in actions performance.
  • 8. • Environmental variations Destructions originate from the actor’s surroundings include dynamic or cluttered environments, illumination variation, Body occlusion • Capturing conditions Depend on the method used to capture the scene, wither singlemultiple static/dynamic camera(s) systems. • Temporal variations Includes the changes in the performance rate from one person to another. Also, the changes in the recording rate (frame/sec).
  • 9. Agenda   Introduction  Quick overview  2DHOOF/2DPCA Contour Based Optical Flow Algorithm  Human Gesture Recognition Employing Radon Transform/2DPCA
  • 10. Overview  The main structure of action recognition system
  • 11. The main structure of action recognition system  The structure of the action recognition system is typically hierarchical. Action classification Extraction of the action descriptors Human detection & segmentation Capture the input video
  • 12. Capture the input video For single camera, the scene is captured from only one viewpoint, so it can't provide enough information about the action performed in case of poor viewpoint. Besides, it can't handle the occlusion problem. Video 1 Video 3 Video 2 Video 4
  • 13. Multi-camera systems can capture the same view from different poses., so they provide sufficient information that can alleviate the occlusion problem. Camera 0 Camera 2 Camera 1 Camera 3
  • 14. The new technology of Kinect depth camera can be utilized to capture the performed actions. The device has: RGB camera, depth sensor and multi-array microphone. It provides full-body 3D motion capture, facial recognition and voice recognition capabilities. Furthermore, depth information can be used for segmentation. RGB information Kinect depth camera Depth information
  • 15. Human detection & segmentation It’s the first step of the full process of human sequence evaluation. Techniques can be divided into : • Background Subtraction techniques • Motion Based techniques • Appearance Based techniques • Depth Based Segmentation
  • 16. Extraction of the action descriptors Input videos consist of massive amounts of information in the form of spatiotemporal pixel intensity variations. But most of this information is not directly relevant to the task of understanding and identifying the activity occurring in the video. In this work we used Non-Parametric approaches in which a set of features are extracted per video frame, then these features are accumulated and matched to stored templates. Example: Motion Energy Image & Motion History Image
  • 17. Action classification When the extracted features are available for an input video, human action recognition becomes a classification problem. Dimensionality reduction is a common step before the actual classification and is discussed first. Dimensionality reduction Image representations are often high-dimensional. This makes matching task computationally more expensive. Also, the representation might contain noisy features. This problem trigged the idea of obtaining a more compact, robust feature representation by reducing the space of the image representation into a lower dimensional space. Example: OneTwo Dimension(s) Principal component analysis (PCA)
  • 18. Nearest neighbor classification k-Nearest neighbor (NN) classifiers use the distance between the features of an observed sequence and those in a training set. The most common label among the k closest training sequences is chosen as the classification. NN classification can be either performed at the frame level, or for the whole video sequences. In the latter case, issues with different frame lengths need to be resolved. In our work we used 1-NN with Euclidean distance to classify the tested actions. is class is class
  • 19. Agenda   Introduction  Quick overview  2DHOOF/2DPCA Contour Based Optical Flow Algorithm  Human Gesture Recognition Employing Radon Transform/2DPCA
  • 20. 2DHOOF/2DPCA Contour Based Optical Flow Algorithm  • • • • • Dense V.S. Sparse OF Alignment issues with OF The Calculation of 2D Histogram of Optical Flow(2DHOOF) Overall System Description Experimental Results
  • 21. Dense V.S. Sparse OF  In practice, dense OF is not the best choice to get the OF. Besides it’s high computation complexity, it is not accurate for homogenous moving objects (aperture problem).
  • 22. Alignment issues with OF  We had two choices to decide the best order for actor alignment:  Align actor then calculate OF  Calculate OF then Align it
  • 23. Jumping & Transition effects in Running action
  • 24. Calculate OF then Align OF Align actor then calculate OF
  • 25. The Calculation of 2D Histogram of Optical Flow(2DHOOF)  Calculated OF Histogram layers W/m x H/m x n
  • 26. An example to obtain the n-layers 2DHOOF for any two successive frames
  • 27. Accumulated 2D-HOOF that represents the whole video
  • 29. Confusion between Wave and Bend actions when using 1DHOOF Wave Bend
  • 30. Overall System Description  Segmentation & Contour Extraction Extract the dominant vectors Store extracted features Training Mode Sparse OF Testing Mode Segmentation & Contour Extraction 2DHOOF Projection on the dominant vectors Sparse OF 2DPCA Classification and Voting Scheme 2DHOOF
  • 31. Training Mode  Segmentation & Contour Extraction Sparse OF Extract the dominant vectors 2DHOOF Store extracted features 2DPCA
  • 32. Segmentation & Contour Extraction (Method 1) • Geodesic segmentation Where xi : stroke pixels (black) x : other pixels (white) I : image intensity Input Video Frame Face Detection Initial Stroke Blob Extraction Final Contour
  • 33. Segmentation & Contour Extraction (Method 2) • Contour extraction from Magnitude dense OF Edge pixel has specific criteria based on it's (3 x 3) neighbor pixels.
  • 34. Applying edgy criteria on the magnitude of the dense OF
  • 35. Steps of contour extraction from dense OF
  • 36. Training Mode  Segmentation & Contour Extraction Sparse OF Extract the dominant vectors 2DHOOF Store extracted features 2DPCA
  • 37. 2DHOOF-2DPCA Features Extraction 2DHOOF of Training Videos Final Features Projection
  • 38. Training Mode  Segmentation & Contour Extraction Sparse OF Extract the dominant vectors 2DHOOF Store extracted features 2DPCA
  • 39. Testing Mode  Segmentation & Contour Extraction Projection on the dominant vectors Sparse OF Classification and Voting Scheme 2DHOOF
  • 40. Projection on the dominant vectors
  • 42. Experimental Results  Two experiments were conducted to evaluate the performance of the proposed algorithm. • For the first experiment Weizmann dataset was used to measure the performance of the low resolution single camera operation. • For the second Experiment IXMAS multi-view dataset was used to evaluate the performance of the parallel camera structure. The two experiments was conducted using the Leave-One-Actor-Out (LOAO) technique to be consistent with the most recent algorithms. Both datasets provide RGB frames and the actor ‘s silhouettes.
  • 43. Weizmann dataset  The Weizmann dataset consists of 90 low-resolution video sequences showing 9 different actors, each performing 10 natural actions such as walk, run, jump forward, gallop sideways, bend, wave with one hand (wave1), wave with two hands (wave2), jump in place (Pjump), jump-jack, and skip. Bend Run Jump Jump-jack Gallop
  • 44. The confusion matrix for this experiment shows that the average recognition accuracy is 97.78%, and eight actions were 100% accurate. 2DHOOF / 2DPCA
  • 45. On the other hand, using 1DHOOF with 1DPCA decreases the accuracy to 63.34% because of the large confusion between actions (as discussed before). 1DHOOF / 1DPCA
  • 46. Comparison with the most recent algorithms: • Recognition Accuracy Method Accuracy Previous Contribution 98.89% Our Algorithm 97.79% Shah et al. 95.57% Yang et al. 92.8% Yuan et al. 92.22% • Average Testing Time Method Average Runtime Our Algorithm 66.11 msec Previous Contribution 113.00 msec Shah et al. 18.65 sec Blank et al. 30 sec
  • 47. Samples from the calculated contour OF Walk Skip P-jump
  • 48. IXMAS Dataset  The proposed parallel structure algorithm was applied on the IXMAS multi-view dataset. Each camera is considered as an independent system, then a voting scheme was carried out between the four cameras to obtain the final decision. This dataset consists of 5 cameras capturing the scene, 12 actors, each performing Our 13 natural actions 3 times in which the actors are free to change their orientation Camera0 Algorithm for each scenario. Our Voting Scheme Camera1 The actions: check Algorithm arms, scratch head, sit down, get up, turn watch, cross Final around, walk, wave, punch, kick, and pick up and throw. Camera2 Our Algorithm Camera3 Our Algorithm Decision
  • 49. Example on IXMAS multi-camera dataset. Action: Pick up and Throw Camera 0 Camera 2 Camera 1 Camera 3
  • 50. Confusion matrix for IXMAS dataset shows that average accuracy is 87.12%,where SH=Scratch head, CW=Check watch, CA=Cross arms, SD=Sit down, GU=Get up, TA=Turn around, PU=Pick up.
  • 51. Cam(2) % Cam(3) % 12 97.29 79.04 72.47 78.53 87.12 Previous Contribution 12 78.9 78.61 80.93 77.38 84.59 Weinland et al. 10 65.04 70.00 54.30 66.00 81.30 Srivastava et al. 10 N/A N/A N/A N/A 81.40 Shah et al. 12 72.00 53.00 68.00 63.00 78.00 Method Overall Vote% Cam(1) % Proposed Algorithm Actors # Cam(0) % Comparison with the best reported accuracies shows that we achieved the highest accuracy with an enhancement of 3%. Bold indicates the best performance, N/A= Not available in published reports
  • 52. Samples from the calculated contour OF Walk Set down Kick
  • 53. Published Paper  F. Fawzy, M. Abdelwahab, and W. Mikhael. 2DHOOF-2DPCA Contour Based Optical Flow Algorithm for Human Activity Recognition . IEEE International Midwest Symposium on Circuits and Systems (MWSCAS 2013), Ohio, USA.
  • 54. Agenda   Introduction  Quick overview  2DHOOF/2DPCA Contour Based Optical Flow Algorithm  Human Gesture Recognition Employing Radon Transform/2DPCA
  • 55. Human Gesture Recognition Employing Radon Transform/2DPCA  • Radon Transform (RT) • Overall system description
  • 56. Radon Transform  The RT computes projections of an image matrix along specified directions. A projection of a two-dimensional function f(x,y) is a set of line integrals along parallel paths, or beams.
  • 57.
  • 58. Overall system description  The proposed system is designed and tested for gesture recognition and can be extended to regular action recognition. We have two modes for this algorithm • Training Mode • Testing Mode Both have a pre-processing step before feature extraction.
  • 60. Pre-processing Step: 1) Input videos The One Shot Learning ChaLearn Gesture Dataset was used for this experiment. In this dataset a single user facing a fixed Kinect™ camera, interacting with a computer by performing gestures was captured. Videos are represented by RGB and depth images. Each actor has from 8 to 15 different gestures(vocabulary) for training, and 47 input videos each has from 1 to 5 gesture(s) for testing. We applied our algorithm on a subset of this dataset consists of 37 different actors.
  • 61. The dataset can be divided into two main groups; standing actors, and sitting actors. In this experiment we used a subset of the standing actor group in which actors are using their whole body to perform the gesture and make significant motion to be captured by the MEI and MHI. Standing actors Sitting actors
  • 62. Also, we used only the depth videos as input videos. Depth information makes the segmentation task easier than using RGB or gray videos, especially when the actor's clothes have the same color as the background, or textured background.
  • 64. Pre-processing Step: 2) Segmentation & Blob extraction We used Basic Global Thresholding Algorithm in order to extract the actor's blob.
  • 65.
  • 66. In some cases the resultant blob has some objects with it. This noise results from some objects that were at the same depth as the actor. Case 1 Case 2 Case 3
  • 67. In this situation we perform a noise elimination step Case 1 Case 2 Case 3
  • 69. Alignment using RT of the First Frame  • Vertical alignment using the projection on the y-axis (90o from RT)
  • 70. • Horizontal alignment using the projection on the x-axis (0o from RT)
  • 72. Calculate the MEI and MHI  Whole Body MEI Body Parts MHI MEI MHI
  • 74. Get Radon Transform for MEI and MHI 
  • 75. Basically, the difference between RT of the whole body and RT of the body parts is the white portion in the center representing the projection of the actor's body
  • 78. Video Chopping  As we have mentioned, the testing videos may contain from 1 to 5 different gestures per video. In this case we need to separate these gestures into one gesture per video to test our system with. We can do that by two main steps : 1. Calculate the plot that represents the moving area/frame 2. Apply the Local minima criteria on this plot.
  • 79. 1. Calculate the plot that represents the moving area/frame
  • 80. 2. Apply the Local minima criteria We are searching for a frame i that satisfies the following conditions: a) The number of frames before this i Threshold. is greater than or equal to the Frame b) The amount of decrease in the area at i value. is greater than 50% of Peak c) The area at i-1 and i+1 is grater than the area at i to insure that i is a local minima between two peaks.
  • 83. Experimental Results  We did four One Shot Learning experiments I, II Radon Transform OSL Experiments III, IV MEI/MHI 2DPCA Direct correlation 2DPCA Direct correlation
  • 84. Recognition accuracy of the four experiments RT MEI/MHI Experiment Whole Body Body Parts MEI MHI MEI MHI I 71 69 82 81.5 II 70 70 81.7 81.6 III 70 68 82 81.7 IV 71.24 68.7 83.33 82.9 Better Comparison between using RT, and using MEI/MHI directly without RT Features % Maintained Energy Storage Requirements RT 99% 72 Mbytes MEI/MHI 88% 102Mbytes 2DPCA Features

Notas del editor

  1. First, the introduction. It covers 3 main points: Importance and applications of this field, The difference between action and activity, and finally Challenges and characteristics of the domain.
  2. The differences between Action and Activity are that …
  3. Inter-class variations are variations within single class because Action performance can differ from one actor to another.Itra-class variations are variations between two or more different classes due to the similarity in actions performanceFor better recognition results we need Less Inter-class variations and more Intra class variations
  4. Environmental Variations: are destructions originate form the actor’s surroundingsCapturing conditions : depends on The method used to capture the scene, include the usage of single\multiple moving or static cameras.Temporal variations : Includes the changes in performance rate from one actor to another, and changes in the recording rate
  5. The structure of the action recognition system is typically hierarchical. Starts by capturing the input video and extract the actor’s body from it, then feature extraction and finally action classification.
  6. As shown here, the first 3 videos are captured from a good viewpoint so we can gain enough information about the actions. But the 4th video is captured from a poor viewpoint from which the actor’s is hiding the action details.
  7. Add gif images
  8. ***Human detection is the task of finding the presence and the position of human beings in images/videos***We briefly describe a few popular human segmentation techniques :
  9. Human detection is the task of finding the presence and the position of human beings in images/videos
  10. MEI: represent the locations where the motion has occurred in the image sequence.MHI: represents the history of this motion by different gray levels (new motion is brighter)
  11. As shown here, this jumping actor has non-textured clothes, so the dense OF will have some inaccurate results because the body pixels in the current frame can’t decide it’s new location in the next frame, and only the edge points can accurately describe the actor’s motion.So we used sparse OF of the actor’s contour because it’s less computationally expensive compared to the dense OF and can accurately describe the motion without the need of excessive processing.
  12. For results consistency, we used alignment step before feature extraction and we found thatthe order of this step affects the results with a significant difference.
  13. Actions like running can be represented by their jumping and transition effects
  14. If we aligned the actor then calculate OF these effects will vanish, and only the legs motion is captured, any other motion is due to the poor alignment. On the other hand, if we calculate OF and then align it, the transition and jumping effects will be captured in the calculated OF. We can see this conclusion from the OF of the head pixels. So we choose to calculate the OF then align it.
  15. After obtaining the OF for each two successive frames from the input video, we used it to calculate the new features of the n layers 2DHOOF. The calculated OF of size WxH was divided into blocks each of size mXm . For each block a 1DHOOF with n bins represent the different ranges of angles was obtained. Then each bin from the 1DHOOF contributes in each layer of the 2DHOOF at locations correspond to the block location. So the size of the final 2DHOOF is W/m x H/m x n.
  16. For example if we divided the calculated OF into blocks each of size (W/2 x H/2), the final 2DHOOF layers have a size of (2x2)
  17. After calculating the 2DHOOF for each 2 successive frames from the input video, these histograms are then layer wise accumulated and normalized to obtain the total 2DHOOF for the whole video.These features are independent on the actor's scale, and tolerable to contours imperfections. Furthermore,they are independent on the start of the action, as the multi-layer 2DHOOF per frame arefinally accumulated and normalized regardless their temporal order.
  18. the main advantage of the 2DHOOF is that it maintains the spatial relation between the moving parts compared to the 1DHOOF which concerns only with the dominant motion wherever it occurs.
  19. As shown here, bend and wave actions have the same motion directions. The main difference here is the spatial location of this motion. Since the 1DHOOF doesn’t maintain the spatial locations of the motion. It cannot be used to discriminate between these actions as they are using the same rang of angles.
  20. Our system is divided into two modes, Training mode in which the training features are extracted and stored. and testing mode in which the dominant features are obtained for the tested video then compared to the stored training features to get the final decision.
  21. The first step in the training mode is actor segmentation and contour extraction.
  22. We tried two different methods for contour extraction: the first method is by geodesic segmentation. The idea of this method is to draw an initial stroke on the actor’s body and try to expand it to cover all other pixels that are near and have low intensity variation compared to the stroke pixels. These two conditions are met by measure Geodesic distance between the initial stroke pixels and the other pixels. We used face detection to draw this initial stroke automatically.
  23. The accuracy of this method is highly dependent on the initial stroke.
  24. The second method is by using the magnitude of the dense OF. Edge pixel has specific criteria based on it's (3 x 3) neighbor pixels. As shown here the black dot represents the edge pixel and the ones represent the pixels that have non zero magnitude. so this criteria can be simply described by the summation of these ones. And we found that the edge point has summation value from 3 to 6.
  25. As shown here, for each pixel we applied this edge criteria to extract the edge pixels.
  26. The main steps of this method are: Calculate the magnitude of the dense OF, then find the edge pixels using edge criteria, and finally apply a simple threshold to remove the noise.
  27. The second step in the training mode is extract the dominant features. After calculating the OF and the 2DHOOF we used the 2DPCA to extract the dominant features.
  28. For each range of anglesin the training 2HOOFs, we calculate the mean and then the covariance matrix , and then obtain the dominant vectors that correspond to the maximum eigen values. The histograms are then projected on the dominant vectors to extract the final features.
  29. These features are stored to be used in the testing mode.
  30. The 2DHOOF of the tested video is projected on the dominant vectors to obtain the final features.
  31. These features are then matched against the stored features using the 1NN classifier with euclidean distance. The final decision is based on the minimum distance.
  32. We compared our algorithm with the most recent algorithms in terms of recognition accuracy and Average testing time. The achieved accuracy is comparable with the highest reported accuracy obtained in our previous contribution. This excellent accuracy was achieved in spite of the imperfect and noisy contours which makes this method independent on how perfect the extracted contours are. Also, our algorithm has the best testing time, which promotes it for real time applications.
  33. Add gifs
  34. This accuracy was achieved inspite of the presence of shadows and imperfections in the extracted contours.
  35. As shown here, we choosed initial T and start the segmentation algorithm. After # of iterations we can segment the actor.
  36. We have 3 cases: Case1: the noise and the actor are not connected.Case2: The noise and the actor are connected but can be separated using simple morphological operations.Case3 : The noise and the actor are connected but can’t be separated.
  37. By calculating the area of each object and only keep the object with maximum area.
  38. The segmented actor can be aligned using ($0^o$) and ($90^o$) projections from the RT. For vertical alignment we used $90^o$ projection information. We specified the projection rectangle on the y-axis and align it's center (red line) to the y-center (purple line) of the frame as shown in Figure. As the gestures don't include whole body motion (i.e. walk, run,...) we can only use the RT of the first frame, and shift the whole video frames by the same distance.
  39. For horizontal alignment we used ($0^o$) projection information. As shown in Figure, the maximum projection value on the x-axis represents the center line of the actor's body. The distance between the maximum projection (red line) and the x-center of the frame (purple line) is the amount needed to align the whole actor's body at the x-center.
  40. We have two types of MEI/MHI. The first one is the whole body MEI/MHI, and the second one includes only the moving body parts.For gestures that include hands motion in front of the body area, MEI/MHIfor the whole body fails to capture this motion and hide it behind the actor's body, that’s why we calculate the body parts version. This will make the MEI/MHI of the moving parts More reliable and accurate than using MEI/MHI of the whole body.
  41. RT of the MEI/MHI is the projection of the image information on a range of angles from 0 to 180 degrees. The resultant RT has height of DL and width of 180.
  42. After obtaining RT we applied 2DPCA to extract the final features and store them.
  43. Testing mode is very similar to the training mode except for the video chopping step.
  44. Take the first frame as a starting position reference for each new gesture.2. Perform frame difference between the first frame and each frame in the video to getthe moving parts.3. Calculate the area of the moving body parts by summing the number of white pixelsper frame, and then plot it.From the plot we can see that the area is decreasing when the actor is about to finish the gesture and returns to the starting position to start a new one.
  45. After obtaining the Area plot, we apply the local minima criteria on it:1- to prevent cutting the video in the middle of the gesture.2- to ensure that the actor is returning to the starting position.3- …
  46. We applied this method on videos contain from 2 to 5 gestures.
  47. In some cases the actor doesn't return to the starting position between the gestures, that's why the algorithmmerges two successive gestures into one. When this happens we discard this video from the training.
  48. We did 4 OSL experiments, the first 2 experiments use RT as action descriptor, and the second 2 experiments use MEI\MHI as action descriptors. For each case we used 2 different methods for classification which are 2DPCA and 1NN and direct correlation.
  49. As shown in this table, there is almost no difference in accuracies between using RT of MEI/MHI and using MEI/MHI directly, but RT is better than the MEI/MHI in the amount of storage needed for the calculated features. RT reduces the storage requirementsby 30% compared to the storage requirements needed for the MEI/MHI features as shown in the 2nd table. In addition, RT maintains about 99% of the features energy compared to MEI/MHI which maintains 88% of the features energy. In all experiments the accuracy of the Body Parts is much better than Whole Body because of the motion occlusion that makes different gestures appear as if they are similar.
  50. For example these two actions have close body MEI/MHI, although they are totally different.