The amount and complexity of digital media being generated, stored, transmitted, analysed and accessed has exponentially increased as a result of advances in computer and Web technologies. Much of this information combines digital images, video, audio, graphics and textual data. Large-scale online video repositories enable users to creatively share material along a wide audience. Consequently, there is an increasing interest in associating media items with free-text annotations, ranging from simple titles and detailed descriptions of the video content. In an effort to reduce the complexity of the annotation task, this talk will outline some of the techniques developed for indexing large-scale multimedia repositories by exploiting multi-modality of information space. One such approach combines the use of semantic expansion and visual analysis for predicting user tags for online videos. The framework is designed to exploit visual features using biologically inspired algorithms and associated textual metadata, which is semantically, expanded using complementary textual resources. The experimental results indicate the usefulness of the proposed approach for analysing large-scale media items.
Multimodal Analysis for Bridging Semantic Gap with Biologically Inspired Algorithm
1. Multimodal Analysis for Bridging
Semantic Gap with Biologically
Inspired Algorithms
Dr. Krishna Chandramouli
Media Engineering and Analytics Research Group
VIT University
2. Overview
Who we are!!
Media and Internet
Information Access
Subjective vs Objective Indexing
The Semantic Gap
Evolving Strategies
Social Media Analysis
MediaEval 2013 Participation
Conclusion
Q & A
04/07/2014Uni. of Siegen
6. Media and internet
In March 2013 that Flickr had a
total of 87 million registered
members and more than 3.5
million new images uploaded
daily.
There are currently almost 90
billion photos total on Facebook.
This means we are, by far, the
largest photos site on the Internet.
04/07/2014Uni. of Siegen
8. Information Access
Traditional ordering of images is achieved through categorization
of information into logical structures
Creation of albums
Categorizing through date/time
Clustering through location
Image based search engines are gaining popularity with the
increase in power of indexing schemes
04/07/2014Uni. of Siegen
14. Indexing subjective or objective
How to uniquely name an image to make them distinguishable?
What names can be used to search images?
How many names are needed to make the images unique?
Will all humans use the same names to identify the images?
04/07/2014Uni. of Siegen
15. Indexing subjective or objective
Humans are culturally influenced
Terms contain different meanings across boundaries and cultures
Therefore, any tag/word assigned to an image will be considered
subjective
Objective signatures for images are generated from the
characteristics of the images
The beginning of MPEG-7 standardisation activities.
04/07/2014Uni. of Siegen
18. The Semantic Gap
The semantic gap characterizes the difference between two
descriptions of an object by different linguistic representations, for
instance languages or symbols.
In computer science, the concept is relevant whenever ordinary
human activities, observations, and tasks are transferred into a
computational representation
04/07/2014Uni. of Siegen
21. Evolving strategies
Image Classification; Visual Classifier; Knowledge Assisted Analysis; Image Retrieval
and User Relevance Feedback; Multi-Concept Space Search and Retrieval
04/07/2014Uni. of Siegen
22. Evolving strategies
The problem of Image classification and clustering has been the
subject of active research for last decade. Mainly attributed to
the exponential growth of digital content.
The efficiency of the clustering and classification algorithms can
be attributed to the efficiency of the machine learning
approaches.
To improve the performance of machine learning algorithms,
different optimisation techniques has been employed such as
Genetic Algorithms.
04/07/2014Uni. of Siegen
23. Evolving strategies
Recent developments in applied and heuristic optimisation
techniques have been strongly influenced and inspired by natural
and biological systems.
Algorithms developed from such observations are
Ant Colony Optimisation (ACO) - based on the ability of an ant colony to
nd the shortest path between the food and the source compared to an
individual ant.
Articial Immune System (AIS) - typically exploit the immune system's
characteristics of learning and memory to solve a problem
Particle Swarm Optimisation (PSO) - inspired by the social behaviour of a
flock of birds.
04/07/2014Uni. of Siegen
24. Evolving strategies
In the study of "Semantic Gap", machine learning algorithms are
the building blocks for bottom-up approach.
Some of the applications of efficient machine learning algorithms
are:
Automatic Content Annotation
Knowledge Extraction
Content Retrieval
In the top-down approach, Ontology provides partial
understanding of human semantics.
04/07/2014Uni. of Siegen
26. Visual classifier
In an effort to transform the social interaction of different species into a
computer simulation, Kennedy and Eberhart developed an optimisation
technique named Particle Swarm Optimisation.
In theory, the universal behaviour of individuals is summarised in terms of
Evaluate, Compare and Imitate principles.
04/07/2014Uni. of Siegen
27. Visual classifier
Evaluate: The tendency to evaluate stimuli – to rate them as
positive or negative, attractive or repulsive is perhaps the most
ubiquitous behavioural characteristic of living organisms.
Compare: In almost every aspect of life, human tend to compare
with others
Imitate: Humans imitation comprises taking the perspective of the
other person, not only imitating a behaviour but also realising its
purpose and executing the behaviour when it is appropriate
04/07/2014Uni. of Siegen
28. Visual classifier
Equations governing the motion of particles in PSO.
04/07/2014Uni. of Siegen
valuessocialandcognitivegoverningparameterscc
particletheofpositiontherepresentstx
swarmtheforsolutionbestglobalrepresentstgbest
iparticleofsolutionbestpersonalrepresentstpbest
particleofvelocitytherepresentstvid
tvtxtx
txtgbestctxtpbestctvtv
id
d
i
ididid
iddidiidid
−
−
−
−
−
++=+
−+−+=+
21
21
,
)(
)(
)(
)(
)1()()1(
))()(())()(()()1(
29. Visual classifier
Pseudo code for the algorithm
Step 1: Random Initialization of Particles
Step 2: Function Evaluation
Step 3: Computation of personal best and global best
Step 4: Velocity update
Step 5: Position update
Step 6: Loop to step 2, until the stopping criteria is reached
04/07/2014Uni. of Siegen
30. Visual classifier
Self Organising Map
04/07/2014Uni. of Siegen
[X]
[X] - Input feature
vector
Class 1 – Red
Untrained - Black
Winner Node selected
based on L2 norm
33. Visual classifier
The elementary principle of “Chaos” is introduced to model the behaviour of
particle motion.
The theoretical discussion on Chaotic – PSO includes the notion of “wind
speed” and “wind direction” modelling the biological atmosphere for
position update of the particles.
The wind speed and therefore the position update equation are presented
by:
04/07/2014Uni. of Siegen
particleofposition
particleofvelocity
atmosphereofeffectsupporting()*
atmosphereofeffectopposing()*
)1()1()()1(
()*()*)()1(
−
−
−
−
−
++++=+
++=+
id
id
su
op
w
wididid
suopww
x
v
randv
randv
speedwindv
tvtvtxtx
randvrandvtvtv
37. Knowledge Assisted Framework
Experimental Dataset
A set of 500 Images, belonging to the general category of
vacation images was assembled.
The content was mainly obtained from Flickr online photo
management and sharing application and includes images
that depict cityscape, seaside, mountain and landscape
locations.
Every image was manually annotated, i.e. after the
segmentation algorithm is applied, a single concept was
associated with each resulting image segment
04/07/2014Uni. of Siegen
40. Knowledge Assisted Framework
From the results it can be seen that the combined use of PSO
optimisation technique with SOM results in better classification
accuracy compared to using the latter alone.
It can be noted that the performance of PSO classier is better
than the performance of SVM and GA classifiers.
Since, SVM's need large training data to accurately discriminate
between image classes.
04/07/2014Uni. of Siegen
44. User Relevance Feedback
The database used in the experiment is generated from Corel
Dataset and consists of seven concepts namely, building, cloud,
car, elephant, grass, lion and tiger
The test set has been modelled for seven concepts with a variety
of background elements and overlapping concepts, hence
making the test set complex.
04/07/2014Uni. of Siegen
48. Multi-concept search space
04/07/2014Uni. of Siegen
• High-level queries
“A tiger resting in the forest
and guarding his territory”
• Mid-level features (context
independent)
“Tiger”, “Grass”, “Rock”,
“Water”,……
49. Multi-concept search space
• Mid-level features:
In a constrained environment with limited number of mid-level features,
the performance of classification algorithm has found to be satisfactory
• High-level queries:
Open to subjective interpretation of the concepts and also may involve
more than one mid-level feature
Main objective:
• In this multi-concept framework, users are encouraged to construct high
level queries based on their preferences
04/07/2014Uni. of Siegen
51. Multi-concept search space
• SVM Classifier
• SVM Light toolbox was used to generate semantic labels
• CLD+EHD
• Multi-feature classifier (MF)
• Employs a mixture of 7 visual features.
• The visual features are merged using Multi-Objective Learning (MOL)
04/07/2014Uni. of Siegen
52. Multi-concept search space
Pre-processing stage: mid-level feature concept detection
Query formulation: users to construct a high-level semantic information
space
04/07/2014Uni. of Siegen
61. Multi-concept search space
04/07/2014Uni. of Siegen
Landscape water, grass 0.58
Modern city building, cloud 0.8
Wild life lion, tiger, elephant 0.59
Rural garden flower, water, grass 0.9
User 2
Landscape water 0.23
Modern city building 0.71
Wild life lion, rock, grass, tiger, elephant 0.87
Rural garden flower 0.28
User 3
Landscape water, grass, cloud, car, elephant 0.59
Modern city cloud, building, car 0.91
Wild life lion, tiger, grass, elephant, rock 0.82
Rural garden flower, water, grass 0.88
63. Social Media Analysis
Social media is the interaction among people in which they create, share or
exchange information and ideas in virtual communities and networks.
Andreas Kaplan and Michael Haenlein define social media as "a group of
Internet-based applications that build on the ideological and technological
foundations of Web 2.0
Social media allows for the creation and exchange of user-generated
content.
Social media differ from traditional or industrial media in many ways,
including quality, reach, frequency, usability, immediacy, and permanence.
04/07/2014Uni. of Siegen
64. Social Media Analysis
• Images are often accompanies with free-text annotations, which
can be used as complementary information for content-based
classification
• The challenge is to extract entities from text and classify them into
an arbitrary set of classes
04/07/2014Uni. of Siegen
Plansarsko lake
Shepherd in Bucegi
National Park
69. Social Media Analysis
Content-based analysis (KAA)
restricted to classes for which the
classifier has been learnt
For text-based analysis (SCM/THD),
the classes have to be exhaustive -
all entities are classified
Mapping from SCM/THD to KAA
Perform intersection between the
individual classifier results
Select concept occupying largest area
on the image
04/07/2014Uni. of Siegen
72. VIT @ MediaEval 2013
17/07/14
The geographical coordinates is an important
component and indicator of where an event has
happened.
The event clusters are analysed through the weighted
occurrence of tags among the distribution of media
annotation
73. VIT @ MediaEval 2013
17/07/14
The system computes the similarity between synset representing
the tags and each of the categories.
We use Lin similarity measure to evaluate the semantic distance
between the synset and category.
75. VIT @ MediaEval 2013
Dividing the globe into grids with a maximum of 10,000 images
per grid . Starting from an initial grid that spans the entire
globe, recursively subdividing grids into smaller ones once the
threshold is reached.
17/07/14
0
5
10
15
20
25
30
35
1 10 100 500 1000
Series1 0.74 3.9 15.24 26.3 30.14
77. Conclusion
Automatic concept detection within images is a challenging and as of yet
unsolved research problem.
Impressive improvements have been achieved, although most of the
proposed systems rely on training data that has been manually, and thus
reliably labeled, an expensive and laborious endeavor that cannot easily
scale.
Current research in domain adaptation focuses on a scenario where
(a) the prior domain (source) consists of one or maximum two databases
(b) the labels between the source and the target domain are the same, and
(c) the number of annotated training data for the target domain are limited.
04/07/2014Uni. of Siegen