The document provides an overview of the teaching and research activities of the ELIS Multimedia Lab at Ghent University and Ghent University Global Campus. It discusses various research projects involving multimedia data analysis using techniques such as deep learning, neural networks, and computer vision. Specific projects summarized include terrain classification using convolutional neural networks, video content understanding using 3D CNNs and LSTMs, video event detection using reservoir computing networks, Twitter micropost modeling using word2vec and feedforward neural networks, humor detection on Twitter, and multimodal condition monitoring of wind turbines.
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
Ghent University Multimedia Lab Research and Teaching Activities
1. ELIS – Multimedia Lab
Ghent University and GUGC-K:
Overview of Teaching and Research Activities
Research Seminar
KAIST, 18 August 2015
Wesley De Neve
@wmdeneve
Ghent University – iMinds & KAIST
2. 2
ELIS – Multimedia Lab
• Teaching activities
- Ghent University Global Campus
- Ghent University Home Campus
• Research activities
- Ghent University Home Campus
- Ghent University Global Campus
Outline
3. 3
ELIS – Multimedia Lab
• Teaching activities
- Ghent University Global Campus
- Ghent University Home Campus
• Research activities
- Ghent University Home Campus
- Ghent University Global Campus
Outline
5. 5
ELIS – Multimedia Lab
Ghent University, Belgium
Rector: Prof. Anne De Paepe
Vice-rector: Prof. Freddy Mortier
Ghent University Global Campus, Korea
Campus President: Prof. Jozef Vercruysse
Campus Vice-president: Dr. Thomas Buerman
8. 8
ELIS – Multimedia Lab
Incheon Global Campus (IGC)
University of Utah
George Mason University
Ghent University
SUNY at Stonybrook
University of Nevada
10. 10
ELIS – Multimedia Lab
Molecular
Biotechnology
Food
Technology
Bachelor Master
PhD
Double Accreditation
Resident and
Flying Faculty
Ghent University Degree
Quality Control
Ghent University Appointment
Integrated Research Plan
Environmental
Technology
11. 11
ELIS – Multimedia Lab
Research-focused program
Practical excersises
in laboratories
Graduation project
Double accreditation
NVAO
January - August 2013
MoE
March – November 2013
Ghent University degree
Company internships
One semester in Belgium
16. 16
ELIS – Multimedia Lab
• Teaching activities
- Ghent University Global Campus
- Ghent University Home Campus
• Research activities
- Ghent University Home Campus
- Ghent University Global Campus
Outline
17. 17
ELIS – Multimedia Lab
• Course content
- management, analysis,
and visualization of large-
scale datasets
• Lecture on the art of (deep)
machine learning
• Hands-on session
- word2vec for natural
language processing (NLP)
- Apache Spark
Teaching Activities
Big Data Science
(Spring term)
18. 18
ELIS – Multimedia Lab
• Teaching activities
- Ghent University Global Campus
- Ghent University Home Campus
• Research activities
- Ghent University Home Campus
- Ghent University Global Campus
Outline
19. 19
ELIS – Multimedia Lab
TERRAIN CLASSIFICATION FOR
HYPERSPECTRAL IMAGES
Viktor Slavkovikj
20. 20
ELIS – Multimedia Lab
• Hyperspectral images
- each pixel contains hundreds of measurements of the
electromagnetic spectrum
- often captured through remote sensing
• e.g., through a camera mounted on an airplane
• Problem: how to do terrain classification?
- e.g., corn, wheat, and woods
Problem Statement
21. 21
ELIS – Multimedia Lab
Architecture Convolutional Neural Network
input layer
convolutional layer
convolutional layer
convolutional layer
fully connected layer
fully connected layer
output layer
output: one out of
16 terrain classes
800 hidden units
(hyperbolic tangent)
800 hidden units
(hyperbolic tangent)
filter size: 9x16
filter size: 1x16
filter size: 1x16
input: 9 pixels and
their spectral bands
implementation: by means of Python and Lasagne, a lightweight library to quickly
build and train neural networks in Theano
23. 23
ELIS – Multimedia Lab
• Data augmentation through the addition of Gaussian noise
- minor impact
- similar observation for max-pooling, ReLUs, and DropOut
• Classification results on par with the state-of-the-art
- overall accuracy between 80% and 95%
Experimental Results
Indian Pines
Test results
5%
training data
10%
training data
20%
training data
Non-augmented
Overall
accuracy (%)
85.46 ± 1.73 92.76 ± 0.93 96.54 ± 0.47
Augmented
Overall
accuracy (%)
86.54 ± 0.30 92.70 ± 1.00 96.58 ± 0.55
25. 25
ELIS – Multimedia Lab
Goals
Representation
Learning using
Neural Networks
Spatial &
Temporal Feature
Construction
Generation of
Fine-grained
Descriptions
Focus on Video
Content
Understanding
objects, actions,
& scenes
26. 26
ELIS – Multimedia Lab
Techniques
Main focus is on neural network techniques
that are able to capture temporal behaviour
3-D Convolutional
Neural Networks
Recurrent Long
Short-Term
Memory Networks
“Convolve over spatial
(2D) and/or temporal
domain (3D) to acquire
knowledge of input”
“Process sequence of
inputs and acquire
knowledge based on
memory cells”
Recurrent Reservoir
Computing
Networks
“Randomly assigned
weights in the reservoir,
combined with a
readout layer using
linear regression”
baseline video features: IDTF, AlexNet (ImageNet), C3D (FAIR)
implementation: Theano, Caffe, and Lasagne
27. 27
ELIS – Multimedia Lab
Data
Focus on
Action recognition dataset Crawled Vine videos
‘Realistic action videos’ Social and mobile content
Well-known and widely used Noisy and short-form data
UCF101
28. 28
ELIS – Multimedia Lab
First Exemplary Approach
Convolutional
Neural Network
Long Short-Term
Memory Network
f1 … fnf2
video
Representation f2
…
Representation f1
Representation fn
Video
Representation
Classification
29. 29
ELIS – Multimedia Lab
Second Exemplary Approach
Convolutional
Neural Network
Classification
Convolutional
Neural Network
f1
…
fn
f2
m1
…
mk
m2
raw frames motion flows
Fusion
Video Representation
30. 30
ELIS – Multimedia Lab
RESERVOIR COMPUTING FOR
VIDEO EVENT DETECTION
Azarakhsh Jalalvand
31. 31
ELIS – Multimedia Lab
• Goal
- detect the status of a door: open, closed, half-open
- use of a simple, efficient, and effective system
• Approach
- use of a fixed low-resolution camera (30×30 pixels)
• privacy reasons: people are not recognizable
• low bandwidth needed to stream the data
- use of Reservoir Computing Networks (RCNs)
• good in modeling temporal information (cf. speech)
• good in dealing with noisy data
Video Event Detection (1/2)
32. 32
ELIS – Multimedia Lab
• Implemented solution: small neural network of 200 nodes
- fast training
• reservoir: random assignment of connection weights
• readout layer: gradient descent for linear regression
- real-time response
- robust against noise
• low light conditions & people occurring
Video Event Detection (2/2)
Reservoir
36. 36
ELIS – Multimedia Lab
Problem statement
Current Natural Language Processing (NLP) research focuses
on “clean” text: news articles, Wikipedia articles…
What about noisy, short-form, and unstructured microposts?
Lack of correct spelling, a lot of slang
Lack of context
Lack of consistent grammar rules (~structure)
37. 37
ELIS – Multimedia Lab
A simple, general but effective
neural network architecture (1)
Use Google’s word2vec (=simplified neural network) to generate
good feature representations for words (=unsupervised learning)
Feed word representations to another neural network (NN) for any
classification task (=supervised learning)
Tweet
Feature
representation
Machine learning:
classification
Label
Learn word2vec
word representations
once in advance
Train a new NN
for any NLP task
38. 38
ELIS – Multimedia Lab
A simple, general but effective
neural network architecture (2)
W(t-1)
W(t)
W(t+1)
L
o
o
k
u
p
N-dim
N-dim
N-dim
Feed
forward
neural
network
Label(W(t))
Tweet
Feature
representation
Machine learning:
classification
Label
Concatenate (3N-dim)Window = 3
from
Seoul
to
Im going from Seoul to Daejeon. #KTX
39. 39
ELIS – Multimedia Lab
Word2vec: automatically learning good features
Model trained on 400 million tweets having 5 billion words
2-D projection of a 400-D space of the top 1000 words used on Twitter
40. 40
ELIS – Multimedia Lab
Part-of-Speech tagging: is it a verb, noun or article?
Im
going
from
L
o
o
k
u
p
400D
400D
400D
FFNN:
400 hidden
nodes
Verb
slang
NIPS Workshop on Modern Machine Learning Methods and Natural Language Processing
41. 41
ELIS – Multimedia Lab
Named Entity Recognition:
is it a location, company or TV show (1)?
from
Seoul
to
L
o
o
k
u
p
400D
400D
400D
FFNN:
400 hidden
nodes
Location
The same word representations
The same network, but with different weights
42. 42
ELIS – Multimedia Lab
Named Entity Recognition:
is it a location, company or TV show (2)?
Used both
“standard” features
as word
representations
Only using word
representations
ACL 2015 Workshop on Noisy User-generated Text
43. 43
ELIS – Multimedia Lab
Next Steps
Replace word2vec word representations with character
representations
Use Convolution Neural Networks as pattern filters, to prevent a
huge increase in vocabulary size (e.g., a convolutional filter should be
able to map “the" and "da" onto the same pattern)
Combine character representations to form word representations
that can be classified
45. 45
ELIS – Multimedia Lab
• Observation
- lots of humor on Twitter
• Question
- can we automatically detect
humorous tweets?
• Motivation
- humor is engaging (ads!)
- creation of intelligent agents
with social & emotional skills
Humor Detection on Twitter
46. 46
ELIS – Multimedia Lab
• Different kinds of humor
- sarcastic humor
- black humor
- self-deprecating humor
- satire
- parody
• Personal context
• Multimodal tweets
• Language usage
Why Humor Detection on Twitter Is Challenging
47. 47
ELIS – Multimedia Lab
• Binary classification problem: humorous or non-humorous
• Collection of tweets in English
- tweets containing #lol, #rofl, #lmao, #funny, #hilarious, …
- dataset of 373,498 tweets
• 50/50 humorous and non-humorous
• Features
- word2vec
• Classification technique
- feed-forward neural network with ReLUs
Approach (1/2)
49. 49
ELIS – Multimedia Lab
Classification accuracy: 81.07%
Preliminary Results
Humorous Tweets
You know you're at a Croatian jam whn your uncle forces
you to take shots .....
I've finally learned how to play spades
Watermelon inside of a watermelon!! My fav vine!
Some boys will wear dark sunglasses in Church, then be
blaming God later when they end up as Welders
It's so weird to thing that over in the other side of the
country there are people going to sleep while I'm getting
up
Got a new TV set for downstairs and my dad said "I bet I
can do this in 15 minutes" and almost 1 hour later it's
nearly finished
#RapLikeLilWayne I walk while I sleep. Call that Sleep
walkin!!!! #whaddup
50. 50
ELIS – Multimedia Lab
• Collect more training data by making use of Reddit
• Experimentation with recurrent neural network techniques
• Multimodal word/concept vector representations,
integrating both textual and visual information
Next Steps
52. 52
ELIS – Multimedia Lab
Healthy wind turbine Broken wind turbine
• Multi-sensor monitoring of bearings to detect faults early on
- infrared imaging, vibration data, and temperature data
• Classification
- white box models: random decision forests and SVMs
- black box models: CNNs
Condition Monitoring: Failure Prevention
54. 54
ELIS – Multimedia Lab
• Infrared imaging analysis
- handcrafted features + SVM: accuracy of 88.25%
• Vibration data analysis
- handcrafted features + RDF: accuracy of 87.25%
- CNN: accuracy of 91.77%
• Ongoing research: ensembling
- creation of a multimodal system using early and/or late fusion
Some Observations
56. 56
ELIS – Multimedia Lab
• Challenge: data handling
- DNA sequencing is outrunning
DNA storage, transmission, and
analysis
• Research question
- how about compressing DNA by making use of video coding
tools in order to alleviate storage, transmission, and analysis
problems?
Problem Statement
57. 57
ELIS – Multimedia Lab
• Modular and extensible
- thanks to the use of the pipes and filters design pattern
• Block-based compression
- allows selecting the best coding tool per block (adaptivity)
- enables random access, streaming, and parallel processing
Codec Architecture (1/2)
Input filter Encoding filter
Pipe
Output filter
Pipe PipePipe
Statistics
58. 58
ELIS – Multimedia Lab
Codec Architecture (2/2)
Efficiency
FunctionalityEffectiveness
Proposed
solution
SOTA
allowing for a flexible trade-off between
efficiency, effectiveness, and functionality
has always been a major design goal
59. 59
ELIS – Multimedia Lab
• Effectiveness: compression of the human Y chromosome
• Efficiency
- < 3 minutes: 4.30 MB
- 10 minutes: 4.21 MB
- 7 hours: 3.75 MB
Experimental Results
Format File size (MB)
No compression (FASTA) 18.70
Binary 7.01
Huffman 5.16
Proposed framework (December 2014) 4.26
Proposed framework with CABAC (August 2015) 3.75
60. 60
ELIS – Multimedia Lab
• Compression
- support for the protein alphabet
- performance optimizations (I/O, GPU)
• Privacy protection and streaming
- encryption
• Compressed-domain manipulation
- only download and decode that part of the compressed
genome that belongs to a particular gene (region-of-interest)
• DCC + MPEG standardization
Future Activities
Past
Future
61. 61
ELIS – Multimedia Lab
• Teaching activities
- Ghent University Global Campus
- Ghent University Home Campus
• Research activities
- Ghent University Home Campus
- Ghent University Global Campus
Outline
62. 62
ELIS – Multimedia Lab
Deep Learning for Biotech Data
Deep machine
learning
Multimedia
data
Biotech
data
SongdoGhent
important: unique (specialized) use cases and corresponding data sets,
given the current speed of change in the field of deep learning
64. 64
ELIS – Multimedia Lab
• What?
- commonly used in chemistry to create a fingerprint by
which molecules can be identified
• Applications
- medical diagnosis and food analysis (a/o)
Use Case 2: Raman Spectroscopy (1/2)
65. 65
ELIS – Multimedia Lab
• Challenges?
- data: vast amounts of data
- device: different devices,
different characteristics
- noise: environment, side effects
- composite materials:
overlapping signals
• Goal
- noise-robust automatic Raman spectrum identification
using signal processing and machine learning techniques
Use Case 2: Raman Spectroscopy (2/2)
66. 66
ELIS – Multimedia Lab
iMinds & ETRI
R&D collaboration in the field of IoT,
Big Data, and network communication (5G)
joint international research labs
(in Songdo?)
67. 67
ELIS – Multimedia Lab
• Memoranda of Understanding (MoU)
• Joint research projects (legal status GUGC-K?)
• Joint doctoral degrees
• Visits of master’s and Ph.D. students in Spring 2016?
- GPU cluster in Songdo (for deep CNNs, a/o)
• 4 Xeon CPUs
• 8 Titan Black GPUs with 96 GB of memory
• 128 GB of system memory
• 2 TB SSD + 16 TB of storage capacity
• 3200 Watt of power consumption
Further Ideas