Ghent University Multimedia Lab Research and Teaching Activities

ELIS – Multimedia Lab
Ghent University and GUGC-K:
Overview of Teaching and Research Activities
Research Seminar
KAIST, 18 August 2015
Wesley De Neve
@wmdeneve
Ghent University – iMinds & KAIST

2
• Teaching activities
- Ghent University Global Campus
- Ghent University Home Campus
• Research activities
Outline

3
Outline

4
WHO?

5
Ghent University, Belgium
Rector: Prof. Anne De Paepe
Vice-rector: Prof. Freddy Mortier
Ghent University Global Campus, Korea
Campus President: Prof. Jozef Vercruysse
Campus Vice-president: Dr. Thomas Buerman

6
WHERE?

8
Incheon Global Campus (IGC)
University of Utah
George Mason University
Ghent University
SUNY at Stonybrook
University of Nevada

9
Bachelor Master PhD
Environmental
Technology
Molecular
Biotechnology
Food
Technology

10
Molecular
Biotechnology
Food
Technology
Bachelor Master
PhD
Double Accreditation
Resident and
Flying Faculty
Ghent University Degree
Quality Control
Ghent University Appointment
Integrated Research Plan
Environmental
Technology

11
Research-focused program
Practical excersises
in laboratories
Graduation project
Double accreditation
NVAO
January - August 2013
MoE
March – November 2013
Ghent University degree
Company internships
One semester in Belgium

14
English
Biology
Mathematics
Inorganic chemistry
Organic chemistry
Informatics
PhysicsBiochemistryMolecular biology
Genetics
StatisticsEconomics
Marketing
Modeling
Simulation
Process engineering
Legislation
Process technology
Entrepreneurship
Intellectual property
Project management
Process control

15
Teaching Activities
Informatics 1
(Fall term – 5 credits)
Informatics 2
(Spring term – 5 credits)

16
Outline

17
• Course content
- management, analysis,
and visualization of large-
scale datasets
• Lecture on the art of (deep)
machine learning
• Hands-on session
- word2vec for natural
language processing (NLP)
- Apache Spark
Teaching Activities
Big Data Science
(Spring term)

18
Outline

19
TERRAIN CLASSIFICATION FOR
HYPERSPECTRAL IMAGES
Viktor Slavkovikj

20
• Hyperspectral images
- each pixel contains hundreds of measurements of the
electromagnetic spectrum
- often captured through remote sensing
• e.g., through a camera mounted on an airplane
• Problem: how to do terrain classification?
- e.g., corn, wheat, and woods
Problem Statement

21
Architecture Convolutional Neural Network
input layer
convolutional layer
convolutional layer
convolutional layer
fully connected layer
fully connected layer
output layer
output: one out of
16 terrain classes
800 hidden units
(hyperbolic tangent)
800 hidden units
(hyperbolic tangent)
filter size: 9x16
filter size: 1x16
filter size: 1x16
input: 9 pixels and
their spectral bands
implementation: by means of Python and Lasagne, a lightweight library to quickly
build and train neural networks in Theano

22
Debugging the CNN
Learned filters (𝑥-axis: wavelength, 𝑦-axis: response)

23
• Data augmentation through the addition of Gaussian noise
- minor impact
- similar observation for max-pooling, ReLUs, and DropOut
• Classification results on par with the state-of-the-art
- overall accuracy between 80% and 95%
Experimental Results
Indian Pines
Test results
5%
training data
10%
training data
20%
training data
Non-augmented
Overall
accuracy (%)
85.46 ± 1.73 92.76 ± 0.93 96.54 ± 0.47
Augmented
Overall
accuracy (%)
86.54 ± 0.30 92.70 ± 1.00 96.58 ± 0.55

24
VIDEO CONTENT UNDERSTANDING
Baptist Vandersmissen

25
Goals
Representation
Learning using
Neural Networks
Spatial &
Temporal Feature
Construction
Generation of
Fine-grained
Descriptions
Focus on Video
Content
Understanding
objects, actions,
& scenes

26
Techniques
Main focus is on neural network techniques
that are able to capture temporal behaviour
3-D Convolutional
Neural Networks
Recurrent Long
Short-Term
Memory Networks
“Convolve over spatial
(2D) and/or temporal
domain (3D) to acquire
knowledge of input”
“Process sequence of
inputs and acquire
knowledge based on
memory cells”
Recurrent Reservoir
Computing
Networks
“Randomly assigned
weights in the reservoir,
combined with a
readout layer using
linear regression”
baseline video features: IDTF, AlexNet (ImageNet), C3D (FAIR)
implementation: Theano, Caffe, and Lasagne

27
Data
Focus on
Action recognition dataset Crawled Vine videos
‘Realistic action videos’ Social and mobile content
Well-known and widely used Noisy and short-form data
UCF101

28
First Exemplary Approach
Convolutional
Neural Network
Long Short-Term
Memory Network
f1 … fnf2
video
Representation f2
…
Representation f1
Representation fn
Video
Representation
Classification

29
Second Exemplary Approach
Convolutional
Neural Network
Classification
Convolutional
Neural Network
f1
…
fn
f2
m1
…
mk
m2
raw frames motion flows
Fusion
Video Representation

30
RESERVOIR COMPUTING FOR
VIDEO EVENT DETECTION
Azarakhsh Jalalvand

31
• Goal
- detect the status of a door: open, closed, half-open
- use of a simple, efficient, and effective system
• Approach
- use of a fixed low-resolution camera (30×30 pixels)
• privacy reasons: people are not recognizable
• low bandwidth needed to stream the data
- use of Reservoir Computing Networks (RCNs)
• good in modeling temporal information (cf. speech)
• good in dealing with noisy data
Video Event Detection (1/2)

32
• Implemented solution: small neural network of 200 nodes
- fast training
• reservoir: random assignment of connection weights
• readout layer: gradient descent for linear regression
- real-time response
- robust against noise
• low light conditions & people occurring
Video Event Detection (2/2)
Reservoir

33
Demo

34
• Reservoir computing for visual content analysis
- handwritten digit recognition (MNIST)
- house number detection and recognition
Next Steps

35
TWITTER MICROPOST MODELING
Frederic Godin

36
Problem statement
Current Natural Language Processing (NLP) research focuses
on “clean” text: news articles, Wikipedia articles…
What about noisy, short-form, and unstructured microposts?
Lack of correct spelling, a lot of slang
Lack of context
Lack of consistent grammar rules (~structure)

37
A simple, general but effective
neural network architecture (1)
Use Google’s word2vec (=simplified neural network) to generate
good feature representations for words (=unsupervised learning)
Feed word representations to another neural network (NN) for any
classification task (=supervised learning)
Tweet
Feature
representation
Machine learning:
classification
Label
Learn word2vec
word representations
once in advance
Train a new NN
for any NLP task

38
A simple, general but effective
neural network architecture (2)
W(t-1)
W(t)
W(t+1)
L
o
o
k
u
p
N-dim
N-dim
N-dim
Feed
forward
neural
network
Label(W(t))
Tweet
Feature
representation
Machine learning:
classification
Label
Concatenate (3N-dim)Window = 3
from
Seoul
to
Im going from Seoul to Daejeon. #KTX

39
Word2vec: automatically learning good features
Model trained on 400 million tweets having 5 billion words
2-D projection of a 400-D space of the top 1000 words used on Twitter

40
Part-of-Speech tagging: is it a verb, noun or article?
Im
going
from
L
o
o
k
u
p
400D
400D
400D
FFNN:
400 hidden
nodes
Verb
slang
NIPS Workshop on Modern Machine Learning Methods and Natural Language Processing

41
Named Entity Recognition:
is it a location, company or TV show (1)?
from
Seoul
to
L
o
o
k
u
p
400D
400D
400D
FFNN:
400 hidden
nodes
Location
The same word representations
The same network, but with different weights

42
Named Entity Recognition:
is it a location, company or TV show (2)?
Used both
“standard” features
as word
representations
Only using word
representations
ACL 2015 Workshop on Noisy User-generated Text

43
Next Steps
Replace word2vec word representations with character
representations
Use Convolution Neural Networks as pattern filters, to prevent a
huge increase in vocabulary size (e.g., a convolutional filter should be
able to map “the" and "da" onto the same pattern)
Combine character representations to form word representations
that can be classified

44
HUMOR DETECTION ON TWITTER
Abhineshwar Tomar

45
• Observation
- lots of humor on Twitter
• Question
- can we automatically detect
humorous tweets?
• Motivation
- humor is engaging (ads!)
- creation of intelligent agents
with social & emotional skills
Humor Detection on Twitter

46
• Different kinds of humor
- sarcastic humor
- black humor
- self-deprecating humor
- satire
- parody
• Personal context
• Multimodal tweets
• Language usage
Why Humor Detection on Twitter Is Challenging

47
• Binary classification problem: humorous or non-humorous
• Collection of tweets in English
- tweets containing #lol, #rofl, #lmao, #funny, #hilarious, …
- dataset of 373,498 tweets
• 50/50 humorous and non-humorous
• Features
- word2vec
• Classification technique
- feed-forward neural network with ReLUs
Approach (1/2)

48
Approach (2/2)
300-D
tweet
vector
Google’s
word2vec
Humorous/
non-
humorous
Feed-forward
neural network
300-D input layer
400-D hidden ReLU layer
2-D output layer
Tweet
Please kill Jar Jar Binks
please

49
Classification accuracy: 81.07%
Preliminary Results
Humorous Tweets
You know you're at a Croatian jam whn your uncle forces
you to take shots .....
I've finally learned how to play spades
Watermelon inside of a watermelon!! My fav vine!
Some boys will wear dark sunglasses in Church, then be
blaming God later when they end up as Welders
It's so weird to thing that over in the other side of the
country there are people going to sleep while I'm getting
up
Got a new TV set for downstairs and my dad said "I bet I
can do this in 15 minutes" and almost 1 hour later it's
nearly finished
#RapLikeLilWayne I walk while I sleep. Call that Sleep
walkin!!!! #whaddup

50
• Collect more training data by making use of Reddit
• Experimentation with recurrent neural network techniques
• Multimodal word/concept vector representations,
integrating both textual and visual information
Next Steps

51
MULTIMODAL CONDITION
MONITORING FOR WIND TURBINES
Olivier Janssens

52
Healthy wind turbine Broken wind turbine
• Multi-sensor monitoring of bearings to detect faults early on
- infrared imaging, vibration data, and temperature data
• Classification
- white box models: random decision forests and SVMs
- black box models: CNNs
Condition Monitoring: Failure Prevention

53
Condition Monitoring: Smearing Fault Detection

54
• Infrared imaging analysis
- handcrafted features + SVM: accuracy of 88.25%
• Vibration data analysis
- handcrafted features + RDF: accuracy of 87.25%
- CNN: accuracy of 91.77%
• Ongoing research: ensembling
- creation of a multimodal system using early and/or late fusion
Some Observations

55
GENOMIC DATA COMPRESSION
Tom Paridaens

56
• Challenge: data handling
- DNA sequencing is outrunning
DNA storage, transmission, and
analysis
• Research question
- how about compressing DNA by making use of video coding
tools in order to alleviate storage, transmission, and analysis
problems?
Problem Statement

57
• Modular and extensible
- thanks to the use of the pipes and filters design pattern
• Block-based compression
- allows selecting the best coding tool per block (adaptivity)
- enables random access, streaming, and parallel processing
Codec Architecture (1/2)
Input filter Encoding filter
Pipe
Output filter
Pipe PipePipe
Statistics

58
Codec Architecture (2/2)
Efficiency
FunctionalityEffectiveness
Proposed
solution
SOTA
allowing for a flexible trade-off between
efficiency, effectiveness, and functionality
has always been a major design goal

59
• Effectiveness: compression of the human Y chromosome
• Efficiency
- < 3 minutes: 4.30 MB
- 10 minutes: 4.21 MB
- 7 hours: 3.75 MB
Experimental Results
Format File size (MB)
No compression (FASTA) 18.70
Binary 7.01
Huffman 5.16
Proposed framework (December 2014) 4.26
Proposed framework with CABAC (August 2015) 3.75

60
• Compression
- support for the protein alphabet
- performance optimizations (I/O, GPU)
• Privacy protection and streaming
- encryption
• Compressed-domain manipulation
- only download and decode that part of the compressed
genome that belongs to a particular gene (region-of-interest)
• DCC + MPEG standardization
Future Activities
Past
Future

61
Outline

62
Deep Learning for Biotech Data
Deep machine
learning
Multimedia
data
Biotech
data
SongdoGhent
important: unique (specialized) use cases and corresponding data sets,
given the current speed of change in the field of deep learning

63
Use Case 1: Quantification of Parasite Movement

64
• What?
- commonly used in chemistry to create a fingerprint by
which molecules can be identified
• Applications
- medical diagnosis and food analysis (a/o)
Use Case 2: Raman Spectroscopy (1/2)

65
• Challenges?
- data: vast amounts of data
- device: different devices,
different characteristics
- noise: environment, side effects
- composite materials:
overlapping signals
• Goal
- noise-robust automatic Raman spectrum identification
using signal processing and machine learning techniques
Use Case 2: Raman Spectroscopy (2/2)

66
iMinds & ETRI
R&D collaboration in the field of IoT,
Big Data, and network communication (5G)
joint international research labs
(in Songdo?)

67
• Memoranda of Understanding (MoU)
• Joint research projects (legal status GUGC-K?)
• Joint doctoral degrees
• Visits of master’s and Ph.D. students in Spring 2016?
- GPU cluster in Songdo (for deep CNNs, a/o)
• 4 Xeon CPUs
• 8 Titan Black GPUs with 96 GB of memory
• 128 GB of system memory
• 2 TB SSD + 16 TB of storage capacity
• 3200 Watt of power consumption
Further Ideas

Ghent University Multimedia Lab Research and Teaching Activities

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (12)

Destacado

Destacado (17)

Similar a Ghent University Multimedia Lab Research and Teaching Activities

Similar a Ghent University Multimedia Lab Research and Teaching Activities (20)

Más de Wesley De Neve

Más de Wesley De Neve (20)

Último

Último (20)

Ghent University Multimedia Lab Research and Teaching Activities

Notas del editor