SlideShare una empresa de Scribd logo
1 de 79
Descargar para leer sin conexión
1
It is all about AI
Mark Liao
Institute of Information Science
Academia Sinica, Taiwan
(TAAI 2016)
Contents of this talk
• Automatic Concert Video Mashup
• Spatio-Temporal Learning of Basketball
Offensive Strategies
2
1
Automatic Concert Video
Mashup
Mark Liao
Institute of Information Science
Academia Sinica, Taiwan
What is concert video mashup ?
• A concert video mashup process is to
deal with all videos captured from
different locations of a concert hall and
convert them into a complete, non-
overlapping, seamless, and high-quality
outcome.
4
Why concert video mashup ?
• To provide people who could not attend live concert a
second chance to enjoy the performance with similar
quality.
5
Many problems to be solved !
• Videos were captured with no coordination,
incompleteness or redundancy happens
always.
• The order to watch these videos often causes
confusion.
• These videos were captured by handheld
devices, their visual/audio quality cannot be
guaranteed.
6
Issues need to be addressed
• The order to watch
• Visual quality optimization
• Seamless sound track connection
• No redundancy
• No missing video segments
• Mashup results follow the rules defined by
language of film
7
Potential Issues: The order to watch(1/5)
• Three video clips captured from 3 different
angles, different distances, 1&2 partially
overlapped, 3 independent
8
1 2
3
Potential Issues: Multiple audio sequence
alignment (2/5)
Case 1: partially overlapped
Case 2: no overlap
9
Potential Issues(3/5)
• Among three videos coherent in time, which
one should be chosen ? (3 different locations)
-- follow the rules of language of film !
10
Medium
Shot
Long
Shot
Extreme Long
Shot
• Among several qualified videos clips, which
one should be chosen ? Same distance !
-- visual quality ? audio quality ?
11
Potential Issues(4/5)
Extreme Long
Shot
Extreme Long
Shot
Potential Issues(5/5 )
• How to present the emotion, ideas, and art of
a music director into a concert video mashup
process ? Can a CNN learn facial emotion ?
12
Previous Effort
• The closest research area to ``automatic video
mashup’’ is ``summarizations of multi-view
videos’’
• The objective of the latter is to produce a
reduced set of abstracted videos or key-frame
sequence that can represent the most
prominent parts of the input videos.
13
Literatures related to video mashup
(1/3)
• [Shrestha et al.] formulate video mashup as an
optimization problem
- pros – optimizing visual quality and
diversity constraints
- cons – did not take into account
professional view of a visual storytelling director
P. Shrestha et al., automatic mashup generation from multiple-
camera concert recordings, ACM MM, 2010.
14
Literatures related to video mashup
(2/3)
• [Wu et al.] put some pre-defined rules to solve
the frequent-shot-change problem
- pros – can solve part of the shot change
problem
- cons – did not involve a visual storytelling
director to instruct a video mashup process
Wu et al., MoVieUp: Automatic mobile video mashup, IEEE TCSVT,
2015.
15
Literatures related to video mashup
(3/3)
• [Saini et al.] introduce visual storytelling rules by dividing
audience seats into six shooting locations and then
calculate statistics of shot transition and length from
professionally edited videos
- pros – a good start by introducing the views of
professional experts
- cons – shot types defined by themselves, not by rules
defined in language of film
Saini et al., MoViMash: Online mobile video mashup, ACM MM, 2012.
16
Introduction
• An experienced movie director frequently use
camera work practice in visual storytelling.
Intro Verse
Verse Chorus
Chorus Bridge
Bridge
. . .
16
Introduction
• Applications
– Mashup
– Emotion (music video)
18
Introduction
• According to the language of film [3], shot size
is one of the basics of filmmaking.
19
Long Shot Close-Up
Introduction
20
• The definition of six
types of shots [3].
Introduction
• Definition from the
language of film [3], a
concert video contains
eight types of camera
shots.
20
Musical Instrument Shot (MIS)Audience Shot (ADS)
INTRODUCTION
• Two images from an official concert video of the song “93
million miles” by Jason Mraz live at Hong Kong 2012.
22
System Framework for Video Mashup
23
Shot Classification based on
EW-Deep-CCM
• Error-Weighted Deep Cross-Correlation Model
24
Object Representation (VGG-Net)
• Object representation using a 16-layer VGG-Net
• we extract features from the output layer and the two fully-
connected layers as the object representations, the feature
dimensions are 1000-D, 4096-D and 4096-D, respectively.
25
Object Representations (1/2)
ImageNet1000
object representation
26
Object Representations (2/2)
27
Literatures related to Fusion Strategy
• Early fusion
– Pros:
Take the advantage of combining various feature cues
– Cons:
High dimensional feature set may easily suffer from the
problem of data sparseness, and stress the computational
resources.
28
Literatures related to Fusion Strategy
• Late fusion
– Pros:
Without increasing the dimensionality
Interpret the performance of different classifiers and gain insight
into the role of multiple modalities during emotional expression
– Cons:
The assumption of conditional independence among multiple
modalities is inappropriate.
29
Shot Classification based on
EW-Deep-CCM
• A novel fusion strategy named Error Weighted Deep Cross-Correlation
Model (EW-Deep-CCM) is proposed to effectively combine the extracted
multilayer object representations.
30
Experimental Results
• Comparison of Shot Type Classification (other
method)
31
• EW-Deep-CCM only achieves 83% detection
rate
• 17% error remain, i.e., 1/6 error rate, this will
cause frequent shot changes
32
17% error rate causes too many
shot changes
31
Conditional Random Field-based (CRF)
Approach
• 1st trial: 30-frame fixed window size (not a systematic way to
smooth the results)
• 2nd trial: Recurrent Neural Network (RNN)
-- Problem: RNN needs pre-segmented data to derive best results,
but the shot type classification results generated are not well
segmented
• 3rd trial: Conditional Random Field (CRF)
34
OUR METHOD – Coherent-Net
Shot Type
Refinement
(CRF)
35
OUR METHOD – Coherent-Net
Framework
Shot Type
Refinement
(CRF)
( | ')P w w
( | )P w O
'
1
( | )= ( , '| )
( | ') ( '| )
( | ') ( '| )
N
n n
n
P P
P P
P P w o
=
≈ ⋅
≈ ⋅
∑
∏
w
w O w w O
w w w O
w w
CRF EW-Deep-CCM
( '| )P w O
36
(EW-Deep-CCM)
Likelihood
(DNN posterior
probability)
Cross-correlation Empirical
weight
1 1 1
( '| ) ( '| , ) ( | ) ( | , ) ( | , )
( | , ) ( | , ) ( | )
C D K
out out fc out
ij k k ij i i k j i k
i j k
out fc fc fc
i j k j j k ij ij
P w o P w w P w P o w P w
P w P o w P o
β α
α β
= = =
≈ Λ Λ Λ
× Λ Λ
∑∑∑
 
 
   
 
Shot Type
Refinement
(CRF)
( | ')P w w
( | )P w O( '| )P w O
37
1 1' ', , ', 't tw w w−=w 
1w=w 2w 3w 1tw − tw
( )
( )
1
( | ') exp , '
'
j
j
P F
 
=  
 
∑w w w w
Z w
( ) ( )' exp , 'j
j
F
 
=  
 
∑ ∑w
Z w w w
( )
( ) ( )1
1
exp , , ' , '
'
j j t t j j t
t j t j
t w w s wλ µ−
 
∝ + 
 
∑∑ ∑∑w w
Z w
( ) 1{ } { } { } { ' }
,
1
exp
' t t t tmn w m w n om w m w o
t m n S t m S o O
λ µ−= = = =
∈ ∈ ∈
 
∝ + 
 
∑ ∑ ∑∑∑1 1 1 1
Z w
( )
1
, '
0
j ts w

= 

w
when and't
w o= t
w m=
otherwise
State-observation pairState transition
( )1
1
, , '
0
j t tt w w−

= 

w
when and 1t
w n−
=t
w m=
otherwise
(CRF)
unary potentialpairwise potential
CLCCCC
CCCCCC
38
EXPERIMENTS – Official Demo 1
39
• the song “Skyfall” by Adele perform at Oscar 2013
EXPERIMENTS – Official Demo 2
• the song “When I was Your Man” by Bruno Mars
perform at BBC Radio 1's Big weekend 2013
40
System Framework for Video Mashup
41
Problem & Goal
• A concert video
mashup process needs
to align the videos
taken by variant
audiences into a
common timeline.
42
Literature Review
• Audio fingerprinting
• Problems
– Originally designed for the problem of audio
identification rather than that of time alignment.
– Easily cause audio signal distortion
• Zhu et al. treat audio identification as an image
matching problem. (significant performance improvement)
• B. Zhu et al., “A novel audio fingerprinting method robust to time
scale modification and pitch shifting,” ACM MM, 2010.
43
Our Method
• We modified Zhu’s method to address the multiple
audio sequences alignment problem.
– Auditory image (spectrogram) construction
1-D audio signal (waveform) 2D auditory image
Time-frequency representation
(spectrogram)
Short-time
Fourier
transform
44
Our Method
– Audio Sequences Alignment
(1) Boundary candidate selection (based on SIFT alignment)
-where a is a SIFT feature in audio sequence A, b is the closest
feature of a in B, b’ is the second closet feature of a in B.
bA Ba
'
, ( , ) ( , )
,
Yes if D a b c D a b
BC
No otherwise
 < ∗
= 

BC: boundary candidate
D(.): Euclidean distance
c: a constant (c=0.7)
Yellow lines are
boundary candidates
45
Our Method
– Audio Sequences Alignment
(2) Boundary candidate refinement.
-A window distortion measure (WDM) is defined for each
boundary candidate refinement.
46
Our Method
– Audio Sequences Alignment
(3) Final boundary decision.
-The alignment result is determined by a refined boundary
candidate that with minimum window distortion.
47
DEMO 1
• “I’m Yours” by Jason Mraz live at Singapore 2012
– with context search (Aligned in 49.8001 s)
48
Time
Line00:00:00 00:00:49.8001
Recording #4
Recording #5
+0.4334 s
DEMO 2
• “All I Ask” by Adele live at Birmingham Genting Arena 2016
– with context search (Aligned in 53.2169 s)
49
Time
Line00:00:00 00:00:53.2169
Recording #1
Recording #2
+0.5502 s
TimeLine
00:00:00 00:00:52.4893 04:00:2277
00:00:52.7667 03:58:8667
Audience #1
Audience #2
Audience #3
Demo
- Multiple Audio Sequence Alignment Result
50
Learning Professional Recording Skill
51
Initial
Prbo.
Duration
(frames/shot)
Shot Transition
(prob.)
Shot Type
Refinement
(CRF)
Coherent-Net
System Framework for Video Mashup
52
Demo - Mashup Result
53
mr#1
mr#2
mr#3
1
Spatio-Temporal Learning
of Basketball Offensive
Strategies
Motivations
• To develop an automatic tactics analysis
tool for coaches, players, and general
publics.
• To develop a new technique that can
compete with existing tools, such as
sportVU, but with much lower price
55
Methodology Adopted
• To analyze group behavior directly from the
court-view of an NBA broadcast video
• Detect and track each offense player,
calculate their trajectories and map these
trajectories from court view to tactic board
for analysis
56
Motivation (1)
57
Motivation (2)
58
Motivation (3)
• Unknown Offense Video Clip
90% → Screen Cut
10% → Princeton
60
• 6 cameras above the court
• No close-up view
→ Unable to see the details of plays
61
SportVU videos Broadcast videos
Tracked data Tracked data
SportVU system Our tracking system
?
Extracting features from an offense video clip ?
• Automatic player detection
• Automatic player tracking
• Map extracted trajectories from basketball
court to tactic board
62
step 2: Derive correct player trajectories on
panorama court (3/3)
63
step 3: Map trajectories from panorama court to
tactic board
64
What’s next ?
–Tactics Analysis based on
spatiotemporal trajectories
of 5 offense players
65
A Two-Stage Un-supervised Clustering
for Tactic Analysis
• Stage-1: Un-supervised clustering of all available
tactics based on their mutual distances
• Stage-2: Un-supervised clustering of all tactics
clustered into the same cluster in Stage-1 (try to
separate the role of each offense player)
66
What techniques are needed ?
• A spatiotemporal model that can describe the
group behavior of 5 offense players
• Automatic clustering of group behaviors
(screen-cut, Princeton, wing-wheel, etc)
• Representation of each group behavior
• An appropriate metric to calculate the distance
between two arbitrary tactics.
67
Trajectory set Representation
S: the spatiotemporal matrix;
Pij=(xij,yij): 2D coordinate of the j-th player in the i-th frame;
Vj=[P1j P2j… PLj]T;
S=[V1 V2 V3 V4 V5 (V6)];
Distance Measure of Trajectory Set
• Problems
• Different time durations between 2 clips
• Ordering of column vectors
Trajectory Set Distance Matrix
S1=[V1 V2 V3 V4 V5]
S2=[U1 U2 U3 U4 U5]
Clustering by Dominant Set
PAMI 07. Massimiliano Pavan and Marcello Pelillo. Dominant Sets and Pairwise Clustering
Tactic1
Tactic2
Tactic3
Second-stage: how to model an offense strategy ?
• 8 different trajectory sets of right hawk, each consists of
5 trajectories generated by 5 offense players
Clustering by Trajectory Distance
• Based on the distance between trajectories, one can separate each
group of tactics into five group of trajectories, each corresponds to
a role (an offense player)
Hawk
Wing
Wheel
Princeton
Temporal Alignment
For each role, we use the velocities along x- and y-direction,
respectively, to model it (use DTW to solve the alignment
problem)
The Built Model
Demo _ Classification
Hawk
template
Demo _ Classification
Princeton
template
Demo _ Classification
Wing wheel
template
Thank you very much for
listening
79

Más contenido relacionado

La actualidad más candente

Compression: Video Compression (MPEG and others)
Compression: Video Compression (MPEG and others)Compression: Video Compression (MPEG and others)
Compression: Video Compression (MPEG and others)danishrafiq
 
Digital Image Processing_ ch2 enhancement spatial-domain
Digital Image Processing_ ch2 enhancement spatial-domainDigital Image Processing_ ch2 enhancement spatial-domain
Digital Image Processing_ ch2 enhancement spatial-domainMalik obeisat
 
MIC-TJU at MediaEval Violent Scenes Detection (VSD) 2014
MIC-TJU at MediaEval Violent Scenes Detection (VSD) 2014MIC-TJU at MediaEval Violent Scenes Detection (VSD) 2014
MIC-TJU at MediaEval Violent Scenes Detection (VSD) 2014multimediaeval
 
Image restoration
Image restorationImage restoration
Image restorationAzad Singh
 
MPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingMPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingChristian Kehl
 
Digital image processing questions
Digital  image processing questionsDigital  image processing questions
Digital image processing questionsManas Mantri
 
Digital image processing - Image Enhancement (MATERIAL)
Digital image processing  - Image Enhancement (MATERIAL)Digital image processing  - Image Enhancement (MATERIAL)
Digital image processing - Image Enhancement (MATERIAL)Mathankumar S
 
Compression: Images (JPEG)
Compression: Images (JPEG)Compression: Images (JPEG)
Compression: Images (JPEG)danishrafiq
 
Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)nikhilus85
 
Next generation image compression standards: JPEG XR and AIC
Next generation image compression standards: JPEG XR and AICNext generation image compression standards: JPEG XR and AIC
Next generation image compression standards: JPEG XR and AICTouradj Ebrahimi
 
Recognition and tracking moving objects using moving camera in complex scenes
Recognition and tracking moving objects using moving camera in complex scenesRecognition and tracking moving objects using moving camera in complex scenes
Recognition and tracking moving objects using moving camera in complex scenesIJCSEA Journal
 
Sound Source Localization with microphone arrays
Sound Source Localization with microphone arraysSound Source Localization with microphone arrays
Sound Source Localization with microphone arraysRamin Anushiravani
 
Particle filter and cam shift approach for motion detection
Particle filter and cam shift approach for motion detectionParticle filter and cam shift approach for motion detection
Particle filter and cam shift approach for motion detectionkalyanibedekar
 
MPEG video compression standard
MPEG video compression standardMPEG video compression standard
MPEG video compression standardanuragjagetiya
 
Frequency Domain Image Enhancement Techniques
Frequency Domain Image Enhancement TechniquesFrequency Domain Image Enhancement Techniques
Frequency Domain Image Enhancement TechniquesDiwaker Pant
 
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon TransformHuman Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon TransformFadwa Fouad
 

La actualidad más candente (20)

Compression
CompressionCompression
Compression
 
Compression: Video Compression (MPEG and others)
Compression: Video Compression (MPEG and others)Compression: Video Compression (MPEG and others)
Compression: Video Compression (MPEG and others)
 
Digital Image Processing_ ch2 enhancement spatial-domain
Digital Image Processing_ ch2 enhancement spatial-domainDigital Image Processing_ ch2 enhancement spatial-domain
Digital Image Processing_ ch2 enhancement spatial-domain
 
MIC-TJU at MediaEval Violent Scenes Detection (VSD) 2014
MIC-TJU at MediaEval Violent Scenes Detection (VSD) 2014MIC-TJU at MediaEval Violent Scenes Detection (VSD) 2014
MIC-TJU at MediaEval Violent Scenes Detection (VSD) 2014
 
Image restoration
Image restorationImage restoration
Image restoration
 
MPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingMPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video Encoding
 
Digital image processing questions
Digital  image processing questionsDigital  image processing questions
Digital image processing questions
 
Digital image processing - Image Enhancement (MATERIAL)
Digital image processing  - Image Enhancement (MATERIAL)Digital image processing  - Image Enhancement (MATERIAL)
Digital image processing - Image Enhancement (MATERIAL)
 
Compression: Images (JPEG)
Compression: Images (JPEG)Compression: Images (JPEG)
Compression: Images (JPEG)
 
פוסטר דר פרידמן
פוסטר דר פרידמןפוסטר דר פרידמן
פוסטר דר פרידמן
 
Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)
 
Multimedia Object - Video
Multimedia Object - VideoMultimedia Object - Video
Multimedia Object - Video
 
JPEG
JPEGJPEG
JPEG
 
Next generation image compression standards: JPEG XR and AIC
Next generation image compression standards: JPEG XR and AICNext generation image compression standards: JPEG XR and AIC
Next generation image compression standards: JPEG XR and AIC
 
Recognition and tracking moving objects using moving camera in complex scenes
Recognition and tracking moving objects using moving camera in complex scenesRecognition and tracking moving objects using moving camera in complex scenes
Recognition and tracking moving objects using moving camera in complex scenes
 
Sound Source Localization with microphone arrays
Sound Source Localization with microphone arraysSound Source Localization with microphone arrays
Sound Source Localization with microphone arrays
 
Particle filter and cam shift approach for motion detection
Particle filter and cam shift approach for motion detectionParticle filter and cam shift approach for motion detection
Particle filter and cam shift approach for motion detection
 
MPEG video compression standard
MPEG video compression standardMPEG video compression standard
MPEG video compression standard
 
Frequency Domain Image Enhancement Techniques
Frequency Domain Image Enhancement TechniquesFrequency Domain Image Enhancement Techniques
Frequency Domain Image Enhancement Techniques
 
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon TransformHuman Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
 

Destacado

Quick Tour of Text Mining
Quick Tour of Text MiningQuick Tour of Text Mining
Quick Tour of Text MiningYi-Shin Chen
 
大數據下的情緒分析
大數據下的情緒分析大數據下的情緒分析
大數據下的情緒分析Yi-Shin Chen
 
Quick tour all handout
Quick tour all handoutQuick tour all handout
Quick tour all handoutYi-Shin Chen
 
TAAI 2016 Keynote Talk: Contention and Disruption
TAAI 2016 Keynote Talk: Contention and DisruptionTAAI 2016 Keynote Talk: Contention and Disruption
TAAI 2016 Keynote Talk: Contention and DisruptionYi-Shin Chen
 
TAAI 2016 Keynote Talk: Intercultural Collaboration as a Multi‐Agent System
TAAI 2016 Keynote Talk: Intercultural Collaboration as a Multi‐Agent SystemTAAI 2016 Keynote Talk: Intercultural Collaboration as a Multi‐Agent System
TAAI 2016 Keynote Talk: Intercultural Collaboration as a Multi‐Agent SystemYi-Shin Chen
 
2016 datascience emotion analysis - english version
2016 datascience emotion analysis - english version2016 datascience emotion analysis - english version
2016 datascience emotion analysis - english versionYi-Shin Chen
 
照海華德福教育簡介
照海華德福教育簡介照海華德福教育簡介
照海華德福教育簡介Yi-Shin Chen
 
Pušenje kao oblik zavisnosti
Pušenje kao oblik zavisnostiPušenje kao oblik zavisnosti
Pušenje kao oblik zavisnostizoranang
 
Investigadores con éxito comercial
Investigadores con éxito comercialInvestigadores con éxito comercial
Investigadores con éxito comercialAlejandro Borges
 
Edu 639 entire course
Edu 639 entire courseEdu 639 entire course
Edu 639 entire coursedisbideca1980
 
Los materiales y las energías en la informática
Los materiales y las energías en la informáticaLos materiales y las energías en la informática
Los materiales y las energías en la informáticamarijojoo
 

Destacado (20)

Quick Tour of Text Mining
Quick Tour of Text MiningQuick Tour of Text Mining
Quick Tour of Text Mining
 
大數據下的情緒分析
大數據下的情緒分析大數據下的情緒分析
大數據下的情緒分析
 
Quick tour all handout
Quick tour all handoutQuick tour all handout
Quick tour all handout
 
TAAI 2016 Keynote Talk: Contention and Disruption
TAAI 2016 Keynote Talk: Contention and DisruptionTAAI 2016 Keynote Talk: Contention and Disruption
TAAI 2016 Keynote Talk: Contention and Disruption
 
Research and life
Research and lifeResearch and life
Research and life
 
TAAI 2016 Keynote Talk: Intercultural Collaboration as a Multi‐Agent System
TAAI 2016 Keynote Talk: Intercultural Collaboration as a Multi‐Agent SystemTAAI 2016 Keynote Talk: Intercultural Collaboration as a Multi‐Agent System
TAAI 2016 Keynote Talk: Intercultural Collaboration as a Multi‐Agent System
 
2016 datascience emotion analysis - english version
2016 datascience emotion analysis - english version2016 datascience emotion analysis - english version
2016 datascience emotion analysis - english version
 
照海華德福教育簡介
照海華德福教育簡介照海華德福教育簡介
照海華德福教育簡介
 
HUMAN PATHOGENIC ANTIMICROBIAL ACTIVITY AND GC-MS ANALYSIS OF CARALLUMA TRUNC...
HUMAN PATHOGENIC ANTIMICROBIAL ACTIVITY AND GC-MS ANALYSIS OF CARALLUMA TRUNC...HUMAN PATHOGENIC ANTIMICROBIAL ACTIVITY AND GC-MS ANALYSIS OF CARALLUMA TRUNC...
HUMAN PATHOGENIC ANTIMICROBIAL ACTIVITY AND GC-MS ANALYSIS OF CARALLUMA TRUNC...
 
Pušenje kao oblik zavisnosti
Pušenje kao oblik zavisnostiPušenje kao oblik zavisnosti
Pušenje kao oblik zavisnosti
 
Investigadores con éxito comercial
Investigadores con éxito comercialInvestigadores con éxito comercial
Investigadores con éxito comercial
 
The Anglo-Saxon Age and Beowulf
The Anglo-Saxon Age and BeowulfThe Anglo-Saxon Age and Beowulf
The Anglo-Saxon Age and Beowulf
 
ON SEMI-  -CONTINUITY WHERE   {L, M, R, S}
ON SEMI-  -CONTINUITY WHERE   {L, M, R, S}ON SEMI-  -CONTINUITY WHERE   {L, M, R, S}
ON SEMI-  -CONTINUITY WHERE   {L, M, R, S}
 
FLOOD ROUTING WITH REAL-TIME METHOD FOR FLASH FLOOD FORECASTING IN THE PLAIN ...
FLOOD ROUTING WITH REAL-TIME METHOD FOR FLASH FLOOD FORECASTING IN THE PLAIN ...FLOOD ROUTING WITH REAL-TIME METHOD FOR FLASH FLOOD FORECASTING IN THE PLAIN ...
FLOOD ROUTING WITH REAL-TIME METHOD FOR FLASH FLOOD FORECASTING IN THE PLAIN ...
 
Edu 639 entire course
Edu 639 entire courseEdu 639 entire course
Edu 639 entire course
 
Tech use as nonlinguistic
Tech use as nonlinguisticTech use as nonlinguistic
Tech use as nonlinguistic
 
Los materiales y las energías en la informática
Los materiales y las energías en la informáticaLos materiales y las energías en la informática
Los materiales y las energías en la informática
 
Manfaat coffee
Manfaat coffeeManfaat coffee
Manfaat coffee
 
Controversy as Pedagogy
Controversy as PedagogyControversy as Pedagogy
Controversy as Pedagogy
 
Copa
CopaCopa
Copa
 

Similar a TAAI 2016 Keynote Talk: It is all about AI

Interactive Control over Temporal Consistency while Stylizing Video Streams
Interactive Control over Temporal Consistency while Stylizing Video StreamsInteractive Control over Temporal Consistency while Stylizing Video Streams
Interactive Control over Temporal Consistency while Stylizing Video StreamsMatthias Trapp
 
Application of feature point matching to video stabilization
Application of feature point matching to video stabilizationApplication of feature point matching to video stabilization
Application of feature point matching to video stabilizationNikhil Prathapani
 
An Introduction to HDTV Principles-Part 3
An Introduction to HDTV Principles-Part 3An Introduction to HDTV Principles-Part 3
An Introduction to HDTV Principles-Part 3Dr. Mohieddin Moradi
 
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...Comparative Study of Various Algorithms for Detection of Fades in Video Seque...
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...theijes
 
Applying Media Content Analysis to the Production of Musical Videos as Summar...
Applying Media Content Analysis to the Production of Musical Videos as Summar...Applying Media Content Analysis to the Production of Musical Videos as Summar...
Applying Media Content Analysis to the Production of Musical Videos as Summar...Chris Huang
 
論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])
論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])
論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])Masaya Kaneko
 
[2018 台灣人工智慧學校校友年會] 視訊畫面生成 / 林彥宇
[2018 台灣人工智慧學校校友年會] 視訊畫面生成 / 林彥宇[2018 台灣人工智慧學校校友年會] 視訊畫面生成 / 林彥宇
[2018 台灣人工智慧學校校友年會] 視訊畫面生成 / 林彥宇台灣資料科學年會
 
Audio Visual Emotion Recognition Using Cross Correlation and Wavelet Packet D...
Audio Visual Emotion Recognition Using Cross Correlation and Wavelet Packet D...Audio Visual Emotion Recognition Using Cross Correlation and Wavelet Packet D...
Audio Visual Emotion Recognition Using Cross Correlation and Wavelet Packet D...Shamman Noor Shoudha
 
Navigational BCI Using Acoustic Stimulation
Navigational BCI Using Acoustic StimulationNavigational BCI Using Acoustic Stimulation
Navigational BCI Using Acoustic Stimulationwacax
 
Objective Evaluation of Video Quality
Objective Evaluation of Video QualityObjective Evaluation of Video Quality
Objective Evaluation of Video QualityAnton Venema
 
Build Your Own 3D Scanner: 3D Scanning with Structured Lighting
Build Your Own 3D Scanner: 3D Scanning with Structured LightingBuild Your Own 3D Scanner: 3D Scanning with Structured Lighting
Build Your Own 3D Scanner: 3D Scanning with Structured LightingDouglas Lanman
 

Similar a TAAI 2016 Keynote Talk: It is all about AI (20)

Interactive Control over Temporal Consistency while Stylizing Video Streams
Interactive Control over Temporal Consistency while Stylizing Video StreamsInteractive Control over Temporal Consistency while Stylizing Video Streams
Interactive Control over Temporal Consistency while Stylizing Video Streams
 
Video Inpainting detection using inconsistencies in optical Flow
Video Inpainting detection using inconsistencies in optical FlowVideo Inpainting detection using inconsistencies in optical Flow
Video Inpainting detection using inconsistencies in optical Flow
 
C04841417
C04841417C04841417
C04841417
 
Application of feature point matching to video stabilization
Application of feature point matching to video stabilizationApplication of feature point matching to video stabilization
Application of feature point matching to video stabilization
 
An Introduction to HDTV Principles-Part 3
An Introduction to HDTV Principles-Part 3An Introduction to HDTV Principles-Part 3
An Introduction to HDTV Principles-Part 3
 
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...Comparative Study of Various Algorithms for Detection of Fades in Video Seque...
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...
 
ICIP2013-video stabilization with l1 l2 optimization
ICIP2013-video stabilization with l1 l2 optimizationICIP2013-video stabilization with l1 l2 optimization
ICIP2013-video stabilization with l1 l2 optimization
 
NMSL_2017summer
NMSL_2017summerNMSL_2017summer
NMSL_2017summer
 
Applying Media Content Analysis to the Production of Musical Videos as Summar...
Applying Media Content Analysis to the Production of Musical Videos as Summar...Applying Media Content Analysis to the Production of Musical Videos as Summar...
Applying Media Content Analysis to the Production of Musical Videos as Summar...
 
論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])
論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])
論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])
 
JASLA_presentation.pdf
JASLA_presentation.pdfJASLA_presentation.pdf
JASLA_presentation.pdf
 
[2018 台灣人工智慧學校校友年會] 視訊畫面生成 / 林彥宇
[2018 台灣人工智慧學校校友年會] 視訊畫面生成 / 林彥宇[2018 台灣人工智慧學校校友年會] 視訊畫面生成 / 林彥宇
[2018 台灣人工智慧學校校友年會] 視訊畫面生成 / 林彥宇
 
Audio Visual Emotion Recognition Using Cross Correlation and Wavelet Packet D...
Audio Visual Emotion Recognition Using Cross Correlation and Wavelet Packet D...Audio Visual Emotion Recognition Using Cross Correlation and Wavelet Packet D...
Audio Visual Emotion Recognition Using Cross Correlation and Wavelet Packet D...
 
PMF BPMF and BPTF
PMF BPMF and BPTFPMF BPMF and BPTF
PMF BPMF and BPTF
 
Navigational BCI Using Acoustic Stimulation
Navigational BCI Using Acoustic StimulationNavigational BCI Using Acoustic Stimulation
Navigational BCI Using Acoustic Stimulation
 
Presentación Tesis 08022016
Presentación Tesis 08022016Presentación Tesis 08022016
Presentación Tesis 08022016
 
chapter5.pptx
chapter5.pptxchapter5.pptx
chapter5.pptx
 
Stereo vision
Stereo visionStereo vision
Stereo vision
 
Objective Evaluation of Video Quality
Objective Evaluation of Video QualityObjective Evaluation of Video Quality
Objective Evaluation of Video Quality
 
Build Your Own 3D Scanner: 3D Scanning with Structured Lighting
Build Your Own 3D Scanner: 3D Scanning with Structured LightingBuild Your Own 3D Scanner: 3D Scanning with Structured Lighting
Build Your Own 3D Scanner: 3D Scanning with Structured Lighting
 

Más de Yi-Shin Chen

從自然語言處理到文字探勘
從自然語言處理到文字探勘從自然語言處理到文字探勘
從自然語言處理到文字探勘Yi-Shin Chen
 
從人工智慧反思教育現場
從人工智慧反思教育現場從人工智慧反思教育現場
從人工智慧反思教育現場Yi-Shin Chen
 
From NLP to text mining
From NLP to text mining From NLP to text mining
From NLP to text mining Yi-Shin Chen
 
2017大數據情緒分析的經驗分享
2017大數據情緒分析的經驗分享2017大數據情緒分析的經驗分享
2017大數據情緒分析的經驗分享Yi-Shin Chen
 
照海華德福教育簡介
照海華德福教育簡介照海華德福教育簡介
照海華德福教育簡介Yi-Shin Chen
 
新竹實驗教育的新契機
新竹實驗教育的新契機新竹實驗教育的新契機
新竹實驗教育的新契機Yi-Shin Chen
 
一名女科技人的反思
一名女科技人的反思一名女科技人的反思
一名女科技人的反思Yi-Shin Chen
 
Examples of working with streaming data
Examples of working with streaming dataExamples of working with streaming data
Examples of working with streaming dataYi-Shin Chen
 
2017 ncu experience sharing
2017 ncu experience sharing2017 ncu experience sharing
2017 ncu experience sharingYi-Shin Chen
 

Más de Yi-Shin Chen (9)

從自然語言處理到文字探勘
從自然語言處理到文字探勘從自然語言處理到文字探勘
從自然語言處理到文字探勘
 
從人工智慧反思教育現場
從人工智慧反思教育現場從人工智慧反思教育現場
從人工智慧反思教育現場
 
From NLP to text mining
From NLP to text mining From NLP to text mining
From NLP to text mining
 
2017大數據情緒分析的經驗分享
2017大數據情緒分析的經驗分享2017大數據情緒分析的經驗分享
2017大數據情緒分析的經驗分享
 
照海華德福教育簡介
照海華德福教育簡介照海華德福教育簡介
照海華德福教育簡介
 
新竹實驗教育的新契機
新竹實驗教育的新契機新竹實驗教育的新契機
新竹實驗教育的新契機
 
一名女科技人的反思
一名女科技人的反思一名女科技人的反思
一名女科技人的反思
 
Examples of working with streaming data
Examples of working with streaming dataExamples of working with streaming data
Examples of working with streaming data
 
2017 ncu experience sharing
2017 ncu experience sharing2017 ncu experience sharing
2017 ncu experience sharing
 

Último

A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

TAAI 2016 Keynote Talk: It is all about AI

  • 1. 1 It is all about AI Mark Liao Institute of Information Science Academia Sinica, Taiwan (TAAI 2016)
  • 2. Contents of this talk • Automatic Concert Video Mashup • Spatio-Temporal Learning of Basketball Offensive Strategies 2
  • 3. 1 Automatic Concert Video Mashup Mark Liao Institute of Information Science Academia Sinica, Taiwan
  • 4. What is concert video mashup ? • A concert video mashup process is to deal with all videos captured from different locations of a concert hall and convert them into a complete, non- overlapping, seamless, and high-quality outcome. 4
  • 5. Why concert video mashup ? • To provide people who could not attend live concert a second chance to enjoy the performance with similar quality. 5
  • 6. Many problems to be solved ! • Videos were captured with no coordination, incompleteness or redundancy happens always. • The order to watch these videos often causes confusion. • These videos were captured by handheld devices, their visual/audio quality cannot be guaranteed. 6
  • 7. Issues need to be addressed • The order to watch • Visual quality optimization • Seamless sound track connection • No redundancy • No missing video segments • Mashup results follow the rules defined by language of film 7
  • 8. Potential Issues: The order to watch(1/5) • Three video clips captured from 3 different angles, different distances, 1&2 partially overlapped, 3 independent 8 1 2 3
  • 9. Potential Issues: Multiple audio sequence alignment (2/5) Case 1: partially overlapped Case 2: no overlap 9
  • 10. Potential Issues(3/5) • Among three videos coherent in time, which one should be chosen ? (3 different locations) -- follow the rules of language of film ! 10 Medium Shot Long Shot Extreme Long Shot
  • 11. • Among several qualified videos clips, which one should be chosen ? Same distance ! -- visual quality ? audio quality ? 11 Potential Issues(4/5) Extreme Long Shot Extreme Long Shot
  • 12. Potential Issues(5/5 ) • How to present the emotion, ideas, and art of a music director into a concert video mashup process ? Can a CNN learn facial emotion ? 12
  • 13. Previous Effort • The closest research area to ``automatic video mashup’’ is ``summarizations of multi-view videos’’ • The objective of the latter is to produce a reduced set of abstracted videos or key-frame sequence that can represent the most prominent parts of the input videos. 13
  • 14. Literatures related to video mashup (1/3) • [Shrestha et al.] formulate video mashup as an optimization problem - pros – optimizing visual quality and diversity constraints - cons – did not take into account professional view of a visual storytelling director P. Shrestha et al., automatic mashup generation from multiple- camera concert recordings, ACM MM, 2010. 14
  • 15. Literatures related to video mashup (2/3) • [Wu et al.] put some pre-defined rules to solve the frequent-shot-change problem - pros – can solve part of the shot change problem - cons – did not involve a visual storytelling director to instruct a video mashup process Wu et al., MoVieUp: Automatic mobile video mashup, IEEE TCSVT, 2015. 15
  • 16. Literatures related to video mashup (3/3) • [Saini et al.] introduce visual storytelling rules by dividing audience seats into six shooting locations and then calculate statistics of shot transition and length from professionally edited videos - pros – a good start by introducing the views of professional experts - cons – shot types defined by themselves, not by rules defined in language of film Saini et al., MoViMash: Online mobile video mashup, ACM MM, 2012. 16
  • 17. Introduction • An experienced movie director frequently use camera work practice in visual storytelling. Intro Verse Verse Chorus Chorus Bridge Bridge . . . 16
  • 19. Introduction • According to the language of film [3], shot size is one of the basics of filmmaking. 19 Long Shot Close-Up
  • 20. Introduction 20 • The definition of six types of shots [3].
  • 21. Introduction • Definition from the language of film [3], a concert video contains eight types of camera shots. 20 Musical Instrument Shot (MIS)Audience Shot (ADS)
  • 22. INTRODUCTION • Two images from an official concert video of the song “93 million miles” by Jason Mraz live at Hong Kong 2012. 22
  • 23. System Framework for Video Mashup 23
  • 24. Shot Classification based on EW-Deep-CCM • Error-Weighted Deep Cross-Correlation Model 24
  • 25. Object Representation (VGG-Net) • Object representation using a 16-layer VGG-Net • we extract features from the output layer and the two fully- connected layers as the object representations, the feature dimensions are 1000-D, 4096-D and 4096-D, respectively. 25
  • 28. Literatures related to Fusion Strategy • Early fusion – Pros: Take the advantage of combining various feature cues – Cons: High dimensional feature set may easily suffer from the problem of data sparseness, and stress the computational resources. 28
  • 29. Literatures related to Fusion Strategy • Late fusion – Pros: Without increasing the dimensionality Interpret the performance of different classifiers and gain insight into the role of multiple modalities during emotional expression – Cons: The assumption of conditional independence among multiple modalities is inappropriate. 29
  • 30. Shot Classification based on EW-Deep-CCM • A novel fusion strategy named Error Weighted Deep Cross-Correlation Model (EW-Deep-CCM) is proposed to effectively combine the extracted multilayer object representations. 30
  • 31. Experimental Results • Comparison of Shot Type Classification (other method) 31
  • 32. • EW-Deep-CCM only achieves 83% detection rate • 17% error remain, i.e., 1/6 error rate, this will cause frequent shot changes 32
  • 33. 17% error rate causes too many shot changes 31
  • 34. Conditional Random Field-based (CRF) Approach • 1st trial: 30-frame fixed window size (not a systematic way to smooth the results) • 2nd trial: Recurrent Neural Network (RNN) -- Problem: RNN needs pre-segmented data to derive best results, but the shot type classification results generated are not well segmented • 3rd trial: Conditional Random Field (CRF) 34
  • 35. OUR METHOD – Coherent-Net Shot Type Refinement (CRF) 35
  • 36. OUR METHOD – Coherent-Net Framework Shot Type Refinement (CRF) ( | ')P w w ( | )P w O ' 1 ( | )= ( , '| ) ( | ') ( '| ) ( | ') ( '| ) N n n n P P P P P P w o = ≈ ⋅ ≈ ⋅ ∑ ∏ w w O w w O w w w O w w CRF EW-Deep-CCM ( '| )P w O 36
  • 37. (EW-Deep-CCM) Likelihood (DNN posterior probability) Cross-correlation Empirical weight 1 1 1 ( '| ) ( '| , ) ( | ) ( | , ) ( | , ) ( | , ) ( | , ) ( | ) C D K out out fc out ij k k ij i i k j i k i j k out fc fc fc i j k j j k ij ij P w o P w w P w P o w P w P w P o w P o β α α β = = = ≈ Λ Λ Λ × Λ Λ ∑∑∑           Shot Type Refinement (CRF) ( | ')P w w ( | )P w O( '| )P w O 37
  • 38. 1 1' ', , ', 't tw w w−=w  1w=w 2w 3w 1tw − tw ( ) ( ) 1 ( | ') exp , ' ' j j P F   =     ∑w w w w Z w ( ) ( )' exp , 'j j F   =     ∑ ∑w Z w w w ( ) ( ) ( )1 1 exp , , ' , ' ' j j t t j j t t j t j t w w s wλ µ−   ∝ +    ∑∑ ∑∑w w Z w ( ) 1{ } { } { } { ' } , 1 exp ' t t t tmn w m w n om w m w o t m n S t m S o O λ µ−= = = = ∈ ∈ ∈   ∝ +    ∑ ∑ ∑∑∑1 1 1 1 Z w ( ) 1 , ' 0 j ts w  =   w when and't w o= t w m= otherwise State-observation pairState transition ( )1 1 , , ' 0 j t tt w w−  =   w when and 1t w n− =t w m= otherwise (CRF) unary potentialpairwise potential CLCCCC CCCCCC 38
  • 39. EXPERIMENTS – Official Demo 1 39 • the song “Skyfall” by Adele perform at Oscar 2013
  • 40. EXPERIMENTS – Official Demo 2 • the song “When I was Your Man” by Bruno Mars perform at BBC Radio 1's Big weekend 2013 40
  • 41. System Framework for Video Mashup 41
  • 42. Problem & Goal • A concert video mashup process needs to align the videos taken by variant audiences into a common timeline. 42
  • 43. Literature Review • Audio fingerprinting • Problems – Originally designed for the problem of audio identification rather than that of time alignment. – Easily cause audio signal distortion • Zhu et al. treat audio identification as an image matching problem. (significant performance improvement) • B. Zhu et al., “A novel audio fingerprinting method robust to time scale modification and pitch shifting,” ACM MM, 2010. 43
  • 44. Our Method • We modified Zhu’s method to address the multiple audio sequences alignment problem. – Auditory image (spectrogram) construction 1-D audio signal (waveform) 2D auditory image Time-frequency representation (spectrogram) Short-time Fourier transform 44
  • 45. Our Method – Audio Sequences Alignment (1) Boundary candidate selection (based on SIFT alignment) -where a is a SIFT feature in audio sequence A, b is the closest feature of a in B, b’ is the second closet feature of a in B. bA Ba ' , ( , ) ( , ) , Yes if D a b c D a b BC No otherwise  < ∗ =   BC: boundary candidate D(.): Euclidean distance c: a constant (c=0.7) Yellow lines are boundary candidates 45
  • 46. Our Method – Audio Sequences Alignment (2) Boundary candidate refinement. -A window distortion measure (WDM) is defined for each boundary candidate refinement. 46
  • 47. Our Method – Audio Sequences Alignment (3) Final boundary decision. -The alignment result is determined by a refined boundary candidate that with minimum window distortion. 47
  • 48. DEMO 1 • “I’m Yours” by Jason Mraz live at Singapore 2012 – with context search (Aligned in 49.8001 s) 48 Time Line00:00:00 00:00:49.8001 Recording #4 Recording #5 +0.4334 s
  • 49. DEMO 2 • “All I Ask” by Adele live at Birmingham Genting Arena 2016 – with context search (Aligned in 53.2169 s) 49 Time Line00:00:00 00:00:53.2169 Recording #1 Recording #2 +0.5502 s
  • 50. TimeLine 00:00:00 00:00:52.4893 04:00:2277 00:00:52.7667 03:58:8667 Audience #1 Audience #2 Audience #3 Demo - Multiple Audio Sequence Alignment Result 50
  • 51. Learning Professional Recording Skill 51 Initial Prbo. Duration (frames/shot) Shot Transition (prob.) Shot Type Refinement (CRF) Coherent-Net
  • 52. System Framework for Video Mashup 52
  • 53. Demo - Mashup Result 53 mr#1 mr#2 mr#3
  • 55. Motivations • To develop an automatic tactics analysis tool for coaches, players, and general publics. • To develop a new technique that can compete with existing tools, such as sportVU, but with much lower price 55
  • 56. Methodology Adopted • To analyze group behavior directly from the court-view of an NBA broadcast video • Detect and track each offense player, calculate their trajectories and map these trajectories from court view to tactic board for analysis 56
  • 59. Motivation (3) • Unknown Offense Video Clip 90% → Screen Cut 10% → Princeton
  • 60. 60 • 6 cameras above the court • No close-up view → Unable to see the details of plays
  • 61. 61 SportVU videos Broadcast videos Tracked data Tracked data SportVU system Our tracking system ?
  • 62. Extracting features from an offense video clip ? • Automatic player detection • Automatic player tracking • Map extracted trajectories from basketball court to tactic board 62
  • 63. step 2: Derive correct player trajectories on panorama court (3/3) 63
  • 64. step 3: Map trajectories from panorama court to tactic board 64
  • 65. What’s next ? –Tactics Analysis based on spatiotemporal trajectories of 5 offense players 65
  • 66. A Two-Stage Un-supervised Clustering for Tactic Analysis • Stage-1: Un-supervised clustering of all available tactics based on their mutual distances • Stage-2: Un-supervised clustering of all tactics clustered into the same cluster in Stage-1 (try to separate the role of each offense player) 66
  • 67. What techniques are needed ? • A spatiotemporal model that can describe the group behavior of 5 offense players • Automatic clustering of group behaviors (screen-cut, Princeton, wing-wheel, etc) • Representation of each group behavior • An appropriate metric to calculate the distance between two arbitrary tactics. 67
  • 68. Trajectory set Representation S: the spatiotemporal matrix; Pij=(xij,yij): 2D coordinate of the j-th player in the i-th frame; Vj=[P1j P2j… PLj]T; S=[V1 V2 V3 V4 V5 (V6)];
  • 69. Distance Measure of Trajectory Set • Problems • Different time durations between 2 clips • Ordering of column vectors
  • 70. Trajectory Set Distance Matrix S1=[V1 V2 V3 V4 V5] S2=[U1 U2 U3 U4 U5]
  • 71. Clustering by Dominant Set PAMI 07. Massimiliano Pavan and Marcello Pelillo. Dominant Sets and Pairwise Clustering Tactic1 Tactic2 Tactic3
  • 72. Second-stage: how to model an offense strategy ? • 8 different trajectory sets of right hawk, each consists of 5 trajectories generated by 5 offense players
  • 73. Clustering by Trajectory Distance • Based on the distance between trajectories, one can separate each group of tactics into five group of trajectories, each corresponds to a role (an offense player) Hawk Wing Wheel Princeton
  • 74. Temporal Alignment For each role, we use the velocities along x- and y-direction, respectively, to model it (use DTW to solve the alignment problem)
  • 78. Demo _ Classification Wing wheel template
  • 79. Thank you very much for listening 79