SlideShare a Scribd company logo
1 of 33
應用媒體內容分析於摘要性音樂影片之製作
Applying Media Content Analysis to the
Production of Musical Videos as Summarization
2004/02/16
Student: Chen-Hsiu Huang
Advisor: Prof. Ja-Ling Wu
Outline
 Problem Formulation
 Current Solutions
 Our Goal
 Gory Details
 Performance Evaluation
 What’s Next?
 Questions and Discussion
Problem Formulation
 The digital video capture devices such as DVs are made
more affordable for end users.
 It’s interesting to shoot videos but frustrating for editing
them.
 There’s still a tremendous barrier between amateurs (home
users) and the powerful video editing software.
 Finally people leave their precious shots in piles of DV
tapes without editing and management.
 According to a survey on
DVworld*, the relations
between the video length
and how many times will
user review them after
days:
 Video clips with no more
then 5 minutes are best for
human’s concentration.
Video length Review times
>= 1 hr 1 or 0
30 min ~ 1 hr 2 ~ 3
15 ~ 30 min 5 ~ 10
5 ~ 15 min >= 10
<= 5 min You take it out and
watch it when you
think about!
*http://www.DVworld.com.tw/
 People are impatient for videos without scenario or voice-
over, especially for those with no music.
 The improved soundtrack quality improved perceived video
image quality.
 Synchronizing video and audio segments enhance the
perception of both.
 One study at MIT showed that listener judge the identical
video image to be higher quality when accompanied by
higher-fidelity audio.
Facts about Musical Video
 Home videos can be roughly classified by its nature
property.
Causal Shots within video are causal; changing the order of
shots may confuse the viewer
Non-causal Shots are not causal; it’s OK to re-order video shots
Recreational Videos are used to represent a kind of emotion or
enjoyment
Memorial Such as marriage or graduation celebrity, videos are
memorial and each shot should be preserved properly.
 Four profiles are proposed to deal with videos of different
nature.
Current Solutions
 A consumer product called “muvee autoProducer” has
been announced to ease the burden of professional video
editing.
 It’s application scenario is quite simple:
Pick-up
your video
Choose your
favorite music
Produce a quality
musical video
Select profiles
to apply
Our Goal
 Although there are commercial products in the market, only
few academic publications related.
 Jonathan Foote, Matthew D. Cooper, Andreas Girgensohn,
"Creating music videos using automatic media analysis," ACM
Multimedia 2002: 553-560
 The content-analysis technologies are developed for years;
can we adopt those technologies to help auto-creation of
musical videos?
 Goal: To achieve the near or beyond quality in the similar
application scenario with the content-analysis technologies
developed in multimedia domain.
Input video
Input music
Shot change
Scene change
Audio
segment
cutting
Alignment
Output Video
Volume
ZCR
Brightness
Bandwidth
…
Human face
Flash light
Motion strength
Color variance
Camera Operation
...
Scene selection
Key shot selection
Audio rhythm &
Video motion/color
synchronization
Proposed Framework
Audio Analysis
 We should cut the input audio into several clips according
to its audio features.
 Frame-level features
 Volume: defined as the MSR of audio samples
 ZCR: the number of times that the audio waveform crosses the
zero axis in each frame.
 Spectral features
 Brightness: the centroid of frequency spectrum
 Bandwidth: the standard deviation of frequency spectrum
 Generally the brightness’ distribution curve is almost the same as ZCR
curve, so here we use ZCR feature only.
 Bandwidth is an important audio feature but we can not easily tell
what’s the real physical meaning in music when the bandwidth reaches
its high/low value.
 Furthermore, the relations between musical perceptual and bandwidth
values are not clear and not regular.
Brightness
ZCR
Volume
Bandwidth
Audio Segmentation
 First we cut the input audio into clips when the volume
changes dramatically.
 For each clip, we define the burst of ZCR as an “attack”,
which may be a beat of base drum or the singer’s voice.


 >
=
otherwise,0
_)(,1
)(
thVCutiF
iA cut
cut
)()(
1
∑∑
+
+−
−=
wi
i
i
i
wi
i
cut
w
v
w
v
absiF
10/)max(_ ivthVCut =


 ×>
=
otherwise,0
)(2)(,1
)( iattack
attack
zstdiF
iA
)(
)()(
1
∑
∑
+
+
−
−
+−=
wi
i
i
i
i
wi
i
iattack
w
v
zabs
w
v
zabsiF
 The dramatic volume change defines the audio clip
boundary, while the burst of ZCR (attack) in each clip
defines the granular sub-segment within it.
Clip boundary Attacks as sub-clip separation
 Here we define the dynamic of each clip as:
)(
)(
ilen
z
iA j
j
dynamic
∑
=
孫燕姿
綠光
 The dynamic feature can be used as a good
reference later for video/audio synchronization
Video Analysis
 First we need to apply shot change detection to segment
video into scenes.
 Here we use the combination of pixel MAD and pixel
histogram method to perform the shot change detection.


 ==
=
otherwise,0
1)(and1)(,1
)(
iSiS
iV HISTMAD
shot
Dhist < Thhist Dhist > Thhist
Dcolor < Thcolor nothing
Dcolor > Thcolor unsuitable! shot change!
 Flashlight detection
 The flashlight event will be detected as shot change.
 When the shot change is founded, check if:
 If so, then it’s a flashlight event, should not be treated as shot
change.
 Sub-Shot segmentation
 Here we use MPEG-7 ColorLayout descriptor to measure each
frame’s similarity.
 The first frame in each shot is selected as the basis, each
consecutive frames are compared with the basis. If
 Then we say that in frame i, a sub-shot is occurred.
thFlash
LMean
LMean
thFlash
LMean
LMean
i
i
i
i
_
)(
)(
&&_
)(
)(
11
≥≥
−+
ThSubSceneiDFFdistiD
i
k
k _)(,),()(
1
0 ≥= ∑=
Camera Operation
 Camera operations such as pan or zoom are widely used in amateur
home videos. By detection those camera operations can help catch the
video taker’s intention.
 Our camera operation detection is performed base on the MPEG
video’s motion vectors in P-frames.
Pan Zoom
31 ≤≤
∑
∑
i
i
v
v 3>
∑
∑
i
i
v
v
 This method is simple and efficient. However, it does well when
detecting camera operations.
Video Features
 Frame-level features
 The presence of human faces.
 Use OpenCV library as face detection module.
 Motion intensity
 Flashlight detection
 Mean and standard deviation of luminance plane
 (Dcolor(i) > Thcolor && Dhist(i) < Thhist) defines the unsuitable frames
 Shot-level features
 Numbers and types of camera operation in each shot.
 Numbers of faces and flashlight event in each shot.
 The accumulation of distance between each frame and first frame
can be used to describe the shot’s homogeneity.
Importance Measure
 Frame-level score function:
)
256
130
(
)_(
)(
Std
Mean
opCameraSR
ERScore
amotion
flashface
+
−
×
++××
++×=
γ
β
α
}2,1,0{_,
)max(
}1,0{,
∈=
∈
×
=
opCamera
Motion
Motion
R
E
HW
Area
R
i
motion
flash
face
face
2.0,3.0,5.0 === γβα
 The face and flashlight event have the highest weighting.
 Camera operation and higher motion intensity represent the video
taker’s intension, so it’s more important.
 Frames with higher luminance and larger standard derivation are more
suitable.
 The penalty of unsuitable frames will be discussed later.
A scaling coefficient according to
synchronized audio clip’s feature
 The shot-level importance is motivated by observing that:
 Shots with larger motion intensity take longer duration.
 The presence of face attracts viewer.
 Shots of higher heterogeneity can taker longer playing time.
 Shots with more camera operations are more important.
 Of course, shots with longer length in origin are more important.
 Shot-level importance:
)()()
_
(
Len
Diff
Len
Motion
Len
opCamera
Len
Num
LenIMP
face ∑∑∑ ××+×=
 The shot-level importance function is used in the medium profile to
reassign each shot’s length according to its importance.
 Static shots takes shorter, while dynamic shots can take longer.
 Gets better results after editing
 “muvee autoProducer” does not reassign each shot’s length!
Example 1
六福村之旅 (31:55)
Music: SHE / 美麗新世界
Length: 4:25
Profile: Sequential Medium
Proposed Profiles
 The usage of profiles allows users to customize their videos according
to its content property and users’ preference in a easy way.
 We said that home videos have four types:
 Causal, Non-causal, Recreational, Memorial
 For causal or non-causal videos, we use the sequential or non-
sequential parameter to deal with.
 For memorial or recreational videos, the rhythmic or medium
parameter is developed to cope with.
 In rhythmic, the music tempo/rhythm is better preserved, while some shots
of video will be neglected.
 In medium, the accompany of music tempo/rhythm is not so clear as
rhythm, but most of the shots will be promised to shown. The medium
parameter preserved the original video the most.
 Thus we have four profiles:
 Sequential Rhythmic, Sequential Medium
 Non-Sequential Rhythmic, Non-Sequential Medium
Sequential Non-Sequential
Rhythmic
Time sequence of
shots will be
preserved, with the
rhythmic parameter
With the rhythmic
parameter, but the
original order of shots
will be changed.
Medium
Time sequence of
shots will be
preserved, with the
medium parameter
With the medium
parameter, but the
original order of shots
will be changed.
Rhythmic vs. Medium
 The video is segmented according to the audio clips and sub-clips.
 After projecting to the video time-line, searching in the video range to
find the video segments with the highest score as the same length as
audio segment.
 Finally concatenate all the selected segments.
Video
Track
Audio
Track
 Each shot will be reassign to a new length according to its shot
importance, shots may becomes longer or shorter in proportion to the
total length.
 After projection to the video space, the length budget is calculated
according to the reduction rate; then allocate the budget to each inner
shots according to its length.
 If the allocated shot length is to short (< 30 frames), then its budget will
be transfer to near shots.
Video
Track
Audio
Track
 However, there are some issues:
 The fast tempo/rhythm audio clip may be aligned to a static video
shot, which will be annoying for viewer.
 The slow audio clip may be aligned to a dynamic video shot.
  We apply an audio scaling coefficient in synchronization stage.
The motion intensity of video shot’s weight will be decreased when
aligned with a slow audio clip; nearly preserved when synchronized
with fast audio clip.
 Another issue when the media length differ:
Video
Track
Audio
Track
 It’s unavoidable when the sequential policy is enforced. 
 For some video sources, the order of shots is not so important, and re-
order shots will not degrade the original.
 If we allow re-order the input video shots, things may be better:
Video
Track
Audio
Track
permutation
 It sounds simple and intuitive, but it’s not an easy problem if we want to
develop an efficient algorithm to find such permutation.
 Furthermore, the “best” solution may not exist and the optimal solution
may not be only one permutation.
Non-Sequential Permutation
 So we developed a randomize algorithm to find a “not-bad” solution
within predictable computation time.
 First randomly permute each video shot
 Then we compute the Ravc “audio-to-video coverage” in the corresponding
time-line for each shot
Video
Audio
1=avcR 2=avcR 3=avcR
 Then we calculate the average Ravc, each permutation will has its Ravc.
 After lots of iterations, find the minimal Ravc, theoretically we can
approach to the optimal solution efficiently and predictable, only
depends on how many iterations we perform.
 For an example, 10000 iterations are performed:
Permutation Minimal Ravc
7 5 8 11 3 14 13 1 2 0 9 6 12 4 10 1.455571
11 14 2 10 1 3 9 6 4 0 12 13 7 8 5 1.482213
9 7 13 1 14 6 2 10 8 0 11 4 12 3 5 1.508536
7 3 5 11 12 8 0 13 1 2 14 10 6 4 9 1.425809
13 5 2 10 3 12 7 11 0 14 9 6 8 4 1 1.453530
 We can get better solution with more iterations, but through
experiments, 10000 iterations are quite enough and will not be a
burden for our computation power (actually it’s really fast)
 Since its random property, each synchronization result will be different.
But we have discussed before that it’s normal to have lots of solutions.
Example 2
吉魯巴 (19:08)
Music: 製造浪漫
Length: 4:25
Profile: Sequential Medium and
Non-Sequential Medium
Performance Evaluation
 Development environment:
 AMD Duron 1.2G Hz with 386 MB RAM
 Analysis complexity:
 For videos, about 1.2~1.3:1 comparing to the original video time.
 For audios, about 2 minutes for a 5 minute audio; if perform the spectral
analysis, 4-5 minutes are needed.
 The audio/video analysis will be saved as description files, so the analysis
is required only once.
 The synchronization can be regarded as O(n) complexity.
 When analyzing, usually less than 20 MB RAM is required (depends on
how many shots in video)
 The synchronization result is saved as an AviSynth script. Then we use
VirtualDub to encode the produced musical video.
Sample Videos
六福村之旅 (31:55)
烏來採蜂蜜 (60:34)
聖淘沙海底世界 (17:59)
littleco 演唱會 (20:22)
吉魯巴 (19:08)
結婚典禮 (43:42)
What’s Next?
 How to design the experimental result?
 The subjective test should not over-burden the viewer.
 Adding the shot transition effects? Such as dissolve, fade
in, fade out.
 I’ve tried, but not so easy as I thought.
 The automatic approach may not always product a
satisfaction result and the experience is highly subjective
and differs from people to people.
 Semi-automatic is probably the best compromise. The automatic
result is served only as a pre-process basis and a labor-saving tool.
 But the video editing tool is hard to develop, and I doubt if it’s
necessary to develop one from startup on the purpose of thesis.
Questions and Discussion
 Any comments are welcomed.
 Acknowledgment:
 Special thanks for Mr. 劉嘉倫 , for his videos and suggestions. 
 Thanks friends in DVworld who provide lots of ideas and
comments.
 Thanks Chih-Hao Shen for his dancing video.

More Related Content

What's hot

Motion graphics and_compositing_video_analysis_worksheet
Motion graphics and_compositing_video_analysis_worksheetMotion graphics and_compositing_video_analysis_worksheet
Motion graphics and_compositing_video_analysis_worksheetsmashingentertainment
 
Motion graphics and_compositing_video_analysis_worksheet
Motion graphics and_compositing_video_analysis_worksheetMotion graphics and_compositing_video_analysis_worksheet
Motion graphics and_compositing_video_analysis_worksheetk_ishii_
 
Motion graphics and_compositing_video_analysis_worksheet
Motion graphics and_compositing_video_analysis_worksheetMotion graphics and_compositing_video_analysis_worksheet
Motion graphics and_compositing_video_analysis_worksheetnickmccabe123
 
Battle of the Codecs
Battle of the CodecsBattle of the Codecs
Battle of the CodecsJames Uren
 
Motion graphics and_compositing_video_megan_robinson_2
Motion graphics and_compositing_video_megan_robinson_2Motion graphics and_compositing_video_megan_robinson_2
Motion graphics and_compositing_video_megan_robinson_2megrobbo95
 
Motion graphics and_compositing_video_analysis_worksheet 2
Motion graphics and_compositing_video_analysis_worksheet 2Motion graphics and_compositing_video_analysis_worksheet 2
Motion graphics and_compositing_video_analysis_worksheet 2smashingentertainment
 
Brian Elliott's "Camera Basics" Lecture
Brian Elliott's "Camera Basics" LectureBrian Elliott's "Camera Basics" Lecture
Brian Elliott's "Camera Basics" Lecturejpowers
 
Analysis sheet animated caption
Analysis sheet animated captionAnalysis sheet animated caption
Analysis sheet animated captionJoe Hayes
 
Analog for all_preview
Analog for all_previewAnalog for all_preview
Analog for all_previewAnand Udupa
 

What's hot (18)

Specsheet sncdh120t
Specsheet sncdh120tSpecsheet sncdh120t
Specsheet sncdh120t
 
Worksheet 3
Worksheet 3Worksheet 3
Worksheet 3
 
Motion graphics and_compositing_video_analysis_worksheet
Motion graphics and_compositing_video_analysis_worksheetMotion graphics and_compositing_video_analysis_worksheet
Motion graphics and_compositing_video_analysis_worksheet
 
Motion graphics and_compositing_video_analysis_worksheet
Motion graphics and_compositing_video_analysis_worksheetMotion graphics and_compositing_video_analysis_worksheet
Motion graphics and_compositing_video_analysis_worksheet
 
Motion graphics and_compositing_video_analysis_worksheet
Motion graphics and_compositing_video_analysis_worksheetMotion graphics and_compositing_video_analysis_worksheet
Motion graphics and_compositing_video_analysis_worksheet
 
Battle of the Codecs
Battle of the CodecsBattle of the Codecs
Battle of the Codecs
 
Imenco subseacamera
Imenco subseacameraImenco subseacamera
Imenco subseacamera
 
Xmen analysis
Xmen analysisXmen analysis
Xmen analysis
 
Linux Video Editing
Linux Video EditingLinux Video Editing
Linux Video Editing
 
Motion graphics and_compositing_video_megan_robinson_2
Motion graphics and_compositing_video_megan_robinson_2Motion graphics and_compositing_video_megan_robinson_2
Motion graphics and_compositing_video_megan_robinson_2
 
 
Motion graphics and_compositing_video_analysis_worksheet 2
Motion graphics and_compositing_video_analysis_worksheet 2Motion graphics and_compositing_video_analysis_worksheet 2
Motion graphics and_compositing_video_analysis_worksheet 2
 
Brian Elliott's "Camera Basics" Lecture
Brian Elliott's "Camera Basics" LectureBrian Elliott's "Camera Basics" Lecture
Brian Elliott's "Camera Basics" Lecture
 
Scct2013 topic4 video
Scct2013 topic4 videoScct2013 topic4 video
Scct2013 topic4 video
 
Analysis sheet animated caption
Analysis sheet animated captionAnalysis sheet animated caption
Analysis sheet animated caption
 
Unit 23
Unit 23Unit 23
Unit 23
 
8
88
8
 
Analog for all_preview
Analog for all_previewAnalog for all_preview
Analog for all_preview
 

Viewers also liked

Howen CCTV System worldwide Application-201309
Howen CCTV System worldwide Application-201309Howen CCTV System worldwide Application-201309
Howen CCTV System worldwide Application-201309Berry Gao
 
Video summarization using clustering
Video summarization using clusteringVideo summarization using clustering
Video summarization using clusteringSahil Biswas
 
VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...
VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...
VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...Journal For Research
 
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Universitat Politècnica de Catalunya
 
"Image and Video Summarization," a Presentation from the University of Washin...
"Image and Video Summarization," a Presentation from the University of Washin..."Image and Video Summarization," a Presentation from the University of Washin...
"Image and Video Summarization," a Presentation from the University of Washin...Edge AI and Vision Alliance
 
Integrating Physical And Logical Security
Integrating Physical And Logical SecurityIntegrating Physical And Logical Security
Integrating Physical And Logical SecurityJorge Sebastiao
 

Viewers also liked (10)

Howen CCTV System worldwide Application-201309
Howen CCTV System worldwide Application-201309Howen CCTV System worldwide Application-201309
Howen CCTV System worldwide Application-201309
 
Content based video summarization into object maps
Content based video summarization into object mapsContent based video summarization into object maps
Content based video summarization into object maps
 
Paralleling Variable Block Size Motion Estimation of HEVC On CPU plus GPU Pla...
Paralleling Variable Block Size Motion Estimation of HEVC On CPU plus GPU Pla...Paralleling Variable Block Size Motion Estimation of HEVC On CPU plus GPU Pla...
Paralleling Variable Block Size Motion Estimation of HEVC On CPU plus GPU Pla...
 
Perceptual Video Coding
Perceptual Video Coding Perceptual Video Coding
Perceptual Video Coding
 
Keyframe-based Video Summarization Designer
Keyframe-based Video Summarization DesignerKeyframe-based Video Summarization Designer
Keyframe-based Video Summarization Designer
 
Video summarization using clustering
Video summarization using clusteringVideo summarization using clustering
Video summarization using clustering
 
VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...
VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...
VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...
 
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
 
"Image and Video Summarization," a Presentation from the University of Washin...
"Image and Video Summarization," a Presentation from the University of Washin..."Image and Video Summarization," a Presentation from the University of Washin...
"Image and Video Summarization," a Presentation from the University of Washin...
 
Integrating Physical And Logical Security
Integrating Physical And Logical SecurityIntegrating Physical And Logical Security
Integrating Physical And Logical Security
 

Similar to Applying Media Content Analysis to the Production of Musical Videos as Summarization

How to prepare a perfect video abstract for your research paper – Pubrica.pptx
How to prepare a perfect video abstract for your research paper – Pubrica.pptxHow to prepare a perfect video abstract for your research paper – Pubrica.pptx
How to prepare a perfect video abstract for your research paper – Pubrica.pptxPubrica
 
How to prepare a perfect video abstract for your research paper – Pubrica.pdf
How to prepare a perfect video abstract for your research paper – Pubrica.pdfHow to prepare a perfect video abstract for your research paper – Pubrica.pdf
How to prepare a perfect video abstract for your research paper – Pubrica.pdfPubrica
 
Real-Time Video Copy Detection in Big Data
Real-Time Video Copy Detection in Big DataReal-Time Video Copy Detection in Big Data
Real-Time Video Copy Detection in Big DataIRJET Journal
 
pivot vector space approach in audio-video mixing
pivot vector space approach in audio-video mixingpivot vector space approach in audio-video mixing
pivot vector space approach in audio-video mixingkanikarr
 
Autom editor video blooper recognition and localization for automatic monolo...
Autom editor  video blooper recognition and localization for automatic monolo...Autom editor  video blooper recognition and localization for automatic monolo...
Autom editor video blooper recognition and localization for automatic monolo...Carlos Toxtli
 
Video compression
Video compressionVideo compression
Video compressionnnmaurya
 
adobe premiere pro.ppt
adobe premiere pro.pptadobe premiere pro.ppt
adobe premiere pro.pptVeronicGomez
 
Randy orton tv thingyy
Randy orton tv thingyyRandy orton tv thingyy
Randy orton tv thingyyreuben95
 
Multimedia Elements - Sound, Animation & Video - R.D.Sivakumar
Multimedia Elements - Sound, Animation & Video - R.D.SivakumarMultimedia Elements - Sound, Animation & Video - R.D.Sivakumar
Multimedia Elements - Sound, Animation & Video - R.D.SivakumarSivakumar R D .
 
Example-Based Remixing of Multimedia Contents
Example-Based Remixing of Multimedia ContentsExample-Based Remixing of Multimedia Contents
Example-Based Remixing of Multimedia ContentsMediaMixerCommunity
 
Post Production Glossary
Post Production GlossaryPost Production Glossary
Post Production GlossaryJoe Nasr
 
Ch07_-_Multimedia_Element-Video_1_.ppt
Ch07_-_Multimedia_Element-Video_1_.pptCh07_-_Multimedia_Element-Video_1_.ppt
Ch07_-_Multimedia_Element-Video_1_.pptdjempol
 

Similar to Applying Media Content Analysis to the Production of Musical Videos as Summarization (20)

How to prepare a perfect video abstract for your research paper – Pubrica.pptx
How to prepare a perfect video abstract for your research paper – Pubrica.pptxHow to prepare a perfect video abstract for your research paper – Pubrica.pptx
How to prepare a perfect video abstract for your research paper – Pubrica.pptx
 
How to prepare a perfect video abstract for your research paper – Pubrica.pdf
How to prepare a perfect video abstract for your research paper – Pubrica.pdfHow to prepare a perfect video abstract for your research paper – Pubrica.pdf
How to prepare a perfect video abstract for your research paper – Pubrica.pdf
 
Adobe premiere
Adobe premiereAdobe premiere
Adobe premiere
 
Real-Time Video Copy Detection in Big Data
Real-Time Video Copy Detection in Big DataReal-Time Video Copy Detection in Big Data
Real-Time Video Copy Detection in Big Data
 
Video editing
Video editingVideo editing
Video editing
 
Avengers
AvengersAvengers
Avengers
 
CHAPTER – 6 Video
CHAPTER – 6    VideoCHAPTER – 6    Video
CHAPTER – 6 Video
 
pivot vector space approach in audio-video mixing
pivot vector space approach in audio-video mixingpivot vector space approach in audio-video mixing
pivot vector space approach in audio-video mixing
 
Autom editor video blooper recognition and localization for automatic monolo...
Autom editor  video blooper recognition and localization for automatic monolo...Autom editor  video blooper recognition and localization for automatic monolo...
Autom editor video blooper recognition and localization for automatic monolo...
 
Tech report
Tech reportTech report
Tech report
 
Video compression
Video compressionVideo compression
Video compression
 
adobe premiere pro.ppt
adobe premiere pro.pptadobe premiere pro.ppt
adobe premiere pro.ppt
 
Video-Editing Techniques.pptx
Video-Editing Techniques.pptxVideo-Editing Techniques.pptx
Video-Editing Techniques.pptx
 
Randy orton tv thingyy
Randy orton tv thingyyRandy orton tv thingyy
Randy orton tv thingyy
 
Production skills
Production skillsProduction skills
Production skills
 
Multimedia Elements - Sound, Animation & Video - R.D.Sivakumar
Multimedia Elements - Sound, Animation & Video - R.D.SivakumarMultimedia Elements - Sound, Animation & Video - R.D.Sivakumar
Multimedia Elements - Sound, Animation & Video - R.D.Sivakumar
 
Example-Based Remixing of Multimedia Contents
Example-Based Remixing of Multimedia ContentsExample-Based Remixing of Multimedia Contents
Example-Based Remixing of Multimedia Contents
 
Post Production Glossary
Post Production GlossaryPost Production Glossary
Post Production Glossary
 
video
videovideo
video
 
Ch07_-_Multimedia_Element-Video_1_.ppt
Ch07_-_Multimedia_Element-Video_1_.pptCh07_-_Multimedia_Element-Video_1_.ppt
Ch07_-_Multimedia_Element-Video_1_.ppt
 

More from Chris Huang

Data compression, data security, and machine learning
Data compression, data security, and machine learningData compression, data security, and machine learning
Data compression, data security, and machine learningChris Huang
 
Kks sre book_ch10
Kks sre book_ch10Kks sre book_ch10
Kks sre book_ch10Chris Huang
 
Kks sre book_ch1,2
Kks sre book_ch1,2Kks sre book_ch1,2
Kks sre book_ch1,2Chris Huang
 
Real time big data applications with hadoop ecosystem
Real time big data applications with hadoop ecosystemReal time big data applications with hadoop ecosystem
Real time big data applications with hadoop ecosystemChris Huang
 
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...Chris Huang
 
Approaching real-time-hadoop
Approaching real-time-hadoopApproaching real-time-hadoop
Approaching real-time-hadoopChris Huang
 
20130310 solr tuorial
20130310 solr tuorial20130310 solr tuorial
20130310 solr tuorialChris Huang
 
Scaling big-data-mining-infra2
Scaling big-data-mining-infra2Scaling big-data-mining-infra2
Scaling big-data-mining-infra2Chris Huang
 
Hbase status quo apache-con europe - nov 2012
Hbase status quo   apache-con europe - nov 2012Hbase status quo   apache-con europe - nov 2012
Hbase status quo apache-con europe - nov 2012Chris Huang
 
Hbase schema design and sizing apache-con europe - nov 2012
Hbase schema design and sizing   apache-con europe - nov 2012Hbase schema design and sizing   apache-con europe - nov 2012
Hbase schema design and sizing apache-con europe - nov 2012Chris Huang
 
重構—改善既有程式的設計(chapter 12,13)
重構—改善既有程式的設計(chapter 12,13)重構—改善既有程式的設計(chapter 12,13)
重構—改善既有程式的設計(chapter 12,13)Chris Huang
 
重構—改善既有程式的設計(chapter 10)
重構—改善既有程式的設計(chapter 10)重構—改善既有程式的設計(chapter 10)
重構—改善既有程式的設計(chapter 10)Chris Huang
 
重構—改善既有程式的設計(chapter 9)
重構—改善既有程式的設計(chapter 9)重構—改善既有程式的設計(chapter 9)
重構—改善既有程式的設計(chapter 9)Chris Huang
 
重構—改善既有程式的設計(chapter 8)part 2
重構—改善既有程式的設計(chapter 8)part 2重構—改善既有程式的設計(chapter 8)part 2
重構—改善既有程式的設計(chapter 8)part 2Chris Huang
 
重構—改善既有程式的設計(chapter 8)part 1
重構—改善既有程式的設計(chapter 8)part 1重構—改善既有程式的設計(chapter 8)part 1
重構—改善既有程式的設計(chapter 8)part 1Chris Huang
 
重構—改善既有程式的設計(chapter 7)
重構—改善既有程式的設計(chapter 7)重構—改善既有程式的設計(chapter 7)
重構—改善既有程式的設計(chapter 7)Chris Huang
 
重構—改善既有程式的設計(chapter 6)
重構—改善既有程式的設計(chapter 6)重構—改善既有程式的設計(chapter 6)
重構—改善既有程式的設計(chapter 6)Chris Huang
 
重構—改善既有程式的設計(chapter 4,5)
重構—改善既有程式的設計(chapter 4,5)重構—改善既有程式的設計(chapter 4,5)
重構—改善既有程式的設計(chapter 4,5)Chris Huang
 
重構—改善既有程式的設計(chapter 2,3)
重構—改善既有程式的設計(chapter 2,3)重構—改善既有程式的設計(chapter 2,3)
重構—改善既有程式的設計(chapter 2,3)Chris Huang
 

More from Chris Huang (20)

Data compression, data security, and machine learning
Data compression, data security, and machine learningData compression, data security, and machine learning
Data compression, data security, and machine learning
 
Kks sre book_ch10
Kks sre book_ch10Kks sre book_ch10
Kks sre book_ch10
 
Kks sre book_ch1,2
Kks sre book_ch1,2Kks sre book_ch1,2
Kks sre book_ch1,2
 
Real time big data applications with hadoop ecosystem
Real time big data applications with hadoop ecosystemReal time big data applications with hadoop ecosystem
Real time big data applications with hadoop ecosystem
 
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...
 
Approaching real-time-hadoop
Approaching real-time-hadoopApproaching real-time-hadoop
Approaching real-time-hadoop
 
20130310 solr tuorial
20130310 solr tuorial20130310 solr tuorial
20130310 solr tuorial
 
Scaling big-data-mining-infra2
Scaling big-data-mining-infra2Scaling big-data-mining-infra2
Scaling big-data-mining-infra2
 
Wissbi osdc pdf
Wissbi osdc pdfWissbi osdc pdf
Wissbi osdc pdf
 
Hbase status quo apache-con europe - nov 2012
Hbase status quo   apache-con europe - nov 2012Hbase status quo   apache-con europe - nov 2012
Hbase status quo apache-con europe - nov 2012
 
Hbase schema design and sizing apache-con europe - nov 2012
Hbase schema design and sizing   apache-con europe - nov 2012Hbase schema design and sizing   apache-con europe - nov 2012
Hbase schema design and sizing apache-con europe - nov 2012
 
重構—改善既有程式的設計(chapter 12,13)
重構—改善既有程式的設計(chapter 12,13)重構—改善既有程式的設計(chapter 12,13)
重構—改善既有程式的設計(chapter 12,13)
 
重構—改善既有程式的設計(chapter 10)
重構—改善既有程式的設計(chapter 10)重構—改善既有程式的設計(chapter 10)
重構—改善既有程式的設計(chapter 10)
 
重構—改善既有程式的設計(chapter 9)
重構—改善既有程式的設計(chapter 9)重構—改善既有程式的設計(chapter 9)
重構—改善既有程式的設計(chapter 9)
 
重構—改善既有程式的設計(chapter 8)part 2
重構—改善既有程式的設計(chapter 8)part 2重構—改善既有程式的設計(chapter 8)part 2
重構—改善既有程式的設計(chapter 8)part 2
 
重構—改善既有程式的設計(chapter 8)part 1
重構—改善既有程式的設計(chapter 8)part 1重構—改善既有程式的設計(chapter 8)part 1
重構—改善既有程式的設計(chapter 8)part 1
 
重構—改善既有程式的設計(chapter 7)
重構—改善既有程式的設計(chapter 7)重構—改善既有程式的設計(chapter 7)
重構—改善既有程式的設計(chapter 7)
 
重構—改善既有程式的設計(chapter 6)
重構—改善既有程式的設計(chapter 6)重構—改善既有程式的設計(chapter 6)
重構—改善既有程式的設計(chapter 6)
 
重構—改善既有程式的設計(chapter 4,5)
重構—改善既有程式的設計(chapter 4,5)重構—改善既有程式的設計(chapter 4,5)
重構—改善既有程式的設計(chapter 4,5)
 
重構—改善既有程式的設計(chapter 2,3)
重構—改善既有程式的設計(chapter 2,3)重構—改善既有程式的設計(chapter 2,3)
重構—改善既有程式的設計(chapter 2,3)
 

Recently uploaded

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 

Recently uploaded (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

Applying Media Content Analysis to the Production of Musical Videos as Summarization

  • 1. 應用媒體內容分析於摘要性音樂影片之製作 Applying Media Content Analysis to the Production of Musical Videos as Summarization 2004/02/16 Student: Chen-Hsiu Huang Advisor: Prof. Ja-Ling Wu
  • 2. Outline  Problem Formulation  Current Solutions  Our Goal  Gory Details  Performance Evaluation  What’s Next?  Questions and Discussion
  • 3. Problem Formulation  The digital video capture devices such as DVs are made more affordable for end users.  It’s interesting to shoot videos but frustrating for editing them.  There’s still a tremendous barrier between amateurs (home users) and the powerful video editing software.  Finally people leave their precious shots in piles of DV tapes without editing and management.
  • 4.  According to a survey on DVworld*, the relations between the video length and how many times will user review them after days:  Video clips with no more then 5 minutes are best for human’s concentration. Video length Review times >= 1 hr 1 or 0 30 min ~ 1 hr 2 ~ 3 15 ~ 30 min 5 ~ 10 5 ~ 15 min >= 10 <= 5 min You take it out and watch it when you think about! *http://www.DVworld.com.tw/
  • 5.  People are impatient for videos without scenario or voice- over, especially for those with no music.  The improved soundtrack quality improved perceived video image quality.  Synchronizing video and audio segments enhance the perception of both.  One study at MIT showed that listener judge the identical video image to be higher quality when accompanied by higher-fidelity audio. Facts about Musical Video
  • 6.  Home videos can be roughly classified by its nature property. Causal Shots within video are causal; changing the order of shots may confuse the viewer Non-causal Shots are not causal; it’s OK to re-order video shots Recreational Videos are used to represent a kind of emotion or enjoyment Memorial Such as marriage or graduation celebrity, videos are memorial and each shot should be preserved properly.  Four profiles are proposed to deal with videos of different nature.
  • 7. Current Solutions  A consumer product called “muvee autoProducer” has been announced to ease the burden of professional video editing.  It’s application scenario is quite simple: Pick-up your video Choose your favorite music Produce a quality musical video Select profiles to apply
  • 8. Our Goal  Although there are commercial products in the market, only few academic publications related.  Jonathan Foote, Matthew D. Cooper, Andreas Girgensohn, "Creating music videos using automatic media analysis," ACM Multimedia 2002: 553-560  The content-analysis technologies are developed for years; can we adopt those technologies to help auto-creation of musical videos?  Goal: To achieve the near or beyond quality in the similar application scenario with the content-analysis technologies developed in multimedia domain.
  • 9. Input video Input music Shot change Scene change Audio segment cutting Alignment Output Video Volume ZCR Brightness Bandwidth … Human face Flash light Motion strength Color variance Camera Operation ... Scene selection Key shot selection Audio rhythm & Video motion/color synchronization Proposed Framework
  • 10. Audio Analysis  We should cut the input audio into several clips according to its audio features.  Frame-level features  Volume: defined as the MSR of audio samples  ZCR: the number of times that the audio waveform crosses the zero axis in each frame.  Spectral features  Brightness: the centroid of frequency spectrum  Bandwidth: the standard deviation of frequency spectrum
  • 11.  Generally the brightness’ distribution curve is almost the same as ZCR curve, so here we use ZCR feature only.  Bandwidth is an important audio feature but we can not easily tell what’s the real physical meaning in music when the bandwidth reaches its high/low value.  Furthermore, the relations between musical perceptual and bandwidth values are not clear and not regular. Brightness ZCR Volume Bandwidth
  • 12. Audio Segmentation  First we cut the input audio into clips when the volume changes dramatically.  For each clip, we define the burst of ZCR as an “attack”, which may be a beat of base drum or the singer’s voice.    > = otherwise,0 _)(,1 )( thVCutiF iA cut cut )()( 1 ∑∑ + +− −= wi i i i wi i cut w v w v absiF 10/)max(_ ivthVCut =    ×> = otherwise,0 )(2)(,1 )( iattack attack zstdiF iA )( )()( 1 ∑ ∑ + + − − +−= wi i i i i wi i iattack w v zabs w v zabsiF
  • 13.  The dramatic volume change defines the audio clip boundary, while the burst of ZCR (attack) in each clip defines the granular sub-segment within it. Clip boundary Attacks as sub-clip separation  Here we define the dynamic of each clip as: )( )( ilen z iA j j dynamic ∑ = 孫燕姿 綠光  The dynamic feature can be used as a good reference later for video/audio synchronization
  • 14. Video Analysis  First we need to apply shot change detection to segment video into scenes.  Here we use the combination of pixel MAD and pixel histogram method to perform the shot change detection.    == = otherwise,0 1)(and1)(,1 )( iSiS iV HISTMAD shot Dhist < Thhist Dhist > Thhist Dcolor < Thcolor nothing Dcolor > Thcolor unsuitable! shot change!
  • 15.  Flashlight detection  The flashlight event will be detected as shot change.  When the shot change is founded, check if:  If so, then it’s a flashlight event, should not be treated as shot change.  Sub-Shot segmentation  Here we use MPEG-7 ColorLayout descriptor to measure each frame’s similarity.  The first frame in each shot is selected as the basis, each consecutive frames are compared with the basis. If  Then we say that in frame i, a sub-shot is occurred. thFlash LMean LMean thFlash LMean LMean i i i i _ )( )( &&_ )( )( 11 ≥≥ −+ ThSubSceneiDFFdistiD i k k _)(,),()( 1 0 ≥= ∑=
  • 16. Camera Operation  Camera operations such as pan or zoom are widely used in amateur home videos. By detection those camera operations can help catch the video taker’s intention.  Our camera operation detection is performed base on the MPEG video’s motion vectors in P-frames. Pan Zoom 31 ≤≤ ∑ ∑ i i v v 3> ∑ ∑ i i v v  This method is simple and efficient. However, it does well when detecting camera operations.
  • 17. Video Features  Frame-level features  The presence of human faces.  Use OpenCV library as face detection module.  Motion intensity  Flashlight detection  Mean and standard deviation of luminance plane  (Dcolor(i) > Thcolor && Dhist(i) < Thhist) defines the unsuitable frames  Shot-level features  Numbers and types of camera operation in each shot.  Numbers of faces and flashlight event in each shot.  The accumulation of distance between each frame and first frame can be used to describe the shot’s homogeneity.
  • 18. Importance Measure  Frame-level score function: ) 256 130 ( )_( )( Std Mean opCameraSR ERScore amotion flashface + − × ++×× ++×= γ β α }2,1,0{_, )max( }1,0{, ∈= ∈ × = opCamera Motion Motion R E HW Area R i motion flash face face 2.0,3.0,5.0 === γβα  The face and flashlight event have the highest weighting.  Camera operation and higher motion intensity represent the video taker’s intension, so it’s more important.  Frames with higher luminance and larger standard derivation are more suitable.  The penalty of unsuitable frames will be discussed later. A scaling coefficient according to synchronized audio clip’s feature
  • 19.  The shot-level importance is motivated by observing that:  Shots with larger motion intensity take longer duration.  The presence of face attracts viewer.  Shots of higher heterogeneity can taker longer playing time.  Shots with more camera operations are more important.  Of course, shots with longer length in origin are more important.  Shot-level importance: )()() _ ( Len Diff Len Motion Len opCamera Len Num LenIMP face ∑∑∑ ××+×=  The shot-level importance function is used in the medium profile to reassign each shot’s length according to its importance.  Static shots takes shorter, while dynamic shots can take longer.  Gets better results after editing  “muvee autoProducer” does not reassign each shot’s length!
  • 20. Example 1 六福村之旅 (31:55) Music: SHE / 美麗新世界 Length: 4:25 Profile: Sequential Medium
  • 21. Proposed Profiles  The usage of profiles allows users to customize their videos according to its content property and users’ preference in a easy way.  We said that home videos have four types:  Causal, Non-causal, Recreational, Memorial  For causal or non-causal videos, we use the sequential or non- sequential parameter to deal with.  For memorial or recreational videos, the rhythmic or medium parameter is developed to cope with.  In rhythmic, the music tempo/rhythm is better preserved, while some shots of video will be neglected.  In medium, the accompany of music tempo/rhythm is not so clear as rhythm, but most of the shots will be promised to shown. The medium parameter preserved the original video the most.
  • 22.  Thus we have four profiles:  Sequential Rhythmic, Sequential Medium  Non-Sequential Rhythmic, Non-Sequential Medium Sequential Non-Sequential Rhythmic Time sequence of shots will be preserved, with the rhythmic parameter With the rhythmic parameter, but the original order of shots will be changed. Medium Time sequence of shots will be preserved, with the medium parameter With the medium parameter, but the original order of shots will be changed.
  • 23. Rhythmic vs. Medium  The video is segmented according to the audio clips and sub-clips.  After projecting to the video time-line, searching in the video range to find the video segments with the highest score as the same length as audio segment.  Finally concatenate all the selected segments. Video Track Audio Track
  • 24.  Each shot will be reassign to a new length according to its shot importance, shots may becomes longer or shorter in proportion to the total length.  After projection to the video space, the length budget is calculated according to the reduction rate; then allocate the budget to each inner shots according to its length.  If the allocated shot length is to short (< 30 frames), then its budget will be transfer to near shots. Video Track Audio Track
  • 25.  However, there are some issues:  The fast tempo/rhythm audio clip may be aligned to a static video shot, which will be annoying for viewer.  The slow audio clip may be aligned to a dynamic video shot.   We apply an audio scaling coefficient in synchronization stage. The motion intensity of video shot’s weight will be decreased when aligned with a slow audio clip; nearly preserved when synchronized with fast audio clip.  Another issue when the media length differ: Video Track Audio Track  It’s unavoidable when the sequential policy is enforced. 
  • 26.  For some video sources, the order of shots is not so important, and re- order shots will not degrade the original.  If we allow re-order the input video shots, things may be better: Video Track Audio Track permutation  It sounds simple and intuitive, but it’s not an easy problem if we want to develop an efficient algorithm to find such permutation.  Furthermore, the “best” solution may not exist and the optimal solution may not be only one permutation.
  • 27. Non-Sequential Permutation  So we developed a randomize algorithm to find a “not-bad” solution within predictable computation time.  First randomly permute each video shot  Then we compute the Ravc “audio-to-video coverage” in the corresponding time-line for each shot Video Audio 1=avcR 2=avcR 3=avcR  Then we calculate the average Ravc, each permutation will has its Ravc.  After lots of iterations, find the minimal Ravc, theoretically we can approach to the optimal solution efficiently and predictable, only depends on how many iterations we perform.
  • 28.  For an example, 10000 iterations are performed: Permutation Minimal Ravc 7 5 8 11 3 14 13 1 2 0 9 6 12 4 10 1.455571 11 14 2 10 1 3 9 6 4 0 12 13 7 8 5 1.482213 9 7 13 1 14 6 2 10 8 0 11 4 12 3 5 1.508536 7 3 5 11 12 8 0 13 1 2 14 10 6 4 9 1.425809 13 5 2 10 3 12 7 11 0 14 9 6 8 4 1 1.453530  We can get better solution with more iterations, but through experiments, 10000 iterations are quite enough and will not be a burden for our computation power (actually it’s really fast)  Since its random property, each synchronization result will be different. But we have discussed before that it’s normal to have lots of solutions.
  • 29. Example 2 吉魯巴 (19:08) Music: 製造浪漫 Length: 4:25 Profile: Sequential Medium and Non-Sequential Medium
  • 30. Performance Evaluation  Development environment:  AMD Duron 1.2G Hz with 386 MB RAM  Analysis complexity:  For videos, about 1.2~1.3:1 comparing to the original video time.  For audios, about 2 minutes for a 5 minute audio; if perform the spectral analysis, 4-5 minutes are needed.  The audio/video analysis will be saved as description files, so the analysis is required only once.  The synchronization can be regarded as O(n) complexity.  When analyzing, usually less than 20 MB RAM is required (depends on how many shots in video)  The synchronization result is saved as an AviSynth script. Then we use VirtualDub to encode the produced musical video.
  • 31. Sample Videos 六福村之旅 (31:55) 烏來採蜂蜜 (60:34) 聖淘沙海底世界 (17:59) littleco 演唱會 (20:22) 吉魯巴 (19:08) 結婚典禮 (43:42)
  • 32. What’s Next?  How to design the experimental result?  The subjective test should not over-burden the viewer.  Adding the shot transition effects? Such as dissolve, fade in, fade out.  I’ve tried, but not so easy as I thought.  The automatic approach may not always product a satisfaction result and the experience is highly subjective and differs from people to people.  Semi-automatic is probably the best compromise. The automatic result is served only as a pre-process basis and a labor-saving tool.  But the video editing tool is hard to develop, and I doubt if it’s necessary to develop one from startup on the purpose of thesis.
  • 33. Questions and Discussion  Any comments are welcomed.  Acknowledgment:  Special thanks for Mr. 劉嘉倫 , for his videos and suggestions.   Thanks friends in DVworld who provide lots of ideas and comments.  Thanks Chih-Hao Shen for his dancing video.