SlideShare a Scribd company logo
1 of 4
Download to read offline
K. Shuma Roshini et al Int. Journal of Engineering Research and Application
ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.338-341

RESEARCH ARTICLE

www.ijera.com

OPEN ACCESS

Implementation of Diamond Search Algorithm Using Parallel
Processing Architecture
K. Shuma Roshini1, M. Tejaswi2,
1
2

M.Tech in Embedded Systems, Gudlavalleru Engineering College, Gudlavalleru, A.P.INDIA.
Assistant Professor in ECE Department, Gudlavalleru Engineering College, Gudlavalleru, A.P.INDIA.

Abstract
In video communication whole content of video cannot be stored without processing. So there is a need to
compress the video before transmission and storage this process is called as video compression. Video
compression plays an important role with regard to real-time scouting/video conferencing applications.
Regarding the entire motion based video compression process, movement estimation is the most
computationally expensive and time consuming process. Motion estimation is the key element in video
compression. The Motion Estimation is a process which determines motion between two or more frames and
finds best possible macro block. There are several algorithms on block matching to name a few, Full Search
Motion estimation [FS], Three Step Search Motion Estimation [TSS], New Three Step Search Motion
Estimation [NTSS], Four Step Search Motion Estimation [FSS], Diamond Search Motion Estimation
[DS].Instead of trying to further reduce computational complexity of these algorithms it is better to implement
these algorithms on parallel processing architecture. In this paper Diamond Search Algorithm is implementation
on CPU and GPU.
Key Words: Motion Estimation, video compression, DS, GPU,CUDA.

I.

INTRODUCTION

A raw video requires 2 to 3 GB of memory
for one minute depending on resolution of the frame,
it is impractical to store and transfer this much
amount of video, for this purpose video compression
is used. By using the technique of motion estimation
video can be encoded, motion estimation is the process
which determines motion between two or more frames
and it is used to find best possible macro block. Video
sequences consists of high level of redundancy
between consecutive frame it means changes between
frames are very less. In temporal redundancy the
reduction of redundancy involves encoding of a first
reference frame and the current frame, while the
current frame encodes only the difference from the
reference frame so this require a lot of computational
complexity. In order to reduce computational
complexity of ME [1] algorithms, a number of Block
Matching Algorithms (BMA)[2] came into existences.
At some point huge Computational complexity for all
the algorithms is going to increase. So, Diamond
Search algorithm is implemented on GPU to reduce
compression time of a video.
A Graphics Processing Unit (GPU)[3] is a
electronic circuit designed to rapidly manipulate and
alter memory to accelerate the creation of images in a
frame buffer intended for output to a display. The
exhaustive motion search is implemented on GPU,
instead of other fast ME algorithms, because of its
regular memory access pattern. Although other fast
ME algorithm can certainly be implemented, need an
additional layer of texture to specify the target
www.ijera.com

searching position and this results in random memory
access pattern and dependent texture read. The
repercussion of these are quite significant in modern
graphics architecture.
Motion Estimation
Mainly in video editing motion estimation [4]
is a type of video compression scheme. The motion
estimation process is done by the coder to find the
motion vector pointing to the best prediction macro
block in a reference frame. For compression
redundancy between adjacent frames can be exploited
where a frame is selected as a reference and
subsequent frames are predicted from the reference
using motion estimation. The motion estimation
process analyzes previous or future frames to identify
blocks that have not changed, and motion vectors are
stored in place of blocks.

Figure 1: Comparing two Frames
Figure 1 shows an example of a frame with 2
stick figures and a tree. The second half of this figure
is an example of a possible next frame, where panning
has resulted in the tree moving down and to the right,
338 | P a g e
K. Shuma Roshini et al Int. Journal of Engineering Research and Application
ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.338-341
and the figures have moved farther to the right because
of their own movement outside of the panning. The
problem for motion estimation to solve is how to
adequately represent the changes, or differences,
between these two video frames.
It will probably worth mentioning that
although motion estimation is also employed in many
other disciplines an example would be computer
vision, target tracking, and industrial monitoring, the
techniques developed particularly for image coding a
variety of in certain respects. The intention of image
compression would be to lessen the total transmission
bit rate for reconstructing images for the receiver.
Hence, the motion information should occupy simply
a small amount of the transmission bandwidth
additional onto the picture contents information. So
long as the motion parameters obtain can effectively
reduce the total bit rate, these parameters need not
function as true motion parameters. Besides, the
reconstructed images for the receiving end are
sometimes distorted. Therefore, in case the
reconstructed images are employed for estimating
motion information, an extremely strong noise
component must not be ignored.
The motion estimation problem, involves
two related sub-problems: i) identify the moving
object boundaries, so-called motion segmentation, and
ii) estimate the motion parameters of each one moving
object, so-called motion estimation in strict sense.
Within our use, a moving object is basically a team of
contiguous pels that share the same variety of motion
parameters. This definition does not necessarily equal
the ordinary meaning of object. For instance, within
the videophone scene, the still background may
include wall, bookshelf, decorations, etc. Given that
these merchandise are stationary (sharing the same
zero motion vector), they could be considered as one
single object in the case of motion estimation.
An appealing point in studying the motion
estimation techniques for standards is the fact that the
current video standards specify for only the decoders.
Assuming motion displacement information (motion
vectors) are generated through encoder and transmitted
to the decoder, the decoder only performs the motion
compensation
operation—patch
the
to-bereconstructed picture making use of the known
(already decoded) picture(s). The encoder, which
performs the motion estimation operation, is not really
explicitly specified among the standards. Hence, it is
more than possible use different motion estimation
techniques to produce standards-compatible motion
information. At the decoder, excluding the resources
of coded pictures which can be used for motion
compensation, fundamentally the same block-based
motion compensation operation is designed by most of
the popular video standards, H.261, H.263, MPEG-1,
and MPEG-2[5][6].

II.

LITERATURE SURVEY

As motion estimation happens to be the most
www.ijera.com

www.ijera.com

compute intensive operate in an H.264 encoder,
extensive researches most certainly been completed to
accelerate the motion estimation operate in GPU.
Since full search algorithm has more parallelism
compared to any search algorithm, it's usually used to
parallelize motion estimation in GPU [7]. But can
often obtain huge performance gain in observation to
the JM encoder, it is not clear if their result is yikes
sufficient to be applied practically. In the other search
algorithms, motion vector prediction (MVP) is made
to increase the compression ratio. In MVP, the initial
motion vector (MV) of the current macro block (MB)
is predicted beginning with the MVs of one's neighbor
macro blocks, assuming that they have been already
encoded in a sequential raster order. And so the
motion vector associated with a block cannot be
predicted if MB encoding is performed in parallel.
Since MVP limits parallelism, several approaches are
proposed to extend the parallelism. If simply don't
MVP, it can be observed that the coding efficiency is
significantly degraded, resulting in a larger output
size.
Kung has proposed a block-based ME
algorithm which uses 4x4 blocks as a substitute for
16x16 blocks. For being frame is separated into finer
blocks, the total number of blocks that could be
executed in parallel is increased. Since dynamic
behavior like early termination introduces poor
performance in GPU resulting from load imbalance,
they proposed a fresh search algorithm based on three
step search algorithm which has no early termination
technique. Schwalb et al. have proposed a different
approach. They ignored MVP to completely exploit
macro block (MB) level parallelism. Instead, they use
other predictors provided by Forward Dominant
Vector Selection (FDVS) and Split and Merge
techniques. In FDVS, it predicts the motion vector of n
1th frame by reusing the motion vectors of n previous
frames. In Split and Merge, it exploits correlation of
movement vectors between variable block sizes. Their
result separated they will be able to achieve
competitive quality and coding efficiency in
comparison with UMHexagons Search algorithm even
if they use Diamond Search algorithm [8]. These
techniques can easily be applicable to our approach
that if use multiple reference frames. Currently, only
consider one reference frame. Search algorithm that
proposed is the same as our algorithm in making use
of a diamond pattern for search points. While their
algorithm considers only a 3x3 square pattern, Here
assume holistic diamond pattern (nxn) and of course
the value of n that optimizes the performance within a
target GPU is about through measurements. Within
this particular approach, work with early termination
techniques and prevent the load imbalance by
carefully thinking about the underlying GPU
architecture. To fully exploit MB level parallelism,
here do not consider MVP that poses dependencies
among the sequence of MBs encoding. To bypass the
lack of MVP, Propose a new search algorithm fitted to
the GPU architecture, where more computations are
339 | P a g e
K. Shuma Roshini et al Int. Journal of Engineering Research and Application
ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.338-341
performed yet in along-side maintain as much coding
efficiency as MVP would supply.

III.

PROPOSED SYSTEM

Compute Unified Device Architecture
(CUDA) CUDA[9][10]
serves as a parallel
programming framework for utilizing GPU for general
computing. Typical GPUs consist of hundreds of
processing cores very effective at achieving immense
parallel computing performance.Today all of
NVIDIA's GPUs are CUDA GPUs. CUDA is not
computer architecture in the sense of a definition of an
instruction set and a set of architectural registers;
binaries compiled for one CUDA GPU do not
necessarily run on all CUDA GPUs. More specifically,
NVIDIA defines different CUDA compute capabilities
to describe the features supported by CUDA hardware.
The first CUDA GPUs had compute capability 1.0. In
2011 NVIDIA released GPUs with compute capability
2.1, which is known as Fermi" architecture.
CUDA is based on the SIMD (Single
Instruction Multiple Data) architecture and suited to
exploiting various stages of data parallelism. A CUDA
GPU consists of multiple so-called streaming
multiprocessors (SMs).The threads executing a GPU
program, a so-called kernel, are grouped in blocks.
Threads belonging to one block all run on the same
multiprocessor but one multiprocessor can run
multiple blocks concurrently. Blocks are further
divided into groups of 32 threads called warps; the
threads belonging to one warp are executed in lock
step, i.e., they are synchronized. CUDA is extension to
C/C++ languages and allows programmers to write
parallel programs for GPU.
Graphic Processing Unit is a computer chip
that performs rapid mathematical calculations,
primarily for rendering purpose. NVIDIA’s GT
GeForce 610 GPU has been chosen for implementing
diamond search algorithm. This GPU consist of 48
processor cores where total work is allocated among
all these processors. GPU is connected in a CPU using
PCI slot of DDR3 memory type. It is also provided
with boost clock where GPU can run at higher speed
depending upon number of input factors that are used
to determine whether it is a good idea to run at higher
clock or not. Windows visual studio 2010 is used
which is a Integrated development Environment (IDE)
provides language services for all programming
languages with code editor and debugger.
For
debugging purpose NVCC compiler is used which
separate source codes and device code. NVCC
generates both instructions for host and GPU, as well
as instructions to send data back and forward between
them. Device functions are processed by NVIDIA
compiler and Host functions are processed by host
compiler.
Specifications of GT GeForce 610
GPU Engine Specs:
• 48CUDA Cores
www.ijera.com

www.ijera.com

• 810Base Clock
• 1620Boost Clock
• 6.5Texture Fill Rate (billion/sec)
Memory Specs:
• 1.8 Gbps Memory Clock
• 1024MBStandard Memory Configuration
• DDR3Memory Interface
• 64-bitMemory Interface Width
• 14.4Memory Bandwidth (GB/sec)
Feature Support:
• 4.2OpenGL
• PCI Express 2.0Bus Support
• Certified for Windows 7
• DirectX 11, CUDA, PhysXSupported
Technologies
Diamond Search Motion Estimation
The diamond search algorithm[8][11] has two
search patterns first pattern consist of large diamond
search pattern which consists of nine points from
which eight points surrounds the center one to form a
diamond shape which is called as Large Diamond
Search Point (LDSP). Second pattern consist of five
search points which forms smaller diamond and it is
called as Small Diamond Search Point (SDSP).

In searching procedure of the DS algorithm,
first LDSP is searched repeatedly until minimum
block distortion (MBD) occurs at the center point. If
MBD occurs at the center point then search pattern is
switched from LDSP to SDSP and search completes.
Points yield from this search represents the motion
vectors of best matching block.
Summary of DS algorithm:
Step 1: Initially LDSP is centered at the origin of the
search window, and the 9 checking points of LDSP are
tested. If MBD is located at the center then go to
step3; otherwise step2.
Step 2: MBD point found in the previous search step
is re-positioned as the center point to form a new
LDSP. If the new MBD point obtained is located at the
center position, go to step3; otherwise, recursively
repeat this step.
Step3: Now switch the search pattern from LDSP to
SDSP. The MBD point found in this step is the final
solution of the motion vectors which points to the best
matching block.

340 | P a g e
K. Shuma Roshini et al Int. Journal of Engineering Research and Application
ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.338-341

www.ijera.com

I am glad to express my deep sense of
gratitude to Dr. M. Kamaraju, HOD of the ECE
Department and entire staff of ECE Department,
Gudlavalleru Engineering College, Gudlavalleru for
their great support and encouragement in completion
of this project.

REFERENCES
[1]
Figure 3: Light color dots represents the LDSP in this
case lets us assume that center point contains least
MBD then switch pattern shifts from LDSP to SDSP,
thick color dots represents SDSP center point is final
MBD which is best possible motion vector.

IV.

[2]

RESULTS

GPU experimental results are compared with
existing results which are taken from [12] in C
environments with different configurations.
Table 1: Comparison of run times on CPU and
GPU

[3]

[4]

[5]

[6]

Simulation experimental video is given to
GPU for implementing Diamond Search Algorithm for
encoding purpose. In the above table run time of
different videos are compared with run time of CPU.
Encoding process of video occur faster in GPU. Table
2 shows by how much factor video can be compressed
a video using this algorithm.

[7]

[8]

Table 2: Compression Factor of Videos
[9]

[10]

[11]

V.

CONCLUSION

In this paper Diamond Search Algorithm is
implemented on parallel processing architecture for
NVIDIA GPU. After observing all the communication
overhead between host and device this proposed
algorithm takes less run time compared to that of CPU
with a 4x speed by only parallelizing motion estimation
block in the encoding side.

VI.
www.ijera.com

[12]

Ahamdi, M. M. Azadfar “implementation of
fast motion estimation algorithms & comparsion
with full search method in H.264”, IJCSNS
international journal of computer science &
Network security, vol 8, No.3, pp139-143,
March 2008
S. Immanuel Alex Pandian, Dr.G.josemin
BalaBecky Alma George, / A Study on Block
Matching Algorithms for Motion Estimation
International Journal on Computer Science and
Engineering (IJCSE) ISSN : 0975-3397, Vol. 3
No. 1 Jan 2011.
John Nickolls, William j.Dally, “ The GPU
Computing Era,” IEEE Annals of the History of
Computing Publication(2010), pp.56-69, ISBN:
0272 -1732.
R. Srinivasan and K. R. Rao, “predictive coding
based on efficient motion estimation,” IEEE
Trans. Commun., vol. COMM-33,pp.888-896,
Aug. 1985.
K.R.Rao and j.j Hwang, techniques and
standards for image, video and audio coding.
Englewood Cliffs, NJ:prentice Hall,1996
“MPEG-4 video verification model, ver. 14.0,”
ISO/IECJTC1/SC29/WG11/N29232, Oct. 1999.
Zhou Jing, Jiao Liang bao, Cao Xuehong, “
implementation of parallel full search algorithm
for motion estimation on multi core
Processors,” (ICNIT) pp 31-35, ISBN 978-14577-0266-2, 2011
D.v.manjunatha
and
Dr.sainarayanan
“Comparsion and implementation of fast block
matching motion estimation algorithms for
video compression”,ISSN:0975- 5462, Vol.3
No.10 October 2011 .
W.Chen, H.Hang, “H.264/AVC motion
estimation implementation on compute Unified
Device Architecture(CUDA)”, ICME 2008.
Youngsub ko, youngmin yi and soonhoi ha, “an
9efficient parallel motion estimation algorithm
and x264 parallelization in CUDA”, ISBN: 9781-4577-0620-2,page.no 91-98, IEEE,(2011).
S.Zhu, K.ma, “ A New Diamond search
Algorithm for Fast Block-matching Motion
Estimation”, IEEE TRANSACTIONS ON
IMAGE PROCESSING, VOL.9, NO.2,
Feb.2000.
D .rukmanidevi, p. rangarajan and j.raja paul,
“A Novel search algorithm for variable block
size motion estimation of H.264/AVC” ISSN:
1450-216X, vol 81 no 2(2012), pp. 160-167.

ACKNOWLEDGMENT
341 | P a g e

More Related Content

What's hot

Hybrid compression based stationary wavelet transforms
Hybrid compression based stationary wavelet transformsHybrid compression based stationary wavelet transforms
Hybrid compression based stationary wavelet transforms
Omar Ghazi
 
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...
cscpconf
 
A novel approach for satellite imagery storage by classifying the non duplica...
A novel approach for satellite imagery storage by classifying the non duplica...A novel approach for satellite imagery storage by classifying the non duplica...
A novel approach for satellite imagery storage by classifying the non duplica...
IAEME Publication
 
A novel approach for satellite imagery storage by classify
A novel approach for satellite imagery storage by classifyA novel approach for satellite imagery storage by classify
A novel approach for satellite imagery storage by classify
iaemedu
 

What's hot (18)

40120140503006
4012014050300640120140503006
40120140503006
 
Passive techniques for detection of tampering in images by Surbhi Arora and S...
Passive techniques for detection of tampering in images by Surbhi Arora and S...Passive techniques for detection of tampering in images by Surbhi Arora and S...
Passive techniques for detection of tampering in images by Surbhi Arora and S...
 
Survey paper on image compression techniques
Survey paper on image compression techniquesSurvey paper on image compression techniques
Survey paper on image compression techniques
 
A REGULARIZED ROBUST SUPER-RESOLUTION APPROACH FORALIASED IMAGES AND LOW RESO...
A REGULARIZED ROBUST SUPER-RESOLUTION APPROACH FORALIASED IMAGES AND LOW RESO...A REGULARIZED ROBUST SUPER-RESOLUTION APPROACH FORALIASED IMAGES AND LOW RESO...
A REGULARIZED ROBUST SUPER-RESOLUTION APPROACH FORALIASED IMAGES AND LOW RESO...
 
Hybrid compression based stationary wavelet transforms
Hybrid compression based stationary wavelet transformsHybrid compression based stationary wavelet transforms
Hybrid compression based stationary wavelet transforms
 
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...
 
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODINGNEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
 
Multiresolution SVD based Image Fusion
Multiresolution SVD based Image FusionMultiresolution SVD based Image Fusion
Multiresolution SVD based Image Fusion
 
538 207-219
538 207-219538 207-219
538 207-219
 
DCT based Steganographic Evaluation parameter analysis in Frequency domain by...
DCT based Steganographic Evaluation parameter analysis in Frequency domain by...DCT based Steganographic Evaluation parameter analysis in Frequency domain by...
DCT based Steganographic Evaluation parameter analysis in Frequency domain by...
 
IRJET- Comparison and Simulation based Analysis of an Optimized Block Mat...
IRJET-  	  Comparison and Simulation based Analysis of an Optimized Block Mat...IRJET-  	  Comparison and Simulation based Analysis of an Optimized Block Mat...
IRJET- Comparison and Simulation based Analysis of an Optimized Block Mat...
 
M017427985
M017427985M017427985
M017427985
 
Ijnsa050207
Ijnsa050207Ijnsa050207
Ijnsa050207
 
Content Based Image Retrieval Using 2-D Discrete Wavelet Transform
Content Based Image Retrieval Using 2-D Discrete Wavelet TransformContent Based Image Retrieval Using 2-D Discrete Wavelet Transform
Content Based Image Retrieval Using 2-D Discrete Wavelet Transform
 
A novel approach for satellite imagery storage by classifying the non duplica...
A novel approach for satellite imagery storage by classifying the non duplica...A novel approach for satellite imagery storage by classifying the non duplica...
A novel approach for satellite imagery storage by classifying the non duplica...
 
A novel approach for satellite imagery storage by classify
A novel approach for satellite imagery storage by classifyA novel approach for satellite imagery storage by classify
A novel approach for satellite imagery storage by classify
 
Motion Detection and Clustering Using PCA and NN in Color Image Sequence
Motion Detection and Clustering Using PCA and NN in Color Image SequenceMotion Detection and Clustering Using PCA and NN in Color Image Sequence
Motion Detection and Clustering Using PCA and NN in Color Image Sequence
 
O017429398
O017429398O017429398
O017429398
 

Viewers also liked

Dichos y refranes
Dichos y refranesDichos y refranes
Dichos y refranes
mpmg3
 
Johan,Cesar,Leonardo,Brian
Johan,Cesar,Leonardo,BrianJohan,Cesar,Leonardo,Brian
Johan,Cesar,Leonardo,Brian
cesar
 
Diapositivas De Tecnologia Mary 9e
Diapositivas De Tecnologia Mary 9eDiapositivas De Tecnologia Mary 9e
Diapositivas De Tecnologia Mary 9e
guestaaf8e1
 
Paradigama De La Propiedad Intelectual
Paradigama De La Propiedad IntelectualParadigama De La Propiedad Intelectual
Paradigama De La Propiedad Intelectual
Jorge Hurtado
 
Isu isu-pendidikan-di-malaysia-121016052843-phpapp02
Isu isu-pendidikan-di-malaysia-121016052843-phpapp02Isu isu-pendidikan-di-malaysia-121016052843-phpapp02
Isu isu-pendidikan-di-malaysia-121016052843-phpapp02
Tan Hooi
 

Viewers also liked (20)

Bx36449453
Bx36449453Bx36449453
Bx36449453
 
Dy4301752755
Dy4301752755Dy4301752755
Dy4301752755
 
Pablo , pantoja
Pablo , pantojaPablo , pantoja
Pablo , pantoja
 
O canto do uirapuru
O canto do uirapuruO canto do uirapuru
O canto do uirapuru
 
Dichos y refranes
Dichos y refranesDichos y refranes
Dichos y refranes
 
Johan,Cesar,Leonardo,Brian
Johan,Cesar,Leonardo,BrianJohan,Cesar,Leonardo,Brian
Johan,Cesar,Leonardo,Brian
 
Diapositivas De Tecnologia Mary 9e
Diapositivas De Tecnologia Mary 9eDiapositivas De Tecnologia Mary 9e
Diapositivas De Tecnologia Mary 9e
 
Atmosfera
AtmosferaAtmosfera
Atmosfera
 
Adriana vicente tese de mestrado
Adriana vicente   tese de mestradoAdriana vicente   tese de mestrado
Adriana vicente tese de mestrado
 
Paradigama De La Propiedad Intelectual
Paradigama De La Propiedad IntelectualParadigama De La Propiedad Intelectual
Paradigama De La Propiedad Intelectual
 
Isu isu-pendidikan-di-malaysia-121016052843-phpapp02
Isu isu-pendidikan-di-malaysia-121016052843-phpapp02Isu isu-pendidikan-di-malaysia-121016052843-phpapp02
Isu isu-pendidikan-di-malaysia-121016052843-phpapp02
 
сказ о том как группа «карусель»
сказ о том как группа «карусель»сказ о том как группа «карусель»
сказ о том как группа «карусель»
 
Upacara kehamilan masyarakat sulawesi tengah
Upacara kehamilan masyarakat sulawesi tengahUpacara kehamilan masyarakat sulawesi tengah
Upacara kehamilan masyarakat sulawesi tengah
 
Cu Isus am numai amintiri frumoase!
Cu Isus am numai amintiri frumoase!Cu Isus am numai amintiri frumoase!
Cu Isus am numai amintiri frumoase!
 
Εικονόλεξο: Το Καρότο και η Αγκινάρα
Εικονόλεξο: Το Καρότο και η ΑγκινάραΕικονόλεξο: Το Καρότο και η Αγκινάρα
Εικονόλεξο: Το Καρότο και η Αγκινάρα
 
Fyrirlestur um LinkedIn hjá VR 5.3.2015
Fyrirlestur um LinkedIn hjá VR 5.3.2015Fyrirlestur um LinkedIn hjá VR 5.3.2015
Fyrirlestur um LinkedIn hjá VR 5.3.2015
 
Simon Menner photographer website report
Simon Menner photographer website reportSimon Menner photographer website report
Simon Menner photographer website report
 
Terapi Cairan pada Anak
Terapi Cairan pada AnakTerapi Cairan pada Anak
Terapi Cairan pada Anak
 
Pe altarul revolutiei , autor Corneliu Leu
Pe altarul revolutiei , autor Corneliu LeuPe altarul revolutiei , autor Corneliu Leu
Pe altarul revolutiei , autor Corneliu Leu
 
Presentasi daur oksigen
Presentasi daur oksigenPresentasi daur oksigen
Presentasi daur oksigen
 

Similar to Be36338341

absorption, Cu2+ : glass, emission, excitation, XRD
absorption, Cu2+ : glass, emission, excitation, XRDabsorption, Cu2+ : glass, emission, excitation, XRD
absorption, Cu2+ : glass, emission, excitation, XRD
IJERA Editor
 
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Koteswar Rao Jerripothula
 
Video indexing using shot boundary detection approach and search tracks
Video indexing using shot boundary detection approach and search tracksVideo indexing using shot boundary detection approach and search tracks
Video indexing using shot boundary detection approach and search tracks
IAEME Publication
 
Propose shot boundary detection methods by using visual hybrid features
Propose shot boundary detection methods by using visual  hybrid featuresPropose shot boundary detection methods by using visual  hybrid features
Propose shot boundary detection methods by using visual hybrid features
IJECEIAES
 

Similar to Be36338341 (20)

A Study on Algorithms for Block Motion Estimation in Video Coding
A Study on Algorithms for Block Motion Estimation in Video CodingA Study on Algorithms for Block Motion Estimation in Video Coding
A Study on Algorithms for Block Motion Estimation in Video Coding
 
Optimization of Macro Block Size for Adaptive Rood Pattern Search Block Match...
Optimization of Macro Block Size for Adaptive Rood Pattern Search Block Match...Optimization of Macro Block Size for Adaptive Rood Pattern Search Block Match...
Optimization of Macro Block Size for Adaptive Rood Pattern Search Block Match...
 
An Effective Implementation of Configurable Motion Estimation Architecture fo...
An Effective Implementation of Configurable Motion Estimation Architecture fo...An Effective Implementation of Configurable Motion Estimation Architecture fo...
An Effective Implementation of Configurable Motion Estimation Architecture fo...
 
A Hardware Model to Measure Motion Estimation with Bit Plane Matching Algorithm
A Hardware Model to Measure Motion Estimation with Bit Plane Matching AlgorithmA Hardware Model to Measure Motion Estimation with Bit Plane Matching Algorithm
A Hardware Model to Measure Motion Estimation with Bit Plane Matching Algorithm
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
absorption, Cu2+ : glass, emission, excitation, XRD
absorption, Cu2+ : glass, emission, excitation, XRDabsorption, Cu2+ : glass, emission, excitation, XRD
absorption, Cu2+ : glass, emission, excitation, XRD
 
Improved Error Detection and Data Recovery Architecture for Motion Estimation...
Improved Error Detection and Data Recovery Architecture for Motion Estimation...Improved Error Detection and Data Recovery Architecture for Motion Estimation...
Improved Error Detection and Data Recovery Architecture for Motion Estimation...
 
Gg3311121115
Gg3311121115Gg3311121115
Gg3311121115
 
Fast Motion Estimation for Quad-Tree Based Video Coder Using Normalized Cross...
Fast Motion Estimation for Quad-Tree Based Video Coder Using Normalized Cross...Fast Motion Estimation for Quad-Tree Based Video Coder Using Normalized Cross...
Fast Motion Estimation for Quad-Tree Based Video Coder Using Normalized Cross...
 
557 480-486
557 480-486557 480-486
557 480-486
 
High Performance Architecture for Full Search Block matching Algorithm
High Performance Architecture for Full Search Block matching AlgorithmHigh Performance Architecture for Full Search Block matching Algorithm
High Performance Architecture for Full Search Block matching Algorithm
 
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
 
SVM Based Saliency Map Technique for Reducing Time Complexity in HEVC
SVM Based Saliency Map Technique for Reducing Time Complexity in HEVCSVM Based Saliency Map Technique for Reducing Time Complexity in HEVC
SVM Based Saliency Map Technique for Reducing Time Complexity in HEVC
 
Video indexing using shot boundary detection approach and search tracks
Video indexing using shot boundary detection approach and search tracksVideo indexing using shot boundary detection approach and search tracks
Video indexing using shot boundary detection approach and search tracks
 
Motion detection in compressed video using macroblock classification
Motion detection in compressed video using macroblock classificationMotion detection in compressed video using macroblock classification
Motion detection in compressed video using macroblock classification
 
Embedded Implementations of Real Time Video Stabilization Mechanisms A Compre...
Embedded Implementations of Real Time Video Stabilization Mechanisms A Compre...Embedded Implementations of Real Time Video Stabilization Mechanisms A Compre...
Embedded Implementations of Real Time Video Stabilization Mechanisms A Compre...
 
Effective Compression of Digital Video
Effective Compression of Digital VideoEffective Compression of Digital Video
Effective Compression of Digital Video
 
Propose shot boundary detection methods by using visual hybrid features
Propose shot boundary detection methods by using visual  hybrid featuresPropose shot boundary detection methods by using visual  hybrid features
Propose shot boundary detection methods by using visual hybrid features
 
A Study of Image Compression Methods
A Study of Image Compression MethodsA Study of Image Compression Methods
A Study of Image Compression Methods
 
Key Frame Extraction for Salient Activity Recognition
Key Frame Extraction for Salient Activity RecognitionKey Frame Extraction for Salient Activity Recognition
Key Frame Extraction for Salient Activity Recognition
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

Be36338341

  • 1. K. Shuma Roshini et al Int. Journal of Engineering Research and Application ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.338-341 RESEARCH ARTICLE www.ijera.com OPEN ACCESS Implementation of Diamond Search Algorithm Using Parallel Processing Architecture K. Shuma Roshini1, M. Tejaswi2, 1 2 M.Tech in Embedded Systems, Gudlavalleru Engineering College, Gudlavalleru, A.P.INDIA. Assistant Professor in ECE Department, Gudlavalleru Engineering College, Gudlavalleru, A.P.INDIA. Abstract In video communication whole content of video cannot be stored without processing. So there is a need to compress the video before transmission and storage this process is called as video compression. Video compression plays an important role with regard to real-time scouting/video conferencing applications. Regarding the entire motion based video compression process, movement estimation is the most computationally expensive and time consuming process. Motion estimation is the key element in video compression. The Motion Estimation is a process which determines motion between two or more frames and finds best possible macro block. There are several algorithms on block matching to name a few, Full Search Motion estimation [FS], Three Step Search Motion Estimation [TSS], New Three Step Search Motion Estimation [NTSS], Four Step Search Motion Estimation [FSS], Diamond Search Motion Estimation [DS].Instead of trying to further reduce computational complexity of these algorithms it is better to implement these algorithms on parallel processing architecture. In this paper Diamond Search Algorithm is implementation on CPU and GPU. Key Words: Motion Estimation, video compression, DS, GPU,CUDA. I. INTRODUCTION A raw video requires 2 to 3 GB of memory for one minute depending on resolution of the frame, it is impractical to store and transfer this much amount of video, for this purpose video compression is used. By using the technique of motion estimation video can be encoded, motion estimation is the process which determines motion between two or more frames and it is used to find best possible macro block. Video sequences consists of high level of redundancy between consecutive frame it means changes between frames are very less. In temporal redundancy the reduction of redundancy involves encoding of a first reference frame and the current frame, while the current frame encodes only the difference from the reference frame so this require a lot of computational complexity. In order to reduce computational complexity of ME [1] algorithms, a number of Block Matching Algorithms (BMA)[2] came into existences. At some point huge Computational complexity for all the algorithms is going to increase. So, Diamond Search algorithm is implemented on GPU to reduce compression time of a video. A Graphics Processing Unit (GPU)[3] is a electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display. The exhaustive motion search is implemented on GPU, instead of other fast ME algorithms, because of its regular memory access pattern. Although other fast ME algorithm can certainly be implemented, need an additional layer of texture to specify the target www.ijera.com searching position and this results in random memory access pattern and dependent texture read. The repercussion of these are quite significant in modern graphics architecture. Motion Estimation Mainly in video editing motion estimation [4] is a type of video compression scheme. The motion estimation process is done by the coder to find the motion vector pointing to the best prediction macro block in a reference frame. For compression redundancy between adjacent frames can be exploited where a frame is selected as a reference and subsequent frames are predicted from the reference using motion estimation. The motion estimation process analyzes previous or future frames to identify blocks that have not changed, and motion vectors are stored in place of blocks. Figure 1: Comparing two Frames Figure 1 shows an example of a frame with 2 stick figures and a tree. The second half of this figure is an example of a possible next frame, where panning has resulted in the tree moving down and to the right, 338 | P a g e
  • 2. K. Shuma Roshini et al Int. Journal of Engineering Research and Application ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.338-341 and the figures have moved farther to the right because of their own movement outside of the panning. The problem for motion estimation to solve is how to adequately represent the changes, or differences, between these two video frames. It will probably worth mentioning that although motion estimation is also employed in many other disciplines an example would be computer vision, target tracking, and industrial monitoring, the techniques developed particularly for image coding a variety of in certain respects. The intention of image compression would be to lessen the total transmission bit rate for reconstructing images for the receiver. Hence, the motion information should occupy simply a small amount of the transmission bandwidth additional onto the picture contents information. So long as the motion parameters obtain can effectively reduce the total bit rate, these parameters need not function as true motion parameters. Besides, the reconstructed images for the receiving end are sometimes distorted. Therefore, in case the reconstructed images are employed for estimating motion information, an extremely strong noise component must not be ignored. The motion estimation problem, involves two related sub-problems: i) identify the moving object boundaries, so-called motion segmentation, and ii) estimate the motion parameters of each one moving object, so-called motion estimation in strict sense. Within our use, a moving object is basically a team of contiguous pels that share the same variety of motion parameters. This definition does not necessarily equal the ordinary meaning of object. For instance, within the videophone scene, the still background may include wall, bookshelf, decorations, etc. Given that these merchandise are stationary (sharing the same zero motion vector), they could be considered as one single object in the case of motion estimation. An appealing point in studying the motion estimation techniques for standards is the fact that the current video standards specify for only the decoders. Assuming motion displacement information (motion vectors) are generated through encoder and transmitted to the decoder, the decoder only performs the motion compensation operation—patch the to-bereconstructed picture making use of the known (already decoded) picture(s). The encoder, which performs the motion estimation operation, is not really explicitly specified among the standards. Hence, it is more than possible use different motion estimation techniques to produce standards-compatible motion information. At the decoder, excluding the resources of coded pictures which can be used for motion compensation, fundamentally the same block-based motion compensation operation is designed by most of the popular video standards, H.261, H.263, MPEG-1, and MPEG-2[5][6]. II. LITERATURE SURVEY As motion estimation happens to be the most www.ijera.com www.ijera.com compute intensive operate in an H.264 encoder, extensive researches most certainly been completed to accelerate the motion estimation operate in GPU. Since full search algorithm has more parallelism compared to any search algorithm, it's usually used to parallelize motion estimation in GPU [7]. But can often obtain huge performance gain in observation to the JM encoder, it is not clear if their result is yikes sufficient to be applied practically. In the other search algorithms, motion vector prediction (MVP) is made to increase the compression ratio. In MVP, the initial motion vector (MV) of the current macro block (MB) is predicted beginning with the MVs of one's neighbor macro blocks, assuming that they have been already encoded in a sequential raster order. And so the motion vector associated with a block cannot be predicted if MB encoding is performed in parallel. Since MVP limits parallelism, several approaches are proposed to extend the parallelism. If simply don't MVP, it can be observed that the coding efficiency is significantly degraded, resulting in a larger output size. Kung has proposed a block-based ME algorithm which uses 4x4 blocks as a substitute for 16x16 blocks. For being frame is separated into finer blocks, the total number of blocks that could be executed in parallel is increased. Since dynamic behavior like early termination introduces poor performance in GPU resulting from load imbalance, they proposed a fresh search algorithm based on three step search algorithm which has no early termination technique. Schwalb et al. have proposed a different approach. They ignored MVP to completely exploit macro block (MB) level parallelism. Instead, they use other predictors provided by Forward Dominant Vector Selection (FDVS) and Split and Merge techniques. In FDVS, it predicts the motion vector of n 1th frame by reusing the motion vectors of n previous frames. In Split and Merge, it exploits correlation of movement vectors between variable block sizes. Their result separated they will be able to achieve competitive quality and coding efficiency in comparison with UMHexagons Search algorithm even if they use Diamond Search algorithm [8]. These techniques can easily be applicable to our approach that if use multiple reference frames. Currently, only consider one reference frame. Search algorithm that proposed is the same as our algorithm in making use of a diamond pattern for search points. While their algorithm considers only a 3x3 square pattern, Here assume holistic diamond pattern (nxn) and of course the value of n that optimizes the performance within a target GPU is about through measurements. Within this particular approach, work with early termination techniques and prevent the load imbalance by carefully thinking about the underlying GPU architecture. To fully exploit MB level parallelism, here do not consider MVP that poses dependencies among the sequence of MBs encoding. To bypass the lack of MVP, Propose a new search algorithm fitted to the GPU architecture, where more computations are 339 | P a g e
  • 3. K. Shuma Roshini et al Int. Journal of Engineering Research and Application ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.338-341 performed yet in along-side maintain as much coding efficiency as MVP would supply. III. PROPOSED SYSTEM Compute Unified Device Architecture (CUDA) CUDA[9][10] serves as a parallel programming framework for utilizing GPU for general computing. Typical GPUs consist of hundreds of processing cores very effective at achieving immense parallel computing performance.Today all of NVIDIA's GPUs are CUDA GPUs. CUDA is not computer architecture in the sense of a definition of an instruction set and a set of architectural registers; binaries compiled for one CUDA GPU do not necessarily run on all CUDA GPUs. More specifically, NVIDIA defines different CUDA compute capabilities to describe the features supported by CUDA hardware. The first CUDA GPUs had compute capability 1.0. In 2011 NVIDIA released GPUs with compute capability 2.1, which is known as Fermi" architecture. CUDA is based on the SIMD (Single Instruction Multiple Data) architecture and suited to exploiting various stages of data parallelism. A CUDA GPU consists of multiple so-called streaming multiprocessors (SMs).The threads executing a GPU program, a so-called kernel, are grouped in blocks. Threads belonging to one block all run on the same multiprocessor but one multiprocessor can run multiple blocks concurrently. Blocks are further divided into groups of 32 threads called warps; the threads belonging to one warp are executed in lock step, i.e., they are synchronized. CUDA is extension to C/C++ languages and allows programmers to write parallel programs for GPU. Graphic Processing Unit is a computer chip that performs rapid mathematical calculations, primarily for rendering purpose. NVIDIA’s GT GeForce 610 GPU has been chosen for implementing diamond search algorithm. This GPU consist of 48 processor cores where total work is allocated among all these processors. GPU is connected in a CPU using PCI slot of DDR3 memory type. It is also provided with boost clock where GPU can run at higher speed depending upon number of input factors that are used to determine whether it is a good idea to run at higher clock or not. Windows visual studio 2010 is used which is a Integrated development Environment (IDE) provides language services for all programming languages with code editor and debugger. For debugging purpose NVCC compiler is used which separate source codes and device code. NVCC generates both instructions for host and GPU, as well as instructions to send data back and forward between them. Device functions are processed by NVIDIA compiler and Host functions are processed by host compiler. Specifications of GT GeForce 610 GPU Engine Specs: • 48CUDA Cores www.ijera.com www.ijera.com • 810Base Clock • 1620Boost Clock • 6.5Texture Fill Rate (billion/sec) Memory Specs: • 1.8 Gbps Memory Clock • 1024MBStandard Memory Configuration • DDR3Memory Interface • 64-bitMemory Interface Width • 14.4Memory Bandwidth (GB/sec) Feature Support: • 4.2OpenGL • PCI Express 2.0Bus Support • Certified for Windows 7 • DirectX 11, CUDA, PhysXSupported Technologies Diamond Search Motion Estimation The diamond search algorithm[8][11] has two search patterns first pattern consist of large diamond search pattern which consists of nine points from which eight points surrounds the center one to form a diamond shape which is called as Large Diamond Search Point (LDSP). Second pattern consist of five search points which forms smaller diamond and it is called as Small Diamond Search Point (SDSP). In searching procedure of the DS algorithm, first LDSP is searched repeatedly until minimum block distortion (MBD) occurs at the center point. If MBD occurs at the center point then search pattern is switched from LDSP to SDSP and search completes. Points yield from this search represents the motion vectors of best matching block. Summary of DS algorithm: Step 1: Initially LDSP is centered at the origin of the search window, and the 9 checking points of LDSP are tested. If MBD is located at the center then go to step3; otherwise step2. Step 2: MBD point found in the previous search step is re-positioned as the center point to form a new LDSP. If the new MBD point obtained is located at the center position, go to step3; otherwise, recursively repeat this step. Step3: Now switch the search pattern from LDSP to SDSP. The MBD point found in this step is the final solution of the motion vectors which points to the best matching block. 340 | P a g e
  • 4. K. Shuma Roshini et al Int. Journal of Engineering Research and Application ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.338-341 www.ijera.com I am glad to express my deep sense of gratitude to Dr. M. Kamaraju, HOD of the ECE Department and entire staff of ECE Department, Gudlavalleru Engineering College, Gudlavalleru for their great support and encouragement in completion of this project. REFERENCES [1] Figure 3: Light color dots represents the LDSP in this case lets us assume that center point contains least MBD then switch pattern shifts from LDSP to SDSP, thick color dots represents SDSP center point is final MBD which is best possible motion vector. IV. [2] RESULTS GPU experimental results are compared with existing results which are taken from [12] in C environments with different configurations. Table 1: Comparison of run times on CPU and GPU [3] [4] [5] [6] Simulation experimental video is given to GPU for implementing Diamond Search Algorithm for encoding purpose. In the above table run time of different videos are compared with run time of CPU. Encoding process of video occur faster in GPU. Table 2 shows by how much factor video can be compressed a video using this algorithm. [7] [8] Table 2: Compression Factor of Videos [9] [10] [11] V. CONCLUSION In this paper Diamond Search Algorithm is implemented on parallel processing architecture for NVIDIA GPU. After observing all the communication overhead between host and device this proposed algorithm takes less run time compared to that of CPU with a 4x speed by only parallelizing motion estimation block in the encoding side. VI. www.ijera.com [12] Ahamdi, M. M. Azadfar “implementation of fast motion estimation algorithms & comparsion with full search method in H.264”, IJCSNS international journal of computer science & Network security, vol 8, No.3, pp139-143, March 2008 S. Immanuel Alex Pandian, Dr.G.josemin BalaBecky Alma George, / A Study on Block Matching Algorithms for Motion Estimation International Journal on Computer Science and Engineering (IJCSE) ISSN : 0975-3397, Vol. 3 No. 1 Jan 2011. John Nickolls, William j.Dally, “ The GPU Computing Era,” IEEE Annals of the History of Computing Publication(2010), pp.56-69, ISBN: 0272 -1732. R. Srinivasan and K. R. Rao, “predictive coding based on efficient motion estimation,” IEEE Trans. Commun., vol. COMM-33,pp.888-896, Aug. 1985. K.R.Rao and j.j Hwang, techniques and standards for image, video and audio coding. Englewood Cliffs, NJ:prentice Hall,1996 “MPEG-4 video verification model, ver. 14.0,” ISO/IECJTC1/SC29/WG11/N29232, Oct. 1999. Zhou Jing, Jiao Liang bao, Cao Xuehong, “ implementation of parallel full search algorithm for motion estimation on multi core Processors,” (ICNIT) pp 31-35, ISBN 978-14577-0266-2, 2011 D.v.manjunatha and Dr.sainarayanan “Comparsion and implementation of fast block matching motion estimation algorithms for video compression”,ISSN:0975- 5462, Vol.3 No.10 October 2011 . W.Chen, H.Hang, “H.264/AVC motion estimation implementation on compute Unified Device Architecture(CUDA)”, ICME 2008. Youngsub ko, youngmin yi and soonhoi ha, “an 9efficient parallel motion estimation algorithm and x264 parallelization in CUDA”, ISBN: 9781-4577-0620-2,page.no 91-98, IEEE,(2011). S.Zhu, K.ma, “ A New Diamond search Algorithm for Fast Block-matching Motion Estimation”, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL.9, NO.2, Feb.2000. D .rukmanidevi, p. rangarajan and j.raja paul, “A Novel search algorithm for variable block size motion estimation of H.264/AVC” ISSN: 1450-216X, vol 81 no 2(2012), pp. 160-167. ACKNOWLEDGMENT 341 | P a g e