2. Peer to Peer (P2P) architecture to solve the mentioned
problems. Due to the inherent characteristics of P2P and
Hadoop file system, the proposed can naturally solve the
problems mentioned above. We present the architecture,
components, its operation flow, and depict the
implementation issue in this paper. We believe that the
architecture can improve significantly issues.
The remainder of the paper is organized as follows.
Section 2 surveys related works, especially on Hadoop,
including its architecture and HDFS, and some surveillance
system related works. Section 3 describes the proposed
architecture, including Video Recording and Video
Monitoring, as well as explaining the operation flow.
Section 4 describes its implement issues. Finally we give a
brief concluding remarks and feature work in Section 6.
II. BACKGROUND AND RELATED WORK
Hadoop
The Apache Hadoop [3][4] is a framework that allows
distributed processing for large data sets across clusters of
computers using a simple programming model. Hadoop is
built up by two important parts, Mapreduce and Hadoop
File System (HDFS) [5].
In the Hadoop File System (HDFS), it provides global
access to files in the cluster and is implemented by two
kinds of node; the Name Node and the Data Node. The
Name Node manages the all metadata of file system, while
Data Node stores actual data. HDFS uses commodity
hardware to distribute the system loading with
characteristics of fault tolerance scalability and
expandability. In the HDFS, the NameNode is responsible
for maintaining the HDFS directory tree, and is a centralized
service. The NameNode execute the base system operation,
just like renaming, opening, closing and others. NameNode
also determines the mapping table for each block in the
DataNode. The DataNode directly accesses the file. It can
serve reading and writing requests from the client.
HDFS is designed to maintain and store large file data.
In HDFS, each file will be split into one or more small
blocks, and those blocks are allowed to store in a set of
DataNodes [6]. That means it will store each file as a
sequence of blocks, which are expectedly split as the same
size but not the last one. For the fault tolerance, each blocks
of a file are replicated and by default, two backups of each
block are stored by different DataNodes in the same rack
and a third one is stored on a DataNode in a different rack.
The NameNode decides the location of replication. The
placement chosen for replica is to improve the reliability
and performance.
Surveillance System
In the [14], authors presented a layered generic
organizational suite for the development of Sensor
Management Framework (SMF) based on the service-
oriented architecture. The sensor management system is
studied from a layered perspective and the functional tasks
carried by the SMF are categorized into three categories;
sensor management, network management, and system
management.
In the [15], authors proposed a design of large scale
video surveillance system client based on p2p streaming.
Given the particularity of video surveillance, such as high
churn and the heterogeneity of user access network and
delay-tolerant, authors also constructed the end-hosts into
mesh application layer topology and adopt pull-push mode
to deliver the data. Authors also designed the architecture of
PPClient and gave details of its components. Through
simulation and real system test it shows it is feasibility.
PPClient conquered traditional server/client mode, which
needs large number of infrastructure to support tens of
thousands users. By using p2p streaming technology, users
act as consumers and at the same time they provide service
to other users. Then the system’s scalability is enhanced.
In the [10], authors proposed an overlay network
architecture by application layer multicast with load
balancing scheme to effectively provide ubiquitous video
surveillance service. The proposed approach can provide
Internet users with scalable and stable surveillance video
streaming, since the simulations and results demonstrate that
our improved optimization of load balancing via both
averaging bandwidth and life time has better performance in
overhead of control message, service disruption, and tree
depth than the other optimization criterion.
In this paper, we apply the data placement concept of
adoop file system to provide fault tolerant and efficient
video access and apply P2P technology to improve the
scalability, reliability, robust, and server cost. Therefore,
integrating both adoop concept and P2P technology can
solve many issues of surveillance system.
III. SYSTEM DESIGN
This section will depict the design philosophy of the
proposed architecture and explain how it can solve the
problems mentioned in Section 1.
System Architecture
We exploit Hadoop concept to design our system. The
proposed system has two kinds of node. One is Directory
Node (DN) which is responsible for managing all FEs, but
does not keep all video data. The whole system can have
multiple DNs. To simplify system architecture and its
explanation, we assume the system only has one DN and it
is deployed in administration department. The other is Peer
Node (PN) which is responsible for storing the video data
662
3. using P2P technology. In the Hadoop concept, a piece of
data generally has three replicas. Therefore, each FE’s video
data in the architecture can be stored into three PNs; one is
called Primary PN (P-PN) that of course is the node FE tied
up, and another two nodes are called Secondary PN (S-PN)
that are selected by DN using distributed hashing table
(DHT), as shown in Figure 2. In other words, the video data
of the FE in the P-PN is replicated to two S-PNs that are
selected and managed by the DN. The three nodes naturally
form a replication group (RG).
The concept of RG can be represented as Figure 2. The
video data of a PN (P-PN) can be replicated to any two PNs
(S-PN). The two S-PNs are selected by the DN using P2P
technology.
Figure 2: Replication Group
Components and Functionality
As well known, a surveillance system can be divided
into two operation modes; one is Video Recording, and the
other mode is Video Monitoring. We depict the components
and their functionality according to the two modes.
Video Recording is responsible for handling the video
data storing. The architecture is shown in Figure 3.
• Directory Node (DN): The node provides the
centralized directory services. It contains following
components: Authenticator Module (AM), Replica
Manager (RM), Replica Scheduler (RS), and a DN
Database for the directory of whole system. In the
DN, a RM manages each PN and selects properly
replicas for each FE of PN through RS. Every PN can
directly communicate with Replica Manager of DN.
The Authentication Module is responsible for
checking the authentication of any PN as well as
assigning authentication key to both P-PN and
associated S-PNs. The Replica Scheduler selects two
proper S-PNs for a P-PN according to P2P
technology. It will search for S-PNs. Directory Node
Database is to keep the information of every peer
node.
• Primary Peer Node (P-PN): The Primary Peer Node
gets video data from FE directly. Therefore, it consists
of an FE (Camera or CCD), persistent storage, and
three extra components: Video Dispatcher, Replica
Manager and Authentication Module, respectively.
Video Dispatcher is responsible for transmitting and
storing video data into its RG currently. In other
words, The Video Dispatcher will store the caught
video into local storage and deliver it to two replicas
(S-PNs). The operation is similar with the write
operation of Hadoop file system. The RM is to
communicate with the RM of DN and other PNs for
authentication and getting associated information of S-
PN. For example, it will get the location associated to
S-PN from the DN, and try to build communication
channel with them. AM stores all keys obtained from
DN and private information of its associated RGs.
Figure 3: System Architecture
• Secondary Peer Node (S-PN): Undoubtedly, an S-PN
of a RG is also a P-PN of another RG. Therefore, all
components and their functionality are identical to the
P-PN excepting for the operation flows. We will
explain their operations in next subsection.
Such design obviously can have following advantages.
First, the scalability can be improved because the data
center is not a bottleneck yet. All video data of the system
are stored into each PN and required bandwidth of data
center can be significantly reduced. Secondly, the reliability
can be enhanced due to replication mechanism used in the
design (every video data is stored into a RG; one P-PN and
two S-PNs). While the P-PN or a replicated S-PN failed, the
DN can select another PN to take over the fail. Thirdly, the
cost of data center also can be reduced. Although it will also
663
4. increase the cost of individual PN, we think that is trivial
comparing with expensive data center. Besides, the
robustness of the system can be guaranteed because a PN
failure will not affect the service of the system.
The other operation mode- Video Monitoring, is
designed for client to retrieve video data. The operation can
be initiated by a client, namely Client Monitoring (CM),
who is assumed that is a special PN has the same
components; Video Dispatcher, Replica Manager and
Authentication Module. These components can be run on
desktop or PC. While the CM wants to retrieve desired
video, it can retrieve from all replicas in parallel, which like
P2P approach or [12]. Such design can be more efficiently
retrieving video than from a centralized data center.
In summary, the proposed architecture can naturally
solve the problems mentioned in Section 1.
Operation Flow
This section presents the operation flow for video
recording and monitoring. First the video recording is
presented and the details are shown in Figure 4.
In general, an activatied PN needs to register itself to DN
for future management. The registration message is sent by
RM to the RM of DN. The message includes some private
data about the node, such as Node ID, address, hardware
information, authentication key, and so on; and it will be
stored into AM of DN. While the registration is completed,
the DN will reply the registration back to the PN. The PN
can then send a request for find suitable S-PNs from DN for
delivering replicated video data.
Figure 4: Operation Flow of Video Recording
The DN will check the states of all PNs to find available
PNs for replicating video data. The peer node states contain
peer node’s Replica state, its bandwidth, connection stability
and peer node group. If available PN can be found and the
number is enough, the DN will select suitable S-PN using
DHT function. And then the DN generates an authentication
key and sends it to chosen S-PN for the request and for
future request messages from P-PN (PN of initiating the
request).
Next, the DN replys the P-PN the node’s information
of S-PNs; so that the P-PN can create communication
channel to S-PNs according to the node’s information and
authentication key. Finally the P-PN can store video data to
local storage and deliver the to the S-PNs.
In the design, we hope that the client can be any device
and at any where. Therefore, a client need to get the
authentication key in each video monitoring request.
Following describes the steps of video monitoring, the
details are shown as in Figure 5.
Figure 5: Operation flow of video monitoring
While a client wants to access his/her own video data,
he/she can send a request with account and password to DN.
The DN will check its legality by inquiring AM for the
request. If the client is authorized, the DN first reply client
the request and then later send an authentication key for
access video data from RG. In the meantime, the RM of DN
will get the authentication key from AM and all states of the
PNs in the RG. After client receives the authentication key,
it can create communication channels with authentication
key to all nodes of the RG. If the request is authorized by all
664
5. nodes, he/she can get the desired video data from all nodes
(three replicas) of the RG in parallel. Obviously, the design
can improve the performance of video access.
Of course, client can get equally video data size from
each replica. Due to the bandwidth variance of PNs, each
replica’s bandwidth may be difference. Therefore, we can
apply adaptive algorithm [12] to dynamically adjust video
data size form each replica according to some criteria, such
as bandwidth, connection stability, distance, and PN’s
computing capability etc.
IV. IMPLEMENTATION ISSUES
According to the design, we depict some
implementation issues in this section. In general, a PN node
can be implemented using an embedded system with
embedded Linux OS.
Peer Node State
Here, we present some information and states inside a
PN used for PN selection and video data access. The
information comprises of peer node state contains the
Unique ID, the peer node group, bandwidth, peer node’s
replica state, authentication key and authorized state, as
follows:
• Unique ID (UID): Every PN has a unique ID in the
System, and they are recoded in the database of
DN.
• Group ID (GID) of Peer Node: It is the ID of RG,
which identifies the group of replication. The Peer
Nodes have the same replica will be grouped in the
same group; so that they have the same GID. If a
PN has N replicas, it will maintain N GIDs for
each group.
• Bandwidth: Each peer node may have different
bandwidth in different situation. It can be used for
calculating the transmitting size of video data
while client asks for his own video.
• Peer Node’s Replica State: The state indicates the
PN is available for replicating other PN’s video. If
“Yes”, it means that it has room for doing so.
• Authorized State: The peer node will store the
Authorized Key from DN. Each key is mapped to
a PN for further access, and it also recodes the
relationship with the peer node. DN will check this
state for the stability of connection between the
same replicas.
The PNs Scheduler in Video Recording
We utilize the same replication scheme with Hadoop-
like file system to store video data. When a PN registers
itself into the DN, it will be grouped together with other PNs
(replicas) using our PN scheduling algorithm that provides a
lookup service which according to the Peer Node’s storage space
state and bandwidth. In the Hadoop, a file will have three
replicas. Therefore, we utilize this algorithm to select two S-
PNs for a certain P-PN.
Figure 6: PN scheduling algorithm
The algorithm of the Replica Scheduler can be found
in Figure 6. The algorithm firstly executes unFullPNList()
to get a PNList (PeerNode List) that maintains the registered
PN which is available for replicating other video data in the
system. And then sorts the list by executing
ReplicaStateSort() to sort the list according to the storage
space of PN. The part of the same storage space in the list,
we will sort again by executing BandwidthSort() according
to the bandwidth of PN.
In order to balance the load, we first try to find an S-
PN from the member of PNlist (PeerNode List). If we
cannot find any PN to serve the replication, we then try the
next peer node in the list.
TABLE 1: SYSTEM ANALYSIS
According to the different design for the system, they
have different advantage. We compare the SMF [14],
PPClient [15], UVS [10] and our P2PCloud. SMF is hard to
join new sensor node into the system, but it robustness is
strong. PPClinet has higher scalability, but the video of each
Media Distribution Subsystem doesn’t make any replicas to
avoid the error occur. UVS utilize UVSMON tree rotate
665
6. when the new node join or broken, but the system frequently
revise is not good for robustness. The SMF and UVS are
centralized storage, but PPClient and P2PCloud are
distributed storage. P2PCloud has higher scalability, fault
tolerant and robustness, but lower bandwidth
V. CONCLUSION AND FUTURE WORK
In this paper, we have proposed an architecture for
video surveillance service by integrating P2P and hadoop-
like file system technology. Adapting P2P is used for
connecting with each PN and storing video data to replicas.
It can improve scalability, cost and efficiency, while
Hadoop is to improve reliability and efficiency. We
presented the system design, components and their
functionality, operation flows, and implementation issues.
According to our explanation and analysis, it is obviously
that such design can improve some issues; such as
scalability, reliability, robust, efficiency, and cost.
In the future, we want to implement the system to
various embedded platform; and turn and evaluate the
performance of the system.
ACKNOWLEDGEMENT
This work was supported by the Nation Science Council
of Republic of China under Grant No. NSC 100-2221-
E-305-013.
REFERENCE:
[1] H. Dias, J. Rocha, P. Silva, C. Leao, and L.P. Reis, “Distributed
Surveillance System”, Proc. of the Portuguese Conf. on Artificial
Intelligence, Covilha, Portugal, 5-8 Dec. 2005
[2] X. Cao, Z. Wang, R. Hu, and J.Chen , “Distributed Video
Surveillance System Based on Overlay Network,” In Proc. IEEE
Future Generation Communication and Networking (FGCN’07),
pp.368-373, 6-8 Dec. 2007
[3] Hadoop. http://hadoop.apache.org/, 2012.
[4] D. Borthakur. “The hadoop distributed file system: Architecture and
design”, Hadoop Project Website, 2007.
[5] HDFS (hadoop distributed file system) architecture.
http://hadoop.apache.org/common/docs/r1.0.2/#HDFS, 2012.
[6] J. Shafer, S. Rixner, and A. L. Cox. “The Hadoop Distributed
Filesystem: Balancing Portability and Performance”, In Proceedings
of the 2010 IEEE International Symposium on Performance
Analysis of Systems and Software (ISPASS’10), pp. 122–133, 2010.
[7] Michael Bramberger, Andreas Doblander, Arnold Maier, Bernhard
Rinner, Helmut Schwabach, “Distributed Embedded Smart
Cameras for Surveillance Applications”, Computer, vol. 39, no. 2,
pp. 68-75, Feb. 2006.
[8] L.F. Marcenaro et al.,“Distributed Architectures and Logical-Task
Decomposition in Multimedia Surveillance Systems,” Proc. IEEE,
pp.1419-1440, Oct. 2001.
[9] Y.Hongyun, H.Ruiming, C.Jun. “Design and Implementation of
Large-scale Distributed Video Surveillance System”, the Third
International Conference on Computer Science & Education
(ICCSE'2008), Kaifeng, China, May 20,2008.
[10] Chia-Hui Wang, Haw-Yun Shin, Wu-Hsiao Hsu. “Load-sharing
overlay network design for ubiquitous video surveillance services”,
International Conference on Ultra Modern Telecommunications &
Workshops, 12-14 Oct. 2009, pp. 1-7.
[11] Whitman, D. “The need and capability of a Surveillance Data
Distribution System”, Integrated Communications, Navigation and
Surveillance Conference, 2009. ICNS '09, pp. 1 – 6, 13-15 May
2009.
[12] Yue-Shan Chang, Guo-Jie Zou, Ching-lung Chang “RARS: A
Resource-Aware Replica Selection and co-allocation scheme for
Mobile Grid,” International Journal of Ad Hoc and Ubiquitous
Computing, Vol. 6, No. 2, 2010, pp. 99-113.
[13] P. Maymounkov and D. Mazieres, “Kademlia: A peer-to-peer
information system based on the xor metric”, In Proceedings of
IPTPS02, Cambridge, USA, Mar. 2002.
[14] A.R. Hilal, A. Khamis, and O. Basir, ”A Service-Oriented
Architecture Suite for Sensor Management in Distributed
Surveillance Systems,“ 2011 International Conference on Computer
and Management (CAMAN), 2011, pp. 1 – 6.
[15] Xun Zhu, Hongtao Deng, Zheng Chen, Hongyun Yang “Design of
Large-Scale Video Surveillance System Based on P2P Streaming,”
2011 3rd International Workshop on Intelligent Systems and
Applications (ISA), 28-29 May 2011, pp. 1 – 4.
666