The document discusses the development of a NoSQL query processing system for wireless ad-hoc and sensor networks. It begins by reviewing existing SQL-based query processing systems like TinyDB and TikiriDB and noting their limitations for wireless sensor networks that lack consistent connectivity. The main objective is described as transforming the relational database model to a NoSQL model for better performance and scalability. The design of the NoSQL query processing system is then outlined, including components like NoSQL queries, a lexical analyzer and parser, query processor, data packets, mesh routing, and using the Redis NoSQL database architecture. Implementation details are also provided about generating NoSQL grammars, implementing data packets, and executing queries on sensor motes.
2. NoSQL Query Processing System for Wireless Ad-hoc and Sensor Networks 79
Here we show the example of TinyDB query; Relational Database Management Systems (RDBMS).
Considering on RDBMS, those databases are design for
SELECT COUNT ( * ) FROM sensors AS s, recentLight AS
guarantee ACID properties. But NoSQL databases did not
rl WHERE rl.nodeid=s.nodeid AND s.light < rUight
guarantee ACID properties. They basically design for the
SAMPLE PERIOD lOs; [2]
performances and scalability. Normally NoSQL databases are
B. TikiriDB suitable for large set of data.
TikiriDB [3] is another well known database abstraction for Working with large set of data using a table based database
WASNs. TikiriDB is the database abstraction layer for Contiki systems, it needs lot of resources to store such massive data
operating system. Comparing TikiriDB with TinyDB, and the operations are time consuming. With regards to
TikiriDB support shared WASNs. NoSQL databases handle massive amount of data is much
TikiriDB also provide a SQL query interface called easier, and the performances are very fast when comparing
TikiriSQL to query the sensor network [3]. It is much more RDBMS. The only limitation of NoSQL is the memory and
similar to conventional query language apart from additional the processing speed. NoSQL database systems uses key,
syntax to comply with sensor network environment. SELECT value pair to store data so, if you want to keep your data in a
temp, humid FROM sensors SAMPLE PERIOD 2 FOR 10; [3] persistent state and have access to them, then this would be an
This query returns node id, humidity level, and temperature ideal database system.
level in every 2 second intervals for duration of 10 seconds Currently there are several NoSQL database management
from all the available sensors nodes in the sensor network. systems available. Facebook's Cassandra, LinkedIn's Project
The results appended to the table as they are arriving to the Voldemort, Google's BigTable and Amazon's Dynamo are
user. Thus the resulting table dynamically expands according some of them. Chordless, CouchDB, Db4o, GT.M, Hbase,
to time. Hypertable, Memcachedb, Mnesia, MongoDB and Redis are
some popular open source NoSQL projects as well.
I) Client with TikiriSQL Library: The client side
functionalities of the TikiriDB is included in the TikiriSQL III. DESIGN OF NoSQL DATABASE ABSTRACTION
library and used by a user program. It provides functions to In this section we discuss the design of NoSQL database
issue SQL queries by the user program, parses the queries and abstraction for WSN from a higher level architecture to
sends them to the Serial Forwarder (SF). TikiriSQL library detailed design.
returns data to the user program which is received from the SF.
Its' main tasks are, 1) Accept queries from the user program, 2) A. Architecture of NoSQL Database Abstraction
Parse the query and put it to a manageable format, 3) If there Figure 1 illustrates the overall design architecture of this
are any syntactic and semantic errors, it returns warnings to research including all the main components, which discussed
the user. in detail in the sub sections 1) to 5).
The possible semantic errors are SELECT queries with As displayed in Figure 1, NoSQL database abstraction
undefined field names, EVENT queries with undefined event consists of eight main components which are directly
names. All the available field names and event names are kept contributing to the database abstraction. These eight main
in an XML configuration file. 1) If no errors, send this new components are; front end NoSQL query, Lexical analyzer
formatted query to the serial forwarder, 2) Returns the query and parser, Query processor, Data packet, Serial forwarder
10 returned from the SF to the user program, 3) This query ID plug-in, Mesh routing protocol, Executing query In sensor
can be used to issue a STOP query to stop an executing query
identified by the query ID 4) Put the data received from the SF
to data structures and make it available to the user for
manipulation.
C. Cougar @
Cougar is another well known approach to in-network
query processing in sensor networks. It supports a platform for
testing query processing techniques over ad-hoc sensor
networks. Cougar mainly has three-tier architecture. It consists ..,
of,
• A query proxy
• Front-end components
• A graphical user interface
Fig. I: Higher level design architecture
Cougar is designed for in-network query processing. In
network processing reduces energy consumption and increase motes and finally the Redis NoSQL database.
lifetime of sensor network significantly compared to 1) NoSQL Query: NoSQL queries play a major role in
traditional centralized data extraction and analysis. Thus one this database abstraction. This is a novel approach for sensor
of the main roles of the query proxy when processing user network database abstractions, because existing database
queries is to perform in-network processing [4]. abstractions consists of traditional SQL queries for querying
sensor networks. We designed NoSQL query syntaxes for
D. NoSQL querying sensor networks. Most of these queries are similar to
NoSQL means Not Only SQL. The concept of NoSQL RedisDB NoSQL queries, because we adopt RedisDB
starts from 1998. NoSQL databases are differing from architecture for our abstraction. Designed NoSQL queries are,
1sl & 2nd September 2011 The International Conference on Advances in rCT for Emerging Regions - ICTer2011
3. 80 T.A.M.C. Thantriwatte, and c.1. Keppetiyagama
• Select Query: Appropriate keyword followed by Redis key space is divided to 4096 hash slots. In order to
relevant key achieve that, different nodes hold a subset of hash slots. All
• Join Query: Appropriate keyword followed by relevant the nodes are connected to each other and the functionalities
key followed by valid set condition for key of each and every node is equivalent.
• Range Query: Appropriate keyword followed by
IV. IMPLEMENTATION
relevant key followed by valid range condition
This section discusses the implementation of the NoSQL
• Ranking Data: Appropriate keyword followed by key
database abstraction for wireless ad-hoc and sensor networks
and relevant member name
in more detail. The focus is on how the NoSQL queries are
• Get the key of members: Appropriate keyword followed executed in actual sensor motes and results are sent back to
by relevant key the base station. Following sub sections A, B, C and 0 will
NoSQL query interface is linked with a lexer. The lexer describe the implementation procedure of our NoSQL
syntactically analyses the NoSQL query according to the rules database abstraction for Contiki operating system.
defined. Following sub section 2) introduces the A. NoSQL Grammar Implementation
functionalities of the NoSQL lexer and the parser.
We have used ANTLR tool to define our NoSQL grammar
2) Lexical Analyser and Parser: Input NoSQL query definitions. Fist we defined the NoSQL grammars for our
pass to lexical analyser, and it read the query characters from sensor network and then the ANTLR tool generates the
the input stream which is tokenized. These token identified appropriate lexical analyser and parser for it. Using ANTLR
using the predefined NoSQL lexer rules in the grammar file we can test and generate the parse tree of our NoSQL queries
which are implemented using regular expressions. easily. Implemented NoSQL queries are;
The main functionality of lexer is, generating a stream of • GET temp SET temp 2 FOR 100;
tokens according to the NoSQL query and passing it to a
• GET humid SET temp 2 FOR 100;
parser for syntax analysis. It also ignores the whitespaces and
• ZRANK temp 2.0;
comments.
• GET humid SET humid < 45;
Parsing is the second stage of NoSQL grammar validation
according to the predefined NoSQL grammar rules. Parser • ZRANK light 5.0;
checks the syntax and semantics of the NoSQL query and 8. Data Packet Implementation
generate the parse tree. If parser can find syntactic or semantic
We had implemented the data packet according to the
errors in the query it produces an error message. Finally the
parsed NoSQL query from the lexical analyzer and parser.
parser produces a C code file according to the NoSQL query
The size of the data packet is 128 bits and it is divided into
and which is passed to the query processor.
following categories.
3) Query Processor: The query processor processes the
C file generated by the parser and distinguishes the parts of
the query such as query type, relevant keys and other query
conditions. According to these query parts it generates query
id, query message header and query pay load. After that the
processed query is considered as a data packet, which is ready I , f -.L 1 -r (
8 bits 8 bits 8 bits 8 bits 16 bits 32 bits
to be routed and executed in sensor motes.
Fig. 2: Structure of data packet
4) Mesh Routing: In wireless mesh network concept,
communication is done by using the ad-hoc mode, also called
as peer-to-peer. Nodes in a mesh network should be able to • No of fields: It mentions what are the fields we want to
discover each and every node and broadcast messages to all its sense from the sensor mote. According to the query
neighbors. relevant fields are set to the data packet.
In mesh networks we used hybrid routing protocol to pass • No of expressions: This indicates the number of
our queries from the base station to destinations. Hybrid expressions that are appeared in the SET clause.
protocols consist of both proactive and reactive routing • For example GET temp SET temp> 10 2 FOR 10; In
protocols. Initially routing starts in proactive mode and then this query "temp> 10" will goes to No of expression
move to reactive flooding in the network. In this research we section of the data packet.
used the inbuilt mesh routing protocol which is in Rime stack • Input buffer 10: This defines the data source for the
in Contiki [7] operating system for our network routing query. If this is zero, direct data from sensors is used
purpose. for the query. None zero positive integer represents the
5) Redis Architecture: RedisDB is an open source, input buffer identification number. Usually, input
advanced key-value storage [5]. It is often referred to as a data buffer is storage medium such as SO card.
structure server since keys can contain Strings, Hashes, List, • Output buffer 10: This defines the location where the
Sets and Sorted-sets. Redis protocol consists of a network results of a query are stored. If this is zero, results are
layer where clients connect to port 6379. In Redis server there sent to the node who issued the query. None zero
are different kinds of replies according to client requests. They positive integer represents the out buffer identification
are error reply, integer reply, bulk replies, multi-bulk replies number. Usually, output buffer is storage medium such
and nil elements in multi-bulk replies. as SO card.
The International Conference on Advances in ICT for Emerging Regions - ICTer20 l1 1sl & 2nd September 2011
4. NoSQL Query Processing System for Wireless Ad-hoc and Sensor Networks 81
• Epoch duration: This is used to define the time duration • ZINTERSTORE destination numkeys key [key ...]:
between two executions of the query. The value of Intersect multiple Sorted-sets and store the resulting
this should be greater than zero. The duration is Sorted-set in a new key.
specified in seconds. Epoch duration is set in the SET
clause of the query. V. RESULTS
• Number of epochs: This defines the number of query In this section, we evaluate the implementation of NoSQL
executions. Zero causes to execute the query infinite query processing system for wireless ad-hoc networks on the
number of times. top of Contiki operating system.
• Fields: Field consists with 16 bits. It consists of a A. Evaluation platform
unique id, result flag and operator. 3 bits are not used in We used two platforms to test and evaluate our system. One
field section. is COOJA [9] simulation platform while other is a hardware
• Expressions: Expression consists with 32 bits. Every platform. We often used COOJA simulation platform to
field had relevant expression. It consists with data develop and test our system, because testing with real sensor
which mapped to field id and operator. motes is little bit tricky part in sensor networks and most of
the system are platform independent. After successful testing
C. Implementation of Serial Forwarder with COOJA simulation platform, we test our system with real
We did not implement the serial forwarder. Existing sensor motes as well.
TikiriDB [3] 0.2 release there was a serial forwarder which we
B. Performance Analysis
can directly plug into our NoSQL database abstraction. So we
used that serial forwarder plug-in to sent data packet to the In performance analysis we analyzed the query execution
time of TikiriDB [3] database abstraction and our newly
network.
designed NoSQL database abstraction. We used following
Once data packets arrive at the serial forwarder it assigns a
queries to evaluate the execution time of both database
unique query ID to each and every data packet. It also stores
abstractions. For TikiriDB database abstraction,
the query ID and the client ID mapping in a table which is
located in its' memory. After that it sends the query to the • SELECT temp FROM sensors SAMPLE PERIOD 2
network with the query ID and ID for the serial forwarder. FOR 10;
When data arrives from the sensor network, serial forwarder • SELECT humid FROM sensors SAMPLE PERIOD 2
searches the relevant client 10 from its' table and sends the FOR 10;
relevant data to the base stations. All the implementations of
serial forwarder are done by using C++ language and the For NoSQL database abstraction,
serial forwarder COOJA [9] plug-in which was written by • GET temp SET temp 2 FOR 10;
using Java language. • GET humid SET humid 2 FOR 10;
Routing is another important part in sensor networks. Here
we used inbuilt mesh routing protocol which is in Rime stack We get the execution time of queries by changing the
in Contiki [7] operating system for routing. sample period of both SQL and NoSQL queries. Following
figure shows the query execution time of TikiriDB database
abstraction and NoSQL database abstraction with respect to
D. Implementation of RedisDB Plug-in
sample periods. The X and Y axis represent sample period
We have done the implementation of Redis backend in variation in seconds and response time in milliseconds
iterative manner. RedisDB supports different data structures
respectively.
such as Strings, Hashes, Lists, Sets and Sorted-sets. First we
have implemented our backend data structure using Strings 350000
and evaluate the performances of our NoSQL database 300000
250000
abstractions. After that we done the implementation using
200000
other data structures as well and evaluates them. These 150000 • NoSQL
evaluations are mentioned under section v-co According to 100000
.SQL
those evaluations, finally we implemented with Sorted-sets. 50000
o
Sorted-set implementation of our NoSQL database
abstraction basically works with two values called key and 2FOR 2FOR 2FOR 2FOR 2FOR
10 100 1000 10000 100000
member. In this implementation we mapped sensing field of a
query as key and sensing values as members. According to
Fig. 3: NoSQL against SQL query execution time
these key and member values we can use following NoSQL
queries in our database abstraction. According to the Figure 3, it was observed that for a shorter
time periods, both database abstractions show the same
• ZADD key score member: Add a member to a Sorted
performances, but when the time period increases,
set, or update its score if it already exists.
performances of NoSQL database abstraction get better. That
• ZCARD key: Get the number of members in a Sorted means execution time of NoSQL query in sensor networks get
set. lesser time than execution time of SQL query in sensor
• ZCOUNT key min max: Count the members in a networks. Reason for that performance bottleneck in SQL
Sorted-set with scores within the given values. database abstraction is, processing time of SQL query.
• ZlNCRBY key increment member: Increment the score According to above results we can conclude processing
of a member in a Sorted-set. NoSQL queries are much more efficient than processing SQL
1sl & 2nd September 20 I I The International Conference on Advances in ICT for Emerging Regions - ICTer20 11
5. 82 TAM.C. Thantriwatte, and C.I. Keppetiyagama
queries in sensor networks. This cause saves energy in sensor backend data structure is the suitable approach for NoSQL
motes. database abstraction for wireless ad-hoc networks.
C. Runtime Analysis VI. CONCLUSION
In section IV-D mentioned that we used different data In this paper we have first reviewed the existing database
structures as backend of our NoSQL database abstraction. We abstractions in WSN. Then we discussed about the issues
tested the backend data structures against the time related to existing database abstractions in wireless ad-hoc
complexities of NoSQL queries. We obtained following networks. Then we discussed the importance of using NoSQL
results. as the underlying database abstraction for ad-hoc networks.
• SET query analysis According to the experimental results we conclude that
NoSQL queries perform better when using longer sample
periods. This shows higher scalability for the networks.
TABLE I Secondly we evaluated the query performance with regard to
the time complexity of different data structures, and it showed
that using Sorted-sets we can achieve best results.
O(log(N»
VII. FUTURE WORK
• GET query analysis This research has evolved from relational database model to
NoSQL database model. Therefore query optimization is not
good as the relational model. Query optimization related to
TABLE II this research is to be done in near future. In addition optimized
routing protocols, security layers can also be incorporated to
this model.
O(log(N»
ACKNOWLEDGMENTS
From the above tables it can be observed that Sorted-sets The authors would like to express their sincere thanks to the
show the best time complexity. After analysing Sorted-set people in SCORE lab of University of Colombo School of
furthermore for different NoSQL query operations, we got the Computing. We would also like to thank the support given by
following results. Primal Wijesekara from University of British Columbia, K.C
Hewage from UCSC, Nayanagith M. Laxman from UCSC and
A.P. Sayakkara from UCSC.
TABLE III
SORTED-- SET ANALYSIS [10] REFERENCES
Query Time Complexity [1] C.Weyer, V.Turau, ALagemann, and J.Nolte, "Programming wireless
sensor network s in a self-stabilizing style," Sensor technologies and
ZADD o (log(N» Applications, International Coriference on, vol. 0, pp. 610-616, 2009.
[2] S.R.Madden, M.J.Franklin, J.M.Helerstein, and W.Hong, "TinyDB: an
ZCARD 0(1)
acquisitional query processing system for sensor networks," ACM
ZCOUNT o (log(N)+M) Transactions on Database Systems, vol. 30, no. 1, pp. 122-173,
Mar.2005. [Online]. Available:
ZlNCRBY o (log(N» http://portal.acm.orglcitation.cfm?doid=1061318.1061322
[3] N.M.Laxaman, M.D.J.S.GoonathiIIake, and K.D. Zoysa, "TikiriDB:
ZlNTERSTORE o (N*K)+O (M*log(M»
Shared Wireless Sensor Network Database for Multi-User Data
ZRANGE o (log(N)+M) Access," CSSL 2010.
[4] Y.Yao, "The cougar approach to in-network query processing in sensor
ZRANGEBYSCORE o (log(N)+M)
networks," ACM SIGMOD Record, vol. 31, no. 3, p.9, Sep. 2002.
ZRANK o (log(N» [5] M.Seeger, "Key-Value stores: a practical overview," Media, pp. 1-21,
2009.
ZREM o (log(N»
[6] G. Decandia, D. Hastorun, M. Jampani,G. Kakulapati, A Lakshman, A
ZREMRANGEBYRAK o (log(N)+M) Pilchin,S. Sivasubramanian, P. Vosshal, and W. Vogels, "Dynamo :
Amazon Highly Available Key-value Store," October, pp. 205-220,
ZREMRANGEBYSCORE o (log(N)+M) 2007.
[7] A Dunkels, B. Gronval, and T. Voigt, "Contiki - a lightweight and
ZREVRANGE o (log(N)+M)
flexible operating system for tiny networked sensors," 29th Annual
ZREVRANGEBYSCORE o (log(N)+M) IEEE International Conference on Local Computer Networks, pp 455-
462. [Online]. Available:
ZREVRANK o (log(N» http://ieeexplore.ieee.orglxpllfreeabs all.jsp?amumber=1367266
_
[8] P. Levis, S. Madden, 1. Polastre, R. Szewczyk, K. Whitehouse, AWoo,
ZSCORE 0(1)
D. Gay, J. Hill, M.Welsh, E. Brewer et aI., "TinyOS: An operating
ZUNIONSTORE O(N) + OeM log(N» system for wireless sensor networks," Ambient Intelligence, vol. 35,
2005.
According to table III, we figure out time complexities of [9] F. Osterlind, A Dunkels, J. Eriksson, N. Finne, and T. Voigt, "Cross
level sensor network simulation with COOJA," Proceedings. 200631st
NoSQL queries under Sorted-sets backend data structure gives
IEEE Conference on Local Computer Networks, pp. 641-648, Nov.
better performances than Strings, Hashes, Lists and Sets. 2006. [Online]. Available:
Also the problems came under Strings, Hashes, Lists and http://ieeexplore.ieee.orgllpdocs/epic03/wrapper.htm?amumber=41166
33
Sets can be solved using Sorted-sets. According to above
[10] Command reference - Redis. http://redis.io/commands; 2011: [Online
comparison results, we came up with a conclusion, Sorted-sets
accessed: 10 Jun 2011].
The International Conference on Advances in ICT for Emerging Regions - ICTer2011 15t & 2nd September 2011