SlideShare una empresa de Scribd logo
1 de 5
Descargar para leer sin conexión
K.Manju et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.552-556

RESEARCH ARTICLE

www.ijera.com

OPEN ACCESS

Assessment of Efficacy among String Quest
K.Manju1, R. Brindha2, V. Lathika3, Mr. A. M. Ravishankkar4, Mr. T.
Yoganandh5
1,2,3

UG Student-Department of CSE Jay Shriram group of institutions, Avinashipalayam, India
Assistant Professor- Department of CSE Jay Shriram group of institutions, Avinashipalayam, India

4,5

ABSTRACT
In a large spatial database the work mainly deals with the approximate string search. Particularly we examine
the query range append with the string similarity search in both Euclidean space and Road network. We dub this
query the spatial approximate string (SAS) query. We propose an approximate solution in Euclidean space, the
D-tree which embeds min-wise signatures into an R-tree. The min-wise signature for an index node u keeps a
clear representation of the union of q-grams from the sub tree of u. We find the pruning functionality of such
signatures based on the set resemblance between the query string and the q-grams from the sub tree of index
nodes. We discuss about the estimation of selectivity of a SAS query in Euclidean space, for which we present a
novel adaptive algorithm to find the balanced partitions using both the spatial and string information stored in
the tree. In Road networks, we propose a novel exact method, RSASSOL, which significantly performs the
baseline algorithm. The RSASSOL method partitions the road network, adaptively searches relevant sub graphs,
and prunes candidate points using both the string matching index and the spatial reference nodes. The efficiency
and effectiveness of our approaches is by the extensive experiments on large real data sets.
Keywords- Approximate string search, range query, road network and spatial databases

I. INTRODUCTION
Data mining also known as Knowledge
Discovery or Knowledge Discovery in Database
(KDD), is the process of extracting or mining the
data from large amount of databases. Text mining is
one of the applications of data mining which mainly
involves the process of extracting interesting
information and knowledge from unstructured text
documents. Keyword search over a large amount of
data is an important operation in a wide range of
domains. In spatial databases, wherekeyword search
becomes a fundamental building block for an
increasing number of real-world applications, and
proposed the IR2-Tree. A main limitationof the IR2Tree is that it onlysupports exact keyword search.
Approximate string search is necessarywhen users
have a fuzzysearchcondition, or a spelling error when
submitting the query, or the strings in the database
contain some degree of uncertainty or error. In the
context of spatial search could be combined with any
type of spatial queries. In this work, we focus on
range queries and dub such queries as Spatial
Approximate String (SAS) queries. We denote SAS
queries in Euclidean space as (ESAS) queries.
Similarly, it extends SAS queries to road networks
(referred as RSAS queries).In ESAS, this motivates
the need for string matching. A critical component of
record matching involves determining whether two
strings are similar or not: Two strings are considered
matches if their corresponding (string) attributes are
www.ijera.com

similar. String similarity is typically measured via a
similarity function that, given a pair of strings returns
a number between 0 and 1 a higher value indicating a
greater degree of similarity with the value 1
corresponding to equality. This function is used to
perform a similarity join between two input relations
that returns pairs of strings whose similarity is above
an input threshold.

II. RELATED WORK
1.

INCORPORATING FORMAL STRING
INDEX TRANSFORMATION IN RECORD
MATCHING
In this paper, the author considered the
problem of record matching for user-defined string
transformation as input. The similarity between two
strings is defined by transformations coupled with an
underlying function of similarity. To lookup an input
record against a table of records, where we make this
approach with effectiveness by a fuzzy match
operation. Here we have an additional table of
transformation as input. The cognizant of
transformations is nothing but the improvement in the
quality of record matching and the efficient retrieval
based on our index structure. The characteristic of
this scenario is the most input sub-strings do not
match with any member of the dictionary, sowe
developed a compact filter which efficiently filters
out a large number of sub-strings that cannot match
with any member of dictionary. For membership, the
552 | P a g e
K.Manju et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.552-556
sub-strings which pass out filter are then verified by
checking it. We demonstrate real datasets that
approach significantly outperforms both current best
exact methods as well as probabilistic methods, that
may not identify a small percentage of matching substrings.
2.

RETRIEVING TOP-K PRESTIGE-BASED
RELEVANT SPATIAL WEB OBJECTS
In this paper, the author handles the problem of
retrieving web documents which is relevant to query
of a keyword within a pre-specified spatial region.
There are two stages in query processing. In the First
stage, indexing is for the filtration of web document.
In the second stage, employed the another index.
(e.g.., R-tree). To integrate the R-tree with signature
files here a hybrid index structure is proposed. For
the purpose of pruning the search space at a query
time, both spatial information and text information is
utilized by enables the hybrid index structure.
However, this proposal is limited by its use of
signature files (e.g.., the number of false matches is
linear in the collection size and there is no sensible
way of using signature files for handling ranking
queries). To process a new type of query the
combination of R*-tree and bitmap indexing is
developed by hybrid index structure is called l-closest
keyword query. To enable the efficient processing of
the location-aware top-k ranking query, it utilizes
both location and text information to prune the search
space which integrates the R-tree and inverted files
for the IR-tree in hybrid index structure.
3.

SELECTIVITY
ESTIMATION
IN
SPATIAL DATABASES
In this paper, the author proposed a several
new techniques for spatial selectivity estimation.
These techniques are based on the spatial indices,
binary space partitioning, and the novel notion of
spatial skew. In database the critical component of
query processing is selectivity estimation. For the
spatial selectivity estimation there will be a very little
work in providing accurate and efficient techniques,
despite the increasing popularity of spatial databases.
In this domain the relational techniques do not
perform well because the spatial data defers from the
relational data. From the previously known
techniques,the author can able to show that : (a)
Sampling and parametric techniques which work well
in the relational one-dimensional world do not work
well for spatial data. (b) A BSP based partitioning
that we call Min-skew outperforms the other

www.ijera.com

www.ijera.com

techniques over a broad range of query workloads
and datasets.

III. EXISTING SYSTEM
Approximate string search is necessary
when the users have a fuzzy condition search or
spelling error when submitting the query. By
completely ignoring the spatial component of a
query, we evaluate only the string predicate by
matching the index which is built as a string in both
ESAS and RSAS query to produce a direct solution.
The string solution which contains a point is not
satisfied by the spatial predicate has been pruned in
post processing step after all similar strings has been
retrieved.
1) The string solution suffers the same scalability
and performance issues as the spatial solution.
2) In existing spatial databases additionally we
answer for SAS query to enable the efficient
processing of standard spatial queries which is a
spatial-oriented solution.

IV. PROPOSED SYSTEM
In our proposed system, we divide a roads
network G={V,E} into t edge-disjoint sub graphs
G1,G2,……,Gt, where t is a user parameter, and for
each sub graph build one string index. From V as
reference nodes, we also select a small subset VR of
nodes: they are used to prune candidate points/nodes
whose distance to query point q are out of query
range r. In our RSAS query framework it consists of
five steps.
1) We find all the sub graphs which intersect with
the query range.
2) To retrieve the points we use the filtration tree
of the sub graphs whose string are potentially
similar to the query string.
3) We performing the calculation of lower and
upper bounds of their distance to the query
point, using VR to prune away some of the
candidate points.
4) Between the query string and the strings of
candidates the exact edit distance is performing
to prune away some further candidate points.
After this step, the string predicate has been
fully explored.
5) We do the checking process of their exact edit
distance to the query point for the remaining
candidate points to return those with distance
within
r.

553 | P a g e
K.Manju et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.552-556

A Query

Find all sub
graphs

Filtering Phase

Candidate Points
Pruning Phase

Filter Tree

By calculating
the lower and
upper bounds

Retrieve points
from strings

Query results

www.ijera.com

Using the
exact edit
distance

Check the exact edit distance of remaining candidate points

Figure 1.SYSTEM ARCHITECTURE
In RSASSOL algorithm, we find all the sub
graphs that intersect with the query range. Here we
employee the Dijkstra‟s algorithm, that is starting
from the query point q to traverse nodes in G. we
analyze that sub graphs for further explorations,
whenever this traversal needs the first node of new
sub graph. When we reach the boundary of query
range automatically the algorithm terminates.To find
the points from Gi that may share similar strings to
the query strings, examine theeach sub graph Gi, we
use the approximate string search over Gi‟s filter tree
as the next pruning step. Then using the spatial
predicate, we prune the candidate points by
computing lower and upper bounds on their distance
to q using VR, in a similar way to the ALT algorithm.
Given a candidate point p on an age w=(mi,mj), the
shortest path from p to a reference node mr must pass
through either mi or mj.
Network distance
d(p,mr)=min(d(p,mi)+d(mi,mr),d(p,mj)+d(mj,.mr))
where

www.ijera.com

d(mi,mr),d(mj,mr) are available from RDISTi and
RDISTj respectively,
d(p,mi) is the distance offset of p to mi which is
available in the adjacency list and the points file of
mi,
d(p,mj)=NDIST(mi,mj)-d(p,mi)
where
NDIST(mi,mj) is available in the adjacency list of mi.
We compute d(p,mr) on the fly rather than
explicitly storing the distance between a point and a
reference node since the number of points much
larger than the number nodes in G. given d(p,mr) and
d(q,mr) for every mrЄVR, we then obtain the distance
lower and upper bounds between p and q using the
triangle inequality. Besides the batch verification, we
support one-at-a-time-verification, which implement
as follows. Verification model consists of two
phases: first, the building phase, second, querying
phase. In building phase, we create λ( r ) tuples { id,
r, hash_sig, wt} for each dictionary string r and each
signature generated by r. Hash code of signature is
denoted as hash_sig and wt is the weight of the string
r.

554 | P a g e
K.Manju et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.552-556

www.ijera.com

DatabaseStrings
Building phase
In memory filter

Input
String

On disk indexed table

Indexed
StringString
Filtering component

Exact
Verification component

Querying phase

Figure 2. OVERVIEW OF THE FRAMEWORK
RSASSOL ALGORITHM
1. Find the set X of ids from all the sub graphs
2. Set B = ß ,Bc= ß
3. For each sub graph id j є X do
4. Find all points ids in Ki whose associated strings
α ‟ may satisfy є(α „ ,α) ≤ r using filter tree
5. for every point Pi є Bc do
6. Calculate b+ (pi , q) and b – (pi ,q) as discussed
7 .if b+ (pi ,q) ≤ r then
8. if є (α i ,α) ≤ r then move pi from Bc to B

9.

10.
11.
12.
13.
14.

else delete pi from Bc else
if b - (pi ,q) > r then
delete pi from Bc
for every point pi є Bc do
if є(α i ,α) > r then
delete pi from Bc
Use the MPALT algorithm to find all points p „s
in Bc
Return B

String(id,str) Compare(id1,id2)

Signature(id,sign)

Output(id1,id2)

Figure 3.SS-JOIN IMPLEMENTATION
In this section, we perform the String
Similarity joins using edit distance, which is one of
the most common distance functions for string. SSJoins are closely related to set-containment joins,
which has been the main part of several previous
works. Generally, similarity joins are closely related
to proximity search. The goal is to retrieve thelookup
closest object.
APPROXIMATE MEMBERSHIP CHECKING
Input: Z,ϒ S =<t1 ,t2......>
,
1. Build the filter f (Z,ϒ
)
2. Index Z for verification
3. for (Start= 1 to lZ l – L + 1)
4. for (length =1 to L)
5. m ← < t start,tstart+1,tstart +length-1>
6. if(f. prune(m)==true) continue
www.ijera.com

7. if(Э r ε Z, S. t Similarity(r, m)≥ delta)
8. Output m

V. EXPERIMENTAL SETUP
The system is developed using Java and is used
in the system development. MYSQL is used as a
back end for this system development. Input to this
project is the user‟s string which is related to the
keyword which already in database.
1) The process of entering the string to collect the
exact information about the string.
2) The particular string makes a compare with
related strings which is already stored in the
database by using the ids.
3) Then the id shortlisted the string which is
related to the user string.

555 | P a g e
K.Manju et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.552-556
At last by applying the Dijkstra‟s algorithm
user can get the exact information for
particular string.
The attention is finally paid to extract
exact information from the database based on
comparison between the strings.
4)

the
the
[2]
the
the

VI. CONCLUSION
In this paper, we use the edit distance for the
string predicate as the similarity measurement. And
also address the problem of query selectivity
estimation for the queries in Euclidean space. In the
selectivityestimation for the query range on road
networks where proposed. Even though, they can
only able to estimate the number of loads and edges
in the range. Future work includes examines the
queries of spatial approximate sub-string, designing
methods that are more user friendly and solving the
selective estimation problem for RSAS queries.

REFERENCES
[1]

S. Acharya, V. Poosala, and S. Ramaswamy,
“Selectivity
Estimation
in
Spatial

www.ijera.com

[3]

[4]

[5]

[6]

www.ijera.com

Databases”, Proc. ACM SIGMOD Int‟l
Conf Management of Data, 1999.
S. Alsubaiee, A. Behm, and C. Li,
“Supporting Location-Based ApproximateKeyword Queries”, Proc. SIGSPATIAL 18 th
Int‟l Conf. Advances in Geographic
Information Systems (GIS), 2010.
A. Arasu, S. Chaudhuri, K. Ganjam, and R.
Kaushik,
“Incorporating
String
Transformations in Record Matching”, Proc.
ACM SIGMOD Int‟l Conf. Management of
Data, 2008.
A. Arasu, V. Ganti, and R. Kaushik,
“Efficient Exact Set-Similarity Joins”, Proc.
32nd Int‟l Conf. Very Large Data Bases
(VLDB), 2006.
N. Beckmann, H. P. Kriegel, R. Schneider,
and B. Seeger, “The R*-Tree: an Efficient
and Robust Access Method for points and
Rectangles”, Proc. ACM SIGMOD Int‟l
Conf. Management of Data, 1990.
X. Cao, G. Cong, and C.S. Jensen,
“Retrieving Top-k Prestige-Based Relevant
Spatial Web Objects”, Proc. VLDB
Endowment,
vol.
3,
2010.

556 | P a g e

Más contenido relacionado

La actualidad más candente

International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) ijceronline
 
JPJ1422 Fast Nearest Neighbour Search With Keywords
JPJ1422   Fast Nearest Neighbour Search With KeywordsJPJ1422   Fast Nearest Neighbour Search With Keywords
JPJ1422 Fast Nearest Neighbour Search With Keywordschennaijp
 
Designing of Semantic Nearest Neighbor Search: Survey
Designing of Semantic Nearest Neighbor Search: SurveyDesigning of Semantic Nearest Neighbor Search: Survey
Designing of Semantic Nearest Neighbor Search: SurveyEditor IJCATR
 
Fast nearest neighbor search with keywords
Fast nearest neighbor search with keywordsFast nearest neighbor search with keywords
Fast nearest neighbor search with keywordsIEEEFINALYEARPROJECTS
 
Basic Local Alignment Search Tool (BLAST)
Basic Local Alignment Search Tool (BLAST)Basic Local Alignment Search Tool (BLAST)
Basic Local Alignment Search Tool (BLAST)Asiri Wijesinghe
 
Fast nearest neighbor search with keywords
Fast nearest neighbor search with keywords Fast nearest neighbor search with keywords
Fast nearest neighbor search with keywords Papitha Velumani
 
Automated building of taxonomies for search engines
Automated building of taxonomies for search enginesAutomated building of taxonomies for search engines
Automated building of taxonomies for search enginesBoris Galitsky
 
Keyphrase Extraction using Neighborhood Knowledge
Keyphrase Extraction using Neighborhood KnowledgeKeyphrase Extraction using Neighborhood Knowledge
Keyphrase Extraction using Neighborhood KnowledgeIJMTST Journal
 
Range Query on Big Data Based on Map Reduce
Range Query on Big Data Based on Map ReduceRange Query on Big Data Based on Map Reduce
Range Query on Big Data Based on Map ReduceIJMER
 
Query Distributed RDF Graphs: The Effects of Partitioning Paper
Query Distributed RDF Graphs: The Effects of Partitioning PaperQuery Distributed RDF Graphs: The Effects of Partitioning Paper
Query Distributed RDF Graphs: The Effects of Partitioning PaperDBOnto
 
Fast nearest neighbor search with keywords
Fast nearest neighbor search with keywordsFast nearest neighbor search with keywords
Fast nearest neighbor search with keywordsLeMeniz Infotech
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information RetrievalBhaskar Mitra
 
Full-Text Retrieval in Unstructured P2P Networks using Bloom Cast Efficiently
Full-Text Retrieval in Unstructured P2P Networks using Bloom Cast EfficientlyFull-Text Retrieval in Unstructured P2P Networks using Bloom Cast Efficiently
Full-Text Retrieval in Unstructured P2P Networks using Bloom Cast Efficientlyijsrd.com
 
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Bhaskar Mitra
 
11. Hashing - Data Structures using C++ by Varsha Patil
11. Hashing - Data Structures using C++ by Varsha Patil11. Hashing - Data Structures using C++ by Varsha Patil
11. Hashing - Data Structures using C++ by Varsha Patilwidespreadpromotion
 

La actualidad más candente (20)

International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
JPJ1422 Fast Nearest Neighbour Search With Keywords
JPJ1422   Fast Nearest Neighbour Search With KeywordsJPJ1422   Fast Nearest Neighbour Search With Keywords
JPJ1422 Fast Nearest Neighbour Search With Keywords
 
Designing of Semantic Nearest Neighbor Search: Survey
Designing of Semantic Nearest Neighbor Search: SurveyDesigning of Semantic Nearest Neighbor Search: Survey
Designing of Semantic Nearest Neighbor Search: Survey
 
Fast nearest neighbor search with keywords
Fast nearest neighbor search with keywordsFast nearest neighbor search with keywords
Fast nearest neighbor search with keywords
 
Basic Local Alignment Search Tool (BLAST)
Basic Local Alignment Search Tool (BLAST)Basic Local Alignment Search Tool (BLAST)
Basic Local Alignment Search Tool (BLAST)
 
Fast nearest neighbor search with keywords
Fast nearest neighbor search with keywords Fast nearest neighbor search with keywords
Fast nearest neighbor search with keywords
 
Ir 02
Ir   02Ir   02
Ir 02
 
Ir 03
Ir   03Ir   03
Ir 03
 
Automated building of taxonomies for search engines
Automated building of taxonomies for search enginesAutomated building of taxonomies for search engines
Automated building of taxonomies for search engines
 
Keyphrase Extraction using Neighborhood Knowledge
Keyphrase Extraction using Neighborhood KnowledgeKeyphrase Extraction using Neighborhood Knowledge
Keyphrase Extraction using Neighborhood Knowledge
 
Spatial approximate string search
Spatial approximate string searchSpatial approximate string search
Spatial approximate string search
 
Range Query on Big Data Based on Map Reduce
Range Query on Big Data Based on Map ReduceRange Query on Big Data Based on Map Reduce
Range Query on Big Data Based on Map Reduce
 
Query Distributed RDF Graphs: The Effects of Partitioning Paper
Query Distributed RDF Graphs: The Effects of Partitioning PaperQuery Distributed RDF Graphs: The Effects of Partitioning Paper
Query Distributed RDF Graphs: The Effects of Partitioning Paper
 
Fast nearest neighbor search with keywords
Fast nearest neighbor search with keywordsFast nearest neighbor search with keywords
Fast nearest neighbor search with keywords
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval
 
Full-Text Retrieval in Unstructured P2P Networks using Bloom Cast Efficiently
Full-Text Retrieval in Unstructured P2P Networks using Bloom Cast EfficientlyFull-Text Retrieval in Unstructured P2P Networks using Bloom Cast Efficiently
Full-Text Retrieval in Unstructured P2P Networks using Bloom Cast Efficiently
 
Sub1579
Sub1579Sub1579
Sub1579
 
P33077080
P33077080P33077080
P33077080
 
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)
 
11. Hashing - Data Structures using C++ by Varsha Patil
11. Hashing - Data Structures using C++ by Varsha Patil11. Hashing - Data Structures using C++ by Varsha Patil
11. Hashing - Data Structures using C++ by Varsha Patil
 

Destacado (20)

Cd4201522527
Cd4201522527Cd4201522527
Cd4201522527
 
Ck4201578592
Ck4201578592Ck4201578592
Ck4201578592
 
B41021318
B41021318B41021318
B41021318
 
Ka3617341739
Ka3617341739Ka3617341739
Ka3617341739
 
Fe35907912
Fe35907912Fe35907912
Fe35907912
 
Ez35875879
Ez35875879Ez35875879
Ez35875879
 
Dw35701704
Dw35701704Dw35701704
Dw35701704
 
Kl3518001806
Kl3518001806Kl3518001806
Kl3518001806
 
Curso de capacitación resumen tarea
Curso de capacitación resumen tareaCurso de capacitación resumen tarea
Curso de capacitación resumen tarea
 
Empresa
EmpresaEmpresa
Empresa
 
Anlisisliterarioediporey 120721200707-phpapp01
Anlisisliterarioediporey 120721200707-phpapp01Anlisisliterarioediporey 120721200707-phpapp01
Anlisisliterarioediporey 120721200707-phpapp01
 
Suisse
SuisseSuisse
Suisse
 
Presentación1
Presentación1Presentación1
Presentación1
 
Valentina moncada 6 2
Valentina moncada 6 2Valentina moncada 6 2
Valentina moncada 6 2
 
1406 le28 lexico contextual 2 2015
1406 le28 lexico contextual 2 20151406 le28 lexico contextual 2 2015
1406 le28 lexico contextual 2 2015
 
Trabajo barras
Trabajo barrasTrabajo barras
Trabajo barras
 
Álbum Digital - Jenny Chinchay Castillo
Álbum Digital - Jenny Chinchay CastilloÁlbum Digital - Jenny Chinchay Castillo
Álbum Digital - Jenny Chinchay Castillo
 
10
1010
10
 
Cartilla politicas cuarto periodo
Cartilla politicas  cuarto periodo Cartilla politicas  cuarto periodo
Cartilla politicas cuarto periodo
 
3
33
3
 

Similar a Cg4201552556

JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Spatial approximate string search
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Spatial approximate string searchJAVA 2013 IEEE CLOUDCOMPUTING PROJECT Spatial approximate string search
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Spatial approximate string searchIEEEGLOBALSOFTTECHNOLOGIES
 
JAVA 2013 IEEE NETWORKSECURITY PROJECT Spatial approximate string search
JAVA 2013 IEEE NETWORKSECURITY PROJECT Spatial approximate string searchJAVA 2013 IEEE NETWORKSECURITY PROJECT Spatial approximate string search
JAVA 2013 IEEE NETWORKSECURITY PROJECT Spatial approximate string searchIEEEGLOBALSOFTTECHNOLOGIES
 
ROMAN URDU OPINION MINING SYSTEM (RUOMIS)
ROMAN URDU OPINION MINING SYSTEM (RUOMIS) ROMAN URDU OPINION MINING SYSTEM (RUOMIS)
ROMAN URDU OPINION MINING SYSTEM (RUOMIS) cseij
 
Hybrid approach for generating non overlapped substring using genetic algorithm
Hybrid approach for generating non overlapped substring using genetic algorithmHybrid approach for generating non overlapped substring using genetic algorithm
Hybrid approach for generating non overlapped substring using genetic algorithmeSAT Publishing House
 
Approaches for Keyword Query Routing
Approaches for Keyword Query RoutingApproaches for Keyword Query Routing
Approaches for Keyword Query RoutingIJERA Editor
 
Using Page Size for Controlling Duplicate Query Results in Semantic Web
Using Page Size for Controlling Duplicate Query Results in Semantic WebUsing Page Size for Controlling Duplicate Query Results in Semantic Web
Using Page Size for Controlling Duplicate Query Results in Semantic WebIJwest
 
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASES
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASESQUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASES
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASESNexgen Technology
 
Keyword Query Routing
Keyword Query RoutingKeyword Query Routing
Keyword Query RoutingSWAMI06
 
A survey on location based serach using spatial inverted index method
A survey on location based serach using spatial inverted index methodA survey on location based serach using spatial inverted index method
A survey on location based serach using spatial inverted index methodeSAT Journals
 
SPATIAL R-TREE INDEX BASED ON GRID DIVISION FOR QUERY PROCESSING
SPATIAL R-TREE INDEX BASED ON GRID DIVISION FOR QUERY PROCESSINGSPATIAL R-TREE INDEX BASED ON GRID DIVISION FOR QUERY PROCESSING
SPATIAL R-TREE INDEX BASED ON GRID DIVISION FOR QUERY PROCESSINGijdms
 
final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)Ankit Rathi
 
Vchunk join an efficient algorithm for edit similarity joins
Vchunk join an efficient algorithm for edit similarity joinsVchunk join an efficient algorithm for edit similarity joins
Vchunk join an efficient algorithm for edit similarity joinsVijay Koushik
 
AN ALGORITHM FOR OPTIMIZED SEARCHING USING NON-OVERLAPPING ITERATIVE NEIGHBOR...
AN ALGORITHM FOR OPTIMIZED SEARCHING USING NON-OVERLAPPING ITERATIVE NEIGHBOR...AN ALGORITHM FOR OPTIMIZED SEARCHING USING NON-OVERLAPPING ITERATIVE NEIGHBOR...
AN ALGORITHM FOR OPTIMIZED SEARCHING USING NON-OVERLAPPING ITERATIVE NEIGHBOR...IJCSEA Journal
 
JPJ1423 Keyword Query Routing
JPJ1423   Keyword Query RoutingJPJ1423   Keyword Query Routing
JPJ1423 Keyword Query Routingchennaijp
 

Similar a Cg4201552556 (20)

Spatial approximate string search
Spatial approximate string searchSpatial approximate string search
Spatial approximate string search
 
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Spatial approximate string search
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Spatial approximate string searchJAVA 2013 IEEE CLOUDCOMPUTING PROJECT Spatial approximate string search
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Spatial approximate string search
 
JAVA 2013 IEEE NETWORKSECURITY PROJECT Spatial approximate string search
JAVA 2013 IEEE NETWORKSECURITY PROJECT Spatial approximate string searchJAVA 2013 IEEE NETWORKSECURITY PROJECT Spatial approximate string search
JAVA 2013 IEEE NETWORKSECURITY PROJECT Spatial approximate string search
 
Spatial approximate string search
Spatial approximate string searchSpatial approximate string search
Spatial approximate string search
 
ROMAN URDU OPINION MINING SYSTEM (RUOMIS)
ROMAN URDU OPINION MINING SYSTEM (RUOMIS) ROMAN URDU OPINION MINING SYSTEM (RUOMIS)
ROMAN URDU OPINION MINING SYSTEM (RUOMIS)
 
Hybrid approach for generating non overlapped substring using genetic algorithm
Hybrid approach for generating non overlapped substring using genetic algorithmHybrid approach for generating non overlapped substring using genetic algorithm
Hybrid approach for generating non overlapped substring using genetic algorithm
 
At33264269
At33264269At33264269
At33264269
 
At33264269
At33264269At33264269
At33264269
 
Approaches for Keyword Query Routing
Approaches for Keyword Query RoutingApproaches for Keyword Query Routing
Approaches for Keyword Query Routing
 
Using Page Size for Controlling Duplicate Query Results in Semantic Web
Using Page Size for Controlling Duplicate Query Results in Semantic WebUsing Page Size for Controlling Duplicate Query Results in Semantic Web
Using Page Size for Controlling Duplicate Query Results in Semantic Web
 
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASES
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASESQUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASES
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASES
 
Keyword Query Routing
Keyword Query RoutingKeyword Query Routing
Keyword Query Routing
 
A survey on location based serach using spatial inverted index method
A survey on location based serach using spatial inverted index methodA survey on location based serach using spatial inverted index method
A survey on location based serach using spatial inverted index method
 
SPATIAL R-TREE INDEX BASED ON GRID DIVISION FOR QUERY PROCESSING
SPATIAL R-TREE INDEX BASED ON GRID DIVISION FOR QUERY PROCESSINGSPATIAL R-TREE INDEX BASED ON GRID DIVISION FOR QUERY PROCESSING
SPATIAL R-TREE INDEX BASED ON GRID DIVISION FOR QUERY PROCESSING
 
Ceis 1
Ceis 1Ceis 1
Ceis 1
 
final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)
 
Az31349353
Az31349353Az31349353
Az31349353
 
Vchunk join an efficient algorithm for edit similarity joins
Vchunk join an efficient algorithm for edit similarity joinsVchunk join an efficient algorithm for edit similarity joins
Vchunk join an efficient algorithm for edit similarity joins
 
AN ALGORITHM FOR OPTIMIZED SEARCHING USING NON-OVERLAPPING ITERATIVE NEIGHBOR...
AN ALGORITHM FOR OPTIMIZED SEARCHING USING NON-OVERLAPPING ITERATIVE NEIGHBOR...AN ALGORITHM FOR OPTIMIZED SEARCHING USING NON-OVERLAPPING ITERATIVE NEIGHBOR...
AN ALGORITHM FOR OPTIMIZED SEARCHING USING NON-OVERLAPPING ITERATIVE NEIGHBOR...
 
JPJ1423 Keyword Query Routing
JPJ1423   Keyword Query RoutingJPJ1423   Keyword Query Routing
JPJ1423 Keyword Query Routing
 

Último

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 

Último (20)

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 

Cg4201552556

  • 1. K.Manju et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.552-556 RESEARCH ARTICLE www.ijera.com OPEN ACCESS Assessment of Efficacy among String Quest K.Manju1, R. Brindha2, V. Lathika3, Mr. A. M. Ravishankkar4, Mr. T. Yoganandh5 1,2,3 UG Student-Department of CSE Jay Shriram group of institutions, Avinashipalayam, India Assistant Professor- Department of CSE Jay Shriram group of institutions, Avinashipalayam, India 4,5 ABSTRACT In a large spatial database the work mainly deals with the approximate string search. Particularly we examine the query range append with the string similarity search in both Euclidean space and Road network. We dub this query the spatial approximate string (SAS) query. We propose an approximate solution in Euclidean space, the D-tree which embeds min-wise signatures into an R-tree. The min-wise signature for an index node u keeps a clear representation of the union of q-grams from the sub tree of u. We find the pruning functionality of such signatures based on the set resemblance between the query string and the q-grams from the sub tree of index nodes. We discuss about the estimation of selectivity of a SAS query in Euclidean space, for which we present a novel adaptive algorithm to find the balanced partitions using both the spatial and string information stored in the tree. In Road networks, we propose a novel exact method, RSASSOL, which significantly performs the baseline algorithm. The RSASSOL method partitions the road network, adaptively searches relevant sub graphs, and prunes candidate points using both the string matching index and the spatial reference nodes. The efficiency and effectiveness of our approaches is by the extensive experiments on large real data sets. Keywords- Approximate string search, range query, road network and spatial databases I. INTRODUCTION Data mining also known as Knowledge Discovery or Knowledge Discovery in Database (KDD), is the process of extracting or mining the data from large amount of databases. Text mining is one of the applications of data mining which mainly involves the process of extracting interesting information and knowledge from unstructured text documents. Keyword search over a large amount of data is an important operation in a wide range of domains. In spatial databases, wherekeyword search becomes a fundamental building block for an increasing number of real-world applications, and proposed the IR2-Tree. A main limitationof the IR2Tree is that it onlysupports exact keyword search. Approximate string search is necessarywhen users have a fuzzysearchcondition, or a spelling error when submitting the query, or the strings in the database contain some degree of uncertainty or error. In the context of spatial search could be combined with any type of spatial queries. In this work, we focus on range queries and dub such queries as Spatial Approximate String (SAS) queries. We denote SAS queries in Euclidean space as (ESAS) queries. Similarly, it extends SAS queries to road networks (referred as RSAS queries).In ESAS, this motivates the need for string matching. A critical component of record matching involves determining whether two strings are similar or not: Two strings are considered matches if their corresponding (string) attributes are www.ijera.com similar. String similarity is typically measured via a similarity function that, given a pair of strings returns a number between 0 and 1 a higher value indicating a greater degree of similarity with the value 1 corresponding to equality. This function is used to perform a similarity join between two input relations that returns pairs of strings whose similarity is above an input threshold. II. RELATED WORK 1. INCORPORATING FORMAL STRING INDEX TRANSFORMATION IN RECORD MATCHING In this paper, the author considered the problem of record matching for user-defined string transformation as input. The similarity between two strings is defined by transformations coupled with an underlying function of similarity. To lookup an input record against a table of records, where we make this approach with effectiveness by a fuzzy match operation. Here we have an additional table of transformation as input. The cognizant of transformations is nothing but the improvement in the quality of record matching and the efficient retrieval based on our index structure. The characteristic of this scenario is the most input sub-strings do not match with any member of the dictionary, sowe developed a compact filter which efficiently filters out a large number of sub-strings that cannot match with any member of dictionary. For membership, the 552 | P a g e
  • 2. K.Manju et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.552-556 sub-strings which pass out filter are then verified by checking it. We demonstrate real datasets that approach significantly outperforms both current best exact methods as well as probabilistic methods, that may not identify a small percentage of matching substrings. 2. RETRIEVING TOP-K PRESTIGE-BASED RELEVANT SPATIAL WEB OBJECTS In this paper, the author handles the problem of retrieving web documents which is relevant to query of a keyword within a pre-specified spatial region. There are two stages in query processing. In the First stage, indexing is for the filtration of web document. In the second stage, employed the another index. (e.g.., R-tree). To integrate the R-tree with signature files here a hybrid index structure is proposed. For the purpose of pruning the search space at a query time, both spatial information and text information is utilized by enables the hybrid index structure. However, this proposal is limited by its use of signature files (e.g.., the number of false matches is linear in the collection size and there is no sensible way of using signature files for handling ranking queries). To process a new type of query the combination of R*-tree and bitmap indexing is developed by hybrid index structure is called l-closest keyword query. To enable the efficient processing of the location-aware top-k ranking query, it utilizes both location and text information to prune the search space which integrates the R-tree and inverted files for the IR-tree in hybrid index structure. 3. SELECTIVITY ESTIMATION IN SPATIAL DATABASES In this paper, the author proposed a several new techniques for spatial selectivity estimation. These techniques are based on the spatial indices, binary space partitioning, and the novel notion of spatial skew. In database the critical component of query processing is selectivity estimation. For the spatial selectivity estimation there will be a very little work in providing accurate and efficient techniques, despite the increasing popularity of spatial databases. In this domain the relational techniques do not perform well because the spatial data defers from the relational data. From the previously known techniques,the author can able to show that : (a) Sampling and parametric techniques which work well in the relational one-dimensional world do not work well for spatial data. (b) A BSP based partitioning that we call Min-skew outperforms the other www.ijera.com www.ijera.com techniques over a broad range of query workloads and datasets. III. EXISTING SYSTEM Approximate string search is necessary when the users have a fuzzy condition search or spelling error when submitting the query. By completely ignoring the spatial component of a query, we evaluate only the string predicate by matching the index which is built as a string in both ESAS and RSAS query to produce a direct solution. The string solution which contains a point is not satisfied by the spatial predicate has been pruned in post processing step after all similar strings has been retrieved. 1) The string solution suffers the same scalability and performance issues as the spatial solution. 2) In existing spatial databases additionally we answer for SAS query to enable the efficient processing of standard spatial queries which is a spatial-oriented solution. IV. PROPOSED SYSTEM In our proposed system, we divide a roads network G={V,E} into t edge-disjoint sub graphs G1,G2,……,Gt, where t is a user parameter, and for each sub graph build one string index. From V as reference nodes, we also select a small subset VR of nodes: they are used to prune candidate points/nodes whose distance to query point q are out of query range r. In our RSAS query framework it consists of five steps. 1) We find all the sub graphs which intersect with the query range. 2) To retrieve the points we use the filtration tree of the sub graphs whose string are potentially similar to the query string. 3) We performing the calculation of lower and upper bounds of their distance to the query point, using VR to prune away some of the candidate points. 4) Between the query string and the strings of candidates the exact edit distance is performing to prune away some further candidate points. After this step, the string predicate has been fully explored. 5) We do the checking process of their exact edit distance to the query point for the remaining candidate points to return those with distance within r. 553 | P a g e
  • 3. K.Manju et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.552-556 A Query Find all sub graphs Filtering Phase Candidate Points Pruning Phase Filter Tree By calculating the lower and upper bounds Retrieve points from strings Query results www.ijera.com Using the exact edit distance Check the exact edit distance of remaining candidate points Figure 1.SYSTEM ARCHITECTURE In RSASSOL algorithm, we find all the sub graphs that intersect with the query range. Here we employee the Dijkstra‟s algorithm, that is starting from the query point q to traverse nodes in G. we analyze that sub graphs for further explorations, whenever this traversal needs the first node of new sub graph. When we reach the boundary of query range automatically the algorithm terminates.To find the points from Gi that may share similar strings to the query strings, examine theeach sub graph Gi, we use the approximate string search over Gi‟s filter tree as the next pruning step. Then using the spatial predicate, we prune the candidate points by computing lower and upper bounds on their distance to q using VR, in a similar way to the ALT algorithm. Given a candidate point p on an age w=(mi,mj), the shortest path from p to a reference node mr must pass through either mi or mj. Network distance d(p,mr)=min(d(p,mi)+d(mi,mr),d(p,mj)+d(mj,.mr)) where www.ijera.com d(mi,mr),d(mj,mr) are available from RDISTi and RDISTj respectively, d(p,mi) is the distance offset of p to mi which is available in the adjacency list and the points file of mi, d(p,mj)=NDIST(mi,mj)-d(p,mi) where NDIST(mi,mj) is available in the adjacency list of mi. We compute d(p,mr) on the fly rather than explicitly storing the distance between a point and a reference node since the number of points much larger than the number nodes in G. given d(p,mr) and d(q,mr) for every mrЄVR, we then obtain the distance lower and upper bounds between p and q using the triangle inequality. Besides the batch verification, we support one-at-a-time-verification, which implement as follows. Verification model consists of two phases: first, the building phase, second, querying phase. In building phase, we create λ( r ) tuples { id, r, hash_sig, wt} for each dictionary string r and each signature generated by r. Hash code of signature is denoted as hash_sig and wt is the weight of the string r. 554 | P a g e
  • 4. K.Manju et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.552-556 www.ijera.com DatabaseStrings Building phase In memory filter Input String On disk indexed table Indexed StringString Filtering component Exact Verification component Querying phase Figure 2. OVERVIEW OF THE FRAMEWORK RSASSOL ALGORITHM 1. Find the set X of ids from all the sub graphs 2. Set B = ß ,Bc= ß 3. For each sub graph id j є X do 4. Find all points ids in Ki whose associated strings α ‟ may satisfy є(α „ ,α) ≤ r using filter tree 5. for every point Pi є Bc do 6. Calculate b+ (pi , q) and b – (pi ,q) as discussed 7 .if b+ (pi ,q) ≤ r then 8. if є (α i ,α) ≤ r then move pi from Bc to B 9. 10. 11. 12. 13. 14. else delete pi from Bc else if b - (pi ,q) > r then delete pi from Bc for every point pi є Bc do if є(α i ,α) > r then delete pi from Bc Use the MPALT algorithm to find all points p „s in Bc Return B String(id,str) Compare(id1,id2) Signature(id,sign) Output(id1,id2) Figure 3.SS-JOIN IMPLEMENTATION In this section, we perform the String Similarity joins using edit distance, which is one of the most common distance functions for string. SSJoins are closely related to set-containment joins, which has been the main part of several previous works. Generally, similarity joins are closely related to proximity search. The goal is to retrieve thelookup closest object. APPROXIMATE MEMBERSHIP CHECKING Input: Z,ϒ S =<t1 ,t2......> , 1. Build the filter f (Z,ϒ ) 2. Index Z for verification 3. for (Start= 1 to lZ l – L + 1) 4. for (length =1 to L) 5. m ← < t start,tstart+1,tstart +length-1> 6. if(f. prune(m)==true) continue www.ijera.com 7. if(Э r ε Z, S. t Similarity(r, m)≥ delta) 8. Output m V. EXPERIMENTAL SETUP The system is developed using Java and is used in the system development. MYSQL is used as a back end for this system development. Input to this project is the user‟s string which is related to the keyword which already in database. 1) The process of entering the string to collect the exact information about the string. 2) The particular string makes a compare with related strings which is already stored in the database by using the ids. 3) Then the id shortlisted the string which is related to the user string. 555 | P a g e
  • 5. K.Manju et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.552-556 At last by applying the Dijkstra‟s algorithm user can get the exact information for particular string. The attention is finally paid to extract exact information from the database based on comparison between the strings. 4) the the [2] the the VI. CONCLUSION In this paper, we use the edit distance for the string predicate as the similarity measurement. And also address the problem of query selectivity estimation for the queries in Euclidean space. In the selectivityestimation for the query range on road networks where proposed. Even though, they can only able to estimate the number of loads and edges in the range. Future work includes examines the queries of spatial approximate sub-string, designing methods that are more user friendly and solving the selective estimation problem for RSAS queries. REFERENCES [1] S. Acharya, V. Poosala, and S. Ramaswamy, “Selectivity Estimation in Spatial www.ijera.com [3] [4] [5] [6] www.ijera.com Databases”, Proc. ACM SIGMOD Int‟l Conf Management of Data, 1999. S. Alsubaiee, A. Behm, and C. Li, “Supporting Location-Based ApproximateKeyword Queries”, Proc. SIGSPATIAL 18 th Int‟l Conf. Advances in Geographic Information Systems (GIS), 2010. A. Arasu, S. Chaudhuri, K. Ganjam, and R. Kaushik, “Incorporating String Transformations in Record Matching”, Proc. ACM SIGMOD Int‟l Conf. Management of Data, 2008. A. Arasu, V. Ganti, and R. Kaushik, “Efficient Exact Set-Similarity Joins”, Proc. 32nd Int‟l Conf. Very Large Data Bases (VLDB), 2006. N. Beckmann, H. P. Kriegel, R. Schneider, and B. Seeger, “The R*-Tree: an Efficient and Robust Access Method for points and Rectangles”, Proc. ACM SIGMOD Int‟l Conf. Management of Data, 1990. X. Cao, G. Cong, and C.S. Jensen, “Retrieving Top-k Prestige-Based Relevant Spatial Web Objects”, Proc. VLDB Endowment, vol. 3, 2010. 556 | P a g e