An adaptive algorithm for detection of duplicate records

An Adaptive Algorithm for Detection of Duplicate Records Presented By: Rama kanta Behera IT200127207 Under the guidance of : Miss Ipsita Mishra

INTRODUCTION ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

OBJECTIVES ,[object Object],[object Object],[object Object],[object Object]

PREVALENT METHODS ,[object Object],[object Object],[object Object],[object Object]

OUTLINE OF THE PROPOSED SOLUTION The central idea behind the present algorithm is based on the fundamental property of primality of numbers I f(x) Record set Integer number space Fig: hashing I P Record set Integer number Prime number f(x) g(x) Fig: Extended hashing into prime space

r1 r2 … rn I1 I2 … In P1 P2 … Pn PRODUCT( P prior) f(x) g(x) P1*p2 …*pn= P prior Fig: The complete algorithm

REALIZATION OF THE ALGORITHM ,[object Object],[object Object],[object Object]

STEPS OF THE ALGORITHM Step 1 : For each new record, hash is performed and unique hash value (Hnew) for each distinct record is obtained. Step 2 : Hnew is mapped to its corresponding unique prime (Pnew). Step 3 : Pprior is divided with Pnew. If Pnew exactly divides Pprior, then the corresponding record to Pnew is a duplicate and already exists in Pprior. Else, Pnew is a distinct record. Step 4 : If Pnew is a distinct record, Pprior is multiplied with Pnew and the result is stored back in Pprior. Thus updating Pprior renders the algorithm adaptive.

IMPLEMENTATIONS There are three important implementation details that need to be discussed ,[object Object],[object Object],[object Object]

CONCLUSION ,[object Object],[object Object]

An adaptive algorithm for detection of duplicate records

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (12)

Similar a An adaptive algorithm for detection of duplicate records

Similar a An adaptive algorithm for detection of duplicate records (20)

Más de Likan Patra

Más de Likan Patra (20)

Último

Último (20)

An adaptive algorithm for detection of duplicate records