SlideShare una empresa de Scribd logo
1 de 12
An Adaptive Algorithm for Detection of Duplicate Records Presented By: Rama kanta Behera  IT200127207 Under the guidance of : Miss Ipsita Mishra
INTRODUCTION ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
OBJECTIVES ,[object Object],[object Object],[object Object],[object Object]
PREVALENT METHODS   ,[object Object],[object Object],[object Object],[object Object]
OUTLINE OF THE PROPOSED SOLUTION   The central idea behind the present algorithm is based on the fundamental property of primality of numbers  I f(x) Record set Integer number space Fig: hashing I P Record set Integer number  Prime number  f(x) g(x) Fig: Extended hashing into prime space
r1 r2 … rn I1 I2 … In P1 P2 … Pn PRODUCT( P prior) f(x) g(x) P1*p2 …*pn= P prior Fig: The complete algorithm
REALIZATION OF THE ALGORITHM  ,[object Object],[object Object],[object Object]
STEPS OF THE ALGORITHM   Step 1  : For each new record, hash is performed and unique hash value (Hnew) for each distinct record is obtained.  Step 2  : Hnew is mapped to its corresponding unique prime (Pnew). Step 3  : Pprior is divided with Pnew. If Pnew exactly divides Pprior, then the corresponding record to Pnew is a duplicate and already exists in Pprior. Else, Pnew is a distinct record. Step 4  : If Pnew is a distinct record, Pprior is multiplied with Pnew and the result is stored back in Pprior. Thus updating Pprior renders the algorithm adaptive.
Fig: Flowchart
IMPLEMENTATIONS There are three important implementation details that need to be discussed  ,[object Object],[object Object],[object Object]
CONCLUSION ,[object Object],[object Object]
THANK YOU !!!

Más contenido relacionado

La actualidad más candente

Computer Science Engineering : Data structure & algorithm, THE GATE ACADEMY
Computer Science Engineering : Data structure & algorithm, THE GATE ACADEMYComputer Science Engineering : Data structure & algorithm, THE GATE ACADEMY
Computer Science Engineering : Data structure & algorithm, THE GATE ACADEMYklirantga
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonRalf Gommers
 
150970116028 2140705
150970116028 2140705150970116028 2140705
150970116028 2140705Manoj Shahu
 
CNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflowsCNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflowsSam Bowne
 
DCC2014 - Fully Online Grammar Compression in Constant Space
DCC2014 - Fully Online Grammar Compression in Constant SpaceDCC2014 - Fully Online Grammar Compression in Constant Space
DCC2014 - Fully Online Grammar Compression in Constant SpaceYasuo Tabei
 
05 heap 20161110_jintaeks
05 heap 20161110_jintaeks05 heap 20161110_jintaeks
05 heap 20161110_jintaeksJinTaek Seo
 
Ch 5: Introduction to heap overflows
Ch 5: Introduction to heap overflowsCh 5: Introduction to heap overflows
Ch 5: Introduction to heap overflowsSam Bowne
 
Faster persistent data structures through hashing
Faster persistent data structures through hashingFaster persistent data structures through hashing
Faster persistent data structures through hashingJohan Tibell
 
High Performance Python - Marc Garcia
High Performance Python - Marc GarciaHigh Performance Python - Marc Garcia
High Performance Python - Marc GarciaMarc Garcia
 

La actualidad más candente (20)

Computer Science Engineering : Data structure & algorithm, THE GATE ACADEMY
Computer Science Engineering : Data structure & algorithm, THE GATE ACADEMYComputer Science Engineering : Data structure & algorithm, THE GATE ACADEMY
Computer Science Engineering : Data structure & algorithm, THE GATE ACADEMY
 
Essential NumPy
Essential NumPyEssential NumPy
Essential NumPy
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for Python
 
Big O Notation
Big O NotationBig O Notation
Big O Notation
 
150970116028 2140705
150970116028 2140705150970116028 2140705
150970116028 2140705
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
 
CNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflowsCNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflows
 
Stack Data structure
Stack Data structureStack Data structure
Stack Data structure
 
S1140183 Presentation
S1140183 PresentationS1140183 Presentation
S1140183 Presentation
 
Plotting data with python and pylab
Plotting data with python and pylabPlotting data with python and pylab
Plotting data with python and pylab
 
Stack
StackStack
Stack
 
DCC2014 - Fully Online Grammar Compression in Constant Space
DCC2014 - Fully Online Grammar Compression in Constant SpaceDCC2014 - Fully Online Grammar Compression in Constant Space
DCC2014 - Fully Online Grammar Compression in Constant Space
 
Sortsearch
SortsearchSortsearch
Sortsearch
 
05 heap 20161110_jintaeks
05 heap 20161110_jintaeks05 heap 20161110_jintaeks
05 heap 20161110_jintaeks
 
Ch 5: Introduction to heap overflows
Ch 5: Introduction to heap overflowsCh 5: Introduction to heap overflows
Ch 5: Introduction to heap overflows
 
Lo18
Lo18Lo18
Lo18
 
Faster persistent data structures through hashing
Faster persistent data structures through hashingFaster persistent data structures through hashing
Faster persistent data structures through hashing
 
High Performance Python - Marc Garcia
High Performance Python - Marc GarciaHigh Performance Python - Marc Garcia
High Performance Python - Marc Garcia
 
Heap_Sort1.pptx
Heap_Sort1.pptxHeap_Sort1.pptx
Heap_Sort1.pptx
 
Cs 62
Cs 62Cs 62
Cs 62
 

Destacado

Progressive duplicate detection
Progressive duplicate detectionProgressive duplicate detection
Progressive duplicate detectionieeepondy
 
A study and survey on various progressive duplicate detection mechanisms
A study and survey on various progressive duplicate detection mechanismsA study and survey on various progressive duplicate detection mechanisms
A study and survey on various progressive duplicate detection mechanismseSAT Journals
 
The Duplicitous Duplicate
The Duplicitous DuplicateThe Duplicitous Duplicate
The Duplicitous DuplicateAnish Raivadera
 
Duplicate detection
Duplicate detectionDuplicate detection
Duplicate detectionjonecx
 
Tutorial 4 (duplicate detection)
Tutorial 4 (duplicate detection)Tutorial 4 (duplicate detection)
Tutorial 4 (duplicate detection)Kira
 
Record matching over query results from Web Databases
Record matching over query results from Web DatabasesRecord matching over query results from Web Databases
Record matching over query results from Web Databasestusharjadhav2611
 
novel and efficient approch for detection of duplicate pages in web crawling
novel and efficient approch for detection of duplicate pages in web crawlingnovel and efficient approch for detection of duplicate pages in web crawling
novel and efficient approch for detection of duplicate pages in web crawlingVipin Kp
 
Linking data without common identifiers
Linking data without common identifiersLinking data without common identifiers
Linking data without common identifiersLars Marius Garshol
 
Indexing Techniques for Scalable Record Linkage and Deduplication
Indexing Techniques for Scalable Record Linkage and DeduplicationIndexing Techniques for Scalable Record Linkage and Deduplication
Indexing Techniques for Scalable Record Linkage and DeduplicationPradeeban Kathiravelu, Ph.D.
 
Efficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data SetsEfficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data SetsPradeeban Kathiravelu, Ph.D.
 

Destacado (12)

Progressive duplicate detection
Progressive duplicate detectionProgressive duplicate detection
Progressive duplicate detection
 
A study and survey on various progressive duplicate detection mechanisms
A study and survey on various progressive duplicate detection mechanismsA study and survey on various progressive duplicate detection mechanisms
A study and survey on various progressive duplicate detection mechanisms
 
The Duplicitous Duplicate
The Duplicitous DuplicateThe Duplicitous Duplicate
The Duplicitous Duplicate
 
Duplicate detection
Duplicate detectionDuplicate detection
Duplicate detection
 
Tutorial 4 (duplicate detection)
Tutorial 4 (duplicate detection)Tutorial 4 (duplicate detection)
Tutorial 4 (duplicate detection)
 
Progressive Texture
Progressive TextureProgressive Texture
Progressive Texture
 
Record matching over query results from Web Databases
Record matching over query results from Web DatabasesRecord matching over query results from Web Databases
Record matching over query results from Web Databases
 
novel and efficient approch for detection of duplicate pages in web crawling
novel and efficient approch for detection of duplicate pages in web crawlingnovel and efficient approch for detection of duplicate pages in web crawling
novel and efficient approch for detection of duplicate pages in web crawling
 
Linking data without common identifiers
Linking data without common identifiersLinking data without common identifiers
Linking data without common identifiers
 
Indexing Techniques for Scalable Record Linkage and Deduplication
Indexing Techniques for Scalable Record Linkage and DeduplicationIndexing Techniques for Scalable Record Linkage and Deduplication
Indexing Techniques for Scalable Record Linkage and Deduplication
 
Deduplication
DeduplicationDeduplication
Deduplication
 
Efficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data SetsEfficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data Sets
 

Similar a An adaptive algorithm for detection of duplicate records

A peek on numerical programming in perl and python e christopher dyken 2005
A peek on numerical programming in perl and python  e christopher dyken  2005A peek on numerical programming in perl and python  e christopher dyken  2005
A peek on numerical programming in perl and python e christopher dyken 2005Jules Krdenas
 
Intellectual technologies
Intellectual technologiesIntellectual technologies
Intellectual technologiesPolad Saruxanov
 
Introduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptxIntroduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptxPJS KUMAR
 
Hashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdfHashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdfJaithoonBibi
 
PyTorch for Deep Learning Practitioners
PyTorch for Deep Learning PractitionersPyTorch for Deep Learning Practitioners
PyTorch for Deep Learning PractitionersBayu Aldi Yansyah
 
Analytical Study and Newer Approach towards Frequent Pattern Mining using Boo...
Analytical Study and Newer Approach towards Frequent Pattern Mining using Boo...Analytical Study and Newer Approach towards Frequent Pattern Mining using Boo...
Analytical Study and Newer Approach towards Frequent Pattern Mining using Boo...iosrjce
 
9 big o-notation
9 big o-notation9 big o-notation
9 big o-notationirdginfo
 
An Overview of Hadoop
An Overview of HadoopAn Overview of Hadoop
An Overview of HadoopAsif Ali
 
Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...
Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...
Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...AM Publications
 
Rapport_Cemracs2012
Rapport_Cemracs2012Rapport_Cemracs2012
Rapport_Cemracs2012Jussara F.M.
 
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...BRNSSPublicationHubI
 
data structures using C 2 sem BCA univeristy of mysore
data structures using C 2 sem BCA univeristy of mysoredata structures using C 2 sem BCA univeristy of mysore
data structures using C 2 sem BCA univeristy of mysoreambikavenkatesh2
 
Data Structures- Part2 analysis tools
Data Structures- Part2 analysis toolsData Structures- Part2 analysis tools
Data Structures- Part2 analysis toolsAbdullah Al-hazmy
 
FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...
FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...
FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...IJERA Editor
 

Similar a An adaptive algorithm for detection of duplicate records (20)

A peek on numerical programming in perl and python e christopher dyken 2005
A peek on numerical programming in perl and python  e christopher dyken  2005A peek on numerical programming in perl and python  e christopher dyken  2005
A peek on numerical programming in perl and python e christopher dyken 2005
 
Intellectual technologies
Intellectual technologiesIntellectual technologies
Intellectual technologies
 
Introduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptxIntroduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptx
 
final
finalfinal
final
 
Hashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdfHashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdf
 
PyTorch for Deep Learning Practitioners
PyTorch for Deep Learning PractitionersPyTorch for Deep Learning Practitioners
PyTorch for Deep Learning Practitioners
 
R01732105109
R01732105109R01732105109
R01732105109
 
Analytical Study and Newer Approach towards Frequent Pattern Mining using Boo...
Analytical Study and Newer Approach towards Frequent Pattern Mining using Boo...Analytical Study and Newer Approach towards Frequent Pattern Mining using Boo...
Analytical Study and Newer Approach towards Frequent Pattern Mining using Boo...
 
9 big o-notation
9 big o-notation9 big o-notation
9 big o-notation
 
An Overview of Hadoop
An Overview of HadoopAn Overview of Hadoop
An Overview of Hadoop
 
Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...
Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...
Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...
 
Rapport_Cemracs2012
Rapport_Cemracs2012Rapport_Cemracs2012
Rapport_Cemracs2012
 
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
 
data structures using C 2 sem BCA univeristy of mysore
data structures using C 2 sem BCA univeristy of mysoredata structures using C 2 sem BCA univeristy of mysore
data structures using C 2 sem BCA univeristy of mysore
 
Data Structures- Part2 analysis tools
Data Structures- Part2 analysis toolsData Structures- Part2 analysis tools
Data Structures- Part2 analysis tools
 
Analysis.ppt
Analysis.pptAnalysis.ppt
Analysis.ppt
 
FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...
FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...
FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...
 
Chapter two
Chapter twoChapter two
Chapter two
 
Cs2251 daa
Cs2251 daaCs2251 daa
Cs2251 daa
 
The Python Programming Language and HDF5: H5Py
The Python Programming Language and HDF5: H5PyThe Python Programming Language and HDF5: H5Py
The Python Programming Language and HDF5: H5Py
 

Más de Likan Patra

Sewn Product Machinary & Equipments
Sewn Product Machinary & EquipmentsSewn Product Machinary & Equipments
Sewn Product Machinary & EquipmentsLikan Patra
 
SMArt Contest- Smart Quiz Questions
SMArt Contest- Smart Quiz QuestionsSMArt Contest- Smart Quiz Questions
SMArt Contest- Smart Quiz QuestionsLikan Patra
 
RC Shri Jagannath Dham- Club Activity Report 2014-15
RC Shri Jagannath Dham- Club Activity Report 2014-15RC Shri Jagannath Dham- Club Activity Report 2014-15
RC Shri Jagannath Dham- Club Activity Report 2014-15Likan Patra
 
Quiz about Google and its Products
Quiz about Google and its ProductsQuiz about Google and its Products
Quiz about Google and its ProductsLikan Patra
 
e-ENERGY METERING BOX (Smart Meter by KPMP Electronics)
e-ENERGY METERING BOX (Smart Meter by KPMP Electronics)e-ENERGY METERING BOX (Smart Meter by KPMP Electronics)
e-ENERGY METERING BOX (Smart Meter by KPMP Electronics)Likan Patra
 
Everything you want to know about Liquid Lenses
Everything you want to know about Liquid LensesEverything you want to know about Liquid Lenses
Everything you want to know about Liquid LensesLikan Patra
 
Seminar on Cyber Crime
Seminar on Cyber CrimeSeminar on Cyber Crime
Seminar on Cyber CrimeLikan Patra
 
What is Optical fiber ?
What is Optical fiber ?What is Optical fiber ?
What is Optical fiber ?Likan Patra
 
Tech 101: Understanding Firewalls
Tech 101: Understanding FirewallsTech 101: Understanding Firewalls
Tech 101: Understanding FirewallsLikan Patra
 
Holographic Data Storage
Holographic Data StorageHolographic Data Storage
Holographic Data StorageLikan Patra
 
A Technical Seminar on OSI model
A Technical Seminar on OSI modelA Technical Seminar on OSI model
A Technical Seminar on OSI modelLikan Patra
 
Who are the INTERNET SERVICE PROVIDERS?
Who are the INTERNET SERVICE PROVIDERS?Who are the INTERNET SERVICE PROVIDERS?
Who are the INTERNET SERVICE PROVIDERS?Likan Patra
 
Computer Tomography (CT Scan)
Computer Tomography (CT Scan)Computer Tomography (CT Scan)
Computer Tomography (CT Scan)Likan Patra
 
Akshaya patra foundation - In Depth
Akshaya patra foundation - In DepthAkshaya patra foundation - In Depth
Akshaya patra foundation - In DepthLikan Patra
 
So, He got a JOB through LinkedIn
So, He got a JOB through LinkedInSo, He got a JOB through LinkedIn
So, He got a JOB through LinkedInLikan Patra
 
Qr code (quick response code)
Qr code (quick response code)Qr code (quick response code)
Qr code (quick response code)Likan Patra
 
Blue ray disc seminar representation
Blue ray disc seminar representationBlue ray disc seminar representation
Blue ray disc seminar representationLikan Patra
 
Brain finger printing
Brain finger printingBrain finger printing
Brain finger printingLikan Patra
 
Audio watermarking
Audio watermarkingAudio watermarking
Audio watermarkingLikan Patra
 

Más de Likan Patra (20)

Sewn Product Machinary & Equipments
Sewn Product Machinary & EquipmentsSewn Product Machinary & Equipments
Sewn Product Machinary & Equipments
 
SMArt Contest- Smart Quiz Questions
SMArt Contest- Smart Quiz QuestionsSMArt Contest- Smart Quiz Questions
SMArt Contest- Smart Quiz Questions
 
RC Shri Jagannath Dham- Club Activity Report 2014-15
RC Shri Jagannath Dham- Club Activity Report 2014-15RC Shri Jagannath Dham- Club Activity Report 2014-15
RC Shri Jagannath Dham- Club Activity Report 2014-15
 
Quiz about Google and its Products
Quiz about Google and its ProductsQuiz about Google and its Products
Quiz about Google and its Products
 
e-ENERGY METERING BOX (Smart Meter by KPMP Electronics)
e-ENERGY METERING BOX (Smart Meter by KPMP Electronics)e-ENERGY METERING BOX (Smart Meter by KPMP Electronics)
e-ENERGY METERING BOX (Smart Meter by KPMP Electronics)
 
Everything you want to know about Liquid Lenses
Everything you want to know about Liquid LensesEverything you want to know about Liquid Lenses
Everything you want to know about Liquid Lenses
 
Seminar on Cyber Crime
Seminar on Cyber CrimeSeminar on Cyber Crime
Seminar on Cyber Crime
 
What is Optical fiber ?
What is Optical fiber ?What is Optical fiber ?
What is Optical fiber ?
 
Tech 101: Understanding Firewalls
Tech 101: Understanding FirewallsTech 101: Understanding Firewalls
Tech 101: Understanding Firewalls
 
Holographic Data Storage
Holographic Data StorageHolographic Data Storage
Holographic Data Storage
 
A Technical Seminar on OSI model
A Technical Seminar on OSI modelA Technical Seminar on OSI model
A Technical Seminar on OSI model
 
Who are the INTERNET SERVICE PROVIDERS?
Who are the INTERNET SERVICE PROVIDERS?Who are the INTERNET SERVICE PROVIDERS?
Who are the INTERNET SERVICE PROVIDERS?
 
Computer Tomography (CT Scan)
Computer Tomography (CT Scan)Computer Tomography (CT Scan)
Computer Tomography (CT Scan)
 
Akshaya patra foundation - In Depth
Akshaya patra foundation - In DepthAkshaya patra foundation - In Depth
Akshaya patra foundation - In Depth
 
So, He got a JOB through LinkedIn
So, He got a JOB through LinkedInSo, He got a JOB through LinkedIn
So, He got a JOB through LinkedIn
 
4g technology
4g technology4g technology
4g technology
 
Qr code (quick response code)
Qr code (quick response code)Qr code (quick response code)
Qr code (quick response code)
 
Blue ray disc seminar representation
Blue ray disc seminar representationBlue ray disc seminar representation
Blue ray disc seminar representation
 
Brain finger printing
Brain finger printingBrain finger printing
Brain finger printing
 
Audio watermarking
Audio watermarkingAudio watermarking
Audio watermarking
 

Último

SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 

Último (20)

SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 

An adaptive algorithm for detection of duplicate records

  • 1. An Adaptive Algorithm for Detection of Duplicate Records Presented By: Rama kanta Behera IT200127207 Under the guidance of : Miss Ipsita Mishra
  • 2.
  • 3.
  • 4.
  • 5. OUTLINE OF THE PROPOSED SOLUTION The central idea behind the present algorithm is based on the fundamental property of primality of numbers I f(x) Record set Integer number space Fig: hashing I P Record set Integer number Prime number f(x) g(x) Fig: Extended hashing into prime space
  • 6. r1 r2 … rn I1 I2 … In P1 P2 … Pn PRODUCT( P prior) f(x) g(x) P1*p2 …*pn= P prior Fig: The complete algorithm
  • 7.
  • 8. STEPS OF THE ALGORITHM Step 1 : For each new record, hash is performed and unique hash value (Hnew) for each distinct record is obtained. Step 2 : Hnew is mapped to its corresponding unique prime (Pnew). Step 3 : Pprior is divided with Pnew. If Pnew exactly divides Pprior, then the corresponding record to Pnew is a duplicate and already exists in Pprior. Else, Pnew is a distinct record. Step 4 : If Pnew is a distinct record, Pprior is multiplied with Pnew and the result is stored back in Pprior. Thus updating Pprior renders the algorithm adaptive.
  • 10.
  • 11.