SlideShare a Scribd company logo
1 of 33
Dave Moore
david.moore@elastic.co
Haystack: The Search Relevancy Conference (11 April 2018)
Real-Time Entity Resolution
With Elasticsearch
1
Disambiguation
Entity Entity
Single attributes in unstructured text
"Named Entity Recognition"
Multiple attributes in structured data
"Entity Resolution"
vs.
Person
Field Value
Name Alice Jones
DOB 1984-01-01
Street 123 Main St
Credit Card 4040 0000 2020 8080
Phone 202-555-1234
2
What is entity resolution?
Health Care
Patient ID
We need to identify
and their medical
many hand-written
Mixing up records puts
at risk of injury or
Sales & Marketing
Customer Intel
We have reps
managing many
sources of info on
leads and customers.
Our view of the buyer
is fragmented and that
makes us less
effective. We're losing
pipeline.
Security & Compliance
Fraud
We need to track a
person or device that is
hiding its tracks.
Connecting the dots is
a
laborious process and
we can't keep up with
our incident backlog.
Military, IC, Law
Surveillance
We need to track a
person or device that is
hiding its identity. Our
timely success is
critical to public safety
and national security.
Privacy Compliance
GDPR
We must find and
manage all PII to
respond to inquiries.
Failure to comply risks
fines of €20 million or
4% annual turnover.
IT
MDM
MDM is a slow and
bureaucratic process.
We can solve our own
data quality problems
faster and better. And
we still need query
time entity resolution.
3
Examples
4
Why is identity hard to track?
Ali Jones
123 W Main Street
ABC Wigdets
4040 0000 2020 8008
+1 (202) 555 1234
5
1. Identity is Vague
Allie Jones
123 Main St
ABC Widgets, Inc.
4040 0000 2020 8080
202-555-1234
Icons by icons8
Ali Jones
123 W Main Street
ABC Wigdets
4040 0000 2020 8008
+1 (202) 555 1234
Alison Jones-Smith
555 Brooad Street
XYZ Tech
3030 5500 9999 0000
2025559867
6
2. Identity Changes
Allie Jones
123 Main St
ABC Widgets, Inc.
4040 0000 2020 8080
202-555-1234
Allison Smith
555 Broad St
XYZ Technology Corp.
3030 5050 9999 0000
202-555-9876
Icons by icons8
Ali Jones
123 W Main Street
ABC Wigdets
4040 0000 2020 8008
+1 (202) 555 1234
Alison Jones-Smith
555 Brooad Street
XYZ Tech
3030 5500 9999 0000
2025559867
7
3. Identity is Messy
Allie Jones
123 Main St
ABC Widgets, Inc.
4040 0000 2020 8080
202-555-1234
Allison Smith
555 Broad St
XYZ Technology Corp.
3030 5050 9999 0000
202-555-9876
Icons by icons8
8
4. Identity is Diverse
Ali Jones
123 W Main Street
ABC Wigdets
4040 0000 2020 8008
+1 (202) 555 1234
Alison Jones-Smith
555 Brooad Street
XYZ Tech
3030 5500 9999 0000
2025559867
Allie Jones
123 Main St
ABC Widgets, Inc.
4040 0000 2020 8080
202-555-1234
Allison Smith
555 Broad St
XYZ Technology Corp.
3030 5050 9999 0000
202-555-9876
???
???
???
???
Icons by icons8
9
Entity Resolution
connects the dots despite these challenges
Allie Jones 123 Main St ABC Widgets, Inc. 4040 0000 2020 8080 202-555-1234
Allie Jones 123 Main Street ABC Widgets 4040 0000 2020 8080 202.555.1234
Ali Jones 123 W Main Street ABC Wigdets 4040 0000 2020 8008 +1 (202) 555 1234
Allie Jones 132 W Main Street ABC Widgets 4040 0000 2020 8080 202 555 1234
Allie Smith 123 Main St ABC Widgets, Inc. 4040 0000 2020 8080 202-555-1234
Allie Smith 123 Main Street ABC Widgets 4040 0000 2020 8080 202.555.1234
Ali Smith 123 W Main Street ABC Wigdets 4040 0000 2020 8008 +1 (202) 555 1234
Allie Smith 555 Broad St ABC Widgets, Inc 4040 0000 2020 8080 202-555-1234
Allie Smith 555 Broad Street XYZ Tech Corp 3030 5050 9999 0000 202.555.1234
Allie Smith 555 Broad Street XYZ Technology Corp 3030 5050 9999 0000 202-555-9876
10
Comparison to Search
Search Resolution
name:"Allie Jones" AND street:"123 Main St" name:"Allie Jones" AND street:"123 Main St"
Allie Jones 123 Main St ABC Widgets, Inc. 4040 0000 2020 8080 202-555-1234
Allie Jones 123 Main Street ABC Widgets 4040 0000 2020 8080 202.555.1234
Ali Jones 123 W Main Street ABC Wigdets 4040 0000 2020 8008 +1 (202) 555 1234
Ali Jones 132 Mane Street ABC Widgets 4024 0071 4970 1227 888-555-5555
Aly Jonas 113 Main Street Acme Corp. 4716 1035 4536 4671 610-555-5555
Allie Jones 132 W Main Street ABC Widgets 4040 0000 2020 8080 202-555-9876
Al Jones 132 E Main St Mom & Pop, LLC 3772 733741 52501 1-610-555-0000
Aly Jones 113 Main St, #102 Acme Corp. 4716 1035 4536 4671 610-555-5555
Ali Jones 132 Mane Street ABC Widgets 4024 0071 4970 1227 888-555-1234
Aly Jonas 113 Main Street Acme Corp. 4781 9105 0533 4481 610-555-2345
Allie Johns 132 W Main Street ABC Widgets 4088 0110 2044 8180 202-555-3456
Elle Jeon 132 E Main St Mom & Pop, LLC 3502 730741 52203 1-610-555-4567
Elle Jones 113 Main St, #102 Acme Corp. 4716 1035 4536 4671 610-555-5678
Eli Jones 132 Mane Street ABC Widgets 4224 0065 4800 1337 888-555-6789
Eli Joans 113 Main Street Acme Corp. 4206 1035 4536 4081 610-555-7890
Allie Jeans 132 N Mean Street ABC Widgets 4240 0101 02020 8888 202-555-8901
Search engine ranks results once.
True hits mixed with noise.
Search engine filters results recursively.
True hits isolated and transitively linked.
11
Real-Time
12
Batch vs. Real-Time
Batch Real-Time
How is it used? Resolve all entities in advance
(Partitioning, pairwise scoring, connected
components)
How long does it take? Docs + (Docs/Partitions)2 + Components2
(Hours for billions of documents)
When is it necessary? Population or network analysis
Most solutions have a real-time phase,
sometimes applied after batch resolution.
How is it used? Resolve one entity on query
(Recursive Boolean query)
How long does it take? Indices * Attributes * Hops
(Milliseconds for a handful of each)
When is it necessary? Individual analysis
Robust matching
• Token normalization
• Phonetic matching
• Fuzzy transpositions
• Boolean logic filtering
• Fine-tune search parameters
13
Real-Time
Why Elasticsearch
Suited for operations
• Horizontal scaling
• Real-time response rates
• Flexible index mappings
14
Approach
• Fast – Get results in real-time. From milliseconds to low seconds.
• Generic – Resolve any type of entity. People, companies, locations, sessions, etc.
• Transitive – Resolve over multiple hops of matches. Capture changing identities.
• Multi-source – Resolve over multiple indices with disparate mappings.
• Accommodating – Operate on data as it exists. Avoid transforming and reindexing
data.
• Logical – Logic is easier to read, troubleshoot, and optimize than statistics.
• 100% Elasticsearch – Operate within existing search infrastructure.
Goals
15
Approach
1. Entity modeling – What is the entity? What are its attributes?
2. Analyzers – How are you indexing each attribute?
3. Matchers – What is the query logic for each attribute?
4. Resolvers – What combinations of matching attributes imply a resolution?
5. Metadata maps – Which matchers apply to which indexed fields?
6. Recursive queries – How to repeat the queries until completion?
Steps
An open source Elasticsearch plugin
for real-time entity resolution
16
zentity.io
17
POST _zentity/resolution/person
{
"attributes": {
"name": "Alice Jones",
"dob": "1984-01-01",
"phone": [ "555-123-4567", "555-987-6543" ]
}
}
zentity.io
18
Demos
19
Demos
Customer intelligence
Gather everything we know about a customer.
Web traffic sessionization
Track a bot that cycles through IP addresses, cookies, and user agent signatures.
Fraud detection
Determine if a health care provider was blacklisted under a different name.
Dave Moore
email: david.moore@elastic.co
zentity: zentity.io
Contact
@elastic
www.elastic.co
Extra Content
22
Approach
23
Step 1. Entity Modeling
Person
Name the entity type.
Name – First Name
Name – Last Name
Address – Street
Address – City
Address – Province
Address – Postal Code
Address – Country
Date of Birth
Phone Number
Email Address
IP Address
Credit Card Number
Social Security Number
Define its attributes. Study them in your data sets.
Uniqueness Consistency Presence
Moderate
Moderate
High
Low
Low
Low
Low
Moderate
Moderate
High
High
Extreme
Extreme
Moderate
Moderate
Low
Moderate
High
High
High
High
Moderate
Extreme
Extreme
Extreme
High
Extreme
High
Moderate
Moderate
High
High
High
Moderate
Moderate
Moderate
Low
Low
None
Icons by icons8
24
Step 1. Entity Modeling
Person
Name the entity type.
Name – First Name
Name – Last Name
Address – Street
Address – City
Address – Province
Address – Postal Code
Address – Country
Date of Birth
Phone Number
Email Address
IP Address
Credit Card Number
Social Security Number
Define its attributes. Study them in your data sets.
Uniqueness Consistency Presence
Moderate
Moderate
High
Low
Low
Low
Low
Moderate
Moderate
High
High
Extreme
Extreme
Moderate
Moderate
Low
Moderate
High
High
High
High
Moderate
Extreme
Extreme
Extreme
High
Extreme
High
Moderate
Moderate
High
High
High
Moderate
Moderate
Moderate
Low
Low
None
This model is independent from your indices.
You can reuse and extend this model as you add or amend indices.
Icons by icons8
Name – First Name
Name – Last Name
Address – Street
Address – City
Address – Province
Address – Postal Code
Address – Country
Date of Birth
Phone Number
Email Address
IP Address
Credit Card Number
Social Security Number
Phonetic
"Alice Jones" => ["ALAC","JAN"]
Standard
"Alice Jones" => ["ALICE","JONES"]
25
Step 2. Analyzers
Take the attributes. Define their analyzers. Put them in your index mappings.
{
"settings": {
"index": {
"analysis": {
"filter": {
"phonetic": {
"type": "phonetic",
"encoder": "nysiis"
}
},
"analyzer": {
"phonetic": {
"filter": [
"icu_normalizer",
"icu_folding",
"phonetic"
],
"tokenizer": "standard"
}
}
}
}
}
}
{
"mappings": {
"_doc": {
"properties": {
“first_name": {
"type": "text",
"fields": {
"phonetic": {
"type": "text",
"analyzer": "phonetic"
}
}
}
}
}
}
}
Person
Icons by icons8
Name – First Name
Name – Last Name
Address – Street
Address – City
Address – Province
Address – Postal Code
Address – Country
Date of Birth
Phone Number
Email Address
IP Address
Credit Card Number
Social Security Number
Phonetic
"Alice Jones" => ["ALAC","JAN"]
Standard
"Alice Jones" => ["ALICE","JONES"]
26
Step 2. Analyzers
Take the attributes. Define their analyzers. Put them in your index mappings.
{
"settings": {
"index": {
"analysis": {
"filter": {
"phonetic": {
"type": "phonetic",
"encoder": "nysiis"
}
},
"analyzer": {
"phonetic": {
"filter": [
"icu_normalizer",
"icu_folding",
"phonetic"
],
"tokenizer": "standard"
}
}
}
}
}
}
{
"mappings": {
"_doc": {
"properties": {
“first_name": {
"type": "text",
"fields": {
"phonetic": {
"type": "text",
"analyzer": "phonetic"
}
}
}
}
}
}
}
Person
Analyzers are powerful. But they must be defined prior to indexing.
Give careful thought to your analyzers to avoid having to reindex data.
Icons by icons8
Phonetic
{
"match": {
"{{ field }}": {
"query": "{{ value }}",
"fuzziness": 0
}
}
}
Standard
{
"match": {
"{{ field }}": {
"query": "{{ value }}",
"fuzziness": 2
}
}
}
27
Step 3. Matchers
Take the attributes.
Name – First Name
Name – Last Name
Address – Street
Address – City
Address – Province
Address – Postal Code
Address – Country
Date of Birth
Phone Number
Email Address
IP Address
Credit Card Number
Social Security Number
Define their Boolean query logic. Use templates for variables.
Person
{{ field }} – The field of an index.
{{ value }} – The value of an attribute.
We will replace these at query time.
Icons by icons8
Phonetic
{
"match": {
"{{ field }}": {
"query": "{{ value }}“,
"fuzziness": 0
}
}
}
Standard
{
"match": {
"{{ field }}": {
"query": "{{ value }}“,
"fuzziness": 2
}
}
}
28
Step 3. Matchers
Take the attributes.
Name – First Name
Name – Last Name
Address – Street
Address – City
Address – Province
Address – Postal Code
Address – Country
Date of Birth
Phone Number
Email Address
IP Address
Credit Card Number
Social Security Number
Define their Boolean query logic. Use templates for variables.
Person
{{ field }} – The field of an index.
{{ value }} – The value of an attribute.
We will replace these at query time.
Understand that each matcher will be combined
into one large Boolean query.
Icons by icons8
29
Step 4. Resolvers
Take the attributes.
Name – First Name
Name – Last Name
Address – Street
Address – City
Address – Province
Address – Postal Code
Address – Country
Date of Birth
Phone Number
Email Address
IP Address
Credit Card Number
Social Security Number
Determine which combinations of matching attributes imply a resolution.
[ Name – First, Name – Last, Address – Street, Address – City, Address – State ]
[ Name – First, Name – Last, Address – Street, Address – Postal Code ]
[ Name – First, Name – Last, Date of Birth, Address – City, Address – State ]
[ Name – First, Name – Last, Date of Birth, Address – Postal Code ]
[ Name – First, Name – Last, Phone Number ]
[ Name – First, Name – Last, Email Address ]
[ Name – First, Name – Last, IP Address ]
[ Name – First, Name – Last, Credit Card Number ]
[ Name – First, Name – Last, Social Security Number]
[ Email Address, Phone Number ]
[ Email Address, IP Address ]
[ Email Address, Credit Card Number ]
[ IP Address, Credit Card Number ]
Person
Icons by icons8
30
Step 4. Resolvers
Take the attributes.
Name – First Name
Name – Last Name
Address – Street
Address – City
Address – Province
Address – Postal Code
Address – Country
Date of Birth
Phone Number
Email Address
IP Address
Credit Card Number
Social Security Number
Determine which combinations of matching attributes imply a resolution.
[ Name – First, Name – Last, Address – Street, Address – City, Address – State ]
[ Name – First, Name – Last, Address – Street, Address – Postal Code ]
[ Name – First, Name – Last, Date of Birth, Address – City, Address – State ]
[ Name – First, Name – Last, Date of Birth, Address – Postal Code ]
[ Name – First, Name – Last, Phone Number ]
[ Name – First, Name – Last, Email Address ]
[ Name – First, Name – Last, IP Address ]
[ Name – First, Name – Last, Credit Card Number ]
[ Name – First, Name – Last, Social Security Number]
[ Email Address, Phone Number ]
[ Email Address, IP Address ]
[ Email Address, Credit Card Number ]
[ IP Address, Credit Card Number ]
Person
Avoid resolving on a single attribute such as Social Security Number.
Corroboration among multiple attributes helps prevent snowballs.
Icons by icons8
31
Step 5. Metadata Maps
Take the attributes.
Name – First Name
Name – Last Name
Address – Street
Address – City
Address – Province
Address – Postal Code
Address – Country
Date of Birth
Phone Number
Email Address
IP Address
Credit Card Number
Social Security Number
Map them to the fields of the relevant indices.
users.first_name
users.last_name
users.phone
users.email
customers:fname
customers:lname
customers:tel
customers:email
customers:cc
customers:zip
Person
Icons by icons8
32
Step 6. Recursive Queries
With each query, new inputs might be found in different attributes.
Use the metadata map and your resolvers to determine if you can
create new queries for the new inputs.

More Related Content

What's hot

Building Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta LakeBuilding Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta LakeDatabricks
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Denodo
 
AWS에서 빅데이터 프로젝트 시작하기 - 이종화 솔루션즈 아키텍트, AWS
AWS에서 빅데이터 프로젝트 시작하기 - 이종화 솔루션즈 아키텍트, AWSAWS에서 빅데이터 프로젝트 시작하기 - 이종화 솔루션즈 아키텍트, AWS
AWS에서 빅데이터 프로젝트 시작하기 - 이종화 솔루션즈 아키텍트, AWSAmazon Web Services Korea
 
May 2023 CIAOPS Need to Know Webinar
May 2023 CIAOPS Need to Know WebinarMay 2023 CIAOPS Need to Know Webinar
May 2023 CIAOPS Need to Know WebinarRobert Crane
 
Building Data Quality Audit Framework using Delta Lake at Cerner
Building Data Quality Audit Framework using Delta Lake at CernerBuilding Data Quality Audit Framework using Delta Lake at Cerner
Building Data Quality Audit Framework using Delta Lake at CernerDatabricks
 
Microsoft Azure Storage Basics
Microsoft Azure Storage BasicsMicrosoft Azure Storage Basics
Microsoft Azure Storage BasicsSai Kishore Naidu
 
Data saturday Oslo Azure Purview Erwin de Kreuk
Data saturday Oslo Azure Purview Erwin de KreukData saturday Oslo Azure Purview Erwin de Kreuk
Data saturday Oslo Azure Purview Erwin de KreukErwin de Kreuk
 
Microsoft Information Protection demystified Albert Hoitingh
Microsoft Information Protection demystified Albert HoitinghMicrosoft Information Protection demystified Albert Hoitingh
Microsoft Information Protection demystified Albert HoitinghAlbert Hoitingh
 
오픈소스 WAS를 위한 클러스터 솔루션 - OPENMARU Cluster
오픈소스 WAS를 위한 클러스터 솔루션 - OPENMARU Cluster오픈소스 WAS를 위한 클러스터 솔루션 - OPENMARU Cluster
오픈소스 WAS를 위한 클러스터 솔루션 - OPENMARU ClusterOpennaru, inc.
 
Complete open source IAM solution
Complete open source IAM solutionComplete open source IAM solution
Complete open source IAM solutionRadovan Semancik
 
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....Kai Wähner
 
SC-900 Capabilities of Microsoft Security Solutions
SC-900 Capabilities of Microsoft Security SolutionsSC-900 Capabilities of Microsoft Security Solutions
SC-900 Capabilities of Microsoft Security SolutionsFredBrandonAuthorMCP
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Hortonworks
 
ZERO TRUST ARCHITECTURE - DIGITAL TRUST FRAMEWORK
ZERO TRUST ARCHITECTURE - DIGITAL TRUST FRAMEWORKZERO TRUST ARCHITECTURE - DIGITAL TRUST FRAMEWORK
ZERO TRUST ARCHITECTURE - DIGITAL TRUST FRAMEWORKMaganathin Veeraragaloo
 
Microsoft Information Protection: Your Security and Compliance Framework
Microsoft Information Protection: Your Security and Compliance FrameworkMicrosoft Information Protection: Your Security and Compliance Framework
Microsoft Information Protection: Your Security and Compliance FrameworkAlistair Pugin
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB
 
Microsoft 365 and Microsoft Cloud App Security
Microsoft 365 and Microsoft Cloud App SecurityMicrosoft 365 and Microsoft Cloud App Security
Microsoft 365 and Microsoft Cloud App SecurityAlbert Hoitingh
 
The Elastic Stack as a SIEM
The Elastic Stack as a SIEMThe Elastic Stack as a SIEM
The Elastic Stack as a SIEMJohn Hubbard
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AIJames Serra
 

What's hot (20)

Azure IoT Hub
Azure IoT HubAzure IoT Hub
Azure IoT Hub
 
Building Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta LakeBuilding Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta Lake
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
 
AWS에서 빅데이터 프로젝트 시작하기 - 이종화 솔루션즈 아키텍트, AWS
AWS에서 빅데이터 프로젝트 시작하기 - 이종화 솔루션즈 아키텍트, AWSAWS에서 빅데이터 프로젝트 시작하기 - 이종화 솔루션즈 아키텍트, AWS
AWS에서 빅데이터 프로젝트 시작하기 - 이종화 솔루션즈 아키텍트, AWS
 
May 2023 CIAOPS Need to Know Webinar
May 2023 CIAOPS Need to Know WebinarMay 2023 CIAOPS Need to Know Webinar
May 2023 CIAOPS Need to Know Webinar
 
Building Data Quality Audit Framework using Delta Lake at Cerner
Building Data Quality Audit Framework using Delta Lake at CernerBuilding Data Quality Audit Framework using Delta Lake at Cerner
Building Data Quality Audit Framework using Delta Lake at Cerner
 
Microsoft Azure Storage Basics
Microsoft Azure Storage BasicsMicrosoft Azure Storage Basics
Microsoft Azure Storage Basics
 
Data saturday Oslo Azure Purview Erwin de Kreuk
Data saturday Oslo Azure Purview Erwin de KreukData saturday Oslo Azure Purview Erwin de Kreuk
Data saturday Oslo Azure Purview Erwin de Kreuk
 
Microsoft Information Protection demystified Albert Hoitingh
Microsoft Information Protection demystified Albert HoitinghMicrosoft Information Protection demystified Albert Hoitingh
Microsoft Information Protection demystified Albert Hoitingh
 
오픈소스 WAS를 위한 클러스터 솔루션 - OPENMARU Cluster
오픈소스 WAS를 위한 클러스터 솔루션 - OPENMARU Cluster오픈소스 WAS를 위한 클러스터 솔루션 - OPENMARU Cluster
오픈소스 WAS를 위한 클러스터 솔루션 - OPENMARU Cluster
 
Complete open source IAM solution
Complete open source IAM solutionComplete open source IAM solution
Complete open source IAM solution
 
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
 
SC-900 Capabilities of Microsoft Security Solutions
SC-900 Capabilities of Microsoft Security SolutionsSC-900 Capabilities of Microsoft Security Solutions
SC-900 Capabilities of Microsoft Security Solutions
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
 
ZERO TRUST ARCHITECTURE - DIGITAL TRUST FRAMEWORK
ZERO TRUST ARCHITECTURE - DIGITAL TRUST FRAMEWORKZERO TRUST ARCHITECTURE - DIGITAL TRUST FRAMEWORK
ZERO TRUST ARCHITECTURE - DIGITAL TRUST FRAMEWORK
 
Microsoft Information Protection: Your Security and Compliance Framework
Microsoft Information Protection: Your Security and Compliance FrameworkMicrosoft Information Protection: Your Security and Compliance Framework
Microsoft Information Protection: Your Security and Compliance Framework
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Microsoft 365 and Microsoft Cloud App Security
Microsoft 365 and Microsoft Cloud App SecurityMicrosoft 365 and Microsoft Cloud App Security
Microsoft 365 and Microsoft Cloud App Security
 
The Elastic Stack as a SIEM
The Elastic Stack as a SIEMThe Elastic Stack as a SIEM
The Elastic Stack as a SIEM
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AI
 

Similar to Real-Time Entity Resolution with Elasticsearch - Haystack 2018

In:Confidence 2019 - Balancing the conflicting objectives of data access and ...
In:Confidence 2019 - Balancing the conflicting objectives of data access and ...In:Confidence 2019 - Balancing the conflicting objectives of data access and ...
In:Confidence 2019 - Balancing the conflicting objectives of data access and ...Privitar
 
Privacy solutions decode2021_jon_oliver
Privacy solutions decode2021_jon_oliverPrivacy solutions decode2021_jon_oliver
Privacy solutions decode2021_jon_oliverJonathanOliver26
 
Global AI Bootcamp Singapore - Keynote
Global AI Bootcamp Singapore - KeynoteGlobal AI Bootcamp Singapore - Keynote
Global AI Bootcamp Singapore - KeynoteAlex Smith
 
How We Did It: The Case of the Credit Card Breach
How We Did It: The Case of the Credit Card BreachHow We Did It: The Case of the Credit Card Breach
How We Did It: The Case of the Credit Card BreachTeradata
 
Introduction of Artificial Intelligence
Introduction of Artificial IntelligenceIntroduction of Artificial Intelligence
Introduction of Artificial IntelligenceAkhileshwar Nirala
 
Mastering Location Data – a new paradigm in network analytics
Mastering Location Data – a new paradigm in network analyticsMastering Location Data – a new paradigm in network analytics
Mastering Location Data – a new paradigm in network analyticsPrecisely
 
Trusting AI with important decisions
Trusting AI with important decisionsTrusting AI with important decisions
Trusting AI with important decisionsLouis Dorard
 
The Domains of Identity & Self-Sovereign Identity MyData 2018
The Domains of Identity & Self-Sovereign Identity MyData 2018The Domains of Identity & Self-Sovereign Identity MyData 2018
The Domains of Identity & Self-Sovereign Identity MyData 2018Kaliya "Identity Woman" Young
 
All Clearances or Cyber Virtual Job Fair Handbook June 3, 2020, Huntsville
All Clearances or Cyber Virtual Job Fair Handbook June 3, 2020, HuntsvilleAll Clearances or Cyber Virtual Job Fair Handbook June 3, 2020, Huntsville
All Clearances or Cyber Virtual Job Fair Handbook June 3, 2020, HuntsvilleClearedJobs.Net
 
Huntsville All Clearances or Cyber Virtual Job Fair Handbook June 3
Huntsville All Clearances or Cyber Virtual Job Fair Handbook June 3Huntsville All Clearances or Cyber Virtual Job Fair Handbook June 3
Huntsville All Clearances or Cyber Virtual Job Fair Handbook June 3ClearedJobs.Net
 
What You Need to Know About Robotic Process Automation: How It Works & Real-W...
What You Need to Know About Robotic Process Automation: How It Works & Real-W...What You Need to Know About Robotic Process Automation: How It Works & Real-W...
What You Need to Know About Robotic Process Automation: How It Works & Real-W...Captricity
 
Database Design Disasters
Database Design DisastersDatabase Design Disasters
Database Design DisastersRichie Rump
 
Cybersecurity for Marketing
Cybersecurity for Marketing Cybersecurity for Marketing
Cybersecurity for Marketing Alert Logic
 
Self-Sovereign Identity: Lightening Talk at RightsCon
Self-Sovereign Identity: Lightening Talk at RightsCon Self-Sovereign Identity: Lightening Talk at RightsCon
Self-Sovereign Identity: Lightening Talk at RightsCon Kaliya "Identity Woman" Young
 
Curated Proof Markets & Token-Curated Identities in Ocean Protocol
Curated Proof Markets & Token-Curated Identities in Ocean ProtocolCurated Proof Markets & Token-Curated Identities in Ocean Protocol
Curated Proof Markets & Token-Curated Identities in Ocean ProtocolTrent McConaghy
 
TECHNOLOGY: Solution to our woos not Politicians & INTERNET of THINGS in Nuts...
TECHNOLOGY: Solution to our woos not Politicians & INTERNET of THINGS in Nuts...TECHNOLOGY: Solution to our woos not Politicians & INTERNET of THINGS in Nuts...
TECHNOLOGY: Solution to our woos not Politicians & INTERNET of THINGS in Nuts...Ravi Chandra
 
Snowflake Data Governance
Snowflake Data GovernanceSnowflake Data Governance
Snowflake Data Governancessuser538b022
 
Synthetic Data for Big Data Privacy
Synthetic Data for Big Data PrivacySynthetic Data for Big Data Privacy
Synthetic Data for Big Data PrivacyMOSTLY AI
 
How the US Military does Risk Management is a little different wha.docx
How the US Military does Risk Management is a little different wha.docxHow the US Military does Risk Management is a little different wha.docx
How the US Military does Risk Management is a little different wha.docxwellesleyterresa
 
Internet of Things Primer
Internet of Things PrimerInternet of Things Primer
Internet of Things PrimerStephen Bates
 

Similar to Real-Time Entity Resolution with Elasticsearch - Haystack 2018 (20)

In:Confidence 2019 - Balancing the conflicting objectives of data access and ...
In:Confidence 2019 - Balancing the conflicting objectives of data access and ...In:Confidence 2019 - Balancing the conflicting objectives of data access and ...
In:Confidence 2019 - Balancing the conflicting objectives of data access and ...
 
Privacy solutions decode2021_jon_oliver
Privacy solutions decode2021_jon_oliverPrivacy solutions decode2021_jon_oliver
Privacy solutions decode2021_jon_oliver
 
Global AI Bootcamp Singapore - Keynote
Global AI Bootcamp Singapore - KeynoteGlobal AI Bootcamp Singapore - Keynote
Global AI Bootcamp Singapore - Keynote
 
How We Did It: The Case of the Credit Card Breach
How We Did It: The Case of the Credit Card BreachHow We Did It: The Case of the Credit Card Breach
How We Did It: The Case of the Credit Card Breach
 
Introduction of Artificial Intelligence
Introduction of Artificial IntelligenceIntroduction of Artificial Intelligence
Introduction of Artificial Intelligence
 
Mastering Location Data – a new paradigm in network analytics
Mastering Location Data – a new paradigm in network analyticsMastering Location Data – a new paradigm in network analytics
Mastering Location Data – a new paradigm in network analytics
 
Trusting AI with important decisions
Trusting AI with important decisionsTrusting AI with important decisions
Trusting AI with important decisions
 
The Domains of Identity & Self-Sovereign Identity MyData 2018
The Domains of Identity & Self-Sovereign Identity MyData 2018The Domains of Identity & Self-Sovereign Identity MyData 2018
The Domains of Identity & Self-Sovereign Identity MyData 2018
 
All Clearances or Cyber Virtual Job Fair Handbook June 3, 2020, Huntsville
All Clearances or Cyber Virtual Job Fair Handbook June 3, 2020, HuntsvilleAll Clearances or Cyber Virtual Job Fair Handbook June 3, 2020, Huntsville
All Clearances or Cyber Virtual Job Fair Handbook June 3, 2020, Huntsville
 
Huntsville All Clearances or Cyber Virtual Job Fair Handbook June 3
Huntsville All Clearances or Cyber Virtual Job Fair Handbook June 3Huntsville All Clearances or Cyber Virtual Job Fair Handbook June 3
Huntsville All Clearances or Cyber Virtual Job Fair Handbook June 3
 
What You Need to Know About Robotic Process Automation: How It Works & Real-W...
What You Need to Know About Robotic Process Automation: How It Works & Real-W...What You Need to Know About Robotic Process Automation: How It Works & Real-W...
What You Need to Know About Robotic Process Automation: How It Works & Real-W...
 
Database Design Disasters
Database Design DisastersDatabase Design Disasters
Database Design Disasters
 
Cybersecurity for Marketing
Cybersecurity for Marketing Cybersecurity for Marketing
Cybersecurity for Marketing
 
Self-Sovereign Identity: Lightening Talk at RightsCon
Self-Sovereign Identity: Lightening Talk at RightsCon Self-Sovereign Identity: Lightening Talk at RightsCon
Self-Sovereign Identity: Lightening Talk at RightsCon
 
Curated Proof Markets & Token-Curated Identities in Ocean Protocol
Curated Proof Markets & Token-Curated Identities in Ocean ProtocolCurated Proof Markets & Token-Curated Identities in Ocean Protocol
Curated Proof Markets & Token-Curated Identities in Ocean Protocol
 
TECHNOLOGY: Solution to our woos not Politicians & INTERNET of THINGS in Nuts...
TECHNOLOGY: Solution to our woos not Politicians & INTERNET of THINGS in Nuts...TECHNOLOGY: Solution to our woos not Politicians & INTERNET of THINGS in Nuts...
TECHNOLOGY: Solution to our woos not Politicians & INTERNET of THINGS in Nuts...
 
Snowflake Data Governance
Snowflake Data GovernanceSnowflake Data Governance
Snowflake Data Governance
 
Synthetic Data for Big Data Privacy
Synthetic Data for Big Data PrivacySynthetic Data for Big Data Privacy
Synthetic Data for Big Data Privacy
 
How the US Military does Risk Management is a little different wha.docx
How the US Military does Risk Management is a little different wha.docxHow the US Military does Risk Management is a little different wha.docx
How the US Military does Risk Management is a little different wha.docx
 
Internet of Things Primer
Internet of Things PrimerInternet of Things Primer
Internet of Things Primer
 

Recently uploaded

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 

Recently uploaded (20)

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 

Real-Time Entity Resolution with Elasticsearch - Haystack 2018

  • 1. Dave Moore david.moore@elastic.co Haystack: The Search Relevancy Conference (11 April 2018) Real-Time Entity Resolution With Elasticsearch
  • 2. 1 Disambiguation Entity Entity Single attributes in unstructured text "Named Entity Recognition" Multiple attributes in structured data "Entity Resolution" vs. Person Field Value Name Alice Jones DOB 1984-01-01 Street 123 Main St Credit Card 4040 0000 2020 8080 Phone 202-555-1234
  • 3. 2 What is entity resolution?
  • 4. Health Care Patient ID We need to identify and their medical many hand-written Mixing up records puts at risk of injury or Sales & Marketing Customer Intel We have reps managing many sources of info on leads and customers. Our view of the buyer is fragmented and that makes us less effective. We're losing pipeline. Security & Compliance Fraud We need to track a person or device that is hiding its tracks. Connecting the dots is a laborious process and we can't keep up with our incident backlog. Military, IC, Law Surveillance We need to track a person or device that is hiding its identity. Our timely success is critical to public safety and national security. Privacy Compliance GDPR We must find and manage all PII to respond to inquiries. Failure to comply risks fines of €20 million or 4% annual turnover. IT MDM MDM is a slow and bureaucratic process. We can solve our own data quality problems faster and better. And we still need query time entity resolution. 3 Examples
  • 5. 4 Why is identity hard to track?
  • 6. Ali Jones 123 W Main Street ABC Wigdets 4040 0000 2020 8008 +1 (202) 555 1234 5 1. Identity is Vague Allie Jones 123 Main St ABC Widgets, Inc. 4040 0000 2020 8080 202-555-1234 Icons by icons8
  • 7. Ali Jones 123 W Main Street ABC Wigdets 4040 0000 2020 8008 +1 (202) 555 1234 Alison Jones-Smith 555 Brooad Street XYZ Tech 3030 5500 9999 0000 2025559867 6 2. Identity Changes Allie Jones 123 Main St ABC Widgets, Inc. 4040 0000 2020 8080 202-555-1234 Allison Smith 555 Broad St XYZ Technology Corp. 3030 5050 9999 0000 202-555-9876 Icons by icons8
  • 8. Ali Jones 123 W Main Street ABC Wigdets 4040 0000 2020 8008 +1 (202) 555 1234 Alison Jones-Smith 555 Brooad Street XYZ Tech 3030 5500 9999 0000 2025559867 7 3. Identity is Messy Allie Jones 123 Main St ABC Widgets, Inc. 4040 0000 2020 8080 202-555-1234 Allison Smith 555 Broad St XYZ Technology Corp. 3030 5050 9999 0000 202-555-9876 Icons by icons8
  • 9. 8 4. Identity is Diverse Ali Jones 123 W Main Street ABC Wigdets 4040 0000 2020 8008 +1 (202) 555 1234 Alison Jones-Smith 555 Brooad Street XYZ Tech 3030 5500 9999 0000 2025559867 Allie Jones 123 Main St ABC Widgets, Inc. 4040 0000 2020 8080 202-555-1234 Allison Smith 555 Broad St XYZ Technology Corp. 3030 5050 9999 0000 202-555-9876 ??? ??? ??? ??? Icons by icons8
  • 10. 9 Entity Resolution connects the dots despite these challenges
  • 11. Allie Jones 123 Main St ABC Widgets, Inc. 4040 0000 2020 8080 202-555-1234 Allie Jones 123 Main Street ABC Widgets 4040 0000 2020 8080 202.555.1234 Ali Jones 123 W Main Street ABC Wigdets 4040 0000 2020 8008 +1 (202) 555 1234 Allie Jones 132 W Main Street ABC Widgets 4040 0000 2020 8080 202 555 1234 Allie Smith 123 Main St ABC Widgets, Inc. 4040 0000 2020 8080 202-555-1234 Allie Smith 123 Main Street ABC Widgets 4040 0000 2020 8080 202.555.1234 Ali Smith 123 W Main Street ABC Wigdets 4040 0000 2020 8008 +1 (202) 555 1234 Allie Smith 555 Broad St ABC Widgets, Inc 4040 0000 2020 8080 202-555-1234 Allie Smith 555 Broad Street XYZ Tech Corp 3030 5050 9999 0000 202.555.1234 Allie Smith 555 Broad Street XYZ Technology Corp 3030 5050 9999 0000 202-555-9876 10 Comparison to Search Search Resolution name:"Allie Jones" AND street:"123 Main St" name:"Allie Jones" AND street:"123 Main St" Allie Jones 123 Main St ABC Widgets, Inc. 4040 0000 2020 8080 202-555-1234 Allie Jones 123 Main Street ABC Widgets 4040 0000 2020 8080 202.555.1234 Ali Jones 123 W Main Street ABC Wigdets 4040 0000 2020 8008 +1 (202) 555 1234 Ali Jones 132 Mane Street ABC Widgets 4024 0071 4970 1227 888-555-5555 Aly Jonas 113 Main Street Acme Corp. 4716 1035 4536 4671 610-555-5555 Allie Jones 132 W Main Street ABC Widgets 4040 0000 2020 8080 202-555-9876 Al Jones 132 E Main St Mom & Pop, LLC 3772 733741 52501 1-610-555-0000 Aly Jones 113 Main St, #102 Acme Corp. 4716 1035 4536 4671 610-555-5555 Ali Jones 132 Mane Street ABC Widgets 4024 0071 4970 1227 888-555-1234 Aly Jonas 113 Main Street Acme Corp. 4781 9105 0533 4481 610-555-2345 Allie Johns 132 W Main Street ABC Widgets 4088 0110 2044 8180 202-555-3456 Elle Jeon 132 E Main St Mom & Pop, LLC 3502 730741 52203 1-610-555-4567 Elle Jones 113 Main St, #102 Acme Corp. 4716 1035 4536 4671 610-555-5678 Eli Jones 132 Mane Street ABC Widgets 4224 0065 4800 1337 888-555-6789 Eli Joans 113 Main Street Acme Corp. 4206 1035 4536 4081 610-555-7890 Allie Jeans 132 N Mean Street ABC Widgets 4240 0101 02020 8888 202-555-8901 Search engine ranks results once. True hits mixed with noise. Search engine filters results recursively. True hits isolated and transitively linked.
  • 13. 12 Batch vs. Real-Time Batch Real-Time How is it used? Resolve all entities in advance (Partitioning, pairwise scoring, connected components) How long does it take? Docs + (Docs/Partitions)2 + Components2 (Hours for billions of documents) When is it necessary? Population or network analysis Most solutions have a real-time phase, sometimes applied after batch resolution. How is it used? Resolve one entity on query (Recursive Boolean query) How long does it take? Indices * Attributes * Hops (Milliseconds for a handful of each) When is it necessary? Individual analysis
  • 14. Robust matching • Token normalization • Phonetic matching • Fuzzy transpositions • Boolean logic filtering • Fine-tune search parameters 13 Real-Time Why Elasticsearch Suited for operations • Horizontal scaling • Real-time response rates • Flexible index mappings
  • 15. 14 Approach • Fast – Get results in real-time. From milliseconds to low seconds. • Generic – Resolve any type of entity. People, companies, locations, sessions, etc. • Transitive – Resolve over multiple hops of matches. Capture changing identities. • Multi-source – Resolve over multiple indices with disparate mappings. • Accommodating – Operate on data as it exists. Avoid transforming and reindexing data. • Logical – Logic is easier to read, troubleshoot, and optimize than statistics. • 100% Elasticsearch – Operate within existing search infrastructure. Goals
  • 16. 15 Approach 1. Entity modeling – What is the entity? What are its attributes? 2. Analyzers – How are you indexing each attribute? 3. Matchers – What is the query logic for each attribute? 4. Resolvers – What combinations of matching attributes imply a resolution? 5. Metadata maps – Which matchers apply to which indexed fields? 6. Recursive queries – How to repeat the queries until completion? Steps
  • 17. An open source Elasticsearch plugin for real-time entity resolution 16 zentity.io
  • 18. 17 POST _zentity/resolution/person { "attributes": { "name": "Alice Jones", "dob": "1984-01-01", "phone": [ "555-123-4567", "555-987-6543" ] } } zentity.io
  • 20. 19 Demos Customer intelligence Gather everything we know about a customer. Web traffic sessionization Track a bot that cycles through IP addresses, cookies, and user agent signatures. Fraud detection Determine if a health care provider was blacklisted under a different name.
  • 24. 23 Step 1. Entity Modeling Person Name the entity type. Name – First Name Name – Last Name Address – Street Address – City Address – Province Address – Postal Code Address – Country Date of Birth Phone Number Email Address IP Address Credit Card Number Social Security Number Define its attributes. Study them in your data sets. Uniqueness Consistency Presence Moderate Moderate High Low Low Low Low Moderate Moderate High High Extreme Extreme Moderate Moderate Low Moderate High High High High Moderate Extreme Extreme Extreme High Extreme High Moderate Moderate High High High Moderate Moderate Moderate Low Low None Icons by icons8
  • 25. 24 Step 1. Entity Modeling Person Name the entity type. Name – First Name Name – Last Name Address – Street Address – City Address – Province Address – Postal Code Address – Country Date of Birth Phone Number Email Address IP Address Credit Card Number Social Security Number Define its attributes. Study them in your data sets. Uniqueness Consistency Presence Moderate Moderate High Low Low Low Low Moderate Moderate High High Extreme Extreme Moderate Moderate Low Moderate High High High High Moderate Extreme Extreme Extreme High Extreme High Moderate Moderate High High High Moderate Moderate Moderate Low Low None This model is independent from your indices. You can reuse and extend this model as you add or amend indices. Icons by icons8
  • 26. Name – First Name Name – Last Name Address – Street Address – City Address – Province Address – Postal Code Address – Country Date of Birth Phone Number Email Address IP Address Credit Card Number Social Security Number Phonetic "Alice Jones" => ["ALAC","JAN"] Standard "Alice Jones" => ["ALICE","JONES"] 25 Step 2. Analyzers Take the attributes. Define their analyzers. Put them in your index mappings. { "settings": { "index": { "analysis": { "filter": { "phonetic": { "type": "phonetic", "encoder": "nysiis" } }, "analyzer": { "phonetic": { "filter": [ "icu_normalizer", "icu_folding", "phonetic" ], "tokenizer": "standard" } } } } } } { "mappings": { "_doc": { "properties": { “first_name": { "type": "text", "fields": { "phonetic": { "type": "text", "analyzer": "phonetic" } } } } } } } Person Icons by icons8
  • 27. Name – First Name Name – Last Name Address – Street Address – City Address – Province Address – Postal Code Address – Country Date of Birth Phone Number Email Address IP Address Credit Card Number Social Security Number Phonetic "Alice Jones" => ["ALAC","JAN"] Standard "Alice Jones" => ["ALICE","JONES"] 26 Step 2. Analyzers Take the attributes. Define their analyzers. Put them in your index mappings. { "settings": { "index": { "analysis": { "filter": { "phonetic": { "type": "phonetic", "encoder": "nysiis" } }, "analyzer": { "phonetic": { "filter": [ "icu_normalizer", "icu_folding", "phonetic" ], "tokenizer": "standard" } } } } } } { "mappings": { "_doc": { "properties": { “first_name": { "type": "text", "fields": { "phonetic": { "type": "text", "analyzer": "phonetic" } } } } } } } Person Analyzers are powerful. But they must be defined prior to indexing. Give careful thought to your analyzers to avoid having to reindex data. Icons by icons8
  • 28. Phonetic { "match": { "{{ field }}": { "query": "{{ value }}", "fuzziness": 0 } } } Standard { "match": { "{{ field }}": { "query": "{{ value }}", "fuzziness": 2 } } } 27 Step 3. Matchers Take the attributes. Name – First Name Name – Last Name Address – Street Address – City Address – Province Address – Postal Code Address – Country Date of Birth Phone Number Email Address IP Address Credit Card Number Social Security Number Define their Boolean query logic. Use templates for variables. Person {{ field }} – The field of an index. {{ value }} – The value of an attribute. We will replace these at query time. Icons by icons8
  • 29. Phonetic { "match": { "{{ field }}": { "query": "{{ value }}“, "fuzziness": 0 } } } Standard { "match": { "{{ field }}": { "query": "{{ value }}“, "fuzziness": 2 } } } 28 Step 3. Matchers Take the attributes. Name – First Name Name – Last Name Address – Street Address – City Address – Province Address – Postal Code Address – Country Date of Birth Phone Number Email Address IP Address Credit Card Number Social Security Number Define their Boolean query logic. Use templates for variables. Person {{ field }} – The field of an index. {{ value }} – The value of an attribute. We will replace these at query time. Understand that each matcher will be combined into one large Boolean query. Icons by icons8
  • 30. 29 Step 4. Resolvers Take the attributes. Name – First Name Name – Last Name Address – Street Address – City Address – Province Address – Postal Code Address – Country Date of Birth Phone Number Email Address IP Address Credit Card Number Social Security Number Determine which combinations of matching attributes imply a resolution. [ Name – First, Name – Last, Address – Street, Address – City, Address – State ] [ Name – First, Name – Last, Address – Street, Address – Postal Code ] [ Name – First, Name – Last, Date of Birth, Address – City, Address – State ] [ Name – First, Name – Last, Date of Birth, Address – Postal Code ] [ Name – First, Name – Last, Phone Number ] [ Name – First, Name – Last, Email Address ] [ Name – First, Name – Last, IP Address ] [ Name – First, Name – Last, Credit Card Number ] [ Name – First, Name – Last, Social Security Number] [ Email Address, Phone Number ] [ Email Address, IP Address ] [ Email Address, Credit Card Number ] [ IP Address, Credit Card Number ] Person Icons by icons8
  • 31. 30 Step 4. Resolvers Take the attributes. Name – First Name Name – Last Name Address – Street Address – City Address – Province Address – Postal Code Address – Country Date of Birth Phone Number Email Address IP Address Credit Card Number Social Security Number Determine which combinations of matching attributes imply a resolution. [ Name – First, Name – Last, Address – Street, Address – City, Address – State ] [ Name – First, Name – Last, Address – Street, Address – Postal Code ] [ Name – First, Name – Last, Date of Birth, Address – City, Address – State ] [ Name – First, Name – Last, Date of Birth, Address – Postal Code ] [ Name – First, Name – Last, Phone Number ] [ Name – First, Name – Last, Email Address ] [ Name – First, Name – Last, IP Address ] [ Name – First, Name – Last, Credit Card Number ] [ Name – First, Name – Last, Social Security Number] [ Email Address, Phone Number ] [ Email Address, IP Address ] [ Email Address, Credit Card Number ] [ IP Address, Credit Card Number ] Person Avoid resolving on a single attribute such as Social Security Number. Corroboration among multiple attributes helps prevent snowballs. Icons by icons8
  • 32. 31 Step 5. Metadata Maps Take the attributes. Name – First Name Name – Last Name Address – Street Address – City Address – Province Address – Postal Code Address – Country Date of Birth Phone Number Email Address IP Address Credit Card Number Social Security Number Map them to the fields of the relevant indices. users.first_name users.last_name users.phone users.email customers:fname customers:lname customers:tel customers:email customers:cc customers:zip Person Icons by icons8
  • 33. 32 Step 6. Recursive Queries With each query, new inputs might be found in different attributes. Use the metadata map and your resolvers to determine if you can create new queries for the new inputs.