SlideShare una empresa de Scribd logo
1 de 20
Digital Enterprise Research Institute

www.deri.ie

Discovering Semantic Equivalence
of People behind
Online Profiles
Keith Cortis, Simon Scerri, Ismael Rivera, Siegfried
Handschuh
REsource Discovery (RED),
Workshop at ESWC 2012
27th May 2012
Copyright 2011 Digital Enterprise Research Institute. All rights reserved.

Enabling Networked Knowledge
Motivation
Digital Enterprise Research Institute

www.deri.ie

Current situation:
 Personal data is
unnecessarily duplicated
over different platforms
 No possibility to merge or
port such data
 Separate handling of this
data
Social Networking Sites as Walled
Gardens – David Simonds

Enabling Networked Knowledge
Problem Specification
Digital Enterprise Research Institute

www.deri.ie

 No common standards exist for modelling profile data in
online accounts

 Personal data (known contacts and presence
information) is dynamic and continuously changing

Enabling Networked Knowledge
Objectives
Digital Enterprise Research Institute






www.deri.ie

Aim: User represented through one digital identity
Main Challenge: Discovery of semantic equivalence
between contacts described in online profiles

Proposal: Use a comprehensive ontology framework for
handling online profile data
Enabling Networked Knowledge
di.me Ontology Framework
Digital Enterprise Research Institute

www.deri.ie

Enabling Networked Knowledge
Related Work Comparison
Digital Enterprise Research Institute



www.deri.ie

Existing Profile Linking Approaches based on:
o
o

Specific Inverse Functional Properties (e.g. email address)

o

Syntactic matching of all profile attributes

o



User’s friends

Semantic relatedness between text, depending on Knowledge
Bases (KB) such as Wikipedia

Our Approach: Similarity measure based on user’s
Personal Information Model (PIM)

PIM
Enabling Networked Knowledge
Approach (1)
Digital Enterprise Research Institute

www.deri.ie

A

User Profile Data

B

Ontology Mapping

C

Matching Attributes

D

Value Matching

Indirect String Matching

Linguistic
Analysis

2
Syntactic Matching
Direct String Matching

1

3

4

Semantic
Search
Extension

Ontologyenhanced
Attribute
Weighting

Online Profile Resolution

Enabling Networked Knowledge
Approach (2)
Digital Enterprise Research Institute

www.deri.ie

A

User Profile Data

B

Ontology Mapping

C

Matching Attributes

D

Value Matching

Indirect String Matching

Linguistic
Analysis

2
Syntactic Matching
Direct String Matching

1

3

4

Semantic
Search
Extension

Ontologyenhanced
Attribute
Weighting

Online Profile Resolution

Enabling Networked Knowledge
Approach (3)
Digital Enterprise Research Institute

www.deri.ie

A

User Profile Data

B

Ontology Mapping

C

Matching Attributes

D

Value Matching

Indirect String Matching

Linguistic
Analysis

2
Syntactic Matching
Direct String Matching

1

3

4

Semantic
Search
Extension

Ontologyenhanced
Attribute
Weighting

Online Profile Resolution

Enabling Networked Knowledge
Approach (3)
Digital Enterprise Research Institute

www.deri.ie



Identity-related online profile information - NCO



Presence and online post data for the user – DLPO

Enabling Networked Knowledge
Approach (3)
Digital Enterprise Research Institute



www.deri.ie

Account Ontology (DAO) – for modelling service account
representations

DLPO

representative
Contact

DAO
LivePost

MultimediaPost
PresencePost

WebDocumentPost
Message

Account
source
source
hasCredentials
Credentials
nao:externalIdentifier

rdfs:label
rdfs:label
userID
password
xsd:string

hasCustomAttribute

NCO
PersonContact

photo
key
sound

foafUrl
OrganizationContact
rdfs:Resource websiteUrl
blogUrl
nie:DataObject
EmailAddress hasEmailAddressbelongsToGroup
ContactGroup

PostalAddress hasPostalAddress
PhoneNumber hasPhoneNumber hasLocation

geo:Point

hasIMAccount

Name

IMAccount

hasName

Enabling Networked Knowledge
Approach (4)
Digital Enterprise Research Institute

www.deri.ie

A

User Profile Data

B

Ontology Mapping

C

Matching Attributes

D

Value Matching

Indirect String Matching

Linguistic
Analysis

2
Syntactic Matching
Direct String Matching

1

3

4

Semantic
Search
Extension

Ontologyenhanced
Attribute
Weighting

Online Profile Resolution

Enabling Networked Knowledge
Approach (4)
Digital Enterprise Research Institute

www.deri.ie

A

User Profile Data

B

Ontology Mapping

C

Matching Attributes

D

Value Matching

Indirect String Matching

Linguistic
Analysis

2
Syntactic Matching
Direct String Matching

1

3

4

Semantic
Search
Extension

Ontologyenhanced
Attribute
Weighting

Online Profile Resolution

Enabling Networked Knowledge
Approach (4)
Digital Enterprise Research Institute

www.deri.ie

A

User Profile Data

B

Ontology Mapping

C

Matching Attributes

D

Value Matching

Indirect String Matching

Linguistic
Analysis

2
Syntactic Matching
Direct String Matching

1

3

4

Semantic
Search
Extension

Ontologyenhanced
Attribute
Weighting

Online Profile Resolution

Enabling Networked Knowledge
Approach (4)
Digital Enterprise Research Institute

www.deri.ie

A

User Profile Data

B

Ontology Mapping

C

Matching Attributes

D

Value Matching

Indirect String Matching

Linguistic
Analysis

2
Syntactic Matching
Direct String Matching

1

3

4

Semantic
Search
Extension

Ontologyenhanced
Attribute
Weighting

Online Profile Resolution

Enabling Networked Knowledge
Approach (4)
Digital Enterprise Research Institute

www.deri.ie

A

User Profile Data

B

Ontology Mapping

C

Matching Attributes

D

Value Matching

Indirect String Matching

Linguistic
Analysis

2
Syntactic Matching
Direct String Matching

1

3

4

Semantic
Search
Extension

Ontologyenhanced
Attribute
Weighting

Online Profile Resolution

Enabling Networked Knowledge
Approach (5)
Digital Enterprise Research Institute

www.deri.ie

A

User Profile Data

B

Ontology Mapping

C

Matching Attributes

D

Value Matching

Indirect String Matching

Linguistic
Analysis

2
Syntactic Matching
Direct String Matching

1

3

4

Semantic
Search
Extension

Ontologyenhanced
Attribute
Weighting

Online Profile Resolution

Enabling Networked Knowledge
Implementation
Digital Enterprise Research Institute

www.deri.ie



Transformation



Linguistic Analysis
ANNIE
Information
Extraction System

Large KB
Gazetteer
Lookup

“DERI, Lower Dangan, Galway, Ireland”

PIM
Organisation

Street

City

Country

Enabling Networked Knowledge
Final Objective
Digital Enterprise Research Institute

www.deri.ie

Enabling Networked Knowledge
Summary
Digital Enterprise Research Institute



www.deri.ie



Objectives
o

o

Future Work

Aggregated profile data is
lifted onto a unique PIM
representation and
integrated in a super profile

o

Integration of further online
accounts

o

Semantic extension to the
syntactic-based profile
attribute matching

o

Definition of a metric

o

Analysis of online posts
from multiple accounts

o

Determination of semantic
equivalence between
contacts described in online
profiles

Evaluation of artefact

Thank you for your attention
keith.cortis@deri.org
Enabling Networked Knowledge

Más contenido relacionado

La actualidad más candente

Data Sharing and the Polar Information Commons
Data Sharing and the Polar Information CommonsData Sharing and the Polar Information Commons
Data Sharing and the Polar Information CommonsKaitlin Thaney
 
CISO's Guide to Securing SharePoint
CISO's Guide to Securing SharePointCISO's Guide to Securing SharePoint
CISO's Guide to Securing SharePointImperva
 
Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)Bianca Pereira
 
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...FIA2010
 
Linked data for Enterprise Data Integration
Linked data for Enterprise Data IntegrationLinked data for Enterprise Data Integration
Linked data for Enterprise Data IntegrationSören Auer
 
Phishing Attacks: Trends, Detection Systems and Computer Vision as a Promisin...
Phishing Attacks: Trends, Detection Systems and Computer Vision as a Promisin...Phishing Attacks: Trends, Detection Systems and Computer Vision as a Promisin...
Phishing Attacks: Trends, Detection Systems and Computer Vision as a Promisin...Selman Bozkır
 
Digital Object Identifier (DOI): Introduction and Applications
Digital Object Identifier (DOI): Introduction and Applications Digital Object Identifier (DOI): Introduction and Applications
Digital Object Identifier (DOI): Introduction and Applications Nader Ale Ebrahim
 

La actualidad más candente (10)

Convolutional Neural Networks
Convolutional Neural Networks Convolutional Neural Networks
Convolutional Neural Networks
 
Data Sharing and the Polar Information Commons
Data Sharing and the Polar Information CommonsData Sharing and the Polar Information Commons
Data Sharing and the Polar Information Commons
 
CISO's Guide to Securing SharePoint
CISO's Guide to Securing SharePointCISO's Guide to Securing SharePoint
CISO's Guide to Securing SharePoint
 
Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)
 
Web Mining
Web MiningWeb Mining
Web Mining
 
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...
 
Linked data for Enterprise Data Integration
Linked data for Enterprise Data IntegrationLinked data for Enterprise Data Integration
Linked data for Enterprise Data Integration
 
Conference Report Final 11.18
Conference Report Final 11.18Conference Report Final 11.18
Conference Report Final 11.18
 
Phishing Attacks: Trends, Detection Systems and Computer Vision as a Promisin...
Phishing Attacks: Trends, Detection Systems and Computer Vision as a Promisin...Phishing Attacks: Trends, Detection Systems and Computer Vision as a Promisin...
Phishing Attacks: Trends, Detection Systems and Computer Vision as a Promisin...
 
Digital Object Identifier (DOI): Introduction and Applications
Digital Object Identifier (DOI): Introduction and Applications Digital Object Identifier (DOI): Introduction and Applications
Digital Object Identifier (DOI): Introduction and Applications
 

Destacado

Affinity Driven Social Network
Affinity Driven Social NetworkAffinity Driven Social Network
Affinity Driven Social Networkepokh
 
Whitepaper: Extract value from Facebook Data - Happiest Minds
Whitepaper: Extract value from Facebook Data - Happiest MindsWhitepaper: Extract value from Facebook Data - Happiest Minds
Whitepaper: Extract value from Facebook Data - Happiest MindsHappiest Minds Technologies
 
An Ontology-based Technique for Online Profile Resolution
An Ontology-based Technique for Online Profile ResolutionAn Ontology-based Technique for Online Profile Resolution
An Ontology-based Technique for Online Profile Resolutionkcortis
 
Introduction to Facebook Graph API and OAuth 2
Introduction to Facebook Graph API and OAuth 2Introduction to Facebook Graph API and OAuth 2
Introduction to Facebook Graph API and OAuth 2Thai Pangsakulyanont
 
Facebook Open Graph API
Facebook Open Graph APIFacebook Open Graph API
Facebook Open Graph APIColin Smillie
 
Tweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTrilok Sharma
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Dev Sahu
 
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
2013-1 Machine Learning Lecture 03 - Naïve Bayes ClassifiersDongseo University
 
Data mining in social network
Data mining in social networkData mining in social network
Data mining in social networkakash_mishra
 
Twitter text mining using sas
Twitter text mining using sasTwitter text mining using sas
Twitter text mining using sasAnalyst
 

Destacado (13)

Affinity Driven Social Network
Affinity Driven Social NetworkAffinity Driven Social Network
Affinity Driven Social Network
 
Whitepaper: Extract value from Facebook Data - Happiest Minds
Whitepaper: Extract value from Facebook Data - Happiest MindsWhitepaper: Extract value from Facebook Data - Happiest Minds
Whitepaper: Extract value from Facebook Data - Happiest Minds
 
Timilar ppt
Timilar pptTimilar ppt
Timilar ppt
 
An Ontology-based Technique for Online Profile Resolution
An Ontology-based Technique for Online Profile ResolutionAn Ontology-based Technique for Online Profile Resolution
An Ontology-based Technique for Online Profile Resolution
 
PointLoyalty
PointLoyaltyPointLoyalty
PointLoyalty
 
Introduction to Facebook Graph API and OAuth 2
Introduction to Facebook Graph API and OAuth 2Introduction to Facebook Graph API and OAuth 2
Introduction to Facebook Graph API and OAuth 2
 
Facebook Open Graph API
Facebook Open Graph APIFacebook Open Graph API
Facebook Open Graph API
 
Facebook data analysis using r
Facebook data analysis using rFacebook data analysis using r
Facebook data analysis using r
 
Tweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVM
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
 
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
 
Data mining in social network
Data mining in social networkData mining in social network
Data mining in social network
 
Twitter text mining using sas
Twitter text mining using sasTwitter text mining using sas
Twitter text mining using sas
 

Similar a Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 - ESWC 2012)

Querying Heterogeneous Datasets on the Linked Data Web
Querying Heterogeneous Datasets on the Linked Data WebQuerying Heterogeneous Datasets on the Linked Data Web
Querying Heterogeneous Datasets on the Linked Data WebEdward Curry
 
Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008Blogtalk 2008
 
Nova Spivack - Semantic Web Talk
Nova Spivack - Semantic Web TalkNova Spivack - Semantic Web Talk
Nova Spivack - Semantic Web Talksyawal
 
Digital Renaissance - Alfresco EMEA Partner Day
Digital Renaissance - Alfresco EMEA Partner DayDigital Renaissance - Alfresco EMEA Partner Day
Digital Renaissance - Alfresco EMEA Partner DayJohn Newton
 
Self-Sovereign Identity: Lightening Talk at RightsCon
Self-Sovereign Identity: Lightening Talk at RightsCon Self-Sovereign Identity: Lightening Talk at RightsCon
Self-Sovereign Identity: Lightening Talk at RightsCon Kaliya "Identity Woman" Young
 
Alitora Innovation Networks
Alitora Innovation NetworksAlitora Innovation Networks
Alitora Innovation Networksalitora
 
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012Bhaskar Ghosh
 
Plays Well With Others
Plays Well With OthersPlays Well With Others
Plays Well With Othersbrianoberkirch
 
A distributional structured semantic space for querying rdf graph data
A distributional structured semantic space for querying rdf graph dataA distributional structured semantic space for querying rdf graph data
A distributional structured semantic space for querying rdf graph dataAndre Freitas
 
Next generation linked in talent search
Next generation linked in talent searchNext generation linked in talent search
Next generation linked in talent searchRyan Wu
 
VoID: Metadata for RDF Datasets
VoID: Metadata for RDF DatasetsVoID: Metadata for RDF Datasets
VoID: Metadata for RDF DatasetsRichard Cyganiak
 
Explaining The Semantic Web
Explaining The Semantic WebExplaining The Semantic Web
Explaining The Semantic WebAditya Tuli
 
Introduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataIntroduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataAndre Freitas
 
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic WebMulti-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic WebFabrizio Orlandi
 
Who am I? Blogtalk 2010 presentation
Who am I? Blogtalk 2010 presentation Who am I? Blogtalk 2010 presentation
Who am I? Blogtalk 2010 presentation coniecto
 

Similar a Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 - ESWC 2012) (20)

Querying Heterogeneous Datasets on the Linked Data Web
Querying Heterogeneous Datasets on the Linked Data WebQuerying Heterogeneous Datasets on the Linked Data Web
Querying Heterogeneous Datasets on the Linked Data Web
 
Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008
 
What is SDMX-RDF?
What is SDMX-RDF?What is SDMX-RDF?
What is SDMX-RDF?
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Nova Spivack - Semantic Web Talk
Nova Spivack - Semantic Web TalkNova Spivack - Semantic Web Talk
Nova Spivack - Semantic Web Talk
 
Digital Renaissance - Alfresco EMEA Partner Day
Digital Renaissance - Alfresco EMEA Partner DayDigital Renaissance - Alfresco EMEA Partner Day
Digital Renaissance - Alfresco EMEA Partner Day
 
Self-Sovereign Identity: Lightening Talk at RightsCon
Self-Sovereign Identity: Lightening Talk at RightsCon Self-Sovereign Identity: Lightening Talk at RightsCon
Self-Sovereign Identity: Lightening Talk at RightsCon
 
Alitora Innovation Networks
Alitora Innovation NetworksAlitora Innovation Networks
Alitora Innovation Networks
 
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
 
Plays Well With Others
Plays Well With OthersPlays Well With Others
Plays Well With Others
 
How to Publish Open Data
How to Publish Open DataHow to Publish Open Data
How to Publish Open Data
 
A distributional structured semantic space for querying rdf graph data
A distributional structured semantic space for querying rdf graph dataA distributional structured semantic space for querying rdf graph data
A distributional structured semantic space for querying rdf graph data
 
Next generation linked in talent search
Next generation linked in talent searchNext generation linked in talent search
Next generation linked in talent search
 
Identity Talk at Net Squared 2008
Identity Talk at Net Squared 2008Identity Talk at Net Squared 2008
Identity Talk at Net Squared 2008
 
VoID: Metadata for RDF Datasets
VoID: Metadata for RDF DatasetsVoID: Metadata for RDF Datasets
VoID: Metadata for RDF Datasets
 
Explaining The Semantic Web
Explaining The Semantic WebExplaining The Semantic Web
Explaining The Semantic Web
 
Introduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataIntroduction to question answering for linked data & big data
Introduction to question answering for linked data & big data
 
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic WebMulti-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
 
JeromeDL Tutorial
JeromeDL TutorialJeromeDL Tutorial
JeromeDL Tutorial
 
Who am I? Blogtalk 2010 presentation
Who am I? Blogtalk 2010 presentation Who am I? Blogtalk 2010 presentation
Who am I? Blogtalk 2010 presentation
 

Último

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 

Último (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 

Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 - ESWC 2012)

  • 1. Digital Enterprise Research Institute www.deri.ie Discovering Semantic Equivalence of People behind Online Profiles Keith Cortis, Simon Scerri, Ismael Rivera, Siegfried Handschuh REsource Discovery (RED), Workshop at ESWC 2012 27th May 2012 Copyright 2011 Digital Enterprise Research Institute. All rights reserved. Enabling Networked Knowledge
  • 2. Motivation Digital Enterprise Research Institute www.deri.ie Current situation:  Personal data is unnecessarily duplicated over different platforms  No possibility to merge or port such data  Separate handling of this data Social Networking Sites as Walled Gardens – David Simonds Enabling Networked Knowledge
  • 3. Problem Specification Digital Enterprise Research Institute www.deri.ie  No common standards exist for modelling profile data in online accounts  Personal data (known contacts and presence information) is dynamic and continuously changing Enabling Networked Knowledge
  • 4. Objectives Digital Enterprise Research Institute    www.deri.ie Aim: User represented through one digital identity Main Challenge: Discovery of semantic equivalence between contacts described in online profiles Proposal: Use a comprehensive ontology framework for handling online profile data Enabling Networked Knowledge
  • 5. di.me Ontology Framework Digital Enterprise Research Institute www.deri.ie Enabling Networked Knowledge
  • 6. Related Work Comparison Digital Enterprise Research Institute  www.deri.ie Existing Profile Linking Approaches based on: o o Specific Inverse Functional Properties (e.g. email address) o Syntactic matching of all profile attributes o  User’s friends Semantic relatedness between text, depending on Knowledge Bases (KB) such as Wikipedia Our Approach: Similarity measure based on user’s Personal Information Model (PIM) PIM Enabling Networked Knowledge
  • 7. Approach (1) Digital Enterprise Research Institute www.deri.ie A User Profile Data B Ontology Mapping C Matching Attributes D Value Matching Indirect String Matching Linguistic Analysis 2 Syntactic Matching Direct String Matching 1 3 4 Semantic Search Extension Ontologyenhanced Attribute Weighting Online Profile Resolution Enabling Networked Knowledge
  • 8. Approach (2) Digital Enterprise Research Institute www.deri.ie A User Profile Data B Ontology Mapping C Matching Attributes D Value Matching Indirect String Matching Linguistic Analysis 2 Syntactic Matching Direct String Matching 1 3 4 Semantic Search Extension Ontologyenhanced Attribute Weighting Online Profile Resolution Enabling Networked Knowledge
  • 9. Approach (3) Digital Enterprise Research Institute www.deri.ie A User Profile Data B Ontology Mapping C Matching Attributes D Value Matching Indirect String Matching Linguistic Analysis 2 Syntactic Matching Direct String Matching 1 3 4 Semantic Search Extension Ontologyenhanced Attribute Weighting Online Profile Resolution Enabling Networked Knowledge
  • 10. Approach (3) Digital Enterprise Research Institute www.deri.ie  Identity-related online profile information - NCO  Presence and online post data for the user – DLPO Enabling Networked Knowledge
  • 11. Approach (3) Digital Enterprise Research Institute  www.deri.ie Account Ontology (DAO) – for modelling service account representations DLPO representative Contact DAO LivePost MultimediaPost PresencePost WebDocumentPost Message Account source source hasCredentials Credentials nao:externalIdentifier rdfs:label rdfs:label userID password xsd:string hasCustomAttribute NCO PersonContact photo key sound foafUrl OrganizationContact rdfs:Resource websiteUrl blogUrl nie:DataObject EmailAddress hasEmailAddressbelongsToGroup ContactGroup PostalAddress hasPostalAddress PhoneNumber hasPhoneNumber hasLocation geo:Point hasIMAccount Name IMAccount hasName Enabling Networked Knowledge
  • 12. Approach (4) Digital Enterprise Research Institute www.deri.ie A User Profile Data B Ontology Mapping C Matching Attributes D Value Matching Indirect String Matching Linguistic Analysis 2 Syntactic Matching Direct String Matching 1 3 4 Semantic Search Extension Ontologyenhanced Attribute Weighting Online Profile Resolution Enabling Networked Knowledge
  • 13. Approach (4) Digital Enterprise Research Institute www.deri.ie A User Profile Data B Ontology Mapping C Matching Attributes D Value Matching Indirect String Matching Linguistic Analysis 2 Syntactic Matching Direct String Matching 1 3 4 Semantic Search Extension Ontologyenhanced Attribute Weighting Online Profile Resolution Enabling Networked Knowledge
  • 14. Approach (4) Digital Enterprise Research Institute www.deri.ie A User Profile Data B Ontology Mapping C Matching Attributes D Value Matching Indirect String Matching Linguistic Analysis 2 Syntactic Matching Direct String Matching 1 3 4 Semantic Search Extension Ontologyenhanced Attribute Weighting Online Profile Resolution Enabling Networked Knowledge
  • 15. Approach (4) Digital Enterprise Research Institute www.deri.ie A User Profile Data B Ontology Mapping C Matching Attributes D Value Matching Indirect String Matching Linguistic Analysis 2 Syntactic Matching Direct String Matching 1 3 4 Semantic Search Extension Ontologyenhanced Attribute Weighting Online Profile Resolution Enabling Networked Knowledge
  • 16. Approach (4) Digital Enterprise Research Institute www.deri.ie A User Profile Data B Ontology Mapping C Matching Attributes D Value Matching Indirect String Matching Linguistic Analysis 2 Syntactic Matching Direct String Matching 1 3 4 Semantic Search Extension Ontologyenhanced Attribute Weighting Online Profile Resolution Enabling Networked Knowledge
  • 17. Approach (5) Digital Enterprise Research Institute www.deri.ie A User Profile Data B Ontology Mapping C Matching Attributes D Value Matching Indirect String Matching Linguistic Analysis 2 Syntactic Matching Direct String Matching 1 3 4 Semantic Search Extension Ontologyenhanced Attribute Weighting Online Profile Resolution Enabling Networked Knowledge
  • 18. Implementation Digital Enterprise Research Institute www.deri.ie  Transformation  Linguistic Analysis ANNIE Information Extraction System Large KB Gazetteer Lookup “DERI, Lower Dangan, Galway, Ireland” PIM Organisation Street City Country Enabling Networked Knowledge
  • 19. Final Objective Digital Enterprise Research Institute www.deri.ie Enabling Networked Knowledge
  • 20. Summary Digital Enterprise Research Institute  www.deri.ie  Objectives o o Future Work Aggregated profile data is lifted onto a unique PIM representation and integrated in a super profile o Integration of further online accounts o Semantic extension to the syntactic-based profile attribute matching o Definition of a metric o Analysis of online posts from multiple accounts o Determination of semantic equivalence between contacts described in online profiles Evaluation of artefact Thank you for your attention keith.cortis@deri.org Enabling Networked Knowledge

Notas del editor

  1. -Users are currently required to create and separately manage duplicated personal data in numerous, heterogeneous online account services-Walled Garden: separate handling of data results in creating a wall around connections and personal data as reflected in the image -> portability, identity, linkability, privacy-Personal data In these accounts: static identity-related information to more dynamic information, as well as physical and online presence.
  2. -Focus of study not a straightforward task:1. no common standards exist for modelling profile data in online accounts -> retrieval and integration of federated heterogeneous personal data is instantly a hard task 2. some personal data is dynamic (known contacts and presence information) -> Dealing with the multiple user digital identities can result in being a complex task
  3. Aim: enable user to create, aggregate and merge multiple online profiles in one digital identity -One digital identity through Digital.Meuserware: i)a single access point to the user’s personal information sphere, ii) refers to personal data on a user’s multiple devices such as laptops, tablets and smartphones (after challenge) – online profile, their attributes and shared posts.Focus: Integration of multiple user online profiles - (e.g. health, bank, government, social related) but currently our focus is on social networksProposal: This comes in the form of a comprehensive ontology framework, which serves as a standard format for handling static and dynamic profile data (a set of re-used, extended and new vocabularies)
  4. Pyramid of the OSCAF Ontologies – adopted by di.meframwork (reused, extended, new) PIM representation uses these ontologies. – based on PIMO, NCO, DLPO For the problem in question (multiple identity integration), of particular relevance are the:NCO : modelling profile attributesPIMO: modelling user’s interests & who knows whom (NCO, PIMO all are established) - glues together knowledge represented by all the other domain upper-level ontologiesLivePost Ontology: modelling online posts (just 1 of a no. of new ontologies being engineered)Other targeting domains: user presence (DPO), context (DCON), history (DUHO), rules (DRMO), devices (DDO), accounts (DAO)In Di.me a no. of established ontologies have been brought together to offer a representation solution tailored for the project's objectives (reused, extended, new)
  5. -IFP : a property which uniquely identifies a user : linking based on IFP only is shallow since users can create multiple accounts within the same social network, with a diff email-Personal Information Model -  an instance of PIMO ontology : main KB for semantic matching, knowledge from external KBs-PIM: initially populated with any personal info integrated from a part. online account/crawled from a device. If there is no match of a particular entity, a new instance is created. (there will be one user profile initially)-Adv of PIM: contains info that is of direct interest to the user, thus more relevant to user than external KB – bound to yield more accurate results-remote KBs such as DPBedia or any other dataset that is part of the LOD cloud, will be accessed to determine any possible semantic relationship if no data exists in PIM
  6. Online profile matching approach involvesfour successive processes as outlined in the image presented.
  7. -Retrieve user’s profile information available through the service account APIs. Info targeted: user’s own identity-related information, online posts, contact’s info. - All crawled info. is aggregated into what we refer to as the user’s ‘super profile’
  8. Mapping of attributes for each represented online profile with the equivalent attributes for the super profile -The use of ontologies and RDF (main data representation) -> mapping we pursue considers both syntactic as well as semantic similarities in between online profile data
  9. Identity-related online profile information is stored as an instance of the NCO ontology – represents info that is related to a part contactPresence and online post data for the user is stored as instances of DLPO – represents personal presence info that is popularly shared in online accounts e.g. stat msg, checkin, etc.
  10. Contacts (NCO instance) and Liveposts (DLPO instance) are linked to instances of accounts (dao:Account), that refer to a particular account e.g. di.me, LinkedIn, Facebook, Twitter
  11. -Matching the user profile attributes - we consider the data both at a semantic and syntactic level. It involves four successive processes as outlined in (C)
  12. 1. Linguistic Analysis: - on the profile attributes that may contain complex/unstructured information such as a postal address, unlike the ones with an atomic value (person’s name, phone number). Required for discovering further knowledge from a particular value. Also, hyperlink resolution if not enough info within profile.
  13. 2. Syntactic Matching: -Value Matching: for attr. of a non-string literal type (e.g. dob or geo pos), since these have a strict, predefined structure -Direct String Matching: for attr. of type ‘string’, if their ontology type (e.g. name, addr) is either known beforehand or discovered through NER -Indirect String Matching: applied if attr. entity remains unknown even after NER is performed, over all PIM instances, regardless of their type -string matching metric – Monge and Elkan: user profile attribute values online to attributes stored in PIM KB
  14. 3. Semantic Search Extension: -To find if 2 attributes are semantically related, given that they don’t syntactically match. -user’s PIM is the main KB used, whilst remote KBs e.g. DBPedia or any other dataset in LOD cloud will also be used to determine any possible semantic relationship, if required data not found within the PIM.
  15. 4. Ontology-enhanced Attribute Weighting: an appropriate metric is required for weighting the attributes which were syntactically and/or semantically matched
  16. -Based on the ontology attribute weighting metric, we establish a threshold which determines semantic equivalence between user online profile and their personal identity which is already known and represented at the PIM level.-Given that 2 profiles are sem. eq., a user can be suggested to merge profile info that’s known over multiple online accounts-Integration of semantically-equivalent personal info across distributed sources will create unique user representation in the PIM
  17. XSPARQL - transformation between the XML social data into our RDF representation (Turtle) is declaratively expressed in a XSPARQL queryJSONLib– used to translate JSON into XMLANNIE – contains several main processing resources for common NLP tasks, such as a: tokeniser, sentence splitter, POS tagger, gazetteer, finite state transducer, orthomatcher and coreference resolver -> pre-defined gazetteers for common entity types (e.g. location, organizations, etc.), which we extended with acr. or abbr. where necessaryLarge KB Gazetteer - to make use of the information stored within the user’s PIM, since it can get populated dynamically by loading any ontology from RDF data.
  18. -User’s Personal Information Model (PIM) - glues together personal info from different sources in this case:-from an online account (OnlineAccountX) & the user’s super profile (Digital.MeAccount)-attributes of the user online profiles will be mapped to their corresponding properties within the di.me ontology framework-five identity-related profile attributes mapped within NCO (affiliation, organization, phone numer, person name, postal address) -e.g. label of org within the nco:org property i.e. ’Digital Enterprise Research Institute’ is matched against other org instances within the PIM The super profile instance ’DERI’ is one example of other PIM instances having the same type.-Presence-related profile info. available in the form of a complex type ’livepost’, is composed of… - ”Having a beer with Anna @ESWC12 in Iraklion” -> Status & Checkin & Event Post -> result of Linguistic analysis on online post -Semantic search example:-user’s addr in super profile listed as ‘Iraklion, is related to a pimo:City instance – ‘Heraklion’-user’s addr in online profile is ‘GR’, is related to pimo:Country instance –’Greece.’-two addr’s don’t syntactically match but are semantically related-through PIM KB, system knows that city and country instances related to both addr’s are related through ‘locatedWithin’ property -> partial semantic searchAdv of using ontologies: - resources can be linked at the semantic level, rather than the syntactic or format level.pimo:groundingOccurrence property, which relates an ’abstract’ but unique subject to one or more of its occurrences.-upper part of Fig. T-Box -> the ontological classes and attributes / lower part of Fig. A-Box -> egs of how the ontologiescan be used in practice -straight lines between the A- and T-box denote an instance-of relationship
  19. Integration of further online service accounts to our current system e.g. Health (RunKeeper), bank, government, social related accounts (Foursquare, Dropbox, Flickr)Metric: takes into account all the resulting weighted matches which were syntactically and/or semantically matched or partially matched>Threshold: determines whether two or more online profile refer to the same person-Evaluation: performed on 3 levels: syntactic matching, ii) semantic matching, and iii) a combination of
  20. -Overall di.me Objective: integrating all personal data in a personal information sphere by a single, user-controlled single point of access: the di.meuserware.-Our part in di.me: WP3 – Objectives and Tasks mentioned in slide