SlideShare a Scribd company logo
1 of 40
Download to read offline
How to Reveal Hidden Relationships
in Data and Risk Analytics
Ontotext Webinar, 28 Mar 2016
Presentation Outline
• Discovery and analytics case
• Data integration and FIBO mapping
• Discovery and analytics examples
• Future work
Apr 2016Hidden Relationships in Data and Risk Analytics
Relation Discovery Case
Apr 2016Hidden Relationships in Data and Risk Analytics
• Find suspicious
relationships like:
− Company in USA controls
− Another company in USA
− Through a company in an
off-shore zone
• Show news
relevant to them
• Database of locations with sub-region info
• Database with companies and control relations
• Define the semantics of the relevant relationships (using FIBO)
– sub-region and control are transitive relationships
– located-in is transitive over sub-region
• Define suspicious relationships
CONSTRUCT { ?orgA my:suspiciousLink ?orgB } WHERE {
?orgA ptop:locatedIn ?x ; fibo:controls ?y .
?y fibo:controls ?orgB ; ptop:locatedIn ?z .
?orgB ptop:locatedIn ?x .
?z a ptop:OffshoreZone .
}
What It Takes to Make It Work?
Hidden Relationships in Data and Risk Analytics Apr 2016
Presentation Outline
• Discovery and analytics case
• Data integration and FIBO mapping
• Discovery and analytics examples
• Future work
Apr 2016Hidden Relationships in Data and Risk Analytics
The Web of Linked Data in 2007
Apr 2016Hidden Relationships in Data and Risk Analytics
structured database
version of Wikipedia
database of all
locations on Earth
product
reviews
semantic synonym
dictionary
Note: Each bubble represents a dataset.
Arrows represent mappings across datasets;
e.g. dbpedia:Paris owl:sameAs geo:2988507
The Web of Linked Data is Gaining Mass
Apr 2016Hidden Relationships in Data and Risk Analytics
The Web of Data is Gaining Mass (2011)
Apr 2016Hidden Relationships in Data and Risk Analytics
The Web of Linked Data is Gaining Mass
Apr 2016Hidden Relationships in Data and Risk Analytics
• 2013 stats: 2 289 public
datasets
− http://stats.lod2.eu/
• Growing exponentially
− see the dotted trend line
• Structured markup
− Schema.org; semantic SEO
• Enables better semantic
tagging!
− As there are more concepts and
richer descriptions to refer to
27 43 89 162
295
822
2,289
2007 2008 2009 2010 2011 2012 2013
LinkedDataDatasets
Data Integration and Loading
• DBpedia (the English version only) 496M statements
• Geonames (all geographic features on Earth) 150M statements
− owl:sameAs links between DBpedia and Geonames 471K statements
• Company registry data (GLEI) 3M statements
• News metadata (from NOW) 128M statements
• Total size: 986М statements
− 667M explicit statements + 318M inferred statements
− RDFRank and geo-spatial indices enabled to allow for ranking and efficient geo-region constraints
Apr 2016Hidden Relationships in Data and Risk Analytics
Global Legal Entity Identifier (GLEI) data
Apr 2016
• Global Markets Entity Identifier (GMEI) Utility data
− The Global Markets Entity Identifier (GMEI) utility is DTCC's legal entity identifier solution offered in
collaboration with SWIFT
− We downloaded data dump from https://www.gmeiutility.org/
• RDF-ized company records
− Fields: LEI#, legal name, ultimate parent, registered country
− 3M explicit statements for 211 thousand organizations
▪ For comparison, there are 490 000 organizations in DBPeda and D&B covers above 200 million
− 10,821 ultimate parent relationships and 1632 ultimate parents
− About 2 800 organizations from the GLEI dump mapped to DBPedia
Hidden Relationships in Data and Risk Analytics
GLEI Company Data Sample: ABN-AMRO
Apr 2016Hidden Relationships in Data and Risk Analytics
lei:businessRegistry "Kamer van Koophandel"^^xsd:string
lei:businessRegistryNumber "34334259"^^xsd:string
lei:duplicateReference data:549300T5O0D0T4V2ZB28
lei:entityStatus "ACTIVE"^^xsd:string
lei:headquartersCity "Amsterdam"^^xsd:string
lei:headquartersState "Noord-Holland"^^xsd:string
lei:legalForm "NAAMLOZE VENNOOTSCHAP"^^xsd:string
lei:legalName "ABN AMRO Bank N.V."^^xsd:string
lei:lei "BFXS5XCH7N0Y05NIXW11"^^xsd:string
lei:registeredCity "Amsterdam"^^xsd:string
lei:registeredCountry "NL"^^xsd:string
lei:registeredPostCode "1082 PP"^^xsd:string
lei:registeredState "Noord-Holland"^^xsd:string
Global Legal Entity Identifier (GLEI) data
Apr 2016Hidden Relationships in Data and Risk Analytics
Ultimate parent Children Country
1 The Goldman Sachs Group, Inc. 1 851 US
2 United Technologies Corporation 427 US
3 Honeywell International Inc. 341 US
4 Morgan Stanley 228 US
5 Cargill, Incorporated 217 US
6 1832 Asset Management L.P. 202 CA
7 Aegon N.V. 174 NL
8 Union Bancaire Privée, UBP SA 138 CH
9 Citigroup Inc. 135 US
10 State Street Corporation 128 US
Country Companies
1 dbr:United_States 103 548
2 dbr:Canada 17 425
3 dbr:Luxembourg 13 984
4 dbr:Sweden 7 934
5 dbr:United_Kingdom 7 421
6 dbr:Belgium 6 868
7 dbr:Ireland 4 762
8 dbr:Australia 4 385
9 dbr:Germany 3 039
10 dbr:Netherlands 2 561
Quick news-analytics case
Apr 2016Hidden Relationships in Data and Risk Analytics
• Our Dynamic Semantic
Publishing platform
already offers linking
of text with big open
data graphs
• One can get navigate
from text to concepts,
get trends, related
entities and news
• Try it at
http://now.ontotext.com
Technology: Semantic Content Enrichment
Dec 2015Technology, Clients & Use Cases, Market 15
News Metadata
• Metadata from Ontotext’s Dynamic Semantic Publishing platform
− Automatically generated as part of the NOW.ontotext.com semantic news showcase
• News stream from Google since Feb 2015, about 10k news/month
− ~70 tags (annotations) per news article
• Tags link text mentions of concepts to the knowledge graph
− Technically these are URIs for entities (people, organizations, locations, etc.) and key phrases
Apr 2016Hidden Relationships in Data and Risk Analytics
News Metadata
Apr 2016Hidden Relationships in Data and Risk Analytics
Category Count
International 52 074
Science and Technology 23 201
Sports 20 714
Business 15 155
Lifestyle 11 684
122 828
Mentions / entity type Count
Keyphrase 2 589 676
Organization 1 276 441
Location 1 260 972
Person 1 248 784
Work 309 093
Event 258 388
RelationPersonRole 236 638
Species 180 946
Class Hierarchy Map (by number of instances)
Apr 2016Hidden Relationships in Data and Risk Analytics
Left: The big picture
Right: dbo:Agent class (2.7M organizations and persons)
Loading FIBO
• FIBO = Financial Industry Business Ontology
• We loaded FIBO Foundations and BE in GraphDB
− About 55 RDF files the “foundations-14-11-30” and “business-eneitites-15-02-23” packages
• Reasoning switched to OWL 2 RL
− Loading takes 3-4 seconds
• Number of explicit statements: 5 433
• Number of total statements: 20 646
− Of which inferred and materialized: 15 213
Apr 2016Hidden Relationships in Data and Risk Analytics
FIBO Class Hierarchy
Apr 2016Hidden Relationships in Data and Risk Analytics
Explore properties related to a class
Apr 2016Hidden Relationships in Data and Risk Analytics
Mapping FIBO to DBPedia
• We mapped FIBO to DBPedia Ontology
− Minimalistic approach – we mapped as much as we needed
dbo:Organization rdfs:subClassOf fibo-fnd-org-fm:FormalOrganization.
dbo:Company rdfs:subClassOf fibo-be-le-cb:Corporation.
dbo:Person rdfs:subClassOf fibo-fnd-aap-ppl:Person.
dbo:subsidiary rdfs:subPropertyOf fibo-fnd-rel-rel:controls.
• Methodological notes
− Note, fibo-fnd-rel-rel:controls is not transitive
− We mapped more specific DBPedia primitives to more general FIBO, so, that data becomes “visible”
through FIBO
Apr 2016Hidden Relationships in Data and Risk Analytics
See open data through the FIBO lens
Apr 2016Hidden Relationships in Data and Risk Analytics
Presentation Outline
• Discovery and analytics case
• Data integration and FIBO mapping
• Discovery and analytics examples
• Future work
Apr 2016Hidden Relationships in Data and Risk Analytics
Semantic Press-Clipping
• We can trace references to a specific company in the news
− This is pretty much standard, however we can deal with syntactic variations in the names, because state
of the art Named Entity Recognition technology is used
− What’s more important, we distinguish correctly in which mention “Paris” refers to which of the
following: Paris (the capital of France), Paris in Texas, Paris Hilton or to Paris (the Greek hero)
• We can trace and consolidate references to daughter companies
• We have comprehensive industry classification
− The one from DBPedia, but refined to accommodate identifier variations and specialization (e.g.
company classified as dbr:Bank will also be considered classified as dbr:FinancialServices)
Apr 2016Hidden Relationships in Data and Risk Analytics
Mentions of related entities
select distinct ?news ?title ?date ?rel_entity
from onto:disable-sameAs
where {
BIND( dbr:Volkswagen_Group as ?entity )
{ ?entity fibo-fnd-rel-rel:controls ?rel_entity }
UNION
{ BIND(?entity as ?rel_entity) }
?news pub-old:containsMention / pub-old:hasInstance / pub:exactMatch ?rel_entity .
?news pub-old:creationDate ?date; pub-old:title ?title .
FILTER ( (?date > "2015-04-01T00:02:00Z"^^xsd:dateTime)
&& (?date < "2015-05-01T00:02:00Z"^^xsd:dateTime))
}
Apr 2016Hidden Relationships in Data and Risk Analytics
Industry distribution
Apr 2016Hidden Relationships in Data and Risk Analytics
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX ff-map: <http://factforge.net/ff2016-mapping/>
select distinct ?top_industry (count(?company) as ?companies)
where {
?company dbo:industry ?industry .
?industrySum ff-map:industryVariant ?industry;
ff-map:industryCenter ?top_industry .
} group by ?top_industry order by desc(?companies)
Most popular companies per industry
Apr 2016Hidden Relationships in Data and Risk Analytics
select distinct ?pub_entity ?label (count(?news) as ?news_count)
where {
?news pub-old:containsMention / pub-old:hasInstance ?pub_entity .
?pub_entity pub:exactMatch ?entity; pub:preferredLabel ?label.
?entity dbo:industry ?industry .
dbr:Automotive ff-map:industryVariant ?industry .
} group by ?pub_entity ?label order by desc(?news_count)
Most popular companies, including children
Apr 2016Hidden Relationships in Data and Risk Analytics
select distinct ?parent (count(?news) as ?news_count)
where {
{ select distinct ?parent ?entity {
BIND(dbr:Software as ?industry)
?industry ff-map:industryVariant ?industryVar .
?parent dbo:industry ?industryVar .
?parent a dbo:Company .
FILTER NOT EXISTS { ?parent dbo:parent / dbo:industry / ff-map:industryVariant ?industry }
{ ?entity dbo:parent ?parent . } UNION
{ BIND(?parent as ?entity) }
} }
?news pub-old:containsMention / pub-old:hasInstance ?pub_entity .
?pub_entity pub:exactMatch ?entity .
?news pub-old:creationDate ?date .
} group by ?parent order by desc(?news_count)
News Popularity Ranking: Automotive
Apr 2016Hidden Relationships in Data and Risk Analytics
Rank Company News # Rank Company incl. mentions of controlled News #
1 General Motors 2722 1 General Motors 4620
2 Tesla Motors 2346 2 Volkswagen Group 3999
3 Volkswagen 2299 3 Fiat Chrysler Automobiles 2658
4 Ford Motor Company 1934 4 Tesla Motors 2370
5 Toyota 1325 5 Ford Motor Company 2125
6 Chevrolet 1264 6 Toyota 1656
7 Chrysler 1054 7 Renault-Nissan Alliance 1332
8 Fiat Chrysler Automobiles 1011 8 Honda 864
9 Audi AG 972 9 BMW 715
10 Honda 717 10 Takata Corporation 547
News Popularity: Finance
Apr 2016Hidden Relationships in Data and Risk Analytics
Rank Company News # Rank Company incl. mentions of controlled News #
1 Bloomberg L.P. 3203 1 Intra Bank 261667
2 Goldman Sachs 1992 2 Hinduja Bank (Switzerland) 49731
3 JP Morgan Chase 1712 3 China Merchants Bank 38288
4 Wells Fargo 1688 4 Alphabet Inc. 22601
5 Citigroup 1557 5 Capital Group Companies 4076
6 HSBC Holdings 1546 6 Bloomberg L.P. 3611
7 Deutsche Bank 1414 7 Exor 2704
8 Bank of America 1335 8 Nasdaq, Inc. 2082
9 Barclays 1260 9 JP Morgan Chase 1972
10 UBS 694 10 Sentinel Capital Partners 1053
Note: Including investment funds, stock exchanges, agencies, etc.
News Popularity: Banking
Apr 2016Hidden Relationships in Data and Risk Analytics
Rank Company News # Rank Company incl. mentions of controlled News #
1 Goldman Sachs 996 1 China Merchants Bank * 38288
2 JP Morgan Chase 856 2 JP Morgan Chase 1972
3 HSBC Holdings 773 3 Goldman Sachs 1030
4 Deutsche Bank 707 4 HSBC 966
5 Barclays 630 5 Bank of America 771
6 Citigroup 519 6 Deutsche Bank 742
7 Bank of America 445 7 Barclays 681
8 Wells Fargo 422 8 Citigroup 630
9 UBS 347 9 Wells Fargo 428
10 Chase 126 10 UBS 347
Note: including investment funds, stock exchanges, agencies, etc.
Regional exposition of a company
Apr 2016Hidden Relationships in Data and Risk Analytics
select distinct ?country (count(*) as ?count)
from onto:disable-sameAs
where {
{ select distinct ?related_entity {
BIND ( dbr:Toyota as ?entity )
{ ?related_entity ff-map:agentRelation ?entity . } UNION
{ BIND(?entity as ?related_entity) }
}
}
?news pub-old:containsMention / pub-old:hasInstance
/ pub:exactMatch ?related_entity .
?news pub:country ?country .
} group by ?country order by desc(?count)
Regional exposition – normalized
Apr 2016Hidden Relationships in Data and Risk Analytics
select distinct ?country (count(*) as ?count) (?count / ?country_score as ?score)
from onto:disable-sameAs
where {
{ select distinct ?related_entity {
BIND ( dbr:BP as ?entity )
{ ?related_entity ff-map:agentRelation ?entity . } UNION
{ BIND(?entity as ?related_entity) }
}
}
?news pub-old:containsMention / pub-old:hasInstance
/ pub:exactMatch ?related_entity .
?news pub:country ?country .
?country ff-map:countryPopularityScore ?country_score .
} group by ?country ?country_score having (?count > 20) order by desc(?score)
Relationships discovery examples
• Companies that control other companies across countries
• Companies that control other companies in the same country
through a company in another country
• Companies that control other companies in the same country
through a company in an off-shore zone
Apr 2016Hidden Relationships in Data and Risk Analytics
Presentation Outline
• Discovery and analytics case
• Data integration and FIBO mapping
• Discovery and analytics examples
• Future work
Apr 2016Hidden Relationships in Data and Risk Analytics
Analytics with relations extracted from text
Apr 2016Hidden Relationships in Data and Risk Analytics
Subject Object Count
dbr:Chrysler dbr:Fiat_Chrysler_Automobiles 455
dbr:NASA dbr:Goddard_Space_Flight_Center 69
dbr:Time_Warner_Cable dbr:Comcast 44
dbr:National_Football_League dbr:New_England_Patriots 40
dbr:DirecTV dbr:AT&T 33
dbr:Alcatel-Lucent dbr:Nokia 31
dbr:AOL dbr:Verizon_Communications 30
dbr:University_of_Pennsylvania dbr:Perelman_School_of_Medicine_at_... UPEN 29
dbr:Time_Warner_Cable dbr:Charter_Communications 27
dbr:Continental_Airlines dbr:United_Airlines 26
Note: relation types "RelationOrganizationAffiliatedWithOrganization" "RelationAcquisition" "RelationMerger"
Future Work
Apr 2016
• Comprehensive mapping of LEI data
• Experiments on Ultimate Parent discovery
• Partnership with commercial data providers
• Organizations, related in the news, but not in other datasets
• Organizations, co-occurring in the news, but not in other datasets
• Construct a profile of related entities for an orgnization
Hidden Relationships in Data and Risk Analytics
Wrap up
Apr 2016
• We allow Open Data to be accessed via FIBO
− It took just few days to clean up DBPedia’s industry classifications and control relationships
• Integrating more data sources is easy (e.g. GLEI)
− We can integrate proprietary and 3rd party data within days or weeks
• We can perform analytics on metadata
− Regional exposition, popularity of entities, relation extraction
• All integrated in proven products and solutions
− GraphDB triplestore, OpenPolicy, Dynamic Semantic Publishing platform
Hidden Relationships in Data and Risk Analytics
Thank you!
Experience the technology with NOW: Semantic News Portal
http://now.ontotext.com
Start using GraphDB and text-mining with S4 in the cloud
http://s4.ontotext.com
Learn more at our website or simply get in touch
info@ontotext.com, @ontotext
Apr 2016Hidden Relationships in Data and Risk Analytics

More Related Content

What's hot

What's hot (20)

Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingAnalytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
 
Building Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsBuilding Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 steps
 
Smarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformSmarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing Platform
 
Applying large scale text analytics with graph databases
Applying large scale text analytics with graph databasesApplying large scale text analytics with graph databases
Applying large scale text analytics with graph databases
 
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
 
Robert Isele | eccenca CorporateMemory - Semantically integrated Enterprise D...
Robert Isele | eccenca CorporateMemory - Semantically integrated Enterprise D...Robert Isele | eccenca CorporateMemory - Semantically integrated Enterprise D...
Robert Isele | eccenca CorporateMemory - Semantically integrated Enterprise D...
 
Adding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to DeliveryAdding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to Delivery
 
GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on Demand
 
The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise
 
Using the Semantic Web Stack to Make Big Data Smarter
Using the Semantic Web Stack to Make  Big Data SmarterUsing the Semantic Web Stack to Make  Big Data Smarter
Using the Semantic Web Stack to Make Big Data Smarter
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
 
Success stories with Connected Data
Success stories with Connected DataSuccess stories with Connected Data
Success stories with Connected Data
 
[Conference] Cognitive Graph Analytics on Company Data and News
[Conference] Cognitive Graph Analytics on Company Data and News[Conference] Cognitive Graph Analytics on Company Data and News
[Conference] Cognitive Graph Analytics on Company Data and News
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
 
Towards Versioning of Arbitrary RDF Data
Towards Versioning of Arbitrary RDF DataTowards Versioning of Arbitrary RDF Data
Towards Versioning of Arbitrary RDF Data
 
Graph Analysis over JSON, Larus
Graph Analysis over JSON, LarusGraph Analysis over JSON, Larus
Graph Analysis over JSON, Larus
 
How Semantics Solves Big Data Challenges
How Semantics Solves Big Data ChallengesHow Semantics Solves Big Data Challenges
How Semantics Solves Big Data Challenges
 
Linked Data Experiences at Springer Nature
Linked Data Experiences at Springer NatureLinked Data Experiences at Springer Nature
Linked Data Experiences at Springer Nature
 
GraphDB
GraphDBGraphDB
GraphDB
 

Viewers also liked

Knowledge management for analytic teams jaime fitzgerald and alex hasha - p...
Knowledge management for analytic teams   jaime fitzgerald and alex hasha - p...Knowledge management for analytic teams   jaime fitzgerald and alex hasha - p...
Knowledge management for analytic teams jaime fitzgerald and alex hasha - p...
Fitzgerald Analytics, Inc.
 
Km 2.0 Wikinomics In The Heart Of Capgemini
Km 2.0   Wikinomics In The Heart Of CapgeminiKm 2.0   Wikinomics In The Heart Of Capgemini
Km 2.0 Wikinomics In The Heart Of Capgemini
Denis Lafont-Trevisan
 
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersKno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Amit Sheth
 
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Artificial Intelligence Institute at UofSC
 
Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...
Amit Sheth
 
Operational Risk Management - A Gateway to managing the risk profile of your...
Operational Risk Management -  A Gateway to managing the risk profile of your...Operational Risk Management -  A Gateway to managing the risk profile of your...
Operational Risk Management - A Gateway to managing the risk profile of your...
Eneni Oduwole
 

Viewers also liked (20)

Semantic web an overview and projects
Semantic web   an  overview and projectsSemantic web   an  overview and projects
Semantic web an overview and projects
 
Introduction to the Semantic Web and Linked Data
Introduction to the Semantic Web and Linked DataIntroduction to the Semantic Web and Linked Data
Introduction to the Semantic Web and Linked Data
 
Masterclass Multimodal Engagements with Cultural Heritage
Masterclass Multimodal Engagements with Cultural HeritageMasterclass Multimodal Engagements with Cultural Heritage
Masterclass Multimodal Engagements with Cultural Heritage
 
Knowledge management for analytic teams jaime fitzgerald and alex hasha - p...
Knowledge management for analytic teams   jaime fitzgerald and alex hasha - p...Knowledge management for analytic teams   jaime fitzgerald and alex hasha - p...
Knowledge management for analytic teams jaime fitzgerald and alex hasha - p...
 
Semantic Web in an SMS as presented at EKAW2016
Semantic Web in an SMS as presented at EKAW2016Semantic Web in an SMS as presented at EKAW2016
Semantic Web in an SMS as presented at EKAW2016
 
Is the Semantic Web what we expected? Adoption Patterns and Content-driven Ch...
Is the Semantic Web what we expected? Adoption Patterns and Content-driven Ch...Is the Semantic Web what we expected? Adoption Patterns and Content-driven Ch...
Is the Semantic Web what we expected? Adoption Patterns and Content-driven Ch...
 
Km 2.0 Wikinomics In The Heart Of Capgemini
Km 2.0   Wikinomics In The Heart Of CapgeminiKm 2.0   Wikinomics In The Heart Of Capgemini
Km 2.0 Wikinomics In The Heart Of Capgemini
 
The Knowledge Discovery Quest
The Knowledge Discovery Quest The Knowledge Discovery Quest
The Knowledge Discovery Quest
 
Cooking up the Semantic Web
Cooking up the Semantic WebCooking up the Semantic Web
Cooking up the Semantic Web
 
The Role of Data Science in Enterprise Risk Management, Presented by John Liu
The Role of Data Science in Enterprise Risk Management, Presented by John LiuThe Role of Data Science in Enterprise Risk Management, Presented by John Liu
The Role of Data Science in Enterprise Risk Management, Presented by John Liu
 
Sorafenib
SorafenibSorafenib
Sorafenib
 
2015 Kno.e.sis Center Annual Review
2015 Kno.e.sis Center Annual Review2015 Kno.e.sis Center Annual Review
2015 Kno.e.sis Center Annual Review
 
Web and Complex Systems Lab @ Kno.e.sis
Web and Complex Systems Lab @ Kno.e.sisWeb and Complex Systems Lab @ Kno.e.sis
Web and Complex Systems Lab @ Kno.e.sis
 
Trust Management: A Tutorial
Trust Management: A TutorialTrust Management: A Tutorial
Trust Management: A Tutorial
 
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersKno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
 
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
 
Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...
 
Thinking Outside the Table
Thinking Outside the TableThinking Outside the Table
Thinking Outside the Table
 
Operational Risk Management - A Gateway to managing the risk profile of your...
Operational Risk Management -  A Gateway to managing the risk profile of your...Operational Risk Management -  A Gateway to managing the risk profile of your...
Operational Risk Management - A Gateway to managing the risk profile of your...
 
Future of Web 2.0 & The Semantic Web
Future of Web 2.0 & The Semantic WebFuture of Web 2.0 & The Semantic Web
Future of Web 2.0 & The Semantic Web
 

Similar to How to Reveal Hidden Relationships in Data and Risk Analytics

Quarterly Review of the IT Services & Business Services Sector - Q1 2016
Quarterly Review of the IT Services & Business Services Sector - Q1 2016Quarterly Review of the IT Services & Business Services Sector - Q1 2016
Quarterly Review of the IT Services & Business Services Sector - Q1 2016
Mark Weisman
 
Semantic Data Enrichment: a Human-in-the-Loop Perspective
Semantic Data Enrichment: a Human-in-the-Loop PerspectiveSemantic Data Enrichment: a Human-in-the-Loop Perspective
Semantic Data Enrichment: a Human-in-the-Loop Perspective
Università degli Studi di Milano-Bicocca
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research report
JULIO GONZALEZ SANZ
 

Similar to How to Reveal Hidden Relationships in Data and Risk Analytics (20)

Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
 
euBusinessGraph Company and Economic Data
euBusinessGraph Company and Economic DataeuBusinessGraph Company and Economic Data
euBusinessGraph Company and Economic Data
 
Modern Data Discovery and Integration in Insurance
Modern Data Discovery and Integration in InsuranceModern Data Discovery and Integration in Insurance
Modern Data Discovery and Integration in Insurance
 
Cloud_Computing_Top_Markets_Report-2
Cloud_Computing_Top_Markets_Report-2Cloud_Computing_Top_Markets_Report-2
Cloud_Computing_Top_Markets_Report-2
 
Dr Dev Kambhampati | Cloud Computing 2016 Top Markets Report
Dr Dev Kambhampati | Cloud Computing 2016 Top Markets ReportDr Dev Kambhampati | Cloud Computing 2016 Top Markets Report
Dr Dev Kambhampati | Cloud Computing 2016 Top Markets Report
 
Tracxn Research — Big Data Infrastructure Landscape, September 2016
Tracxn Research — Big Data Infrastructure Landscape, September 2016Tracxn Research — Big Data Infrastructure Landscape, September 2016
Tracxn Research — Big Data Infrastructure Landscape, September 2016
 
The Power of Data
The Power of DataThe Power of Data
The Power of Data
 
Tracxn Research — IT Operations Landscape, November 2016
Tracxn Research — IT Operations Landscape, November 2016Tracxn Research — IT Operations Landscape, November 2016
Tracxn Research — IT Operations Landscape, November 2016
 
Tracxn Big Data Analytics Landscape Report, June 2016
Tracxn Big Data Analytics Landscape Report, June 2016Tracxn Big Data Analytics Landscape Report, June 2016
Tracxn Big Data Analytics Landscape Report, June 2016
 
eccenca CorporateMemory - Semantically integrated Enterprise Data Lakes
eccenca CorporateMemory - Semantically integrated Enterprise Data Lakeseccenca CorporateMemory - Semantically integrated Enterprise Data Lakes
eccenca CorporateMemory - Semantically integrated Enterprise Data Lakes
 
The Curious Case of the Semantic Data Catalog.pdf
The Curious Case of the Semantic Data Catalog.pdfThe Curious Case of the Semantic Data Catalog.pdf
The Curious Case of the Semantic Data Catalog.pdf
 
Hedge Fund case study solution - Credit default swaps execution system and Gr...
Hedge Fund case study solution - Credit default swaps execution system and Gr...Hedge Fund case study solution - Credit default swaps execution system and Gr...
Hedge Fund case study solution - Credit default swaps execution system and Gr...
 
DAS Slides: Graph Databases — Practical Use Cases
DAS Slides: Graph Databases — Practical Use CasesDAS Slides: Graph Databases — Practical Use Cases
DAS Slides: Graph Databases — Practical Use Cases
 
Tracxn Startup Research: Data as a Service Landscape, August 2016
Tracxn Startup Research: Data as a Service Landscape, August 2016Tracxn Startup Research: Data as a Service Landscape, August 2016
Tracxn Startup Research: Data as a Service Landscape, August 2016
 
Quarterly Review of the IT Services & Business Services Sector - Q1 2016
Quarterly Review of the IT Services & Business Services Sector - Q1 2016Quarterly Review of the IT Services & Business Services Sector - Q1 2016
Quarterly Review of the IT Services & Business Services Sector - Q1 2016
 
Semantic Data Enrichment: a Human-in-the-Loop Perspective
Semantic Data Enrichment: a Human-in-the-Loop PerspectiveSemantic Data Enrichment: a Human-in-the-Loop Perspective
Semantic Data Enrichment: a Human-in-the-Loop Perspective
 
Fintech summit 2016 thomson reuters tim baker_presentation final
Fintech summit 2016 thomson reuters tim baker_presentation finalFintech summit 2016 thomson reuters tim baker_presentation final
Fintech summit 2016 thomson reuters tim baker_presentation final
 
Tracxn Research — Business Intelligence Landscape, September 2016
Tracxn Research —  Business Intelligence Landscape, September 2016Tracxn Research —  Business Intelligence Landscape, September 2016
Tracxn Research — Business Intelligence Landscape, September 2016
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research report
 
IIex North America 2019 - No Fake News - How Coca-Cola created ONE source of ...
IIex North America 2019 - No Fake News - How Coca-Cola created ONE source of ...IIex North America 2019 - No Fake News - How Coca-Cola created ONE source of ...
IIex North America 2019 - No Fake News - How Coca-Cola created ONE source of ...
 

More from Ontotext

Best Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingBest Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining Processing
Ontotext
 

More from Ontotext (11)

Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020
 
It Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got SemanticsIt Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got Semantics
 
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
 
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake OnesHercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
 
How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps
 
How is smart data cooked?
How is smart data cooked?How is smart data cooked?
How is smart data cooked?
 
Efficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessEfficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining Process
 
Best Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingBest Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining Processing
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
 
Semantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial ResearchSemantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial Research
 
Gaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive TechnologyGaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive Technology
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

How to Reveal Hidden Relationships in Data and Risk Analytics

  • 1. How to Reveal Hidden Relationships in Data and Risk Analytics Ontotext Webinar, 28 Mar 2016
  • 2. Presentation Outline • Discovery and analytics case • Data integration and FIBO mapping • Discovery and analytics examples • Future work Apr 2016Hidden Relationships in Data and Risk Analytics
  • 3. Relation Discovery Case Apr 2016Hidden Relationships in Data and Risk Analytics • Find suspicious relationships like: − Company in USA controls − Another company in USA − Through a company in an off-shore zone • Show news relevant to them
  • 4. • Database of locations with sub-region info • Database with companies and control relations • Define the semantics of the relevant relationships (using FIBO) – sub-region and control are transitive relationships – located-in is transitive over sub-region • Define suspicious relationships CONSTRUCT { ?orgA my:suspiciousLink ?orgB } WHERE { ?orgA ptop:locatedIn ?x ; fibo:controls ?y . ?y fibo:controls ?orgB ; ptop:locatedIn ?z . ?orgB ptop:locatedIn ?x . ?z a ptop:OffshoreZone . } What It Takes to Make It Work? Hidden Relationships in Data and Risk Analytics Apr 2016
  • 5. Presentation Outline • Discovery and analytics case • Data integration and FIBO mapping • Discovery and analytics examples • Future work Apr 2016Hidden Relationships in Data and Risk Analytics
  • 6. The Web of Linked Data in 2007 Apr 2016Hidden Relationships in Data and Risk Analytics structured database version of Wikipedia database of all locations on Earth product reviews semantic synonym dictionary Note: Each bubble represents a dataset. Arrows represent mappings across datasets; e.g. dbpedia:Paris owl:sameAs geo:2988507
  • 7. The Web of Linked Data is Gaining Mass Apr 2016Hidden Relationships in Data and Risk Analytics
  • 8. The Web of Data is Gaining Mass (2011) Apr 2016Hidden Relationships in Data and Risk Analytics
  • 9. The Web of Linked Data is Gaining Mass Apr 2016Hidden Relationships in Data and Risk Analytics • 2013 stats: 2 289 public datasets − http://stats.lod2.eu/ • Growing exponentially − see the dotted trend line • Structured markup − Schema.org; semantic SEO • Enables better semantic tagging! − As there are more concepts and richer descriptions to refer to 27 43 89 162 295 822 2,289 2007 2008 2009 2010 2011 2012 2013 LinkedDataDatasets
  • 10. Data Integration and Loading • DBpedia (the English version only) 496M statements • Geonames (all geographic features on Earth) 150M statements − owl:sameAs links between DBpedia and Geonames 471K statements • Company registry data (GLEI) 3M statements • News metadata (from NOW) 128M statements • Total size: 986М statements − 667M explicit statements + 318M inferred statements − RDFRank and geo-spatial indices enabled to allow for ranking and efficient geo-region constraints Apr 2016Hidden Relationships in Data and Risk Analytics
  • 11. Global Legal Entity Identifier (GLEI) data Apr 2016 • Global Markets Entity Identifier (GMEI) Utility data − The Global Markets Entity Identifier (GMEI) utility is DTCC's legal entity identifier solution offered in collaboration with SWIFT − We downloaded data dump from https://www.gmeiutility.org/ • RDF-ized company records − Fields: LEI#, legal name, ultimate parent, registered country − 3M explicit statements for 211 thousand organizations ▪ For comparison, there are 490 000 organizations in DBPeda and D&B covers above 200 million − 10,821 ultimate parent relationships and 1632 ultimate parents − About 2 800 organizations from the GLEI dump mapped to DBPedia Hidden Relationships in Data and Risk Analytics
  • 12. GLEI Company Data Sample: ABN-AMRO Apr 2016Hidden Relationships in Data and Risk Analytics lei:businessRegistry "Kamer van Koophandel"^^xsd:string lei:businessRegistryNumber "34334259"^^xsd:string lei:duplicateReference data:549300T5O0D0T4V2ZB28 lei:entityStatus "ACTIVE"^^xsd:string lei:headquartersCity "Amsterdam"^^xsd:string lei:headquartersState "Noord-Holland"^^xsd:string lei:legalForm "NAAMLOZE VENNOOTSCHAP"^^xsd:string lei:legalName "ABN AMRO Bank N.V."^^xsd:string lei:lei "BFXS5XCH7N0Y05NIXW11"^^xsd:string lei:registeredCity "Amsterdam"^^xsd:string lei:registeredCountry "NL"^^xsd:string lei:registeredPostCode "1082 PP"^^xsd:string lei:registeredState "Noord-Holland"^^xsd:string
  • 13. Global Legal Entity Identifier (GLEI) data Apr 2016Hidden Relationships in Data and Risk Analytics Ultimate parent Children Country 1 The Goldman Sachs Group, Inc. 1 851 US 2 United Technologies Corporation 427 US 3 Honeywell International Inc. 341 US 4 Morgan Stanley 228 US 5 Cargill, Incorporated 217 US 6 1832 Asset Management L.P. 202 CA 7 Aegon N.V. 174 NL 8 Union Bancaire Privée, UBP SA 138 CH 9 Citigroup Inc. 135 US 10 State Street Corporation 128 US Country Companies 1 dbr:United_States 103 548 2 dbr:Canada 17 425 3 dbr:Luxembourg 13 984 4 dbr:Sweden 7 934 5 dbr:United_Kingdom 7 421 6 dbr:Belgium 6 868 7 dbr:Ireland 4 762 8 dbr:Australia 4 385 9 dbr:Germany 3 039 10 dbr:Netherlands 2 561
  • 14. Quick news-analytics case Apr 2016Hidden Relationships in Data and Risk Analytics • Our Dynamic Semantic Publishing platform already offers linking of text with big open data graphs • One can get navigate from text to concepts, get trends, related entities and news • Try it at http://now.ontotext.com
  • 15. Technology: Semantic Content Enrichment Dec 2015Technology, Clients & Use Cases, Market 15
  • 16. News Metadata • Metadata from Ontotext’s Dynamic Semantic Publishing platform − Automatically generated as part of the NOW.ontotext.com semantic news showcase • News stream from Google since Feb 2015, about 10k news/month − ~70 tags (annotations) per news article • Tags link text mentions of concepts to the knowledge graph − Technically these are URIs for entities (people, organizations, locations, etc.) and key phrases Apr 2016Hidden Relationships in Data and Risk Analytics
  • 17. News Metadata Apr 2016Hidden Relationships in Data and Risk Analytics Category Count International 52 074 Science and Technology 23 201 Sports 20 714 Business 15 155 Lifestyle 11 684 122 828 Mentions / entity type Count Keyphrase 2 589 676 Organization 1 276 441 Location 1 260 972 Person 1 248 784 Work 309 093 Event 258 388 RelationPersonRole 236 638 Species 180 946
  • 18. Class Hierarchy Map (by number of instances) Apr 2016Hidden Relationships in Data and Risk Analytics Left: The big picture Right: dbo:Agent class (2.7M organizations and persons)
  • 19. Loading FIBO • FIBO = Financial Industry Business Ontology • We loaded FIBO Foundations and BE in GraphDB − About 55 RDF files the “foundations-14-11-30” and “business-eneitites-15-02-23” packages • Reasoning switched to OWL 2 RL − Loading takes 3-4 seconds • Number of explicit statements: 5 433 • Number of total statements: 20 646 − Of which inferred and materialized: 15 213 Apr 2016Hidden Relationships in Data and Risk Analytics
  • 20. FIBO Class Hierarchy Apr 2016Hidden Relationships in Data and Risk Analytics
  • 21. Explore properties related to a class Apr 2016Hidden Relationships in Data and Risk Analytics
  • 22. Mapping FIBO to DBPedia • We mapped FIBO to DBPedia Ontology − Minimalistic approach – we mapped as much as we needed dbo:Organization rdfs:subClassOf fibo-fnd-org-fm:FormalOrganization. dbo:Company rdfs:subClassOf fibo-be-le-cb:Corporation. dbo:Person rdfs:subClassOf fibo-fnd-aap-ppl:Person. dbo:subsidiary rdfs:subPropertyOf fibo-fnd-rel-rel:controls. • Methodological notes − Note, fibo-fnd-rel-rel:controls is not transitive − We mapped more specific DBPedia primitives to more general FIBO, so, that data becomes “visible” through FIBO Apr 2016Hidden Relationships in Data and Risk Analytics
  • 23. See open data through the FIBO lens Apr 2016Hidden Relationships in Data and Risk Analytics
  • 24. Presentation Outline • Discovery and analytics case • Data integration and FIBO mapping • Discovery and analytics examples • Future work Apr 2016Hidden Relationships in Data and Risk Analytics
  • 25. Semantic Press-Clipping • We can trace references to a specific company in the news − This is pretty much standard, however we can deal with syntactic variations in the names, because state of the art Named Entity Recognition technology is used − What’s more important, we distinguish correctly in which mention “Paris” refers to which of the following: Paris (the capital of France), Paris in Texas, Paris Hilton or to Paris (the Greek hero) • We can trace and consolidate references to daughter companies • We have comprehensive industry classification − The one from DBPedia, but refined to accommodate identifier variations and specialization (e.g. company classified as dbr:Bank will also be considered classified as dbr:FinancialServices) Apr 2016Hidden Relationships in Data and Risk Analytics
  • 26. Mentions of related entities select distinct ?news ?title ?date ?rel_entity from onto:disable-sameAs where { BIND( dbr:Volkswagen_Group as ?entity ) { ?entity fibo-fnd-rel-rel:controls ?rel_entity } UNION { BIND(?entity as ?rel_entity) } ?news pub-old:containsMention / pub-old:hasInstance / pub:exactMatch ?rel_entity . ?news pub-old:creationDate ?date; pub-old:title ?title . FILTER ( (?date > "2015-04-01T00:02:00Z"^^xsd:dateTime) && (?date < "2015-05-01T00:02:00Z"^^xsd:dateTime)) } Apr 2016Hidden Relationships in Data and Risk Analytics
  • 27. Industry distribution Apr 2016Hidden Relationships in Data and Risk Analytics PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX ff-map: <http://factforge.net/ff2016-mapping/> select distinct ?top_industry (count(?company) as ?companies) where { ?company dbo:industry ?industry . ?industrySum ff-map:industryVariant ?industry; ff-map:industryCenter ?top_industry . } group by ?top_industry order by desc(?companies)
  • 28. Most popular companies per industry Apr 2016Hidden Relationships in Data and Risk Analytics select distinct ?pub_entity ?label (count(?news) as ?news_count) where { ?news pub-old:containsMention / pub-old:hasInstance ?pub_entity . ?pub_entity pub:exactMatch ?entity; pub:preferredLabel ?label. ?entity dbo:industry ?industry . dbr:Automotive ff-map:industryVariant ?industry . } group by ?pub_entity ?label order by desc(?news_count)
  • 29. Most popular companies, including children Apr 2016Hidden Relationships in Data and Risk Analytics select distinct ?parent (count(?news) as ?news_count) where { { select distinct ?parent ?entity { BIND(dbr:Software as ?industry) ?industry ff-map:industryVariant ?industryVar . ?parent dbo:industry ?industryVar . ?parent a dbo:Company . FILTER NOT EXISTS { ?parent dbo:parent / dbo:industry / ff-map:industryVariant ?industry } { ?entity dbo:parent ?parent . } UNION { BIND(?parent as ?entity) } } } ?news pub-old:containsMention / pub-old:hasInstance ?pub_entity . ?pub_entity pub:exactMatch ?entity . ?news pub-old:creationDate ?date . } group by ?parent order by desc(?news_count)
  • 30. News Popularity Ranking: Automotive Apr 2016Hidden Relationships in Data and Risk Analytics Rank Company News # Rank Company incl. mentions of controlled News # 1 General Motors 2722 1 General Motors 4620 2 Tesla Motors 2346 2 Volkswagen Group 3999 3 Volkswagen 2299 3 Fiat Chrysler Automobiles 2658 4 Ford Motor Company 1934 4 Tesla Motors 2370 5 Toyota 1325 5 Ford Motor Company 2125 6 Chevrolet 1264 6 Toyota 1656 7 Chrysler 1054 7 Renault-Nissan Alliance 1332 8 Fiat Chrysler Automobiles 1011 8 Honda 864 9 Audi AG 972 9 BMW 715 10 Honda 717 10 Takata Corporation 547
  • 31. News Popularity: Finance Apr 2016Hidden Relationships in Data and Risk Analytics Rank Company News # Rank Company incl. mentions of controlled News # 1 Bloomberg L.P. 3203 1 Intra Bank 261667 2 Goldman Sachs 1992 2 Hinduja Bank (Switzerland) 49731 3 JP Morgan Chase 1712 3 China Merchants Bank 38288 4 Wells Fargo 1688 4 Alphabet Inc. 22601 5 Citigroup 1557 5 Capital Group Companies 4076 6 HSBC Holdings 1546 6 Bloomberg L.P. 3611 7 Deutsche Bank 1414 7 Exor 2704 8 Bank of America 1335 8 Nasdaq, Inc. 2082 9 Barclays 1260 9 JP Morgan Chase 1972 10 UBS 694 10 Sentinel Capital Partners 1053 Note: Including investment funds, stock exchanges, agencies, etc.
  • 32. News Popularity: Banking Apr 2016Hidden Relationships in Data and Risk Analytics Rank Company News # Rank Company incl. mentions of controlled News # 1 Goldman Sachs 996 1 China Merchants Bank * 38288 2 JP Morgan Chase 856 2 JP Morgan Chase 1972 3 HSBC Holdings 773 3 Goldman Sachs 1030 4 Deutsche Bank 707 4 HSBC 966 5 Barclays 630 5 Bank of America 771 6 Citigroup 519 6 Deutsche Bank 742 7 Bank of America 445 7 Barclays 681 8 Wells Fargo 422 8 Citigroup 630 9 UBS 347 9 Wells Fargo 428 10 Chase 126 10 UBS 347 Note: including investment funds, stock exchanges, agencies, etc.
  • 33. Regional exposition of a company Apr 2016Hidden Relationships in Data and Risk Analytics select distinct ?country (count(*) as ?count) from onto:disable-sameAs where { { select distinct ?related_entity { BIND ( dbr:Toyota as ?entity ) { ?related_entity ff-map:agentRelation ?entity . } UNION { BIND(?entity as ?related_entity) } } } ?news pub-old:containsMention / pub-old:hasInstance / pub:exactMatch ?related_entity . ?news pub:country ?country . } group by ?country order by desc(?count)
  • 34. Regional exposition – normalized Apr 2016Hidden Relationships in Data and Risk Analytics select distinct ?country (count(*) as ?count) (?count / ?country_score as ?score) from onto:disable-sameAs where { { select distinct ?related_entity { BIND ( dbr:BP as ?entity ) { ?related_entity ff-map:agentRelation ?entity . } UNION { BIND(?entity as ?related_entity) } } } ?news pub-old:containsMention / pub-old:hasInstance / pub:exactMatch ?related_entity . ?news pub:country ?country . ?country ff-map:countryPopularityScore ?country_score . } group by ?country ?country_score having (?count > 20) order by desc(?score)
  • 35. Relationships discovery examples • Companies that control other companies across countries • Companies that control other companies in the same country through a company in another country • Companies that control other companies in the same country through a company in an off-shore zone Apr 2016Hidden Relationships in Data and Risk Analytics
  • 36. Presentation Outline • Discovery and analytics case • Data integration and FIBO mapping • Discovery and analytics examples • Future work Apr 2016Hidden Relationships in Data and Risk Analytics
  • 37. Analytics with relations extracted from text Apr 2016Hidden Relationships in Data and Risk Analytics Subject Object Count dbr:Chrysler dbr:Fiat_Chrysler_Automobiles 455 dbr:NASA dbr:Goddard_Space_Flight_Center 69 dbr:Time_Warner_Cable dbr:Comcast 44 dbr:National_Football_League dbr:New_England_Patriots 40 dbr:DirecTV dbr:AT&T 33 dbr:Alcatel-Lucent dbr:Nokia 31 dbr:AOL dbr:Verizon_Communications 30 dbr:University_of_Pennsylvania dbr:Perelman_School_of_Medicine_at_... UPEN 29 dbr:Time_Warner_Cable dbr:Charter_Communications 27 dbr:Continental_Airlines dbr:United_Airlines 26 Note: relation types "RelationOrganizationAffiliatedWithOrganization" "RelationAcquisition" "RelationMerger"
  • 38. Future Work Apr 2016 • Comprehensive mapping of LEI data • Experiments on Ultimate Parent discovery • Partnership with commercial data providers • Organizations, related in the news, but not in other datasets • Organizations, co-occurring in the news, but not in other datasets • Construct a profile of related entities for an orgnization Hidden Relationships in Data and Risk Analytics
  • 39. Wrap up Apr 2016 • We allow Open Data to be accessed via FIBO − It took just few days to clean up DBPedia’s industry classifications and control relationships • Integrating more data sources is easy (e.g. GLEI) − We can integrate proprietary and 3rd party data within days or weeks • We can perform analytics on metadata − Regional exposition, popularity of entities, relation extraction • All integrated in proven products and solutions − GraphDB triplestore, OpenPolicy, Dynamic Semantic Publishing platform Hidden Relationships in Data and Risk Analytics
  • 40. Thank you! Experience the technology with NOW: Semantic News Portal http://now.ontotext.com Start using GraphDB and text-mining with S4 in the cloud http://s4.ontotext.com Learn more at our website or simply get in touch info@ontotext.com, @ontotext Apr 2016Hidden Relationships in Data and Risk Analytics