SlideShare una empresa de Scribd logo
1 de 40
Descargar para leer sin conexión
#AICRECSYS
ADVANsse
Advances in social semantic enterprise
HTTP://ADVANSSE.DERI.IE/	

MACIEJ DABROWSKI 	

BENJAMIN HEITMANN	

 	

CONOR HAYES 	

 KEITH GRIFFIN	

10TH JULY 2013
About me
MACIEJ DABROWSKI!
maciej.dabrowski@deri.org!
lecturerAt
co-PI
contact
co-PI
worksWith
researcherAt
graduated
name
Overview
THISTALK	

	

RESEARCH	

INDUSTRY	

1.  WHY? 	

2.  WHAT? 	

3.  HOW?	

4.  TECHNICAL DECISIONS	

5.  LESSONS LEARNED
Why? What? How?
technical considerations
lessons learned
Various information domains
preferences
recommendations
implicit
connections
User profile
TRAVEL	

FOOD	

SPORTS	

POLITICS	

??
Use Case: Enterprise Social Web
Enterprise social web
ENTERPRISE INFORMATION SPACE
MARKETING
DEVELOPMENT
R & D
ANDREW
BOB
CECILIA
DANNY
Limited information flow
MARKETING
DEVELOPMENT
R & D
GREAT
TOOL!"
MEETING
IBM"
TALK
BY DERI"
ANDREW
BOB
CECILIA
DANNY
ENTERPRISE INFORMATION SPACE
Disconnected Social Networks
?	
  
ANDREW
BOB
CECILIA
DANNY
MARKETING
DEVELOPMENT
R & D
Distributed Social Platforms
?	
  
MARKETING
DEVELOPMENT
R & D
Problem 1: information overload and discovery
Problem 2: data level issues
DISTRIBUTION	

	

	

MULTIPLE DOMAINS	

ANDTYPES OF ENTITIES	

	

	

	

PEOPLE 	

 	

 	

INTERESTS	

	

	

	

CONTENT
Requirements - personalization
USE BACKGROUND KNOWLEDGE	

ALLOW CROSS-DOMAIN MULTI-
SOURCE PERSONALIZATION	

EXPLOIT SOCIAL GRAPH	

ALLOW REAL-TIME APPLICATIONS
Requirements - data
DATA LEVEL	

•  FLEXIBLE	

•  COMPACT 	

•  ENABLE CRUD	

•  GRAPH?	

TRANSPORT PROTOCOL:	

•  RELIABLE	

•  EFFICIENT	

•  PUBSUB?
What?
A PLATFORM BASED ON OPEN STANDARDS
THAT IS EASILY PLUGGABLETO EXISTING
INFRASTRUCTURES ANDTHAT EXPLOITS
LEGACY INFORMATION, SOCIAL GRAPH
AND INTEREST GRAPHTO PROVIDE A
PERSONALIZED INFORMATION
“DASHBOARD” IN NEAR REAL-TIME.
use cases
HOW? A look inside
Step 1: Exploit distributed (social) graphs
http://www.insidefacebook.com/wp-content/uploads/2013/06/shutterstock_107108318.jpg
Step 2: Exploit interest graphs
BENEFITS OF USING INTEREST GRAPHS:	

1.  FLEXIBLE SOURCE OF BACKGROUND KNOWLEDGE	

2.  ANY DATASET CAN BE “PLUGGED-IN” IF NEEDED	

3.  CROSS-DOMAIN RECOMMENDATIONS	

4.  VERY GOOD IN DISCOVERING INTERESTING
RECOMMENDATIONS	

OUR APPROACH: SPREADING ACTIVATION
Interest graphs
DERI
Maciej
Blog
Post2
Maurice
"Emerging Technology"
http://dbpedia.org/resource/Data_analytics
http://dbpedia.org/resource/
Emerging_technologies
sioc:creator_of
sioc:topic
works
at
interest
recommended
interest
owl:sameAs
Expanded User Profile (EUP)
Includes both original and
recommended interests
Social Software Entities
Additional
Profile Knowledge
External Background
Knowledge
(DBPedia + domain datasets)
Our Approach
	

	

A PLATFORM FOR SOCIAL NETWORKS:	

§  ENTERPRISE FOCUS: PEOPLE, COMMUNITIES, INFORMATION	

§  EFFICIENCY USING XMPP PUBSUB AND SPARQL 1.1 UPDATE	

§  EXPLOIT INTEREST GRAPH ANDVARIOUS DATA SOURCES
TO PROVIDE PERSONALIZATIONTHROUGH SOPHISTICATED
NEAR REAL-TIME RECOMMENDATIONS
Demonstrator
EASYTO INTEGRATE WITH CISCO INFRASTRUCTURE	

OPEN STANDARDS (XMPP, SPARQL 1.1 UPDATE)	

SCALABLE RECOMMENDATIONS BASED ON SOCIAL
GRAPH WITH OVER 10M ENTITIES AND 40M EDGES
COMPUTED BELOW 1 SECOND (0.2S ON AVERAGE).	

MORE DETAILS: HTTP://ADVANSSE.DERI.IE/
demonstrator
Prototype stats
SOCIAL NETWORK GRAPH:	

•  100S USERS	

•  100S POSTS	

•  500+TAGS	

•  2000+ ENTITIES	

•  15000+ EDGES	

Saffron.deri.ie
BACKGROUND KNOWLEDGE GRAPH:	

•  11M ENTITIES	

•  40M EDGES	

CROSS-DOMAIN GRAPH:	

•  3956 RESEARCH ARTICLES	

•  LANGUAGE CONFERENCES
Why? What? How?
technical considerations
lessons learned
Technical considerations
ALGORITHM:	

•  SEMANTIC NETWORK	

•  LARGE DATASET	

•  ITERATIVE GRAPH ALGORITHM	

•  STATEFUL NODES	

•  EMBEDDING OF DOMAIN LOGIC
Technical considerations
NON-NATIVE IMPORT OF RDF	

STARTUPTIME WITH DBPEDIA	

•  12 MIN ON 24 CORE, 96GB RAMTO LOAD	

PARALLEL PROCESSING OF ACTIVATIONS	

•  STATE FOR EACH USER AT EACH NODE	

SCALABILITY ISSUES 	

LACK OF GLOBAL ALGORITHM CONTROL	

IMMATURE CODE BASE, LACK OF
DOCUMENTATION
Technical considerations
NATIVE SUPPORT FOR RDF	

DBPEDIA (5.46GB) COMPRESSEDTO 436MB	

LOW MEMORY REQUIREMENTS	

LOW STARTUPTIME (90S)	

FAST QUERY ACCESS < 1ms
Server design
XMPP	

SPREADING ACTIVATION	

HDT	

	

ADVANSSE connected
social platform
XMPP client:
Ignite Smack
Web application:
Tomcat + Servlet
RDF store:
Jena Fuseki
ADVANSSE
server
Personalisation
component
Recommendation
algorithm
XMPP
R/W RDF store:
Jena Fuseki
XMPP
Java API
XMPP server:
Ignite OpenFire
XMPP client:
Ignite Smack
Fast, R/O RDF
store: HDT
SPARQL
SPARQL +
Java API
Java API +
SPARQL
Java
API
SPARQL
Java API
File
import
Link resolver
RDF store:
Jena Fuseki
configuration
•  DISTANCE CONSTRAINT DISABLED 	

•  FANOUT CONSTRAINT ENABLED	

•  10TARGET ACTIVATIONS	

•  ACTIVATIONTHRESHOLD 0.5 	

•  INITIAL ACTIVATION 4.0, 	

•  MAXIMUM OUT EDGES 500,	

•  AND A MAXIMUM OF 10 WAVES AND 1 PHASE
stats
DATASET:	

•  371 USERS	

•  6 INTEREST ON AVERAGE	

•  DEGREE 2-5, UPTO 51	

200ms	

	

	

85%	

AVERAGE EXECUTION 	

 	

 	

 COVERAGE
The value
	

SOCIAL CAPITAL IN ENTERPRISE
SOCIAL NETWORKS IN NOT FULLY
EXPLOITED.	

	

ENTERPRISE SOCIAL PLATFORMS
ARE DISTRIBUTED AND INCLUDE
VARIOUS SOURCES OF
INFORMATION.	

	

VALUABLE INFORMATION IN AN
ORGANIZATION IS NOT
DISCOVERED BYTHE RELEVANT
EMPLOYEES.	

DISCOVER AND CONNECT WITH
RELEVANT PEOPLE IN THE
ORGANIZATION.	

	

AGGREGATE INFORMATION FROM
VARIOUS DISTRIBUTED SOCIAL
PLATFORMS USING OPEN
STANDARDS	

	

PROVIDE NEAR REAL-TIME
PERSONALIZATION BASED ON
LARGE, DYNAMIC GRAPH DATA.
Why? What? How?
technical considerations
lessons learned
Lessons learned
•  GREATER RELEVANCETO REAL PROBLEMS	

•  CLEARER REQUIREMENTS (AND MORE)	

•  ACCESSTO ACTUAL USAGE DATA (REAL USERS)	

•  PATENTSVS. PUBLISHING	

•  PROTOTYPE INTEGRATION CONSUMES RESOURCES	

•  MORE FOCUS ON FEATURE DEVELOPMENT	

•  LESS EXPLORATION AND HYPOTHESISTESTING
major considerations
ACCESSTO INDUSTRY
DATA	

INTEGRATION WITH
THE PRODUCT?	

https://www.keytrac.net/assets/industry-social-networks.jpg http://www.autointhenews.com/wp-content/uploads/2010/05/volvo-s60-crash-video-image.jpg
Summary
PROBLEM	

§  INFORMATION OVERLOAD AND INEFFICIENT INFORMATION
DISCOVERY IN DISTRIBUTED ENTERPRISE SOCIAL NETWORKS	

SOLUTION	

§  RECOMMENDER SYSTEMTHAT EXPLOITS SOCIAL GRAPH	

§  UTILIZE INTEREST GRAPH AND LEGACY INFORMATION	

§  NEAR-REALTIME PERSONALIZATION	

TECHNOLOGY	

§  OPEN SOURCE COMPONENT FOR RDF DATA AGGREGATION
USING XMPP AND SPARQL 1.1 UPDATE	

§  PERSONALIZATION COMPONENT BASED ON SPREADING
ACTIVATION APPLICABLETO MULTI-SOURCE, CROSS DOMAIN
DATA
ENORMOUS	

VALUE 	

IN 	

INDUSTRY-ACADEMIA 	

COLLABORATIONS	

	

CONTACT: 	

 	

MACIEJ.DABROWSKI@DERI.ORG	

@MACDAB

Más contenido relacionado

Similar a Near real-time recommendations in enterprise social networks

Washdc cto-0905-2003
Washdc cto-0905-2003Washdc cto-0905-2003
Washdc cto-0905-2003
eaiti
 
Sp meetup 17 slidedeck
Sp meetup 17 slidedeckSp meetup 17 slidedeck
Sp meetup 17 slidedeck
Ric Centre
 

Similar a Near real-time recommendations in enterprise social networks (20)

Microsoft Data Platform and a new world of data
Microsoft Data Platform and a new world of dataMicrosoft Data Platform and a new world of data
Microsoft Data Platform and a new world of data
 
Shoutlet and IBM's Executive Social Marketing Summit
Shoutlet and IBM's Executive Social Marketing SummitShoutlet and IBM's Executive Social Marketing Summit
Shoutlet and IBM's Executive Social Marketing Summit
 
(PROJEKTURA) open data big data @tgg osijek
(PROJEKTURA) open data big data @tgg osijek(PROJEKTURA) open data big data @tgg osijek
(PROJEKTURA) open data big data @tgg osijek
 
ION Hangzhou - Opening Remarks
ION Hangzhou - Opening RemarksION Hangzhou - Opening Remarks
ION Hangzhou - Opening Remarks
 
The Connected Data Imperative: The Shifting Enterprise Data Story
The Connected Data Imperative: The Shifting Enterprise Data StoryThe Connected Data Imperative: The Shifting Enterprise Data Story
The Connected Data Imperative: The Shifting Enterprise Data Story
 
Washdc cto-0905-2003
Washdc cto-0905-2003Washdc cto-0905-2003
Washdc cto-0905-2003
 
Semantic web & structured data - #SMT Search Marketing Thursday - Jan-Willem ...
Semantic web & structured data - #SMT Search Marketing Thursday - Jan-Willem ...Semantic web & structured data - #SMT Search Marketing Thursday - Jan-Willem ...
Semantic web & structured data - #SMT Search Marketing Thursday - Jan-Willem ...
 
Personalisation in the Open Marketing Cloud
Personalisation in the Open Marketing CloudPersonalisation in the Open Marketing Cloud
Personalisation in the Open Marketing Cloud
 
Computer Applications and Systems - Workshop V
Computer Applications and Systems - Workshop VComputer Applications and Systems - Workshop V
Computer Applications and Systems - Workshop V
 
Webinar: Is the Cloud Right for You 2016-10-18
Webinar: Is the Cloud Right for You 2016-10-18Webinar: Is the Cloud Right for You 2016-10-18
Webinar: Is the Cloud Right for You 2016-10-18
 
Oracle analytics cloud overview feb 2017
Oracle analytics cloud overview   feb 2017Oracle analytics cloud overview   feb 2017
Oracle analytics cloud overview feb 2017
 
Apache Spark + AI Helps and FDA Protects the Nation with Jonathan Chu and Kun...
Apache Spark + AI Helps and FDA Protects the Nation with Jonathan Chu and Kun...Apache Spark + AI Helps and FDA Protects the Nation with Jonathan Chu and Kun...
Apache Spark + AI Helps and FDA Protects the Nation with Jonathan Chu and Kun...
 
Introduction to Aerospike
Introduction to AerospikeIntroduction to Aerospike
Introduction to Aerospike
 
Sp meetup 17 slidedeck
Sp meetup 17 slidedeckSp meetup 17 slidedeck
Sp meetup 17 slidedeck
 
Pistoia Alliance USA Conference 2016
Pistoia Alliance USA Conference 2016Pistoia Alliance USA Conference 2016
Pistoia Alliance USA Conference 2016
 
Future Proofing Your Office 365 & SharePoint Strategy
Future Proofing Your Office 365 & SharePoint StrategyFuture Proofing Your Office 365 & SharePoint Strategy
Future Proofing Your Office 365 & SharePoint Strategy
 
MuleSoft Manchester Meetup slides 4th July 2019
MuleSoft Manchester Meetup slides 4th July 2019MuleSoft Manchester Meetup slides 4th July 2019
MuleSoft Manchester Meetup slides 4th July 2019
 
Closing the Infrastructure Gap
Closing the Infrastructure Gap Closing the Infrastructure Gap
Closing the Infrastructure Gap
 
Building Social Business Applications with OpenSocial
Building Social Business Applications with OpenSocialBuilding Social Business Applications with OpenSocial
Building Social Business Applications with OpenSocial
 
Big Data LDN 2017: Become an Information-driven Organisation With Cognitive S...
Big Data LDN 2017: Become an Information-driven Organisation With Cognitive S...Big Data LDN 2017: Become an Information-driven Organisation With Cognitive S...
Big Data LDN 2017: Become an Information-driven Organisation With Cognitive S...
 

Más de mdabrowski

Más de mdabrowski (11)

Spark Summit Europe 2017 - Applying multiple ML pipelines to heterogenous dat...
Spark Summit Europe 2017 - Applying multiple ML pipelines to heterogenous dat...Spark Summit Europe 2017 - Applying multiple ML pipelines to heterogenous dat...
Spark Summit Europe 2017 - Applying multiple ML pipelines to heterogenous dat...
 
2017 05 Hadoop User Group Meetup Dublin
2017 05 Hadoop User Group Meetup Dublin2017 05 Hadoop User Group Meetup Dublin
2017 05 Hadoop User Group Meetup Dublin
 
The true meaning of data
The true meaning of dataThe true meaning of data
The true meaning of data
 
Applications of the Social Semantic Web
Applications of the Social Semantic WebApplications of the Social Semantic Web
Applications of the Social Semantic Web
 
Short guide to the Semantic Web
Short guide to the Semantic WebShort guide to the Semantic Web
Short guide to the Semantic Web
 
Introduction to the Social Semantic Web
Introduction to the Social Semantic WebIntroduction to the Social Semantic Web
Introduction to the Social Semantic Web
 
Introduction to the Social Web and its applications
Introduction to the Social Web and its applicationsIntroduction to the Social Web and its applications
Introduction to the Social Web and its applications
 
Geo-annotations in Semantic Digital Libraries
Geo-annotations in Semantic Digital Libraries Geo-annotations in Semantic Digital Libraries
Geo-annotations in Semantic Digital Libraries
 
MarcOnt Initiative - Protege meeting
MarcOnt Initiative - Protege meetingMarcOnt Initiative - Protege meeting
MarcOnt Initiative - Protege meeting
 
Philosophy and Atrificial Inteligence
Philosophy and Atrificial Inteligence Philosophy and Atrificial Inteligence
Philosophy and Atrificial Inteligence
 
MarcOnt Initiative
MarcOnt InitiativeMarcOnt Initiative
MarcOnt Initiative
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Último (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

Near real-time recommendations in enterprise social networks

Notas del editor

  1. Broader picture Users expect personalised experiences Preferences are distributed and cover many domains New site: no user profile no recommendations Goal: Use any user information for recommendations from any target domain or data setDomain -
  2. This problem is also visible in enterprises, where within company there are different domains, social platforms etc
  3. SHOW DEMO
  4. Non-native import of RDF: The Giraph paradigm for reading input data is incompatible with reading NTriple files, as it assumes each node of the graph to be describe on exactly one line of the input data. Merging of lines needs to be done in a pre-processing step. In addition Giraph does not support true multi-graphs.
  5. Out of the 56 users without enough recommendations, 64% had only 2 out-links and 20% had only 3 out-links (
  6. ----- Meeting Notes (01/05/2013 14:15) -----netvibes