Microblogging: A Semantic Web and Distributed Approach

Microblogging:
A Semantic Web and Distributed Approach

Alexandre Passant1, Tuukka Hastrup2, Uldis Bojārs2, John Breslin2
1 LaLIC, Université Paris-Sorbonne
2 Digital Enterprise Research Institute, National University of Ireland, Galway

Scripting For the Semantic Web (SFSW2008)
Tenerife, Spain, 2008-06-02
www.deri.ie
 Copyright 2008 Digital Enterprise Research
Institute. All rights reserved.

Microblogging overview

• Sweet spot between blogging and instant messaging
• Short status notification updates
– Share your real life with others !

2

Why and how does it work ?

• A ubiquitous network of communication
– Various communication channels: Web, phone messages, e-mail
– Simple approach for publishing data, following and replying
• A fluid network for information exchange in real-time

• Services
– Online platforms: Twitter, Jaiku, Pownce ...
– Plug-ins for existing services: Prologue for WordPress

• Microblogging in organisations ?
– Corporate microblogging: real-time Q&A
– Extends the Enterprise 2.0 vision
• Internal Signals (SLATES)

3

Issue #1: Data ownership and portability

• A centralised approach
– Need to register to (one more!) social platform
– “Social Network Fatigue”
– Users of different services cannot communicate
– Would you use webmail that only allows you to send mail to
people using the same provider ?
• Users do not own the content they publish
– It belongs to a proprietary and closed service
– What if it closes ? How do I move my data between services ?
– Would you register to a webmail that does not provide POP or
SMTP ?
• Users do not own their social network
– And cannot reuse existing ones: invite, again, again and again ...
• Yet, Twitter provides XFN export of people you follow

4

Issue #2: Meta-data

• Lack of unified, machine-readable meta-data
– Unified queries over a set of services ?
• All microblog content posted ten days ago ?
– APIs ?
• For each service, a new API must be learnt

• Extract machine-readable meta-data from Twitter
– Merge RSS feeds with XML export available for each update
– Map result data with Semantic Web vocabularies
• Dublin Core, SIOC...
– Use Sindice / SWSE to guess URIs of people
• From a user name to a FOAF URI (as in SWAML)

– A complex process, latest updates only (RSS-based)

5

Issue #3: Content semantics

• Lack of semantics in status updates
– Updates dealing with programming languages ?
– What happens in my neighbourhood ?
• Want to extend meta-data
– Locations the post talks about
• Hash tags ? Lead to the same issues as tagging
– Ambiguity
• #paris ? #swig ?
– Heterogeneity
• #semweb, #websemantique
– Lack of organisation
• How to relate #rdfa and #semanticweb
• Which tags to follow if I’m interested in SW ?

6

Our approach to microblogging

• Goal: To provide an open and flexible alternative to
current microblogging systems
– Distributed, open, user-controlled, reusable, scalable, based on
standards

• Means: The Semantic Web !
– SIOC and FOAF as the main vocabularies
– Semantics for both meta-data and status content
– Linked Data principles

• Proof of concept: SMOB
– Open-source software for distributed microblogging
– An ecosystem of distributed publishers and aggregators

7

A common model for meta-data

• Modelling users (physical persons) with FOAF
– Friend Of A Friend
– Ability to reuse one’s personal profile created from an external
application (LiveJournal, Flickr exporter ...)
– Interlinking various profile URIs on the Web using Linked Data
principles

• Modelling accounts and data with SIOC
– Semantically-Interlinked Online Communities
– Linking an existing FOAF profile to an online account, instead of
creating yet another disconnected one
– Extended with Microblog and MicroblogPost classes
• Subclasses of Container and Item
– Use other SIOC / DC properties to model the data

8

FOAF + SIOC: Semantics for data portability

9

Post example with the Tabulator

• @@@@@@@@@@

10

Modelling content of status updates

• URIs instead of hash tags
– Uniform description of resources (DBpedia ...)
– Modelled using sioc:topic between the content and the URI
• Microblogging enters the Linked Data Web !
– Need to find a user-friendly way to bridge this gap

• Prefixed hash tags
– #dbp:Effeil_Tower - Simple DBpedia mapping
• http://dbpedia.org/resource/Effeil_Tower
– #geo:Paris,France - Using geonames.org webservice
• Querying the service to retrieve location URI

• Can be used in lookup services such as Sindice
– New ways to discover content

11

A distributed architecture

• Vision: Open, distributed
– Follow the spirit of the Web architecture
– A network of publishing services and aggregation servers
interacting with each other
– A microblogging ecosystem
– New providers or aggregators can be added at any time,
anywhere on the network
– Provide standards, methods and open-source tools rather than a
closed proprietary approach

12

Architecture overview

13

Data ownership

• Publisher stores its content locally, then provide it to
aggregators which cache it in a triple store
– Data belongs to the user
– If an aggregator closes, data is still there
– Available in RDF: Mashable, browsable, linkable ...
– Can be combined with other Social Media Contributions modeled
using SIOC
• Retrieve all blog posts and microblogging updates of the last week

• Focusing on ideas from “A bill of rights for the Social
Web”
– http://opensocialweb.org/2007/09/05/bill-of-rights/
– Ownership, Control, Freedom

14

SMOB: A prototype for semantic microblogging

• SMOB
– http://smob.sioc-project.org
– Open-source client and server software to demonstrate
principles of our approach
– Early stage of development
• First prototype in a day and very few lines of PHP
– Still a prototype, some challenges to be achieved:
• Scalability
• SPARQL query complexity on the server side
• Authentication
• A public SMOB aggregator and anonymous publishing
client deployed
– 3 weeks, 10 users, 90 posts

15

Publishing content with SMOB

• Reusing your FOAF profile
– Creating RDF data using the SIOC PHP API
• Publishing to various aggregators
– Twitter integration, promote SW by using it for your tweets !

16

Browsing local content

• Listing of latest updates, embeds RDFa

17

Storing aggregated content in SMOB server

• Aggregators receive pings and cache the RDF
documents in real-time

• Hash tag interpretation with regular expressions
– geonames.org wrapper for #geo: tags
– DPpedia links for #dbp: tags

• Based on the ARC2 API for storage / queries and Exhibit
for the browsing interface
– SPARUL “LOAD” pattern to get data
– SPARQL to format data to Exhibit JSON
– Exhibit for faceted browsing

18

SPARQL query example

• Retrieve latest updates from the server (uniquify in PHP)

SELECT ?post ?date ?content ?maker ?name ?depiction
WHERE {
?post rdf:type sioct:MicroblogPost ;
foaf:maker ?maker ;
sioc:content ?content ;
dct:created ?date .
?maker foaf:name ?name .
{ ?maker foaf:img ?depiction } union
{ ?maker foaf:depiction ?depiction }
} ORDER BY DESC(?date) LIMIT 20

19

Faceted browsing with geolocation

21

Security, privacy, authentication

• We currently limit access to publishing, aggregation and
content viewing by HTTP authentication and API keys
– IP-based authentication using .htaccess
– Global API key for a microblogging aggregator

• All updates are public on the client side

• TODO
– Authentication schemes (OAuth, OpenID)
– Private updates and private communities

22

Future works

• More meta-data
– Process hash tags before publishing RDF
• Linked Data from the client-side
• Tags / URIs relationships with MOAT
– @replies, linked to FOAF URIs
• Other issues
– Scalability, authentication, timezones
• Intelligent agregators
– Browse the SIOC-o-sphere to find relevant updates
– Based on their content:
• A music aggregator, retrieving only data linking to music bands URIs
• Deployment within organisations
– Corporate Microblogging in SIOC-based companies

23

Thank you !

• Contacts
– http://smob.sioc-project.org
– #smob IRC channel on Freenode
– sioc-dev on google-groups

• SDoW2008
– Social Data on the Web workshop @ ISWC2008

24

Microblogging: A Semantic Web and Distributed Approach

Recomendados

Recomendados

Más contenido relacionado

Similar a Microblogging: A Semantic Web and Distributed Approach

Similar a Microblogging: A Semantic Web and Distributed Approach (20)

Más de Alexandre Passant

Más de Alexandre Passant (20)

Último

Último (20)

Microblogging: A Semantic Web and Distributed Approach