Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
2014 ESWC Tutorial Handson 2: Transforming Twitter Data into SIOC, glimpse
1. Social Web: Where are the Semantics?
ESWC 2014
Miriam Fernández, Victor Rodríguez,
Andrés García-Silva, Oscar Corcho
Ontology Engineering Group, UPM, Spain
Knowledge Media Institute, The Open University
2. We have already learned what SIOC is
ESWC 2014 Social Web: Where are the Semantics? 2
3. But just to have it a bit more clear….
• Open it with your favourite ontology editor
ESWC 2014 Social Web: Where are the Semantics? 3
4. Lets try to model some Twitter data with SIOC
• Lets remind which sort of data Twitter give us…
– Information about the post
– Information about the user
ESWC 2014 Social Web: Where are the Semantics? 4
5. Lets try to model some Twitter data with SIOC
• Lets remind which sort of data Twitter give us…
– Entities, including hashtags, user mentions and urls
ESWC 2014 Social Web: Where are the Semantics? 5
6. Hands on: Using SIOC to model Twitter Data
ESWC 2014 Social Web: Where are the Semantics? 6
sioc:reply_of/
sioc:has_reply
sioct:
Microblog
Post
Tweet
URL
sioc:content
Tweet
Text
dcterms:created
Tweet
creation
time
sioc:has_container/
sioc:container_of
sioct:
Microblog
sioc:has_creator/
sioc:creator_of
sioc:UserAccount sioc:name
Screen
name
sioc:has_space/
sioc:space_of
sioc:Site
Twitter
homepage
sioc:topic
sioct:Tag
sioc:name
Extracted
hashtag
sioc:links_to
Extracted
link
sioc:mentions
sioc:follows
sioc:subscriber_of/
sioc:has_subscriber,
sioc:isPartOf/
sioc:hasPart
sioc:has_owner/
sioc:owner_of
geo:long
Tweet
Longt.
geo:lat
Tweet
Lat.
gn:Feature
sioc:about
...
geo:Point
geo:location
dcterms:created
Account
creation
time
sioc:note
Account
description
sioc:avatar
Avatar URL
User
Twitter
homepage
User
ID
dcterms:title
User
name
sioc:forwarded_by
sioc:Container
Twitter
list ID
sioc:addressed_to
• How to model re-
tweets?
• How to
distinguish
replies from
retweets?
7. Try the code!
• https://gist.github.com/miriamfs/e1738c7e17ce4a479dbe
This Gists contains three main files:
– SIOCTWitterParser.java: contains the code that you need to parse
Twitter data and transform it into SIOC format
– pom.xml: contains the dependencies. If you prefer not to use a maven
project, just go to jena and JSON-Liband download the corresponding
libraries
– siocTwitterParser.properties: this is the properties file that you need to
set up, including
• siocOntologyFolder. This is the local folder in your computer where you
store the SIOC ontology
• jsonInputFile. This is an example of a Twitter JSON file. Note that you
can can directly connect, download Twitter data and transform it! :)
• rdfOutputFile. This is the output file containing SIOC transformed Twitter
data
ESWC 2014 Social Web: Where are the Semantics? 7