SlideShare una empresa de Scribd logo
1 de 19
Integrating  and Interpreting Social Data from Heterogeneous Sources Matthew Rowe  Organisations, Information and Knowledge Group University of Sheffield SuvodeepMazumdar Department of Information Studies University of Sheffield
Outline Information overload Increase in social data publication Interlinking social data Metadata Generation Integrating Social Data Application: Interpreting Social Data Cumbrian Floods Use Case Interacting with Social Data Conclusions
Information Overload Masses of social data are published every day E.g. 50 million tweets (600 per second) http://blog.twitter.com 22million Facebook users in the UK http://www.clickymedia.co.uk/2009/10/uk-facebook-user-statistics-october-2009/ Too much information to deal with! Social data is multi-faceted: Provenance Topic Geo Trend services (e.g. trendistic, blogpulse): Focus on majority consensus Need to listen in to a specific topic Concentrate on a single source/platform Do not consider geo facet
Interlinking Social Data Consider multi-faceted nature of social data: Allows fine-grained analysis Show geo-localised social data Relevant past social data Solution: Interlink social data from heterogeneous sources Use semantics! Consistent data interpretation
Metadata Generation Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post and itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance
Metadata Generation <photo id="949406913" media="photo">	   <owner nsid="54948696@N00”/>   <title>DSC00171.JPG</title>	   <description></description>	   <dates posted="1205398307" taken="2009-01-09 09:16:31" lastupdate="1257421561" />   <tags>		     <tag id="24539622-2330113101-400" author="54948696@N00" raw="arctic">arctic</tag>     <tag id="24539622-2330113101-401" author="54948696@N00" raw="monkeys">monkeys</tag>   </tags>   <location latitude="53.4813" longitude="-2.2392" place_id="R8vDw_abBpSzUA">     <locality place_id="R8vDw_abBpSzUA" woeid="27872">Manchester</locality>     <region place_id="pn4MsiGbBZlXeplyXg" woeid="24554868">England</region>     <country place_id="DevLebebApj4RVbtaQ" woeid="23424975">United Kingdom</country>   </location>	 </photo>	 Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post and itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <status>   <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at>   <id>9774519667</id>   <text>Writing up our Geovation work for #lupas2010.</text>   <truncated>false</truncated>   <in_reply_to_status_id></in_reply_to_status_id>   <in_reply_to_user_id></in_reply_to_user_id>   <favorited>false</favorited>   <in_reply_to_screen_name></in_reply_to_screen_name>   <geo xmlns:georss="http://www.georss.org/georss">     <georss:point>53.3833,-1.4722</georss:point>   </geo> </status>
Metadata Generation Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post and itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <status>   <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at>   <id>9774519667</id>   <text>Writing up our Geovation work for #lupas2010.</text>   <truncated>false</truncated>   <in_reply_to_status_id></in_reply_to_status_id>   <in_reply_to_user_id></in_reply_to_user_id>   <favorited>false</favorited>   <in_reply_to_screen_name></in_reply_to_screen_name>   <geo xmlns:georss="http://www.georss.org/georss">     <georss:point>53.3833,-1.4722</georss:point>   </geo> </status>
Metadata Generation <status>   <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at>   <id>9774519667</id>   <text>Writing up our Geovation work for #lupas2010.</text>   <truncated>false</truncated>   <in_reply_to_status_id></in_reply_to_status_id>   <in_reply_to_user_id></in_reply_to_user_id>   <favorited>false</favorited>   <in_reply_to_screen_name></in_reply_to_screen_name>   <geo xmlns:georss="http://www.georss.org/georss">     <georss:point>53.3833,-1.4722</georss:point>   </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow/9774519667>  rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ;
Metadata Generation <status>   <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at>   <id>9774519667</id>   <text>Writing up our Geovation work for #lupas2010.</text>   <truncated>false</truncated>   <in_reply_to_status_id></in_reply_to_status_id>   <in_reply_to_user_id></in_reply_to_user_id>   <favorited>false</favorited>   <in_reply_to_screen_name></in_reply_to_screen_name>   <geo xmlns:georss="http://www.georss.org/georss">     <georss:point>53.3833,-1.4722</georss:point>   </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow/9774519667>  rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ;  sioc:content "Writing up our Geovation work for #lupas2010." ; dcterms:subject "lupas2010" ;
Metadata Generation <status>   <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at>   <id>9774519667</id>   <text>Writing up our Geovation work for #lupas2010.</text>   <truncated>false</truncated>   <in_reply_to_status_id></in_reply_to_status_id>   <in_reply_to_user_id></in_reply_to_user_id>   <favorited>false</favorited>   <in_reply_to_screen_name></in_reply_to_screen_name>   <geo xmlns:georss="http://www.georss.org/georss">     <georss:point>53.3833,-1.4722</georss:point>   </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow/9774519667>  rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ;  sioc:content "Writing up our Geovation work for #lupas2010." ; dcterms:subject "lupas2010" ; itr:has_Localization _:a2 . _:a2 rdf:typegml:Geometry ; gml:pos "53.3833,-1.4722" .
Metadata Generation <status>   <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at>   <id>9774519667</id>   <text>Writing up our Geovation work for #lupas2010.</text>   <truncated>false</truncated>   <in_reply_to_status_id></in_reply_to_status_id>   <in_reply_to_user_id></in_reply_to_user_id>   <favorited>false</favorited>   <in_reply_to_screen_name></in_reply_to_screen_name>   <geo xmlns:georss="http://www.georss.org/georss">     <georss:point>53.3833,-1.4722</georss:point>   </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow/9774519667>  rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ;  sioc:content "Writing up our Geovation work for #lupas2010." ; dcterms:subject "lupas2010" ; dcterms:created "2010-2-28 12:22:47.0" ; itr:has_Localization _:a2 . _:a2 rdf:typegml:Geometry ; gml:pos "53.3833,-1.4722" .
Metadata Generation <status>   <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at>   <id>9774519667</id>   <text>Writing up our Geovation work for #lupas2010.</text>   <truncated>false</truncated>   <in_reply_to_status_id></in_reply_to_status_id>   <in_reply_to_user_id></in_reply_to_user_id>   <favorited>false</favorited>   <in_reply_to_screen_name></in_reply_to_screen_name>   <geo xmlns:georss="http://www.georss.org/georss">     <georss:point>53.3833,-1.4722</georss:point>   </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow>  rdf:typefoaf:Person ; rdf:typeitr:LocalizedResource ;	 foaf:name "Matthew Rowe" ; foaf:homepage <http://www.dcs.shef.ac.uk/~mrowe> ; <http://twitter.com/mattroweshow/9774519667>  rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ;  sioc:content "Writing up our Geovation work for #lupas2010." ; dcterms:subject "lupas2010" ; dcterms:created "2010-2-28 12:22:47.0" ; sioc:hasCreator <http://twitter.com/mattroweshow> ; itr:has_Localization _:a2 . _:a2 rdf:typegml:Geometry ; gml:pos "53.3833,-1.4722" .
Integrated Social Data Triplify social data from multiple platforms Flickr XML response -> RDF Picassa XML response -> RDF Use common semantics Can perform SPARQL queries PREFIX dcterms:<http://purl.org/dc/terms> SELECT ?item WHERE { 	?item dcterms:subject "iranelections" . 	?item dcterms:created ?date } ORDER BY DESC(?date) PREFIX dcterms:<http://purl.org/dc/terms> PREFIX itr:<http://www.dcs.shef.ac.uk/~gregoire/interaction/ns#> PREFIX gml:<http://www.opengis.net/gml/> SELECT DISTINCT ?post ?tag WHERE { 	?post dcterms:subject ?tag . 	?post itr:has_Localization ?geo . 	?geo gml:pos "53.4813,-2.2392"   }
Interpreting Social Data Cumbrian Use Case UK region suffered worst floods in centuries Observe the effects in social data Rise in publication Fine-grained geocoded social data		 Dataset: Microblogs from 200 Cumbrian Twitter users Published during 2009 3513 microblogs Produced 475,043 triples Images from Flickr taken in Cumbria 6663 images Produced 182,304
Interacting with Social Data Built a visualisation application to analyse social data fragments http://www.dcs.shef.ac.uk/~suvodeep/ViziSocial Filter by date Lower slider Fine-grained focus Zoom in Tag cloud Shows fragment topics Window controls tag cloud topics Markers contain number of fragments
Conclusions Consistent interpretation of social data	 Across heterogeneous sources Application Allows analyses of social data To fine-grained detail Utilises multiple facets of social data Requires metadata  Issue of scalability Future Work Adapting to real time data acquisition	 Focussing on South Yorkshire region at present Assess scalability issue
Twitter:  @mattroweshow Web:     http://www.dcs.shef.ac.uk/~mrowe Email:   m.rowe@dcs.shef.ac.uk Questions?

Más contenido relacionado

La actualidad más candente (12)

10/12/11 Boston Area SharePoint Users Group Meeting
10/12/11 Boston Area SharePoint Users Group Meeting10/12/11 Boston Area SharePoint Users Group Meeting
10/12/11 Boston Area SharePoint Users Group Meeting
 
3/9/11 Boston Area SharePoint Users Group Meeting
3/9/11 Boston Area SharePoint Users Group Meeting3/9/11 Boston Area SharePoint Users Group Meeting
3/9/11 Boston Area SharePoint Users Group Meeting
 
Search on Mobile - Mobile Copenhagen 2012
Search on Mobile - Mobile Copenhagen 2012Search on Mobile - Mobile Copenhagen 2012
Search on Mobile - Mobile Copenhagen 2012
 
Boston Area SharePoint User Group 10/21/10 Meeting
Boston Area SharePoint User Group 10/21/10 MeetingBoston Area SharePoint User Group 10/21/10 Meeting
Boston Area SharePoint User Group 10/21/10 Meeting
 
7/14/10 Boston Area SharePoint Users Group Meeting
7/14/10 Boston Area SharePoint Users Group Meeting7/14/10 Boston Area SharePoint Users Group Meeting
7/14/10 Boston Area SharePoint Users Group Meeting
 
8/11/10 Boston Area SharePoint Users Group meeting
8/11/10 Boston Area SharePoint Users Group meeting8/11/10 Boston Area SharePoint Users Group meeting
8/11/10 Boston Area SharePoint Users Group meeting
 
Boston Area SharePoint Users Group January 11th, 2012 Meeting
Boston Area SharePoint Users Group January 11th, 2012 MeetingBoston Area SharePoint Users Group January 11th, 2012 Meeting
Boston Area SharePoint Users Group January 11th, 2012 Meeting
 
January 9th, 2013 BASPUG Meeting
January 9th, 2013 BASPUG MeetingJanuary 9th, 2013 BASPUG Meeting
January 9th, 2013 BASPUG Meeting
 
Online policy primer google - al black
Online policy primer   google - al blackOnline policy primer   google - al black
Online policy primer google - al black
 
Web As A Platform
Web As A PlatformWeb As A Platform
Web As A Platform
 
Google Search Policy Primer
Google Search Policy PrimerGoogle Search Policy Primer
Google Search Policy Primer
 
BASPUG 8/13/13 Meeting
BASPUG 8/13/13 MeetingBASPUG 8/13/13 Meeting
BASPUG 8/13/13 Meeting
 

Similar a Integrating and Interpreting Social Data from Heterogeneous Sources

technical fluency
technical fluencytechnical fluency
technical fluencyjudell
 
Agile Descriptions
Agile DescriptionsAgile Descriptions
Agile DescriptionsTony Hammond
 
Social Media Release Xml
Social Media Release XmlSocial Media Release Xml
Social Media Release XmlEcordia
 
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
"RDFa - what, why and how?" by Mike Hewett and Shamod LacoulShamod Lacoul
 
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google InfrastructureLiving in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructureguest517f2f
 
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google InfrastructureLiving in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google InfrastructurePamela Fox
 
Social Semantic Web on Facebook Open Graph protocol and Twitter Annotations
Social Semantic Web on Facebook Open Graph protocol and Twitter AnnotationsSocial Semantic Web on Facebook Open Graph protocol and Twitter Annotations
Social Semantic Web on Facebook Open Graph protocol and Twitter AnnotationsMyungjin Lee
 
Linked Data and Search: Thomas Steiner (Google Inc, Germany)
Linked Data and Search:  Thomas Steiner (Google Inc, Germany)Linked Data and Search:  Thomas Steiner (Google Inc, Germany)
Linked Data and Search: Thomas Steiner (Google Inc, Germany)FIA2010
 
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google InfrastructureLiving in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructureguest517f2f
 
MicroWSMO editor - Bachelor's thesis presentation
MicroWSMO editor - Bachelor's thesis presentationMicroWSMO editor - Bachelor's thesis presentation
MicroWSMO editor - Bachelor's thesis presentationSimone Spaccarotella
 
Creating Linked Data 2/5 Semtech2011
Creating Linked Data 2/5 Semtech2011Creating Linked Data 2/5 Semtech2011
Creating Linked Data 2/5 Semtech2011Juan Sequeda
 
OpenSocial - GTUG Stockholm Meeting Oct 1 2009
OpenSocial - GTUG Stockholm Meeting Oct 1 2009OpenSocial - GTUG Stockholm Meeting Oct 1 2009
OpenSocial - GTUG Stockholm Meeting Oct 1 2009Jacob Gyllenstierna
 
Struts2 course chapter 2: installation and configuration
Struts2 course chapter 2: installation and configurationStruts2 course chapter 2: installation and configuration
Struts2 course chapter 2: installation and configurationJavaEE Trainers
 
Illuminated Hacks -- Where 2.0 101 Tutorial
Illuminated Hacks -- Where 2.0 101 TutorialIlluminated Hacks -- Where 2.0 101 Tutorial
Illuminated Hacks -- Where 2.0 101 Tutorialmikel_maron
 
Searching the Now
Searching the NowSearching the Now
Searching the Nowlucasjosh
 

Similar a Integrating and Interpreting Social Data from Heterogeneous Sources (20)

technical fluency
technical fluencytechnical fluency
technical fluency
 
Agile Descriptions
Agile DescriptionsAgile Descriptions
Agile Descriptions
 
Social Media Release Xml
Social Media Release XmlSocial Media Release Xml
Social Media Release Xml
 
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
 
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google InfrastructureLiving in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructure
 
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google InfrastructureLiving in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructure
 
Social Semantic Web on Facebook Open Graph protocol and Twitter Annotations
Social Semantic Web on Facebook Open Graph protocol and Twitter AnnotationsSocial Semantic Web on Facebook Open Graph protocol and Twitter Annotations
Social Semantic Web on Facebook Open Graph protocol and Twitter Annotations
 
Linked Data and Search: Thomas Steiner (Google Inc, Germany)
Linked Data and Search:  Thomas Steiner (Google Inc, Germany)Linked Data and Search:  Thomas Steiner (Google Inc, Germany)
Linked Data and Search: Thomas Steiner (Google Inc, Germany)
 
Embedded Metadata working group
Embedded Metadata working groupEmbedded Metadata working group
Embedded Metadata working group
 
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google InfrastructureLiving in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructure
 
MicroWSMO editor - Bachelor's thesis presentation
MicroWSMO editor - Bachelor's thesis presentationMicroWSMO editor - Bachelor's thesis presentation
MicroWSMO editor - Bachelor's thesis presentation
 
Creating Linked Data 2/5 Semtech2011
Creating Linked Data 2/5 Semtech2011Creating Linked Data 2/5 Semtech2011
Creating Linked Data 2/5 Semtech2011
 
Jabber Bot
Jabber BotJabber Bot
Jabber Bot
 
OpenSocial - GTUG Stockholm Meeting Oct 1 2009
OpenSocial - GTUG Stockholm Meeting Oct 1 2009OpenSocial - GTUG Stockholm Meeting Oct 1 2009
OpenSocial - GTUG Stockholm Meeting Oct 1 2009
 
Struts2 course chapter 2: installation and configuration
Struts2 course chapter 2: installation and configurationStruts2 course chapter 2: installation and configuration
Struts2 course chapter 2: installation and configuration
 
CurrentCost
CurrentCostCurrentCost
CurrentCost
 
Illuminated Hacks -- Where 2.0 101 Tutorial
Illuminated Hacks -- Where 2.0 101 TutorialIlluminated Hacks -- Where 2.0 101 Tutorial
Illuminated Hacks -- Where 2.0 101 Tutorial
 
Jquery mobile
Jquery mobileJquery mobile
Jquery mobile
 
Biblio2.0
Biblio2.0Biblio2.0
Biblio2.0
 
Searching the Now
Searching the NowSearching the Now
Searching the Now
 

Más de Matthew Rowe

Social Computing Research with Apache Spark
Social Computing Research with Apache SparkSocial Computing Research with Apache Spark
Social Computing Research with Apache SparkMatthew Rowe
 
Predicting Online Community Churners using Gaussian Sequences
Predicting Online Community Churners using Gaussian SequencesPredicting Online Community Churners using Gaussian Sequences
Predicting Online Community Churners using Gaussian SequencesMatthew Rowe
 
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...Matthew Rowe
 
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting RatingsSemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings Matthew Rowe
 
The Semantic Evolution of Online Communities
The Semantic Evolution of Online CommunitiesThe Semantic Evolution of Online Communities
The Semantic Evolution of Online CommunitiesMatthew Rowe
 
From Mining to Understanding: The Evolution of Social Web Users
From Mining to Understanding: The Evolution of Social Web UsersFrom Mining to Understanding: The Evolution of Social Web Users
From Mining to Understanding: The Evolution of Social Web UsersMatthew Rowe
 
Mining User Lifecycles from Online Community Platforms and their Application ...
Mining User Lifecycles from Online Community Platforms and their Application ...Mining User Lifecycles from Online Community Platforms and their Application ...
Mining User Lifecycles from Online Community Platforms and their Application ...Matthew Rowe
 
From User Needs to Community Health: Mining User Behaviour to Analyse Online ...
From User Needs to Community Health: Mining User Behaviour to Analyse Online ...From User Needs to Community Health: Mining User Behaviour to Analyse Online ...
From User Needs to Community Health: Mining User Behaviour to Analyse Online ...Matthew Rowe
 
Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...
Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...
Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...Matthew Rowe
 
Identity: Physical, Cyber, Future
Identity: Physical, Cyber, FutureIdentity: Physical, Cyber, Future
Identity: Physical, Cyber, FutureMatthew Rowe
 
Measuring the Topical Specificity of Online Communities
Measuring the Topical Specificity of Online CommunitiesMeasuring the Topical Specificity of Online Communities
Measuring the Topical Specificity of Online CommunitiesMatthew Rowe
 
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...Matthew Rowe
 
Attention Economics in Social Web Systems
Attention Economics in Social Web SystemsAttention Economics in Social Web Systems
Attention Economics in Social Web SystemsMatthew Rowe
 
What makes communities tick? Community health analysis using role compositions
What makes communities tick? Community health analysis using role compositionsWhat makes communities tick? Community health analysis using role compositions
What makes communities tick? Community health analysis using role compositionsMatthew Rowe
 
Existing Research and Future Research Agenda
Existing Research and Future Research AgendaExisting Research and Future Research Agenda
Existing Research and Future Research AgendaMatthew Rowe
 
Tutorial: Social Semantics
Tutorial: Social SemanticsTutorial: Social Semantics
Tutorial: Social SemanticsMatthew Rowe
 
Modelling and Analysis of User Behaviour in Online Communities
Modelling and Analysis of User Behaviour in Online CommunitiesModelling and Analysis of User Behaviour in Online Communities
Modelling and Analysis of User Behaviour in Online CommunitiesMatthew Rowe
 
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web SystemsUsing Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web SystemsMatthew Rowe
 
Anticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsAnticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsMatthew Rowe
 
Semantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic DataSemantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic DataMatthew Rowe
 

Más de Matthew Rowe (20)

Social Computing Research with Apache Spark
Social Computing Research with Apache SparkSocial Computing Research with Apache Spark
Social Computing Research with Apache Spark
 
Predicting Online Community Churners using Gaussian Sequences
Predicting Online Community Churners using Gaussian SequencesPredicting Online Community Churners using Gaussian Sequences
Predicting Online Community Churners using Gaussian Sequences
 
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
 
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting RatingsSemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
 
The Semantic Evolution of Online Communities
The Semantic Evolution of Online CommunitiesThe Semantic Evolution of Online Communities
The Semantic Evolution of Online Communities
 
From Mining to Understanding: The Evolution of Social Web Users
From Mining to Understanding: The Evolution of Social Web UsersFrom Mining to Understanding: The Evolution of Social Web Users
From Mining to Understanding: The Evolution of Social Web Users
 
Mining User Lifecycles from Online Community Platforms and their Application ...
Mining User Lifecycles from Online Community Platforms and their Application ...Mining User Lifecycles from Online Community Platforms and their Application ...
Mining User Lifecycles from Online Community Platforms and their Application ...
 
From User Needs to Community Health: Mining User Behaviour to Analyse Online ...
From User Needs to Community Health: Mining User Behaviour to Analyse Online ...From User Needs to Community Health: Mining User Behaviour to Analyse Online ...
From User Needs to Community Health: Mining User Behaviour to Analyse Online ...
 
Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...
Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...
Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...
 
Identity: Physical, Cyber, Future
Identity: Physical, Cyber, FutureIdentity: Physical, Cyber, Future
Identity: Physical, Cyber, Future
 
Measuring the Topical Specificity of Online Communities
Measuring the Topical Specificity of Online CommunitiesMeasuring the Topical Specificity of Online Communities
Measuring the Topical Specificity of Online Communities
 
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
 
Attention Economics in Social Web Systems
Attention Economics in Social Web SystemsAttention Economics in Social Web Systems
Attention Economics in Social Web Systems
 
What makes communities tick? Community health analysis using role compositions
What makes communities tick? Community health analysis using role compositionsWhat makes communities tick? Community health analysis using role compositions
What makes communities tick? Community health analysis using role compositions
 
Existing Research and Future Research Agenda
Existing Research and Future Research AgendaExisting Research and Future Research Agenda
Existing Research and Future Research Agenda
 
Tutorial: Social Semantics
Tutorial: Social SemanticsTutorial: Social Semantics
Tutorial: Social Semantics
 
Modelling and Analysis of User Behaviour in Online Communities
Modelling and Analysis of User Behaviour in Online CommunitiesModelling and Analysis of User Behaviour in Online Communities
Modelling and Analysis of User Behaviour in Online Communities
 
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web SystemsUsing Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
 
Anticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsAnticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community Forums
 
Semantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic DataSemantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic Data
 

Último

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 

Último (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 

Integrating and Interpreting Social Data from Heterogeneous Sources

  • 1. Integrating and Interpreting Social Data from Heterogeneous Sources Matthew Rowe Organisations, Information and Knowledge Group University of Sheffield SuvodeepMazumdar Department of Information Studies University of Sheffield
  • 2. Outline Information overload Increase in social data publication Interlinking social data Metadata Generation Integrating Social Data Application: Interpreting Social Data Cumbrian Floods Use Case Interacting with Social Data Conclusions
  • 3. Information Overload Masses of social data are published every day E.g. 50 million tweets (600 per second) http://blog.twitter.com 22million Facebook users in the UK http://www.clickymedia.co.uk/2009/10/uk-facebook-user-statistics-october-2009/ Too much information to deal with! Social data is multi-faceted: Provenance Topic Geo Trend services (e.g. trendistic, blogpulse): Focus on majority consensus Need to listen in to a specific topic Concentrate on a single source/platform Do not consider geo facet
  • 4.
  • 5.
  • 6. Interlinking Social Data Consider multi-faceted nature of social data: Allows fine-grained analysis Show geo-localised social data Relevant past social data Solution: Interlink social data from heterogeneous sources Use semantics! Consistent data interpretation
  • 7. Metadata Generation Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post and itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance
  • 8. Metadata Generation <photo id="949406913" media="photo"> <owner nsid="54948696@N00”/> <title>DSC00171.JPG</title> <description></description> <dates posted="1205398307" taken="2009-01-09 09:16:31" lastupdate="1257421561" /> <tags> <tag id="24539622-2330113101-400" author="54948696@N00" raw="arctic">arctic</tag> <tag id="24539622-2330113101-401" author="54948696@N00" raw="monkeys">monkeys</tag> </tags> <location latitude="53.4813" longitude="-2.2392" place_id="R8vDw_abBpSzUA"> <locality place_id="R8vDw_abBpSzUA" woeid="27872">Manchester</locality> <region place_id="pn4MsiGbBZlXeplyXg" woeid="24554868">England</region> <country place_id="DevLebebApj4RVbtaQ" woeid="23424975">United Kingdom</country> </location> </photo> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post and itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <status> <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at> <id>9774519667</id> <text>Writing up our Geovation work for #lupas2010.</text> <truncated>false</truncated> <in_reply_to_status_id></in_reply_to_status_id> <in_reply_to_user_id></in_reply_to_user_id> <favorited>false</favorited> <in_reply_to_screen_name></in_reply_to_screen_name> <geo xmlns:georss="http://www.georss.org/georss"> <georss:point>53.3833,-1.4722</georss:point> </geo> </status>
  • 9. Metadata Generation Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post and itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <status> <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at> <id>9774519667</id> <text>Writing up our Geovation work for #lupas2010.</text> <truncated>false</truncated> <in_reply_to_status_id></in_reply_to_status_id> <in_reply_to_user_id></in_reply_to_user_id> <favorited>false</favorited> <in_reply_to_screen_name></in_reply_to_screen_name> <geo xmlns:georss="http://www.georss.org/georss"> <georss:point>53.3833,-1.4722</georss:point> </geo> </status>
  • 10. Metadata Generation <status> <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at> <id>9774519667</id> <text>Writing up our Geovation work for #lupas2010.</text> <truncated>false</truncated> <in_reply_to_status_id></in_reply_to_status_id> <in_reply_to_user_id></in_reply_to_user_id> <favorited>false</favorited> <in_reply_to_screen_name></in_reply_to_screen_name> <geo xmlns:georss="http://www.georss.org/georss"> <georss:point>53.3833,-1.4722</georss:point> </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow/9774519667> rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ;
  • 11. Metadata Generation <status> <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at> <id>9774519667</id> <text>Writing up our Geovation work for #lupas2010.</text> <truncated>false</truncated> <in_reply_to_status_id></in_reply_to_status_id> <in_reply_to_user_id></in_reply_to_user_id> <favorited>false</favorited> <in_reply_to_screen_name></in_reply_to_screen_name> <geo xmlns:georss="http://www.georss.org/georss"> <georss:point>53.3833,-1.4722</georss:point> </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow/9774519667> rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ; sioc:content "Writing up our Geovation work for #lupas2010." ; dcterms:subject "lupas2010" ;
  • 12. Metadata Generation <status> <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at> <id>9774519667</id> <text>Writing up our Geovation work for #lupas2010.</text> <truncated>false</truncated> <in_reply_to_status_id></in_reply_to_status_id> <in_reply_to_user_id></in_reply_to_user_id> <favorited>false</favorited> <in_reply_to_screen_name></in_reply_to_screen_name> <geo xmlns:georss="http://www.georss.org/georss"> <georss:point>53.3833,-1.4722</georss:point> </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow/9774519667> rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ; sioc:content "Writing up our Geovation work for #lupas2010." ; dcterms:subject "lupas2010" ; itr:has_Localization _:a2 . _:a2 rdf:typegml:Geometry ; gml:pos "53.3833,-1.4722" .
  • 13. Metadata Generation <status> <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at> <id>9774519667</id> <text>Writing up our Geovation work for #lupas2010.</text> <truncated>false</truncated> <in_reply_to_status_id></in_reply_to_status_id> <in_reply_to_user_id></in_reply_to_user_id> <favorited>false</favorited> <in_reply_to_screen_name></in_reply_to_screen_name> <geo xmlns:georss="http://www.georss.org/georss"> <georss:point>53.3833,-1.4722</georss:point> </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow/9774519667> rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ; sioc:content "Writing up our Geovation work for #lupas2010." ; dcterms:subject "lupas2010" ; dcterms:created "2010-2-28 12:22:47.0" ; itr:has_Localization _:a2 . _:a2 rdf:typegml:Geometry ; gml:pos "53.3833,-1.4722" .
  • 14. Metadata Generation <status> <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at> <id>9774519667</id> <text>Writing up our Geovation work for #lupas2010.</text> <truncated>false</truncated> <in_reply_to_status_id></in_reply_to_status_id> <in_reply_to_user_id></in_reply_to_user_id> <favorited>false</favorited> <in_reply_to_screen_name></in_reply_to_screen_name> <geo xmlns:georss="http://www.georss.org/georss"> <georss:point>53.3833,-1.4722</georss:point> </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow> rdf:typefoaf:Person ; rdf:typeitr:LocalizedResource ; foaf:name "Matthew Rowe" ; foaf:homepage <http://www.dcs.shef.ac.uk/~mrowe> ; <http://twitter.com/mattroweshow/9774519667> rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ; sioc:content "Writing up our Geovation work for #lupas2010." ; dcterms:subject "lupas2010" ; dcterms:created "2010-2-28 12:22:47.0" ; sioc:hasCreator <http://twitter.com/mattroweshow> ; itr:has_Localization _:a2 . _:a2 rdf:typegml:Geometry ; gml:pos "53.3833,-1.4722" .
  • 15. Integrated Social Data Triplify social data from multiple platforms Flickr XML response -> RDF Picassa XML response -> RDF Use common semantics Can perform SPARQL queries PREFIX dcterms:<http://purl.org/dc/terms> SELECT ?item WHERE { ?item dcterms:subject "iranelections" . ?item dcterms:created ?date } ORDER BY DESC(?date) PREFIX dcterms:<http://purl.org/dc/terms> PREFIX itr:<http://www.dcs.shef.ac.uk/~gregoire/interaction/ns#> PREFIX gml:<http://www.opengis.net/gml/> SELECT DISTINCT ?post ?tag WHERE { ?post dcterms:subject ?tag . ?post itr:has_Localization ?geo . ?geo gml:pos "53.4813,-2.2392" }
  • 16. Interpreting Social Data Cumbrian Use Case UK region suffered worst floods in centuries Observe the effects in social data Rise in publication Fine-grained geocoded social data Dataset: Microblogs from 200 Cumbrian Twitter users Published during 2009 3513 microblogs Produced 475,043 triples Images from Flickr taken in Cumbria 6663 images Produced 182,304
  • 17. Interacting with Social Data Built a visualisation application to analyse social data fragments http://www.dcs.shef.ac.uk/~suvodeep/ViziSocial Filter by date Lower slider Fine-grained focus Zoom in Tag cloud Shows fragment topics Window controls tag cloud topics Markers contain number of fragments
  • 18. Conclusions Consistent interpretation of social data Across heterogeneous sources Application Allows analyses of social data To fine-grained detail Utilises multiple facets of social data Requires metadata Issue of scalability Future Work Adapting to real time data acquisition Focussing on South Yorkshire region at present Assess scalability issue
  • 19. Twitter: @mattroweshow Web: http://www.dcs.shef.ac.uk/~mrowe Email: m.rowe@dcs.shef.ac.uk Questions?

Notas del editor

  1. Trend ServicesTrendisticOnly twitterBlogpulseBlogosphere
  2. Trend ServicesTrendisticOnly twitterBlogpulseBlogosphere
  3. Trend ServicesTrendisticOnly twitterBlogpulseBlogosphere
  4. Web 2.0 platforms provide data in proprietary formats:XML according to bespoke schemasLift to RDF using consistent semantics