SlideShare una empresa de Scribd logo
1 de 13
BotNetBM

          A Benchmark for Social Network


                                       CWI
                            Project Meeting@Innsbruck
                              Feb 28 - Mar 04, 2011




Wednesday, March 02, 2011
Motivation
     —   Highly linked data

     —   No (good) benchmark yet for social
          networks




                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
BotNetBM
     —   A benchmark for social networks

     —   Simulates an RDF OLTP backend

     —   Simulates random activities of large #users

     —   Simulates on-site “analyst” ➠ weekly
          “analytic report”

     —   One parameter: scale (#user accounts to
          start BM)
                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
BotNetBM Queries
     —   SPARQL 1.1 + SPARUL

     —   User Actions

          ◦ Interactive queries (80%)

          ◦ Update transactions (20%)

     —   Measurement: successful #clicks/min.

          ◦ Transactions commit, penalty for > 3 sec.

          ◦ Interactive queries response time < 3 sec.

     —   Analytic queries (must finish within simulated weekend)

                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Limitations
     —   Data generator: too uniform, not realistic for social networks

          ◦ 10 operations / user / simulated day

          ◦ all users are equally active

          ◦ some queries have no “meaningful” relation to each other

          ◦ read/write contention unrealistically frequent
          ◦ ...

     —   Query mix:

          ◦ Does not exploit SPARQL 1.1 advanced features
          ◦ No link to other RDF datasets

     —   Queries do not run with the open source ed. of Virtuoso Server

                               Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Our Goals
     —   Exploit SPARQL1.1 features in queries

          ◦ “Property Path Expressions”

     —   Add links to well-know RDF data sets into the queries

          ◦ DBpedia

     —   Use real-life analysis info (e.g., twitter)

          ◦ redesign data generator

          ◦ distribution of interactive/update queries

     —   Use real-life social network data

          ◦ twitter, facebook, orkut, MySpace, ...

     —   Migration to MonetDB

                               Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Done
     —   Loaded into the Virtuoso Server (commercial ed.)

     —   Design of new query mix

     —   Twitter datasets

          ◦ http://infochimps.com/collections/twitter-census

          ◦ http://an.kaist.ac.kr/traces/WWW2010.html

          ◦ http://snap.stanford.edu/data/twitter7.html

          ◦ http://twitter.mpi-sws.org/

     —   Analysis information

          ◦ “The Man Your Man Could Smell Like: Twitter Analytics Report”

          ◦ “Characterizing user behavior in online social networks”

          ◦ “User Interactions in Social Networks and their Implications”


                                 Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Interactive & Analytic Queries

     Q1 - Q8: Information of Profiles & Friends
     1.   Find all users whose first names contain a particular string, e.g., “Minh”.

     2.   Return the names of people who study in the same school and have the same age as a user. These
          people can be the classmates of the user.

     3.   Find people studied from the same school that connect with you by a path of friend relationship. (Use
          the “Property Path Expression” in SPARQL 1.1 with arbitrary length path)

     4.   Find all friends who like an action movie whose actor is Tom Cruise. (Use the information from dbpedia
          for the movie and actor Tom Cruise)

     5.   Find all people living in a specific location, e.g., Amsterdam, that can be reached from a user by at most
          3 steps friend relationship.

     6.   Show all the friends of yours who are living in Europe. (Use the information from dbpedia. For example,
          Amsterdam is a city in Europe, London is a city in Europe)

     7.   Find top-10 suggested friends for a user: those people that are currently not your friend but are friends
          of many of your friends. (Get all friends of your friends, order them by the number of people in your
          friends list connecting to them)

     8.   Return all users that have not joined a specific group but more than 5 friends of theirs joined the group.



                                       Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Interactive & Analytic Queries

     Q9 - Q14: Posts or Tweets
     9.   Show 10 latest posts/tweets from your friends or the friends of them. (Order by posting time)

     10. Show active posts/tweets - the 10 latest commented posts/tweets from your friends. (Order
         by the timestamp of the last comments on the posts)

     11. Return top-10 most interesting posts from your friends - First order by the number of
         “like” (or in Twitter, the number of “re-tweet” posts) on the posts from your friends, then
         order by the number of comments.

     12. Return all posts about an event (e.g., Unrest in Tunisia) in 10 recent days. (Based on the
         hash tags if they are available. In case no tag appears in the post, check whether the content
         of the post contains the terms in the searching event.)

     13. Show all posts about a specific location, e.g., Egypt, in 10 recent days. (Use the information
         from DBpedia for identify the location of the post. For example, Cairo is the capital of Egypt,
         Tahrir square is in Cairo.)

     14. Find number of inactive user: all users activated for at least 30 days but did not have any
         post or all users that do not have any more post for 60 days.



                                   Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Interactive & Analytic Queries

     Q15 - Q17: Hash tags
     15.Show all photos posted by my friends that I was tagged.

     16.Find top-10 friends or all friends of friends of you that have
        common interest. (Based on the similarity between the tags in
        your posts and tags in their posts)

     17.What are the current hottest events/problems? (Get the hash tags
        from posts and order by the number of their appearances in 10
        recent days)




                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Interactive & Analytic Queries

     Q18 - Q19: other information
     18.Which area is the most active area? (Order by the total number of
        posts in each location in 5 recent days.)

     19.Return the top-10 locations that have the fastest growth in the
        number of users. (Count the number of people joined before 10
        days and those joined during the 10 recent days, and then,
        compute the developing rate.)




                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
SPARQL/Update Queries
     1. Update user profile

     2. Posts/Tweets:

           2.1. Add a posts (Popularity: high)

           2.2. Remove a posts (Popularity: low)

           2.3. Add tags for your friends

           2.4. Add/Remove a comment

     3. Friends

           3.1. Add a friend (Popularity: high)

           3.2. Remove a friend (Popularity: low)

                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
SPARQL/Update Queries
     4. Group, Event

           4.1. Join/Leave a group/event

           4.2. Add/Delete post in the group/event

     5. Photos

           5.1. Add/Delete a photo

           5.2. Add/Remove tags in the photo

           5.3. Add/Remove a comment
           5.4. Remove tags to me from all the pictures of my friends

                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011

Más contenido relacionado

Destacado (9)

Privacy-Preserving Schema Reuse
Privacy-Preserving Schema ReusePrivacy-Preserving Schema Reuse
Privacy-Preserving Schema Reuse
 
Dl2014 slides
Dl2014 slidesDl2014 slides
Dl2014 slides
 
Arrays in database systems, the next frontier?
Arrays in database systems, the next frontier?Arrays in database systems, the next frontier?
Arrays in database systems, the next frontier?
 
Exposing Real World Information for the Web of Things
Exposing Real World Information for the Web of ThingsExposing Real World Information for the Web of Things
Exposing Real World Information for the Web of Things
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
 
Tractor Pulling on Data Warehouse
Tractor Pulling on Data WarehouseTractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
 
Efficiently Maintaining Distributed Model-Based Views on Real-Time Data Streams
Efficiently Maintaining Distributed Model-Based Views on Real-Time Data StreamsEfficiently Maintaining Distributed Model-Based Views on Real-Time Data Streams
Efficiently Maintaining Distributed Model-Based Views on Real-Time Data Streams
 
Planetdata
PlanetdataPlanetdata
Planetdata
 

Similar a BotNetBenchmark - A Benchmark for Social Network

Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook:  case study of BlogTOPredicting what gets ‘Likes’ on Facebook:  case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTOToronto Metropolitan University
 
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTOPredicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTOPriya Kumar
 
Social Media Crawling & Mining Seminar
Social Media Crawling & Mining Seminar Social Media Crawling & Mining Seminar
Social Media Crawling & Mining Seminar Symeon Papadopoulos
 
Social media as a tool for terminological research
Social media as a tool for terminological researchSocial media as a tool for terminological research
Social media as a tool for terminological researchTERMCAT
 
Flux of MEME - DOW 1st semester
Flux of MEME - DOW 1st semesterFlux of MEME - DOW 1st semester
Flux of MEME - DOW 1st semesterthomas alisi
 
Developing rich interactive eBooks to teach linked open data to professionals...
Developing rich interactive eBooks to teach linked open data to professionals...Developing rich interactive eBooks to teach linked open data to professionals...
Developing rich interactive eBooks to teach linked open data to professionals...John Domingue
 
2016 09-28 social network analysis with node-xl_emke
2016 09-28 social network analysis with node-xl_emke2016 09-28 social network analysis with node-xl_emke
2016 09-28 social network analysis with node-xl_emkeDr Martina Emke
 
Social Media Analysis... according to Net7
Social Media Analysis... according to Net7Social Media Analysis... according to Net7
Social Media Analysis... according to Net7Net7
 
Analyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogsAnalyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogsStefan Sommer
 
Text mining on Twitter information based on R platform
Text mining on Twitter information based on R platformText mining on Twitter information based on R platform
Text mining on Twitter information based on R platformFayan TAO
 
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social NetworksBang Hui Lim
 
Anticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsAnticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsMatthew Rowe
 
Mapping Online Publics: Researching the Uses of Twitter
Mapping Online Publics: Researching the Uses of TwitterMapping Online Publics: Researching the Uses of Twitter
Mapping Online Publics: Researching the Uses of TwitterAxel Bruns
 
IAT334-Lec04-DesignIdeasPrinciples.pptx
IAT334-Lec04-DesignIdeasPrinciples.pptxIAT334-Lec04-DesignIdeasPrinciples.pptx
IAT334-Lec04-DesignIdeasPrinciples.pptxssuseraae9cd
 
Open Annotation Collaboration Briefing
Open Annotation Collaboration BriefingOpen Annotation Collaboration Briefing
Open Annotation Collaboration BriefingTimothy Cole
 
Information Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteInformation Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteDeep Kayal
 
Accessing and analysing your own social media data.pptx
Accessing and analysing your own social media data.pptxAccessing and analysing your own social media data.pptx
Accessing and analysing your own social media data.pptxLadduAnanu
 

Similar a BotNetBenchmark - A Benchmark for Social Network (20)

Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook:  case study of BlogTOPredicting what gets ‘Likes’ on Facebook:  case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
 
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTOPredicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
 
Social Media Crawling & Mining Seminar
Social Media Crawling & Mining Seminar Social Media Crawling & Mining Seminar
Social Media Crawling & Mining Seminar
 
CSE509 Lecture 5
CSE509 Lecture 5CSE509 Lecture 5
CSE509 Lecture 5
 
Social media as a tool for terminological research
Social media as a tool for terminological researchSocial media as a tool for terminological research
Social media as a tool for terminological research
 
Flux of MEME - DOW 1st semester
Flux of MEME - DOW 1st semesterFlux of MEME - DOW 1st semester
Flux of MEME - DOW 1st semester
 
Developing rich interactive eBooks to teach linked open data to professionals...
Developing rich interactive eBooks to teach linked open data to professionals...Developing rich interactive eBooks to teach linked open data to professionals...
Developing rich interactive eBooks to teach linked open data to professionals...
 
2016 09-28 social network analysis with node-xl_emke
2016 09-28 social network analysis with node-xl_emke2016 09-28 social network analysis with node-xl_emke
2016 09-28 social network analysis with node-xl_emke
 
Social Media Analysis... according to Net7
Social Media Analysis... according to Net7Social Media Analysis... according to Net7
Social Media Analysis... according to Net7
 
Analyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogsAnalyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogs
 
Text mining on Twitter information based on R platform
Text mining on Twitter information based on R platformText mining on Twitter information based on R platform
Text mining on Twitter information based on R platform
 
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
 
Anticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsAnticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community Forums
 
Social Media in Japan (Panel in Blogtalk2009)
Social Media in Japan (Panel in Blogtalk2009)Social Media in Japan (Panel in Blogtalk2009)
Social Media in Japan (Panel in Blogtalk2009)
 
Mapping Online Publics: Researching the Uses of Twitter
Mapping Online Publics: Researching the Uses of TwitterMapping Online Publics: Researching the Uses of Twitter
Mapping Online Publics: Researching the Uses of Twitter
 
IAT334-Lec04-DesignIdeasPrinciples.pptx
IAT334-Lec04-DesignIdeasPrinciples.pptxIAT334-Lec04-DesignIdeasPrinciples.pptx
IAT334-Lec04-DesignIdeasPrinciples.pptx
 
Twitter in Academic Conferences
Twitter in Academic ConferencesTwitter in Academic Conferences
Twitter in Academic Conferences
 
Open Annotation Collaboration Briefing
Open Annotation Collaboration BriefingOpen Annotation Collaboration Briefing
Open Annotation Collaboration Briefing
 
Information Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteInformation Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ Deloitte
 
Accessing and analysing your own social media data.pptx
Accessing and analysing your own social media data.pptxAccessing and analysing your own social media data.pptx
Accessing and analysing your own social media data.pptx
 

Más de PlanetData Network of Excellence

A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoPlanetData Network of Excellence
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingPlanetData Network of Excellence
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamPlanetData Network of Excellence
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingPlanetData Network of Excellence
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...PlanetData Network of Excellence
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchPlanetData Network of Excellence
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSPlanetData Network of Excellence
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...PlanetData Network of Excellence
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsPlanetData Network of Excellence
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...PlanetData Network of Excellence
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsAdaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsPlanetData Network of Excellence
 

Más de PlanetData Network of Excellence (20)

A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about Trentino
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory Sensing
 
Pay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching NetworksPay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching Networks
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMS
 
CLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data ArchitectureCLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data Architecture
 
Data and Knowledge Evolution
Data and Knowledge Evolution  Data and Knowledge Evolution
Data and Knowledge Evolution
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
 
Access Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract ModelsAccess Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract Models
 
Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
 
Heuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQLHeuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQL
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsAdaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of Endpoints
 
Building a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data CloudBuilding a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data Cloud
 
OntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image CollectionsOntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image Collections
 

Último

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Último (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

BotNetBenchmark - A Benchmark for Social Network

  • 1. BotNetBM A Benchmark for Social Network CWI Project Meeting@Innsbruck Feb 28 - Mar 04, 2011 Wednesday, March 02, 2011
  • 2. Motivation — Highly linked data — No (good) benchmark yet for social networks Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 3. BotNetBM — A benchmark for social networks — Simulates an RDF OLTP backend — Simulates random activities of large #users — Simulates on-site “analyst” ➠ weekly “analytic report” — One parameter: scale (#user accounts to start BM) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 4. BotNetBM Queries — SPARQL 1.1 + SPARUL — User Actions ◦ Interactive queries (80%) ◦ Update transactions (20%) — Measurement: successful #clicks/min. ◦ Transactions commit, penalty for > 3 sec. ◦ Interactive queries response time < 3 sec. — Analytic queries (must finish within simulated weekend) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 5. Limitations — Data generator: too uniform, not realistic for social networks ◦ 10 operations / user / simulated day ◦ all users are equally active ◦ some queries have no “meaningful” relation to each other ◦ read/write contention unrealistically frequent ◦ ... — Query mix: ◦ Does not exploit SPARQL 1.1 advanced features ◦ No link to other RDF datasets — Queries do not run with the open source ed. of Virtuoso Server Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 6. Our Goals — Exploit SPARQL1.1 features in queries ◦ “Property Path Expressions” — Add links to well-know RDF data sets into the queries ◦ DBpedia — Use real-life analysis info (e.g., twitter) ◦ redesign data generator ◦ distribution of interactive/update queries — Use real-life social network data ◦ twitter, facebook, orkut, MySpace, ... — Migration to MonetDB Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 7. Done — Loaded into the Virtuoso Server (commercial ed.) — Design of new query mix — Twitter datasets ◦ http://infochimps.com/collections/twitter-census ◦ http://an.kaist.ac.kr/traces/WWW2010.html ◦ http://snap.stanford.edu/data/twitter7.html ◦ http://twitter.mpi-sws.org/ — Analysis information ◦ “The Man Your Man Could Smell Like: Twitter Analytics Report” ◦ “Characterizing user behavior in online social networks” ◦ “User Interactions in Social Networks and their Implications” Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 8. Interactive & Analytic Queries Q1 - Q8: Information of Profiles & Friends 1. Find all users whose first names contain a particular string, e.g., “Minh”. 2. Return the names of people who study in the same school and have the same age as a user. These people can be the classmates of the user. 3. Find people studied from the same school that connect with you by a path of friend relationship. (Use the “Property Path Expression” in SPARQL 1.1 with arbitrary length path) 4. Find all friends who like an action movie whose actor is Tom Cruise. (Use the information from dbpedia for the movie and actor Tom Cruise) 5. Find all people living in a specific location, e.g., Amsterdam, that can be reached from a user by at most 3 steps friend relationship. 6. Show all the friends of yours who are living in Europe. (Use the information from dbpedia. For example, Amsterdam is a city in Europe, London is a city in Europe) 7. Find top-10 suggested friends for a user: those people that are currently not your friend but are friends of many of your friends. (Get all friends of your friends, order them by the number of people in your friends list connecting to them) 8. Return all users that have not joined a specific group but more than 5 friends of theirs joined the group. Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 9. Interactive & Analytic Queries Q9 - Q14: Posts or Tweets 9. Show 10 latest posts/tweets from your friends or the friends of them. (Order by posting time) 10. Show active posts/tweets - the 10 latest commented posts/tweets from your friends. (Order by the timestamp of the last comments on the posts) 11. Return top-10 most interesting posts from your friends - First order by the number of “like” (or in Twitter, the number of “re-tweet” posts) on the posts from your friends, then order by the number of comments. 12. Return all posts about an event (e.g., Unrest in Tunisia) in 10 recent days. (Based on the hash tags if they are available. In case no tag appears in the post, check whether the content of the post contains the terms in the searching event.) 13. Show all posts about a specific location, e.g., Egypt, in 10 recent days. (Use the information from DBpedia for identify the location of the post. For example, Cairo is the capital of Egypt, Tahrir square is in Cairo.) 14. Find number of inactive user: all users activated for at least 30 days but did not have any post or all users that do not have any more post for 60 days. Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 10. Interactive & Analytic Queries Q15 - Q17: Hash tags 15.Show all photos posted by my friends that I was tagged. 16.Find top-10 friends or all friends of friends of you that have common interest. (Based on the similarity between the tags in your posts and tags in their posts) 17.What are the current hottest events/problems? (Get the hash tags from posts and order by the number of their appearances in 10 recent days) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 11. Interactive & Analytic Queries Q18 - Q19: other information 18.Which area is the most active area? (Order by the total number of posts in each location in 5 recent days.) 19.Return the top-10 locations that have the fastest growth in the number of users. (Count the number of people joined before 10 days and those joined during the 10 recent days, and then, compute the developing rate.) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 12. SPARQL/Update Queries 1. Update user profile 2. Posts/Tweets: 2.1. Add a posts (Popularity: high) 2.2. Remove a posts (Popularity: low) 2.3. Add tags for your friends 2.4. Add/Remove a comment 3. Friends 3.1. Add a friend (Popularity: high) 3.2. Remove a friend (Popularity: low) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 13. SPARQL/Update Queries 4. Group, Event 4.1. Join/Leave a group/event 4.2. Add/Delete post in the group/event 5. Photos 5.1. Add/Delete a photo 5.2. Add/Remove tags in the photo 5.3. Add/Remove a comment 5.4. Remove tags to me from all the pictures of my friends Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011