11. Gérard Fabien Mylène Michel Yvonne father sister mother colleague colleague <family> d (guillaume)= 3 parent sibling mother father brother sister colleague knows
12.
13.
14.
15.
16.
17.
18.
19. Qualified component Qualified in-degree Qualified diameter Closenness Centrality Betweenness Centrality Number of geodesics between from and to Qualified degree Number of geodesics between from and to going through b
20.
21.
22. sister mother supervisor hasSNAConcept isDefinedForProperty hasValue colleague colleague father hasCentralityDistance colleague colleague supervisor 4 Philippe 2 colleague supervisor Degree Guillaume Gérard Fabien Mylène Michel Yvonne Ivan Peter
Since its birth, the web has provided many ways of interaction between people , revealing real social network structures. Social networks have been extracted from: hyperlink structure of home pages co-occurrence of names Synchronous and asynchronous communications
The social network effect of the web have been amplified by the deployment of a social media landscape where “expressing tools allow users to express themselves, discuss and aggregate their social life”, “sharing tools allow users to publish and share content”, and “networking tools allow users to search, connect and interact with each other” . Social platforms, like Facebook, Orkut, Hi5, etc., are at the center of this landscape as they enable us to host and aggregate these different social applications. You can publish and share your del.icio.us bookmarks, your RSS streams or your microblog posts via the Facebook news feed, thanks to dedicated Facebook applications. This integration of various means for publishing and socializing enables us to quickly share, recommend and propagate information to our social network, trigger reactions, and finally enrich it. Collaborative applications now capture more and more aspects of physical social networks and human interactions in a decentralized way. Such rich and diffuse data cannot be represented using only raw graphs as in classical SNA algorithms without some loss of knowledge.
metrics help understanding the global structure of the network. The density indicates the cohesion of the network. Community detection helps understanding the distribution of actors and activities in the network], by detecting group of actors densely connected. The community structure influences the way information is shared and the way actors behave.
The centrality highlights the most important actors of the network and three definitions have been proposed by Freeman. The degree centrality considers nodes with the higher degrees (number of adjacent edges). It highlights a local popularity of the network, actors that influence their neighbourhood. The closeness centrality is based on the average length of the paths (number of edges) linking a node to others and reveals the capacity of a node to be reached and to join others actors. The betweenness centrality focuses on the capacity of a node to be an intermediary between any two other nodes . A network is highly dependent on actors with high betweenness centrality due to their position as intermediaries and brokers in information flow.
The centrality highlights the most important actors of the network and three definitions have been proposed by Freeman. The degree centrality considers nodes with the higher degrees (number of adjacent edges). It highlights a local popularity of the network, actors that influence their neighbourhood. The closeness centrality is based on the average length of the paths (number of edges) linking a node to others and reveals the capacity of a node to be reached and to join others actors. The betweenness centrality focuses on the capacity of a node to be an intermediary between any two other nodes . A network is highly dependent on actors with high betweenness centrality due to their position as intermediaries and brokers in information flow.
Several ontologies exist for representing online social networks. Social data can be seen as a twofold structure: data that describe people and social network structure , and data that describe the content produced by network members . FOAF is used for describing people, their profile, their relationships and their online accounts. The properties defined in the RELATIONSHIP ontology specialize the “knows” property of FOAF to type relationships in a social network more precisely (familial, friendship or professional relationships). The primitives of the SIOC ontology specialize “OnlineAccount” and “holdsAccount” from FOAF in order to model the interactions and resources manipulated by users of social web applications; SIOC defines concepts such as posts, replies or user groups. The SKOS ontology offers a way to organize manipulated concepts with lightweight semantic properties (e.g. narrower, broader, related) and to link them to SIOC descriptions with the property &quot;isSubjectOf&quot;.
RDF enables us to make assertions and to describe resources with triples . These triples form a directed typed graph that is well suited to represent social data, produced on different sites. Distributed identities, activities and relationships are represented with a uniform graph structure in RDF. Moreover, both nodes and relationships can be richly typed with classes and properties of ontologies that are described in RDFs and OWL adding a semantic dimension to the social graph. SPARQL is the standard query language for querying these richly typed and oriented graphs. Consequently it is a privileged tool to analyze social data represented with semantic web languages.
RDF enables us to make assertions and to describe resources with triples . These triples form a directed typed graph that is well suited to represent social data, produced on different sites. Distributed identities, activities and relationships are represented with a uniform graph structure in RDF. Moreover, both nodes and relationships can be richly typed with classes and properties of ontologies that are described in RDFs and OWL adding a semantic dimension to the social graph. SPARQL is the standard query language for querying these richly typed and oriented graphs. Consequently it is a privileged tool to analyze social data represented with semantic web languages.
Researchers have applied classical SNA methods to the graph of acquaintance and interest networks respectively formed by the properties &quot;foaf:knows&quot; and &quot;foaf:interest&quot;. In order to apply existing tools they extract simple untyped graph from the richer RDF descriptions of FOAF profiles (each corresponding to one relationship “knows” or “interest”). A lot of knowledge is lost in this transformation and this knowledge could be used to parameterize social network indicators, filter their sources and customize their results.
global queries are mostly based on result aggregation and path computation which are missing from the standard SPARQL definition. The Corese search engine provides such features with result grouping, aggregating function like count(), sum() or avg() and path retrieving;
The group by clause groups results having the same values for specified variables. Then an aggregating function can be applied on each SPARQL results like count(). These features will be added to SPARQL 2.0
A syntactic convention in Corese enables path extraction. A regular expression is used instead of the property variable to specify that a path is searched and to describe its characteristics. sub-properties of the properties of the regular expression are taken into account, unless specified otherwise. The regular expression operators are: / (sequence), | (or), * (0 or more), ? (optional), ! (not). We can bind the path with a variable specified after the regular expression. Path characteristics are defined by adding options before the regular expression: 'i' to allow inverse properties, 's' to retrieve one shortest path, 'sa' to retrieve all shortest paths. This example retrieves a path between two resources ?x and ?y starting with zero or more foaf:knows properties and ending with the rel:worksWith property; the path length must be equal to or less than 4. Depending of the time, path retrieving is a candidate for being added to SPARQL 2.0
The closeness centrality of a node is the average length of the paths linking it to others nodes
SemSNA is an ontology of Social Network Analysis that enable to annotate social data with strategic positions and structural indices. The main class SNAConcept is used as the super class for all SNA concepts. The property isDefinedForProperty indicates for which relationship, i.e., subnetwork, an instance of the SNA concept is defined. An SNA concept is attached to a social resource with the property hasSNAConcept. The class SNAIndice describes valued concepts such as centrality, and the associated value is set with the property hasValue. This models strategic position, based on Freeman's definition of centrality, and different definitions of groups with useful indices to characterize their properties.
In this social network, Guillaume has both family and professional relationships. The degree of Guillaume for the relationship colleague, a superProperty of supervisor, considering a neighbourhood at distance 2 is 4.
Ipernity.com, the social network we analyzed, offers users several options for building their social network and sharing multimedia content. Every user can share medias, create a blog, a personal profile page, and comment on other’s shared resources. To build the social network, users can specify the type of relationship they have with others: friend, family, or simple contact (like a favorite you follow). Relationships are not symmetric, Fabien can declare a relationship with Michel but Michel can declare a different type of relationship with Fabien or not have him in his contact list at all;
Corese has an extension that enables us to nest SQL queries within SPARQL queries. This is done by means of this sql() function that returns a sequence of results for each variable in the SQL select clause. Then in Corese we can combine a construct clause and a select clause to generate RDF data.
We extended FOAF, SIOC and SIOC types in order to import social data from ipernity.com, in particular to model interactions like messages or visit on resources. We introduced the class Interaction to differentiate declared relationships (like family or friendOf) from active relationships.
Corese has an extension that enables us to nest SQL queries within SPARQL queries. This is done by means of this sql() function that returns a sequence of results for each variable in the SQL select clause. Then in Corese we can combine a construct clause and a select clause to generate RDF data.
We tested our algorithms and queries on an bi-processor quadri-core of 3.2 GHZ, and 32.0Gb of main memory. We analyzed the three types of relations separately ( favorite, friend and family) and also used polymorphic queries to analyze them as a whole using their super property: foaf:knows. We also analyzed the interactions produced by exchanges of private messages between users, as well as the ones produced by someone commenting someone else's documents. This table shows some performances when computing components, degree and shortest paths. Queries exploiting only grouping and aggregating features (component, degree) are efficient and can be computed on large scale data. Path computation is time and space consuming. When too many paths could be retrieved, we limit queries to a maximum number of graph projections or the path length. In some cases like betwenness centralities, approximations are sufficient to highlight strategic actors.