1. Davai Predictive User Modeling
Introduction
The web is rapidly changing from being just a source of information consumption to
a place where people produce, consume, share and interact with information. Hence,
this is the social web. The need to interact socially has led to the success of sites like
Facebook - an egocentric social network that is used to stay in touchwith friends and
family and Twitter - a micro-blogging site that is used to broadcast news, ideas, and
opinions to followers.
However, the change is deeper if one considers the communication patterns enabled
by ubiquitous connectivity and mobile devices. We are moving away from a web of
clicks to a web of online activities and interactions that people perform in a
conversation style:
Users of the ‘old’ web would leave behind an anonymous trail of clicks on hyperlinks
that marks their content consumption patterns and implicit interests. This click-
stream model has been successfully exploited by search engines to create a database
of intent that assigns user intent to searches performed by users. The search model
relies on the click since intent inference is limited to the click and very little meta-
data is available other than the click.
On the social web people engage in more complex activities in the context of their
network of online friends. Modeling user activity by clicks or queries as on today
web doesn’t capture the rich interaction between people that is possible moving
forward. Davai’s behavior modeling is therefore based on an activity-stream model.
The social web with its conversation style is inherently participatory and
communication is shared. People are interested in what their friends and family are
doing, what they think and their opinions. Every activity is therefore in some
context newsworthy. In fact social connectivity and preference in-itself are strong
2. indicators of affinity, interests and therefore ‘intent’. This inference of intent in
made even stronger by the interactions on the social web.
Social network services strive to make participation implicit by turning a user
interaction with the service into content, i.e. many activities are recorded
automatically. Changing a profile, clicking on a like button, purchasing a product; all
might be recorded automatically for friends to see.
Many of today’s egocentric social networks started as community focused online
places for similar minded people to interact in privacy. However the trend is
towards open social networks with a significant portion of the interaction being
public:
People want to meet new people and find new information as part of their social
interaction, which requires sharing of a basic set of personal information,
Social Network sites feel a need to monetize their membership by making their
profile information available/accessible to business, which in turn requires the
sites users to relax privacy expectations,
Social Networks, which historically have been walled gardens, are pushed to
open up for external content (e.g. facebook apps) or to expose their social graph
to external sites (e.g. facebook connect), and
Users accept a minimum of sharing of personal information if value added
context-aware online service are offered in return for the social graph and
interaction patters as long as these are generalized and the platform
shields/anonymizes the users from service providers. For example sharing
location information on mobile devices to obtain valuable location-based
services (like on Yelp or foursquare).
At Davai weare aware of these changing trends. The newfound social web paradigm
that generates online user activity is a rich context for predictive user modeling.
Modeling online users generates tremendously valuable insights on user
intent,which can be used to provide services to business and consumers.
Developing technology that seeks to understand and predict user interests from
observing the activity stream, profile information and social graph, published on the
social web, to create models to predict user demographics, behavior and interests is
the core strength of the Davai platform and services.
The key areas of investment are predictive user models in support of:
Online direct/interactive marketing on social networks such as lead generation,
personalized sales promotions, or customer relation management,
A new kind of interactive and user generated content, which we call social
objects, and
Context-aware services especially on mobile devices such as personal assistants.
All services we envision are permission-based, i.e. the user opts into the services
and in turn receives personalized commercial offers, online services or direct
monetary incentives.
3. Mining the User Activity Stream of the Social Web
An ActivityStraem, or Live Stream or simply Stream is a feed of activists performed
by an actor on one or more online web sites. Many different social networking sites
have started to publish activities stream of their users.
The activity in ActivityStreams is a description of an action that was performed (the
verb) at some instant in time by someone or something (the actor) against some
kind of person, place, or thing (the object).
There are many different social network services and each has its on set of activities,
actors and objects. These formats have to be standardized into a canonical
representation of actions before any kind of analysis or mining activity can be
performed.
Once standardized one can approach mining correlations out of the data set. The
challenge is to perform this in a real-time stream environment. Traditional data
mining algorithms require a fixed vocabulary and fixed set of objects for their
calculation, an assumption that cannot hold for real-time streams.
In order to address the need of real-time and stream based data mining incremental
algorithms have to be used. The data set to be analyzed is typically constraint by a
sliding window that moves over the stream and controls which event is considered
and which not. Additional approximation of algorithms by using heuristics is
necessary to meet real-time needs.
4. The following figure summarizes the high-level approach of Davai:
Davai analyzes online communication of users on social networks. Conversations
center on social objects, which are people, places and things we talk about.
Locations are the real-world locations where the conversations occur.
Communication on the social web manifests itself in online activity streams or actor,
verb, and objects triplets over time. These constitute the observable variables for
which a predictive user model – a set of hidden states and state transitions – has to
be generated.
The predictive user models in Davai are generated through a process of statistical
machine-learning procedures. The models allow assigning users to specific classes
based on the online behavior. Classes can indicate topic and commercial interest,
responsiveness to special marketing campaigns, etc.