4. Influence in Social Networks
ο We live in communities and interact with friends, families,
and even strangers
ο This forms social networks
ο In social interactions, people may influence each other
5. Influence Diffusion & Viral Marketing
iPhone 5 is great
iPhone 5 is great
iPhone 5 is great
iPhone 5 is great
iPhone 5 is great
Source:Wei Chenβs KDDβ10 slides
Word-of-mouth effect
6. Social Network as Directed Graph
ο Nodes: Individuals in the network
ο Edges: Links/relationships between individuals
ο Edge weight on (π, π): Influence weight π€π,π
0.8
0.7
0.1
0.13
0.3
0.41
0.27
0.2
0.9
0.01
0.6
0.54
0.1
0.110
0.20.7
7. Linear Threshold Model β Definition
ο Each node π£ chooses an activation threshold ππ£
uniformly at random from [0,1]
ο Time unfolds in discrete steps 0,1,2β¦
ο At step 0, a set π of seeds are activated
ο At any step π‘, activate node π£ if
π€ π’,π£active in neighbor π’ β₯ ππ£
ο The diffusion stops when no more nodes can be activated
ο Influence spread of π:The expected number of active
nodes by the end of the diffusion, when targeting π initially
8. Linear Threshold Model β Example
Inactive Node
Active Node
Threshold
Total Influence
Weights
Source: David Kempeβs slides
vw
0.5
0.3
0.2
0.5
0.1
0.4
0.3 0.2
0.6
0.2
Stop!
U
8
x
Influence spread of {v} is 4
9. Influence Maximization
Problem
Select k individuals such that by
activating them, influence spread is
maximized.
Input
Output
A directed graph representing a
social network, with influence
weights on edges
NP-hard ο #P-hard to compute exact influence ο
11. Influence vs. Product Adoption
ο Classical models do not fully capture monetary aspects of
product adoptions
ο Being influenced != Being willing to purchase
ο HPTouchPad significant price drop in 2011 ($499 ο $99)
ο Worldwide market share of cellphones (as of 2011.7):
1. Nokia ο
2. Samsung (booβ¦)
3. LG (β¦)
4. Apple iPhone ο
ο iPhone: More expensive in hardware and monthly rate plans
(ask Rogers,Telus, or Bellβ¦)
12. Product Adoption
ο Product adoption is a two-stage process (Kalish 85)
ο 1st stage: Awareness
ο Get exposed to the product
ο Become familiar with its features
ο 2nd stage: Actual adoption
ο Only if valuation outweighs price
ο Awareness is modeled as being propagated through word-of-
mouth: captured by classical models
ο OTOH, the 2nd stage is not captured
13. Valuations for Products
ο Oneβs valuation for a product = the maximum amount of
money one is willing to pay
ο People do not want to reveal valuations for trust and privacy
reasons (Kleinberg & Leighton, FOCSβ03)
ο IPV (Independent PrivateValue) assumption:The
valuation of each personβs is drawn independently at random
from a certain distribution (Shoham & Leyton-Brown 09)
ο Price-taker assumption: Users respond myopically to
price, comparing it only with own valuation
14. Our Contribution
ο Incorporate monetary aspects into the modeling of the
diffusion process of product adoption
ο Price & user valuations
ο Seeding costs
ο LT ο LT with user valuations (LT-V)
ο Profit maximization (ProMax) under LT-V
ο Price-Aware GrEedy algorithm (PAGE)
16. Linear Threshold Model with Valuations
(LT-V)
ο Three node states: Inactive, Influenced, and Adopting
ο Inactive ο Influenced: same as in LT
ο Influenced ο Adopting: Only if the valuation is at least the quoted
price
ο Only adopting nodes will propagate influence to inactive neighbors
ο Model is progressive (see figure)
17. More about LT-V
ο Our LT-V model captures the two-stage product adoption
process in (Kalish 1985)
ο Only adopting nodes propagate influence:Actual adopters can
access experienced-based features of the product
ο Usability, e.g., Easy to shoot night scenes using Nikon D600?
ο Durability, e.g., How long can iPhone 5βs battery last on LTE?
ο Still quite abstract, with room for extensions and refinements
(more to come later)
18. ProMax: Notations
ο π = π1, π2, β¦ , π π : the vector of quoted prices, one per
each node
ο π: the seed set
ο π: 2 π
Γ [0,1] π
β πΉ: the profit function
ο π(π, π): the expected profit earned by targeting π and setting
prices π
19. ProMax Problem Definition
Problem
Select a set π of seeds & determine
a vector π of quoted price, such
that the π(π, π) is maximized
under the LT-V model
Input
Output
A directed graph representing a
social network, with influence
weights on edges
20. ProMax vs. InfMax
ο Difference w/ InfMax under LT
ο Propagation models are different & have distinct properties
ο InfMax only requires βbinary decisionβ on nodes, while ProMax
requires to set prices
21. A Restricted Special Case
ο Simplifying assumptions:
ο Valuation distributions degenerate to a single point:
π£π = π, βπ’π β π
ο Seeds get the item for free (price = 0)
ο Optimal price vector is out of question
ο Restricted ProMax: Find an optimal seed set π to
maximize π π = π β β πΏ π β π β π π β |π|
ο β πΏ π : expected #adopting nodes under LT-V
ο π π: acquisition cost (seeding expenses)
22. A Restricted Special Case
ο π π is non-monotone, but submodular in π
ο No need to preset #seeds to pick (the number k in InfMax)
ο Simple greedy cannot be applied to get approx. guarantees
ο Theorem:The restricted ProMax problem is NP-hard.
ο Reduction from the MinimumVertex Cover problem
ο Aside on Maximizing non-monotone submodular functions:
Local search approximation algorithms (Feige, Mirrokni, &
Vondrak, FOCSβ07)
ο Nice, but time complexity too high
ο They assumed an oracle for evaluating the function
23. Unbudgeted Greedy (U-Greedy)
ο Simply grow the seed set π by selecting the node with the
largest marginal increase in profit, and stop when no nodes
can provide positive marginal gain.
ο Theorem (Quality guarantee of U-Greedy)
π π π β₯ 1 β
1
π
π πβ β Ξ(max πβ , |π π| )
ο π π: Seed set by U-Greedy
ο πβ
: optimal seed set
ο Proof: Some algebraβ¦ omittedβ¦
24. Properties of LT-V (in general)
ο For an arbitrary vector of valuation samples π =
(π£1, π£2, β¦ , π£|π|), given an instance of the LT-V model, for
any fixed vector π of prices, the profit function π π, π is
submodular in π.
ο It is #P-hard to compute the exact value of π π, π , given
any π and π.
26. ProMax Algorithm: All-OMP
ο Given the distribution function (CDF) πΉπ of π£π, the
Optimal Myopic Price (OMP) is
ο First baseline βAll OMP: Offer OMP to all nodes, and select
seeds using U-Greedy.
ο Ensures max. profit earned solely from a single influenced
node
ο Ignores network structures and βprofit potentialβ (from
influence) of seeds
27. ProMax Algorithm: Free-For-Seeds (FFS)
ο Seeds receive the product for free
ο Non-seeds are charged OMP
ο Ensure all seeds will adopt & propagate influence
ο Trade-off: immediate profit from seeds vs. profit potential
of seeds (through influence)
ο All-OMP favors the former, good for low influence networks
ο FFS favors the latter, good for high influence networks
ο Can we achieve a more balanced heuristic?
28. Price-Aware GrEedy (PAGE) Algorithm
ο The key question: Given a partial seed set, how to determine
the price ππ for the next seed candidate π’π?
ο Consider the marginal profit π’π brings:
ο This is a function of ππ
ο So, letβs find ππ that maximizes
29. PAGE details
ο Offer ππ to π’π leads to 2 possible worlds:
ο π’π accepts, w.p. 1 β πΉπ ππ .
ο π’π does not accept, w.p. πΉπ ππ .
ο Re-write as follows
ο π1: expected profit earned from other nodes, if π’π accepts
ο π0: -----------------------------------------------, o.w.
ο Finding the optimal ππ depends on the specific form of πΉπ
ο We study two cases:
ο Normal distribution
ο Uniform distribution
30. Normal Distribution
ο CDF:
where erf(π₯) is the error function
ο No analytical solution can be found, since erf(π₯) has no
closed-form expression, and thus neither does ππ(ππ)
ο Numerical method: the golden section search algorithm
31. Aside: Golden Section Search
ο Finds the extremum of a function by iteratively narrowing
the interval inside which the extremum is known to exist
ο The function must be unimodal and continuous over the initial interval
ο Terminates when the length of the interval is smaller than a pre-
defined number (say, 10β6)
ο No need to take derivatives (which the Newtonβs method will need)
ο Performs only one new function evaluation in each step
ο Has a constant reduction factor for the size of the interval
32. Uniform Distribution
ο CDF:
ο So,
ο Easily, we solve for optimal ππ:
ο If < 0 or >1, normalize it back to 0 or 1 (respectively)
ο N.B.,This solution framework can be applied to any
valuation distributions.Actual solution may be analytical or
numerical, depending on the distribution itself.
34. Network Datasets
ο Epinions
ο A who-trusts-whom network from the customer reviews site
Epinions.com
ο Flixster
ο A friendship network from social movie site Flixster.com
ο NetHEPT
ο A co-authorship network from arxiv.org High Energey Physics
Theory section.
36. Influence Weights in Datasets
ο Weighted Distribution (WD)
π€ π’,π£ =
π΄ π’,π£
ππ£
ο π΄ π’,π£: #actions π’ and π£ both have performed (if data is time-
stamped, then π’ should perform earlier
ο ππ£: total #actions π£ has performed
ο Trivalency (TV): π€ π’,π£ is chosen uniformly at random from
{0.001, 0.01, 0.1}.
ο All weights shall be normalized to ensure βπ£,
π€ π’,π£ β€ 1
π’
39. Estimating Valuation Distributions
ο Besides ratings (1 to 5 stars), users may optionally provide
the price they paid
ο At the end of the same review:
40. ο We obtain all reviews for Canon EOS 300D, 350D, and 400D
DSLR cameras
ο Sequential releases in 3 years ο approximately the same
monetary values
ο Remove reviews without price: 276 samples remain
ο View rating as utility
utility = valuation β price paid
ο Thus, our estimation is
Estimating Valuation Distributions
44. Experimental Results: Running Time
ο PAGE is more efficient
ο Leveraging lazy-forwarding more effectively
ο Extra overhead for computing price is small (golden search
algorithm converges in less than 40 iterations)
46. Conclusions
ο Extended LT model to incorporate price and valuations &
distinguish product adoption from social influence
ο Studies the properties of the extended model
ο Proposed profit maximization (ProMax) problem & effective
algorithm to solve it
47. Discussions & Future Work
ο Make similar extensions to other influence propagation
models: IC, LT-C, or even the general threshold model
ο Develop fast heuristics to give more efficient & scalable
algorithms
ο Consider to incorporate other elements into the modeling of
product adoption
ο Peoplesβ spontaneous interests in product (natural early
adopters)
ο Valuation may change over time for some people
ο Valuations may observe externalities
48. Related Work
ο Influence maximization (too manyβ¦)
ο Revenue/profit maximization in social networks
ο Hartline et al.,WWWβ08: Influence-and-Exploit
ο Arthur et al.,WINEβ09
ο Chen et al.,WINEβ11
ο Bloch & Qurou, working paper, 2011
ο Influence vs. product adoption
ο Bhagat, Goyal, & Lakshmanan,WSDMβ12: LT model with
Colors