SlideShare una empresa de Scribd logo
1 de 24
A Novel Target Marketing
Approach based on
Influence Maximization
Motivation
• “Businesses on Facebook and Twitter are reaching only 2% of
their fans and only 0.07% of follower actually interact with
their post.” – Forrester Study, Nov. 17, 2014
• Local business owner need to target market people nearby, to
increase footfall
• Traditional methods of marketing like leafleting are inefficient
• “82% people check review online before spending money on
product/service” – Nielsen Study, July 1, 2013
• Local businesses can use online review websites like Yelp,
Zomato to target customers effectively.
Problem Statement
• “To develop a novel approach for Identification of influential customers for target
marketing through Influence Maximization.”
Objectives
Fig. 1
Influence Maximization
• It is problem to find K vertices in the graph such that under the diffusion model, the expected
number of vertices influenced by the K vertices (referred to as influence spread) is the largest
possible
• The Independent Cascade (IC) model is the simplest diffusion model. If j is a neighbor of i
then the probability of j being activated by i is:
Eq. 5
i j
pij
wij
Existing Work
• Kemp et al. were first to study the optimization problem of influence maximization
• Proved it to be a NP-hard problem, gave a time inefficient Greedy algorithm
• GeneralGreedy repeats k rounds: in the ith round, select a node v that provides the largest
increase in influence spread
• In each round influence spread is calculated by Monte-Carlo simulations.
Cont’d
• Chen et al. developed NewGreedyIC, an improved Greedy algorithm
• NewGreedyIC also runs Monte-Carlo simulations, but in each iteration it generates a random
graph G’ by randomly removing edges from the existing graph G. This makes the size of graph
in that iteration smaller and hence is faster than GeneralGreedy method
Cont’d
• Chen et al. also proposed a more efficient DegreeDiscount method
• DegreeDiscount method doesn’t run Monte-Carlo simulations, it uses degree discount
heuristics where it is assumed that the spread increases with the degree of nodes.
• It gives discount in the degree of a node by one if any of its neighbors have already been
selected in the set of active nodes.
• It is 6 time faster than NewGreedyIC. It gives influence spread slightly lower than
NewGreedyIC.
[link]
A
3
5
6
A
2
4
5
Inspiration from existing work
• DegreeDiscount method eliminates need for Monte-Carlo simulations by using degree
heuristic.
• This reduces running time compared to NewGreedy by manifold.
Research Gap
• DegreeDiscount doesn’t take into account
the overlapping part of spread of two
influential nodes
• Due to which the total influence spread will
be lesser than sum of their individual
influence spread
• Our novel algorithm adds that node as kth
node which maximize difference between
spread of already selected k-1 nodes and
that of k nodes after addition
• C-A has more difference in spread than B-A.
A
B
C
A
B
C
Our
Approach:
ANIM
(A Novel Influence
Maximization approach)
Fig. 5
dbANIM
Yelp Dataset Description
Name Attributes
User {user_id, name, review_count, average_stars, friends, yelping_since, elite}
Business {business_id, name, review_count, stars, address, latitude, longitude, categories}
Review {user_id, business_id, text, stars, votes}
Users 252,898
User-user edges(friendship) 955,999
Businesses 42,153
Reviews 1,125,458
Data and Preprocessing
• The semi-structured data obtained from Yelp is stored in a Document Oriented database.
• Preprocessing is done to clean the data.
• Social network is formed from users who have reviewed similar nearby businesses.
• Users are represented as nodes in the network, and two nodes are joined by an edge only if
they are friends.
Edge weight calculation in network
• The weight of an edge between two users X and Y is calculated by the formula:
• w1 is the normalized count of mutual friends between X and Y
where nx and ny are the list of friends of user X and user Y.
• w2 signifies the similarity in opinion of user X and user Y
where and
• Xpos is the set of businesses that X rated positively; Xneg is the set of businesses that X rated
negatively.
• We have considered a rating of 3 or below as negative review, and 4 or above as positive
review. [old]
Eq. 9
Eq. 7
Eq. 8
Propagation probability calculation in network
• Propagation probability of an edge going from u to v was calculated by:
• Strength of an edge between u and v is the average of influence of u and v
• Where
• For popularity we used two attributes of the user, reviewCount and averageStars
• The clustering value is defined as the closeness of a node to a cluster of highly interconnected
nodes.
• C(v) is clustering value of a node given by:
• Quartiles were used for normalization.[link]
Eq. 17
Eq. 16
Eq. 15
Eq. 10
Eq. 11
wsANIM
Fig. 6
Our novel approach: spreadHeuristicIC Algorithm
• Proposed algorithm is a greedy algorithm.
• It iteratively finds a node and add it to the set S of top-K influential nodes.
• While adding kth node to set S, it finds the node that maximize the difference between
spread of already selected k-1 nodes and spread of set S after adding that kth node.
A
B
C
Pseudo code for the algorithm
Complexity Analysis
• The algorithm take O(V) steps in line 3 and line 4 take O(T) time, where T is the time to
compute the coverage of a node in the graph G, and it takes O(IE) time (where I is the number
of simulations for the Independent Cascade model, and E is the number of edges in graph G).
• From lines 7-9, complexity of each line is O(VlgV) when we use sorting for union operation.
• So, overall complexity of the algorithm is O(K(VIE + VlgV)).
Experiments and Results
• We have conducted experiments for our algorithm and various other algorithms (i.e.- Degree
Discount algorithm, Single Discount algorithm, Degree Discount algorithm, General Greedy
algorithm etc.) on Yelp’s network.
• We find that the Spread Heuristic based algorithm has more influence spread compared to
the other algorithms. The ranking based on influence spread comes out to be:
spreadHeuristicIC > newGreedyIC > degreeDiscountIC > random
Cont’d
Influence spread for G with n=1617, E=2058
0
50
100
150
200
250
300
0 10 20 30 40 50 60 70 80
InfluenceSpread
K
degreeDiscountIC degreeDiscountIC2 degreeDiscountStar
degreeHeuristic degreeHeuristic2 singleDiscount
highestDegree newGreedyIC randomHeuristic
spreedHeuristic
Influence spread for G with n=4292, E=8147
0
50
100
150
200
250
0 10 20 30 40 50 60 70 80
InfluenceSpread
K
degreeDiscountIC degreeDiscountIC2 degreeDiscountStar
degreeHeuristic degreeHeuristic2 singleDiscount
highestDegree newGreedyIC randomHeuristic
spreadHeuristic
Fig. 9Fig. 7
Cont’d
Run time for G with n=1617, E=2058 Run time for G with n=4292, E=8147
-10
0
10
20
30
40
50
60
70
80
0 10 20 30 40 50 60 70 80
RunningTime(sec)
K
degreeDiscountIC degreeDiscountIC2 degreeDiscountStar
degreeHeuristic degreeHeuristic2 singleDiscount
highestDegree newGreedyIC randomHeuristic
spreadHeuristic
0
50
100
150
200
250
300
350
0 10 20 30 40 50 60 70 80
RunningTime(sec)
K
degreeDiscountIC degreeDiscountIC2 degreeDiscountStar
degreeHeuristic degreeHeuristic2 singleDiscount
highestDegree newGreedyIC randomHeuristic
spreedHeuristic
Fig. 10Fig. 8
Conclusion
• With respect to initial aims and objectives of this project, the final outcome is fairly
successful.
• After series of experiments, we concluded that our algorithm outperforms existing influence
maximization algorithms.
• We developed a dashboard for the businesses to visualize the influential users and their
spread among the people nearby.
References
[1] M.E.J. Newman, M. Girvan, Finding and evaluating community structure in networks, Phys. Rev. E 69 (2) (2004) 026113.
[2] Blondel, Vincent D., et al. "Fast unfolding of communities in large networks. "Journal of Statistical Mechanics: Theory and Experiment 2008.10 (2008):
P10008.
[3]. J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. S. Glance. Cost-effective outbreak detection in networks. In Proceedings of the 13th
ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 420–429, 2007.
[4] “Yelp Dataset,” https://www.yelp.com/dataset challenge/dataset.
[5]. D. Kempe, J. Kleinberg, E. Tardos. Maximizing the Spread of Influence through a Social Network. Proc. 9th ACM SIGKDD Intl. Conf. on Knowledge Discovery
and Data Mining, 2003.
[6]. M. Richardson, P. Domingos. Mining Knowledge-Sharing Sites for Viral Marketing. Eighth Intl. Conf. on Knowledge Discovery and Data Mining, 2002.
[7] J. Goldenberg, B. Libai, E. Muller. Talk of the Network: A Complex Systems Look at the Underlying Process of Word-of-Mouth. Marketing Letters 12:3(2001),
211-223
[8] M. Granovetter, Threshold models of collective behavior, the American Journal of sociology, vol. 83, no. 6, pp.1420-1443, May 1978
[9] Chen, Wei, Yajun Wang, and Siyu Yang. "Efficient influence maximization in social networks." Proceedings of the 15th ACM SIGKDD international conference
on Knowledge discovery and data mining. ACM, 2009.
[10] Kempe, David, Jon Kleinberg, and Éva Tardos. "Maximizing the spread of influence through a social network." Proceedings of the ninth ACM SIGKDD
international conference on Knowledge discovery and data mining. ACM, 2003.
[11] Wang, Yu, et al. "Community-based greedy algorithm for mining top-k influential nodes in mobile social networks." Proceedings of the 16th ACM SIGKDD
international conference on Knowledge discovery and data mining. ACM, 2010.
[12] Saito, Kazumi, Ryohei Nakano, and Masahiro Kimura. "Prediction of information diffusion probabilities for independent cascade model." Knowledge-Based
Intelligent Information and Engineering Systems. Springer Berlin Heidelberg, 2008.
[13] Newman, Mark EJ. "Analysis of weighted networks." Physical Review E 70.5 (2004): 056131.

Más contenido relacionado

Similar a A Novel Target Marketing Approach based on Influence Maximization

cs224w-79-final
cs224w-79-finalcs224w-79-final
cs224w-79-finalDarren Koh
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraJason Riedy
 
Ripple Algorithm to Evaluate the Importance of Network Nodes
Ripple Algorithm to Evaluate the Importance of Network NodesRipple Algorithm to Evaluate the Importance of Network Nodes
Ripple Algorithm to Evaluate the Importance of Network Nodesrahulmonikasharma
 
Introduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regressionIntroduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regressionGirish Gore
 
A network pruning based approach for subset specific influential detection
A network pruning based approach for subset specific influential detectionA network pruning based approach for subset specific influential detection
A network pruning based approach for subset specific influential detectionArun Kalyanasundaram
 
Least Cost Influence in Multiplex Social Networks
Least Cost Influence in Multiplex Social NetworksLeast Cost Influence in Multiplex Social Networks
Least Cost Influence in Multiplex Social NetworksNatasha Mandal
 
(141205) Masters_Thesis_Defense_Sundong_Kim
(141205) Masters_Thesis_Defense_Sundong_Kim(141205) Masters_Thesis_Defense_Sundong_Kim
(141205) Masters_Thesis_Defense_Sundong_KimSundong Kim
 
Social Network Analysis with Spark
Social Network Analysis with SparkSocial Network Analysis with Spark
Social Network Analysis with SparkGhulam Imaduddin
 
MIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaMIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaRahul Bhatia
 
RESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIA
RESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIARESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIA
RESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIANexgen Technology
 
Relational machine-learning
Relational machine-learningRelational machine-learning
Relational machine-learningBhushan Kotnis
 
From econophysicsto networks to data science: Estonian network of payments
From econophysicsto networks to data science: Estonian network of paymentsFrom econophysicsto networks to data science: Estonian network of payments
From econophysicsto networks to data science: Estonian network of paymentsEesti Pank
 
Fractional step discriminant pruning
Fractional step discriminant pruningFractional step discriminant pruning
Fractional step discriminant pruningVasileiosMezaris
 
Creating Community at WeWork through Graph Embeddings with node2vec - Karry Lu
Creating Community at WeWork through Graph Embeddings with node2vec - Karry LuCreating Community at WeWork through Graph Embeddings with node2vec - Karry Lu
Creating Community at WeWork through Graph Embeddings with node2vec - Karry LuRising Media Ltd.
 

Similar a A Novel Target Marketing Approach based on Influence Maximization (20)

cs224w-79-final
cs224w-79-finalcs224w-79-final
cs224w-79-final
 
Big Data Challenges and Solutions
Big Data Challenges and SolutionsBig Data Challenges and Solutions
Big Data Challenges and Solutions
 
Cluster Analysis for Dummies
Cluster Analysis for DummiesCluster Analysis for Dummies
Cluster Analysis for Dummies
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear Algebra
 
Ripple Algorithm to Evaluate the Importance of Network Nodes
Ripple Algorithm to Evaluate the Importance of Network NodesRipple Algorithm to Evaluate the Importance of Network Nodes
Ripple Algorithm to Evaluate the Importance of Network Nodes
 
Introduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regressionIntroduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regression
 
A network pruning based approach for subset specific influential detection
A network pruning based approach for subset specific influential detectionA network pruning based approach for subset specific influential detection
A network pruning based approach for subset specific influential detection
 
Massivegraph telecom ppt
Massivegraph telecom pptMassivegraph telecom ppt
Massivegraph telecom ppt
 
Least Cost Influence in Multiplex Social Networks
Least Cost Influence in Multiplex Social NetworksLeast Cost Influence in Multiplex Social Networks
Least Cost Influence in Multiplex Social Networks
 
Complexity metrics and models
Complexity metrics and modelsComplexity metrics and models
Complexity metrics and models
 
Complexity metrics and models
Complexity metrics and modelsComplexity metrics and models
Complexity metrics and models
 
(141205) Masters_Thesis_Defense_Sundong_Kim
(141205) Masters_Thesis_Defense_Sundong_Kim(141205) Masters_Thesis_Defense_Sundong_Kim
(141205) Masters_Thesis_Defense_Sundong_Kim
 
Social Network Analysis with Spark
Social Network Analysis with SparkSocial Network Analysis with Spark
Social Network Analysis with Spark
 
MIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaMIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_Bhatia
 
50120130406039
5012013040603950120130406039
50120130406039
 
RESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIA
RESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIARESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIA
RESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIA
 
Relational machine-learning
Relational machine-learningRelational machine-learning
Relational machine-learning
 
From econophysicsto networks to data science: Estonian network of payments
From econophysicsto networks to data science: Estonian network of paymentsFrom econophysicsto networks to data science: Estonian network of payments
From econophysicsto networks to data science: Estonian network of payments
 
Fractional step discriminant pruning
Fractional step discriminant pruningFractional step discriminant pruning
Fractional step discriminant pruning
 
Creating Community at WeWork through Graph Embeddings with node2vec - Karry Lu
Creating Community at WeWork through Graph Embeddings with node2vec - Karry LuCreating Community at WeWork through Graph Embeddings with node2vec - Karry Lu
Creating Community at WeWork through Graph Embeddings with node2vec - Karry Lu
 

A Novel Target Marketing Approach based on Influence Maximization

  • 1. A Novel Target Marketing Approach based on Influence Maximization
  • 2. Motivation • “Businesses on Facebook and Twitter are reaching only 2% of their fans and only 0.07% of follower actually interact with their post.” – Forrester Study, Nov. 17, 2014 • Local business owner need to target market people nearby, to increase footfall • Traditional methods of marketing like leafleting are inefficient • “82% people check review online before spending money on product/service” – Nielsen Study, July 1, 2013 • Local businesses can use online review websites like Yelp, Zomato to target customers effectively.
  • 3. Problem Statement • “To develop a novel approach for Identification of influential customers for target marketing through Influence Maximization.” Objectives Fig. 1
  • 4. Influence Maximization • It is problem to find K vertices in the graph such that under the diffusion model, the expected number of vertices influenced by the K vertices (referred to as influence spread) is the largest possible • The Independent Cascade (IC) model is the simplest diffusion model. If j is a neighbor of i then the probability of j being activated by i is: Eq. 5 i j pij wij
  • 5. Existing Work • Kemp et al. were first to study the optimization problem of influence maximization • Proved it to be a NP-hard problem, gave a time inefficient Greedy algorithm • GeneralGreedy repeats k rounds: in the ith round, select a node v that provides the largest increase in influence spread • In each round influence spread is calculated by Monte-Carlo simulations.
  • 6. Cont’d • Chen et al. developed NewGreedyIC, an improved Greedy algorithm • NewGreedyIC also runs Monte-Carlo simulations, but in each iteration it generates a random graph G’ by randomly removing edges from the existing graph G. This makes the size of graph in that iteration smaller and hence is faster than GeneralGreedy method
  • 7. Cont’d • Chen et al. also proposed a more efficient DegreeDiscount method • DegreeDiscount method doesn’t run Monte-Carlo simulations, it uses degree discount heuristics where it is assumed that the spread increases with the degree of nodes. • It gives discount in the degree of a node by one if any of its neighbors have already been selected in the set of active nodes. • It is 6 time faster than NewGreedyIC. It gives influence spread slightly lower than NewGreedyIC. [link] A 3 5 6 A 2 4 5
  • 8. Inspiration from existing work • DegreeDiscount method eliminates need for Monte-Carlo simulations by using degree heuristic. • This reduces running time compared to NewGreedy by manifold.
  • 9. Research Gap • DegreeDiscount doesn’t take into account the overlapping part of spread of two influential nodes • Due to which the total influence spread will be lesser than sum of their individual influence spread • Our novel algorithm adds that node as kth node which maximize difference between spread of already selected k-1 nodes and that of k nodes after addition • C-A has more difference in spread than B-A. A B C A B C
  • 12. Yelp Dataset Description Name Attributes User {user_id, name, review_count, average_stars, friends, yelping_since, elite} Business {business_id, name, review_count, stars, address, latitude, longitude, categories} Review {user_id, business_id, text, stars, votes} Users 252,898 User-user edges(friendship) 955,999 Businesses 42,153 Reviews 1,125,458
  • 13. Data and Preprocessing • The semi-structured data obtained from Yelp is stored in a Document Oriented database. • Preprocessing is done to clean the data. • Social network is formed from users who have reviewed similar nearby businesses. • Users are represented as nodes in the network, and two nodes are joined by an edge only if they are friends.
  • 14. Edge weight calculation in network • The weight of an edge between two users X and Y is calculated by the formula: • w1 is the normalized count of mutual friends between X and Y where nx and ny are the list of friends of user X and user Y. • w2 signifies the similarity in opinion of user X and user Y where and • Xpos is the set of businesses that X rated positively; Xneg is the set of businesses that X rated negatively. • We have considered a rating of 3 or below as negative review, and 4 or above as positive review. [old] Eq. 9 Eq. 7 Eq. 8
  • 15. Propagation probability calculation in network • Propagation probability of an edge going from u to v was calculated by: • Strength of an edge between u and v is the average of influence of u and v • Where • For popularity we used two attributes of the user, reviewCount and averageStars • The clustering value is defined as the closeness of a node to a cluster of highly interconnected nodes. • C(v) is clustering value of a node given by: • Quartiles were used for normalization.[link] Eq. 17 Eq. 16 Eq. 15 Eq. 10 Eq. 11
  • 17. Our novel approach: spreadHeuristicIC Algorithm • Proposed algorithm is a greedy algorithm. • It iteratively finds a node and add it to the set S of top-K influential nodes. • While adding kth node to set S, it finds the node that maximize the difference between spread of already selected k-1 nodes and spread of set S after adding that kth node. A B C
  • 18. Pseudo code for the algorithm
  • 19. Complexity Analysis • The algorithm take O(V) steps in line 3 and line 4 take O(T) time, where T is the time to compute the coverage of a node in the graph G, and it takes O(IE) time (where I is the number of simulations for the Independent Cascade model, and E is the number of edges in graph G). • From lines 7-9, complexity of each line is O(VlgV) when we use sorting for union operation. • So, overall complexity of the algorithm is O(K(VIE + VlgV)).
  • 20. Experiments and Results • We have conducted experiments for our algorithm and various other algorithms (i.e.- Degree Discount algorithm, Single Discount algorithm, Degree Discount algorithm, General Greedy algorithm etc.) on Yelp’s network. • We find that the Spread Heuristic based algorithm has more influence spread compared to the other algorithms. The ranking based on influence spread comes out to be: spreadHeuristicIC > newGreedyIC > degreeDiscountIC > random
  • 21. Cont’d Influence spread for G with n=1617, E=2058 0 50 100 150 200 250 300 0 10 20 30 40 50 60 70 80 InfluenceSpread K degreeDiscountIC degreeDiscountIC2 degreeDiscountStar degreeHeuristic degreeHeuristic2 singleDiscount highestDegree newGreedyIC randomHeuristic spreedHeuristic Influence spread for G with n=4292, E=8147 0 50 100 150 200 250 0 10 20 30 40 50 60 70 80 InfluenceSpread K degreeDiscountIC degreeDiscountIC2 degreeDiscountStar degreeHeuristic degreeHeuristic2 singleDiscount highestDegree newGreedyIC randomHeuristic spreadHeuristic Fig. 9Fig. 7
  • 22. Cont’d Run time for G with n=1617, E=2058 Run time for G with n=4292, E=8147 -10 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 RunningTime(sec) K degreeDiscountIC degreeDiscountIC2 degreeDiscountStar degreeHeuristic degreeHeuristic2 singleDiscount highestDegree newGreedyIC randomHeuristic spreadHeuristic 0 50 100 150 200 250 300 350 0 10 20 30 40 50 60 70 80 RunningTime(sec) K degreeDiscountIC degreeDiscountIC2 degreeDiscountStar degreeHeuristic degreeHeuristic2 singleDiscount highestDegree newGreedyIC randomHeuristic spreedHeuristic Fig. 10Fig. 8
  • 23. Conclusion • With respect to initial aims and objectives of this project, the final outcome is fairly successful. • After series of experiments, we concluded that our algorithm outperforms existing influence maximization algorithms. • We developed a dashboard for the businesses to visualize the influential users and their spread among the people nearby.
  • 24. References [1] M.E.J. Newman, M. Girvan, Finding and evaluating community structure in networks, Phys. Rev. E 69 (2) (2004) 026113. [2] Blondel, Vincent D., et al. "Fast unfolding of communities in large networks. "Journal of Statistical Mechanics: Theory and Experiment 2008.10 (2008): P10008. [3]. J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. S. Glance. Cost-effective outbreak detection in networks. In Proceedings of the 13th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 420–429, 2007. [4] “Yelp Dataset,” https://www.yelp.com/dataset challenge/dataset. [5]. D. Kempe, J. Kleinberg, E. Tardos. Maximizing the Spread of Influence through a Social Network. Proc. 9th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2003. [6]. M. Richardson, P. Domingos. Mining Knowledge-Sharing Sites for Viral Marketing. Eighth Intl. Conf. on Knowledge Discovery and Data Mining, 2002. [7] J. Goldenberg, B. Libai, E. Muller. Talk of the Network: A Complex Systems Look at the Underlying Process of Word-of-Mouth. Marketing Letters 12:3(2001), 211-223 [8] M. Granovetter, Threshold models of collective behavior, the American Journal of sociology, vol. 83, no. 6, pp.1420-1443, May 1978 [9] Chen, Wei, Yajun Wang, and Siyu Yang. "Efficient influence maximization in social networks." Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2009. [10] Kempe, David, Jon Kleinberg, and Éva Tardos. "Maximizing the spread of influence through a social network." Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2003. [11] Wang, Yu, et al. "Community-based greedy algorithm for mining top-k influential nodes in mobile social networks." Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2010. [12] Saito, Kazumi, Ryohei Nakano, and Masahiro Kimura. "Prediction of information diffusion probabilities for independent cascade model." Knowledge-Based Intelligent Information and Engineering Systems. Springer Berlin Heidelberg, 2008. [13] Newman, Mark EJ. "Analysis of weighted networks." Physical Review E 70.5 (2004): 056131.