Video sharing platforms are one of the most popular and engaging platforms on the Internet today. Despite the increasing levels of user activity on these video platforms, current research on digital platforms have largely focused on social media and networking websites like Facebook and Twitter. We depart from previous work that have focused primarily on user demands (i.e. activity of viewers), and instead focus our attention to the supply-side activities on the platform (i.e. activity of video uploaders). We perform a large-scale empirical study by leveraging longitudinal video upload data from a major online video platform, demonstrating (i) heterogeneity of video types (e.g. presence of popular vs. niche genres), and (ii) inherent seasonality effects associated with video uploads. Through our analyses, we uncover a set of informative genre-clusters and estimate a self-exciting Hawkes point-process model on each of these clusters, to fully specify and estimate the video upload process.
Additionally, we disentangle potential factors that govern user engagement and determine the video upload rates, which help supplement our analysis with additional explanatory power. Our results emphasize that using a parsimonious and relatively simple point-process model, we were able to obtain a high model fit, as well as perform prediction of video upload volumes with a higher accuracy than a number of competing models. The findings from this study can benefit platform owners in better understanding how their supply-side users engage with their site over time. We also offer a robust method for performing media upload prediction that is likely to be generalizable across media platforms which demonstrate similar temporal and genre-level heterogeneity.
22. • Hawkes model à flexibility to characterize the relationships between
two events
• We can infer the strength of the ties between two events by
examining the intensity function
• a current event can potentially be triggered by any of the historical events.
• Probabilistic measure: j-th event is triggered by the i-th event:
Prerequisite:
Inferring Correlations between events
24. • Popularity of cluster is important
• Perceived popularity of content lends a sense of validation
• Users more likely to upload if they perceive it will be well-received
• Different standards of quantifying popularity per user:
• different notions of baseline popularity thresholds
• Past uploads which are popular than user’s own popularity average
could positively impact the uploader’s decision
Contributing Factors:
Popularity Effect
Is it more popular
than me?
28. 1. Model Selection
• Evaluate goodness
of fit
• AIC Score:
• Penalizes models
with higher no of
parameters to
prevent overfit
• Plot: AIC(Hawkes) –
AIC (Poisson)
• Differences in AIC scores are negative for most clusters:
• Hawkes process scored a lower value
• Lower AIC score è better model fit
Takeaway:Hawkes model provides better fit for most clusters!
30. • Lower the prediction error, better
the model
• Impact of clustering:
• Cluster specific model performs
better
• Benefits over time-series models:
• ARIMA performs worse
• Not generic enough to incorporate
variations
• Hawkes – better model for
temporal variations
2. Predicting Video Uploads
Takeaway: Hawkes model better predicts video upload counts!