Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

The impact of search ads on organic search traffic

The impact of search ads on organic search traffic using nonparametric statistics and time series analysis with R.

  • Inicia sesión para ver los comentarios

  • Sé el primero en recomendar esto

The impact of search ads on organic search traffic

  1. 1. The impact of search ads on organic search traffic A nonparametric statistical analysis based on a small size times series sample Alexandros Papageorgiou Advanced Business Data Analysis National College of Ireland Abstract—this study examines the impact of paid search engine advertising on organic search engine traffic. In particular it is concerned with ways of analysing the impact that pausing search advertising can have on organic traffic. The main objective is to develop a methodology that can be applied to individual websites to help determine whether paid search clicks substitute traffic that would have reached the website anyway. The study is based on a small time series sample from an e-commerce website, with respect to organic traffic, that includes a one week experimental period during which search ads were disabled. A methodology for approaching the problem was developed and nonparametric statistical techniques were employed. The results for this particular experiment suggest that pausing search ads does lead to an increase in organic search engine traffic. A confidence interval for the change is provided too. The methodology can be employed for different websites; however the final results can vary depending on their specific characteristics. I. INTRODUCTION A. Background Search engine traffic, in its two forms, i.e. paid and organic, has been a rapidly evolving marketing channel for digital properties. For many online businesses it is already the top incoming traffic generator. Its importance is even higher with the high relevancy of this type of traffic and the high propensity for conversion that characterises it taken into account. There is an ongoing debate within the digital advertising industry regarding the effect of the symbiosis between paid and organic search engine traffic. Typical questions include: What happens if a website is the top organic result for a given keyword? Does it make sense to advertise in that case? What would the repercussions be if its rank was third, fifth or 100th? Companies spend considerable amounts on search advertising in expectation of positive economic results; however the possibility of traffic “cannibalisation” is a hidden cost that is hard to quantify and integrate into the cost/benefit equation. Due to the importance of advertising for revenue generation, pausing advertising for prolonged periods, as an experiment, is undesirable from a business point of view. It can result in lost sales or valuable customer traffic reaching competitors' properties. Therefore, the challenge is to develop a method of approximating negative impact by minimising the exposure of a company to the aforementioned risks. B. Related Work A number of research studies have addressed this question from a macro level. These studies have covered a large number of websites across several industries and involved disabling search ads for some specific period of time. The collective results provide a general conclusion, by industry, regarding the impact of ads on organic search traffic. In particular, the researchers suggest that for most industries paid search traffic is almost entirely incremental to the organic one [1]. A follow up research reported that the final outcome can vary based on the organic ranking of a website. The higher the website ranks organically, the higher the likelihood that even in the absence of an ad, users would find and click through to the site [2]. It was also highlighted that while these findings provide guidance on overall trends there was a lot of variability between different advertisers and different search terms. The authors encouraged advertisers to design their own experiments. Individual websites have particular characteristics related to their industry or the degree of diversification of the product they offer. Additionally, search rankings can vary greatly from one page or one section of a site to another. It is therefore not ideal to use those particular studies in order to determine the precise effect that search ads can have on a given website. C. Research statement The objective of this study is to design a general framework that enables individual online businesses with different attributes to make inferences about the impact of paid search advertising on organic traffic without the need to design complex and costly longitudinal studies. The method will provide the tools for a digital company to establish if a change has taken place, and if so, to approximate the estimated range of the change in organic traffic by using suitable confidence intervals.
  2. 2. II. METHODS A. The dataset The study was based on organic traffic data from an ecommerce website. This included three full weeks of data, with traffic from both organic and paid search channels visiting the site, and one experimental week during which the ads were completely paused. The website in question receives both paid and organic traffic from multiple search engines, however, for this particular study the focus was on organic data, originating from and other country level google domains. In general, paid traffic visits to the website are a fraction of organic traffic visits. It is, however, much more targeted to the desired audiences. Attention was paid towards ensuring that no other factors (beyond the absence of ads), that could alter normal organic traffic patterns were present before and during the experiment, e.g. website upgrades, server downtime or google search algorithm updates. The data was collected via Google Analytics and its API. The 28 data points refer to total organic users by date. Descriptive statistics for the data are presented in Table 1. Table 1 Descriptive statistics The data represent s time series which is illustrated in Figure 1. The effect of the weekly seasonal component in the data is evident. Figure 1 The data represented as a 4 week time series The boxplot in Figure 2 provides further evidence of this cyclicality. In particular, the first days of the week, starting from Monday, exhibit stronger numbers with regard to organic users. Then there is a gradual decline leading to the weekend during which user numbers reach the lowest point. Figure 2 Boxplots of traffic by day of week B. Data pre-processing This known cyclicality is not atypical for e-commerce websites. It presents several challenges for the methodologies than can be employed for the data analysis. In particular, given that the data are auto-correlated, normality and their respective tests cannot be applied. In order to perform a statistical test some adjustments need to be considered. As a first step the seasonality was removed to enable day to day comparison on an equal basis. To accomplish this, the time series was decomposed into its seasonal, trend and irregular factors[3] as illustrated in Figure 3. Subsequently, the seasonal component was extracted and then applied to every data point in the dataset by division (due to the multiplicative nature of the time series with respect to its composition). The adjusted time series was used for the following steps of the analysis. Figure 3 Time series decomposed into its 4 main components C. Statistical Plans A histogram of the adjusted time series is illustrated in Figure 4. The data set is relatively small in size. There are less than 30 data points represented and there is not enough evidence that the data follow the normality pattern.
  3. 3. Figure 4 Histogram of seasonally adjusted time series A quantile-quantile plot is also illustrated in Figure 5. In general the trend in both graphs suggests a bimodal distribution, which can be an early sign that the experimental week has exhibited different behaviour. Figure 5 Quantile-Quantile plot for the adjusted time series In addition to the previous observations the sample sizes are very small (especially for the days of the experiment). In this context standard parametric assumptions are not met and therefore using such methods to test if there is a difference between the first three weeks and the experiment would likely lead to inaccurate conclusions. To examine the hypothesis that organic traffic has increased when the ads were paused, the nonparametric Mann Whitney U test was used instead. This test is typically employed to examine whether two independent samples of observations are drawn from the same or identical distributions. An additional reason for employing this test is that the two samples under consideration may not necessarily contain the same number of observations [4]. Another nonparametric technique, the bootstrap, will be used to provide a confidence interval based on multiple re- samplings with replacement from the original data[5]. III. RESULTS 1) Mann Whitney U test for the distributions The null hypothesis of the Mann Whitney U test stated that there is no difference in the location of the distributions for organic traffic users between the two conditions: when search ads are activated and when they are not. The alternative hypothesis was that the organic user traffic grows when the search ads are not active. The alpha value used was 0.05, a value commonly used in statistical practice. Table 2 Output of the Mann Whitney U test The basic assumptions of the Mann Whitney U test were that the samples are independent from each other and they are random samples from the populations. The former assumption is met since seasonal components were removed. Likewise the latter assumption is met if we consider the samples as representative of their underlying populations. This is an assumption that has to be made given the fact that the cost of the experiment can only allow for a limited number of days without search ads and therefore there is no real opportunity for sampling. Further assumptions regarding shape of distributions and variances were not tested due to the small size of the data sets, particularly the limited number of experiment days. The test statistic value was 132 and the associated p-value of a one-tailed Mann Whitney U test for the location of the distributions was 0.00047 as illustrated in Table 2. This indicated that under a true null hypothesis, the probability is - order of magnitude- less than 5% that the difference between the two distribution locations is this or more extreme. Based on the above observations, it was concluded that there is indeed some significant increase in the organic traffic when search advertising is paused. B. The Bootstrap for the Confidence Intervals The next question to address is about the range of the possible change. To address this question the bootstrap method was selected. It allows the generation of confidence intervals and testing of statistical hypotheses without having to assume a specific underlying theoretical distribution[6]. It was therefore employed in order to construct a suitable confidence interval around the difference in the medians of the two samples. The median was preferred due to small number of data points for the experiment dates. Using the bootstrap’s resampling with replacement technique, the difference in the medians between the two groups was recorded for each of the 10000 iterations and a 95% confidence interval was subsequently constructed.
  4. 4. The Bias Corrected and Accelerated (BCa) confidence interval for the difference in medians was (2160, 3060) which suggests that the number of users reaching the website organically on a daily basis, in the absence of search ads, is not insignificant. IV. DISCUSSION A. Conclusions The previous methodology can be applied in an experimental setting enabling advertisers to evaluate the impact of search advertising to the organic traffic using suitable nonparametric statistical methods. A key feature of this methodology is that it only requires that search ads be paused for seven days only. For the specific website under study it was found that the act of pausing the ad campaigns had a positive impact on the number of organic users visiting the website. A 95% confidence interval was built to provide a better understanding of the possible range of variation in the difference. This methodology can be applied to any website but naturally the results are likely to vary based on particular website characteristics. B. Future Work In the present study the number of users was the primary metric examined. However, it might be more meaningful from a business point of view to instead examine differences in organic search revenue or organic search users that complete a transaction. An explicit ROAS (Return on Advertising Spend) analysis in the light of the experiment results would be the final verdict as to whether and to what extent search advertising is beneficial for each advertiser. As a consequence of natural variation of traffic it is always likely that events that go beyond the experiment design can play a role in changing traffic patterns, often without being easily identifiable in order to be appropriately evaluated. An approach that addressed this concern could be based on the concept of geographically structured randomised experiments. Additionally, not all sections of a website are impacted in the same way by the presence or absence of ads. In fact it is likely that different pages can have very different organic search rankings. It would therefore be valuable to apply the present or alternative methods of analysis to distinct sets of pages on a website and report separately for each set in order to achieve more focused results. V. REFERENCES [1] D. X. Chan, Y. Yuan, J. Koehler, and D. Kumar, “Incremental Clicks: The Impact of Search Advertising,” 2011. [2] D. Chan, D. Kumar, S. Ma, and J. Koehler, “Impact Of Ranking Of Organic Search Results On The Incrementality Of Search Ads,” 2012. [3] A. Coghlan, “A Little Book of R For Time Series,” Release 02, 2014. [4] “Mann-Whitney U-test / Mann-Whitney-Wilcoxon.” [Online]. Available: whitney-u-test. [Accessed: 03-Aug-2016]. [5] E. S. Banjanovic and J. W. Osborne, “Confidence Intervals for Effect Sizes: Applying Bootstrap Resampling.,” Pract. Assess. Res. Eval., vol. 21, no. 5, p. 2, 2016. [6] R. Kabacoff, R in Action: Data Analysis and Graphics with R, 2 edition. Shelter Island: Manning Publications, 2015.