2. Key Considerations Sample size versus response rate – planning for the number of usable data points you will actually obtain Attrition – Repeated measures, panel designs, and diary studies all lose participants over time Statistical power – ability to draw inferences from the sample obtained Margin of error – to the extent that the resulting statistics must be projectable to the larger population
3. May 15-17, 2008 Internet Data Collection Methods (Day 2-3) Response Rate Reminder 70% 65% 60% 55% 50% 45% 40% 1975 1995 Academic Surveys
4. Hope for the best / Plan for the worst Try to achieve an 80% response rate Hope to achieve a 50% response rate Plan ahead for a 30% response rate Means you need to sample 1000 people to obtain a sample of 300
5. Bad Data Unproctored, anonymous self report instruments generally have a higher percentage of: Unusual outliers Missing data Carelessly entered data Intentionally sabotaged data Another aspect of dealing with nonresponse is to anticipate, prepare for, and deal with item level data losses
7. The Best Articles on Statistical Power Cohen, J. (1992). "A power primer." Psychological bulletin 112(1): 155-159. Cohen, J. (1992). "Statistical power analysis." Current Directions in Psychological Science: 98-101. Kraemer, H. and S. Thiemann (1987). How many subjects?: Statistical power analysis in research, Sage Publications, Inc.
8. May 15-17, 2008 Internet Data Collection Methods (Day 2-8) Sample Size “Guestimates”(With apologies to Jacob Cohen)
9. May 15-17, 2008 Internet Data Collection Methods (Day 2-9) Estimating Effect Size(Also with apologies to Jacob Cohen) Mean differences, calibrated in standard deviations: Large = .8+, Medium = .5, Small = .2 Multiple regression, size of R-squared: Large =.35+, Medium = .15, Small = .02 Chi-square, calibrated in the difference between null and alternate population proportions: Large = . 50, Medium = .30, Small = .10
10. Margin of Error Generally represents only sampling error: Other sources of error will often make the margin much larger Assumes a large population, with no more than 5% drawn into the sample Margin of error is half the width of a confidence interval Straightforward calculation for a CI around a mean or a mean difference: generally about 1.96 standard errors CI around a proportion/percentage is more complex: Use 1.96 times this SE; works fine for even splits; can be a little funky for extreme proportions
11. Margin of Error Calculators http://www.raosoft.com/samplesize.html Trades off sample size and margin of error http://www.surveysystem.com/sscalc.htm Explains terminology http://faculty.vassar.edu/lowry/polls/calcs.html Various tools for assessing poll data http://glass.ed.asu.edu/stats/analysis/rci.html Confidence intervals for correlations http://www.stat.tamu.edu/~jhardin/applets/signed/case11.html Java-based applet
12. An Overall Sampling Plan Estimate the expected effect size for the most important tests you plan to conduct For inferential testing, use power estimation tools to plan sample size For projectability, use margin of error tools to plan sample size Take into account item level data loss due to bad data Take attrition into account for longitudinal designs Take overall response rate into account for all types of designs Determine overall initial sample size based on all of the factors listed above