ARG Defends their "Ballot Lead Calculator"
Yesterday, I emailed the polling firm ARG, to let them know that their Ballot Lead Calculator was in error. Today they replied, arguing that they're right and I'm wrong. Here's their email; I'll reply in my next post.
Thanks for your e-mail.
I believe you are confused about the calculation of the confidence interval of the difference when you write:
The confidence intervals overlap, so this is a "statistical tie."* To say this another way, the confidence interval for the difference is 8, so Kerry is ahead by 6 +/- 8: there's a 95% chance that he's somewhere between 14 points ahead and 2 points behind. The margin of error for Kerry's lead (+/- 8), is double that for his percentage of the vote (+/- 4).
The actual confidence interval for the difference is produced using the formula in the calculator, it is not correct to create the confidence interval for the difference simply by doubling the margin of error for the proportion as you (and the site you point to) suggest because you have to take into consideration the sampling error for both sample estimates (whether or not they are from the same sample or are from different samples - the calculator assumes same samples).
There is no difference in the formula for a point in time versus a gain over time as you imply. You may be confusing continuous and discrete measures as relating to time.
Also, it is incorrect when you write that "there's a 95% chance that he's somewhere between 14 points ahead and 2 points behind" because the population proportion is fixed and either the candidate is between 14 points ahead and 2 points behind or the candidate is not - there are no probabilities involved (that's why it is called a confidence interval and not a probability interval). This point is confusing to most people.
Without conducing a census, there is no way to determine the true extent of the sampling error for any sample. If, for example, the candidate is really 16 points ahead, your statement that "there's a 95% chance that he's somewhere between 14 points ahead and 2 points behind" is incorrect - there is no chance the candidate is between 14 ahead and 2 behind. When you write that the responses in a single poll "aren't independent" is incorrect for true random samples. Sampling error accounts for a sample that has more Kerry (or Bush) supporters than the actual population and it has nothing to do with "negative correlation." Again, the problem is there is no way to tell without a census. And that's another reason why it is not safe just to double the margin of error to determine the differences.
I hope this is helpful in showing you that the calculator does not get it wrong.
American Research Group, Inc.