Understand Statistical Confidence
The A/B test report uses unique tested visitors and goals to calculate the conversion rate, improvement and confidence with a statistical significance. You see below a screenshot of a report where 97% confidence has been reached.
Note: In this report a winner has been selected because the confidence is above 97% and there are more than 10 conversions for each variation.
- Variation: this column reports the name of the variation for a particular row
- Conversion Rate: this column shows the percentage of visitors that turned into conversions as well as the error interval;
- Improvement: this column reports the percentage change of the variation compared to the Control;
- Confidence: this column reports the significance, or how different the confidence interval for the conversion rate for the experiment variation is when compared to the control/original variation (this must be at least 97% confident before being marked as a winner). The gray/green dots in that column indicate:
- 1 green dot for 75%-85% confidence
- 2 green dots for 85%-95% confidence
- 3 green dots for 95%-96% confidence
- 4 green dots for 96%-97% confidence
- 5 green dots for 97% and above
- Conversions / Visitors: this column reports the number of conversions received and the number of visitors that saw the specific variation;
We at Convert.com decided to use 2-tailed Z-test at a .05 confidence level (95%) (that is .025 for each tail being a normal symmetric distribution) with the option to change this between .05 (95%) and .01 (99%) (see also below in the update)
The values used in the report are calculated as noted below.
Conversion Rate and Conversion Rate Change for Variations
For each variation the following is calculated:
Conversion rate (events are unique visitors):
The percentage change of the conversion rate between the experiment variation and the original/control variation:
A statistical method for calculating a confidence interval around the conversion rate is used for each variation. The standard error (for 1 standard deviation) is calculated using the Wald method for a binomial distribution.
This formula is one of the simplest formulas used to calculate standard error and assumes that the binomial distribution can be approximated with a normal distribution (because of the central limit theorem) http://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval. The sample distribution can be approximated with a normal distribution when there are a more than 10 conversions on the specific goal.
To determine the confidence interval for the conversion rate multiply the standard error with the 95th percentile of a standard normal distribution (a constant value equal to 1.65).
This results in a 90% confidence that the conversion rate, p, is in the range of
To determine whether the results are significant (that the conversion rates for each variation are not different because of random variations), a ZScore is calculated as follows:
The ZScore is the number of standard deviations between the control and test variation mean values described at http://en.wikipedia.org/wiki/Zscore. Using a standard normal distribution the 95% significance is determined when the view event count is greater than 1000 and one of the following criteria is met:
Probability(ZScore) > 95%
Probability(ZScore) < 5%
The chance to be different (displayed on the report) is derived from the Probability(ZScore) value where:
- If Probability(ZScore) <= 0.5 then
Improvement = 1- Probability(ZScore)
- If Probability(ZScore) > 0.5 then
Improvement = Probability(ZScore)
Update: August 18, 2014
In Experiment Summary you can now set the experiment confidence level for reports, sampling and automations between 95%-99% (see screenshot below).