Did your variant really win, or is it just noise? Enter the visitors and conversions for each version and this calculator returns the conversion rates, the relative uplift, and whether the difference is statistically significant — using a two-proportion z-test.
Statistical significance (a p-value below 0.05, i.e. 95% confidence) means the difference between your variants is unlikely to be random chance. It does not mean the test is done — you still need enough sample size and a full business cycle (often two weeks) to avoid being fooled by early swings. If the result isn't significant yet, keep the test running; stopping early is the most common A/B testing mistake.
It's the probability that the difference between your variants is real rather than random noise. A p-value below 0.05 means roughly 95% confidence the result isn't chance.
It depends on your baseline rate and the uplift you want to detect, but smaller effects need far more traffic. As a rule of thumb, don't call a test before a few hundred conversions per variant and a full business cycle.
Better not. 'Peeking' and stopping at the first significant moment inflates false positives. Decide your sample size up front and let the test run its course.
A two-proportion z-test (two-tailed), the standard approach for comparing two conversion rates. For multivariate tests or sequential analysis, use a dedicated platform like VWO.
This tool is free and runs entirely in your browser. The link above is an affiliate link: we may earn a commission if you sign up, at no extra cost to you, and it never changes our honest take.