Question 1

How does this calculator decide how big my A/B test should be?

Accepted Answer

It uses the standard two-proportion z-test formula. Given a baseline conversion rate, the smallest effect you want to detect, the statistical power you want, and the significance level you're willing to accept, it solves for the sample size per variant that satisfies all four. The result is what you need before you peek, not after.

Question 2

What's the difference between relative and absolute minimum detectable effect (MDE)?

Accepted Answer

Absolute is in conversion-rate points. If your baseline is 4% and your absolute MDE is 1%, you're trying to detect a lift to 5%. Relative is a percentage of the baseline. A 25% relative MDE on a 4% baseline is also a lift to 5%. Relative MDE is usually the more honest framing — it tells you how much extra revenue you can pay for.

Question 3

Why is 80% statistical power the default?

Accepted Answer

Eighty percent is the convention because it's a workable balance between false negatives and how long you have to wait. At 80% power you'll miss roughly one in five real wins. Bump it to 90% if missing a real winner is more expensive than running the test a bit longer.

Question 4

Why is 95% significance the default?

Accepted Answer

Ninety-five percent is the convention because it caps the false-positive rate at 5% per test. If you run a lot of tests in parallel or peek at results, that effective rate climbs fast. Pushing significance to 99% buys you a quieter dashboard at the cost of larger samples.

Question 5

Can I trust the days-to-significance estimate?

Accepted Answer

Treat it as a floor, not a forecast. The math assumes evenly distributed traffic, no day-of-week effects, no holidays, and no segment skew. In practice, plan for at least one full business cycle (typically two weeks) even if the calculator says you can call the test sooner.

Question 6

What sample size should I use if my baseline conversion rate is very low?

Accepted Answer

Low baselines need disproportionately large samples because the variance of a Bernoulli outcome doesn't shrink the way the rate does. If your baseline is 1% and you want to detect a 10% relative lift, expect tens of thousands of users per variant. If you don't have that volume, test a higher-funnel metric where the baseline is bigger.

A/B Test Sample Size Calculator

Controls · v1.0

Output

Assumptions

Questions people ask

How does this calculator decide how big my A/B test should be?

What's the difference between relative and absolute minimum detectable effect (MDE)?

Why is 80% statistical power the default?

Why is 95% significance the default?

Can I trust the days-to-significance estimate?

What sample size should I use if my baseline conversion rate is very low?

A/B Test Sample Size Calculator

Controls · v1.0

Output

Assumptions

01These defaults are conventions, not laws of physics

02Pick MDE based on what's worth shipping, not what feels easy

03Your real test will run longer than this number says

Questions people ask

How does this calculator decide how big my A/B test should be?

What's the difference between relative and absolute minimum detectable effect (MDE)?

Why is 80% statistical power the default?

Why is 95% significance the default?

Can I trust the days-to-significance estimate?

What sample size should I use if my baseline conversion rate is very low?