Bayesian Calculator for UX Research

Compare two variants on a binary outcome — task success, click-through, theme prevalence — and get answers like "A is 96% likely to be better" instead of "p = 0.18 — inconclusive."

Bayesian results are easier to act on, easier to explain, and honest about uncertainty when samples are small.

Group A

Group B

Prior

Custom…

Only use when you have real prior data or expert estimates. Don't have one? Build one with the Intuition-based Priors Calculator.

Default. All rates 0–100% equally plausible before data.

Estimated rates for each group (posterior distributions)

These curves show your updated belief about each group's true rate after combining the data with the prior. Tighter, taller curves mean more certainty. The faint dashed lines show the prior for comparison.

Probability of Direction (which group is more likely better?)

The curve shows the posterior distribution of the difference pA − pB. The shaded areas on each side of zero are the Probability of Direction: how likely each group is to be the better one.

Credible intervals

A credible interval is the range most likely to contain the true value, given your data and prior. A 95% credible interval can be read literally: "there's a 95% probability the true value falls in this range" — which is what most people incorrectly assume a frequentist confidence interval means. Wider intervals signal more uncertainty, usually from smaller samples.

Group A

Group B

Group Observed Best estimate Lower Upper
A
B
A − B

Showing the 95% interval. Adjust the level above to widen or tighten.

Region of Practical Equivalence (ROPE)

The ROPE is a band around zero difference that you'd consider "not meaningful." Statistical evidence and practical importance aren't the same: a real 0.3pp difference may not be worth acting on. By committing to a ROPE up front, you turn "is there a difference?" into the more useful question, "is the difference big enough to matter?" The bar below shows what fraction of the posterior falls below, inside, and above the ROPE.

± percentage points

The verdict above uses a strict 95% threshold so you can act with high confidence. Adjust the ROPE to match the smallest difference that would actually change your decision.

Compare to frequentist test

Fisher's exact test p-value:

A frequentist test for the same data. We surface this for comparison only — the Bayesian results above are the primary answer.

Posterior parameters