Skip to main content

Fieller Intervals

Fieller Intervals

Organizations can choose to use Fieller Intervals as the methodology of calculation for the confidence intervals for the relative change between test and control group.

The Delta Method is an approximation for the variance of a ratio between two variables that is then used to establish a confidence interval, while Fieller Intervals are an exact solution for the confidence interval.

In most cases though, Fieller Interval results are very similar to results from the Delta Method. Since Fieller Intervals are more accurate, we recommend that you opt into using this methodology!

Calculation

1 Determine if a Fieller Interval is Well-Defined

Before proceeding to applying Fieller’s Theorem, we need to check that the denominator of the relative lift metric XC\overline{X_C} is significantly distinct from 0.

We do this by calculating the parameter g: g=Zα/22var(XC)(nC1)XC2g = \frac{Z_{\alpha/2}^2 \cdot var(X_C)}{(n_C-1) \cdot \overline{X_C}^2}

Where: Zα/2Z_{\alpha/2} is the critical value associated with the desired confidence level var(XC)var(X_C) is the variance of the control group metric values nCn_C is the number of units in the control group XC\overline{X_C} is the mean of the control group metric values

When g < 1, the control mean is significantly different from 0, and we can use Fieller intervals.

2A Apply Fieller Interval Formula

Since the control and test group results are independent of each other, covariance terms in Fieller's Theorem can be dropped.

CI(%ΔX)=11g(XTXC1±Zα/2nCXC(1g)var(XT)nT(nT1)+XTvar(XC)XCnC(nC1))CI(\% \Delta \overline{X} ) = \frac{1}{1-g} ( \frac{\overline{X_T}}{\overline{X_C}} - 1 \pm \frac{Z_{\alpha/2}}{\sqrt{n_C} \cdot \overline{X_C}} \sqrt{(1-g) \cdot \frac{var(X_T)}{n_T(n_T-1)} + \frac{\overline{X_T} var(X_C)}{\overline{X_C} n_C (n_C-1)}})

2B Edge Case: Control Mean not Statistically Distinct from Zero

In rare cases (less than 5% of observed metric comparisons on Statsig), g \geq 1, which means that the control group’s mean is not statistically distinguishable from 0.

When XC\overline{X_C} is not statistically different from zero, the denominator of our relative lift calculation is unstable. This means that the confidence interval for the percent difference between test and control is unbounded.

When this happens, we surface the relative lift observed during the experiment.

%ΔX=XTXCXC \% \Delta \overline{X} = \frac{\overline{X_T}-\overline{X_C}}{\overline{X_C}}

Enabling on Statsig

Controlling which relative confidence interval methodology you use is available in your Experimentation Settings at the Organization level, and changing this setting only impacts experiments created after the setting change. image

In many cases, the results will be effectively the same as using the Delta Method, but especially if you’re running experiments with small sample sizes or noisy denominators, Fieller Intervals are more reliable. Thus, we'd strongly recommend using Fieller Intervals.

In the experiment scorecard, Fieller Intervals will look like this image