Fieller Intervals
Fieller Intervals
Organizations can choose to use Fieller Intervals as the methodology of calculation for the confidence intervals for the relative change between test and control group.
The Delta Method is an approximation for the variance of a ratio between two variables that is then used to establish a confidence interval, while Fieller Intervals are an exact solution for the confidence interval.
In most cases though, Fieller Interval results are very similar to results from the Delta Method. Since Fieller Intervals are more accurate, we recommend that you opt into using this methodology!
Calculation
1 Determine if a Fieller Interval is Well-Defined
Before proceeding to applying Fieller’s Theorem, we need to check that the denominator of the relative lift metric is significantly distinct from 0.
We do this by calculating the parameter g:
Where: is the critical value associated with the desired confidence level is the variance of the control group metric values is the number of units in the control group is the mean of the control group metric values
When g < 1, the control mean is significantly different from 0, and we can use Fieller intervals.
2A Apply Fieller Interval Formula
Since the control and test group results are independent of each other, covariance terms in Fieller's Theorem can be dropped.
2B Edge Case: Control Mean not Statistically Distinct from Zero
In rare cases (less than 5% of observed metric comparisons on Statsig), g 1, which means that the control group’s mean is not statistically distinguishable from 0.
When is not statistically different from zero, the denominator of our relative lift calculation is unstable. This means that the confidence interval for the percent difference between test and control is unbounded.
When this happens, we surface the relative lift observed during the experiment.
Enabling on Statsig
Controlling which relative confidence interval methodology you use is available in your Experimentation Settings at the Organization level, and changing this setting only impacts experiments created after the setting change.
In many cases, the results will be effectively the same as using the Delta Method, but especially if you’re running experiments with small sample sizes or noisy denominators, Fieller Intervals are more reliable. Thus, we'd strongly recommend using Fieller Intervals.
In the experiment scorecard, Fieller Intervals will look like this