Variance Outlier Test: 4. Unbalanced designs

The approaches described below are conceptually simplest, but require more calculation work than necessary. If you plan to apply the G test routinely, you are referred to § 3.6 of the manuscript.

4.1 One-sided upper limit G test

For each data set j, calculate the ratio G_j according to § 2.1. For each data set j, also calculate the upper limit critical value G_UL at the desired significance level α according to § 2.3, and check if G_j ≤ G_UL.

If G_j ≤ G_UL for all data sets, there is no reason to assume that one of the data sets has an exceptionally large variance value in comparison with the other data sets.
If G_j > G_UL for one data set, label the corresponding variance value as "exceptionally large", remove the value from the variance data, and repeat the test on the remaining variance values.
If G_j > G_UL for several data sets, lower the significance level of the test until only one of the data sets still will show G_j > G_UL. Label the corresponding variance value as "exceptionally large", remove the value from the variance data, and repeat the test on the remaining variance values at the lowered significance level. Continue the process until you have identified and removed all deviant data sets at the lowered significance level. Then run the test on the remaining variance data; let the significance level gradually increase until you reach your initial significance level.

4.2 One-sided lower limit G test

For each data set j, calculate the ratio G_j according to § 2.1. For each data set j, also calculate the lower limit critical value G_LL at the desired significance level α according to § 2.3, and check if G_j ≥ G_LL.

If G_j ≥ G_LL for all data sets, there is no reason to assume that one of the data sets has an exceptionally small variance value in comparison with the other data sets.
If G_j < G_LL for one data set, label the corresponding variance value as "exceptionally small", remove the value from the variance data, and repeat the test on the remaining variance values.
If G_j < G_LL for several data sets, lower the significance level of the test until only one of the data sets still will show G_j < G_LL. Label the corresponding variance value as "exceptionally small", remove the value from the variance data, and repeat the test on the remaining variance values at the lowered significance level. Continue the process until you have identified and removed all deviant data sets at the lowered significance level. Then run the test on the remaining variance data; let the significance level gradually increase until you reach your initial significance level.

4.3 Two-sided G test

For each data set j, calculate the ratio G_j according to § 2.1. For each data set j, also calculate the upper limit critical value G_UL and the lower limit critical value G_LL at the desired significance level α according to § 2.3, and check if G_LL ≤ G_j ≤ G_UL.

If G_LL ≤ G_j ≤ G_UL for all data sets, there is no reason to assume that one of the data sets has a deviant variance value in comparison with the other data sets.
If G_LL ≤ G_j ≤ G_UL is not met for one particular data set, label the corresponding variance value as "deviant", remove the value from the variance data, and repeat the test on the remaining variance values.
If G_LL ≤ G_j ≤ G_UL is not met for several data sets, lower the significance level of the test until only one of the data sets still will fail G_LL ≤ G_j ≤ G_UL. Label the corresponding variance value as "deviant", remove the value from the variance data, and repeat the test on the remaining variance values at the lowered significance level. Continue the process until you have identified and removed all deviant data sets at the lowered significance level. Then run the test on the remaining variance data; let the significance level gradually increase until you reach your initial significance level.

4.4 Example case of two-sided G test

The considered data are identical to the data discussed in § 7 of the manuscript.

Cell contents:

D5: "=B5-1"

E5: "=D5*C5^2"

G5: "=1/(1+(D13/D5-1)/FINV(1-G3/2/COUNT(C5:C12),D5,D13-D5))"

H5: "=E5/E13"

I5: "=1/(1+(D13/D5-1)/FINV(G3/2/COUNT(C5:C12),D5,D13-D5))"

(...)
D13: "=SUM(D5:D12)"

D12: "=B12-1"

E12: "=D12*C12^2"

G12: "=1/(1+(D13/D12-1)/FINV(1-G3/2/COUNT(C5:C12),D12,D13-D12))"

H12: "=E12/E13"

I12: "=1/(1+(D13/D12-1)/FINV(G3/2/COUNT(C5:C12),D12,D13-D12))"

E13: "=SUM(E5:E12)"

Variance Outlier Test

Pages

Saturday, November 6, 2010

4. Unbalanced designs

1 comment:

About Me

Blog Archive