Calculation Procedure for the Randomization F Test for Regression Coefficients

 

Given: n paired X,Y data values. These are the observed (or real) data.

 

Step 1:        Generate a random number data set consisting of n paired X’,Y’ values that are ‘modeled’ after the observed sample data (X,Y). The modeling is such that the X’,Y’ data have the same distributional properties as X,Y.

 

Step 2:        Fit the 1st, 2nd and 3rd degree polynomials to the X’,Y’ data exactly as done in the Parametric F Test. This includes finding the ‘Sum of Squares due to Regression’ for the linear, quadratic and cubic terms (SSRL, SSRQ and SSRC), the ‘Mean Sum of Squares due to Deviations’ for the 3rd degree polynomial (MSD3), and the F statistics for each term (FL, FQ and FC).

 

Note that FL/s, FQ/s and FC/s are spurious F values because they are based on a random number data set (X’,Y’).

 

Step 3:        Repeat Steps 1 and 2 a large number of times (q) using different X’,Y’ data sets (q should be at least 100).

 

Step 4:        Plot the ‘probability distributions’ of the q spurious FL/s, FQ/s and FC/s values (one plot for each F statistic).

 

Step 5:        Calculate the F statistics for the linear, quadratic and cubic terms for the observed X,Y data as is normally done in the Parametric F Test (FL/Obs, FQ/Obs and FC/Obs).

 

Step 6:        For each F statistic, find the percent of spurious F values (Fk/s) that are larger than the single observed F value (Fk/Obs). This is the statistical significance of the kth regression term.