Skip navigation and jump directly to page content

 IU Trident Indiana University

UITS Research Technologies

Cyberinfrastructure enabling research and creative activities
banner-image

7. Random Effect Models


The random effects model examines how group and/or time affect error variances. This model is appropriate for n individuals who were drawn randomly from a large population. This chapter focuses on the feasible generalized least squares (FGLS) with variance component estimation methods from Baltagi and Chang (1994), Fuller and Battese (1974), and Wansbeek and Kapteyn (1989).

7.1 The One-way Random Group Effect Model

When the omega matrix is not known, you have to estimate theta using the SSEs of the between effect model (.0317) and the fixed effect model (.2926).

The variance component of error is .00361263 = .292622872/(6*15-6-3)
The variance component of group is .01559712 =.031675926/(6-4) - .00361263/15

Thus, theta estimate is .

Now, transform the dependent and independent variables including the intercept.

. gen rg_cost = cost - .87668488*gm_cost // transform variables
. gen rg_output = output - .87668488*gm_output
. gen rg_fuel = fuel - .87668488*gm_fuel
. gen rg_load = load - .87668488*gm_load
. gen rg_int = 1 - .87668488 // for the intercept

Finally, run the OLS with the transformed variables. Do not forget to suppress the intercept. This is the groupwise heteroscedastic regression model (Greene 2003).

. regress rg_cost rg_int rg_output rg_fuel rg_load, noc

      Source |       SS       df       MS              Number of obs =      90
-------------+------------------------------           F(  4,    86) =19642.72
       Model |  284.670313     4  71.1675783           Prob > F      =  0.0000
    Residual |  .311586777    86  .003623102           R-squared     =  0.9989
-------------+------------------------------           Adj R-squared =  0.9989
       Total |    284.9819    90  3.16646556           Root MSE      =  .06019
 
------------------------------------------------------------------------------
     rg_cost |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      rg_int |   9.627911   .2101638    45.81   0.000     9.210119     10.0457
   rg_output |   .9066808   .0256249    35.38   0.000     .8557401    .9576215
     rg_fuel |   .4227784   .0140248    30.15   0.000      .394898    .4506587
     rg_load |    -1.0645   .2000703    -5.32   0.000    -1.462226   -.6667731
------------------------------------------------------------------------------

Top

7.2 Estimations in SAS, Stata, and LIMDEP

The SAS TSCSREG and PANEL procedures have the /RANONE option to fit the one-way random effect model. These procedures by default use the Fuller and Battese (1974) estimation method, which produces slightly different estimates from FGLS.

PROC TSCSREG DATA=masil.airline;
   ID airline year;
   MODEL cost = output fuel load /RANONE;
RUN;

                                     The TSCSREG Procedure
 
Dependent Variable: cost
 
                                       Model Description
 
                              Estimation Method             RanOne
                              Number of Cross Sections           6
                              Time Series Length                15
 
 
                                         Fit Statistics
 
                       SSE              0.3090    DFE                  86
                       MSE              0.0036    Root MSE         0.0599
                       R-Square         0.9923
 
 
                                 Variance Component Estimates
 
                       Variance Component for Cross Sections    0.018198
                       Variance Component for Error             0.003613
 
 
                                        Hausman Test for
                                         Random Effects
 
                                       DF    m Value    Pr > m
 
                                        3       0.92    0.8209
 
 
                                      Parameter Estimates
 
                                                 Standard
               Variable        DF    Estimate       Error    t Value    Pr > |t|
 
               Intercept        1       9.637      0.2132      45.21      <.0001
               output           1    0.908024      0.0260      34.91      <.0001
               fuel             1    0.422199      0.0141      29.95      <.0001
               load             1    -1.06469      0.1995      -5.34      <.0001

The PANEL procedure has the /VCOMP=WK option for the Wansbeek and Kapteyn (1989) method, which is close to groupwise heteroscedastic regression. The BP option of the MODEL statement, not available in the TSCSREG procedure, conducts the Breusch-Pagen LM test for random effects. Note that two procedures estimate the same variance component for error (.0036) but a different variance component for groups (.0182 versus .0160),

PROC PANEL DATA=masil.airline;
   ID airline year;
   MODEL cost = output fuel load /RANONE BP VCOMP=WK;
RUN;

                                      The PANEL Procedure
                       Wansbeek and Kapteyn Variance Components (RanOne)
 
Dependent Variable: cost
 
                                       Model Description
 
                              Estimation Method             RanOne
                              Number of Cross Sections           6
                              Time Series Length                15
 
 
                                         Fit Statistics
 
                       SSE              0.3111    DFE                  86
                       MSE              0.0036    Root MSE         0.0601
                       R-Square         0.9923
 
 
                                 Variance Component Estimates
 
                       Variance Component for Cross Sections    0.016015
                       Variance Component for Error             0.003613
 
 
                                        Hausman Test for
                                         Random Effects
 
                                       DF    m Value    Pr > m
 
                                        2       1.63    0.4429
 
 
                                 Breusch Pagan Test for Random
                                       Effects (One Way)
 
                                       DF    m Value    Pr > m
 
                                        1     334.85    <.0001
 
 
                                      Parameter Estimates
 
                                                 Standard
               Variable        DF    Estimate       Error    t Value    Pr > |t|
 
               Intercept        1    9.629513      0.2107      45.71      <.0001
               output           1    0.906918      0.0257      35.30      <.0001
               fuel             1    0.422676      0.0140      30.11      <.0001
               load             1    -1.06452      0.2000      -5.32      <.0001

The Stata .xtreg command has the re option to produce FGLS estimates. The .iis command specifies the panel identification variable, such as a grouping or cross-section variable that is used in the i() option.

. iis airline

. xtreg cost output fuel load, re i(airline) theta

Random-effects GLS regression                   Number of obs      =        90
Group variable (i): airline                     Number of groups   =         6
 
R-sq:  within  = 0.9925                         Obs per group: min =        15
       between = 0.9856                                        avg =      15.0
       overall = 0.9876                                        max =        15
 
Random effects u_i ~ Gaussian                   Wald chi2(3)       =  11091.33
corr(u_i, X)       = 0 (assumed)                Prob > chi2        =    0.0000
theta              = .87668503
 
------------------------------------------------------------------------------
        cost |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      output |   .9066805    .025625    35.38   0.000     .8564565    .9569045
        fuel |   .4227784   .0140248    30.15   0.000     .3952904    .4502665
        load |  -1.064499   .2000703    -5.32   0.000    -1.456629    -.672368
       _cons |   9.627909    .210164    45.81   0.000     9.215995    10.03982
-------------+----------------------------------------------------------------
     sigma_u |  .12488859
     sigma_e |  .06010514
         rho |  .81193816   (fraction of variance due to u_i)
------------------------------------------------------------------------------

The theta option reports the estimated theta (.8767). The sigma_u and sigma_e are square roots of the variance components for groups and errors (.0036=.0601^2).

In LIMDEP, you have to specify Panel$ and Het$ subcommands for the groupwise heteroscedastic model. Note that LIMDEP presents the pooled OLS regression and least square dummy variable model as well.

--> REGRESS;Lhs=COST;Rhs=ONE,OUTPUT,FUEL,LOAD;Panel;Str=AIRLINE;Het=AIRLINE$

+-----------------------------------------------------------------------+
| OLS Without Group Dummy Variables                                     |
| Ordinary    least squares regression    Weighting variable = none     |
| Dep. var. = COST     Mean=   13.36560933    , S.D.=   1.131971444     |
| Model size: Observations =      90, Parameters =   4, Deg.Fr.=     86 |
| Residuals:  Sum of squares= 1.335449522    , Std.Dev.=         .12461 |
| Fit:        R-squared=  .988290, Adjusted R-squared =          .98788 |
| Model test: F[  3,     86] = 2419.33,    Prob value =          .00000 |
| Diagnostic: Log-L =     61.7699, Restricted(b=0) Log-L =    -138.3581 |
|             LogAmemiyaPrCrt.=   -4.122, Akaike Info. Crt.=     -1.284 |
| Panel Data Analysis of COST       [ONE way]                           |
|           Unconditional ANOVA (No regressors)                         |
| Source      Variation        Deg. Free.     Mean Square               |
| Between       74.6799                5.         14.9360               |
| Residual      39.3611               84.         .468584               |
| Total         114.041               89.         1.28136               |
+-----------------------------------------------------------------------+
+---------+--------------+----------------+--------+---------+----------+
|Variable | Coefficient  | Standard Error |t-ratio |P[|T|>t] | Mean of X|
+---------+--------------+----------------+--------+---------+----------+
 OUTPUT       .8827386341   .13254552E-01   66.599   .0000    -1.1743092
 FUEL         .4539777119   .20304240E-01   22.359   .0000     12.770359
 LOAD        -1.627507797       .34530293   -4.713   .0000     .56046016
 Constant     9.516912231       .22924522   41.514   .0000
 (Note: E+nn or E-nn means multiply by 10 to + or -nn power.)
 
+-----------------------------------------------------------------------+
| Least Squares with Group Dummy Variables                              |
| Ordinary    least squares regression    Weighting variable = none     |
| Dep. var. = COST     Mean=   13.36560933    , S.D.=   1.131971444     |
| Model size: Observations =      90, Parameters =   9, Deg.Fr.=     81 |
| Residuals:  Sum of squares= .2926207777    , Std.Dev.=         .06010 |
| Fit:        R-squared=  .997434, Adjusted R-squared =          .99718 |
| Model test: F[  8,     81] = 3935.82,    Prob value =          .00000 |
| Diagnostic: Log-L =    130.0865, Restricted(b=0) Log-L =    -138.3581 |
|             LogAmemiyaPrCrt.=   -5.528, Akaike Info. Crt.=     -2.691 |
| Estd. Autocorrelation of e(i,t)     .573531                           |
| White/Hetero. corrected covariance matrix used.                       |
+-----------------------------------------------------------------------+
+---------+--------------+----------------+--------+---------+----------+
|Variable | Coefficient  | Standard Error |t-ratio |P[|T|>t] | Mean of X|
+---------+--------------+----------------+--------+---------+----------+
 OUTPUT       .9192881432   .19105357E-01   48.117   .0000    -1.1743092
 FUEL         .4174910457   .13532534E-01   30.851   .0000     12.770359
 LOAD        -1.070395015       .21662097   -4.941   .0000     .56046016
 (Note: E+nn or E-nn means multiply by 10 to + or -nn power.)
 
+------------------------------------------------------------------------+
|                Test Statistics for the Classical Model                 |
|                                                                        |
|        Model            Log-Likelihood    Sum of Squares    R-squared  |
| (1)  Constant term only     -138.35814   .1140409821D+03     .0000000  |
| (2)  Group effects only      -90.48804   .3936109461D+02     .6548513  |
| (3)  X - variables only       61.76991   .1335449522D+01     .9882897  |
| (4)  X and group effects     130.08647   .2926207777D+00     .9974341  |
|                                                                        |
|                                Hypothesis Tests                        |
|               Likelihood Ratio Test                F Tests             |
|          Chi-squared   d.f.  Prob.         F    num. denom. Prob value |
| (2) vs (1)    95.740      5     .00000    31.875    5    84     .00000 |
| (3) vs (1)   400.256      3     .00000  2419.329    3    86     .00000 |
| (4) vs (1)   536.889      8     .00000  3935.818    8    81     .00000 |
| (4) vs (2)   441.149      3     .00000  3604.832    3    81     .00000 |
| (4) vs (3)   136.633      5     .00000    57.733    5    81     .00000 |
+------------------------------------------------------------------------+
Error:   425: REGR;PANEL. Could not invert VC matrix for Hausman test.
 
            +--------------------------------------------------+
            | Random Effects Model: v(i,t) = e(i,t) + u(i)     |
            | Estimates:  Var[e]              =   .361260D-02  |
            |             Var[u]              =   .119159D-01  |
            |             Corr[v(i,t),v(i,s)] =   .767356      |
            | Lagrange Multiplier Test vs. Model (3) =  334.85 |
            | ( 1 df, prob value =  .000000)                   |
            | (High values of LM favor FEM/REM over CR model.) |
            | Fixed vs. Random Effects (Hausman)     =     .00 |
            | ( 3 df, prob value = 1.000000)                   |
            | (High (low) values of H favor FEM (REM).)        |
            | Reestimated using GLS coefficients:              |
            | Estimates:  Var[e]              =   .362491D-02  |
            |             Var[u]              =   .392309D-01  |
            | Var[e] above is an average. Groupwise            |
            | heteroscedasticity model was estimated.          |
            |             Sum of Squares          .147779D+01  |
            +--------------------------------------------------+
+---------+--------------+----------------+--------+---------+----------+
|Variable | Coefficient  | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|
+---------+--------------+----------------+--------+---------+----------+
 OUTPUT       .9041238041   .24615477E-01   36.730   .0000    -1.1743092
 FUEL         .4238986905   .13746498E-01   30.837   .0000     12.770359
 LOAD        -1.064558659       .19933132   -5.341   .0000     .56046016
 Constant     9.610634379       .20277404   47.396   .0000
 (Note: E+nn or E-nn means multiply by 10 to + or -nn power.)

Like SAS TSCSREG and PANEL procedures, LIMDEP estimates a slightly different variance component for groups (.0119), thus producing different parameter estimates. In addition, the Hausman test is not successful in this example.

Top

7.3 The One-way Random Time Effect Model

Let us compute theta estimate using the SSEs of the between effect model (.0056) and the fixed effect model (1.0882).

The variance component for error is .01511375 = 1.08819022/(15*6-15-3)
The variance component for time is -.00201072 =.005590631/(15-4)- .01511375/6

The theta estimate is .

. gen rt_cost = cost - (-1.226263)*tm_cost // transform variables
. gen rt_output = output - (-1.226263)*tm_output
. gen rt_fuel = fuel - (-1.226263)*tm_fuel
. gen rt_load = load - (-1.226263)*tm_load
. gen rt_int = 1 - (-1.226263) // for the intercept

. regress rt_cost rt_int rt_output rt_fuel rt_load, noc

      Source |       SS       df       MS              Number of obs =      90
-------------+------------------------------           F(  4,    86) =       .
       Model |  79944.1804     4  19986.0451           Prob > F      =  0.0000
    Residual |  1.79271995    86  .020845581           R-squared     =  1.0000
-------------+------------------------------           Adj R-squared =  1.0000
       Total |  79945.9732    90  888.288591           Root MSE      =  .14438
 
------------------------------------------------------------------------------
     rt_cost |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      rt_int |   9.516098   .1489281    63.90   0.000     9.220038    9.812157
   rt_output |   .8883838   .0143338    61.98   0.000     .8598891    .9168785
     rt_fuel |   .4392731   .0129051    34.04   0.000     .4136186    .4649277
     rt_load |  -1.279176   .2482869    -5.15   0.000    -1.772754   -.7855982
------------------------------------------------------------------------------

However, the negative value of the variance component for time is not likely. This section presents examples of procedures and commands for the one-way time random effect model without outputs.

In SAS, use the TSCSREG or PANEL procedure with the /RANONE option.

PROC SORT DATA=masil.airline;
   BY year airline;

PROC TSCSREG DATA=masil.airline;
   ID year airline;
   MODEL cost = output fuel load /RANONE;
RUN;

PROC PANEL DATA=masil.airline;
   ID year airline;
   MODEL cost = output fuel load /RANONE BP;
RUN;

In Stata, you have to switch the grouping and time variables using the .tsset command.

. tsset year airline

       panel variable:  year, 1 to 15
        time variable:  airline, 1 to 6

. xtreg cost output fuel load, re i(year) theta

In LIMDEP, you need to use the Period$ and Random$ subcommands.

REGRESS;Lhs=COST;Rhs=ONE,OUTPUT,FUEL,LOAD;Panel;Pds=15;Het=YEAR$


Top

7.4 The Two-way Random Effect Model in SAS

The random group and time effect model is formulated as . Let us first estimate the two way FGLS using the SAS PANEL procedure with the /RANTWO option. The BP2 option conducts the Breusch-Pagan LM test for the two-way random effect model.

PROC PANEL DATA=masil.airline;
   ID airline year;
   MODEL cost = output fuel load /RANTWO BP2;
RUN;

                                      The PANEL Procedure
                        Fuller and Battese Variance Components (RanTwo)
 
Dependent Variable: cost
 
                                       Model Description
 
                              Estimation Method             RanTwo
                              Number of Cross Sections           6
                              Time Series Length                15
 
 
                                         Fit Statistics
 
                       SSE              0.2322    DFE                  86
                       MSE              0.0027    Root MSE         0.0520
                       R-Square         0.9829
 
 
                                 Variance Component Estimates
 
                       Variance Component for Cross Sections    0.017439
                       Variance Component for Time Series       0.001081
                       Variance Component for Error              0.00264
 
 
                                        Hausman Test for
                                         Random Effects
 
                                       DF    m Value    Pr > m
 
                                        3       6.93    0.0741
 
 
                                 Breusch Pagan Test for Random
                                       Effects (Two Way)
 
                                       DF    m Value    Pr > m
 
                                        2     336.40    <.0001
 
 
                                      Parameter Estimates
 
                                                 Standard
               Variable        DF    Estimate       Error    t Value    Pr > |t|
 
               Intercept        1    9.362677      0.2440      38.38      <.0001
               output           1    0.866448      0.0255      33.98      <.0001
               fuel             1    0.436163      0.0172      25.41      <.0001
               load             1    -0.98053      0.2235      -4.39      <.0001

Similarly, you may run the TSCSREG procedure with the /RANTWO option.

PROC TSCSREG DATA=masil.airline;
   ID airline year;
   MODEL cost = output fuel load /RANTWO;
RUN;

Top

7.5 Testing Random Effect Models

The Breusch-Pagan Lagrange multiplier (LM) test is designed to test random effects. The null hypothesis of the one-way random group effect model is that variances of groups are zero. If the null hypothesis is not rejected, the pooled regression model is appropriate. The e'e of the pooled OLS is 1.33544153 and the e'e bar is .0665147.

LM is 334.8496 =  with p <.0000.

With the large chi-squared, we reject the null hypothesis in favor of the random group effect model. The SAS PANEL procedure with the /BP option and the LIMDEP Panel$ and Het$ subcommands report the LM statistic. In Stata, run the .xttest0 command right after estimating the one-way random effect model.

. quietly xtreg cost output fuel load, re i(airline)

. xttest0

Breusch and Pagan Lagrangian multiplier test for random effects:
 
        cost[airline,t] = Xb + u[airline] + e[airline,t]
 
        Estimated results:
                         |       Var     sd = sqrt(Var)
                ---------+-----------------------------
                    cost |   1.281358       1.131971
                       e |   .0036126       .0601051
                       u |   .0155972       .1248886
 
        Test:   Var(u) = 0
                              chi2(1) =   334.85
                          Prob > chi2 =     0.0000

The null hypothesis of the one-way random time effect is that variance components for time are zero. The following LM test uses Baltagi¡¯s formula. The small chi-squared of 1.5472 does not reject the null hypothesis at the .01 level.

LM is  with p<.2135.

. quietly xtreg cost output fuel load, re i(year)

. xttest0

Breusch and Pagan Lagrangian multiplier test for random effects:
 
        cost[year,t] = Xb + u[year] + e[year,t]
 
        Estimated results:
                         |       Var     sd = sqrt(Var)
                ---------+-----------------------------
                    cost |   1.281358       1.131971
                       e |   .0151138        .122938
                       u |          0              0
 
        Test:   Var(u) = 0
                              chi2(1) =     1.55
                          Prob > chi2 =     0.2135

The two way random effects model has the null hypothesis that variance components for groups and time are all zero. The LM statistic with two degrees of freedom is 336.3968 = 334.8496 + 1.5472 (p<.0001).

Top

7.6 Fixed Effects versus Random Effects

How do we compare a fixed effect model and its counterpart random effect model? The Hausman specification test examines if the individual effects are uncorrelated with the other regressors in the model. Since computation is complicated, let us conduct the test in Stata.

. tsset airline year

       panel variable:  airline, 1 to 6
        time variable:  year, 1 to 15

. quietly xtreg cost output fuel load, fe

. estimates store fixed_group

. quietly xtreg cost output fuel load, re

. hausman fixed_group .

                 ---- Coefficients ----
             |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
             |   fix_group        .          Difference          S.E.
-------------+----------------------------------------------------------------
      output |    .9192846     .9066805        .0126041        .0153877
        fuel |    .4174918     .4227784       -.0052867        .0058583
        load |   -1.070396    -1.064499       -.0058974        .0255088
------------------------------------------------------------------------------
                           b = consistent under Ho and Ha; obtained from xtreg
            B = inconsistent under Ha, efficient under Ho; obtained from xtreg
 
    Test:  Ho:  difference in coefficients not systematic
 
                  chi2(3) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                          =        2.12
                Prob>chi2 =      0.5469
                (V_b-V_B is not positive definite)

The Hausman statistic 2.12 is different from the PANEL procedure¡¯s 1.63 and Greene (2003)¡¯s 4.16. It is because SAS, Stata, and LIMDEP use different estimation methods to produce slightly different parameter estimates. These tests, however, do not reject the null hypothesis in favor of the random effect model.

Top

7.7 Summary

Table 7 summarizes random effect estimations in SAS, Stata, and LIMDEP. The SAS PANEL procedure is highly recommended.

Table 7 Comparison of the Random Effect Model in SAS, Stata, LIMDEP*


Up: Table of Contents
Next: Poolability Test
Prev: The Fixed Group Time Effect Model