R Markdown File to Augment Bliese’s (2016) Tutorial

Multilevel Modeling – Bliese’s Steps

The following R Markdown file is meant to augment Bliese’s (2016) multilevel package tutorial

Prior to starting the actual steps to multilevel modeling, let’s preload the packages

#install.packages("multilevel")
library(multilevel)
## Loading required package: nlme
## Loading required package: MASS

Now, let’s call the data that is preloaded in R (thanks to Bliese)

data(bh1996)

Although we’re not to Bliese’s Step 1 yet (p. 52), his step 1.5 is actually Aguinis et al.’s (2013) Step 1, so let’s start there

Null.gls<-gls(WBEING~1,data=bh1996,
              control=list(opt="optim"),na.action = na.omit) #Adding na.action to handle missing data 

Here, we’re checking to see if there’s any variation among individual’s response to well being (i.e., intercept only)

STEP ONE: Examine the ICC for the Outcome Variable

Null.Model<-lme(WBEING~1,random=~1|GRP,data=bh1996,
                control=list(opt="optim"),na.action = na.omit) #Adding na.action to handle missing data

Well being is the outcome variable. The intercept is allowed to vary by group. Thus, we’re modeling the intercept plus random between- and within-group error terms. AKA the unconditional means model.

Option 1: Manually Calculate ICC(1)

VarCorr(Null.Model)
## GRP = pdLogChol(1) 
##             Variance   StdDev   
## (Intercept) 0.03580079 0.1892110
## Residual    0.78949727 0.8885366

ICC(1) = Intercept Variance / (Intercept Variance + Residual Variance)

0.03580079/(0.03580079+0.78949727)
## [1] 0.04337922

Option 2: Let R Calculate ICC(1) by Creating an aov model

tmod<-aov(WBEING~as.factor(GRP),data=bh1996) # Creates a model comparing well-being among groups
ICC1(tmod)
## [1] 0.04336905

Option 3: Let R Calculate ICC(1) and ICC(2) Using psychometric package

# ICCs of Mixed Effects Models 
#install.packages("psychometric")
library(psychometric)

ICC1.lme(WBEING, GRP, bh1996) # Format is (DV, grouping variable, data-frame)
## [1] 0.0433792
ICC2.lme(WBEING, GRP, bh1996, weighted = FALSE)
## [1] 0.7335211
  • As you can see, all ICC(1) calculations are very close. Roughly 4% of the variance in individual well being can be “explained” by the group.
  • Remember, ICC(1) can be interpretted as the amount of variance “explained” by the group membership (i.e., the higher level entity)
  • ICC(2) is how reliably you can say that the group means differ from each other. >.70 is a rule of thumb cut-off.

Compare Models to See if Including the Random Intercept (i.e., different group means) Improves Model Fit

anova(Null.gls, Null.Model)
##            Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## Null.gls       1  2 19540.17 19553.98 -9768.084                        
## Null.Model     2  3 19353.34 19374.06 -9673.669 1 vs 2 188.8303  <.0001

The anova function simplifies the -2 logLiklihood comparison that uses Chi-square distribution, based on degrees of freedom, to test significance The results of this comparison suggest that including the random intercept significantly improves the model fit. Thus, we can move forward to the next step in the multilevel process.

STEP TWO: Explain Level 1 and 2 Intercept Variance

Now, we’re adding in level-1 (HRS) and level-2 (G.HRS) predictor variables. *Note: G.HRS is the group mean for individual-level hours worked.

Model.1<-lme(WBEING~HRS+G.HRS,random=~1|GRP,data=bh1996,
             control=list(opt="optim"),na.action = na.omit)
summary(Model.1)
## Linear mixed-effects model fit by REML
##  Data: bh1996 
##        AIC      BIC   logLik
##   19222.28 19256.81 -9606.14
## 
## Random effects:
##  Formula: ~1 | GRP
##         (Intercept)  Residual
## StdDev:     0.11639 0.8832353
## 
## Fixed effects: WBEING ~ HRS + G.HRS 
##                 Value  Std.Error   DF   t-value p-value
## (Intercept)  4.740829 0.21368746 7282 22.185808       0
## HRS         -0.046461 0.00488798 7282 -9.505056       0
## G.HRS       -0.126926 0.01940357   97 -6.541368       0
##  Correlation: 
##       (Intr) HRS   
## HRS    0.000       
## G.HRS -0.965 -0.252
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -3.35320562 -0.65024982  0.03760797  0.71319835  2.70917777 
## 
## Number of Observations: 7382
## Number of Groups: 99

Checking to see how much variance is explained at each level.

VarCorr(Model.1)
## GRP = pdLogChol(1) 
##             Variance   StdDev   
## (Intercept) 0.01354663 0.1163900
## Residual    0.78010466 0.8832353
  • Variance Explained = 1 - (Var with Predictor/Var without Predictor)
WithinVarExp=1-(.780/.789)
WithinVarExp
## [1] 0.01140684
BetweenVarExp=1-(.014/.036)
BetweenVarExp
## [1] 0.6111111

“…the variance estimates should be treated with some degree of caution because they are partially dependent upon how one specifies the models” (Bliese, 2016, p. 57).

STEP THREE: Examine and Predict Slope Variance

Now, we’re trying to explain the third source of variation, slope variation

library(lattice)
xyplot(WBEING~LEAD|as.factor(GRP),data=bh1996[1:1582,], # We're looking at the first 1582 rows to see the first 25 groups
       type=c("p","g","r"),col="dark blue",col.line="black", # "p" = points, "g" = grid, "r" = regression line
       xlab="Leadership Consideration",
       ylab="Well-Being") 

  • Based on the plot, it looks like the slopes vary among the groups, but we have to test this.

Before we allow the WBEING~LEAD slope to vary among groups, we should first build a model with a fixed slope & random intercepts.

  • Note: I’m rearranging Bliese’s order because it makes sense to start with the less complicated models and work to more complicated models. Thus, Model.2a comes before Model.2 in my version, but the models match Bliese’s (2016).
Model.2a<-lme(WBEING~HRS+LEAD+G.HRS,random=~1|GRP, data=bh1996, # We're allowing the intercept to vary among groups
              control=list(opt="optim"),na.action=na.omit)
summary(Model.2a)
## Linear mixed-effects model fit by REML
##  Data: bh1996 
##        AIC      BIC    logLik
##   17862.68 17904.12 -8925.341
## 
## Random effects:
##  Formula: ~1 | GRP
##         (Intercept)  Residual
## StdDev:   0.1190809 0.8043905
## 
## Fixed effects: WBEING ~ HRS + LEAD + G.HRS 
##                  Value  Std.Error   DF  t-value p-value
## (Intercept)  2.5592587 0.21580452 7281 11.85915   0e+00
## HRS         -0.0282785 0.00447625 7281 -6.31745   0e+00
## LEAD         0.4956385 0.01277894 7281 38.78559   0e+00
## G.HRS       -0.0790096 0.01888137   97 -4.18453   1e-04
##  Correlation: 
##       (Intr) HRS    LEAD  
## HRS   -0.027              
## LEAD  -0.260  0.105       
## G.HRS -0.950 -0.228  0.065
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -3.94871064 -0.65589603  0.03691456  0.70114760  3.63347646 
## 
## Number of Observations: 7382
## Number of Groups: 99

Now, we can add another layer of complication by allowing the WBEING~LEAD slopes to vary among groups

Model.2<-lme(WBEING~HRS+LEAD+G.HRS,random=~LEAD|GRP, data=bh1996, # We're allowing the WBEING~LEAD slope to vary by group
             control=list(opt="optim"),na.action=na.omit)
summary(Model.2)
## Linear mixed-effects model fit by REML
##  Data: bh1996 
##        AIC      BIC   logLik
##   17838.58 17893.83 -8911.29
## 
## Random effects:
##  Formula: ~LEAD | GRP
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 0.3794891 (Intr)
## LEAD        0.1021935 -0.97 
## Residual    0.8008079       
## 
## Fixed effects: WBEING ~ HRS + LEAD + G.HRS 
##                  Value  Std.Error   DF   t-value p-value
## (Intercept)  2.4631348 0.20832607 7281 11.823459   0e+00
## HRS         -0.0284776 0.00446795 7281 -6.373764   0e+00
## LEAD         0.4946550 0.01680846 7281 29.428928   0e+00
## G.HRS       -0.0705047 0.01789284   97 -3.940387   2e-04
##  Correlation: 
##       (Intr) HRS    LEAD  
## HRS   -0.022              
## LEAD  -0.338  0.074       
## G.HRS -0.929 -0.245  0.064
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -3.87093730 -0.65503152  0.04205164  0.69693131  3.95467766 
## 
## Number of Observations: 7382
## Number of Groups: 99

Now, we compare the two models to see if allowing the slopes to vary among groups improves the model fit

anova(Model.2a,Model.2)
##          Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## Model.2a     1  6 17862.68 17904.12 -8925.341                        
## Model.2      2  8 17838.58 17893.83 -8911.290 1 vs 2 28.10254  <.0001

*The results suggest allowing the WBEING~LEAD slopes to vary among groups provides a significantly better fit

Now, we add a cross-level interaction between level-1 LEAD and level-2 G.HRS

Final.Model<-lme(WBEING~HRS+LEAD+G.HRS+LEAD:G.HRS, random=~LEAD|GRP,data=bh1996,
                 control=list(opt="optim"),na.action=na.omit)
summary(Final.Model)
## Linear mixed-effects model fit by REML
##  Data: bh1996 
##        AIC      BIC    logLik
##   17843.94 17906.09 -8912.968
## 
## Random effects:
##  Formula: ~LEAD | GRP
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev     Corr  
## (Intercept) 0.36229909 (Intr)
## LEAD        0.09770136 -0.965
## Residual    0.80087761       
## 
## Fixed effects: WBEING ~ HRS + LEAD + G.HRS + LEAD:G.HRS 
##                 Value Std.Error   DF   t-value p-value
## (Intercept)  3.653755 0.7261065 7280  5.031983  0.0000
## HRS         -0.028559 0.0044683 7280 -6.391465  0.0000
## LEAD         0.125570 0.2173923 7280  0.577621  0.5635
## G.HRS       -0.174924 0.0635883   97 -2.750884  0.0071
## LEAD:G.HRS   0.032460 0.0190634 7280  1.702740  0.0887
##  Correlation: 
##            (Intr) HRS    LEAD   G.HRS 
## HRS        -0.018                     
## LEAD       -0.962  0.017              
## G.HRS      -0.994 -0.058  0.958       
## LEAD:G.HRS  0.958 -0.012 -0.997 -0.959
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -3.83632991 -0.66059970  0.04112998  0.69533044  3.94797744 
## 
## Number of Observations: 7382
## Number of Groups: 99

Now, we compare the random slope models with and without the cross-level interaction

anova(Model.2,Final.Model)
## Warning in anova.lme(Model.2, Final.Model): fitted objects with different fixed
## effects. REML comparisons are not meaningful.
##             Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## Model.2         1  8 17838.58 17893.83 -8911.290                        
## Final.Model     2  9 17843.94 17906.09 -8912.968 1 vs 2 3.356567  0.0669

NOTE: Even though this final model is not significantly better at the .05 level and Final.Model’s interaction is not significant at the .05 level, we proceed for illustrative purposes as if these were both significant.

Calculate Psuedo-R-squared for lme Models

#install.packages("MuMIn")
library(MuMIn)
Model.1rs <- r.squaredGLMM(Model.1) #pseudo R-squared
## Warning: 'r.squaredGLMM' now calculates a revised statistic. See the help page.
Model.1rs
##             R2m        R2c
## [1,] 0.03855485 0.05496551
Model.2rs <- r.squaredGLMM(Model.2) #pseudo R-squared
Model.2rs
##            R2m       R2c
## [1,] 0.2030021 0.2271794
Final.Modelrs <- r.squaredGLMM(Final.Model) #pseudo R-squared 
Final.Modelrs
##            R2m       R2c
## [1,] 0.2053143 0.2283785

PLOTTING CROSS-LEVEL INTERACTIONS

STEP 1: Calculate High and Low Values of Both Variables, Based on 1sd Above and Below the Mean

mean(bh1996$G.HRS) 
## [1] 11.2987
sd(bh1996$G.HRS) 
## [1] 0.8608297
# Low = 11.30 - .86 = 10.44
# High = 11.30 + .86 = 12.16
mean(bh1996$LEAD) 
## [1] 2.890665
sd(bh1996$LEAD)
## [1] 0.7719381
# Low = 2.89 - .77 = 2.12
# High = 2.89 + .77 = 3.66

STEP 2: Create a Temporary Dataset (TDAT) with High and Low Values for the Interaction Variables and Mean Values for Non-Interactive Variables

TDAT<-data.frame(HRS=c(11.2987,11.2987,11.2987,11.2987),
                 LEAD=c(2.12,2.12,3.66,3.66),
                 G.HRS=c(10.44,12.16,10.44,12.16),
                 GRP=c(1,1,1,1))
predict(Final.Model,TDAT,level=0)
## [1] 2.489508 2.307001 3.204766 3.108239
## attr(,"label")
## [1] "Predicted values"

STEP 3: Plot the Interaction Using interaction.plot

TDAT$WBEING<-predict(Final.Model,TDAT,level=0)
with(TDAT,interaction.plot(LEAD,G.HRS,WBEING))

REFERENCES

  • Aguinis, H., Gottfredson, R. K., & Culpepper, S. A. (2013). Best-practice recommendations for estimating cross-level interaction effects using multilevel modeling. Journal of Management (Vol. 39). https://doi.org/10.1177/0149206313478188

  • Bliese, P. D. (2016). Multilevel modeling in R (2.6): A brief introduction to R, the multilevel package and the nlme package. https://doi.org/10.1177/0269216316671280