Mixed-effects modeling with crossed random effects in MLwiN

annotated logfile of example analyses

The mixed-effects analyses shown here were done with the program MLwiN (version 2.02; dated June 2005) running under Microsoft Windows 2000 Professional (SP4).

In the log below, user input commands are indicated by blue and bolder type, like this.

Program output is indicated by black monospace type, like this.

Comments and annotations are given in italic sans-serif type, with indented left margin, like this.

Data preparation

Start the program MLwiN. Open a command-line input window, by choosing Data Manipulation > Command interface. The input commands below were entered in this Commands window. The program will also open a separate Output window. Unfortunately, the contents of neither window can be saved!

We start with reading data into the work sheet, from a disk file named x24.txt. This can only be done through the GUI, choose File > Ascii Input. In the Columns field, enter c1-c4 to store the file contents in the first four columns of the worksheet. Use the Browse button to find the input data file.

names c1 "subj" c2 "item" c3 "cond" c4 "resp"

Rename columns to sensible variable names.

Sorting: This program requires that data are sorted in hierarchical order, i.e. by higher-level units as primary key, lower-level units as secondary key, etc. This was already done in the present data file. If necessary, data can be sorted in MLwiN ...

calc c10=100*"subj"+"item"

... by calculating an auxiliary variable from the sort key columns...

sort c10 c1-c4 c10 c10 c1-c4 c10

... and then sorting all columns using the auxiliary variable.

Sort, using key in c10, the input columns in c1-c4 and in c10. Put the sorted key (back) in c10, and put the sorted output columns (back) in c1-c4 and in c10.

This program does not add constant variances into the model by default; these have to be specified by the user. An auxiliary column containing the value unity (1) everywhere is needed for this purpose.

put 864 1 c5

name c5 "const"

dummy "cond" c11-c13

In order to analyze the fixed-only or 'cell means' model, we need dummy variables for each of the 3 values or levels of the cond factor. The resulting dummy variables are stored in columns 11 to 13.

names c11 "cond1" c12 "cond2" c13 "cond3"

We can inspect the data with the command names (without arguments) or print. Data inspection is perhaps easier through the Names window in the GUI. Open this window by choosing Data Manipulation > Names. The Names window should resemble the one here.

Model preparation

resp "resp"
expl 1 "const"

The independent (EXPLanatory) and dependent (RESPonse) variables must be specified. The command to include or exclude EXPLanatory variables works as a toggle. The first argument 1 forces inclusion of the (constant intercept) predictor in column 5 into the model. By default, explanatory variables are included in the fixed part, but not in the random part.

iden 1 "item" 2 "subj"

The hierarchical structure must be specified. Level-1 units are the items, within subjects, at the lowest level --- levels are counted from the lowest upward, as we do in our papers. Level-2 units are the participants or subjects, at a higher level.

The model specified so far is a basic multilevel model with the random effect of item nested under that of subj.

iden 3 "const"

Crossed effects are created by adding a virtual third level, consisting of a single unit (here identified by value 1).

mlre "subj" "item" c20

This single level however is different for each item within subj, so we need to create unique identifiers for items within subjects. The new identifiers are stored in column 20.

setx "const" 3 c20 c201-c236 c250
rcon c250

These two commands realize the actual cross-classification of two random effects. The 36 identifiers (for the 36 items within subjects) in c20 are used to build an additional constraint, at the now highest level 3, that the coefficients for each item (identified in c20, dummies in c201-c236) at that highest level are identical. In other words, items-within-subjects are constrained to be identical across subjects if they have the same identifier.

Empty model

fpart 1 "const"

The fixed part of the model must be specified. Here the constant intercept is specified as the only predictor included in the fixed part (1, use 0 to remove). Beause explanatory variables are included in the fixed part by default, this step is currently vacuous.

setv 1 "const"
setv 2 "const"

The random part of the model must be specified. A constant variance at both levels is specified here, by choosing the CONST variable in column 5 as the only predictor for the variance at each level.
This is the empty model (8) mentioned in the paper.

start

Start the iterative estimation of the coefficients in the current model. The program is set here to iterate until a certain convergence criterion is achieved (BATCH ON). The command BATCH OFF allows inspection of estimates after each iteration.

random

Show the estimates for the random part of the current (empty) model (8).

LEV.  PARAMETER       (NCONV)    ESTIMATE    S. ERROR(U)  PREV. ESTIM     CORR.
-------------------------------------------------------------------------------
 3    C201     /C201     ( 1)      0.2566        0.06661       0.2567         1
 3    C202     /C202     ( 1)      0.2566        0.06661       0.2567         1
 3    C203     /C203     ( 1)      0.2566        0.06661       0.2567         1
 3    C204     /C204     ( 1)      0.2566        0.06661       0.2567         1
. . .
 3    C233     /C233     ( 1)      0.2566        0.06661       0.2567         1
 3    C234     /C234     ( 1)      0.2566        0.06661       0.2567         1
 3    C235     /C235     ( 1)      0.2566        0.06661       0.2567         1
 3    C236     /C236     ( 1)      0.2566        0.06661       0.2567         1
-------------------------------------------------------------------------------
 2    const    /const    ( 1)      0.2883        0.0886        0.2881         1
-------------------------------------------------------------------------------
 1    const    /const    ( 1)      0.5403        0.02693       0.5403
-------------------------------------------------------------------------------

The between-item variance is constrained to be identical for each item. Note that there is an amount of residual variance, at the lowest level. This is the deviation of each observation from the score predicted by its subject-and-item combination. All variance estimates in the random part are reported with their standard error.

fixed

Show the estimates for the fixed part of the current (empty) model (8).

  PARAMETER            ESTIMATE     S. ERROR(U)   PREV. ESTIMATE
const                    0.0235       0.1406             0.0235

The grand mean, or intercept, or constant, is 0.0235.
Does this deviate significantly from the expected value of zero? This may be tested with the Wald statistic, 0.0235/0.1406=0.167 which does not exceed the critical value of Z*=1.96 (at α=.05). Hence H0 (intercept equals zero) should not be rejected.

Compute and show the likelihood ratio of the current model.

-2*log(lh) is      2080.69

Fixed-only model, with homogeneity assumption

The next, intermediate model (not specified in the manuscript but nevertheless reported in Table 2), has the fixed effect of cond in its fixed part. Instead of listing only the intercept (in the fixed part of the regression formula, as in model (8) above, the new model adds the fixed effect of cond and suppresses the intercept.

expl 1 c11-c13

Add (1) the three dummies in c11-c13, for the three levels of factor cond, as explanatory variables. The dummies are included in the fixed part by default.

fpart 0 "const"

Remove (0) the const (intercept) predictor from the fixed part. The variable remains in the model as an explanatory variable in the random part.

note : no changes in random part

NOTE adds comment into the output window and log file.

start

fixed

Show the estimates for the fixed part of the current model.

  PARAMETER            ESTIMATE     S. ERROR(U)   PREV. ESTIMATE
cond1                   0.2161       0.1449            0.2161
cond2                   0.04569      0.1449            0.04569
cond3                  -0.1913       0.1449           -0.1913

The estimated condition effects are close to the values of {0.2, 0.0, -0.2} used in the simulation. Because the condition effect is not specified in the random part, this model assumes homoschedasticity and sphericity, much like a RM-ANOVA does. Hence the standard errors are the same for all three fixed estimates.

random

Show the estimates for the random part of the current model.

LEV.  PARAMETER       (NCONV)    ESTIMATE    S. ERROR(U)  PREV. ESTIM     CORR.
-------------------------------------------------------------------------------
 3    C201     /C201     ( 1)      0.2579        0.06661        0.258         1
 3    C202     /C202     ( 1)      0.2579        0.06661        0.258         1
 3    C203     /C203     ( 1)      0.2579        0.06661        0.258         1
 3    C204     /C204     ( 1)      0.2579        0.06661        0.258         1
. . .
 3    C233     /C233     ( 1)      0.2579        0.06661        0.258         1
 3    C234     /C234     ( 1)      0.2579        0.06661        0.258         1
 3    C235     /C235     ( 1)      0.2579        0.06661        0.258         1
 3    C236     /C236     ( 1)      0.2579        0.06661        0.258         1
-------------------------------------------------------------------------------
 2    const    /const    ( 1)      0.2891        0.08861       0.2889         1
-------------------------------------------------------------------------------
 1    const    /const    ( 1)      0.5103        0.02544       0.5103
-------------------------------------------------------------------------------

Compute and show the likelihood ratio of the current (empty) model.

-2*log(lh) is      2034.79

In order to evaluate the coefficients in the fixed part of the model, we have to set up appropriate contrast weights. There are 3 coefficients in the fixed part of this model. For each contrast, 3 weights are specified for the 3 coefficients, followed by the expected value of the contrast under H0 (i.e. zero). The sequence of 3 weights plus expected value is given for two pairwise comparisons A-B and C-B, because the central condition is a baseline in the present fictitious study. All weights for all contrasts in the fixed part are put in a single auxiliary variable (in column 350).

put 8 0 c350
edit 1 c350 1
edit 2 c350 -1
edit 6 c350 -1
edit 7 c350 1

Column 350 is first filled with zeros, some of which are later edited to meaningful weights.

name c350 "f.contrasts"

ftest "f.contrasts"

The FTEST command performs the actual testing in the fixed part using the weights in column 350.

   CONTRASTS
cond1                 1.00    0.00
cond2                -1.00   -1.00
cond3                 0.00    1.00
result                0.17   -0.24
chi square ( 1 df)    8.19   15.84
+/-95% c.i.(sep.)     0.12    0.12
+/-95% c.i.(sim.)     0.17    0.17
chi sq for simultaneous contrasts (3 df) =  47.23

Each contrast is tested by evaluating the amount of variance associated with that contrast, using a chi-square test statistic. The chi-square statistic is also reported for the joint contrasts, i.e., for the main effect of condition. Note that the reported degrees of freedom for the joint or simultaneous contrasts are not correct; this should be the number of contrasts that make up the whole test. Here we should evaluate chi-square=47.23 with 2 (not 3) d.f.

cpro 47.23 2

 5.5480e-011

Calculate the Chi-square PROBability for the two contrasts together, with chi-square = 47.23 and df=2, p=<<.001. There are significant differences in response among the three treatment conditions.

Use the GUI to save the current, intermediate model. This cannot be done from the Commands window, nor from inside a macro, so we have to choose the menu options File > Save worksheet As... m8v2.ws. In general it is recommended to save each model in a separate file.

Full model (9), without homogeneity assumption

sete 2 c11 c11 c12 c12 c13 c13

Add the three dummies in c11, c12 and c13 to the random part, at the participant level (level 2). The command SETE (set element) adds specific elements of the particular variance-covariance matrix. This command adds only the three variances (on the diagonal) and not the covariances (off the diagonal).

clrv 2 "const"

Remove the const (intercept) predictor from the random part at level 2 (participants). The variable may remain in the model as an explanatory variable elsewhere.

start

This full model (9) now has the effect of cond in its fixed part, and in the random part at the participant level. Hence this model does not require homogeneity of variance (homoschedasticity). (Because there are no covariances in the model, sphericity is still assumed.)

random

Show the estimates for the random part of the current full model (9).

LEV.  PARAMETER       (NCONV)    ESTIMATE    S. ERROR(U)  PREV. ESTIM     CORR.
-------------------------------------------------------------------------------
 3    C201     /C201     ( 1)      0.2119        0.05625       0.2114         1
 3    C202     /C202     ( 1)      0.2119        0.05625       0.2114         1
 3    C203     /C203     ( 1)      0.2119        0.05625       0.2114         1
 3    C204     /C204     ( 1)      0.2119        0.05625       0.2114         1
. . .
 3    C233     /C233     ( 1)      0.2119        0.05625       0.2114         1
 3    C234     /C234     ( 1)      0.2119        0.05625       0.2114         1
 3    C235     /C235     ( 1)      0.2119        0.05625       0.2114         1
 3    C236     /C236     ( 1)      0.2119        0.05625       0.2114         1
-------------------------------------------------------------------------------
 2    cond1    /cond1    ( 1)      0.3064        0.103         0.3078         1
 2    cond2    /cond2    ( 2)      0.267         0.09088       0.266          1
 2    cond3    /cond3    ( 1)      0.4004        0.13          0.4014         1
-------------------------------------------------------------------------------
 1    const    /const    ( 2)      0.4945        0.02538       0.4945
-------------------------------------------------------------------------------
9956904 spaces left on worksheet

Between-subject variances (level 2) may be different in the three treatment conditions. The different estimates of between-speaker variance in the random part lead to different standard errors for the three treatment conditions in the fixed part.

fixed

Show the estimates for the fixed part of the current full model (9).

  PARAMETER            ESTIMATE     S. ERROR(U)   PREV. ESTIMATE
cond1                    0.2161       0.1427             0.2161
cond2                   0.04569       0.1369            0.04569
cond3                   -0.1913       0.1558            -0.1913

-2*log(lh) is       2082.2

ftest "f.contrasts"

Use command FTEST again to test the joint contrasts in the fixed part of the model, as specified by the weights in column 350.

  CONTRASTS
cond1                 1.00    0.00
cond2                -1.00   -1.00
cond3                 0.00    1.00
result                0.17   -0.24
chi square ( 1 df)    1.06    1.80
+/-95% c.i.(sep.)     0.32    0.35
+/-95% c.i.(sim.)     0.46    0.49
chi sq for simultaneous contrasts (3 df) =   5.05

cpro 5.05 2

0.080058

Calculate the Chi-square PROBability for the 2 contrasts together. The resulting probability shows that the main effect of condition is not significant, if homogeneity of variance is no longer assumed in the full model (9).

Use the GUI to save the current full model (9).

Full model, without homogeneity assumption, extended

In the data set under analysis, within-participant residual variances too are different in the three treatment conditions (see Note 1). This heteroschedasticity at the lowest level can also be modeled in MLwiN.

sete 1 c11 c11 c12 c12 c13 c13

As above, add the three dummies in columns 11, 12, 13 to the random part, at the residual level (level 1). The command SETE (set element) adds specific elements of the particular variance-covariance matrix to the model. The present command adds only the three variances (on the diagonal) and not the covariances (off the diagonal).

clrv 1 "const"

Remove the const (intercept) predictor from the random part at level 1 (observations). The variable may remain in the model as an explanatory variable elsewhere.

start

This extended model now has the effect of cond in its fixed part, and in the random part at the participant level, and at the residual level. Hence this model does not assume homogeneity of variance (homoschedasticity), for neither of these two random effects.

random

Show the estimates for the random part of the current full model (9).

LEV.  PARAMETER       (NCONV)    ESTIMATE    S. ERROR(U)  PREV. ESTIM     CORR.
-------------------------------------------------------------------------------
 3    C201     /C201     ( 1)      0.2111        0.05587       0.2107         1
 3    C202     /C202     ( 1)      0.2111        0.05587       0.2107         1
 3    C203     /C203     ( 1)      0.2111        0.05587       0.2107         1
 3    C204     /C204     ( 1)      0.2111        0.05587       0.2107         1
. . .
 3    C233     /C233     ( 1)      0.2111        0.05587       0.2107         1
 3    C234     /C234     ( 1)      0.2111        0.05587       0.2107         1
 3    C235     /C235     ( 1)      0.2111        0.05587       0.2107         1
 3    C236     /C236     ( 1)      0.2111        0.05587       0.2107         1
-------------------------------------------------------------------------------
 2    cond1    /cond1    ( 1)      0.2974         0.1032        0.299         1
 2    cond2    /cond2    ( 2)       0.266        0.09084       0.2648         1
 2    cond3    /cond3    ( 1)      0.4107         0.1301       0.4118         1
-------------------------------------------------------------------------------
 1    cond1    /cond1    ( 2)       0.605         0.0541       0.6049
 1    cond2    /cond2    ( 3)      0.5058        0.04546       0.5059
 1    cond3    /cond3    ( 2)      0.3724        0.03377       0.3723
-------------------------------------------------------------------------------
9956537 spaces left on worksheet

Within-participant residuals (level 1) are indeed quite different in the three treatment conditions, as expected from the simulation parameters (see Note 1). Inspection of their standard errors suggests that these variance estimates at the lowest level are likely to differ among conditions. A more formal evaluation is reported below.

fixed

Show the estimates for the fixed part of the current full model (9).

  PARAMETER            ESTIMATE     S. ERROR(U)   PREV. ESTIMATE
cond1                    0.2161       0.1427             0.2161
cond2                   0.04569       0.1368            0.04569
cond3                   -0.1913       0.1558            -0.1913

-2*log(lh) is      2067.61

The lower log-likelihood indicates that the current model fits considerably better than the previous model (9). As a rough indication, we compute the difference in log-likelihood (2082.2-2067.61=14.59), and evaluate this with a chi-square statistic, with the difference in number of model parameters (+3-1=2) as d.f., yielding p<.001.

ftest "f.contrasts"

Use command FTEST again to test the joint contrasts in the fixed part of the model, as specified by the weights in column 350.

CONTRASTS
cond1                 1.00    0.00
cond2                -1.00   -1.00
cond3                 0.00    1.00
result                0.17   -0.24
chi square ( 1 df)    1.06    1.80
+/-95% c.i.(sep.)     0.32    0.35
+/-95% c.i.(sim.)     0.46    0.49
chi sq for simultaneous contrasts (3 df) =   5.05

Testing the main effect of the cond factor yields the same non-significant outcome as before.

In order to evaluate the coefficients in the random part of the model, at level 1, we proceed just as we did for testing the fixed effect of cond, i.e., we have to set up appropriate contrast weights. There are 36+3+3 coefficients in the random part of the current model. For each contrast, 42 weights are specified for the 42 coefficients, followed by the expected value of the contrast under H0 (i.e. zero). The sequence of 42 weights plus expected value is given for two pairwise comparisons A-B and C-B, because the central condition B is regarded as a baseline in the present fictitious study.

put 39 0 c351

To construct the clumsy contrast vector we need, it's easier to set up an auxiliary vector containing 36+3=39 zeroes, corresponding to the random coefficients at levels 3 (items) and 2 (participants) which are irrelevant for the comparisons at level 1.

join c351 1 -1 0 0 c351 0 -1 1 0 c352

Construct a new column or vector, by appending the auxiliary vector (which contains the 39 zeroes for the estimates at levels 3 and 2), then the weights for the A-B constrasts (1 -1 0), then the expected value (0) for this contrast under H0, then again the auxiliary vector, then the weights for the C-B contrast (0 -1 1), then the expected value (0) for this latter contrast under H0. Store the resulting vector in column 352.

rtest c352

The RTEST command performs the actual testing in the random part using the weights in column 352. (This command corresponds to the FTEST command used above for testing contrasts in the fixed part.)

CONTRASTS
C201    /C201    :?   0.00    0.00
C202    /C202    :?   0.00    0.00
C203    /C203    :?   0.00    0.00
. . .
C234    /C234    :?   0.00    0.00
C235    /C235    :?   0.00    0.00
C236    /C236    :?   0.00    0.00
cond1   /cond1   :>   0.00    0.00
cond2   /cond2   :>   0.00    0.00
cond3   /cond3   :>   0.00    0.00
cond1   /cond1   :=   1.00    0.00
cond2   /cond2   :=  -1.00   -1.00
cond3   /cond3   :=   0.00    1.00
result                0.10   -0.13
chi square ( 1 df)    1.95    5.47
+/-95% c.i.(sep.)     0.14    0.11
+/-95% c.i.(sim.)     0.00    0.00
chi sq for simultaneous contrasts (42 df) =  14.72

Each contrast is tested by evaluating the amount of variance associated with that contrast, using a chi-square test statistic. The chi-square statistic is also reported for the joint contrasts, i.e., for the main effect of condition on the residual variances at the lowest level. Again, the degrees of freedom should not be 42 but it should be the number of contrasts (i.e., 2 d.f.).

cpro 14.72 2

 0.00063620

Calculate the Chi-square PROBability for the two contrasts together, with chi-square = 14.72 and df=2, p<.001. The residual variances are indeed significantly different among the three treatment conditions. In other words, the cond factor also modulates the within-participant (residual) variance. A participant responds less consistently (with larger variance) in condition 3 than in the baseline condition 2, irrespective of the items presented in those conditions. This insight may lead to new research questions and hypotheses.

Use the GUI to save the current model.

As a further exercise, one might add the covariances at level 2 (participants) into the analysis model, in accordance with the simulation model in Note 1. This can also be done with the SETE command. This will result in a model that does not assume sphericity (at level 2), unlike the current model. The log-likelihood in the resulting model will be significantly lower than in the current one. This significant change in log-likelihood indicates that the sphericity assumption is indeed violated in the current data set (as specified by the simulation parameters in Note 1).

Finally, to leave the program MLwiN, choose File > Exit from the GUI.

Full model, without homogeneity assumption, extended (2)

In the analyses above we have inspected whether the between-participant variances are different in the three treatment conditions. In a similar fashion we can also model three different between-item variances for the three conditions, in order to investigate whether variances between items are different under the three treatment conditions.
To this end, it's easiest to discard the whole level 3 (which consists of a single constant unit, containing item information) and replace it by a new level 3 (which consists of three units, one for each condition). For the sake of clarity, the between-participant and residual variances are modeled by a single term in this example.

We start by changing the memory settings of the active worksheet, because a larger amount of memory is needed. This will erase the current worksheet. Hence we save the worksheet, inititialize, and then retrieve the worksheet.

save d:\x24mx.ws

init 4 1000 400 300 20

retr d:\x24mx.ws

Then we clear the whole model and its estimates ...

clear

erase c50-c400

... and then re-define the fixed-only model again.

resp "resp"

iden 2 "subj" 1 "item"

expl 1 c11-c13

Add the three dummies for the three conditions to the fixed part.

expl 1 "const"

fpart 0 "const"

setv 1 "const"

setv 2 "const"

batch 1

mlre "subj" "item" c20

Just to be sure, we re-create unique identifiers for items within subjects. The new identifiers are stored in column 20.

iden 3 "const"

setx "cond1" "cond2" "cond3" 3 c20 c101-c136 c201-c236 c301-c336 c50

These three commands realize the actual cross-classification of two random effects. The 3 dummies and the 36 identifiers (for the 36 items within subjects) in c20 are used to build three additional constraints, at the new highest level 3. The constraints state that the variance and covariance coefficients for each item (identified in c20) are identical — within each level of cond. In other words, items-within-subjects are constrained to be identical across subjects, but within each condition, if they have the same identifier. The dummies for the constraints for condition 1 are in c101-c136, those for condition 2 are in c201-c236, and those for condition 3 are in c301-c336. All constraints for the random part are stored in column c50.

rcon c50

The random constraints in c50 are to be used.

start

Note that estimation of the coefficients of this model fails, because of insufficient memory. The help information for the SETX command explains the high memory requirements for cross-classified models.
One solution might be to investigate the model with fewer items. This can be done by selecting only the items 1 to 9 (out of 36 items), from the "item" column c2, taking along the data in c1-c4 and the dummies in c11-c13, and storing the result in these same columns.

choose 1 9 c2 c1-c4 c11-c13 c2 c1-c4 c11-c13

Because the number of items determines the number of dummy columns for the random part at level 3, the whole model has to be re-built once more, just like above.

clear

...

setx "cond1" "cond2" "cond3" 3 c20 c101-c109 c201-c209 c301-c309 c50

rcon c50

Finally, the model converges and produces the following results, for the first 9 items only.

We see that the variance between items is larger in condition 2 than in the other conditions, although this difference is not significant (command RTEST, see above, here yields χ²=3.228, df=2).

2008.06.27 HQ