Multi-Level Modeling with MLn

annotated logfile of example analyses

This logfile accompanies the paper "On multi-level modeling of data from repeated measures designs: A tutorial" by Hugo Quené and Huub van den Bergh (2004).

The multilevel analyses reported here were done with the program MLn for multilevel analysis (version 1.0a). This is a command-line version of the program MLn, with output suitable for offline inspection as presented here. The program has since been modernized into a version for MicroSoft Windows, MLwiN (version 1.10). In the latter program, the user can open a command-line window and input the commands below; program output is written to a separate window.

In the log below, user input commands are indicated by blue and bolder type, like this.

Program output is indicated by black monospace type, like this.

Comments and annotations are given in italic sans-serif type, with indented left margin, like this.

MLN - Software for N-level analysis.

data set-up

All commands can be abbreviated to their first 4 characters.
The command HELP gives a list of all possible commands. The command HELP with another command as parameter, e.g. "HELP DINPUT" yields a short explanation of that command.

dinp c1-c4

250000 spaces left on worksheet
Type file name
->

f:\qb108.dat

read Disk INPut, and store data from file into columns 1 to 4 of worksheet. The listing of the Disk INPut file is abbreviated here.

1.7074	1	1	1
-.55131	1	2	1
-.75213	1	3	1
...
1.7583	12	7	3
-.18328	12	8	3
-.72555	12	9	3

names c1 'score' c2 'subj' c3 'trial' c4 'treatm'

Rename columns to sensible variable names.
NB: This program requires that data are sorted in hierarchical order, e.g. first by subject and within subjects by trial. This was already done in the data file. Data can be sorted in MLn by CALCulating an auxiliary variable from the sort key variables, and then SORTing all columns using the auxiliary variable.

note construct CONST vector, filled with one's only

NOTE helps to add comment into the logfile.
This program does not add constant variances into the model by default; these have to be specified by the user. An auxiliary column containing the value unity (1) everywhere is needed for this purpose. This auxiliary column is CALCulated here from an already existing column.

calc c5=c1/c1
name c5 'const'

names

The NAMEs command without columns as argument returns summary information about the columns in the worksheet, as an aid to check data integrity.

      Name              n  missing     min             max
   1 SCORE            108      0       -4.7446        5.7573
   2 SUBJ             108      0        1.0000        12.000
   3 TRIAL            108      0        1.0000        9.0000
   4 TREATM           108      0        1.0000        3.0000
   5 CONST            108      0        1.0000        1.0000
   6 C6                 0
   7 C7                 0
...
  19 C19                0
  20 C20                0

empty model

resp c1
expl 1 c5

The independent (EXPLanatory) and dependent (RESPonse) variables must be specified. The RESPonse variable is in column 1. The command to include or exclude EXPLanatory variables works as a toggle. The first parameter "1" forces inclusion of the constant variable in column 5 into the model.

iden 1 c3 2 c2

The hierarchical structure must be specified. Level-1 units (trials or occasions, lowest level) are IDENtified in column 3. Level-2 units (subjects, highest level) are IDENtified in column 2.

fpart 1 c5

The fixed part of the model must be specified. Here the grand mean is specified as the only predictor, by choosing the CONST variable in column 5 as the only predictor for the scores. The command to include or exclude explanatory variables in the Fixed PART also works as a toggle.

setv 1 c5
setv 2 c5

The random part of the model must be specified. A constant variance at both levels is specified here, by choosing the CONST variable in column 5 as the only predictor for the variance at each level.

sett

Show SETTings of the current model: specifications of the dependent and independent variables, hierarchical structure, etc.

EXPLanatory variables in       CONST
FPARameters                    CONST
RESPonse variable in           SCORE
FSDErrors : uncorrected                 RSDErrors : uncorrected
MAXIterations  20   TOLErance     2     METHod is RIGLS    BATCh is ON
IDENtifying codes : 1-TRIAL, 2-SUBJ

LEVEL 2 RPM
         CONST
CONST    1
LEVEL 1 RPM
         CONST
CONST    1

The output of the SETTings command lists the explanatory and response variables, and gives the variance-covariance matrix at each level specified by the IDENtify command. Here the matrices only contain the CONST column, yielding constant variances and no covariances at both levels.
This is the empty model mentioned in the paper.

start

START the iterative estimation of the coefficients in the current model. The program is set here to iterate until a certain convergence criterion is achieved (BATCH ON). The command BATCH OFF allows inspection of estimates after each iteration.

Convergence achieved

Only a single iteration was needed to obtain good estimates for this simple model.

fixed

  PARAMETER            ESTIMATE     S. ERROR(U)   PREV. ESTIMATE
CONST                    0.2371       0.4401             0.2371

The command FIXED shows the estimates for the fixed part of the model, together with the standard error of that estimate (and the estimate in the previous iteration). These are given in units of the dependent variable. The single constant estimate corresponds with the grand mean, corrected or unbiased for the hierarchical random effects in the model.

random

LEV.  PARAMETER       (NCONV)    ESTIMATE    S. ERROR(U)  PREV. ESTIM     CORR.
-------------------------------------------------------------------------------
 2    CONST    /CONST    ( 1)       2.056         0.9437        2.042         1
-------------------------------------------------------------------------------
 1    CONST    /CONST    ( 2)       2.408         0.3475        2.408

The command RANDOM shows the estimates for the random part of the model, together with the standard error of that estimate (and the estimate in the previous iteration). These are given in variance units. The single constant estimate corresponds with the pooled between-subjects and within-subjects variance, corrected or unbiased for the hierarchical random effects in the model.

aver c1

The AVERage command shows the raw, uncorrected average of the dependent variable in the first column. Note that the standard error of the mean in this single-level, disaggregated model is considerably lower than the standard error of the mean as estimated in the fixed part of the two-level model (listed above), as explained in the paper.

Count   =            108
Average =         0.23708
S.D.    =          2.0764
S.E.M.  =         0.19980

249114 spaces left on worksheet
-2*log(lh) is      426.334

The log-likelihood indicates the amount of "stress" or "deviance" between the model and the actual data. Smaller values generally indicate a better fit for the model (except for proportion data).

cell means model

note construct dummies for treatment conditions
dummy c4 c11-c13

In order to analyze the cell means model mentioned in the paper, we need dummy variables for each of the 3 levels of the Treatment factor in column 4. The resulting dummy variables are stored in columns 11 to 13.

full set of 3 dummies created

names c11 'TrA' c12 'TrB' c13 'TrC'
expl 1 c11-c13
fpart 0 'const'

Include the new dummy columns as EXPLanatory variables in the new model. Default behavior is to include these explanatory variables in the fixed part but not in the random part, unless specified explicitly. Exclude the column named 'const' (column 5) from the Fixed PART of the new model.

sett

EXPLanatory variables in       TRA      TRB      TRC      CONST
FPARameters                    TRA      TRB      TRC
RESPonse variable in           SCORE
FSDErrors : uncorrected                 RSDErrors : uncorrected
MAXIterations  20   TOLErance     2     METHod is RIGLS    BATCh is ON
IDENtifying codes : 1-TRIAL, 2-SUBJ

LEVEL 2 RPM
         CONST
CONST    1
LEVEL 1 RPM
         CONST
CONST    1

start

Convergence achieved

fixed

  PARAMETER            ESTIMATE     S. ERROR(U)   PREV. ESTIMATE
TRA                     -0.1798       0.4807            -0.1798
TRB                     -0.2194       0.4807            -0.2194
TRC                        1.11       0.4807               1.11

These coefficients are the estimated cell means for the treatment conditions.

random

LEV.  PARAMETER       (NCONV)    ESTIMATE    S. ERROR(U)  PREV. ESTIM     CORR.
-------------------------------------------------------------------------------
 2    CONST    /CONST    ( 1)       2.099         0.9435        2.085         1
-------------------------------------------------------------------------------
 1    CONST    /CONST    ( 1)        2.02         0.2916         2.02

These coefficients are the estimated variances at both levels. Note that the within-subject variance at level 1 is less than in the empty model, because within-subject differences can now in part be attributed to treatment conditions.

input c41
1 -1 0 0 1 0 -1 0 0 1 -1 0

ftest c41

In order to test pairwise comparisons among the coefficients in the fixed part of the model, we have to set up appropriate contrast weights. There are 3 coefficients in the fixed part of this model. For each contrast, 3 weights are specified for the 3 coefficients, followed by the expected value of the contrast under H0 (i.e. zero). The sequence of 3 weights plus expected value is given for three pairwise comparisons A-B, A-C and B-C. All weights for all contrasts in the fixed part are INPUT in a single auxiliary variable (in column 41), so that they are tested simultaneously. Data INPUT into column 41 is terminated by an empty line. The FTEST command performs the actual testing in the fixed part using the weights in column 41.

CONTRASTS
TRA                   1.00    1.00    0.00
TRB                  -1.00    0.00    1.00
TRC                   0.00   -1.00   -1.00
result                0.04   -1.29   -1.33
chi square ( 1 df)    0.01   14.83   15.76
+/-95% c.i.(sep.)     0.66    0.66    0.66
+/-95% c.i.(sim.)     0.94    0.94    0.94

 chi sq for simultaneous contrasts(3 df) =  20.40

As explained in the paper, each contrast is tested by evaluating the amount of variance associated with that contrast, using a chi-square test statistic.

cprob 14.8 1

 0.00011954

Calculate the Chi-square PROBability for one of the contrasts, with chi-square = 14.8 and df=1, p=.00012. Treatment C differs significantly from treatments A and B, which are not significantly different themselves.

249107 spaces left on worksheet
-2*log(lh) is        407.5

The log-likelihood (deviance) for the cell means model is lower than for the empty model (see above), indicating a better fit for the cell means model.

full model

expl 0 c5
expl 1 c11-c13

Make sure that the CONST variable is excluded as an EXPLanatory variable from the model (toggle off, constant vector in column 5). Make sure that the dummy variables TrA, TrB and TrC are included in the model (toggle on, dummies in columns 11 to 13). The latter command automatically includes these dummies in the fixed part of the model.

setv 1 c11-c13
setv 2 c11-c13

The random part of the model must be adjusted. The dummy variables in columns 11 to 13 are specified as predictors for the variance at level 1 and level 2. The default behavior of the program is to include not only the variances, on the diagonal of the variance-covariance matrix, but also the covariances, off the diagonal. This can be verified by inspecting the current SETTings for the multilevel model.

sett

EXPLanatory variables in       TRA      TRB      TRC
FPARameters                    TRA      TRB      TRC
RESPonse variable in           SCORE
FSDErrors : uncorrected                 RSDErrors : uncorrected
MAXIterations  20   TOLErance     2     METHod is RIGLS    BATCh is ON
IDENtifying codes : 1-TRIAL, 2-SUBJ

LEVEL 2 RPM
         TRA      TRB      TRC
TRA      1
TRB      1        1
TRC      1        1        1
LEVEL 1 RPM
         TRA      TRB      TRC
TRA      1
TRB      1        1
TRC      1        1        1

note Remove covariances between trials and treatments

Since treatments were not crossed with trials in this experimental design, there are no covariances between treatments, at level-1 (trials). The covariance terms between treatments at the lowest level (trials) must therefore be excluded.

clre 1 c12 c13
clre 1 c11 c13
clre 1 c11 c12

CLeaR a single Element in the variance-covariance matrix for level 1, viz the covariance between columns 12 and 13 (TrA and TrB). The other two commands work in similar fashion to remove or clear the other covariance elements.

sett

EXPLanatory variables in       TRA      TRB      TRC
FPARameters                    TRA      TRB      TRC
RESPonse variable in           SCORE
FSDErrors : uncorrected                 RSDErrors : uncorrected
MAXIterations  20   TOLErance     2     METHod is RIGLS    BATCh is ON
IDENtifying codes : 1-TRIAL, 2-SUBJ

LEVEL 2 RPM
         TRA      TRB      TRC
TRA      1
TRB      1        1
TRC      1        1        1
LEVEL 1 RPM
         TRA      TRB      TRC
TRA      1
TRB      0        1
TRC      0        0        1

This is the correct model, with a full variance-covariance matrix at level-2 (between subjects) and variances only at level-2 (trials or occasions). Let's go!

start

Convergence achieved

fixed

  PARAMETER            ESTIMATE     S. ERROR(U)   PREV. ESTIMATE
TRA                     -0.1798       0.4739            -0.1798
TRB                     -0.2194       0.5184            -0.2194
TRC                        1.11       0.6383               1.11

The fixed estimates are the same as in the cell means model above.

random

LEV.  PARAMETER       (NCONV)    ESTIMATE    S. ERROR(U)  PREV. ESTIM     CORR.
-------------------------------------------------------------------------------
 2    TRA      /TRA      ( 1)       2.411          1.097        2.396         1
 2    TRB      /TRA      ( 1)      0.9261         0.8869       0.9202      0.35
 2    TRB      /TRB      ( 1)       2.903          1.312        2.885         1
 2    TRC      /TRA      ( 1)         3.3          1.408        3.279     0.989
 2    TRC      /TRB      ( 1)       0.826          1.164       0.8208     0.226
 2    TRC      /TRC      ( 1)       4.617          1.986        4.588         1
-------------------------------------------------------------------------------
 1    TRA      /TRA      ( 2)      0.8513         0.2458       0.8513
 1    TRB      /TRB      ( 2)      0.9635         0.2781       0.9635
 1    TRC      /TRC      ( 2)      0.8186         0.2363       0.8186

The random estimates show interesting properties, especially at level-2 (between subjects), as explained in the paper. First, between-speaker variances differ widely among treatments (2.4, 2.9 and 4.6), although not significantly so (see below). Second, covariances among treatments are not constant among pairs of treatments. Hence there is no compound symmetry and no sphericity. Variance within subjects (level-1) seems to be homogeneous across treatments.

input c42
1 0 -1 0 0 0 0 0 0 0 1 0 0 0 0 -1 0 0 0 0 0 0 1 0 0 -1 0 0 0 0

rtest c42

The different between-speaker variances among treatments can be evaluated by means of pairwise comparisons, similar to pairwise comparisons in the fixed part of the cell means model (see above). We have to set up appropriate contrast weights. There are 9 coefficients in the random part of this model (6 at level-2 and 3 at level-1). For each contrast, 9 weights are specified for the 9 coefficients, again followed by the expected value of the contrast under H0 (i.e. zero). The sequence of 9 weights plus expected value is given for three pairwise comparisons A-B, A-C and B-C. All weights for all contrasts in the random part are INPUT in a single auxiliary variable (in column 42), so that they are tested simultaneously. Data INPUT into column 42 is terminated by an empty line. The RTEST command performs the actual testing in the random part using the weights in column 42.

CONTRASTS
TRA     /TRA     :>   1.00    1.00    0.00
TRB     /TRA     :>   0.00    0.00    0.00
TRB     /TRB     :>  -1.00    0.00    1.00
TRC     /TRA     :>   0.00    0.00    0.00
TRC     /TRB     :>   0.00    0.00    0.00
TRC     /TRC     :>   0.00   -1.00   -1.00
TRA     /TRA     :=   0.00    0.00    0.00
TRB     /TRB     :=   0.00    0.00    0.00
TRC     /TRC     :=   0.00    0.00    0.00
result               -0.49   -2.21   -1.71
chi square ( 1 df)    0.09    3.11    0.54
+/-95% c.i.(sep.)     3.19    2.45    4.57
+/-95% c.i.(sim.)     6.69    5.14    9.59

 chi sq for simultaneous contrasts(9 df) =   3.88

cpro 3.11 1

 0.077813

Calculate the Chi-square PROBability for this contrast, with chi-square = 3.11 and df=1, p=.078. Although the between-speaker variance under treatment A is considerably smaller than under treatment C (2.4 vs 4.6), this difference is not significant. Nevertheless the assumption of homoschedasticity is probably violated in these data.

calc b1 = 3.3^2 / (2.411*4.617)

 0.97830

note this is correlation in subj means between treatmts A and C

The covariance amount of 3.3 can be regarded as unstandardized correlation in subject means between treatments A and C. Here the standardized correlation is CALCulated, yielding r=0.978. This standardized correlation is also given (value 0.989) in the output for the random part, in the last column of the output table (above).

ftest c41

The full model still has only 3 coefficients in its fixed part. For the pairwise comparisons we can re-use the contrast weights that were set up above for the cell means model, and stored in column 41. Again, each contrast is tested by evaluating the amount of variance associated with that contrast, using a chi-square test statistic.

CONTRASTS
TRA                   1.00    1.00    0.00
TRB                  -1.00    0.00    1.00
TRC                   0.00   -1.00   -1.00
result                0.04   -1.29   -1.33
chi square ( 1 df)    0.00   20.28    3.28
+/-95% c.i.(sep.)     1.14    0.56    1.44
+/-95% c.i.(sim.)     1.63    0.80    2.05

 chi sq for simultaneous contrasts(3 df) =  22.91

cprob 3.28 1

 0.070129

The third contrast, between treatments A-C, yields a chi-square value of 3.28 with df=1, with a Chi-square PROBability of p=.070, which is not significant at alpha=.05. Adding random coefficients has led to larger standard errors for the cell means estimates in the fixed part, as compared to the cell means model. In turn, this has led to more conservative testing of the pairwise comparisons: the third contrast was significant in the cell means model but not in the full model.

249058 spaces left on worksheet
-2*log(lh) is      355.759

The log-likelihood (deviance) for the current full model is lower than for the empty model or for the cell means model (see above), indicating a better fit for the current model.

wrap-up

save

Type file name
->

f:\qb108.ws

249028 spaces left on worksheet

SAVE the current worksheet for later use. The worksheet contains the data in columns 1 to 4, auxiliary variables, and estimated coefficients in columns 96 to 99.

stop

STOP the current session, and terminate the program MLn.

2004.02.20 HQ