The mixed-effects analyses shown here were done with the program R (version 2.4.1), with the function lmer
in package lme4
(Bates, 2005). It is assumed that you have some basic knowledge of R (or S or S-Plus). For further assistance, you may consult my tutorial on Statistics with R and S-Plus, or other help information at the R project website www.r-project.org.
Program output is indicated by black monospace type, like this.
R version 2.4.1 (2006-12-18) Copyright (C) 2006 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. Loading required package: utils [Previously saved workspace restored]
xyz
is available by typing help(xyz) at the R prompt, or by browsing the Help menu option. The help pages also contain highly informative examples that may be copy-pasted at the R prompt. R>
getwd()[1]
in the output tells us that the following element (string) is the first (and here, the only) element in the list of strings.dir
argument, can also be used to change the current working directory.[1] "E:/Hugo/multilevel"
R>
x24 <- read.table(file="x24.txt",header=F)read.table
reads a data sheet or matrix from the specified file. The file specification is a character string enclosed in double quotes. The file is searched in the default working directory, see above. The function returns a data object of class data.frame
. Optionally, the first line of the ASCII data file specified as argument, may contain the names of the columns (variables), in which case we should add the argument header=TRUE
. Since our data file doesn't have such a first line, this option is switched off by specifying header=FALSE
(which is also the default value for this argument). Boolean values TRUE and FALSE may be abbreviated to their first character. x24
by means of the assignment operator <-
. Without such assignment, the data are read from file, the contents of the resulting data frame is displayed on the screen, and the result is then lost. Warning: Assignments are carried out without checking whether the object already exists. In other words, if some object named x24
already exists, it will be overwritten by this assignment operation. Make sure that you do not lose valuable information by inadvertedly overwriting it. R>
head(x24)
V1 V2 V3 V4 1 1 1 1 0.72383 2 1 2 1 0.06709 3 1 3 1 1.19462 4 1 4 1 1.72527 5 1 5 1 0.52362 6 1 6 1 1.09539
R>
dimnames(x24)[[2]] <- c("subj","item","cond","resp")
x24
has two dimensions, viz. rows and columns. We'd prefer to assign helpful names to the columns, i.e. to the names of the elements in dimension number two, to replace the names V1, V2, V3, V4
. This is done by constructing a list of four strings, and assigning this to the names of the second dimension of object x24
.R>
# cond is coerced from numerical into factorR>
x24$cond <- as.factor(x24$cond)
x24
are numeric. In regression, this means that the continuous numerical values would be used, instead of the discrete categorical values. Replace variable cond
in data frame x24
by its own values but coerced (forced) into a special class named factor
, for discrete factors. R>
require(lme4)
lme4
(Bates, 2005) is required for further analysis. If the package is not yet loaded, then do so. Any packages that are required by lme4
are also loaded. Loading required package: lme4 Loading required package: Matrix Loading required package: lattice [1] TRUE
R>
# empty model (8)R>
m8 <- lmer(resp~1+(1|subj)+(1|item),data=x24,method="ML")
lmer
. Study the help information for this function. The first argument is the regression formula. This formula contains the random components in parentheses; the residual is not specified. The sequence (1|subj)
corresponds to u0(j0); it specifies an intercept for the effect of subj
that is the same for all conditions. Likewise, the sequence (1|item)
corresponds to v0(0k). The two random effects, of subj
ects and of item
s, are crossed.data
argument contains the data frame object to use, and the method
argument specifies Maximum Likelihood as the estimation method (Pinheiro & Bates, 2000). The function returns an object containing the model fit and residuals; this object of class lmer
is assigned to a new object named m8
.R>
summary(m8)
m8
.Linear mixed-effects model fit by maximum likelihood Formula: resp ~ 1 + (1 | subj) + (1 | item) Data: x24 AIC BIC logLik MLdeviance REMLdeviance 2087 2101 -1040 2081 2083 Random effects: Groups Name Variance Std.Dev. item (Intercept) 0.25664 0.50659 ## v_0(0k) subj (Intercept) 0.28828 0.53692 ## u_0(j0) Residual 0.54027 0.73503 ## e_i(jk) number of obs: 864, groups: item, 36; subj, 24 Fixed effects: Estimate Std. Error t value (Intercept) 0.0235 0.1406 0.1671 ## gamma_0(00)
R>
m8v2 <- lmer(resp~-1+cond+(1|subj)+(1|item),data=x24,method="ML")
-1
) and adds the fixed effect of cond
. Because the condition effect is not given in the random part, this model assumes homoschedasticity and sphericity, much like a RM-ANOVA does. Hence the standard errors are the same for all three fixed estimates.R>
summary(m8v2)
Linear mixed-effects model fit by maximum likelihood Formula: resp ~ -1 + cond + (1 | subj) + (1 | item) Data: x24 AIC BIC logLik MLdeviance REMLdeviance 2045 2069 -1017 2035 2045 Random effects: Groups Name Variance Std.Dev. item (Intercept) 0.25789 0.50783 ## v_0(0k) subj (Intercept) 0.28913 0.53771 ## u_0(j0) Residual 0.51033 0.71437 ## e_i(jk) number of obs: 864, groups: item, 36; subj, 24 Fixed effects: Estimate Std. Error t value cond1 0.21606 0.14485 1.4916 ## gamma_H cond2 0.04569 0.14485 0.3154 ## gamma_N cond3 -0.19127 0.14485 -1.3204 ## gamma_L
R>
anova(m8,m8v2)
cond
, by comparing their likelihood ratios using function anova
with the two models as two parameters. This method of evaluating fixed effects may be too liberal (Pinheiro & Bates, 2000), especially if there are few observations and many model parameters. Data: x24 Models: m8: resp ~ 1 + (1 | subj) + (1 | item) m8v2: resp ~ -1 + cond + (1 | subj) + (1 | item) Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) m8 3 2086.7 2101.0 -1040.3 m8v2 5 2044.8 2068.6 -1017.4 45.901 2 1.079e-10 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
R>
# full model (9)R>
m9 <- lmer(resp~-1+cond+(-1+cond|subj)+(1|item),data=x24,method="ML")
cond
in its fixed part, and in the random part at the participant level. Hence this model does not require homoschedasticity nor sphericity. R>
summary(m9)
Linear mixed-effects model fit by maximum likelihood Formula: resp ~ -1 + cond + (-1 + cond | subj) + (1 | item) Data: x24 AIC BIC logLik MLdeviance REMLdeviance 2047 2095 -1014 2027 2036 Random effects: Groups Name Variance Std.Dev. Corr item (Intercept) 0.25344 0.50342 ## v_0(0k) subj cond1 0.25067 0.50067 ## u_H 0(j0) cond2 0.30652 0.55364 0.922 ## u_N 0(j0) cond3 0.35956 0.59964 0.898 0.960 ## u_L 0(j0) Residual 0.49429 0.70306 ## e_i(jk) number of obs: 864, groups: item, 36; subj, 24 Fixed effects: Estimate Std. Error t value cond1 0.21606 0.13857 1.5593 ## gamma_H cond2 0.04569 0.14672 0.3114 ## gamma_N cond3 -0.19127 0.15407 -1.2414 ## gamma_L
lmer
. Speaker's deviances uH are correlated with uN, with r=.92. The different estimates of between-speaker variance in the random part lead to different standard errors for the three treatment conditions in the fixed part.
In the simulated data, the residual variances at the lowest level were not homogenenous either (cf. Note 1 in the manuscript), but this heteroschedasticity at the lowest level cannot be captured here.