Today, we deal with selected applications of a very powerful notion, general linear hypothesis tests, a framework that allows us to answer all questions from all sets (and others) regardless of whether we’re looking at means (of/with) means or slopes. To approach this framework (and we will mostly discuss its simplest kinds of application), we use the duration data …
car effects emmeans magrittr multcomp
TRUE TRUE TRUE TRUE TRUE
d <-read.delim( # summarize d, the result of loadingfile="_input/wdurations.csv", # this filestringsAsFactors=TRUE) # change categorical variables into factorsd$DURATION_l <-log2(d$DURATION) # as befored$SURPRISAL_s <-sqrt(d$SURPRISAL) # this time no centeringsummary(d)
CASE CONV SPEAKER SEX DURATION
Min. : 1 C08 : 873 S15 : 478 female:3203 Min. : 14.0
1st Qu.:1596 C05 : 831 S10 : 455 male :3180 1st Qu.: 110.0
Median :3192 C06 : 800 S11 : 414 Median : 162.0
Mean :3192 C07 : 772 S14 : 410 Mean : 194.5
3rd Qu.:4788 C04 : 737 S08 : 401 3rd Qu.: 245.0
Max. :6383 C09 : 557 S16 : 395 Max. :1405.0
(Other):1813 (Other):3830
POSINTURN LENGTH FREQ SURPRISAL
Min. : 1.000 Min. : 1.000 Min. : 0.00 Min. : 0.00
1st Qu.: 3.000 1st Qu.: 2.000 1st Qu.:12.50 1st Qu.: 4.00
Median : 6.000 Median : 3.000 Median :15.00 Median : 6.00
Mean : 5.524 Mean : 2.768 Mean :14.04 Mean : 6.17
3rd Qu.: 8.000 3rd Qu.: 3.000 3rd Qu.:17.00 3rd Qu.: 8.00
Max. :10.000 Max. :14.000 Max. :18.00 Max. :21.00
CLASS VOWELS DURATION_l SURPRISAL_s
closed :4021 one :3698 Min. : 3.807 Min. :0.000
number : 125 two+:2685 1st Qu.: 6.781 1st Qu.:2.000
open :2086 Median : 7.340 Median :2.449
propname: 110 Mean : 7.348 Mean :2.371
unclear : 41 3rd Qu.: 7.937 3rd Qu.:2.828
Max. :10.456 Max. :4.583
… and simplify m_01 a little for didactic simplicity:
Let’s begin with a very simple (and admittedly quite artificial) example. Imagine you had m_01 in R’s working memory but only knew the function coef but not the function summary. In other words, you could do this …
… but you aren’t able to get the p-value for the 2nd coefficient (for VOWELStwo+), i.e. the p-value of 5.3036714^{-5} which tells you that that coefficient is significantly different from 0; put differently, from the coef output you can’t see whether the predicted mean for DURATION_l when CLASS is closed and VOWELS is one is significantly different from when CLASS is closed and VOWELS is two+.
For this, and a ton of other, more interesting questions, the function multcomp::glht can help. For how we will use it now, it requires two arguments:1
a model object (here, m_01) and
a 1-row matrix that contains as many numbers as the model has coefficients (here, length(coef(m_01)) =10).
You can consider this 1-row matrix to be the equivalent of one row of the model’s model matrix (as retrieved with model.matrix). As a reminder, the model frame of a model is all the rows and columns from the original data that are used in the model and the model matrix of a model is a matrix that
has as many rows as the model frame;
has as many columns as the model has coefficients …
head(mf <-model.frame(m_01)) # the model frame of m_01
DURATION_l VOWELS CLASS
1 6.714246 one closed
2 6.475733 one closed
3 6.285402 one closed
4 6.643856 one closed
5 8.523562 two+ open
6 7.971544 two+ closed
head(mm <-model.matrix(m_01)) # the model matrix of m_01
.. because the prediction the model makes for each case of the model frame is the summed product of the row of the model matrix and the model’s coefficients. Let’s look at case 5:
mf[5,] # row 5 of the model frame## DURATION_l VOWELS CLASS## 5 8.523562 two+ openmm[5,] # row 5 of the model matrix## (Intercept) VOWELStwo+ CLASSnumber ## 1 1 0 ## CLASSopen CLASSpropname CLASSunclear ## 1 0 0 ## VOWELStwo+:CLASSnumber VOWELStwo+:CLASSopen VOWELStwo+:CLASSpropname ## 0 1 0 ## VOWELStwo+:CLASSunclear ## 0sum( # the summed"*"( # product of mm[5,], # the row of the model matrix &coef(m_01) # the model's coefficients))## [1] 7.950388predict(m_01)[5] %>% unname## [1] 7.950388
What glht does is,
it also multiplies each coefficient of the model (i.e., the values of coef(m_01)) with the corresponding number in the 1-row version of the model matrix you create ;
it also sums up all those products; but then
with the function summary, we can then test whether that sum is significantly different from some value (default: 0);
with the function confint, we can compute the confidence interval of that sum.
Here is this procedure shown in a stepwise fashion: We generate a vector I’m calling mm_index with 10 values and we do that such that, when each of the 10 values of mm_index is multiplied with the corresponding value of coef(m_01), the sum of these 10 products is the second coefficient, for which we want to know whether it’s significantly different from 0. Here’s the vector mm_index, which I put into a data frame just to easily represent how it lines up with the coefficients:
Steps 1 and 2 of a glht applications are then this:
(temp$COEF * temp$INDX) %>%# step 1sum # step 2
[1] -0.1072129
And you can see we did this right, because, trivially, this is indeed the value of coef(m_01)[2]. To do this with glht, we create a glht object, which I just call qwe:
Ideally, you’d immediately write a function like glht.summy that takes as input a glht object and then returns both the relevant summary and confint contents:
glht.summy(qwe)
estimate lower upper se t/z p
-0.10721 -0.15918 -0.05525 0.02651 -4.04458 0.00005
Of course, given everything we know already – we know summary and confint – these particular results are boring, we could have gotten them like this:
Estimate Std. Error t value Pr(>|t|) 2.5 % 97.5 %
-0.10721 0.02651 -4.04458 0.00005 -0.15918 -0.05525
But the point is to abstract away from this specific example because, with what we’ve just learned, we can now give to glhtany 1-row matrix codifying any specific combination of predictor values (!), and it will always do that above kind of pairwise multiplication, summing up, and testing that sum for how it differs from 0. That means, we can answer all questions of all questions sets (even question set 3) from above with just glht, and that’s what we’ll practice now.
1.2 Practical applications
Quick reminder of the 9 scenarios:
Table 1: 9 scenarios relevant for post-hoc exploration of models
a: what are the means/slopes?
b: are they * diff from 0?
c: are they * diff from each other?
1: means for (combos of) levels of cat preds
scenario 1a
scenario 1b
scenario 1c
2: means for values of numeric preds
scenario 2a
scenario 2b
scenario 2c
3: slopes for (combos of) levels of cat preds
scenario 3a
scenario 3b
scenario 3c
1.2.1 Scenarios 1a & 1b
Scenario 1a meant to ask the following question: What is each of the 10 predicted means of DURATION_l for when you cross the 2 levels of VOWELS and the 5 levels of CLASS? And scenario 1b meant to ask whether each of the 10 means computed above is significantly different from 0 (and what the exact p-value is). We discussed last week that one of the simplest ways to get the information for scenario 1a is with nd:
CLASS VOWELS fit lwr upr
1 closed one 7.017088 6.991226 7.042950
2 number one 7.953662 7.809975 8.097349
3 open one 7.852672 7.785243 7.920102
4 propname one 8.529168 8.383230 8.675106
5 unclear one 7.848304 7.607870 8.088738
6 closed two+ 6.909875 6.864804 6.954947
7 number two+ 7.905112 7.631366 8.178858
8 open two+ 7.950388 7.915274 7.985501
9 propname two+ 8.491001 8.123732 8.858270
10 unclear two+ 7.793538 7.212834 8.374241
data.frame(effect("VOWELS:CLASS", m_01))
VOWELS CLASS fit se lower upper
1 one closed 7.017088 0.01319279 6.991226 7.042950
2 two+ closed 6.909875 0.02299163 6.864804 6.954947
3 one number 7.953662 0.07329702 7.809975 8.097349
4 two+ number 7.905112 0.13964245 7.631366 8.178858
5 one open 7.852672 0.03439690 7.785243 7.920102
6 two+ open 7.950388 0.01791204 7.915274 7.985501
7 one propname 8.529168 0.07444535 8.383230 8.675106
8 two+ propname 8.491001 0.18735001 8.123732 8.858270
9 one unclear 7.848304 0.12264937 7.607870 8.088738
10 two+ unclear 7.793538 0.29622638 7.212834 8.374241
data.frame(emmeans(m_01, specs=~ VOWELS|CLASS))
VOWELS CLASS emmean SE df lower.CL upper.CL
1 one closed 7.017088 0.01319279 6373 6.991226 7.042950
2 two+ closed 6.909875 0.02299163 6373 6.864804 6.954947
3 one number 7.953662 0.07329702 6373 7.809975 8.097349
4 two+ number 7.905112 0.13964245 6373 7.631366 8.178858
5 one open 7.852672 0.03439690 6373 7.785243 7.920102
6 two+ open 7.950388 0.01791204 6373 7.915274 7.985501
7 one propname 8.529168 0.07444535 6373 8.383230 8.675106
8 two+ propname 8.491001 0.18735001 6373 8.123732 8.858270
9 one unclear 7.848304 0.12264937 6373 7.607870 8.088738
10 two+ unclear 7.793538 0.29622638 6373 7.212834 8.374241
But then we said we can’t really answer question set 1b without stupid releveling – well, now we can even it will at first be painful (but powerful!). But now you will finally realize why I always placed so much emphasis on you understanding what the coefficients mean and how we use them to compute predictions.
The intercept – when CLASS is closed and VOWELS is one – is very easy:
summary(m_01)$coef[1,]
Estimate Std. Error t value Pr(>|t|)
7.01708812 0.01319279 531.88808627 0.00000000
nd[1,]
CLASS VOWELS fit lwr upr
1 closed one 7.017088 6.991226 7.04295
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 7.8483 0.1226 63.99 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
As you can see, the values in each (model) matrix we give glht are the row indices of the required coefficients in the model summary. But now it gets a bit more complex because we turn to VOWELS being two+, which means we need to consider more coefficients.
The prediction for when CLASS is closed and VOWELS is two+ requires that we add to the intercept the coefficient for VOWELStwo+:
summary(m_01)$coef[c(1,2),]
Estimate Std. Error t value Pr(>|t|)
(Intercept) 7.0170881 0.01319279 531.888086 0.000000e+00
VOWELStwo+ -0.1072129 0.02650782 -4.044576 5.303671e-05
nd[6,]
CLASS VOWELS fit lwr upr
6 closed two+ 6.909875 6.864804 6.954947
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 6.90988 0.02299 300.5 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
The prediction for when CLASS is number and VOWELS is two+ requires that we add to the intercept the coefficient for CLASSnumber, the coefficient for VOWELStwo+, and the one for their interaction:
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 7.9051 0.1396 56.61 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
The prediction for when CLASS is open and VOWELS is two+ requires that we add to the intercept the coefficient for CLASSopen, the coefficient for VOWELStwo+, and the one for their interaction:
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 7.95039 0.01791 443.9 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
The prediction for when CLASS is propname and VOWELS is two+ requires that we add to the intercept the coefficient for CLASSpropname, the coefficient for VOWELStwo+, and the one for their interaction:
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 8.4910 0.1874 45.32 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
The prediction for when CLASS is unclear and VOWELS is two+ requires that we add to the intercept the coefficient for CLASSunclear, the coefficient for VOWELStwo+, and the one for their interaction:
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 7.7935 0.2962 26.31 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
Yes, it’s cumbersome, but if you know what the coefficients mean, it’s also not that bad and this approach answers all questions from scenarios 1a and 1b, which will become especially interesting when we turn to slopes, i.e. scenarios 3a and 3b.
1.2.2 Scenario 1c
Scenario 1c meant to ask whether (some? all?) differences between the 10 means are significant and last week we used pairs(emmeans(...)), and this is probably the best way to proceed, but to really drive home the point of how powerful glht is, let’s also look at glht-based comparisons between means. Imagine you were interested in answering the following question: Is the prediction for when VOWELS is two+ and CLASS is propname significantly different from when VOWELS is two+ and CLASS is unclear? With emmeans, this is what we’d do …
nd[9,"fit"] - nd[10,"fit"]qwe <-pairs(emmeans( # compute pairwise comparisons of meansobject=m_01, # for the model m_01specs=~ CLASS | VOWELS), # of CLASS within each level of VOWELSadjust="none") # do not adjust for multiple comparisonsqwe[20,]
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 0.6975 0.3505 1.99 0.0466 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
You can see that this is the right result by comparing this output to the above result from emmeans and to the difference in predictions from nd. Again, yes, it seems cumbersome, but you have to remember that, since you define the (model) matrices, any comparison you’re interested in is suddenly at your fingertips. As mentioned above, this will become especially interesting when we turn to slopes, i.e. scenarios 3a and 3b.
Scenario 2a meant to ask the following question: what are the 5 predicted means of DURATION_l for each of the levels of CLASS and when SURPRISAL_s is controlled for in some way? And scenario 2b meant to ask whether each of the 5 means is significantly different from 0 (and what the exact p-value is).
To address scenario 2a, we previously did this …
nd <-expand.grid(SURPRISAL_s=c(0, 1, mean(d$SURPRISAL_s)), # main values of SURPRISAL_sCLASS=levels(d$CLASS)) %>%# all levels of CLASS"["(,2:1) # reorder columns(nd <-cbind(nd, predict(m_02, newdata=nd, interval="confidence")))
… and then we looked at the predictions when SURPRISAL_s is 0. But then we said we can’t really address scenario 2b without stupid releveling – well, now we can even it is still painful (yet powerful). But to use the capabilities of glht some more, let’s actually use it to create/check the predictions that emmeans would make by default, because those are not for when SURPRISAL_s is 0, but for when SURPRISAL_s is at its mean:
So, let’s quickly compute a 1-element vector / variable with the mean of SURPRISAL_s (which is already part of nd):
(mean_surp <-mean(d$SURPRISAL_s))
[1] 2.370868
The prediction for when CLASS is closed and SURPRISAL_s is its mean requires that we add to the intercept – the prediction when CLASS is closed and SURPRISAL_s is 0 – mean_surp times the coefficient/slope for SURPRISAL_s:
summary(m_02)$coef[c(1,2),]
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.1697983 0.03707286 166.42359 0.000000e+00
SURPRISAL_s 0.3926787 0.01696794 23.14239 7.685868e-114
nd[3,]
CLASS SURPRISAL_s fit lwr upr
3 closed 2.370868 7.100787 7.077645 7.12393
The prediction for when CLASS is number and SURPRISAL_s is its mean requires that, to the intercept, we add mean_surp times the coefficient/slope for SURPRISAL_s, the coefficient for CLASSnumber, and mean_surp times the coefficient/slope for their interaction:
The prediction for when CLASS is open and SURPRISAL_s is its mean requires that, to the intercept, we add mean_surp times the coefficient/slope for SURPRISAL_s, the coefficient for CLASSopen, and mean_surp times the coefficient/slope for their interaction:
The prediction for when CLASS is propname and SURPRISAL_s is its mean requires that, to the intercept, we add mean_surp times the coefficient/slope for SURPRISAL_s, the coefficient for CLASSpropname, and mean_surp times the coefficient/slope for their interaction:
The prediction for when CLASS is unclear and SURPRISAL_s is its mean requires that, to the intercept, we add mean_surp times the coefficient/slope for SURPRISAL_s, the coefficient for CLASSunclear, and mean_surp times the coefficient/slope for their interaction:
Scenario 2c meant to ask whether (some? all?) differences between the 5 means are significant and last week we used pairs(emmeans(...)), and this is probably the best way to proceed, but to really drive home the point of how powerful glht is, let’s also look at glht-based comparisons between means. Imagine you were interested in answering the following question: Is the prediction for when SURPRISAL_s is its mean and CLASS is propname significantly different from when SURPRISAL_s is its mean and CLASS is unclear? With emmeans, this is what we’d do (because emmeans does us the favor of setting SURPRISAL_s to its mean for this):
nd[12,"fit"] - nd[15,"fit"]qwe <-pairs(emmeans( # compute pairwise comparisons of meansobject=m_02, # for the model m_01specs=~ CLASS | SURPRISAL_s), # of CLASS within each level of VOWELSadjust="none") # do not adjust for multiple comparisonsqwe[10,]
But now we use glht. As before I show the super stepwise way where we define two (model) matrices, one for each of the two situations that you want to compare, then you subtract one from the other, and then you give that difference (model) matrix to glht. And, thankfully, we already generated the relevant matrices above; I just copy and paste from above and assign proper names to them:
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 0.4945 0.2015 2.455 0.0141 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
You can see that this is the right result by comparing this output to the result from emmeans and the one from nd above. Again, yes, it seems cumbersome, but note that you can use the glht logic for arbitrary values of SURPRISAL_s, not just 0, 1, or its mean: If any prior work reported a certain significant effect for a certain value of SURPRISAL_s, you can just plug that value into the above code to see whether that value also yields a significant difference in your data. But again, this will become most interesting when we turn to slopes, which we do now.
1.2.5 Scenarios 3a & 3b
Scenario 3a meant to ask the following question: what are the 5 predicted slopes of SURPRISAL_s for each of the levels of CLASS? And scenario 3b meant to ask whether each of these 5 slopes is significantly different from 0 (and what the exact p-value is).
To address scenario 3a last week, the shortest thing we did was this:
# tapply(nd$fit, # apply a grouping of the predictions# list(nd$CLASS, # by CLASS# nd$SURPRISAL_s), # and SURPRISAL_s# c) %>% # just list the values,# "["(,-3) %>% # suppress the 3rd column# apply(1, diff) # compute the differences# or# nd.split <- split(nd, nd$SURPRISAL_s)# nd.split$`1`$fit - nd.split$`0`$fitcoef(m_02)[2] +c(0, coef(m_02)[7:10])
Scenario 3c means to ask whether (some? all?) differences between the 5 slopes of SURPRISAL_s for the different levels of CLASS are significant and last week we used emmeans::emtrends for this, which is probably the best way to proceed:
pairs(emtrends( # compute pairwise comparisons of trends/slopesobject=m_02, # for the model m_02specs="CLASS", # compute comparisons over this predictorvar="SURPRISAL_s"), # namely all slopes of this predictoradjust="none") # do not adjust for multiple comparisons
contrast estimate SE df t.ratio p.value
closed - number 0.3263 0.1290 6373 2.533 0.0113
closed - open 0.0188 0.0285 6373 0.659 0.5099
closed - propname 0.3391 0.1090 6373 3.105 0.0019
closed - unclear 0.7174 0.2590 6373 2.765 0.0057
number - open -0.3075 0.1300 6373 -2.371 0.0178
number - propname 0.0129 0.1670 6373 0.077 0.9387
number - unclear 0.3912 0.2890 6373 1.355 0.1754
open - propname 0.3204 0.1100 6373 2.905 0.0037
open - unclear 0.6987 0.2600 6373 2.688 0.0072
propname - unclear 0.3783 0.2800 6373 1.349 0.1774
Now with glht … Imagine you were interested in whether the slope of SURPRISAL_s when CLASS is open is significantly different from when CLASS is unclear, but you wanted to solve this with glht. You will not be surprised to see that I demonstrate this in the same granular way as before: We define two (model) matrices, one for each of the two situations that you want to compare, then you subtract one from the other, and then you give that difference (model) matrix to glht. And, thankfully, we already generated the relevant matrices above; I just copy and paste from above and assign proper names to them:
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 0.6987 0.2599 2.688 0.0072 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
1.3 What does this mean for our regression summaries?
The above has at least some implications for how we might want to report the results of regression models. We saw how the summary table of m_02 provides maximally half of all the info one might want to provide with regard to the interaction SURPRISAL_s:CLASS:
Given everything we’ve learned by now, we can do much better in two ways: First, we can represent the summary table in a format that makes it easier to digest for readers who may not always remember all your variables and, for factors, their reference levels, plus we can and probably should add confidence intervals. Thus, this is better:
Estimate
95%-CI
se
t
p2-tailed
Intercept (SURP=0 + CLASS=closed)
6.1698
[6.097, 6.242]
0.0371
166.4236
<0.0001
SURP0→1
0.3927
[0.359, 0.426]
0.0170
23.1424
<0.0001
CLASSclosed→numb
1.5894
[0.882, 2.297]
0.3608
4.4049
<0.0001
CLASSclosed→open
0.7045
[0.556, 0.853]
0.0759
9.2811
<0.0001
CLASSclosed→propn
2.1682
[1.419, 2.917]
0.3820
5.6761
<0.0001
CLASSclosed→uncl
2.5707
[1.147, 3.995]
0.7265
3.5385
0.0004
SURP0→1 + CLASSclosed→numb
-0.3263
[-0.579, -0.074]
0.1288
-2.5333
0.0113
SURP0→1 + CLASSclosed→open
-0.0188
[-0.075, 0.037]
0.0285
-0.6590
0.5099
SURP0→1 + CLASSclosed→propn
-0.3391
[-0.553, -0.125]
0.1092
-3.1051
0.0019
SURP0→1 + CLASSclosed→uncl
-0.7174
[-1.226, -0.209]
0.2595
-2.7652
0.0057
But then, even if you provide effects plots, which I recommend to always do, it might be nice – (especially) for generalized linear models – to provide
the predicted means for the levels of CLASS (while probably while SURPRISAL_s is at its mean);
the predicted slopes of SURPRISAL_s for each level of CLASS,
and their – don’t forget those! – their confidence intervals.
Here’re those results. First, we fix SURPRISAL_s to its mean and then compute the predictions for each level of CLASS:
SURP=2.37
CLASS: closed
CLASS: number
CLASS: open
CLASS: propname
CLASS: unclear
pred. DURA_l
7.1008
7.9167
7.7608
8.4650
7.9705
95% CI
[7.0776, 7.1239]
[7.7604, 8.0729]
[7.7251, 7.7964]
[8.1993, 8.7307]
[7.6783, 8.2627]
Then, we compute the slope of SURPRISAL_s for each level of CLASS:
CLASS: closed
CLASS: number
CLASS: open
CLASS: propname
CLASS: unclear
slope of SURP
0.3927
0.0664
0.3739
0.0536
-0.3248
95% CI
[0.3594, 0.4259]
[-0.1839, 0.3167]
[0.3291, 0.4187]
[0.1579, 0.2651]
[-0.8323, 0.1828]
Tip
Something for you to check out: multcomp::glht actually also allows you to test specific hypotheses in a way that looks more like English; for instance, you can do a (now two-tailed) test of SURPRISAL_s being different from 0.425 like this: summary(glht(m_02, "SURPRISAL_s = 0.425")). Another function that is useful for stuff like this is marginaleffects::hypotheses.
1.4 What else is possible?
1.4.1 Orthogonal contrasts w/ glht
We can also define orthogonal contrasts with glht, because glht can really do anything. Let’s get the data set up again:
d$CLASS_orth <- d$CLASS# clos numb open prop unclclo.op_vs_num.prop.unc <-c(+0.6, -0.4, +0.6, -0.4, -0.4)clo_vs_op <-c(+0.5, 0 , -0.5, 0 , 0 )num.prop_vs_uncl <-c( 0 , -1/3, 0 , -1/3, +2/3)num_vs_prop <-c( 0 , +0.5, 0 , -0.5, 0 )# & then we put those into a matrix like the above table & assign them as contrasts:contrasts(d$CLASS_orth) <-# make the following the contrasts of CLASS_orthcbind( # the matrix resulting from binding together as columns clo.op_vs_num.prop.unc, # all clo_vs_op, # four num.prop_vs_uncl, # contrast num_vs_prop) # vectors
And let’s fit the two models:
summary(m_03_treatm <-lm(DURATION_l ~1+ CLASS , data=d))$coef
So how do we get the first contrast closed/open vs. number/propname/unclear with glht from m_03_treatm? Like this: We first create a glht 1-row matrix that combines the predictions for when CLASS is closed or open by computing the unweighted mean of the sum of the vectors/matrices:
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + CLASS, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 -0.64244 0.05029 -12.77 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
And you can see, that’s right:
summary(m_03_orthog)$coef[2,]
Estimate Std. Error t value Pr(>|t|)
-6.424391e-01 5.029463e-02 -1.277351e+01 6.512723e-37
1.4.1.2 The second orthogonal contrast
So how do we get the second contrast closed vs. open (with glht from m_03_treatm)?
Actually, that’s a trick question: we don’t have to use glht because it’s already there (just with a different sign, but that doesn’t matter for the significance test):
summary(m_03_orthog)$coef[3,]
Estimate Std. Error t value Pr(>|t|)
-0.93901065 0.01960634 -47.89320883 0.00000000
summary(m_03_treatm)$coef[3,]
Estimate Std. Error t value Pr(>|t|)
0.93901065 0.01960634 47.89320883 0.00000000
So, don’t forget the old stuff because you’re so excited about the new stuff …
1.4.1.3 The third orthogonal contrast
So how do we get the third contrast number/propname vs. unclear with glht from m_03_treatm? Like this: We first create a glht 1-row matrix that combines the predictions for when CLASS is number or propname:
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + CLASS, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 0.3933 0.1230 3.197 0.0014 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
And you can see, that’s right:
summary(m_03_orthog)$coef[4,]
Estimate Std. Error t value Pr(>|t|)
-0.393280161 0.123017647 -3.196941003 0.001395713
1.4.1.4 The fourth orthogonal contrast
And the last one , number vs. propname: We first create one glht 1-row matrix that computes the predictions for when CLASS is number, then another for when CLASS is propname, and then we compute the difference:
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + CLASS, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 -0.58079 0.09499 -6.114 1.03e-09 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
And you can see, that’s right:
summary(m_03_orthog)$coef[5,]
Estimate Std. Error t value Pr(>|t|)
-5.807882e-01 9.499252e-02 -6.114042e+00 1.028827e-09
So this is why I discussed glht so much: it’s very very powerful for all sorts of comparisons you can think of, in fact even for things you might not have considered yet, given how we proceeded – read on to find out.
1.4.2 Testing against specific values inside your model
We discussed earlier – when we discussed the 1c/2c/3c scenarios – how we can use glht to compare values of our model coefficients directly against each other. We can also extend that a bit to ask more specific questions. For example, here are the coefficients of our ‘normal’ treatment contrast model again:
Imagine that, based on previous studies, you had hypothesized earlier that the difference between closed-class words (the intercept) and proper names (i.e., coef(m_03_treatm)[4]) should be twice as big as the difference between closed-class words and open-class words (i.e., coef(m_03_treatm)[3]). We can check this kind of hypothesis with glht as well and I’m doing it super stepwise again:
closed_vs_open <-matrix(c(0,0,1,0,0), nrow=1)closed_vs_propname <-matrix(c(0,0,0,1,0), nrow=1)summary(glht(m_03_treatm, "-"( # is the difference between closed_vs_propname, # closed vs. proper names2* closed_vs_open))) # twice as big as closed vs. open
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + CLASS, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 -0.34459 0.07709 -4.47 7.97e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
The result is significant – what does that mean? It means that
the difference between closed-class words and proper names on the one hand and
two times the difference between closed- and open-class words on the other
is significantly different from 0, meaning the ratio of
the difference between closed-class words and proper names on the one hand and
the difference between closed- and open-class words on the other
is significantly different from 2. What is the ratio? This:
coef(m_03_treatm)[4] /coef(m_03_treatm)[3]
CLASSpropname
1.633029
…, meaning, for the hypothesis to work out as expected, either the difference between closed- and open-class words is too big, or the difference between closed-class words and proper names is too small (or both), something we were not able to test easily before.
1.4.3 Testing against specific values outside your model
One final really nice thing we can do is this: So far, we always ‘only’ tested whether a certain coefficient is significantly different from 0 or whether the sum of some painstakingly-defined linear combinations of coefficients is significantly different from 0. But we can also test whether a certain coefficient or the sum of some painstakingly-defined linear combinations of coefficients is different from some other value.
Let’s first check out a very simple (monofactorial) model for that:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.921625 0.03118183 189.90626 0
SURPRISAL_s 0.601603 0.01255319 47.92431 0
Imagine a previous study reported that the slope of SURPRISAL_s was 0.625. However, you actually expected this slope to be lower than that, which is why you were happy to see that, in your data, the slope of SURPRISAL_s is indeed smaller: it’s not 0.625, it’s 0.601603. But is that difference significant? How can we address this?
There are three ways one might consider. The first is very straightforward, but doesn’t always help; we can try and use the confidence interval (and here we would of course not even need glht for that):
This confidence interval of our slope does include 0.625 but why is this wrong?
We actually want to look at this in a one-tailed way – remember, we expected the slope of SURPRISAL_s for closed class words to be less than 0.625. For this, we could of course adjust the level of the confidence interval:
confint(glht( # show the confint of a glht, namely m_04, # for the sum of the product of the coeffs of m_04matrix(c(0,1), nrow=1)), # and these valueslevel=0.9) # cut off 5% on each side, not 2.5%
Now, the confidence interval of our smaller estimate does not include the previous value of 0.625: given our expectation, our estimate of 0.601603 is indeed significantly smaller than that of the previous study. But how do we get a proper p-value? Here’s the second way: We use summary(glht(...)) again, but add two arguments to it:
the argument rhs, whose default value is 0, which is why all our tests so far were set up to test against whether something is different from 0; now we set that to the value we want to test against, 0.625;
the well-known argument alternative whose default value is, as it is in general, two.sided, but now we set it to less because we expect the coefficient to be less than the rhs we are testing against:
summary(glht( # show the summary of a glht, namely m_04, # check whether the sum of the coeffs of m_04matrix(c(0,1), nrow=1), # multiplied by thisrhs=0.625, # is significantly different from 0.625, namelyalternative="less")) # by being smaller
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(<t)
1 >= 0.625 0.60160 0.01255 -1.864 0.0312 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
What this does is this:
(summary(m_04)$coef[2,"Estimate"]-0.625) /summary(m_04)$coef[2,"Std. Error"] # t-score of glht
[1] -1.863831
pt(q=-1.863831, df=m_04$df.residual, lower.tail=TRUE) # 1-tailed p-value of glht
[1] 0.03119567
Now we have all we want:2 the one-tailed 95%-confidence interval of 0.601603 does not overlap with 0.625 and we have a proper p-value as well; we can report that, as expected, the slope of SURPRISAL_s is indeed significantly smaller than the value of 0.625 reported in the previous study.
A third way that can can also work involves model comparison. We have a model m_04 where the slope of SURPRISAL_s is 0.601603 but we can also create a model m_05, in which the slope of SURPRISAL_s is fixed to 0.625 (using offset), and then we can use anova for one of our usual model comparisons:
m_05 <-lm(DURATION_l ~1+offset(0.625*SURPRISAL_s), # here we fix the slope of SURPRISAL_sdata=d)anova(m_04, m_05, test="F")
Analysis of Variance Table
Model 1: DURATION_l ~ 1 + SURPRISAL_s
Model 2: DURATION_l ~ 1 + offset(0.625 * SURPRISAL_s)
Res.Df RSS Df Sum of Sq F Pr(>F)
1 6381 3524.5
2 6382 3526.4 -1 -1.9187 3.4739 0.06239 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
While this is a two-tailed comparison, these are the same results as from the summary(glht(...)):
the p-value here is twice glht’s p-value;
the F-score here is glht’s t-value squared,
The exact same logic applies to more complex models; let’s look at m_02 again:
Imagine a previous study reported that the slope of SURPRISAL_s for numbers was 0.3 (and significantly different from 0). If you actually expected to find in your data that this slope
would be significantly lower than 0.3, in fact
would be not significantly different from 0,
how could you explore and test this?
# question 1:summary(glht( # show the summary of a glht, namely m_02, # check whether the sum of the coeffs of m_02matrix(c(0,1,0,0,0,0,1,0,0,0), nrow=1), # multiplied by thisrhs=0.3, # is significantly different from 0.3, namelyalternative="less")) # by being smaller
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(<t)
1 >= 0.3 0.06642 0.12766 -1.83 0.0337 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
# question 2:confint(glht( # show the confint of a glht, namely m_02, # for the sum of the product of the coeffs of m_02matrix(c(0,1,0,0,0,0,1,0,0,0), nrow=1)), # and these valueslevel=0.9) # cut off 5% on each side, not 2.5%
summary(glht( # show the summary of a glht, namely m_02, # check whether the sum of the coeffs of m_02matrix(c(0,1,0,0,0,0,1,0,0,0), nrow=1), # multiplied by thisalternative="less")) # by being smaller
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(<t)
1 >= 0 0.06642 0.12766 0.52 0.699
(Adjusted p values reported -- single-step method)
Tip
See Faraway (2015, Section 3.2) for some more discussion.
1.4.4 Model matrices from effects
Ok, after all this painful manual defining of ‘model matrices’ to give to glht, you will be glad to hear there is one bit of relief I can offer. Remember the above 2c case where we wanted to find out whether the following two predictions are significantly different from each other:
when SURPRISAL_s is at its mean and CLASS is propname;
when SURPRISAL_s is at its mean and CLASS is unclear.
Above, we did this and I think we can all agree that defining SUR.is.mean.CLA.is.prop and SUR.is.mean.CLA.is.uncl is less than fun:
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 0.4945 0.2015 2.455 0.0141 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
Here’s the good news bit, which comes in two steps which will simplify things considerably. Remember that we already defined nd above as a data frame with all combinations of potentially interesting predictor values: all levels of categorical predictors and c(0, 1, mean) for all numeric predictors; I am reducing it here to just the two columns with the predictor combinations:
(nd <- nd[,1:2])
CLASS SURPRISAL_s
1 closed 0.000000
2 closed 1.000000
3 closed 2.370868
4 number 0.000000
5 number 1.000000
6 number 2.370868
7 open 0.000000
8 open 1.000000
9 open 2.370868
10 propname 0.000000
11 propname 1.000000
12 propname 2.370868
13 unclear 0.000000
14 unclear 1.000000
15 unclear 2.370868
Step 1 of the simplification is realizing that we can use a version of nd as the argument xlevels to effect. Remember how we always define our effects object?
That means, once we give effect an nd object, it can do the work of defining model matrix vectors for us. If we’re interested in the prediction when SURPRISAL_s is at its mean and CLASS is propname, We don’t need to manually develop this anymore like this …
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 0.4945 0.2015 2.455 0.0141 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
In other words, all we need to do is give effect the combinations of things we might be interested in and then pick the right row(s) from its model matrix.
This also works for slopes. Remember the above 3c case where we wanted to find out whether the slope of SURPRISAL_s when CLASS is open is significantly different from when CLASS is unclear. Above, we did this and I think we can all agree that defining slope.when.SUR.is.0.CLA.is.open and slope.when.SUR.is.0.CLA.is.uncl was kinda awful:
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 0.6987 0.2599 2.688 0.0072 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
I hope you see how big a simplification this is; it’s a good enough a reason to always define an nd object and consider effects for your follow-up analyses.
Interpretation: The more surprising a word is, the longer its duration, but that effect is significantly stronger for words with just one vowel than for words with two vowels. But here are some questions for you:
what is the slope of SURPRISAL_sc for closed-class words with one vowel? (question type 3a)
what is the slope of SURPRISAL_sc for closed-class words with two+ vowels? (question type 3a)
is the difference between the two significant? What is its p-value? What is its confidence interval? (question type 3c)
For the slope of SURPRISAL_sc for closed-class words with one vowel, we can use nd:
nd <-expand.grid(SURPRISAL_sc=0:1, # main values of SURPRISAL_scCLASS=levels(d$CLASS), # all levels of CLASSVOWELS=levels(d$VOWELS)) %>%# all levels of VOWELS"["(,3:1) # reorder columnsnd <-cbind(nd, PREDS=predict( m_final, newdata=nd, interval="confidence"))colnames(nd)[4:6] <-c("fit", "lwr", "upr")# 1: slope when CLASS is "closed" & when VOWEL is "one"# this is a scenario 3a questionnd$fit[2] - nd$fit[1]
[1] 0.4403235
coef(m_final)[2]
SURPRISAL_sc
0.4403235
But we can now also solve this with glht.
# 1: slope when CLASS is "closed" & when VOWEL is "one"# this is a scenario 3a questionCLASS.is.clos.N.SURPR.is.0.N.VOW.is.1<-c(1,0,0,0,0,0,0,0,0,0,0,0,0)CLASS.is.clos.N.SURPR.is.1.N.VOW.is.1<-c(1,1,0,0,0,0,0,0,0,0,0,0,0)("-"( CLASS.is.clos.N.SURPR.is.1.N.VOW.is.1, CLASS.is.clos.N.SURPR.is.0.N.VOW.is.1)) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 0.44032 0.01884 23.37 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
For the slope of SURPRISAL_sc for closed-class words with two+ vowels, we can also use nd:
# 2: slope when CLASS is "closed" & when VOWEL is "two+"# this is a scenario 3a questionnd$fit[10] - nd$fit[9]
[1] 0.2599488
coef(m_final)[2] +coef(m_final)[7]
SURPRISAL_sc
0.2599488
But we can now also solve this with glht.
# 2: slope when CLASS is "closed" & when VOWEL is "two+"# this is a scenario 3a questionCLASS.is.clos.N.SURPR.is.0.N.VOW.is.2p <-c(1,0,1,0,0,0,0,0,0,0,0,0,0)CLASS.is.clos.N.SURPR.is.1.N.VOW.is.2p <-c(1,1,1,0,0,0,1,0,0,0,0,0,0)("-"( CLASS.is.clos.N.SURPR.is.1.N.VOW.is.2p, CLASS.is.clos.N.SURPR.is.0.N.VOW.is.2p)) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 0.25995 0.02855 9.105 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
Finally we ask: Is the difference between the two significant? What is its p-value? What is its confidence interval? Above we did this, …
# 3: difference between the two: p and confint# this is a scenario 3c questionsummary(m_final)$coefficients[7,]
Estimate Std. Error t value Pr(>|t|)
-1.803747e-01 3.130768e-02 -5.761354e+00 8.736560e-09
pairs(emtrends( # compute pairwise comparisons of trends/slopesobject=m_final, # for the model m_finalspecs=~ VOWELS | CLASS, # compute comparisons over thisvar="SURPRISAL_sc"), # namely all slopes of this predictoradjust="none")[1,] # do not adjust for multiple comparisons
contrast CLASS estimate SE df t.ratio p.value
one - (two+) closed 0.18 0.0313 6329 5.761 <0.0001
With glht, we can also use something that, if done stepwise, is quite a monstrosity:
# 3: difference between the two: p and confint# this is a scenario 3c question"-"( # compute the difference between 2 slopes, namely the"-"( # slope when CLASS is "closed" & when VOWEL is "one" CLASS.is.clos.N.SURPR.is.1.N.VOW.is.2p, CLASS.is.clos.N.SURPR.is.0.N.VOW.is.2p),"-"( # slope when CLASS is "closed" & when VOWEL is "two+" CLASS.is.clos.N.SURPR.is.1.N.VOW.is.1, CLASS.is.clos.N.SURPR.is.0.N.VOW.is.1)) %>%# put that into a difference (model) matrix:matrix(nrow=1) -> diff_matrixglht(m_final, diff_matrix) %>% summary
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 -0.18037 0.03131 -5.761 8.74e-09 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
Interpretation: The more surprising a word is, the longer its duration is, but that effect is not the same across all classes of words. Specifically and controlling for other predictors,
the slope of SURPRISAL_sc seems to be higher for open-class words than for closed-class words;
the slope of SURPRISAL_sc seems to be very flat for numbers and proper names.
But here are the questions for you we asked in week 1:
what is the slope of SURPRISAL_sc for closed-class words with one vowel? (question type 3a)
what is the slope of SURPRISAL_sc for open-class words with one vowel? (question type 3a)
is the difference between the two significant? What is its p-value? What is its confidence interval? (question type 3c)
what is the slope of SURPRISAL_sc for numbers with one vowel? (question type 3a)
what is the slope of SURPRISAL_sc for proper names with one vowel? (question type 3a)
is the difference between the two significant? What is its p-value? What is its confidence interval? (question type 3c)
are these two slopes significantly different from 0? what are the p-values? (question type 3b)
The first we now already dealt with, but let’s now tackle the rest.
# 2: slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "open"# this is a scenario 3a questionnd$fit[6] - nd$fit[5]
[1] 0.5151296
coef(m_final)[2] +coef(m_final)[9]
SURPRISAL_sc
0.5151296
# now with glht:CLASS.is.open.N.SURPR.is.0.N.VOW.is.1<-c(1,0,0,0,1,0,0,0,0,0,0,0,0)CLASS.is.open.N.SURPR.is.1.N.VOW.is.1<-c(1,1,0,0,1,0,0,0,1,0,0,0,0)("-"( CLASS.is.open.N.SURPR.is.1.N.VOW.is.1, CLASS.is.open.N.SURPR.is.0.N.VOW.is.1)) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 0.51513 0.03369 15.29 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
# 3: difference between the two: p and confint# this is a scenario 3c question"-"( # compute the difference between 2 slopes, namely the"-"( # slope when CLASS is "closed" & when VOWEL is "one" CLASS.is.open.N.SURPR.is.1.N.VOW.is.1, CLASS.is.open.N.SURPR.is.0.N.VOW.is.1),"-"( # slope when CLASS is "closed" & when VOWEL is "two+" CLASS.is.clos.N.SURPR.is.1.N.VOW.is.1, CLASS.is.clos.N.SURPR.is.0.N.VOW.is.1)) %>%# put that into a difference (model) matrix:matrix(nrow=1) -> diff_matrixglht(m_final, diff_matrix) %>% summary
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 0.07481 0.03283 2.278 0.0227 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 0.07743 0.10760 0.72 0.472
(Adjusted p values reported -- single-step method)
# 6: this is a scenario 3c question"-"( # compute the difference between 2 slopes, namely the"-"( # slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "propname" CLASS.is.numb.N.SURPR.is.1.N.VOW.is.1, CLASS.is.numb.N.SURPR.is.0.N.VOW.is.1),"-"( # slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "number" CLASS.is.prop.N.SURPR.is.1.N.VOW.is.1, CLASS.is.prop.N.SURPR.is.0.N.VOW.is.1)) %>%matrix(nrow=1) -> diff_matrixglht(m_final, diff_matrix) %>% summary
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
data = d)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 == 0 0.03833 0.16664 0.23 0.818
(Adjusted p values reported -- single-step method)
# 7: this is scenario 3b question# slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "number": see 4 above# slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "propname": see 5 above
1.5.2 A final nice example
For the final example, we will briefly load different model on a completely different data set. This data set was actually analyzed with a mixed-effects model, but since we haven’t dealt with those yet, we will make this a fixed-effects model only. In this example, the response variable is GENITIVE (of vs. s) and the model we’re discussing here has the following predictors:
LANG: the L1 of the speaker: english vs. chinese vs. german;
POSSORANIM: whether the possessor of the genitive is animate or inanimate (because the former prefer s-genitives much more than the latter);
POSSORNUMBER: whether the possessor of the genitive is singular, plural or irregplural (because regular plurals disprefer s-genitives much more than the others);
LENGTHNPDIFF: a variable quantifying the length difference between possessor and possessed: positive values should go with s-genitives, positive values should go with of-genitives.
From the output of the model m, you can see that the interaction LANG:POSSORANIM is significant, and the plot shows the effects predictions for it:
Clearly, for all speaker L1s, animate possessors lead to significantly more s-genitives than inanimate possessors (we wouldn’t even need emmeans for that: the confidence intervals between a and i within each level of L1 never overlap). Just as clearly, the Chinese learners behave differently from the native speakers and the German learners because, for them, the difference between animate and inanimate possessors is much smaller than for the native speakers and the German learners. Here are questions for you to answer (with glht!):
is the prediction for inanimate possessors for the Chinese learners significantly higher than that for the native speakers?
is the prediction for inanimate possessors for the Chinese learners significantly higher than that for the German learners?
Is the difference between a and i for the German learners significantly different from the difference between a and i for the Chinese learners?
To make things easier for you, do your calculations only for when LENGTHNPDIFF is 0 and when POSSORNUMBER is singular; those will numerically be somewhat different from the above effects estimate, but conceptually similar (because I doubt you want to spend time on making these comparisons for the plot above, for which LENGTHNPDIFF would have to be at its mean and POSSORNUMBER would have to be its proportional representation ;-) But if you want to do this with the effects object’s model matrix, be my guest).
Two other arguments of glht, which we are currently using at their default settings, will be tweaked later.↩︎
I may misunderstand things (!) but I think the output of summary(glht(...)) gets the row names of the output wrong: I think it should be “1 <= 0.625”.↩︎
Source Code
---title: "Ling 204: session 03"author: - name: "[Stefan Th. Gries](https://www.stgries.info)" affiliation: - UC Santa Barbara - JLU Giessen orcid: 0000-0002-6497-3958date: "2026-01-21 12:34:56 PDT"date-format: "DD MMM YYYY HH-mm-ss"editor: sourceformat: html: page-layout: full code-fold: false code-link: true code-copy: true code-tools: true code-line-numbers: true code-overflow: scroll number-sections: true smooth-scroll: true toc: true toc-depth: 4 number-depth: 4 toc-location: left monofont: lucida console tbl-cap-location: top fig-cap-location: bottom fig-width: 4 fig-height: 4 fig-format: png fig-dpi: 300 fig-align: center embed-resources: true link-external-newwindow: trueexecute: cache: false echo: true eval: true warning: false---# Session 03: Coefficient comparisons 2 {#sec-s03}## Theoretical introductionToday, we deal with selected applications of a very powerful notion, **general linear hypothesis tests**, a framework that allows us to answer *all questions from all sets* (and others) regardless of whether we're looking at means (of/with) means or slopes. To approach this framework (and we will mostly discuss its simplest kinds of application), we use the duration data ...```{r}rm(list=ls(all.names=TRUE))pkgs2load <-c("car", "effects", "emmeans", "magrittr", "multcomp")suppressMessages(sapply(pkgs2load, library,character.only=TRUE, logical.return=TRUE, quietly=TRUE)); rm(pkgs2load)d <-read.delim( # summarize d, the result of loadingfile="_input/wdurations.csv", # this filestringsAsFactors=TRUE) # change categorical variables into factorsd$DURATION_l <-log2(d$DURATION) # as befored$SURPRISAL_s <-sqrt(d$SURPRISAL) # this time no centeringsummary(d)```... and simplify `m_01` a little for didactic simplicity:```{r}summary(m_01 <-lm(DURATION_l ~1+ VOWELS*CLASS, data=d))$coef```Let's begin with a very simple (and admittedly quite artificial) example. Imagine you had `m_01` in R's working memory but only knew the function `coef` but not the function `summary`. In other words, you could do this ...```{r}coef(m_01)```... but you aren't able to get the *p*-value for the 2nd coefficient (for `VOWELStwo+`), i.e. the *p*-value of `r summary(m_01)$coef[2,4]` which tells you that that coefficient is significantly different from 0; put differently, from the `coef` output you can't see whether the predicted mean for `DURATION_l` when `CLASS` is *closed* and `VOWELS` is *one* is significantly different from when `CLASS` is *closed* and `VOWELS` is *two+*.For this, and a ton of other, more interesting questions, the function `multcomp::glht` can help. For how we will use it now, it requires two arguments:^[Two other arguments of `glht`, which we are currently using at their default settings, will be tweaked later.]1. a model object (here, `m_01`) and2. a 1-row matrix that contains as many numbers as the model has coefficients (here, `length(coef(m_01))` =`r length(coef(m_01))`).You can consider this 1-row matrix to be the equivalent of one row of the model's model matrix (as retrieved with `model.matrix`). As a reminder, the **model frame** of a model is all the rows and columns from the original data that are used in the model and the **model matrix** of a model is a matrix that* has as many rows as the model frame;* has as many columns as the model has coefficients ...```{r}head(mf <-model.frame(m_01)) # the model frame of m_01head(mm <-model.matrix(m_01)) # the model matrix of m_01```.. because the prediction the model makes for each case of the model frame is the summed product of the row of the model matrix and the model's coefficients. Let's look at case 5:```{r}#| collapse: truemf[5,] # row 5 of the model framemm[5,] # row 5 of the model matrixsum( # the summed"*"( # product of mm[5,], # the row of the model matrix &coef(m_01) # the model's coefficients))predict(m_01)[5] %>% unname```What `glht` does is,1. it also multiplies each coefficient of the model (i.e., the values of `coef(m_01)`) with the corresponding number in the 1-row version of the model matrix you create ;2. it also sums up all those products; but then3. with the function `summary`, we can then test whether that sum is significantly different from some value (default: 0);4. with the function `confint`, we can compute the confidence interval of that sum.Here is this procedure shown in a stepwise fashion: We generate a vector I'm calling `mm_index` with 10 values and we do that such that, when each of the 10 values of `mm_index` is multiplied with the corresponding value of `coef(m_01)`, the sum of these 10 products is the second coefficient, for which we want to know whether it's significantly different from 0. Here's the vector `mm_index`, which I put into a data frame just to easily represent how it lines up with the coefficients:```{r}mm_index <-c(0,1,0,0,0,0,0,0,0,0)(temp <-data.frame(COEF=coef(m_01),INDX=mm_index))```Steps 1 and 2 of a `glht` applications are then this:```{r}(temp$COEF * temp$INDX) %>%# step 1sum # step 2```And you can see we did this right, because, trivially, this is indeed the value of `coef(m_01)[2]`. To do this with `glht`, we create a `glht` object, which I just call `qwe`:```{r}# step 1 & step 2 w/ glht(qwe <-glht(model=m_01,linfct=matrix(mm_index, nrow=1)))```We again see the desired value: `glht` did what we said it did. And now we can use `qwe` as an argument to `summary` and `confint` to get* a significance test of whether that value of `r round(sum(temp$COEF * temp$INDX), 4)` is significantly different from 0;* the 95% confidence interval of that value of `r round(sum(temp$COEF * temp$INDX), 4)`:```{r}summary(qwe) # try: summary(qwe)$test[3:6] %>% unlistconfint(qwe) # try: confint(qwe)$confint```Ideally, you'd immediately write a function like `glht.summy` that takes as input a `glht` object and then returns both the relevant `summary` and `confint` contents:```{r}#| echo: falseglht.summy <-function (a.glht.object, roundy=5) {setNames(round(c(confint(qwe)$confint, unlist(summary(qwe)$test[4:6])), roundy),c("estimate", "lower", "upper", "se", "t/z", "p"))}``````{r}glht.summy(qwe)```Of course, given everything we know already -- we know `summary` and `confint` -- these particular results are boring, we could have gotten them like this:```{r}c(round(summary(m_01)$coef[2,], 5), round(confint(m_01)[2,], 5))```But the point is to abstract away from this specific example because, with what we've just learned, we can now give to `glht` *any* 1-row matrix codifying *any* specific combination of predictor values (!), and it will always do that above kind of pairwise multiplication, summing up, and testing that sum for how it differs from 0. That means, we can answer *all* questions of *all* questions sets (even question set 3) from above with just `glht`, and that's what we'll practice now.## Practical applicationsQuick reminder of the 9 scenarios:| | **a**: what are the means/slopes? | **b**: are they * diff from 0? | **c**: are they * diff from each other? ||---------------------------------------------------|-----------------------------------|--------------------------------|-----------------------------------------|| **1**: means for (combos of) levels of cat preds | scenario 1a | scenario 1b | scenario 1c || **2**: means for values of numeric preds | scenario 2a | scenario 2b | scenario 2c || **3**: slopes for (combos of) levels of cat preds | scenario 3a | scenario 3b | scenario 3c |: 9 scenarios relevant for post-hoc exploration of models {#tbl-threeTIMESthree}### Scenarios 1a & 1bScenario 1a meant to ask the following question: What is each of the 10 predicted means of `DURATION_l` for when you cross the 2 levels of `VOWELS` and the 5 levels of `CLASS`? And scenario 1b meant to ask whether each of the 10 means computed above is significantly different from 0 (and what the exact *p*-value is). We discussed last week that one of the simplest ways to get the information for scenario 1a is with `nd`:```{r}nd <-expand.grid(CLASS=levels(d$CLASS),VOWELS=levels(d$VOWELS))(nd <-cbind(nd,predict(m_01, newdata=nd, interval="confidence")))data.frame(effect("VOWELS:CLASS", m_01))data.frame(emmeans(m_01, specs=~ VOWELS|CLASS))```But then we said we can't really answer question set 1b without stupid releveling -- well, now we can even it will at first be painful (but powerful!). But now you will finally realize why I always placed so much emphasis on you understanding what the coefficients mean and how we use them to compute predictions.The intercept -- when `CLASS` is *closed* and `VOWELS` is *one* -- is very easy:```{r}summary(m_01)$coef[1,]nd[1,]summary(glht(model=m_01, linfct=matrix(c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0), nrow=1)))```The prediction for when `CLASS` is *number* and `VOWELS` is *one* requires that we add to the intercept the coefficient for `CLASSnumber`:```{r}summary(m_01)$coef[c(1,3),]nd[2,]summary(glht(model=m_01, linfct=matrix(c(1, 0, 1, 0, 0, 0, 0, 0, 0, 0), nrow=1)))```The prediction for when `CLASS` is *open* and `VOWELS` is *one* requires that we add to the intercept the coefficient for `CLASSopen`:```{r}summary(m_01)$coef[c(1,4),]nd[3,]summary(glht(model=m_01, linfct=matrix(c(1, 0, 0, 1, 0, 0, 0, 0, 0, 0), nrow=1)))```The prediction for when `CLASS` is *propname* and `VOWELS` is *one* requires that we add to the intercept the coefficient for `CLASSpropname`:```{r}summary(m_01)$coef[c(1,5),]nd[4,]summary(glht(model=m_01, linfct=matrix(c(1, 0, 0, 0, 1, 0, 0, 0, 0, 0), nrow=1)))```The prediction for when `CLASS` is *unclear* and `VOWELS` is *one* requires that we add to the intercept the coefficient for `CLASSunclear`:```{r}summary(m_01)$coef[c(1,6),]nd[5,]summary(glht(model=m_01, linfct=matrix(c(1, 0, 0, 0, 0, 1, 0, 0, 0, 0), nrow=1)))```As you can see, the values in each (model) matrix we give `glht` are the row indices of the required coefficients in the model summary. But now it gets a bit more complex because we turn to `VOWELS` being *two+*, which means we need to consider more coefficients.The prediction for when `CLASS` is *closed* and `VOWELS` is *two+* requires that we add to the intercept the coefficient for `VOWELStwo+`:```{r}summary(m_01)$coef[c(1,2),]nd[6,]summary(glht(model=m_01, linfct=matrix(c(1, 1, 0, 0, 0, 0, 0, 0, 0, 0), nrow=1)))```The prediction for when `CLASS` is *number* and `VOWELS` is *two+* requires that we add to the intercept the coefficient for `CLASSnumber`, the coefficient for `VOWELStwo+`, and the one for their interaction:```{r}summary(m_01)$coef[c(1,2,3,7),]nd[7,]summary(glht(model=m_01, linfct=matrix(c(1, 1, 1, 0, 0, 0, 1, 0, 0, 0), nrow=1)))```The prediction for when `CLASS` is *open* and `VOWELS` is *two+* requires that we add to the intercept the coefficient for `CLASSopen`, the coefficient for `VOWELStwo+`, and the one for their interaction:```{r}summary(m_01)$coef[c(1,2,4,8),]nd[8,]summary(glht(model=m_01, linfct=matrix(c(1, 1, 0, 1, 0, 0, 0, 1, 0, 0), nrow=1)))```The prediction for when `CLASS` is *propname* and `VOWELS` is *two+* requires that we add to the intercept the coefficient for `CLASSpropname`, the coefficient for `VOWELStwo+`, and the one for their interaction:```{r}summary(m_01)$coef[c(1,2,5,9),]nd[9,]summary(glht(model=m_01, linfct=matrix(c(1, 1, 0, 0, 1, 0, 0, 0, 1, 0), nrow=1)))```The prediction for when `CLASS` is *unclear* and `VOWELS` is *two+* requires that we add to the intercept the coefficient for `CLASSunclear`, the coefficient for `VOWELStwo+`, and the one for their interaction:```{r}summary(m_01)$coef[c(1,2,6,10),]nd[10,]summary(glht(model=m_01, linfct=matrix(c(1, 1, 0, 0, 0, 1, 0, 0, 0, 1), nrow=1)))```Yes, it's cumbersome, but if you know what the coefficients mean, it's also not *that* bad and this approach answers all questions from scenarios 1a and 1b, which will become especially interesting when we turn to slopes, i.e. scenarios 3a and 3b.### Scenario 1cScenario 1c meant to ask whether (some? all?) differences between the 10 means are significant and last week we used `pairs(emmeans(...))`, and this is probably the best way to proceed, but to really drive home the point of how powerful `glht` is, let's also look at `glht`-based comparisons between means. Imagine you were interested in answering the following question: Is the prediction for when `VOWELS` is *two+* and `CLASS` is *propname* significantly different from when `VOWELS` is *two+* and `CLASS` is *unclear*? With `emmeans`, this is what we'd do ...```{r}#| results: holdnd[9,"fit"] - nd[10,"fit"]qwe <-pairs(emmeans( # compute pairwise comparisons of meansobject=m_01, # for the model m_01specs=~ CLASS | VOWELS), # of CLASS within each level of VOWELSadjust="none") # do not adjust for multiple comparisonsqwe[20,]```... but how could we do this with `glht`? One way to do this is *very* stepwise, but that may make it the one that is simplest to comprehend. You1. define two (model) matrices, one for each of the two situations that you want to compare;2. subtract one from the other;3. give that difference (model) matrix to `glht`.And, thankfully, we already generated the relevant matrices above; I just copy and paste from above and assign proper names to them:```{r}# step 1:VOW.is.twoplus.CLA.is.prop <-c(1, 1, 0, 0, 1, 0, 0, 0, 1, 0)VOW.is.twoplus.CLA.is.uncl <-c(1, 1, 0, 0, 0, 1, 0, 0, 0, 1)# step 2:(diff.betw.them <- VOW.is.twoplus.CLA.is.prop - VOW.is.twoplus.CLA.is.uncl)# step 3:summary(glht(model=m_01, linfct=matrix(diff.betw.them, nrow=1)))```You can see that this is the right result by comparing this output to the above result from `emmeans` and to the difference in predictions from `nd`. Again, yes, it seems cumbersome, but you have to remember that, since you define the (model) matrices, any comparison you're interested in is suddenly at your fingertips. As mentioned above, this will become especially interesting when we turn to slopes, i.e. scenarios 3a and 3b.### Scenario 2a & 2bFor scenarios 2a and 2b we turn to a model `m_02`:```{r}summary(m_02 <-lm(DURATION_l ~1+ SURPRISAL_s*CLASS, data=d))$coef```Scenario 2a meant to ask the following question: what are the 5 predicted means of `DURATION_l` for each of the levels of `CLASS` and when `SURPRISAL_s` is controlled for in some way? And scenario 2b meant to ask whether each of the 5 means is significantly different from 0 (and what the exact *p*-value is).To address scenario 2a, we previously did this ...```{r}nd <-expand.grid(SURPRISAL_s=c(0, 1, mean(d$SURPRISAL_s)), # main values of SURPRISAL_sCLASS=levels(d$CLASS)) %>%# all levels of CLASS"["(,2:1) # reorder columns(nd <-cbind(nd, predict(m_02, newdata=nd, interval="confidence")))```... and then we looked at the predictions when `SURPRISAL_s` is 0. But then we said we can't really address scenario 2b without stupid releveling -- well, now we can even it is still painful (yet powerful). But to use the capabilities of `glht` some more, let's actually use it to create/check the predictions that `emmeans` would make by default, because those are not for when `SURPRISAL_s` is 0, but for when `SURPRISAL_s` is at its mean:```{r}emmeans(m_02, ~CLASS | SURPRISAL_s)```So, let's quickly compute a 1-element vector / variable with the mean of `SURPRISAL_s` (which is already part of `nd`):```{r}(mean_surp <-mean(d$SURPRISAL_s))```The prediction for when `CLASS` is *closed* and `SURPRISAL_s` is its mean requires that we add to the intercept -- the prediction when `CLASS` is *closed* and `SURPRISAL_s` is 0 -- `mean_surp` times the coefficient/slope for `SURPRISAL_s`:```{r}summary(m_02)$coef[c(1,2),]nd[3,]glht(model=m_02, linfct=matrix(c(1, mean_surp, 0, 0, 0, 0, 0, 0, 0, 0), nrow=1)) %>% summaryglht(model=m_02, linfct=matrix(c(1, mean_surp, 0, 0, 0, 0, 0, 0, 0, 0), nrow=1)) %>% confint```The prediction for when `CLASS` is *number* and `SURPRISAL_s` is its mean requires that, to the intercept, we add `mean_surp` times the coefficient/slope for `SURPRISAL_s`, the coefficient for `CLASSnumber`, and `mean_surp` times the coefficient/slope for their interaction:```{r}summary(m_02)$coef[c(1,2,3,7),]nd[6,]qwe <-glht(model=m_02, linfct=matrix(c(1, mean_surp, 1, 0, 0, 0, mean_surp, 0, 0, 0), nrow=1))qwe %>% summaryqwe %>% confint```The prediction for when `CLASS` is *open* and `SURPRISAL_s` is its mean requires that, to the intercept, we add `mean_surp` times the coefficient/slope for `SURPRISAL_s`, the coefficient for `CLASSopen`, and `mean_surp` times the coefficient/slope for their interaction:```{r}summary(m_02)$coef[c(1,2,4,8),]nd[9,]qwe <-glht(model=m_02, linfct=matrix(c(1, mean_surp, 0, 1, 0, 0, 0, mean_surp, 0, 0), nrow=1))qwe %>% summaryqwe %>% confint```The prediction for when `CLASS` is *propname* and `SURPRISAL_s` is its mean requires that, to the intercept, we add `mean_surp` times the coefficient/slope for `SURPRISAL_s`, the coefficient for `CLASSpropname`, and `mean_surp` times the coefficient/slope for their interaction:```{r}summary(m_02)$coef[c(1,2,5,9),]nd[12,]qwe <-glht(model=m_02, linfct=matrix(c(1, mean_surp, 0, 0, 1, 0, 0, 0, mean_surp, 0), nrow=1))qwe %>% summaryqwe %>% confint```The prediction for when `CLASS` is *unclear* and `SURPRISAL_s` is its mean requires that, to the intercept, we add `mean_surp` times the coefficient/slope for `SURPRISAL_s`, the coefficient for `CLASSunclear`, and `mean_surp` times the coefficient/slope for their interaction:```{r}summary(m_02)$coef[c(1,2,6,10),]nd[15,]qwe <-glht(model=m_02, linfct=matrix(c(1, mean_surp, 0, 0, 0, 1, 0, 0, 0, mean_surp), nrow=1))qwe %>% summaryqwe %>% confint```Cumbersome, but powerful/versatile!### Scenario 2cScenario 2c meant to ask whether (some? all?) differences between the 5 means are significant and last week we used `pairs(emmeans(...))`, and this is probably the best way to proceed, but to really drive home the point of how powerful `glht` is, let's also look at `glht`-based comparisons between means. Imagine you were interested in answering the following question: Is the prediction for when `SURPRISAL_s` is its mean and `CLASS` is *propname* significantly different from when `SURPRISAL_s` is its mean and `CLASS` is *unclear*? With `emmeans`, this is what we'd do (because `emmeans` does us the favor of setting `SURPRISAL_s` to its mean for this):```{r}#| results: holdnd[12,"fit"] - nd[15,"fit"]qwe <-pairs(emmeans( # compute pairwise comparisons of meansobject=m_02, # for the model m_01specs=~ CLASS | SURPRISAL_s), # of CLASS within each level of VOWELSadjust="none") # do not adjust for multiple comparisonsqwe[10,]```But now we use `glht`. As before I show the super stepwise way where we define two (model) matrices, one for each of the two situations that you want to compare, then you subtract one from the other, and then you give that difference (model) matrix to `glht`. And, thankfully, we already generated the relevant matrices above; I just copy and paste from above and assign proper names to them:```{r}SUR.is.mean.CLA.is.prop <-c(1, mean_surp, 0, 0, 1, 0, 0, 0, mean_surp, 0)SUR.is.mean.CLA.is.uncl <-c(1, mean_surp, 0, 0, 0, 1, 0, 0, 0 , mean_surp)(diff.betw.them <- SUR.is.mean.CLA.is.prop - SUR.is.mean.CLA.is.uncl)summary(glht(model=m_02, linfct=matrix(diff.betw.them, nrow=1)))```You can see that this is the right result by comparing this output to the result from `emmeans` and the one from `nd` above. Again, yes, it seems cumbersome, but note that you can use the `glht` logic for arbitrary values of `SURPRISAL_s`, not just 0, 1, or its mean: If any prior work reported a certain significant effect for a certain value of `SURPRISAL_s`, you can just plug that value into the above code to see whether that value also yields a significant difference in your data. But again, this will become most interesting when we turn to slopes, which we do now.### Scenarios 3a & 3bScenario 3a meant to ask the following question: what are the 5 predicted slopes of `SURPRISAL_s` for each of the levels of `CLASS`? And scenario 3b meant to ask whether each of these 5 slopes is significantly different from 0 (and what the exact *p*-value is).To address scenario 3a last week, the shortest thing we did was this:```{r}# tapply(nd$fit, # apply a grouping of the predictions# list(nd$CLASS, # by CLASS# nd$SURPRISAL_s), # and SURPRISAL_s# c) %>% # just list the values,# "["(,-3) %>% # suppress the 3rd column# apply(1, diff) # compute the differences# or# nd.split <- split(nd, nd$SURPRISAL_s)# nd.split$`1`$fit - nd.split$`0`$fitcoef(m_02)[2] +c(0, coef(m_02)[7:10])```But then we said we can't really address scenario 3b without stupid releveling -- well, we can *now*! When `CLASS` is *closed*, it's super straightforward:```{r}summary(m_02)$coef[2,]nd[2,"fit"] - nd[1,"fit"]glht(model=m_02, linfct=matrix(c(0, 1, 0, 0, 0, 0, 0, 0, 0, 0), nrow=1)) %>% summaryglht(model=m_02, linfct=matrix(c(0, 1, 0, 0, 0, 0, 0, 0, 0, 0), nrow=1)) %>% confint```When `CLASS` is *number*, we need to consider the interaction term in the 7th coefficient (`SURPRISAL_s:CLASSnumber`) as well:```{r}summary(m_02)$coef[c(2,7),]nd[5,"fit"] - nd[4,"fit"]qwe <-glht(model=m_02, linfct=matrix(c(0, 1, 0, 0, 0, 0, 1, 0, 0, 0), nrow=1))qwe %>% summaryqwe %>% confint```When `CLASS` is *open*, we need to consider the interaction term in the 8th coefficient (`SURPRISAL_s:CLASSopen`) as well:```{r}summary(m_02)$coef[c(2,8),]nd[8,"fit"] - nd[7,"fit"]qwe <-glht(model=m_02, linfct=matrix(c(0, 1, 0, 0, 0, 0, 0, 1, 0, 0), nrow=1))qwe %>% summaryqwe %>% confint```When `CLASS` is *propname*, we need to consider the interaction term in the 9th coefficient (`SURPRISAL_s:CLASSpropname`) as well:```{r}summary(m_02)$coef[c(2,9),]nd[11,"fit"] - nd[10,"fit"]qwe <-glht(model=m_02, linfct=matrix(c(0, 1, 0, 0, 0, 0, 0, 0, 1, 0), nrow=1))qwe %>% summaryqwe %>% confint```When `CLASS` is *unclear*, we need to consider the interaction term in the 10th coefficient (`SURPRISAL_s:CLASSunclear`) as well:```{r}summary(m_02)$coef[c(2,10),]nd[14,"fit"] - nd[13,"fit"]qwe <-glht(model=m_02, linfct=matrix(c(0, 1, 0, 0, 0, 0, 0, 0, 0, 1), nrow=1))qwe %>% summaryqwe %>% confint```### Scenarios 3cScenario 3c means to ask whether (some? all?) differences between the 5 slopes of `SURPRISAL_s` for the different levels of `CLASS` are significant and last week we used `emmeans::emtrends` for this, which is probably the best way to proceed:```{r}pairs(emtrends( # compute pairwise comparisons of trends/slopesobject=m_02, # for the model m_02specs="CLASS", # compute comparisons over this predictorvar="SURPRISAL_s"), # namely all slopes of this predictoradjust="none") # do not adjust for multiple comparisons```Now with `glht` ... Imagine you were interested in whether the slope of `SURPRISAL_s` when `CLASS` is *open* is significantly different from when `CLASS` is *unclear*, but you wanted to solve this with `glht`. You will not be surprised to see that I demonstrate this in the same granular way as before: We define two (model) matrices, one for each of the two situations that you want to compare, then you subtract one from the other, and then you give that difference (model) matrix to `glht`. And, thankfully, we already generated the relevant matrices above; I just copy and paste from above and assign proper names to them:```{r}slope.when.SUR.is.0.CLA.is.open <-c(0, 1, 0, 0, 0, 0, 0, 1, 0, 0)slope.when.SUR.is.0.CLA.is.uncl <-c(0, 1, 0, 0, 0, 0, 0, 0, 0, 1)(diff.betw.them <- slope.when.SUR.is.0.CLA.is.open - slope.when.SUR.is.0.CLA.is.uncl)summary(glht(model=m_02, linfct=matrix(diff.betw.them, nrow=1)))```## What does this mean for our regression summaries?The above has at least some implications for how we might want to report the results of regression models. We saw how the summary table of `m_02` provides maximally half of all the info one might want to provide with regard to the interaction `SURPRISAL_s:CLASS`:```{r}summary(m_02)$coef %>%round(4)```Given everything we've learned by now, we can do much better in two ways: First, we can represent the summary table in a format that makes it easier to digest for readers who may not always remember all your variables and, for factors, their reference levels, plus we can and probably should add confidence intervals. Thus, this is better:| | Estimate | 95%-CI | *se* | *t* | *p*~2-tailed~ ||:----------------------------------------------|:---------|:-----------------|:-------|:---------|:--------------|| Intercept (SURP=0 + CLASS=*closed*) | 6.1698 | [6.097, 6.242] | 0.0371 | 166.4236 | <0.0001 || SURP~0→1~ | 0.3927 | [0.359, 0.426] | 0.0170 | 23.1424 | <0.0001 || CLASS~*closed*→*numb*~ | 1.5894 | [0.882, 2.297] | 0.3608 | 4.4049 | <0.0001 || CLASS~*closed*→*open*~ | 0.7045 | [0.556, 0.853] | 0.0759 | 9.2811 | <0.0001 || CLASS~*closed*→*propn*~ | 2.1682 | [1.419, 2.917] | 0.3820 | 5.6761 | <0.0001 || CLASS~*closed*→*uncl*~ | 2.5707 | [1.147, 3.995] | 0.7265 | 3.5385 | 0.0004 || SURP~0→1~ + CLASS~*closed*→*numb*~ | -0.3263 | [-0.579, -0.074] | 0.1288 | -2.5333 | 0.0113 || SURP~0→1~ + CLASS~*closed*→*open*~ | -0.0188 | [-0.075, 0.037] | 0.0285 | -0.6590 | 0.5099 || SURP~0→1~ + CLASS~*closed*→*propn*~ | -0.3391 | [-0.553, -0.125] | 0.1092 | -3.1051 | 0.0019 || SURP~0→1~ + CLASS~*closed*→*uncl*~ | -0.7174 | [-1.226, -0.209] | 0.2595 | -2.7652 | 0.0057 |But then, even if you provide effects plots, which I recommend to always do, it might be nice -- (especially) for generalized linear models -- to provide* the predicted means for the levels of `CLASS` (while probably while `SURPRISAL_s` is at its mean);* the predicted slopes of `SURPRISAL_s` for each level of `CLASS`,* and their -- don't forget those! -- their confidence intervals.Here're those results. First, we fix `SURPRISAL_s` to its mean and then compute the predictions for each level of `CLASS`:| `SURP`=2.37 | `CLASS`: *closed* | `CLASS`: *number* | `CLASS`: *open* | `CLASS`: *propname* | `CLASS`: *unclear* ||-----------------|-------------------|-------------------|------------------|---------------------|--------------------|| pred. `DURA_l` | 7.1008 | 7.9167 | 7.7608 | 8.4650 | 7.9705 || 95% CI | [7.0776, 7.1239] | [7.7604, 8.0729] | [7.7251, 7.7964] | [8.1993, 8.7307] | [7.6783, 8.2627] |Then, we compute the slope of `SURPRISAL_s` for each level of `CLASS`:| | `CLASS`: *closed* | `CLASS`: *number* | `CLASS`: *open* | `CLASS`: *propname* | `CLASS`: *unclear* ||-----------------|-------------------|-------------------|------------------|---------------------|--------------------|| slope of `SURP` | 0.3927 | 0.0664 | 0.3739 | 0.0536 | -0.3248 || 95% CI | [0.3594, 0.4259] | [-0.1839, 0.3167] | [0.3291, 0.4187] | [0.1579, 0.2651] | [-0.8323, 0.1828] |```{r}#| echo: false#| eval: falsend[nd$SURPRISAL_s>1,]confint(glht(model=m_02, linfct=matrix(c(0, 1, 0, 0, 0, 0, 0, 0, 0, 0), nrow=1)))$confintconfint(glht(model=m_02, linfct=matrix(c(0, 1, 0, 0, 0, 0, 1, 0, 0, 0), nrow=1)))$confintconfint(glht(model=m_02, linfct=matrix(c(0, 1, 0, 0, 0, 0, 0, 1, 0, 0), nrow=1)))$confintconfint(glht(model=m_02, linfct=matrix(c(0, 1, 0, 0, 0, 0, 0, 0, 1, 0), nrow=1)))$confintconfint(glht(model=m_02, linfct=matrix(c(0, 1, 0, 0, 0, 0, 0, 0, 0, 1), nrow=1)))$confint```:::{.callout-tip collapse="true"}Something for you to check out: `multcomp::glht` actually also allows you to test specific hypotheses in a way that looks more like English; for instance, you can do a (now two-tailed) test of `SURPRISAL_s` being different from 0.425 like this: `summary(glht(m_02, "SURPRISAL_s = 0.425"))`. Another function that is useful for stuff like this is `marginaleffects::hypotheses`.:::## What else is possible?### Orthogonal contrasts w/ `glht`We can also define orthogonal contrasts with `glht`, because `glht` can really do anything. Let's get the data set up again:```{r}d$CLASS_orth <- d$CLASS# clos numb open prop unclclo.op_vs_num.prop.unc <-c(+0.6, -0.4, +0.6, -0.4, -0.4)clo_vs_op <-c(+0.5, 0 , -0.5, 0 , 0 )num.prop_vs_uncl <-c( 0 , -1/3, 0 , -1/3, +2/3)num_vs_prop <-c( 0 , +0.5, 0 , -0.5, 0 )# & then we put those into a matrix like the above table & assign them as contrasts:contrasts(d$CLASS_orth) <-# make the following the contrasts of CLASS_orthcbind( # the matrix resulting from binding together as columns clo.op_vs_num.prop.unc, # all clo_vs_op, # four num.prop_vs_uncl, # contrast num_vs_prop) # vectors```And let's fit the two models:```{r}summary(m_03_treatm <-lm(DURATION_l ~1+ CLASS , data=d))$coefsummary(m_03_orthog <-lm(DURATION_l ~1+ CLASS_orth, data=d))$coef```#### The first orthogonal contrastSo how do we get the first contrast *closed*/*open* vs. *number*/*propname*/*unclear* with `glht` from `m_03_treatm`? Like this: We first create a `glht` 1-row matrix that combines the predictions for when `CLASS` is *closed* or *open* by computing the unweighted mean of the sum of the vectors/matrices:```{r}#| results: holdCLA.is.clos <-c(1, 0, 0, 0, 0)CLA.is.open <-c(1, 0, 1, 0, 0)(CLA.is.clos.or.open <- (CLA.is.clos + CLA.is.open) /2)# same as:# CLA.is.clos CLA.is.open# rbind(c(1, 0, 0, 0, 0), c(1, 0, 1, 0, 0)) %>% apply(2, mean)```Then we do the same for when `CLASS` is *number* or *propname* or *unclear*:```{r}#| results: holdCLA.is.numb <-c(1, 1, 0, 0, 0)CLA.is.prop <-c(1, 0, 0, 1, 0)CLA.is.uncl <-c(1, 0, 0, 0, 1)(CLA.is.numb.or.prop.or.uncl <- (CLA.is.numb + CLA.is.prop + CLA.is.uncl) /3)# same as:# CLA.is.numb CLA.is.prop CLA.is.uncl# rbind(c(1, 1, 0, 0, 0), c(1, 0, 0, 1, 0), c(1, 0, 0, 0, 1)) %>% apply(2, mean)```And then we compute the the summary of the `glht` for the difference between the two:```{r}diff.betw.them <- CLA.is.clos.or.open - CLA.is.numb.or.prop.or.unclsummary(glht(model=m_03_treatm, linfct=matrix(diff.betw.them, nrow=1)))```And you can see, that's right:```{r}summary(m_03_orthog)$coef[2,]```#### The second orthogonal contrastSo how do we get the second contrast *closed* vs. *open* (with `glht` from `m_03_treatm`)?Actually, that's a trick question: we don't have to use `glht` because it's already there (just with a different sign, but that doesn't matter for the significance test):```{r}summary(m_03_orthog)$coef[3,]summary(m_03_treatm)$coef[3,]```So, don't forget the old stuff because you're so excited about the new stuff ...#### The third orthogonal contrastSo how do we get the third contrast *number*/*propname* vs. *unclear* with `glht` from `m_03_treatm`? Like this: We first create a `glht` 1-row matrix that combines the predictions for when `CLASS` is *number* or *propname*:```{r}CLA.is.numb <-c(1, 1, 0, 0, 0)CLA.is.prop <-c(1, 0, 0, 1, 0)(CLA.is.num.or.prop <- (CLA.is.numb + CLA.is.prop) /2)```Then we do the same for when `CLASS` is *unclear*, but since this is just one level, we don't need a mean:```{r}CLA.is.uncl <-c(1, 0, 0, 0, 1)```And then we compute the the summary of the `glht` for the difference between the two:```{r}diff.betw.them <- CLA.is.num.or.prop - CLA.is.unclsummary(glht(model=m_03_treatm, linfct=matrix( diff.betw.them, nrow=1)))```And you can see, that's right:```{r}summary(m_03_orthog)$coef[4,]```#### The fourth orthogonal contrastAnd the last one , *number* vs. *propname*: We first create one `glht` 1-row matrix that computes the predictions for when `CLASS` is *number*, then another for when `CLASS` is *propname*, and then we compute the difference:```{r}CLA.is.numb <-c(1, 1, 0, 0, 0)CLA.is.prop <-c(1, 0, 0, 1, 0)diff.betw.them <- CLA.is.numb - CLA.is.propsummary(glht(model=m_03_treatm, linfct=matrix( diff.betw.them, nrow=1)))```And you can see, that's right:```{r}summary(m_03_orthog)$coef[5,]```So this is why I discussed `glht` so much: it's very very powerful for *all* sorts of comparisons you can think of, in fact even for things you might not have considered yet, given how we proceeded -- read on to find out.### Testing against specific values inside your modelWe discussed earlier -- when we discussed the 1c/2c/3c scenarios -- how we can use `glht` to compare values of our model coefficients directly against each other. We can also extend that a bit to ask more specific questions. For example, here are the coefficients of our 'normal' treatment contrast model again:```{r}summary(m_03_treatm)$coef```Imagine that, based on previous studies, you had hypothesized earlier that the difference between closed-class words (the intercept) and proper names (i.e., `coef(m_03_treatm)[4]`) should be twice as big as the difference between closed-class words and open-class words (i.e., `coef(m_03_treatm)[3]`). We can check this kind of hypothesis with `glht` as well and I'm doing it super stepwise again:```{r}closed_vs_open <-matrix(c(0,0,1,0,0), nrow=1)closed_vs_propname <-matrix(c(0,0,0,1,0), nrow=1)summary(glht(m_03_treatm, "-"( # is the difference between closed_vs_propname, # closed vs. proper names2* closed_vs_open))) # twice as big as closed vs. open``````{r}#| echo: false#| eval: falsemarginaleffects::hypotheses(m_03_treatm, "b4 = 2 * b3")```The result is significant -- what does that mean? It means that* the difference between closed-class words and proper names on the one hand and* two times the difference between closed- and open-class words on the otheris significantly different from 0, meaning the ratio of* the difference between closed-class words and proper names on the one hand and* the difference between closed- and open-class words on the otheris significantly different from 2. What is the ratio? This:```{r}coef(m_03_treatm)[4] /coef(m_03_treatm)[3]```..., meaning, for the hypothesis to work out as expected, either the difference between closed- and open-class words is too big, or the difference between closed-class words and proper names is too small (or both), something we were not able to test easily before.### Testing against specific values outside your modelOne final really nice thing we can do is this: So far, we always 'only' tested whether a certain coefficient is significantly different from 0 or whether the sum of some painstakingly-defined linear combinations of coefficients is significantly different from 0. But we can also test whether a certain coefficient or the sum of some painstakingly-defined linear combinations of coefficients is different from some other value.Let's first check out a very simple (monofactorial) model for that:```{r}summary(m_04 <-lm(DURATION_l ~1+ SURPRISAL_s, data=d))$coef```Imagine a previous study reported that the slope of `SURPRISAL_s` was 0.625. However, you actually expected this slope to be lower than that, which is why you were happy to see that, in your data, the slope of `SURPRISAL_s` is indeed smaller: it's not 0.625, it's `r coef(m_04)[2]`. But is that difference significant? How can we address this?There are three ways one might consider. The first is very straightforward, but doesn't always help; we can try and use the confidence interval (and here we would of course not even need `glht` for that):```{r}Confint(m_04)confint(glht( # show the confint of a glht, namely m_04, # for the sum of the product of the coeffs of m_04matrix(c(0,1), nrow=1))) # and these values```This confidence interval of our slope does include 0.625 but why is this wrong?We actually want to look at this in a one-tailed way -- remember, we expected the slope of `SURPRISAL_s` for closed class words to be less than 0.625. For this, we could of course adjust the level of the confidence interval:```{r}Confint(m_04, level=0.9)confint(glht( # show the confint of a glht, namely m_04, # for the sum of the product of the coeffs of m_04matrix(c(0,1), nrow=1)), # and these valueslevel=0.9) # cut off 5% on each side, not 2.5%```Now, the confidence interval of our smaller estimate does not include the previous value of 0.625: given our expectation, our estimate of `r coef(m_04)[2]` *is* indeed significantly smaller than that of the previous study. But how do we get a proper *p*-value? Here's the second way: We use `summary(glht(...))` again, but add two arguments to it:* the argument `rhs`, whose default value is 0, which is why all our tests so far were set up to test against whether something is different from 0; now we set that to the value we want to test against, 0.625;* the well-known argument `alternative` whose default value is, as it is in general, *two.sided*, but now we set it to *less* because we expect the coefficient to be *less* than the `rhs` we are testing against:```{r}summary(glht( # show the summary of a glht, namely m_04, # check whether the sum of the coeffs of m_04matrix(c(0,1), nrow=1), # multiplied by thisrhs=0.625, # is significantly different from 0.625, namelyalternative="less")) # by being smaller``````{r}#| echo: false#| eval: falsemarginaleffects::hypotheses(m_04, "b2=0.625") # but this is 2-tailed!summary(glht( # show the summary of a glht, namely m_04, # check whether the sum of the coeffs of m_04matrix(c(0,1), nrow=1), # multiplied by thisrhs=0.625)) # is significantly different from 0.625, namely```What this does is this:```{r}(summary(m_04)$coef[2,"Estimate"]-0.625) /summary(m_04)$coef[2,"Std. Error"] # t-score of glhtpt(q=-1.863831, df=m_04$df.residual, lower.tail=TRUE) # 1-tailed p-value of glht```Now we have all we want:^[I may misunderstand things (!) but I think the output of `summary(glht(...))` gets the row names of the output wrong: I think it should be "1 <= 0.625".] the one-tailed 95%-confidence interval of `r coef(m_04)[2]` does not overlap with 0.625 and we have a proper *p*-value as well; we can report that, as expected, the slope of `SURPRISAL_s` is indeed significantly smaller than the value of 0.625 reported in the previous study.A third way that can can also work involves model comparison. We have a model `m_04` where the slope of `SURPRISAL_s` is `r coef(m_04)[2]` but we can also create a model `m_05`, in which the slope of `SURPRISAL_s` is fixed to 0.625 (using `offset`), and then we can use `anova` for one of our usual model comparisons:```{r}m_05 <-lm(DURATION_l ~1+offset(0.625*SURPRISAL_s), # here we fix the slope of SURPRISAL_sdata=d)anova(m_04, m_05, test="F")```While this is a two-tailed comparison, these are the same results as from the `summary(glht(...))`:* the *p*-value here is twice `glht`'s *p*-value;* the *F*-score here is `glht`'s *t*-value squared,The exact same logic applies to more complex models; let's look at `m_02` again:```{r}summary(m_02)$coef```Imagine a previous study reported that the slope of `SURPRISAL_s` for numbers was 0.3 (and significantly different from 0). If you actually expected to find in your data that this slope1. would be significantly lower than 0.3, in fact2. would be not significantly different from 0,how could you explore and test this?```{r}# question 1:summary(glht( # show the summary of a glht, namely m_02, # check whether the sum of the coeffs of m_02matrix(c(0,1,0,0,0,0,1,0,0,0), nrow=1), # multiplied by thisrhs=0.3, # is significantly different from 0.3, namelyalternative="less")) # by being smaller# question 2:confint(glht( # show the confint of a glht, namely m_02, # for the sum of the product of the coeffs of m_02matrix(c(0,1,0,0,0,0,1,0,0,0), nrow=1)), # and these valueslevel=0.9) # cut off 5% on each side, not 2.5%summary(glht( # show the summary of a glht, namely m_02, # check whether the sum of the coeffs of m_02matrix(c(0,1,0,0,0,0,1,0,0,0), nrow=1), # multiplied by thisalternative="less")) # by being smaller```:::{.callout-tip collapse="true"}See Faraway (2015, Section 3.2) for some more discussion.:::### Model matrices from `effects`Ok, after all this painful manual defining of 'model matrices' to give to `glht`, you will be glad to hear there is one bit of relief I can offer. Remember the above 2c case where we wanted to find out whether the following two predictions are significantly different from each other:* when `SURPRISAL_s` is at its mean and `CLASS` is *propname*;* when `SURPRISAL_s` is at its mean and `CLASS` is *unclear*.Above, we did this and I think we can all agree that defining `SUR.is.mean.CLA.is.prop` and `SUR.is.mean.CLA.is.uncl` is less than fun:```{r}SUR.is.mean.CLA.is.prop <-c(1, mean_surp, 0, 0, 1, 0, 0, 0, mean_surp, 0)SUR.is.mean.CLA.is.uncl <-c(1, mean_surp, 0, 0, 0, 1, 0, 0, 0 , mean_surp)(diff.betw.them <- SUR.is.mean.CLA.is.prop - SUR.is.mean.CLA.is.uncl)summary(glht(model=m_02, linfct=matrix(diff.betw.them, nrow=1)))```Here's the good news bit, which comes in two steps which will simplify things considerably. Remember that we already defined `nd` above as a data frame with all combinations of potentially interesting predictor values: all levels of categorical predictors and `c(0, 1, mean)` for all numeric predictors; I am reducing it here to just the two columns with the predictor combinations:```{r}(nd <- nd[,1:2])```Step 1 of the simplification is realizing that we can use a version of `nd` as the argument `xlevels` to `effect`. Remember how we always define our `effects` object?```{r}#| eval: falsesucl_d <-data.frame( sucl <-effect("SURPRISAL_s:CLASS", m_02))```We can now do this to get effects predictions and confidence intervals for everything, which can already be a huge advantage:```{r}(sucl_d <-data.frame( sucl <-effect("SURPRISAL_s:CLASS", m_02,xlevels=lapply(nd, unique))))```But the other advantage is that the `effects` actually contains a model matrix:```{r}(sucl_mm <- sucl$model.matrix)```That means, once we give `effect` an `nd` object, it can do the work of defining model matrix vectors for us. If we're interested in the prediction when `SURPRISAL_s` is at its mean and `CLASS` is *propname*, We don't need to manually develop this anymore like this ...```{r}#| eval: falseSUR.is.mean.CLA.is.prop <-c(1, mean_surp, 0, 0, 1, 0, 0, 0, mean_surp, 0)```... no, we just pick from what `effect` computed for us:```{r}SUR.is.mean.CLA.is.prop <- sucl_mm[12,]```Same for when `SURPRISAL_s` is at its mean and `CLASS` is *propname*: no more this ...```{r}#| eval: falseSUR.is.mean.CLA.is.uncl <-c(1, mean_surp, 0, 0, 0, 1, 0, 0, 0, mean_surp)```... instead, we just do this:```{r}SUR.is.mean.CLA.is.uncl <- sucl_mm[15,]```And then we can do the usual:```{r}(diff.betw.them <- SUR.is.mean.CLA.is.prop - SUR.is.mean.CLA.is.uncl)summary(glht(model=m_02, linfct=matrix( diff.betw.them, nrow=1)))```In other words, all we need to do is give `effect` the combinations of things we might be interested in and then pick the right row(s) from its model matrix.This also works for slopes. Remember the above 3c case where we wanted to find out whether the slope of `SURPRISAL_s` when `CLASS` is *open* is significantly different from when `CLASS` is *unclear*. Above, we did this and I think we can all agree that defining `slope.when.SUR.is.0.CLA.is.open` and `slope.when.SUR.is.0.CLA.is.uncl` was kinda awful:```{r}#| eval: falseslope.when.SUR.is.0.CLA.is.open <-c(0, 1, 0, 0, 0, 0, 0, 1, 0, 0)slope.when.SUR.is.0.CLA.is.uncl <-c(0, 1, 0, 0, 0, 0, 0, 0, 0, 1)(diff.betw.them <- slope.when.SUR.is.0.CLA.is.open - slope.when.SUR.is.0.CLA.is.uncl)```But now we just pick things out from `sucl_mm`:```{r}slope.when.SUR.is.0.CLA.is.open <- sucl_mm[ 8,] - sucl_mm[ 7,]slope.when.SUR.is.0.CLA.is.uncl <- sucl_mm[14,] - sucl_mm[13,](diff.betw.them <- slope.when.SUR.is.0.CLA.is.open - slope.when.SUR.is.0.CLA.is.uncl)summary(glht(model=m_02, linfct=matrix( diff.betw.them, nrow=1)))```I hope you see how big a simplification this is; it's a good enough a reason to always define an `nd` object and consider `effects` for your follow-up analyses.## Exercises### `m_final` of the lm from week 1Let's fit a model `m_final`:```{r}d <-droplevels(d[d$CLASS!="unclear",])d$SURPRISAL_sc <-sqrt(d$SURPRISAL) -mean(sqrt(d$SURPRISAL))summary(m_final <-lm( DURATION_l ~1+ (SURPRISAL_sc + VOWELS + CLASS)^2,data=d))$coef```#### Interaction 1: `SURPRISAL_sc:VOWELS`Interpretation: The more surprising a word is, the longer its duration, but that effect is significantly stronger for words with just one vowel than for words with two vowels. But here are some questions for you:1. what is the slope of `SURPRISAL_sc` for closed-class words with one vowel? (question type 3a)2. what is the slope of `SURPRISAL_sc` for closed-class words with two+ vowels? (question type 3a)3. is the difference between the two significant? What is its *p*-value? What is its confidence interval? (question type 3c)For the slope of `SURPRISAL_sc` for closed-class words with one vowel, we can use `nd`:```{r}nd <-expand.grid(SURPRISAL_sc=0:1, # main values of SURPRISAL_scCLASS=levels(d$CLASS), # all levels of CLASSVOWELS=levels(d$VOWELS)) %>%# all levels of VOWELS"["(,3:1) # reorder columnsnd <-cbind(nd, PREDS=predict( m_final, newdata=nd, interval="confidence"))colnames(nd)[4:6] <-c("fit", "lwr", "upr")# 1: slope when CLASS is "closed" & when VOWEL is "one"# this is a scenario 3a questionnd$fit[2] - nd$fit[1]coef(m_final)[2]```But we can now also solve this with `glht`.```{r}# 1: slope when CLASS is "closed" & when VOWEL is "one"# this is a scenario 3a questionCLASS.is.clos.N.SURPR.is.0.N.VOW.is.1<-c(1,0,0,0,0,0,0,0,0,0,0,0,0)CLASS.is.clos.N.SURPR.is.1.N.VOW.is.1<-c(1,1,0,0,0,0,0,0,0,0,0,0,0)("-"( CLASS.is.clos.N.SURPR.is.1.N.VOW.is.1, CLASS.is.clos.N.SURPR.is.0.N.VOW.is.1)) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary```For the slope of `SURPRISAL_sc` for closed-class words with two+ vowels, we can also use `nd`:```{r}# 2: slope when CLASS is "closed" & when VOWEL is "two+"# this is a scenario 3a questionnd$fit[10] - nd$fit[9]coef(m_final)[2] +coef(m_final)[7]```But we can now also solve this with `glht`.```{r}# 2: slope when CLASS is "closed" & when VOWEL is "two+"# this is a scenario 3a questionCLASS.is.clos.N.SURPR.is.0.N.VOW.is.2p <-c(1,0,1,0,0,0,0,0,0,0,0,0,0)CLASS.is.clos.N.SURPR.is.1.N.VOW.is.2p <-c(1,1,1,0,0,0,1,0,0,0,0,0,0)("-"( CLASS.is.clos.N.SURPR.is.1.N.VOW.is.2p, CLASS.is.clos.N.SURPR.is.0.N.VOW.is.2p)) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary```Finally we ask: Is the difference between the two significant? What is its *p*-value? What is its confidence interval? Above we did this, ...```{r}# 3: difference between the two: p and confint# this is a scenario 3c questionsummary(m_final)$coefficients[7,]Confint(m_final)[7,]pairs(emtrends( # compute pairwise comparisons of trends/slopesobject=m_final, # for the model m_finalspecs=~ VOWELS | CLASS, # compute comparisons over thisvar="SURPRISAL_sc"), # namely all slopes of this predictoradjust="none")[1,] # do not adjust for multiple comparisons```With `glht`, we can also use something that, if done stepwise, is quite a monstrosity:```{r}# 3: difference between the two: p and confint# this is a scenario 3c question"-"( # compute the difference between 2 slopes, namely the"-"( # slope when CLASS is "closed" & when VOWEL is "one" CLASS.is.clos.N.SURPR.is.1.N.VOW.is.2p, CLASS.is.clos.N.SURPR.is.0.N.VOW.is.2p),"-"( # slope when CLASS is "closed" & when VOWEL is "two+" CLASS.is.clos.N.SURPR.is.1.N.VOW.is.1, CLASS.is.clos.N.SURPR.is.0.N.VOW.is.1)) %>%# put that into a difference (model) matrix:matrix(nrow=1) -> diff_matrixglht(m_final, diff_matrix) %>% summaryglht(m_final, diff_matrix) %>% confint```#### Interaction 2: `SURPRISAL_sc:CLASS`Interpretation: The more surprising a word is, the longer its duration is, but that effect is not the same across all classes of words. Specifically and controlling for other predictors,* the slope of `SURPRISAL_sc` seems to be higher for open-class words than for closed-class words;* the slope of `SURPRISAL_sc` seems to be very flat for numbers and proper names.But here are the questions for you we asked in week 1:1. what is the slope of `SURPRISAL_sc` for closed-class words with one vowel? (question type 3a)2. what is the slope of `SURPRISAL_sc` for open-class words with one vowel? (question type 3a)3. is the difference between the two significant? What is its *p*-value? What is its confidence interval? (question type 3c)4. what is the slope of `SURPRISAL_sc` for numbers with one vowel? (question type 3a)5. what is the slope of `SURPRISAL_sc` for proper names with one vowel? (question type 3a)6. is the difference between the two significant? What is its *p*-value? What is its confidence interval? (question type 3c)7. are these two slopes significantly different from 0? what are the *p*-values? (question type 3b)The first we now already dealt with, but let's now tackle the rest.```{r}# 2: slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "open"# this is a scenario 3a questionnd$fit[6] - nd$fit[5]coef(m_final)[2] +coef(m_final)[9]# now with glht:CLASS.is.open.N.SURPR.is.0.N.VOW.is.1<-c(1,0,0,0,1,0,0,0,0,0,0,0,0)CLASS.is.open.N.SURPR.is.1.N.VOW.is.1<-c(1,1,0,0,1,0,0,0,1,0,0,0,0)("-"( CLASS.is.open.N.SURPR.is.1.N.VOW.is.1, CLASS.is.open.N.SURPR.is.0.N.VOW.is.1)) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary# 3: difference between the two: p and confint# this is a scenario 3c question"-"( # compute the difference between 2 slopes, namely the"-"( # slope when CLASS is "closed" & when VOWEL is "one" CLASS.is.open.N.SURPR.is.1.N.VOW.is.1, CLASS.is.open.N.SURPR.is.0.N.VOW.is.1),"-"( # slope when CLASS is "closed" & when VOWEL is "two+" CLASS.is.clos.N.SURPR.is.1.N.VOW.is.1, CLASS.is.clos.N.SURPR.is.0.N.VOW.is.1)) %>%# put that into a difference (model) matrix:matrix(nrow=1) -> diff_matrixglht(m_final, diff_matrix) %>% summaryglht(m_final, diff_matrix) %>% confint# 4: slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "number"# this is a scenario 3a questionnd$fit[4] - nd$fit[3]coef(m_final)[2] +coef(m_final)[8]# now with glht:CLASS.is.numb.N.SURPR.is.0.N.VOW.is.1<-c(1,0,0,1,0,0,0,0,0,0,0,0,0)CLASS.is.numb.N.SURPR.is.1.N.VOW.is.1<-c(1,1,0,1,0,0,0,1,0,0,0,0,0)("-"( CLASS.is.numb.N.SURPR.is.1.N.VOW.is.1, CLASS.is.numb.N.SURPR.is.0.N.VOW.is.1)) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary# 5/7: slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "propname"# this is a scenario 3a questionnd$fit[8] - nd$fit[7]coef(m_final)[2] +coef(m_final)[10]CLASS.is.prop.N.SURPR.is.0.N.VOW.is.1<-c(1,0,0,0,0,1,0,0,0,0,0,0,0)CLASS.is.prop.N.SURPR.is.1.N.VOW.is.1<-c(1,1,0,0,0,1,0,0,0,1,0,0,0)("-"( CLASS.is.prop.N.SURPR.is.1.N.VOW.is.1, CLASS.is.prop.N.SURPR.is.0.N.VOW.is.1)) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary# 6: this is a scenario 3c question"-"( # compute the difference between 2 slopes, namely the"-"( # slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "propname" CLASS.is.numb.N.SURPR.is.1.N.VOW.is.1, CLASS.is.numb.N.SURPR.is.0.N.VOW.is.1),"-"( # slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "number" CLASS.is.prop.N.SURPR.is.1.N.VOW.is.1, CLASS.is.prop.N.SURPR.is.0.N.VOW.is.1)) %>%matrix(nrow=1) -> diff_matrixglht(m_final, diff_matrix) %>% summaryglht(m_final, diff_matrix) %>% confint# 7: this is scenario 3b question# slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "number": see 4 above# slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "propname": see 5 above```### A final nice exampleFor the final example, we will briefly load different model on a completely different data set. This data set was actually analyzed with a mixed-effects model, but since we haven't dealt with those yet, we will make this a fixed-effects model only. In this example, the response variable is `GENITIVE` (*of* vs. *s*) and the model we're discussing here has the following predictors:* `LANG`: the L1 of the speaker: *english* vs. *chinese* vs. *german*;* `POSSORANIM`: whether the possessor of the genitive is *animate* or *inanimate* (because the former prefer *s*-genitives much more than the latter);* `POSSORNUMBER`: whether the possessor of the genitive is *singular*, *plural* or *irregplural* (because regular plurals disprefer *s*-genitives much more than the others);* `LENGTHNPDIFF`: a variable quantifying the length difference between possessor and possessed: positive values should go with *s*-genitives, positive values should go with *of*-genitives.From the output of the model `m`, you can see that the interaction `LANG:POSSORANIM` is significant, and the plot shows the effects predictions for it:```{r}rm(list=ls(all.names=TRUE))summary(d <-readRDS("_input/2024_STG_AgainstLevel3_ICAMEJournal_data.RDS"))summary(m <-readRDS("_input/2024_STG_AgainstLevel3_ICAMEJournal_glm.RDS"))$coefficients %>%round(5)drop1(m, test="Chisq")lapo_d <-data.frame(lapo <-effect("LANG:POSSORANIM", m))lapo_d$fit.lo <-qlogis(lapo_d$fit)plot(type="n", axes=FALSE,xlab="L1 of the speakers" , xlim=c(0.9, 3.1), x=0,ylab="Pred. prob. of s-genitives", ylim=c(0 , 0.5), y=0); grid()axis(2); axis(1, at=1:3, labels=c("English", "Chinese", "German")) # levels(d$LANG)arrows(rep(1:3,2), lapo_d$lower,rep(1:3,2), lapo_d$upper,angle=90, code=3)lines(1:3, lapo_d$fit[c(1,2,3)], type="b", pch=16); text(2.75, 0.3 , "anim")lines(1:3, lapo_d$fit[c(4,5,6)], type="b", pch=16); text(1.25, 0.05, "inanim")```Clearly, for all speaker L1s, animate possessors lead to significantly more *s*-genitives than inanimate possessors (we wouldn't even need `emmeans` for that: the confidence intervals between *a* and *i* within each level of `L1` never overlap). Just as clearly, the Chinese learners behave differently from the native speakers and the German learners because, for them, the difference between animate and inanimate possessors is much smaller than for the native speakers and the German learners. Here are questions for you to answer (with `glht`!):1. is the prediction for inanimate possessors for the Chinese learners significantly higher than that for the native speakers?2. is the prediction for inanimate possessors for the Chinese learners significantly higher than that for the German learners?3. Is the difference between *a* and *i* for the German learners significantly different from the difference between *a* and *i* for the Chinese learners?To make things easier for you, do your calculations only for when `LENGTHNPDIFF` is 0 and when `POSSORNUMBER` is *singular*; those will numerically be somewhat different from the above effects estimate, but conceptually similar (because I doubt you want to spend time on making these comparisons for the plot above, for which `LENGTHNPDIFF` would have to be at its mean and `POSSORNUMBER` would have to be its proportional representation ;-) But if you want to do this with the effects object's model matrix, be my guest).```{r}nd <-expand.grid(POSSORNUMBER="singular",LENGTHNPDIFF=0,LANG=levels(d$LANG),POSSORANIM=levels(d$POSSORANIM))nd$fit.lo <-predict(m, newdata=nd)``````{r}#| echo: false#| eval: falseinan.N.engl <-matrix(c(1,0,0,1,0,0,0,0,0), nrow=1)inan.N.chin <-matrix(c(1,1,0,1,0,0,0,1,0), nrow=1)inan.N.germ <-matrix(c(1,0,1,1,0,0,0,0,1), nrow=1)# 1.summary(glht(m, inan.N.chin-inan.N.engl, alternative="greater"))# 2.summary(glht(m, inan.N.chin-inan.N.germ, alternative="greater"))``````{r}#| echo: false#| eval: falseanim.N.engl <-matrix(c(1,0,0,0,0,0,0,0,0), nrow=1)anim.N.chin <-matrix(c(1,1,0,0,0,0,0,0,0), nrow=1)anim.N.germ <-matrix(c(1,0,1,0,0,0,0,0,0), nrow=1)# which means nd is this:glht.retriever(glht(m, anim.N.engl))glht.retriever(glht(m, anim.N.chin))glht.retriever(glht(m, anim.N.germ))glht.retriever(glht(m, inan.N.engl))glht.retriever(glht(m, inan.N.chin))glht.retriever(glht(m, inan.N.germ))# 3.summary(glht(m, "-"( anim.N.germ - inan.N.germ, anim.N.chin - inan.N.chin)))```# Session info```{r}sessionInfo()```