Ling 204: session 03

Author

Affiliations

Stefan Th. Gries

UC Santa Barbara

JLU Giessen

Published

21 Jan 2026 11-34-56

1 Session 03: Coefficient comparisons 2

1.1 Theoretical introduction

Today, we deal with selected applications of a very powerful notion, general linear hypothesis tests, a framework that allows us to answer all questions from all sets (and others) regardless of whether we’re looking at means (of/with) means or slopes. To approach this framework (and we will mostly discuss its simplest kinds of application), we use the duration data …

rm(list=ls(all.names=TRUE))
pkgs2load <- c("car", "effects", "emmeans", "magrittr", "multcomp")
suppressMessages(sapply(pkgs2load, library,
   character.only=TRUE, logical.return=TRUE, quietly=TRUE)); rm(pkgs2load)

     car  effects  emmeans magrittr multcomp
    TRUE     TRUE     TRUE     TRUE     TRUE

d <- read.delim( # summarize d, the result of loading
   file="_input/wdurations.csv", # this file
   stringsAsFactors=TRUE) # change categorical variables into factors
d$DURATION_l <- log2(d$DURATION)   # as before
d$SURPRISAL_s <- sqrt(d$SURPRISAL) # this time no centering
summary(d)

      CASE           CONV         SPEAKER         SEX          DURATION
 Min.   :   1   C08    : 873   S15    : 478   female:3203   Min.   :  14.0
 1st Qu.:1596   C05    : 831   S10    : 455   male  :3180   1st Qu.: 110.0
 Median :3192   C06    : 800   S11    : 414                 Median : 162.0
 Mean   :3192   C07    : 772   S14    : 410                 Mean   : 194.5
 3rd Qu.:4788   C04    : 737   S08    : 401                 3rd Qu.: 245.0
 Max.   :6383   C09    : 557   S16    : 395                 Max.   :1405.0
                (Other):1813   (Other):3830
   POSINTURN          LENGTH            FREQ         SURPRISAL
 Min.   : 1.000   Min.   : 1.000   Min.   : 0.00   Min.   : 0.00
 1st Qu.: 3.000   1st Qu.: 2.000   1st Qu.:12.50   1st Qu.: 4.00
 Median : 6.000   Median : 3.000   Median :15.00   Median : 6.00
 Mean   : 5.524   Mean   : 2.768   Mean   :14.04   Mean   : 6.17
 3rd Qu.: 8.000   3rd Qu.: 3.000   3rd Qu.:17.00   3rd Qu.: 8.00
 Max.   :10.000   Max.   :14.000   Max.   :18.00   Max.   :21.00

      CLASS       VOWELS       DURATION_l      SURPRISAL_s
 closed  :4021   one :3698   Min.   : 3.807   Min.   :0.000
 number  : 125   two+:2685   1st Qu.: 6.781   1st Qu.:2.000
 open    :2086               Median : 7.340   Median :2.449
 propname: 110               Mean   : 7.348   Mean   :2.371
 unclear :  41               3rd Qu.: 7.937   3rd Qu.:2.828
                             Max.   :10.456   Max.   :4.583

… and simplify m_01 a little for didactic simplicity:

summary(m_01 <- lm(DURATION_l ~ 1 + VOWELS*CLASS, data=d))$coef

                            Estimate Std. Error     t value      Pr(>|t|)
(Intercept)               7.01708812 0.01319279 531.8880863  0.000000e+00
VOWELStwo+               -0.10721289 0.02650782  -4.0445758  5.303671e-05
CLASSnumber               0.93657391 0.07447485  12.5757077  7.632984e-36
CLASSopen                 0.83558391 0.03684015  22.6813380 1.356049e-109
CLASSpropname             1.51207992 0.07560529  19.9996587  2.364983e-86
CLASSunclear              0.83121561 0.12335687   6.7382998  1.742943e-11
VOWELStwo+:CLASSnumber    0.05866289 0.15992227   0.3668213  7.137645e-01
VOWELStwo+:CLASSopen      0.20492839 0.04697502   4.3624970  1.306137e-05
VOWELStwo+:CLASSpropname  0.06904612 0.20333421   0.3395696  7.341919e-01
VOWELStwo+:CLASSunclear   0.05244692 0.32170732   0.1630268  8.705025e-01

Let’s begin with a very simple (and admittedly quite artificial) example. Imagine you had m_01 in R’s working memory but only knew the function coef but not the function summary. In other words, you could do this …

coef(m_01)

             (Intercept)               VOWELStwo+              CLASSnumber
              7.01708812              -0.10721289               0.93657391
               CLASSopen            CLASSpropname             CLASSunclear
              0.83558391               1.51207992               0.83121561
  VOWELStwo+:CLASSnumber     VOWELStwo+:CLASSopen VOWELStwo+:CLASSpropname
              0.05866289               0.20492839               0.06904612
 VOWELStwo+:CLASSunclear
              0.05244692

… but you aren’t able to get the p-value for the 2nd coefficient (for VOWELStwo+), i.e. the p-value of 5.3036714^{-5} which tells you that that coefficient is significantly different from 0; put differently, from the coef output you can’t see whether the predicted mean for DURATION_l when CLASS is closed and VOWELS is one is significantly different from when CLASS is closed and VOWELS is two+.

For this, and a ton of other, more interesting questions, the function multcomp::glht can help. For how we will use it now, it requires two arguments:¹

a model object (here, m_01) and
a 1-row matrix that contains as many numbers as the model has coefficients (here, length(coef(m_01)) =10).

You can consider this 1-row matrix to be the equivalent of one row of the model’s model matrix (as retrieved with model.matrix). As a reminder, the model frame of a model is all the rows and columns from the original data that are used in the model and the model matrix of a model is a matrix that

has as many rows as the model frame;
has as many columns as the model has coefficients …

head(mf <- model.frame(m_01)) # the model frame of m_01

  DURATION_l VOWELS  CLASS
1   6.714246    one closed
2   6.475733    one closed
3   6.285402    one closed
4   6.643856    one closed
5   8.523562   two+   open
6   7.971544   two+ closed

head(mm <- model.matrix(m_01)) # the model matrix of m_01

  (Intercept) VOWELStwo+ CLASSnumber CLASSopen CLASSpropname CLASSunclear
1           1          0           0         0             0            0
2           1          0           0         0             0            0
3           1          0           0         0             0            0
4           1          0           0         0             0            0
5           1          1           0         1             0            0
6           1          1           0         0             0            0
  VOWELStwo+:CLASSnumber VOWELStwo+:CLASSopen VOWELStwo+:CLASSpropname
1                      0                    0                        0
2                      0                    0                        0
3                      0                    0                        0
4                      0                    0                        0
5                      0                    1                        0
6                      0                    0                        0
  VOWELStwo+:CLASSunclear
1                       0
2                       0
3                       0
4                       0
5                       0
6                       0

.. because the prediction the model makes for each case of the model frame is the summed product of the row of the model matrix and the model’s coefficients. Let’s look at case 5:

mf[5,] # row 5 of the model frame
##   DURATION_l VOWELS CLASS
## 5   8.523562   two+  open
mm[5,] # row 5 of the model matrix
##              (Intercept)               VOWELStwo+              CLASSnumber 
##                        1                        1                        0 
##                CLASSopen            CLASSpropname             CLASSunclear 
##                        1                        0                        0 
##   VOWELStwo+:CLASSnumber     VOWELStwo+:CLASSopen VOWELStwo+:CLASSpropname 
##                        0                        1                        0 
##  VOWELStwo+:CLASSunclear 
##                        0
sum(             # the summed
   "*"(          # product of
      mm[5,],    # the row of the model matrix &
      coef(m_01) # the model's coefficients
))
## [1] 7.950388
predict(m_01)[5] %>% unname
## [1] 7.950388

What glht does is,

it also multiplies each coefficient of the model (i.e., the values of coef(m_01)) with the corresponding number in the 1-row version of the model matrix you create ;
it also sums up all those products; but then
with the function summary, we can then test whether that sum is significantly different from some value (default: 0);
with the function confint, we can compute the confidence interval of that sum.

Here is this procedure shown in a stepwise fashion: We generate a vector I’m calling mm_index with 10 values and we do that such that, when each of the 10 values of mm_index is multiplied with the corresponding value of coef(m_01), the sum of these 10 products is the second coefficient, for which we want to know whether it’s significantly different from 0. Here’s the vector mm_index, which I put into a data frame just to easily represent how it lines up with the coefficients:

mm_index <- c(0,1,0,0,0,0,0,0,0,0)
(temp <- data.frame(
   COEF=coef(m_01),
   INDX=mm_index))

                                COEF INDX
(Intercept)               7.01708812    0
VOWELStwo+               -0.10721289    1
CLASSnumber               0.93657391    0
CLASSopen                 0.83558391    0
CLASSpropname             1.51207992    0
CLASSunclear              0.83121561    0
VOWELStwo+:CLASSnumber    0.05866289    0
VOWELStwo+:CLASSopen      0.20492839    0
VOWELStwo+:CLASSpropname  0.06904612    0
VOWELStwo+:CLASSunclear   0.05244692    0

Steps 1 and 2 of a glht applications are then this:

(temp$COEF * temp$INDX) %>% # step 1
sum                         # step 2

[1] -0.1072129

And you can see we did this right, because, trivially, this is indeed the value of coef(m_01)[2]. To do this with glht, we create a glht object, which I just call qwe:

# step 1 & step 2 w/ glht
(qwe <- glht(
   model=m_01,
   linfct=matrix(mm_index, nrow=1)))


     General Linear Hypotheses

Linear Hypotheses:
       Estimate
1 == 0  -0.1072

We again see the desired value: glht did what we said it did. And now we can use qwe as an argument to summary and confint to get

a significance test of whether that value of -0.1072 is significantly different from 0;
the 95% confidence interval of that value of -0.1072:

summary(qwe) # try: summary(qwe)$test[3:6] %>% unlist


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0 -0.10721    0.02651  -4.045  5.3e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

confint(qwe) # try: confint(qwe)$confint


     Simultaneous Confidence Intervals

Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)

Quantile = 1.9603
95% family-wise confidence level


Linear Hypotheses:
       Estimate lwr      upr
1 == 0 -0.10721 -0.15918 -0.05525

Ideally, you’d immediately write a function like glht.summy that takes as input a glht object and then returns both the relevant summary and confint contents:

glht.summy(qwe)

estimate    lower    upper       se      t/z        p
-0.10721 -0.15918 -0.05525  0.02651 -4.04458  0.00005

Of course, given everything we know already – we know summary and confint – these particular results are boring, we could have gotten them like this:

c(round(summary(m_01)$coef[2,], 5), 
  round(confint(m_01)[2,], 5))

  Estimate Std. Error    t value   Pr(>|t|)      2.5 %     97.5 %
  -0.10721    0.02651   -4.04458    0.00005   -0.15918   -0.05525

But the point is to abstract away from this specific example because, with what we’ve just learned, we can now give to glht any 1-row matrix codifying any specific combination of predictor values (!), and it will always do that above kind of pairwise multiplication, summing up, and testing that sum for how it differs from 0. That means, we can answer all questions of all questions sets (even question set 3) from above with just glht, and that’s what we’ll practice now.

1.2 Practical applications

Quick reminder of the 9 scenarios:

Table 1: 9 scenarios relevant for post-hoc exploration of models

	a: what are the means/slopes?	b: are they * diff from 0?	c: are they * diff from each other?
1: means for (combos of) levels of cat preds	scenario 1a	scenario 1b	scenario 1c
2: means for values of numeric preds	scenario 2a	scenario 2b	scenario 2c
3: slopes for (combos of) levels of cat preds	scenario 3a	scenario 3b	scenario 3c

1.2.1 Scenarios 1a & 1b

Scenario 1a meant to ask the following question: What is each of the 10 predicted means of DURATION_l for when you cross the 2 levels of VOWELS and the 5 levels of CLASS? And scenario 1b meant to ask whether each of the 10 means computed above is significantly different from 0 (and what the exact p-value is). We discussed last week that one of the simplest ways to get the information for scenario 1a is with nd:

nd <- expand.grid(CLASS=levels(d$CLASS),VOWELS=levels(d$VOWELS))
(nd <- cbind(nd,predict(m_01, newdata=nd, interval="confidence")))

      CLASS VOWELS      fit      lwr      upr
1    closed    one 7.017088 6.991226 7.042950
2    number    one 7.953662 7.809975 8.097349
3      open    one 7.852672 7.785243 7.920102
4  propname    one 8.529168 8.383230 8.675106
5   unclear    one 7.848304 7.607870 8.088738
6    closed   two+ 6.909875 6.864804 6.954947
7    number   two+ 7.905112 7.631366 8.178858
8      open   two+ 7.950388 7.915274 7.985501
9  propname   two+ 8.491001 8.123732 8.858270
10  unclear   two+ 7.793538 7.212834 8.374241

data.frame(effect("VOWELS:CLASS", m_01))

   VOWELS    CLASS      fit         se    lower    upper
1     one   closed 7.017088 0.01319279 6.991226 7.042950
2    two+   closed 6.909875 0.02299163 6.864804 6.954947
3     one   number 7.953662 0.07329702 7.809975 8.097349
4    two+   number 7.905112 0.13964245 7.631366 8.178858
5     one     open 7.852672 0.03439690 7.785243 7.920102
6    two+     open 7.950388 0.01791204 7.915274 7.985501
7     one propname 8.529168 0.07444535 8.383230 8.675106
8    two+ propname 8.491001 0.18735001 8.123732 8.858270
9     one  unclear 7.848304 0.12264937 7.607870 8.088738
10   two+  unclear 7.793538 0.29622638 7.212834 8.374241

data.frame(emmeans(m_01, specs= ~ VOWELS|CLASS))

   VOWELS    CLASS   emmean         SE   df lower.CL upper.CL
1     one   closed 7.017088 0.01319279 6373 6.991226 7.042950
2    two+   closed 6.909875 0.02299163 6373 6.864804 6.954947
3     one   number 7.953662 0.07329702 6373 7.809975 8.097349
4    two+   number 7.905112 0.13964245 6373 7.631366 8.178858
5     one     open 7.852672 0.03439690 6373 7.785243 7.920102
6    two+     open 7.950388 0.01791204 6373 7.915274 7.985501
7     one propname 8.529168 0.07444535 6373 8.383230 8.675106
8    two+ propname 8.491001 0.18735001 6373 8.123732 8.858270
9     one  unclear 7.848304 0.12264937 6373 7.607870 8.088738
10   two+  unclear 7.793538 0.29622638 6373 7.212834 8.374241

But then we said we can’t really answer question set 1b without stupid releveling – well, now we can even it will at first be painful (but powerful!). But now you will finally realize why I always placed so much emphasis on you understanding what the coefficients mean and how we use them to compute predictions.

The intercept – when CLASS is closed and VOWELS is one – is very easy:

summary(m_01)$coef[1,]

    Estimate   Std. Error      t value     Pr(>|t|)
  7.01708812   0.01319279 531.88808627   0.00000000

nd[1,]

   CLASS VOWELS      fit      lwr     upr
1 closed    one 7.017088 6.991226 7.04295

summary(glht(model=m_01, linfct=matrix(
   c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0), nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0  7.01709    0.01319   531.9   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

The prediction for when CLASS is number and VOWELS is one requires that we add to the intercept the coefficient for CLASSnumber:

summary(m_01)$coef[c(1,3),]

             Estimate Std. Error   t value     Pr(>|t|)
(Intercept) 7.0170881 0.01319279 531.88809 0.000000e+00
CLASSnumber 0.9365739 0.07447485  12.57571 7.632984e-36

nd[2,]

   CLASS VOWELS      fit      lwr      upr
2 number    one 7.953662 7.809975 8.097349

summary(glht(model=m_01, linfct=matrix(
   c(1, 0, 1, 0, 0, 0, 0, 0, 0, 0), nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0   7.9537     0.0733   108.5   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

The prediction for when CLASS is open and VOWELS is one requires that we add to the intercept the coefficient for CLASSopen:

summary(m_01)$coef[c(1,4),]

             Estimate Std. Error   t value      Pr(>|t|)
(Intercept) 7.0170881 0.01319279 531.88809  0.000000e+00
CLASSopen   0.8355839 0.03684015  22.68134 1.356049e-109

nd[3,]

  CLASS VOWELS      fit      lwr      upr
3  open    one 7.852672 7.785243 7.920102

summary(glht(model=m_01, linfct=matrix(
   c(1, 0, 0, 1, 0, 0, 0, 0, 0, 0), nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0   7.8527     0.0344   228.3   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

The prediction for when CLASS is propname and VOWELS is one requires that we add to the intercept the coefficient for CLASSpropname:

summary(m_01)$coef[c(1,5),]

              Estimate Std. Error   t value     Pr(>|t|)
(Intercept)   7.017088 0.01319279 531.88809 0.000000e+00
CLASSpropname 1.512080 0.07560529  19.99966 2.364983e-86

nd[4,]

     CLASS VOWELS      fit     lwr      upr
4 propname    one 8.529168 8.38323 8.675106

summary(glht(model=m_01, linfct=matrix(
   c(1, 0, 0, 0, 1, 0, 0, 0, 0, 0), nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0  8.52917    0.07445   114.6   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

The prediction for when CLASS is unclear and VOWELS is one requires that we add to the intercept the coefficient for CLASSunclear:

summary(m_01)$coef[c(1,6),]

              Estimate Std. Error  t value     Pr(>|t|)
(Intercept)  7.0170881 0.01319279 531.8881 0.000000e+00
CLASSunclear 0.8312156 0.12335687   6.7383 1.742943e-11

nd[5,]

    CLASS VOWELS      fit     lwr      upr
5 unclear    one 7.848304 7.60787 8.088738

summary(glht(model=m_01, linfct=matrix(
   c(1, 0, 0, 0, 0, 1, 0, 0, 0, 0), nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0   7.8483     0.1226   63.99   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

As you can see, the values in each (model) matrix we give glht are the row indices of the required coefficients in the model summary. But now it gets a bit more complex because we turn to VOWELS being two+, which means we need to consider more coefficients.

The prediction for when CLASS is closed and VOWELS is two+ requires that we add to the intercept the coefficient for VOWELStwo+:

summary(m_01)$coef[c(1,2),]

              Estimate Std. Error    t value     Pr(>|t|)
(Intercept)  7.0170881 0.01319279 531.888086 0.000000e+00
VOWELStwo+  -0.1072129 0.02650782  -4.044576 5.303671e-05

nd[6,]

   CLASS VOWELS      fit      lwr      upr
6 closed   two+ 6.909875 6.864804 6.954947

summary(glht(model=m_01, linfct=matrix(
   c(1, 1, 0, 0, 0, 0, 0, 0, 0, 0), nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0  6.90988    0.02299   300.5   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

The prediction for when CLASS is number and VOWELS is two+ requires that we add to the intercept the coefficient for CLASSnumber, the coefficient for VOWELStwo+, and the one for their interaction:

summary(m_01)$coef[c(1,2,3,7),]

                          Estimate Std. Error     t value     Pr(>|t|)
(Intercept)             7.01708812 0.01319279 531.8880863 0.000000e+00
VOWELStwo+             -0.10721289 0.02650782  -4.0445758 5.303671e-05
CLASSnumber             0.93657391 0.07447485  12.5757077 7.632984e-36
VOWELStwo+:CLASSnumber  0.05866289 0.15992227   0.3668213 7.137645e-01

nd[7,]

   CLASS VOWELS      fit      lwr      upr
7 number   two+ 7.905112 7.631366 8.178858

summary(glht(model=m_01, linfct=matrix(
   c(1, 1, 1, 0, 0, 0, 1, 0, 0, 0), nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0   7.9051     0.1396   56.61   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

The prediction for when CLASS is open and VOWELS is two+ requires that we add to the intercept the coefficient for CLASSopen, the coefficient for VOWELStwo+, and the one for their interaction:

summary(m_01)$coef[c(1,2,4,8),]

                       Estimate Std. Error    t value      Pr(>|t|)
(Intercept)           7.0170881 0.01319279 531.888086  0.000000e+00
VOWELStwo+           -0.1072129 0.02650782  -4.044576  5.303671e-05
CLASSopen             0.8355839 0.03684015  22.681338 1.356049e-109
VOWELStwo+:CLASSopen  0.2049284 0.04697502   4.362497  1.306137e-05

nd[8,]

  CLASS VOWELS      fit      lwr      upr
8  open   two+ 7.950388 7.915274 7.985501

summary(glht(model=m_01, linfct=matrix(
   c(1, 1, 0, 1, 0, 0, 0, 1, 0, 0), nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0  7.95039    0.01791   443.9   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

The prediction for when CLASS is propname and VOWELS is two+ requires that we add to the intercept the coefficient for CLASSpropname, the coefficient for VOWELStwo+, and the one for their interaction:

summary(m_01)$coef[c(1,2,5,9),]

                            Estimate Std. Error     t value     Pr(>|t|)
(Intercept)               7.01708812 0.01319279 531.8880863 0.000000e+00
VOWELStwo+               -0.10721289 0.02650782  -4.0445758 5.303671e-05
CLASSpropname             1.51207992 0.07560529  19.9996587 2.364983e-86
VOWELStwo+:CLASSpropname  0.06904612 0.20333421   0.3395696 7.341919e-01

nd[9,]

     CLASS VOWELS      fit      lwr     upr
9 propname   two+ 8.491001 8.123732 8.85827

summary(glht(model=m_01, linfct=matrix(
   c(1, 1, 0, 0, 1, 0, 0, 0, 1, 0), nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0   8.4910     0.1874   45.32   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

The prediction for when CLASS is unclear and VOWELS is two+ requires that we add to the intercept the coefficient for CLASSunclear, the coefficient for VOWELStwo+, and the one for their interaction:

summary(m_01)$coef[c(1,2,6,10),]

                           Estimate Std. Error     t value     Pr(>|t|)
(Intercept)              7.01708812 0.01319279 531.8880863 0.000000e+00
VOWELStwo+              -0.10721289 0.02650782  -4.0445758 5.303671e-05
CLASSunclear             0.83121561 0.12335687   6.7382998 1.742943e-11
VOWELStwo+:CLASSunclear  0.05244692 0.32170732   0.1630268 8.705025e-01

nd[10,]

     CLASS VOWELS      fit      lwr      upr
10 unclear   two+ 7.793538 7.212834 8.374241

summary(glht(model=m_01, linfct=matrix(
   c(1, 1, 0, 0, 0, 1, 0, 0, 0, 1), nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0   7.7935     0.2962   26.31   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

Yes, it’s cumbersome, but if you know what the coefficients mean, it’s also not that bad and this approach answers all questions from scenarios 1a and 1b, which will become especially interesting when we turn to slopes, i.e. scenarios 3a and 3b.

1.2.2 Scenario 1c

Scenario 1c meant to ask whether (some? all?) differences between the 10 means are significant and last week we used pairs(emmeans(...)), and this is probably the best way to proceed, but to really drive home the point of how powerful glht is, let’s also look at glht-based comparisons between means. Imagine you were interested in answering the following question: Is the prediction for when VOWELS is two+ and CLASS is propname significantly different from when VOWELS is two+ and CLASS is unclear? With emmeans, this is what we’d do …

nd[9,"fit"] - nd[10,"fit"]
qwe <- pairs(emmeans( # compute pairwise comparisons of means
    object=m_01,      # for the model m_01
    specs= ~ CLASS | VOWELS), # of CLASS within each level of VOWELS
    adjust="none")    # do not adjust for multiple comparisons
qwe[20,]

[1] 0.6974635
 contrast           VOWELS estimate   SE   df t.ratio p.value
 propname - unclear two+      0.697 0.35 6373   1.990  0.0466

… but how could we do this with glht? One way to do this is very stepwise, but that may make it the one that is simplest to comprehend. You

define two (model) matrices, one for each of the two situations that you want to compare;
subtract one from the other;
give that difference (model) matrix to glht.

And, thankfully, we already generated the relevant matrices above; I just copy and paste from above and assign proper names to them:

# step 1:
VOW.is.twoplus.CLA.is.prop <- c(1, 1, 0, 0, 1, 0, 0, 0, 1, 0)
VOW.is.twoplus.CLA.is.uncl <- c(1, 1, 0, 0, 0, 1, 0, 0, 0, 1)
# step 2:
(diff.betw.them <- VOW.is.twoplus.CLA.is.prop - VOW.is.twoplus.CLA.is.uncl)

 [1]  0  0  0  0  1 -1  0  0  1 -1

# step 3:
summary(glht(model=m_01, linfct=matrix(diff.betw.them, nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + VOWELS * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0   0.6975     0.3505    1.99   0.0466 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

You can see that this is the right result by comparing this output to the above result from emmeans and to the difference in predictions from nd. Again, yes, it seems cumbersome, but you have to remember that, since you define the (model) matrices, any comparison you’re interested in is suddenly at your fingertips. As mentioned above, this will become especially interesting when we turn to slopes, i.e. scenarios 3a and 3b.

1.2.3 Scenario 2a & 2b

For scenarios 2a and 2b we turn to a model m_02:

summary(m_02 <- lm(DURATION_l ~ 1 + SURPRISAL_s*CLASS, data=d))$coef

                             Estimate Std. Error     t value      Pr(>|t|)
(Intercept)                6.16979831 0.03707286 166.4235886  0.000000e+00
SURPRISAL_s                0.39267865 0.01696794  23.1423912 7.685868e-114
CLASSnumber                1.58939734 0.36082711   4.4048723  1.075782e-05
CLASSopen                  0.70445540 0.07590253   9.2810530  2.254246e-20
CLASSpropname              2.16824002 0.38199775   5.6760544  1.439034e-08
CLASSunclear               2.57069279 0.72649913   3.5384665  4.053259e-04
SURPRISAL_s:CLASSnumber   -0.32626365 0.12878748  -2.5333490  1.132161e-02
SURPRISAL_s:CLASSopen     -0.01876137 0.02846803  -0.6590329  5.098985e-01
SURPRISAL_s:CLASSpropname -0.33912215 0.10921507  -3.1050858  1.910585e-03
SURPRISAL_s:CLASSunclear  -0.71744847 0.25945594  -2.7652035  5.705098e-03

Scenario 2a meant to ask the following question: what are the 5 predicted means of DURATION_l for each of the levels of CLASS and when SURPRISAL_s is controlled for in some way? And scenario 2b meant to ask whether each of the 5 means is significantly different from 0 (and what the exact p-value is).

To address scenario 2a, we previously did this …

nd <- expand.grid(
   SURPRISAL_s=c(0, 1, mean(d$SURPRISAL_s)), # main values of SURPRISAL_s
   CLASS=levels(d$CLASS)) %>%                # all levels of CLASS
   "["(,2:1)                                 # reorder columns
(nd <- cbind(nd, predict(m_02, newdata=nd, interval="confidence")))

      CLASS SURPRISAL_s      fit      lwr       upr
1    closed    0.000000 6.169798 6.097123  6.242474
2    closed    1.000000 6.562477 6.520488  6.604466
3    closed    2.370868 7.100787 7.077645  7.123930
4    number    0.000000 7.759196 7.055597  8.462795
5    number    1.000000 7.825611 7.366613  8.284608
6    number    2.370868 7.916657 7.760428  8.072886
7      open    0.000000 6.874254 6.744415  7.004092
8      open    1.000000 7.248171 7.161384  7.334958
9      open    2.370868 7.760762 7.725078  7.796446
10 propname    0.000000 8.338038 7.592729  9.083347
11 propname    1.000000 8.391595 7.853413  8.929776
12 propname    2.370868 8.465014 8.199338  8.730689
13  unclear    0.000000 8.740491 7.318164 10.162818
14  unclear    1.000000 8.415721 7.492344  9.339099
15  unclear    2.370868 7.970505 7.678307  8.262703

… and then we looked at the predictions when SURPRISAL_s is 0. But then we said we can’t really address scenario 2b without stupid releveling – well, now we can even it is still painful (yet powerful). But to use the capabilities of glht some more, let’s actually use it to create/check the predictions that emmeans would make by default, because those are not for when SURPRISAL_s is 0, but for when SURPRISAL_s is at its mean:

emmeans(m_02, ~CLASS | SURPRISAL_s)

SURPRISAL_s = 2.371:
 CLASS    emmean     SE   df lower.CL upper.CL
 closed    7.101 0.0118 6373    7.078    7.124
 number    7.917 0.0797 6373    7.760    8.073
 open      7.761 0.0182 6373    7.725    7.796
 propname  8.465 0.1360 6373    8.199    8.731
 unclear   7.971 0.1490 6373    7.678    8.263

Confidence level used: 0.95

So, let’s quickly compute a 1-element vector / variable with the mean of SURPRISAL_s (which is already part of nd):

(mean_surp <- mean(d$SURPRISAL_s))

[1] 2.370868

The prediction for when CLASS is closed and SURPRISAL_s is its mean requires that we add to the intercept – the prediction when CLASS is closed and SURPRISAL_s is 0 – mean_surp times the coefficient/slope for SURPRISAL_s:

summary(m_02)$coef[c(1,2),]

             Estimate Std. Error   t value      Pr(>|t|)
(Intercept) 6.1697983 0.03707286 166.42359  0.000000e+00
SURPRISAL_s 0.3926787 0.01696794  23.14239 7.685868e-114

nd[3,]

   CLASS SURPRISAL_s      fit      lwr     upr
3 closed    2.370868 7.100787 7.077645 7.12393

glht(model=m_02, linfct=matrix(
   c(1, mean_surp, 0, 0, 0, 0, 0, 0, 0, 0), nrow=1)) %>% summary


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0  7.10079    0.01181   601.5   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

glht(model=m_02, linfct=matrix(
   c(1, mean_surp, 0, 0, 0, 0, 0, 0, 0, 0), nrow=1)) %>% confint


     Simultaneous Confidence Intervals

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Quantile = 1.9603
95% family-wise confidence level


Linear Hypotheses:
       Estimate lwr    upr
1 == 0 7.1008   7.0776 7.1239

The prediction for when CLASS is number and SURPRISAL_s is its mean requires that, to the intercept, we add mean_surp times the coefficient/slope for SURPRISAL_s, the coefficient for CLASSnumber, and mean_surp times the coefficient/slope for their interaction:

summary(m_02)$coef[c(1,2,3,7),]

                          Estimate Std. Error    t value      Pr(>|t|)
(Intercept)              6.1697983 0.03707286 166.423589  0.000000e+00
SURPRISAL_s              0.3926787 0.01696794  23.142391 7.685868e-114
CLASSnumber              1.5893973 0.36082711   4.404872  1.075782e-05
SURPRISAL_s:CLASSnumber -0.3262636 0.12878748  -2.533349  1.132161e-02

nd[6,]

   CLASS SURPRISAL_s      fit      lwr      upr
6 number    2.370868 7.916657 7.760428 8.072886

qwe <- glht(model=m_02, linfct=matrix(
   c(1, mean_surp, 1, 0, 0, 0, mean_surp, 0, 0, 0), nrow=1))
qwe %>% summary


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0   7.9167     0.0797   99.34   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

qwe %>% confint


     Simultaneous Confidence Intervals

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Quantile = 1.9603
95% family-wise confidence level


Linear Hypotheses:
       Estimate lwr    upr
1 == 0 7.9167   7.7604 8.0729

The prediction for when CLASS is open and SURPRISAL_s is its mean requires that, to the intercept, we add mean_surp times the coefficient/slope for SURPRISAL_s, the coefficient for CLASSopen, and mean_surp times the coefficient/slope for their interaction:

summary(m_02)$coef[c(1,2,4,8),]

                         Estimate Std. Error     t value      Pr(>|t|)
(Intercept)            6.16979831 0.03707286 166.4235886  0.000000e+00
SURPRISAL_s            0.39267865 0.01696794  23.1423912 7.685868e-114
CLASSopen              0.70445540 0.07590253   9.2810530  2.254246e-20
SURPRISAL_s:CLASSopen -0.01876137 0.02846803  -0.6590329  5.098985e-01

nd[9,]

  CLASS SURPRISAL_s      fit      lwr      upr
9  open    2.370868 7.760762 7.725078 7.796446

qwe <- glht(model=m_02, linfct=matrix(
   c(1, mean_surp, 0, 1, 0, 0, 0, mean_surp, 0, 0), nrow=1))
qwe %>% summary


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0   7.7608     0.0182   426.3   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

qwe %>% confint


     Simultaneous Confidence Intervals

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Quantile = 1.9603
95% family-wise confidence level


Linear Hypotheses:
       Estimate lwr    upr
1 == 0 7.7608   7.7251 7.7964

The prediction for when CLASS is propname and SURPRISAL_s is its mean requires that, to the intercept, we add mean_surp times the coefficient/slope for SURPRISAL_s, the coefficient for CLASSpropname, and mean_surp times the coefficient/slope for their interaction:

summary(m_02)$coef[c(1,2,5,9),]

                            Estimate Std. Error    t value      Pr(>|t|)
(Intercept)                6.1697983 0.03707286 166.423589  0.000000e+00
SURPRISAL_s                0.3926787 0.01696794  23.142391 7.685868e-114
CLASSpropname              2.1682400 0.38199775   5.676054  1.439034e-08
SURPRISAL_s:CLASSpropname -0.3391222 0.10921507  -3.105086  1.910585e-03

nd[12,]

      CLASS SURPRISAL_s      fit      lwr      upr
12 propname    2.370868 8.465014 8.199338 8.730689

qwe <- glht(model=m_02, linfct=matrix(
   c(1, mean_surp, 0, 0, 1, 0, 0, 0, mean_surp, 0), nrow=1))
qwe %>% summary


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0   8.4650     0.1355   62.46   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

qwe %>% confint


     Simultaneous Confidence Intervals

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Quantile = 1.9603
95% family-wise confidence level


Linear Hypotheses:
       Estimate lwr    upr
1 == 0 8.4650   8.1993 8.7307

The prediction for when CLASS is unclear and SURPRISAL_s is its mean requires that, to the intercept, we add mean_surp times the coefficient/slope for SURPRISAL_s, the coefficient for CLASSunclear, and mean_surp times the coefficient/slope for their interaction:

summary(m_02)$coef[c(1,2,6,10),]

                           Estimate Std. Error    t value      Pr(>|t|)
(Intercept)               6.1697983 0.03707286 166.423589  0.000000e+00
SURPRISAL_s               0.3926787 0.01696794  23.142391 7.685868e-114
CLASSunclear              2.5706928 0.72649913   3.538466  4.053259e-04
SURPRISAL_s:CLASSunclear -0.7174485 0.25945594  -2.765204  5.705098e-03

nd[15,]

     CLASS SURPRISAL_s      fit      lwr      upr
15 unclear    2.370868 7.970505 7.678307 8.262703

qwe <- glht(model=m_02, linfct=matrix(
   c(1, mean_surp, 0, 0, 0, 1, 0, 0, 0, mean_surp), nrow=1))
qwe %>% summary


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0   7.9705     0.1491   53.47   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

qwe %>% confint


     Simultaneous Confidence Intervals

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Quantile = 1.9603
95% family-wise confidence level


Linear Hypotheses:
       Estimate lwr    upr
1 == 0 7.9705   7.6783 8.2627

Cumbersome, but powerful/versatile!

1.2.4 Scenario 2c

Scenario 2c meant to ask whether (some? all?) differences between the 5 means are significant and last week we used pairs(emmeans(...)), and this is probably the best way to proceed, but to really drive home the point of how powerful glht is, let’s also look at glht-based comparisons between means. Imagine you were interested in answering the following question: Is the prediction for when SURPRISAL_s is its mean and CLASS is propname significantly different from when SURPRISAL_s is its mean and CLASS is unclear? With emmeans, this is what we’d do (because emmeans does us the favor of setting SURPRISAL_s to its mean for this):

nd[12,"fit"] - nd[15,"fit"]
qwe <- pairs(emmeans( # compute pairwise comparisons of means
    object=m_02,      # for the model m_01
    specs= ~ CLASS | SURPRISAL_s), # of CLASS within each level of VOWELS
    adjust="none")    # do not adjust for multiple comparisons
qwe[10,]

[1] 0.4945089
 contrast           SURPRISAL_s estimate    SE   df t.ratio p.value
 propname - unclear        2.37    0.495 0.201 6373   2.455  0.0141

But now we use glht. As before I show the super stepwise way where we define two (model) matrices, one for each of the two situations that you want to compare, then you subtract one from the other, and then you give that difference (model) matrix to glht. And, thankfully, we already generated the relevant matrices above; I just copy and paste from above and assign proper names to them:

SUR.is.mean.CLA.is.prop <- c(1, mean_surp, 0, 0, 1, 0, 0, 0, mean_surp, 0)
SUR.is.mean.CLA.is.uncl <- c(1, mean_surp, 0, 0, 0, 1, 0, 0, 0        , mean_surp)
(diff.betw.them <- SUR.is.mean.CLA.is.prop - SUR.is.mean.CLA.is.uncl)

 [1]  0.000000  0.000000  0.000000  0.000000  1.000000 -1.000000  0.000000
 [8]  0.000000  2.370868 -2.370868

summary(glht(model=m_02, linfct=matrix(diff.betw.them, nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0   0.4945     0.2015   2.455   0.0141 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

You can see that this is the right result by comparing this output to the result from emmeans and the one from nd above. Again, yes, it seems cumbersome, but note that you can use the glht logic for arbitrary values of SURPRISAL_s, not just 0, 1, or its mean: If any prior work reported a certain significant effect for a certain value of SURPRISAL_s, you can just plug that value into the above code to see whether that value also yields a significant difference in your data. But again, this will become most interesting when we turn to slopes, which we do now.

1.2.5 Scenarios 3a & 3b

Scenario 3a meant to ask the following question: what are the 5 predicted slopes of SURPRISAL_s for each of the levels of CLASS? And scenario 3b meant to ask whether each of these 5 slopes is significantly different from 0 (and what the exact p-value is).

To address scenario 3a last week, the shortest thing we did was this:

# tapply(nd$fit,           # apply a grouping of the predictions
#    list(nd$CLASS,        # by CLASS
#         nd$SURPRISAL_s), # and SURPRISAL_s
#    c)                %>% # just list the values,
#    "["(,-3)          %>% # suppress the 3rd column
#    apply(1, diff)        # compute the differences
#   or
# nd.split <- split(nd, nd$SURPRISAL_s)
# nd.split$`1`$fit - nd.split$`0`$fit
coef(m_02)[2] + c(0, coef(m_02)[7:10])

                            SURPRISAL_s:CLASSnumber     SURPRISAL_s:CLASSopen
                0.3926787                 0.0664150                 0.3739173
SURPRISAL_s:CLASSpropname  SURPRISAL_s:CLASSunclear
                0.0535565                -0.3247698

But then we said we can’t really address scenario 3b without stupid releveling – well, we can now! When CLASS is closed, it’s super straightforward:

summary(m_02)$coef[2,]

     Estimate    Std. Error       t value      Pr(>|t|)
 3.926787e-01  1.696794e-02  2.314239e+01 7.685868e-114

nd[2,"fit"] - nd[1,"fit"]

[1] 0.3926787

glht(model=m_02, linfct=matrix(
   c(0, 1, 0, 0, 0, 0, 0, 0, 0, 0), nrow=1)) %>% summary


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0  0.39268    0.01697   23.14   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

glht(model=m_02, linfct=matrix(
   c(0, 1, 0, 0, 0, 0, 0, 0, 0, 0), nrow=1)) %>% confint


     Simultaneous Confidence Intervals

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Quantile = 1.9603
95% family-wise confidence level


Linear Hypotheses:
       Estimate lwr    upr
1 == 0 0.3927   0.3594 0.4259

When CLASS is number, we need to consider the interaction term in the 7th coefficient (SURPRISAL_s:CLASSnumber) as well:

summary(m_02)$coef[c(2,7),]

                          Estimate Std. Error   t value      Pr(>|t|)
SURPRISAL_s              0.3926787 0.01696794 23.142391 7.685868e-114
SURPRISAL_s:CLASSnumber -0.3262636 0.12878748 -2.533349  1.132161e-02

nd[5,"fit"] - nd[4,"fit"]

[1] 0.066415

qwe <- glht(model=m_02, linfct=matrix(
   c(0, 1, 0, 0, 0, 0, 1, 0, 0, 0), nrow=1))
qwe %>% summary


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0  0.06642    0.12766    0.52    0.603
(Adjusted p values reported -- single-step method)

qwe %>% confint


     Simultaneous Confidence Intervals

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Quantile = 1.9603
95% family-wise confidence level


Linear Hypotheses:
       Estimate lwr      upr
1 == 0  0.06642 -0.18385  0.31668

When CLASS is open, we need to consider the interaction term in the 8th coefficient (SURPRISAL_s:CLASSopen) as well:

summary(m_02)$coef[c(2,8),]

                         Estimate Std. Error    t value      Pr(>|t|)
SURPRISAL_s            0.39267865 0.01696794 23.1423912 7.685868e-114
SURPRISAL_s:CLASSopen -0.01876137 0.02846803 -0.6590329  5.098985e-01

nd[8,"fit"] - nd[7,"fit"]

[1] 0.3739173

qwe <- glht(model=m_02, linfct=matrix(
   c(0, 1, 0, 0, 0, 0, 0, 1, 0, 0), nrow=1))
qwe %>% summary


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0  0.37392    0.02286   16.36   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

qwe %>% confint


     Simultaneous Confidence Intervals

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Quantile = 1.9603
95% family-wise confidence level


Linear Hypotheses:
       Estimate lwr    upr
1 == 0 0.3739   0.3291 0.4187

When CLASS is propname, we need to consider the interaction term in the 9th coefficient (SURPRISAL_s:CLASSpropname) as well:

summary(m_02)$coef[c(2,9),]

                            Estimate Std. Error   t value      Pr(>|t|)
SURPRISAL_s                0.3926787 0.01696794 23.142391 7.685868e-114
SURPRISAL_s:CLASSpropname -0.3391222 0.10921507 -3.105086  1.910585e-03

nd[11,"fit"] - nd[10,"fit"]

[1] 0.0535565

qwe <- glht(model=m_02, linfct=matrix(
   c(0, 1, 0, 0, 0, 0, 0, 0, 1, 0), nrow=1))
qwe %>% summary


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0  0.05356    0.10789   0.496     0.62
(Adjusted p values reported -- single-step method)

qwe %>% confint


     Simultaneous Confidence Intervals

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Quantile = 1.9603
95% family-wise confidence level


Linear Hypotheses:
       Estimate lwr      upr
1 == 0  0.05356 -0.15794  0.26506

When CLASS is unclear, we need to consider the interaction term in the 10th coefficient (SURPRISAL_s:CLASSunclear) as well:

summary(m_02)$coef[c(2,10),]

                           Estimate Std. Error   t value      Pr(>|t|)
SURPRISAL_s               0.3926787 0.01696794 23.142391 7.685868e-114
SURPRISAL_s:CLASSunclear -0.7174485 0.25945594 -2.765204  5.705098e-03

nd[14,"fit"] - nd[13,"fit"]

[1] -0.3247698

qwe <- glht(model=m_02, linfct=matrix(
   c(0, 1, 0, 0, 0, 0, 0, 0, 0, 1), nrow=1))
qwe %>% summary


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0  -0.3248     0.2589  -1.254     0.21
(Adjusted p values reported -- single-step method)

qwe %>% confint


     Simultaneous Confidence Intervals

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Quantile = 1.9603
95% family-wise confidence level


Linear Hypotheses:
       Estimate lwr     upr
1 == 0 -0.3248  -0.8323  0.1828

1.2.6 Scenarios 3c

Scenario 3c means to ask whether (some? all?) differences between the 5 slopes of SURPRISAL_s for the different levels of CLASS are significant and last week we used emmeans::emtrends for this, which is probably the best way to proceed:

pairs(emtrends(   # compute pairwise comparisons of trends/slopes
   object=m_02,   # for the model m_02
   specs="CLASS", # compute comparisons over this predictor
   var="SURPRISAL_s"), # namely all slopes of this predictor
   adjust="none") # do not adjust for multiple comparisons

 contrast           estimate     SE   df t.ratio p.value
 closed - number      0.3263 0.1290 6373   2.533  0.0113
 closed - open        0.0188 0.0285 6373   0.659  0.5099
 closed - propname    0.3391 0.1090 6373   3.105  0.0019
 closed - unclear     0.7174 0.2590 6373   2.765  0.0057
 number - open       -0.3075 0.1300 6373  -2.371  0.0178
 number - propname    0.0129 0.1670 6373   0.077  0.9387
 number - unclear     0.3912 0.2890 6373   1.355  0.1754
 open - propname      0.3204 0.1100 6373   2.905  0.0037
 open - unclear       0.6987 0.2600 6373   2.688  0.0072
 propname - unclear   0.3783 0.2800 6373   1.349  0.1774

Now with glht … Imagine you were interested in whether the slope of SURPRISAL_s when CLASS is open is significantly different from when CLASS is unclear, but you wanted to solve this with glht. You will not be surprised to see that I demonstrate this in the same granular way as before: We define two (model) matrices, one for each of the two situations that you want to compare, then you subtract one from the other, and then you give that difference (model) matrix to glht. And, thankfully, we already generated the relevant matrices above; I just copy and paste from above and assign proper names to them:

slope.when.SUR.is.0.CLA.is.open <- c(0, 1, 0, 0, 0, 0, 0, 1, 0, 0)
slope.when.SUR.is.0.CLA.is.uncl <- c(0, 1, 0, 0, 0, 0, 0, 0, 0, 1)
(diff.betw.them <- slope.when.SUR.is.0.CLA.is.open - slope.when.SUR.is.0.CLA.is.uncl)

 [1]  0  0  0  0  0  0  0  1  0 -1

summary(glht(model=m_02, linfct=matrix(diff.betw.them, nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0   0.6987     0.2599   2.688   0.0072 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

1.3 What does this mean for our regression summaries?

The above has at least some implications for how we might want to report the results of regression models. We saw how the summary table of m_02 provides maximally half of all the info one might want to provide with regard to the interaction SURPRISAL_s:CLASS:

summary(m_02)$coef %>% round(4)

                          Estimate Std. Error  t value Pr(>|t|)
(Intercept)                 6.1698     0.0371 166.4236   0.0000
SURPRISAL_s                 0.3927     0.0170  23.1424   0.0000
CLASSnumber                 1.5894     0.3608   4.4049   0.0000
CLASSopen                   0.7045     0.0759   9.2811   0.0000
CLASSpropname               2.1682     0.3820   5.6761   0.0000
CLASSunclear                2.5707     0.7265   3.5385   0.0004
SURPRISAL_s:CLASSnumber    -0.3263     0.1288  -2.5333   0.0113
SURPRISAL_s:CLASSopen      -0.0188     0.0285  -0.6590   0.5099
SURPRISAL_s:CLASSpropname  -0.3391     0.1092  -3.1051   0.0019
SURPRISAL_s:CLASSunclear   -0.7174     0.2595  -2.7652   0.0057

Given everything we’ve learned by now, we can do much better in two ways: First, we can represent the summary table in a format that makes it easier to digest for readers who may not always remember all your variables and, for factors, their reference levels, plus we can and probably should add confidence intervals. Thus, this is better:

	Estimate	95%-CI	se	t	p_2-tailed
Intercept (SURP=0 + CLASS=closed)	6.1698	[6.097, 6.242]	0.0371	166.4236	<0.0001
SURP_0→1	0.3927	[0.359, 0.426]	0.0170	23.1424	<0.0001
CLASS_{closed→numb}	1.5894	[0.882, 2.297]	0.3608	4.4049	<0.0001
CLASS_{closed→open}	0.7045	[0.556, 0.853]	0.0759	9.2811	<0.0001
CLASS_{closed→propn}	2.1682	[1.419, 2.917]	0.3820	5.6761	<0.0001
CLASS_{closed→uncl}	2.5707	[1.147, 3.995]	0.7265	3.5385	0.0004
SURP_0→1 + CLASS_{closed→numb}	-0.3263	[-0.579, -0.074]	0.1288	-2.5333	0.0113
SURP_0→1 + CLASS_{closed→open}	-0.0188	[-0.075, 0.037]	0.0285	-0.6590	0.5099
SURP_0→1 + CLASS_{closed→propn}	-0.3391	[-0.553, -0.125]	0.1092	-3.1051	0.0019
SURP_0→1 + CLASS_{closed→uncl}	-0.7174	[-1.226, -0.209]	0.2595	-2.7652	0.0057

But then, even if you provide effects plots, which I recommend to always do, it might be nice – (especially) for generalized linear models – to provide

the predicted means for the levels of CLASS (while probably while SURPRISAL_s is at its mean);
the predicted slopes of SURPRISAL_s for each level of CLASS,
and their – don’t forget those! – their confidence intervals.

Here’re those results. First, we fix SURPRISAL_s to its mean and then compute the predictions for each level of CLASS:

`SURP`=2.37	`CLASS`: closed	`CLASS`: number	`CLASS`: open	`CLASS`: propname	`CLASS`: unclear
pred. `DURA_l`	7.1008	7.9167	7.7608	8.4650	7.9705
95% CI	[7.0776, 7.1239]	[7.7604, 8.0729]	[7.7251, 7.7964]	[8.1993, 8.7307]	[7.6783, 8.2627]

Then, we compute the slope of SURPRISAL_s for each level of CLASS:

	`CLASS`: closed	`CLASS`: number	`CLASS`: open	`CLASS`: propname	`CLASS`: unclear
slope of `SURP`	0.3927	0.0664	0.3739	0.0536	-0.3248
95% CI	[0.3594, 0.4259]	[-0.1839, 0.3167]	[0.3291, 0.4187]	[0.1579, 0.2651]	[-0.8323, 0.1828]

Tip

Something for you to check out: multcomp::glht actually also allows you to test specific hypotheses in a way that looks more like English; for instance, you can do a (now two-tailed) test of SURPRISAL_s being different from 0.425 like this: summary(glht(m_02, "SURPRISAL_s = 0.425")). Another function that is useful for stuff like this is marginaleffects::hypotheses.

1.4 What else is possible?

1.4.1 Orthogonal contrasts w/ `glht`

We can also define orthogonal contrasts with glht, because glht can really do anything. Let’s get the data set up again:

d$CLASS_orth <- d$CLASS
#                           clos  numb  open  prop  uncl
clo.op_vs_num.prop.unc <- c(+0.6, -0.4, +0.6, -0.4, -0.4)
clo_vs_op              <- c(+0.5,  0  , -0.5,  0  ,  0  )
num.prop_vs_uncl       <- c( 0  , -1/3,  0  , -1/3, +2/3)
num_vs_prop            <- c( 0  , +0.5,  0  , -0.5,  0  )
# & then we put those into a matrix like the above table & assign them as contrasts:
contrasts(d$CLASS_orth) <-    # make the following the contrasts of CLASS_orth
   cbind( # the matrix resulting from binding together as columns
      clo.op_vs_num.prop.unc, # all
      clo_vs_op,              # four
      num.prop_vs_uncl,       # contrast
      num_vs_prop)            # vectors

And let’s fit the two models:

summary(m_03_treatm <- lm(DURATION_l ~ 1 + CLASS     , data=d))$coef

               Estimate Std. Error   t value      Pr(>|t|)
(Intercept)   6.9905315 0.01145882 610.05701  0.000000e+00
CLASSnumber   0.9526437 0.06599324  14.43548  1.663876e-46
CLASSopen     0.9390107 0.01960634  47.89321  0.000000e+00
CLASSpropname 1.5334319 0.07022165  21.83703 5.221218e-102
CLASSunclear  0.8497577 0.11405591   7.45036  1.054138e-13

summary(m_03_orthog <- lm(DURATION_l ~ 1 + CLASS_orth, data=d))$coef

                                   Estimate Std. Error    t value     Pr(>|t|)
(Intercept)                       7.8455003 0.02985661 262.772614 0.000000e+00
CLASS_orthclo.op_vs_num.prop.unc -0.6424391 0.05029463 -12.773514 6.512723e-37
CLASS_orthclo_vs_op              -0.9390107 0.01960634 -47.893209 0.000000e+00
CLASS_orthnum.prop_vs_uncl       -0.3932802 0.12301765  -3.196941 1.395713e-03
CLASS_orthnum_vs_prop            -0.5807882 0.09499252  -6.114042 1.028827e-09

1.4.1.1 The first orthogonal contrast

So how do we get the first contrast closed/open vs. number/propname/unclear with glht from m_03_treatm? Like this: We first create a glht 1-row matrix that combines the predictions for when CLASS is closed or open by computing the unweighted mean of the sum of the vectors/matrices:

CLA.is.clos <- c(1, 0, 0, 0, 0)
CLA.is.open <- c(1, 0, 1, 0, 0)
(CLA.is.clos.or.open <- (CLA.is.clos + CLA.is.open) / 2)
# same as:
#       CLA.is.clos       CLA.is.open
# rbind(c(1, 0, 0, 0, 0), c(1, 0, 1, 0, 0)) %>% apply(2, mean)

[1] 1.0 0.0 0.5 0.0 0.0

Then we do the same for when CLASS is number or propname or unclear:

CLA.is.numb <- c(1, 1, 0, 0, 0)
CLA.is.prop <- c(1, 0, 0, 1, 0)
CLA.is.uncl <- c(1, 0, 0, 0, 1)
(CLA.is.numb.or.prop.or.uncl <- (CLA.is.numb + CLA.is.prop + CLA.is.uncl) / 3)
# same as:
#       CLA.is.numb       CLA.is.prop       CLA.is.uncl
# rbind(c(1, 1, 0, 0, 0), c(1, 0, 0, 1, 0), c(1, 0, 0, 0, 1)) %>% apply(2, mean)

[1] 1.0000000 0.3333333 0.0000000 0.3333333 0.3333333

And then we compute the the summary of the glht for the difference between the two:

diff.betw.them <- CLA.is.clos.or.open - CLA.is.numb.or.prop.or.uncl
summary(glht(model=m_03_treatm, linfct=matrix(diff.betw.them, nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0 -0.64244    0.05029  -12.77   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

And you can see, that’s right:

summary(m_03_orthog)$coef[2,]

     Estimate    Std. Error       t value      Pr(>|t|)
-6.424391e-01  5.029463e-02 -1.277351e+01  6.512723e-37

1.4.1.2 The second orthogonal contrast

So how do we get the second contrast closed vs. open (with glht from m_03_treatm)?

Actually, that’s a trick question: we don’t have to use glht because it’s already there (just with a different sign, but that doesn’t matter for the significance test):

summary(m_03_orthog)$coef[3,]

    Estimate   Std. Error      t value     Pr(>|t|)
 -0.93901065   0.01960634 -47.89320883   0.00000000

summary(m_03_treatm)$coef[3,]

   Estimate  Std. Error     t value    Pr(>|t|)
 0.93901065  0.01960634 47.89320883  0.00000000

So, don’t forget the old stuff because you’re so excited about the new stuff …

1.4.1.3 The third orthogonal contrast

So how do we get the third contrast number/propname vs. unclear with glht from m_03_treatm? Like this: We first create a glht 1-row matrix that combines the predictions for when CLASS is number or propname:

CLA.is.numb <- c(1, 1, 0, 0, 0)
CLA.is.prop <- c(1, 0, 0, 1, 0)
(CLA.is.num.or.prop <- (CLA.is.numb + CLA.is.prop) / 2)

[1] 1.0 0.5 0.0 0.5 0.0

Then we do the same for when CLASS is unclear, but since this is just one level, we don’t need a mean:

CLA.is.uncl <- c(1, 0, 0, 0, 1)

And then we compute the the summary of the glht for the difference between the two:

diff.betw.them <- CLA.is.num.or.prop - CLA.is.uncl
summary(glht(model=m_03_treatm, linfct=matrix(
   diff.betw.them, nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0   0.3933     0.1230   3.197   0.0014 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

And you can see, that’s right:

summary(m_03_orthog)$coef[4,]

    Estimate   Std. Error      t value     Pr(>|t|)
-0.393280161  0.123017647 -3.196941003  0.001395713

1.4.1.4 The fourth orthogonal contrast

And the last one , number vs. propname: We first create one glht 1-row matrix that computes the predictions for when CLASS is number, then another for when CLASS is propname, and then we compute the difference:

CLA.is.numb <- c(1, 1, 0, 0, 0)
CLA.is.prop <- c(1, 0, 0, 1, 0)
diff.betw.them <- CLA.is.numb - CLA.is.prop
summary(glht(model=m_03_treatm, linfct=matrix(
   diff.betw.them, nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0 -0.58079    0.09499  -6.114 1.03e-09 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

And you can see, that’s right:

summary(m_03_orthog)$coef[5,]

     Estimate    Std. Error       t value      Pr(>|t|)
-5.807882e-01  9.499252e-02 -6.114042e+00  1.028827e-09

So this is why I discussed glht so much: it’s very very powerful for all sorts of comparisons you can think of, in fact even for things you might not have considered yet, given how we proceeded – read on to find out.

1.4.2 Testing against specific values inside your model

We discussed earlier – when we discussed the 1c/2c/3c scenarios – how we can use glht to compare values of our model coefficients directly against each other. We can also extend that a bit to ask more specific questions. For example, here are the coefficients of our ‘normal’ treatment contrast model again:

summary(m_03_treatm)$coef

               Estimate Std. Error   t value      Pr(>|t|)
(Intercept)   6.9905315 0.01145882 610.05701  0.000000e+00
CLASSnumber   0.9526437 0.06599324  14.43548  1.663876e-46
CLASSopen     0.9390107 0.01960634  47.89321  0.000000e+00
CLASSpropname 1.5334319 0.07022165  21.83703 5.221218e-102
CLASSunclear  0.8497577 0.11405591   7.45036  1.054138e-13

Imagine that, based on previous studies, you had hypothesized earlier that the difference between closed-class words (the intercept) and proper names (i.e., coef(m_03_treatm)[4]) should be twice as big as the difference between closed-class words and open-class words (i.e., coef(m_03_treatm)[3]). We can check this kind of hypothesis with glht as well and I’m doing it super stepwise again:

closed_vs_open     <- matrix(c(0,0,1,0,0), nrow=1)
closed_vs_propname <- matrix(c(0,0,0,1,0), nrow=1)
summary(glht(m_03_treatm, "-"( # is the difference between
   closed_vs_propname,         # closed vs. proper names
   2 * closed_vs_open)))       # twice as big as closed vs. open


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0 -0.34459    0.07709   -4.47 7.97e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

The result is significant – what does that mean? It means that

the difference between closed-class words and proper names on the one hand and
two times the difference between closed- and open-class words on the other

is significantly different from 0, meaning the ratio of

the difference between closed-class words and proper names on the one hand and
the difference between closed- and open-class words on the other

is significantly different from 2. What is the ratio? This:

coef(m_03_treatm)[4] / coef(m_03_treatm)[3]

CLASSpropname
     1.633029

…, meaning, for the hypothesis to work out as expected, either the difference between closed- and open-class words is too big, or the difference between closed-class words and proper names is too small (or both), something we were not able to test easily before.

1.4.3 Testing against specific values outside your model

One final really nice thing we can do is this: So far, we always ‘only’ tested whether a certain coefficient is significantly different from 0 or whether the sum of some painstakingly-defined linear combinations of coefficients is significantly different from 0. But we can also test whether a certain coefficient or the sum of some painstakingly-defined linear combinations of coefficients is different from some other value.

Let’s first check out a very simple (monofactorial) model for that:

summary(m_04 <- lm(DURATION_l ~ 1 + SURPRISAL_s, data=d))$coef

            Estimate Std. Error   t value Pr(>|t|)
(Intercept) 5.921625 0.03118183 189.90626        0
SURPRISAL_s 0.601603 0.01255319  47.92431        0

Imagine a previous study reported that the slope of SURPRISAL_s was 0.625. However, you actually expected this slope to be lower than that, which is why you were happy to see that, in your data, the slope of SURPRISAL_s is indeed smaller: it’s not 0.625, it’s 0.601603. But is that difference significant? How can we address this?

There are three ways one might consider. The first is very straightforward, but doesn’t always help; we can try and use the confidence interval (and here we would of course not even need glht for that):

Confint(m_04)

            Estimate     2.5 %    97.5 %
(Intercept) 5.921625 5.8604977 5.9827514
SURPRISAL_s 0.601603 0.5769945 0.6262114

confint(glht( # show the confint of a glht, namely
   m_04,      # for the sum of the product of the coeffs of m_04
   matrix(c(0,1), nrow=1))) # and these values


     Simultaneous Confidence Intervals

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s, data = d)

Quantile = 1.9603
95% family-wise confidence level


Linear Hypotheses:
       Estimate lwr    upr
1 == 0 0.6016   0.5770 0.6262

This confidence interval of our slope does include 0.625 but why is this wrong?

We actually want to look at this in a one-tailed way – remember, we expected the slope of SURPRISAL_s for closed class words to be less than 0.625. For this, we could of course adjust the level of the confidence interval:

Confint(m_04, level=0.9)

            Estimate       5 %      95 %
(Intercept) 5.921625 5.8703276 5.9729216
SURPRISAL_s 0.601603 0.5809518 0.6222541

confint(glht( # show the confint of a glht, namely
   m_04,      # for the sum of the product of the coeffs of m_04
   matrix(c(0,1), nrow=1)), # and these values
   level=0.9) # cut off 5% on each side, not 2.5%


     Simultaneous Confidence Intervals

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s, data = d)

Quantile = 1.6451
90% family-wise confidence level


Linear Hypotheses:
       Estimate lwr    upr
1 == 0 0.6016   0.5810 0.6223

Now, the confidence interval of our smaller estimate does not include the previous value of 0.625: given our expectation, our estimate of 0.601603 is indeed significantly smaller than that of the previous study. But how do we get a proper p-value? Here’s the second way: We use summary(glht(...)) again, but add two arguments to it:

the argument rhs, whose default value is 0, which is why all our tests so far were set up to test against whether something is different from 0; now we set that to the value we want to test against, 0.625;
the well-known argument alternative whose default value is, as it is in general, two.sided, but now we set it to less because we expect the coefficient to be less than the rhs we are testing against:

summary(glht( # show the summary of a glht, namely
   m_04,      # check whether the sum of the coeffs of m_04
   matrix(c(0,1), nrow=1), # multiplied by this
   rhs=0.625, # is significantly different from 0.625, namely
   alternative="less")) # by being smaller


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s, data = d)

Linear Hypotheses:
           Estimate Std. Error t value Pr(<t)
1 >= 0.625  0.60160    0.01255  -1.864 0.0312 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

What this does is this:

(summary(m_04)$coef[2,"Estimate"]-0.625) / summary(m_04)$coef[2,"Std. Error"] # t-score of glht

[1] -1.863831

pt(q=-1.863831, df=m_04$df.residual, lower.tail=TRUE) # 1-tailed p-value of glht

[1] 0.03119567

Now we have all we want:² the one-tailed 95%-confidence interval of 0.601603 does not overlap with 0.625 and we have a proper p-value as well; we can report that, as expected, the slope of SURPRISAL_s is indeed significantly smaller than the value of 0.625 reported in the previous study.

A third way that can can also work involves model comparison. We have a model m_04 where the slope of SURPRISAL_s is 0.601603 but we can also create a model m_05, in which the slope of SURPRISAL_s is fixed to 0.625 (using offset), and then we can use anova for one of our usual model comparisons:

m_05 <- lm(DURATION_l ~ 1 +
   offset(0.625*SURPRISAL_s), # here we fix the slope of SURPRISAL_s
   data=d)
anova(m_04, m_05, test="F")

Analysis of Variance Table

Model 1: DURATION_l ~ 1 + SURPRISAL_s
Model 2: DURATION_l ~ 1 + offset(0.625 * SURPRISAL_s)
  Res.Df    RSS Df Sum of Sq      F  Pr(>F)
1   6381 3524.5
2   6382 3526.4 -1   -1.9187 3.4739 0.06239 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

While this is a two-tailed comparison, these are the same results as from the summary(glht(...)):

the p-value here is twice glht’s p-value;
the F-score here is glht’s t-value squared,

The exact same logic applies to more complex models; let’s look at m_02 again:

summary(m_02)$coef

                             Estimate Std. Error     t value      Pr(>|t|)
(Intercept)                6.16979831 0.03707286 166.4235886  0.000000e+00
SURPRISAL_s                0.39267865 0.01696794  23.1423912 7.685868e-114
CLASSnumber                1.58939734 0.36082711   4.4048723  1.075782e-05
CLASSopen                  0.70445540 0.07590253   9.2810530  2.254246e-20
CLASSpropname              2.16824002 0.38199775   5.6760544  1.439034e-08
CLASSunclear               2.57069279 0.72649913   3.5384665  4.053259e-04
SURPRISAL_s:CLASSnumber   -0.32626365 0.12878748  -2.5333490  1.132161e-02
SURPRISAL_s:CLASSopen     -0.01876137 0.02846803  -0.6590329  5.098985e-01
SURPRISAL_s:CLASSpropname -0.33912215 0.10921507  -3.1050858  1.910585e-03
SURPRISAL_s:CLASSunclear  -0.71744847 0.25945594  -2.7652035  5.705098e-03

Imagine a previous study reported that the slope of SURPRISAL_s for numbers was 0.3 (and significantly different from 0). If you actually expected to find in your data that this slope

would be significantly lower than 0.3, in fact
would be not significantly different from 0,

how could you explore and test this?

# question 1:
summary(glht( # show the summary of a glht, namely
   m_02,      # check whether the sum of the coeffs of m_02
   matrix(c(0,1,0,0,0,0,1,0,0,0), nrow=1), # multiplied by this
   rhs=0.3, # is significantly different from 0.3, namely
   alternative="less")) # by being smaller


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Linear Hypotheses:
         Estimate Std. Error t value Pr(<t)
1 >= 0.3  0.06642    0.12766   -1.83 0.0337 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

# question 2:
confint(glht( # show the confint of a glht, namely
   m_02,      # for the sum of the product of the coeffs of m_02
   matrix(c(0,1,0,0,0,0,1,0,0,0), nrow=1)), # and these values
   level=0.9) # cut off 5% on each side, not 2.5%


     Simultaneous Confidence Intervals

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Quantile = 1.6451
90% family-wise confidence level


Linear Hypotheses:
       Estimate lwr      upr
1 == 0  0.06642 -0.14361  0.27644

summary(glht( # show the summary of a glht, namely
   m_02,      # check whether the sum of the coeffs of m_02
   matrix(c(0,1,0,0,0,0,1,0,0,0), nrow=1), # multiplied by this
   alternative="less")) # by being smaller


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(<t)
1 >= 0  0.06642    0.12766    0.52  0.699
(Adjusted p values reported -- single-step method)

Tip

See Faraway (2015, Section 3.2) for some more discussion.

1.4.4 Model matrices from `effects`

Ok, after all this painful manual defining of ‘model matrices’ to give to glht, you will be glad to hear there is one bit of relief I can offer. Remember the above 2c case where we wanted to find out whether the following two predictions are significantly different from each other:

when SURPRISAL_s is at its mean and CLASS is propname;
when SURPRISAL_s is at its mean and CLASS is unclear.

Above, we did this and I think we can all agree that defining SUR.is.mean.CLA.is.prop and SUR.is.mean.CLA.is.uncl is less than fun:

SUR.is.mean.CLA.is.prop <- c(1, mean_surp, 0, 0, 1, 0, 0, 0, mean_surp, 0)
SUR.is.mean.CLA.is.uncl <- c(1, mean_surp, 0, 0, 0, 1, 0, 0, 0        , mean_surp)
(diff.betw.them <- SUR.is.mean.CLA.is.prop - SUR.is.mean.CLA.is.uncl)

 [1]  0.000000  0.000000  0.000000  0.000000  1.000000 -1.000000  0.000000
 [8]  0.000000  2.370868 -2.370868

summary(glht(model=m_02, linfct=matrix(diff.betw.them, nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0   0.4945     0.2015   2.455   0.0141 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

Here’s the good news bit, which comes in two steps which will simplify things considerably. Remember that we already defined nd above as a data frame with all combinations of potentially interesting predictor values: all levels of categorical predictors and c(0, 1, mean) for all numeric predictors; I am reducing it here to just the two columns with the predictor combinations:

(nd <- nd[,1:2])

      CLASS SURPRISAL_s
1    closed    0.000000
2    closed    1.000000
3    closed    2.370868
4    number    0.000000
5    number    1.000000
6    number    2.370868
7      open    0.000000
8      open    1.000000
9      open    2.370868
10 propname    0.000000
11 propname    1.000000
12 propname    2.370868
13  unclear    0.000000
14  unclear    1.000000
15  unclear    2.370868

Step 1 of the simplification is realizing that we can use a version of nd as the argument xlevels to effect. Remember how we always define our effects object?

sucl_d <- data.frame(
   sucl <- effect("SURPRISAL_s:CLASS", m_02))

We can now do this to get effects predictions and confidence intervals for everything, which can already be a huge advantage:

(sucl_d <- data.frame(
   sucl <- effect("SURPRISAL_s:CLASS", m_02,
      xlevels=lapply(nd, unique))))

   SURPRISAL_s    CLASS      fit         se    lower     upper
1     0.000000   closed 6.169798 0.03707286 6.097123  6.242474
2     1.000000   closed 6.562477 0.02141936 6.520488  6.604466
3     2.370868   closed 7.100787 0.01180527 7.077645  7.123930
4     0.000000   number 7.759196 0.35891755 7.055597  8.462795
5     1.000000   number 7.825611 0.23414223 7.366613  8.284608
6     2.370868   number 7.916657 0.07969515 7.760428  8.072886
7     0.000000     open 6.874254 0.06623290 6.744415  7.004092
8     1.000000     open 7.248171 0.04427145 7.161384  7.334958
9     2.370868     open 7.760762 0.01820300 7.725078  7.796446
10    0.000000 propname 8.338038 0.38019454 7.592729  9.083347
11    1.000000 propname 8.391595 0.27453526 7.853413  8.929776
12    2.370868 propname 8.465014 0.13552532 8.199338  8.730689
13    0.000000  unclear 8.740491 0.72555260 7.318164 10.162818
14    1.000000  unclear 8.415721 0.47103018 7.492344  9.339099
15    2.370868  unclear 7.970505 0.14905516 7.678307  8.262703

But the other advantage is that the effects actually contains a model matrix:

(sucl_mm <- sucl$model.matrix)

   (Intercept) SURPRISAL_s CLASSnumber CLASSopen CLASSpropname CLASSunclear
1            1    0.000000           0         0             0            0
2            1    1.000000           0         0             0            0
3            1    2.370868           0         0             0            0
4            1    0.000000           1         0             0            0
5            1    1.000000           1         0             0            0
6            1    2.370868           1         0             0            0
7            1    0.000000           0         1             0            0
8            1    1.000000           0         1             0            0
9            1    2.370868           0         1             0            0
10           1    0.000000           0         0             1            0
11           1    1.000000           0         0             1            0
12           1    2.370868           0         0             1            0
13           1    0.000000           0         0             0            1
14           1    1.000000           0         0             0            1
15           1    2.370868           0         0             0            1
   SURPRISAL_s:CLASSnumber SURPRISAL_s:CLASSopen SURPRISAL_s:CLASSpropname
1                 0.000000              0.000000                  0.000000
2                 0.000000              0.000000                  0.000000
3                 0.000000              0.000000                  0.000000
4                 0.000000              0.000000                  0.000000
5                 1.000000              0.000000                  0.000000
6                 2.370868              0.000000                  0.000000
7                 0.000000              0.000000                  0.000000
8                 0.000000              1.000000                  0.000000
9                 0.000000              2.370868                  0.000000
10                0.000000              0.000000                  0.000000
11                0.000000              0.000000                  1.000000
12                0.000000              0.000000                  2.370868
13                0.000000              0.000000                  0.000000
14                0.000000              0.000000                  0.000000
15                0.000000              0.000000                  0.000000
   SURPRISAL_s:CLASSunclear
1                  0.000000
2                  0.000000
3                  0.000000
4                  0.000000
5                  0.000000
6                  0.000000
7                  0.000000
8                  0.000000
9                  0.000000
10                 0.000000
11                 0.000000
12                 0.000000
13                 0.000000
14                 1.000000
15                 2.370868
attr(,"assign")
 [1] 0 1 2 2 2 2 3 3 3 3
attr(,"contrasts")
attr(,"contrasts")$CLASS
[1] "contr.treatment"

That means, once we give effect an nd object, it can do the work of defining model matrix vectors for us. If we’re interested in the prediction when SURPRISAL_s is at its mean and CLASS is propname, We don’t need to manually develop this anymore like this …

SUR.is.mean.CLA.is.prop <- c(1, mean_surp, 0, 0, 1, 0, 0, 0, mean_surp, 0)

… no, we just pick from what effect computed for us:

SUR.is.mean.CLA.is.prop <- sucl_mm[12,]

Same for when SURPRISAL_s is at its mean and CLASS is propname: no more this …

SUR.is.mean.CLA.is.uncl <- c(1, mean_surp, 0, 0, 0, 1, 0, 0, 0, mean_surp)

… instead, we just do this:

SUR.is.mean.CLA.is.uncl <- sucl_mm[15,]

And then we can do the usual:

(diff.betw.them <- SUR.is.mean.CLA.is.prop - SUR.is.mean.CLA.is.uncl)

              (Intercept)               SURPRISAL_s               CLASSnumber
                 0.000000                  0.000000                  0.000000
                CLASSopen             CLASSpropname              CLASSunclear
                 0.000000                  1.000000                 -1.000000
  SURPRISAL_s:CLASSnumber     SURPRISAL_s:CLASSopen SURPRISAL_s:CLASSpropname
                 0.000000                  0.000000                  2.370868
 SURPRISAL_s:CLASSunclear
                -2.370868

summary(glht(model=m_02, linfct=matrix(
   diff.betw.them, nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0   0.4945     0.2015   2.455   0.0141 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

In other words, all we need to do is give effect the combinations of things we might be interested in and then pick the right row(s) from its model matrix.

This also works for slopes. Remember the above 3c case where we wanted to find out whether the slope of SURPRISAL_s when CLASS is open is significantly different from when CLASS is unclear. Above, we did this and I think we can all agree that defining slope.when.SUR.is.0.CLA.is.open and slope.when.SUR.is.0.CLA.is.uncl was kinda awful:

slope.when.SUR.is.0.CLA.is.open <- c(0, 1, 0, 0, 0, 0, 0, 1, 0, 0)
slope.when.SUR.is.0.CLA.is.uncl <- c(0, 1, 0, 0, 0, 0, 0, 0, 0, 1)
(diff.betw.them <- slope.when.SUR.is.0.CLA.is.open - slope.when.SUR.is.0.CLA.is.uncl)

But now we just pick things out from sucl_mm:

slope.when.SUR.is.0.CLA.is.open <- sucl_mm[ 8,] - sucl_mm[ 7,]
slope.when.SUR.is.0.CLA.is.uncl <- sucl_mm[14,] - sucl_mm[13,]
(diff.betw.them <- slope.when.SUR.is.0.CLA.is.open - slope.when.SUR.is.0.CLA.is.uncl)

              (Intercept)               SURPRISAL_s               CLASSnumber
                        0                         0                         0
                CLASSopen             CLASSpropname              CLASSunclear
                        0                         0                         0
  SURPRISAL_s:CLASSnumber     SURPRISAL_s:CLASSopen SURPRISAL_s:CLASSpropname
                        0                         1                         0
 SURPRISAL_s:CLASSunclear
                       -1

summary(glht(model=m_02, linfct=matrix(
   diff.betw.them, nrow=1)))


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + SURPRISAL_s * CLASS, data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0   0.6987     0.2599   2.688   0.0072 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

I hope you see how big a simplification this is; it’s a good enough a reason to always define an nd object and consider effects for your follow-up analyses.

1.5 Exercises

1.5.1 `m_final` of the lm from week 1

Let’s fit a model m_final:

d <- droplevels(d[d$CLASS!="unclear",])
d$SURPRISAL_sc <- sqrt(d$SURPRISAL) - mean(sqrt(d$SURPRISAL))
summary(m_final <- lm(
   DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
   data=d))$coef

                              Estimate Std. Error    t value      Pr(>|t|)
(Intercept)                 7.13911903 0.01346418 530.230542  0.000000e+00
SURPRISAL_sc                0.44032348 0.01883961  23.372215 5.940044e-116
VOWELStwo+                 -0.15610228 0.02643484  -5.905172  3.705531e-09
CLASSnumber                 0.76822805 0.08691278   8.839069  1.233466e-18
CLASSopen                   0.50357553 0.03825565  13.163428  4.628465e-39
CLASSpropname               1.30448136 0.13872903   9.403088  7.244750e-21
SURPRISAL_sc:VOWELStwo+    -0.18037466 0.03130768  -5.761354  8.736560e-09
SURPRISAL_sc:CLASSnumber   -0.32455577 0.12835149  -2.528648  1.147438e-02
SURPRISAL_sc:CLASSopen      0.07480614 0.03283443   2.278284  2.274286e-02
SURPRISAL_sc:CLASSpropname -0.36288916 0.10892371  -3.331590  8.684619e-04
VOWELStwo+:CLASSnumber      0.18025493 0.15197158   1.186109  2.356236e-01
VOWELStwo+:CLASSopen        0.30761673 0.04934543   6.233946  4.841354e-10
VOWELStwo+:CLASSpropname    0.31594193 0.19606252   1.611435  1.071349e-01

1.5.1.1 Interaction 1: `SURPRISAL_sc:VOWELS`

Interpretation: The more surprising a word is, the longer its duration, but that effect is significantly stronger for words with just one vowel than for words with two vowels. But here are some questions for you:

what is the slope of SURPRISAL_sc for closed-class words with one vowel? (question type 3a)
what is the slope of SURPRISAL_sc for closed-class words with two+ vowels? (question type 3a)
is the difference between the two significant? What is its p-value? What is its confidence interval? (question type 3c)

For the slope of SURPRISAL_sc for closed-class words with one vowel, we can use nd:

nd <- expand.grid(
   SURPRISAL_sc=0:1,            # main values of SURPRISAL_sc
   CLASS=levels(d$CLASS),       # all levels of CLASS
   VOWELS=levels(d$VOWELS)) %>% # all levels of VOWELS
   "["(,3:1)                    # reorder columns
nd <- cbind(nd, PREDS=predict(
   m_final, newdata=nd, interval="confidence"))
colnames(nd)[4:6] <- c("fit", "lwr", "upr")
# 1: slope when CLASS is "closed" & when VOWEL is "one"
# this is a scenario 3a question
nd$fit[2] - nd$fit[1]

[1] 0.4403235

coef(m_final)[2]

SURPRISAL_sc
   0.4403235

But we can now also solve this with glht.

# 1: slope when CLASS is "closed" & when VOWEL is "one"
# this is a scenario 3a question
CLASS.is.clos.N.SURPR.is.0.N.VOW.is.1 <- c(1,0,0,0,0,0,0,0,0,0,0,0,0)
CLASS.is.clos.N.SURPR.is.1.N.VOW.is.1 <- c(1,1,0,0,0,0,0,0,0,0,0,0,0)
("-"(
   CLASS.is.clos.N.SURPR.is.1.N.VOW.is.1,
   CLASS.is.clos.N.SURPR.is.0.N.VOW.is.1)) %>%
   matrix(nrow=1) %>% glht(m_final, .) %>% summary


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
    data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0  0.44032    0.01884   23.37   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

For the slope of SURPRISAL_sc for closed-class words with two+ vowels, we can also use nd:

# 2: slope when CLASS is "closed" & when VOWEL is "two+"
# this is a scenario 3a question
nd$fit[10] - nd$fit[9]

[1] 0.2599488

coef(m_final)[2] + coef(m_final)[7]

SURPRISAL_sc
   0.2599488

But we can now also solve this with glht.

# 2: slope when CLASS is "closed" & when VOWEL is "two+"
# this is a scenario 3a question
CLASS.is.clos.N.SURPR.is.0.N.VOW.is.2p <- c(1,0,1,0,0,0,0,0,0,0,0,0,0)
CLASS.is.clos.N.SURPR.is.1.N.VOW.is.2p <- c(1,1,1,0,0,0,1,0,0,0,0,0,0)
("-"(
   CLASS.is.clos.N.SURPR.is.1.N.VOW.is.2p,
   CLASS.is.clos.N.SURPR.is.0.N.VOW.is.2p)) %>%
   matrix(nrow=1) %>% glht(m_final, .) %>% summary


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
    data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0  0.25995    0.02855   9.105   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

Finally we ask: Is the difference between the two significant? What is its p-value? What is its confidence interval? Above we did this, …

# 3: difference between the two: p and confint
# this is a scenario 3c question
summary(m_final)$coefficients[7,]

     Estimate    Std. Error       t value      Pr(>|t|)
-1.803747e-01  3.130768e-02 -5.761354e+00  8.736560e-09

Confint(m_final)[7,]

  Estimate      2.5 %     97.5 %
-0.1803747 -0.2417483 -0.1190010

pairs(emtrends(         # compute pairwise comparisons of trends/slopes
   object=m_final,      # for the model m_final
   specs= ~ VOWELS | CLASS, # compute comparisons over this
   var="SURPRISAL_sc"), # namely all slopes of this predictor
   adjust="none")[1,]   # do not adjust for multiple comparisons

 contrast     CLASS  estimate     SE   df t.ratio p.value
 one - (two+) closed     0.18 0.0313 6329   5.761 <0.0001

With glht, we can also use something that, if done stepwise, is quite a monstrosity:

# 3: difference between the two: p and confint
# this is a scenario 3c question
"-"( # compute the difference between 2 slopes, namely the
   "-"( # slope when CLASS is "closed" & when VOWEL is "one"
      CLASS.is.clos.N.SURPR.is.1.N.VOW.is.2p,
      CLASS.is.clos.N.SURPR.is.0.N.VOW.is.2p),
   "-"( # slope when CLASS is "closed" & when VOWEL is "two+"
      CLASS.is.clos.N.SURPR.is.1.N.VOW.is.1,
      CLASS.is.clos.N.SURPR.is.0.N.VOW.is.1)) %>%
# put that into a difference (model) matrix:
matrix(nrow=1) -> diff_matrix
glht(m_final, diff_matrix) %>% summary


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
    data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0 -0.18037    0.03131  -5.761 8.74e-09 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

glht(m_final, diff_matrix) %>% confint


     Simultaneous Confidence Intervals

Fit: lm(formula = DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
    data = d)

Quantile = 1.9603
95% family-wise confidence level


Linear Hypotheses:
       Estimate lwr     upr
1 == 0 -0.1804  -0.2417 -0.1190

1.5.1.2 Interaction 2: `SURPRISAL_sc:CLASS`

Interpretation: The more surprising a word is, the longer its duration is, but that effect is not the same across all classes of words. Specifically and controlling for other predictors,

the slope of SURPRISAL_sc seems to be higher for open-class words than for closed-class words;
the slope of SURPRISAL_sc seems to be very flat for numbers and proper names.

But here are the questions for you we asked in week 1:

what is the slope of SURPRISAL_sc for closed-class words with one vowel? (question type 3a)
what is the slope of SURPRISAL_sc for open-class words with one vowel? (question type 3a)
is the difference between the two significant? What is its p-value? What is its confidence interval? (question type 3c)
what is the slope of SURPRISAL_sc for numbers with one vowel? (question type 3a)
what is the slope of SURPRISAL_sc for proper names with one vowel? (question type 3a)
is the difference between the two significant? What is its p-value? What is its confidence interval? (question type 3c)
are these two slopes significantly different from 0? what are the p-values? (question type 3b)

The first we now already dealt with, but let’s now tackle the rest.

# 2: slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "open"
# this is a scenario 3a question
nd$fit[6] - nd$fit[5]

[1] 0.5151296

coef(m_final)[2] + coef(m_final)[9]

SURPRISAL_sc
   0.5151296

# now with glht:
CLASS.is.open.N.SURPR.is.0.N.VOW.is.1 <- c(1,0,0,0,1,0,0,0,0,0,0,0,0)
CLASS.is.open.N.SURPR.is.1.N.VOW.is.1 <- c(1,1,0,0,1,0,0,0,1,0,0,0,0)
("-"(
   CLASS.is.open.N.SURPR.is.1.N.VOW.is.1,
   CLASS.is.open.N.SURPR.is.0.N.VOW.is.1)) %>%
   matrix(nrow=1) %>% glht(m_final, .) %>% summary


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
    data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0  0.51513    0.03369   15.29   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

# 3: difference between the two: p and confint
# this is a scenario 3c question
"-"( # compute the difference between 2 slopes, namely the
   "-"( # slope when CLASS is "closed" & when VOWEL is "one"
      CLASS.is.open.N.SURPR.is.1.N.VOW.is.1,
      CLASS.is.open.N.SURPR.is.0.N.VOW.is.1),
   "-"( # slope when CLASS is "closed" & when VOWEL is "two+"
      CLASS.is.clos.N.SURPR.is.1.N.VOW.is.1,
      CLASS.is.clos.N.SURPR.is.0.N.VOW.is.1)) %>%
# put that into a difference (model) matrix:
matrix(nrow=1) -> diff_matrix
glht(m_final, diff_matrix) %>% summary


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
    data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0  0.07481    0.03283   2.278   0.0227 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

glht(m_final, diff_matrix) %>% confint


     Simultaneous Confidence Intervals

Fit: lm(formula = DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
    data = d)

Quantile = 1.9603
95% family-wise confidence level


Linear Hypotheses:
       Estimate lwr     upr
1 == 0 0.07481  0.01044 0.13917

# 4: slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "number"
# this is a scenario 3a question
nd$fit[4] - nd$fit[3]

[1] 0.1157677

coef(m_final)[2] + coef(m_final)[8]

SURPRISAL_sc
   0.1157677

# now with glht:
CLASS.is.numb.N.SURPR.is.0.N.VOW.is.1 <- c(1,0,0,1,0,0,0,0,0,0,0,0,0)
CLASS.is.numb.N.SURPR.is.1.N.VOW.is.1 <- c(1,1,0,1,0,0,0,1,0,0,0,0,0)
("-"(
   CLASS.is.numb.N.SURPR.is.1.N.VOW.is.1,
   CLASS.is.numb.N.SURPR.is.0.N.VOW.is.1)) %>%
   matrix(nrow=1) %>% glht(m_final, .) %>% summary


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
    data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0   0.1158     0.1275   0.908    0.364
(Adjusted p values reported -- single-step method)

# 5/7: slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "propname"
# this is a scenario 3a question
nd$fit[8] - nd$fit[7]

[1] 0.07743432

coef(m_final)[2] + coef(m_final)[10]

SURPRISAL_sc
  0.07743432

CLASS.is.prop.N.SURPR.is.0.N.VOW.is.1 <- c(1,0,0,0,0,1,0,0,0,0,0,0,0)
CLASS.is.prop.N.SURPR.is.1.N.VOW.is.1 <- c(1,1,0,0,0,1,0,0,0,1,0,0,0)
("-"(
   CLASS.is.prop.N.SURPR.is.1.N.VOW.is.1,
   CLASS.is.prop.N.SURPR.is.0.N.VOW.is.1)) %>%
   matrix(nrow=1) %>% glht(m_final, .) %>% summary


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
    data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0  0.07743    0.10760    0.72    0.472
(Adjusted p values reported -- single-step method)

# 6: this is a scenario 3c question
"-"( # compute the difference between 2 slopes, namely the
   "-"( # slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "propname"
      CLASS.is.numb.N.SURPR.is.1.N.VOW.is.1,
      CLASS.is.numb.N.SURPR.is.0.N.VOW.is.1),
   "-"( # slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "number"
      CLASS.is.prop.N.SURPR.is.1.N.VOW.is.1,
      CLASS.is.prop.N.SURPR.is.0.N.VOW.is.1)) %>%
matrix(nrow=1) -> diff_matrix
glht(m_final, diff_matrix) %>% summary


     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
    data = d)

Linear Hypotheses:
       Estimate Std. Error t value Pr(>|t|)
1 == 0  0.03833    0.16664    0.23    0.818
(Adjusted p values reported -- single-step method)

glht(m_final, diff_matrix) %>% confint


     Simultaneous Confidence Intervals

Fit: lm(formula = DURATION_l ~ 1 + (SURPRISAL_sc + VOWELS + CLASS)^2,
    data = d)

Quantile = 1.9603
95% family-wise confidence level


Linear Hypotheses:
       Estimate lwr      upr
1 == 0  0.03833 -0.28833  0.36500

# 7: this is scenario 3b question
# slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "number": see 4 above
# slope of SURPRISAL_sc when VOWEL is "one" & when CLASS is "propname": see 5 above

1.5.2 A final nice example

For the final example, we will briefly load different model on a completely different data set. This data set was actually analyzed with a mixed-effects model, but since we haven’t dealt with those yet, we will make this a fixed-effects model only. In this example, the response variable is GENITIVE (of vs. s) and the model we’re discussing here has the following predictors:

LANG: the L1 of the speaker: english vs. chinese vs. german;
POSSORANIM: whether the possessor of the genitive is animate or inanimate (because the former prefer s-genitives much more than the latter);
POSSORNUMBER: whether the possessor of the genitive is singular, plural or irregplural (because regular plurals disprefer s-genitives much more than the others);
LENGTHNPDIFF: a variable quantifying the length difference between possessor and possessed: positive values should go with s-genitives, positive values should go with of-genitives.

From the output of the model m, you can see that the interaction LANG:POSSORANIM is significant, and the plot shows the effects predictions for it:

rm(list=ls(all.names=TRUE))
summary(d <- readRDS("_input/2024_STG_AgainstLevel3_ICAMEJournal_data.RDS"))

      CASE       GENITIVE       LANG          POSSORANIM        POSSORNUMBER
 Min.   :    7   of:2581   english: 996   animate  : 802   singular   :2175
 1st Qu.: 3465   s : 405   chinese: 990   inanimate:2184   plural     : 676
 Median : 7022             german :1000                    irregplural: 135
 Mean   :11308
 3rd Qu.:15701
 Max.   :32872
  LENGTHNPDIFF
 Min.   :-143.00
 1st Qu.:  -8.00
 Median :  -1.00
 Mean   :  -3.14
 3rd Qu.:   4.00
 Max.   : 129.00

summary(m <- readRDS("_input/2024_STG_AgainstLevel3_ICAMEJournal_glm.RDS"))$coefficients %>% round(5)

                                Estimate Std. Error   z value Pr(>|z|)
(Intercept)                      0.26251    0.17575   1.49362  0.13528
LANGchinese                     -0.88066    0.27349  -3.22010  0.00128
LANGgerman                       0.01980    0.21759   0.09099  0.92750
POSSORANIMinanimate             -3.49434    0.26726 -13.07463  0.00000
LENGTHNPDIFF                     0.06450    0.00621  10.39042  0.00000
POSSORNUMBERplural              -3.57929    0.40693  -8.79589  0.00000
POSSORNUMBERirregplural         -1.04300    0.24161  -4.31683  0.00002
LANGchinese:POSSORANIMinanimate  2.25641    0.36088   6.25253  0.00000
LANGgerman:POSSORANIMinanimate   0.63158    0.33678   1.87532  0.06075

   drop1(m, test="Chisq")

Single term deletions

Model:
GENITIVE ~ 1 + LANG * POSSORANIM + LENGTHNPDIFF + POSSORNUMBER
                Df Deviance    AIC    LRT  Pr(>Chi)
<none>               1617.1 1635.1
LENGTHNPDIFF     1   1755.4 1771.4 138.30 < 2.2e-16 ***
POSSORNUMBER     2   1812.5 1826.5 195.45 < 2.2e-16 ***
LANG:POSSORANIM  2   1662.6 1676.6  45.51 1.311e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

lapo_d <- data.frame(lapo <- effect("LANG:POSSORANIM", m))
lapo_d$fit.lo <- qlogis(lapo_d$fit)
plot(type="n", axes=FALSE,
   xlab="L1 of the speakers"        , xlim=c(0.9, 3.1), x=0,
   ylab="Pred. prob. of s-genitives", ylim=c(0  , 0.5), y=0); grid()
axis(2); axis(1, at=1:3, labels=c("English", "Chinese", "German")) # levels(d$LANG)
arrows(
   rep(1:3,2), lapo_d$lower,
   rep(1:3,2), lapo_d$upper,
   angle=90, code=3)
lines(1:3, lapo_d$fit[c(1,2,3)], type="b", pch=16); text(2.75, 0.3 , "anim")
lines(1:3, lapo_d$fit[c(4,5,6)], type="b", pch=16); text(1.25, 0.05, "inanim")

Clearly, for all speaker L1s, animate possessors lead to significantly more s-genitives than inanimate possessors (we wouldn’t even need emmeans for that: the confidence intervals between a and i within each level of L1 never overlap). Just as clearly, the Chinese learners behave differently from the native speakers and the German learners because, for them, the difference between animate and inanimate possessors is much smaller than for the native speakers and the German learners. Here are questions for you to answer (with glht!):

is the prediction for inanimate possessors for the Chinese learners significantly higher than that for the native speakers?
is the prediction for inanimate possessors for the Chinese learners significantly higher than that for the German learners?
Is the difference between a and i for the German learners significantly different from the difference between a and i for the Chinese learners?

To make things easier for you, do your calculations only for when LENGTHNPDIFF is 0 and when POSSORNUMBER is singular; those will numerically be somewhat different from the above effects estimate, but conceptually similar (because I doubt you want to spend time on making these comparisons for the plot above, for which LENGTHNPDIFF would have to be at its mean and POSSORNUMBER would have to be its proportional representation ;-) But if you want to do this with the effects object’s model matrix, be my guest).

nd <- expand.grid(
   POSSORNUMBER="singular",
   LENGTHNPDIFF=0,
   LANG=levels(d$LANG),
   POSSORANIM=levels(d$POSSORANIM))
nd$fit.lo <- predict(m, newdata=nd)

2 Session info

sessionInfo()

R version 4.5.2 (2025-10-31)
Platform: x86_64-pc-linux-gnu
Running under: Pop!_OS 22.04 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

time zone: America/Los_Angeles
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  compiler  methods
[8] base

other attached packages:
 [1] multcomp_1.4-29 TH.data_1.1-5   MASS_7.3-65     survival_3.8-6
 [5] mvtnorm_1.3-3   emmeans_2.0.1   effects_4.2-4   car_3.1-3
 [9] carData_3.0-5   STGmisc_1.05    Rcpp_1.1.1      magrittr_2.0.4

loaded via a namespace (and not attached):
 [1] sandwich_3.1-1     lattice_0.22-7     lme4_1.1-38        digest_0.6.39
 [5] evaluate_1.0.5     grid_4.5.2         estimability_1.5.1 fastmap_1.2.0
 [9] jsonlite_2.0.0     Matrix_1.7-4       nnet_7.3-20        DBI_1.2.3
[13] Formula_1.2-5      codetools_0.2-20   abind_1.4-8        reformulas_0.4.3.1
[17] Rdpack_2.6.4       cli_3.6.5          mitools_2.4        rlang_1.1.7
[21] rbibutils_2.4.1    splines_4.5.2      yaml_2.3.12        otel_0.2.0
[25] tools_4.5.2        nloptr_2.2.1       coda_0.19-4.1      minqa_1.2.8
[29] colorspace_2.1-2   boot_1.3-32        zoo_1.8-15         htmlwidgets_1.6.4
[33] insight_1.4.4      xfun_0.56          rstudioapi_0.18.0  knitr_1.51
[37] xtable_1.8-4       htmltools_0.5.9    nlme_3.1-168       survey_4.4-8
[41] rmarkdown_2.30

Footnotes

Two other arguments of glht, which we are currently using at their default settings, will be tweaked later.↩︎
I may misunderstand things (!) but I think the output of summary(glht(...)) gets the row names of the output wrong: I think it should be “1 <= 0.625”.↩︎

1 Session 03: Coefficient comparisons 2

1.1 Theoretical introduction

1.2 Practical applications

1.2.1 Scenarios 1a & 1b

1.2.2 Scenario 1c

1.2.3 Scenario 2a & 2b

1.2.4 Scenario 2c

1.2.5 Scenarios 3a & 3b

1.2.6 Scenarios 3c

1.3 What does this mean for our regression summaries?

1.4 What else is possible?

1.4.1 Orthogonal contrasts w/ glht

1.4.1.1 The first orthogonal contrast

1.4.1.2 The second orthogonal contrast

1.4.1.3 The third orthogonal contrast

1.4.1.4 The fourth orthogonal contrast

1.4.2 Testing against specific values inside your model

1.4.3 Testing against specific values outside your model

1.4.4 Model matrices from effects

1.5 Exercises

1.5.1 m_final of the lm from week 1

1.5.1.1 Interaction 1: SURPRISAL_sc:VOWELS

1.5.1.2 Interaction 2: SURPRISAL_sc:CLASS

1.5.2 A final nice example

2 Session info

Footnotes

1.4.1 Orthogonal contrasts w/ `glht`

1.4.4 Model matrices from `effects`

1.5.1 `m_final` of the lm from week 1

1.5.1.1 Interaction 1: `SURPRISAL_sc:VOWELS`

1.5.1.2 Interaction 2: `SURPRISAL_sc:CLASS`