The “sociolinguistic” part of the title might suggest that I’ll be talking about:
Different ways of saying the same thing.
Language use is not deterministic, i.e. even in narrowly defined linguistic and social contexts, there is structured optionality. (Weinreich, Labov & Herzog, 1968, and so on)
The data in this talk is drawn from the Philadelphia Neighborhood Corpus.
|speakers||transcribed audio||words||stressed vowels||date of birth range|
(Wieling et al, forthcoming)
Maybe the observed pattern is due to an age-linked pattern of use: As speakers get older, they use UH more.
gamm4(UM ~ t2(dob, year, bs = "tp"), random = ~ (1|speaker))
Whatever it is, it’s playing out as an inter-generational shift.
If Clark & Fox Tree (2002) are correct, and
what would it mean for UM to be used more often?
This isn’t a shift in equivalent alternatives, but a shift in meanings being expressed.
It seems like UM and UH are trading off in frequency, but what is trading off:
Following Clark & Fox Tree (2002), I’ll be treating the duration of following silence as the message space.
A change of usage within a stable communicative context will have a different quantitative profile from a change in the messages being signaled with a stable usage system.
## Data: pause_for_mod ## Models: ## dur_mod4: log2dur ~ 1 + (1 | idstring) ## dur_mod3: log2dur ~ sex + (1 | idstring) ## dur_mod2: log2dur ~ decade + sex + (1 | idstring) ## dur_mod1: log2dur ~ decade * sex + (1 | idstring) ## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq) ## dur_mod4 3 70358 70382 -35176 70352 ## dur_mod3 4 70354 70385 -35173 70346 6.3190 1 0.01194 * ## dur_mod2 5 70354 70393 -35172 70344 2.1684 1 0.14087 ## dur_mod1 6 70355 70402 -35172 70343 0.7363 1 0.39084 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
There is equivocal evidence compatible with the hypothesis that the messages speakers are sending, or their speech planning difficulties, are changing over time.
The model with the full dob\(\times\)word\(\times\)gender has the lowest AIC, and is favored by the likelihood ratio tests.
The model without any date of birth predictor has the lowest BIC.
## Data: um_for_mod ## Models: ## um_mod3: log2dur ~ word * sex + (1 | idstring) ## um_mod2: log2dur ~ decade * sex + word * sex + (1 | idstring) ## um_mod1: log2dur ~ decade * word * sex + (1 | idstring) ## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq) ## um_mod3 6 69918 69965 -34953 69906 ## um_mod2 8 69910 69972 -34947 69894 12.4935 2 0.001937 ** ## um_mod1 10 69905 69983 -34943 69885 8.5948 2 0.013604 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Under the changing messages hypothesis, the predictive power of the duration of following silences should be stable.
Under the changing usage hypothesis, its predictive power should decrease as speakers start using “um” more often irrespective of the context.
The frequency with which speakers use UM or UH as a filled pauses is a non-trivial, but arbitrary aspect of the knowledge of their language, and it can change.
Speakers must be able to track frequencies of UM vs UH, despite the fact that they can’t accurately explicitly report back when they’ve heard them (Lickley 1995; Lickley & Bard 1996)
The “meaning” of UM and UH may function similarly to other sociolinguistic variables.
While others have noted that following UM and UH can condition the long form of preceding, the [ði:], systematically absent from the PNC is any examples of it triggering a preceding an.
Perhaps pursuing further the points of contact between these two areas of study will be enlightening to both of us.