Is it really, um, revealing?

Josef Fruehwald

February 2, 2015

Intro

Two parts of this talk

  1. Increasing “um”: a language change in progress
  2. How informative is it?

The Data

UhUm Package

The raw data from the Philadelphia Neighborhood Corpus available here:

  library(devtools)
  install_github("jofrhwld/UhUm")

UhUm Package

  library(UhUm)
  head(um_PNC, 3)
##    idstring word start_time end_time vowel_start vowel_end nasal_start
## 1 PH00-1-1-   UH      24.39    24.69       24.39     24.69          NA
## 2 PH00-1-1-   UH      34.96    35.24       34.96     35.24          NA
## 3 PH00-1-1-   UM      37.90    38.27       37.90     38.12       38.12
##   nasal_end next_seg next_seg_start next_seg_end chunk_start chunk_end
## 1        NA        S          24.69        24.87       24.39     25.29
## 2        NA        F          35.24        35.35       34.96     37.11
## 3     38.27       sp          38.27        38.39       37.90     38.80
##   nwords sex year age ethnicity schooling transcribed total nvowels
## 1   6551   m 2000  21       i/r        14        2811  2814    3078
## 2   6551   m 2000  21       i/r        14        2811  2814    3078
## 3   6551   m 2000  21       i/r        14        2811  2814    3078

um_PNC

  um_PNC%>%
    group_by(word, sex)%>%
    summarise(n = n())%>%
    ungroup()%>%
    spread(sex, n)
## Source: local data frame [5 x 3]
## 
##     word    f    m
## 1 AND_UH  904 1176
## 2 AND_UM  314  153
## 3     UH 7523 9520
## 4     UM 4132 1792
## 5  UM_UH    7    2

Transcription

From the FAVE transcription guidelines: