Categorical Predictors

Author

Josef Fruehwald

Published

March 9, 2023

── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
✔ ggplot2 3.4.0     ✔ purrr   1.0.1
✔ tibble  3.1.8     ✔ dplyr   1.1.0
✔ tidyr   1.3.0     ✔ stringr 1.5.0
✔ readr   2.1.3     ✔ forcats 0.5.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
um_uh <- read_tsv("https://bit.ly/3JdeSbx") 
Rows: 26060 Columns: 14
── Column specification ────────────────────────────────────────────────────────
Delimiter: "\t"
chr  (3): word, next_seg, idstring
dbl (11): start_time, end_time, vowel_start, vowel_end, nasal_start, nasal_e...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
um_uh
# A tibble: 26,060 × 14
   word  start…¹ end_t…² vowel…³ vowel…⁴ nasal…⁵ nasal…⁶ next_…⁷ next_…⁸ next_…⁹
   <chr>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl> <chr>     <dbl>   <dbl>
 1 UH       24.4    24.7    24.4    24.7    NA      NA   S          24.7    24.9
 2 UH       35.0    35.2    35.0    35.2    NA      NA   F          35.2    35.4
 3 UM       37.9    38.3    37.9    38.1    38.1    38.3 sp         38.3    38.4
 4 UH       44.5    44.7    44.5    44.7    NA      NA   DH         44.7    44.7
 5 UH       57.6    57.8    57.6    57.8    NA      NA   AY1        57.8    57.9
 6 UH       62.3    62.5    62.3    62.5    NA      NA   sp         62.5    63.0
 7 UH       73.9    74.2    73.9    74.2    NA      NA   sp         74.2    75.0
 8 UH       75.1    75.4    75.1    75.4    NA      NA   sp         75.4    75.7
 9 UM       81.6    82.0    81.6    81.8    81.8    82.0 sp         82.0    84.0
10 UH       92.6    92.9    92.6    92.9    NA      NA   sp         92.9    93.4
# … with 26,050 more rows, 4 more variables: chunk_start <dbl>,
#   chunk_end <dbl>, nwords <dbl>, idstring <chr>, and abbreviated variable
#   names ¹​start_time, ²​end_time, ³​vowel_start, ⁴​vowel_end, ⁵​nasal_start,
#   ⁶​nasal_end, ⁷​next_seg, ⁸​next_seg_start, ⁹​next_seg_end
um_uh |> 
  mutate(
    vowel_dur = vowel_end - vowel_start,
        fol_pause = case_when(
          next_seg == "sp" ~ "pause", 
          .default = "no pause"
        )
  ) |> 
  select(idstring, word, fol_pause, vowel_dur) -> 
  pause_data

Reuse

CC-BY-SA 4.0