Skip to contents

The goal of tidynorm is to provide convenient and tidy functions to normalize vowel formant data.

Installation

You can install the development version of tidynorm like so:

## if you need to install `remotes`
# install.packages("remotes")
remotes::install_github("jofrhwld/tidynorm")

Example

Vowel formant frequencies are heavily influenced by vocal tract length differences between speakers. Equivalent vowels between speakers can have dramatically different frequency locations.

Plotting Options
options(
  ggplot2.discrete.colour = c(
    lapply(
      1:6, 
      \(x) c("#4477AA", "#EE6677", "#228833", 
             "#CCBB44", "#66CCEE", "#AA3377")[1:x]
    )
  ),
  ggplot2.discrete.fill = c(
    lapply(
      1:6, 
      \(x) c("#4477AA", "#EE6677", "#228833", 
             "#CCBB44", "#66CCEE", "#AA3377")[1:x]
    )
  )  
)

theme_set(
  theme_minimal(
    base_size = 16
  )
)
Plotting Code
ggplot(
  speaker_data,
  aes(
    F2, F1,
    color = speaker
  )
)+
  ggdensity::stat_hdr(
    probs = c(0.95, 0.8, 0.5),
    alpha = 1,
    fill = NA,
    linewidth = 1
  )+
  scale_x_reverse()+
  scale_y_reverse()+
  coord_fixed()+
  labs(
    title = "unnormalized"
  )

The goal of tidynorm is to provide tidyverse-friendly and familiar functions that will allow you to quickly normalize vowel formant data. There are a number of built in functions based on conventional normalization methods.

speaker_data |> 
  norm_nearey(
    F1:F3,
    .by = speaker,
    .names = "{.formant}_nearey"
  ) ->
  speaker_normalized
#> Normalization info
#> • normalized `F1`, `F2`, and `F3`
#> • normalized values in `F1_nearey`, `F2_nearey`, and `F3_nearey`
#> • grouped by `speaker`
#> • formant extrinsic
Plotting Code
speaker_normalized |> 
  ggplot(
    aes(
      F2_nearey, F1_nearey,
      color = speaker
    )
  )+
  ggdensity::stat_hdr(
    probs = c(0.95, 0.8, 0.5),
    alpha = 1,
    fill = NA,
    linewidth = 1
  )+
  scale_x_reverse()+
  scale_y_reverse()+
  coord_fixed() +
  labs(
    title = "Nearey Normalized"
  )

There is also a tidynorm::norm_generic() function to allow you to define your own bespoke normalization methods. For example, a “robust Nearey” normalization method using the median, instead of the mean, could be done like so.

speaker_rnearey <- speaker_data |> 
  norm_generic(
    F1:F3,
    .by = speaker,
    .by_formant = FALSE,
    .pre_trans = log,
    .L = median(.formant, na.rm = T),
    .names = "{.formant}_rnearey"
  )
#> Normalization info
#> • normalized `F1`, `F2`, and `F3`
#> • normalized values in `F1_rnearey`, `F2_rnearey`, and `F3_rnearey`
#> • grouped by `speaker`
#> • formant extrinsic
Plotting Code
speaker_rnearey |> 
 ggplot(
    aes(
      F2_rnearey, F1_rnearey,
      color = speaker
    )
  )+
  ggdensity::stat_hdr(
    probs = c(0.95, 0.8, 0.5),
    alpha = 1,
    fill = NA,
    linewidth = 1
  )+
  scale_x_reverse()+
  scale_y_reverse()+
  coord_fixed() +
  labs(
    title = "Robust Nearey Normalized"
  )