In addition to the generic normalization functions in tidynorm (norm_generic()
, norm_track_generic()
and norm_dct_generic()
), there are a number of convenience functions for a few established normalization methods.
Lobanov (Lobanov 1971)
tidynorm functions:
Lobanov normalization z-scores each formant. If F_{ij} is the j^{th} token of the i^{th} formant, and \hat{F}_{ij} is its normalized value, then
\hat{F}_{ij} = \frac{F_{ij} - L_i}{S_i}
Where L_i is the mean across the i^{th} formant:
L_i = \frac{1}{N}\sum_{j=1}^N F_{ij}
And S_i is the standard deviation across the i^{th} formant.
S_i = \sqrt{\frac{\sum_j(F_{ij}-L_i)^2}{N-1}}
Using the Lobanov normalization functions
On points
point_norm <- speaker_data |>
norm_lobanov(
F1:F3,
.by = speaker
)
#> Normalization info
#> • normalized with `tidynorm::norm_lobanov()`
#> • normalized `F1`, `F2`, and `F3`
#> • normalized values in `F1_z`, `F2_z`, and `F3_z`
#> • grouped by `speaker`
#> • within formant: TRUE
#> • (.formant - mean(.formant, na.rm = T))/(sd(.formant, na.rm = T))
On tracks
track_norm <- speaker_tracks |>
norm_track_lobanov(
F1:F3,
.by = speaker,
.token_id_col = id,
.time_col = t
)
#> Normalization info
#> • normalized with `tidynorm::norm_track_lobanov()`
#> • normalized `F1`, `F2`, and `F3`
#> • normalized values in `F1_z`, `F2_z`, and `F3_z`
#> • token id column: `id`
#> • time column: `t`
#> • grouped by `speaker`
#> • within formant: TRUE
#> • (.formant - mean(.formant, na.rm = T))/sd(.formant, na.rm = T)
On DCT Coefficients
dct_norm <- speaker_tracks |>
reframe_with_dct(
F1:F3,
.by = speaker,
.token_id_col = id,
.time_col = t
) |>
norm_dct_lobanov(
F1:F3,
.by = speaker,
.token_id_col = id,
.param_col = .param
)
#> Normalization info
#> • normalized with `tidynorm::norm_dct_lobanov()`
#> • normalized `F1`, `F2`, and `F3`
#> • normalized values in `F1_z`, `F2_z`, and `F3_z`
#> • token id column: `id`
#> • DCT parameter column: `.param`
#> • grouped by `speaker`
#> • within formant: TRUE
#> • (.formant - mean(.formant, na.rm = T))/sd(.formant, na.rm = T)
Nearey Normalization (Nearey 1978)
tidynorm functions:
Nearey Normalization first log transforms formant values, then subtracts the grand mean across all formants. If F_{ij} is the j^{th} token of the i^{th} formant, and \hat{F}_{ij} is its normalized value, then
\hat{F}_{ij} = \log(F_{ij}) - L
L = \frac{1}{MN}\sum_{i = 1}^M\sum_{j=1}^N \log(F_{ij})
The fact that the grand mean is taken across all formants, it’s important to report whether just F1 and F2 were used, or if F1, F2 and F3 were used.
Using the Nearey normalization functions
On points
point_norm <- speaker_data |>
norm_nearey(
F1:F3,
.by = speaker
)
#> Normalization info
#> • normalized with `tidynorm::norm_nearey()`
#> • normalized `F1`, `F2`, and `F3`
#> • normalized values in `F1_lm`, `F2_lm`, and `F3_lm`
#> • grouped by `speaker`
#> • within formant: FALSE
#> • (.formant - mean(.formant, na.rm = T))/(1)
On tracks
track_norm <- speaker_tracks |>
norm_track_nearey(
F1:F3,
.by = speaker,
.token_id_col = id,
.time_col = t
)
#> Normalization info
#> • normalized with `tidynorm::norm_track_nearey()`
#> • normalized `F1`, `F2`, and `F3`
#> • normalized values in `F1_lm`, `F2_lm`, and `F3_lm`
#> • token id column: `id`
#> • time column: `t`
#> • grouped by `speaker`
#> • within formant: FALSE
#> • (.formant - mean(.formant, na.rm = T))/(1/sqrt(2))
On DCT Coefficients
dct_norm <- speaker_tracks |>
mutate(across(F1:F3, log)) |>
reframe_with_dct(
F1:F3,
.by = speaker,
.token_id_col = id,
.time_col = t
) |>
norm_dct_nearey(
F1:F3,
.by = speaker,
.token_id_col = id,
.param_col = .param
)
#> Normalization info
#> • normalized with `tidynorm::norm_dct_nearey()`
#> • normalized `F1`, `F2`, and `F3`
#> • normalized values in `F1_lm`, `F2_lm`, and `F3_lm`
#> • token id column: `id`
#> • DCT parameter column: `.param`
#> • grouped by `speaker`
#> • within formant: FALSE
#> • (.formant - mean(.formant, na.rm = T))/(1/sqrt(2))
Delta F (Johnson 2020)
tidynorm functions:
The \Delta F normalization method is based on the average of formant spacing. If F_{ij} is the j^{th} token of the i^{th} formant, and \hat{F}_{ij} is its normalized value, then
\hat{F} = \frac{F_{ij}}{S}
S = \frac{1}{MN} \sum_{i=1}^M\sum_{j=1}^N \frac{F_{ij}}{i-0.5}
The fact that this method takes a weighted average across all formants, it’s important to report whether just F1 and F2 were used, or if F1, F2 and F3 were used.
Using the DeltaF normalization functions
On points
point_norm <- speaker_data |>
norm_deltaF(
F1:F3,
.by = speaker
)
#> Normalization info
#> • normalized with `tidynorm::norm_deltaF()`
#> • normalized `F1`, `F2`, and `F3`
#> • normalized values in `F1_df`, `F2_df`, and `F3_df`
#> • grouped by `speaker`
#> • within formant: FALSE
#> • (.formant - 0)/(mean(.formant/(.formant_num - 0.5), na.rm = T))
On tracks
track_norm <- speaker_tracks |>
norm_track_deltaF(
F1:F3,
.by = speaker,
.token_id_col = id,
.time_col = t
)
#> Normalization info
#> • normalized with `tidynorm::norm_track_deltaF()`
#> • normalized `F1`, `F2`, and `F3`
#> • normalized values in `F1_df`, `F2_df`, and `F3_df`
#> • token id column: `id`
#> • time column: `t`
#> • grouped by `speaker`
#> • within formant: FALSE
#> • (.formant - 0)/mean(.formant/(.formant_num - 0.5), na.rm = T)
On DCT coefficients
dct_norm <- speaker_tracks |>
reframe_with_dct(
F1:F3,
.by = speaker,
.token_id_col = id,
.time_col = t
) |>
norm_dct_deltaF(
F1:F3,
.by = speaker,
.token_id_col = id,
.param_col = .param
)
#> Normalization info
#> • normalized with `tidynorm::norm_dct_deltaF()`
#> • normalized `F1`, `F2`, and `F3`
#> • normalized values in `F1_df`, `F2_df`, and `F3_df`
#> • token id column: `id`
#> • DCT parameter column: `.param`
#> • grouped by `speaker`
#> • within formant: FALSE
#> • (.formant - 0)/mean(.formant/(.formant_num - 0.5), na.rm = T)
Watt & Fabricious (Watt and Fabricius 2002)
tidynorm functions:
The Watt & Fabricious method attempt to center vowel spaces on their “center of gravity”. The original Watt & Fabricious method involved calculating average F1 and F2 values for point vowels. In tidynorm, a modified version has been implemented that just uses the average over F1 and F2 as the centers of gravity. If F_{ij} is the j^{th} token of the i^{th} formant, and \hat{F}_{ij} is its normalized value, then
\hat{F_{ij}} = \frac{F_{ij}}{S_i}
Where S_i is the mean across the i_{th} formant.
S_i = \frac{1}{N} \sum_{j = 1}^N F_{ij}
Using the Watt & Fabricious normaliation functions
On points
point_norm <- speaker_data |>
norm_wattfab(
F1:F3,
.by = speaker
)
#> Normalization info
#> • normalized with `tidynorm::norm_wattfab()`
#> • normalized `F1`, `F2`, and `F3`
#> • normalized values in `F1_wf`, `F2_wf`, and `F3_wf`
#> • grouped by `speaker`
#> • within formant: TRUE
#> • (.formant - 0)/(mean(.formant, na.rm = T))
On tracks
track_norm <- speaker_tracks |>
norm_track_wattfab(
F1:F3,
.by = speaker,
.token_id_col = id,
.time_col = t
)
#> Normalization info
#> • normalized with `tidynorm::norm_track_wattfab()`
#> • normalized `F1`, `F2`, and `F3`
#> • normalized values in `F1_wf`, `F2_wf`, and `F3_wf`
#> • token id column: `id`
#> • time column: `t`
#> • grouped by `speaker`
#> • within formant: TRUE
#> • (.formant - 0)/mean(.formant, na.rm = T)
On DCT coefficients
dct_norm <- speaker_tracks |>
reframe_with_dct(
F1:F3,
.by = speaker,
.token_id_col = id,
.time_col = t
) |>
norm_dct_wattfab(
F1:F3,
.by = speaker,
.token_id_col = id,
.param_col = .param
)
#> Normalization info
#> • normalized with `tidynorm::norm_dct_wattfab()`
#> • normalized `F1`, `F2`, and `F3`
#> • normalized values in `F1_wf`, `F2_wf`, and `F3_wf`
#> • token id column: `id`
#> • DCT parameter column: `.param`
#> • grouped by `speaker`
#> • within formant: TRUE
#> • (.formant - 0)/mean(.formant, na.rm = T)
Bark Difference (Syrdal and Gopal 1986)
tidynorm functions
The bark difference metric tries to normalize vowels on the basis of individual tokens. First, formant data is converted to bark (see hz_to_bark()
), then F3 is subtracted from F1 and F2. If F_{ij} is the j^{th} token of the i^{th} formant, and \hat{F}_{ij} is its normalized value, then
\hat{F}_{ij} = \text{bark}(F_{ij}) - L_j
L_j = \text{bark}(F_{3j})
Using the Bark Difference normalization functions
On points
point_norm <- speaker_data |>
norm_barkz(
F1:F3,
.by = speaker
)
#> Normalization info
#> • normalized with `tidynorm::norm_barkz()`
#> • normalized `F1`, `F2`, and `F3`
#> • normalized values in `F1_bz`, `F2_bz`, and `F3_bz`
#> • grouped by `speaker`
#> • within formant: FALSE
#> • within token: TRUE
#> • (.formant - .formant[3])/(1)
On tracks
track_norm <- speaker_tracks |>
norm_track_barkz(
F1:F3,
.by = speaker,
.token_id_col = id,
.time_col = t
)
#> Normalization info
#> • normalized with `tidynorm::norm_track_barkz()`
#> • normalized `F1`, `F2`, and `F3`
#> • normalized values in `F1_bz`, `F2_bz`, and `F3_bz`
#> • token id column: `id`
#> • time column: `t`
#> • grouped by `speaker`
#> • within formant: FALSE
#> • within token: TRUE
#> • (.formant - .formant[3])/(1/sqrt(2))
On DCT Coefficients
dct_norm <- speaker_tracks |>
mutate(
across(F1:F3, hz_to_bark)
) |>
reframe_with_dct(
F1:F3,
.by = speaker,
.token_id_col = id,
.time_col = t
) |>
norm_dct_barkz(
F1:F3,
.by = speaker,
.token_id_col = id,
.param_col = .param
)
#> Normalization info
#> • normalized with `tidynorm::norm_dct_barkz()`
#> • normalized `F1`, `F2`, and `F3`
#> • normalized values in `F1_bz`, `F2_bz`, and `F3_bz`
#> • token id column: `id`
#> • DCT parameter column: `.param`
#> • grouped by `speaker`
#> • within formant: FALSE
#> • within token: TRUE
#> • (.formant - .formant[3])/(1/sqrt(2))