The eVoweluate survey is a research project focusing on the reliability and reproducibility of manual vowel measurements. Its goals are twofold.

  1. Quantitatively evaluate the quality of automated vowel analysis, specifically that produced by the FAVE suite.
  2. Determine the rate of inter-annotator agreement amongst experts studying vowels.

In order to do this, it will recruit researchers experienced in vowel measurement to measure the same 180 vowels extracted from the Philadelphia neighborhood corpus. The more people who participate, the better we can all understand the process of vowel measurement and its potential for automation.


We are hoping to recruit participants with experience doing formant analysis of vowels. In order to participate, you will need to use Praat. It is estimated that it will take between and hour and an hour and a half in total, but it is possible to stop the survey (quit Praat even) and restart at any time.

You can begin participating simply downloading the eVoweluate package, a compressed zip file. This file contains

  • A README file, providing more information about running the survey.
  • Directories containing wav files, which are snippets of sociolinguistic interviews extracted from the Philadelphia Neighborhood Corpus, and Praat TextGrids.
  • An emtpy directory called Results, which will store the results of the survey.
  • A Praat script called eVoweluate.Praat, which will run the actual experiment.
  • A simple tab-delimited text file listing the survey items which is utilized by eVoweluate.Praat.

Upon completing the survey, you should compress the Results directory into a zip file, and e-mail it to Josef Fruehwald at . No information about your participation or your survey responses will be recorded until you submit the compressed Results file. We maintain no record of how many times eVoweluate has been downloaded, and this webpage has no web analytics installed. The consent form can be accessed by running the eVoweluate Praat script.


The FAVE (Forced Alignment and Vowel Extraction) suite has increased the volume of vowel measurement data linguists can extract, and does so in a replicable way. The quality of the data FAVE returns appears qualitatively to be high. The goal of this study is to quantify that quality.

We need to utilize a survey based method to quantify FAVE's quality. One of the motivations for developing FAVE in the first place is that manual formant analysis is time consuming simply due to the number of decisions a researcher must make with every token, including

  • the time point within the vowel at which to make a measurement,
  • adjustments to the LPC algorithm in order to account for noise, including
    • the desired frequency range of the analysis,
    • the number of poles.

In many cases, these decisions must be revised from token to token. The consequence is that the formant estimates for any given vowel token may vary from coder to coder depending on the decisions they made for at least these three parameters. Thus, there is no "ground truth" or "gold standard" against which to compare FAVE output. Instead, FAVE will be compared against the collective measurements of experts.

Moreover, in order to demonstrate the usefulness of FAVE, we need only establish that it agrees as closely with experts as experts agree with each other. This introduces the second goal of the study, determining the rate of agreement on vowel formant estimation between experts. It is currently unknown how high or low the rate of agreement between experts is, and not only will determining it be crucial for evaluating FAVE, but may be crucial for comparing results from different researchers, or research groups.

More Info

  • Josef Fruehwald:
    • Contact:
    • Academic Website
  • FAVE - Force Alignment and Vowel Extraction: Website
  • FAVE toolkit on Github
  • FAVE Users' Group