
Generated from my Zotero publications.


aligned-textgrid: Lightweight access to structured phonetic data

Josef Fruehwald, Christian Brickhouse
Poster presented at SCiL 2024

aligned-textgrid: Lightweight access to structured phonetic data

Josef Fruehwald, Christian Brickhouse
Proceedings of the Society for Computation in Linguistics (SCiL) 2024
The goal of aligned-textgrid is to provide lightweight, scriptable access to the structured data produced by forced-aligners. The library is written in python, and currently available on the Python Package Index.



Josef Fruehwald, Santiago Barreda
This is a python implementation of the FastTrack method.


Josef Fruehwald, Christian Brickhouse
The aligned-textgrid package provides a python interface for representing and operating on TextGrids produced by forced aligners like FAVE or the Montreal Forced Aligner. Classes provided by aligned-textgrid represent hierarchical and precedence relationships among data stored in TextGrid formats allowing for simplified and more accessible analysis of aligned speech data.


Josef Fruehwald
The idea behind fave-recode is that no matter how much you may adjust the dictionary of a forced-aligner, you may still want to make programmatic changes to the output.

densityarea: Polygons of Bivariate Density Distributions

Josef Fruehwald
Areas of Bivarate Density Distributions


Frequency and morphological complexity in variation

Ruaridh Purse, Josef Fruehwald, Meredith Tamminga
Glossa: a journal of general linguistics
Broad interest in probabilistic aspects of language has reignited debates about a potential delineation between the shape of an abstract grammar and patterns of language in use. A central topic in this debate is the relationship between measures capturing aspects of language use, such as word frequency, and patterns of variation. While it has become common practice to attend to frequency measures in studies of linguistic variation, fundamental questions about exactly what linguistic unit’s frequency it is appropriate to measure in each case, and what this implies about the representations or processing mechanisms at play, remain underexplored. In the present study, we compare how three frequency measures account for variance in Coronal Stop Deletion (CSD) based on large-scale corpus data from Philadelphia English: whole-word frequency, stem frequency, and conditional (whole-word/stem) frequency. While there is an effect of all three measures on CSD outcomes in monomorphemes, the effect of conditional frequency is by far the most robust. Furthermore, only conditional frequency has an effect on CSD rates in -ed suffixed words. Thus, we suggest that frequency effects in CSD are best interpreted in terms of stem-conditional predictability of a suffix or word-edge. These results lend support to the importance of asking these fundamental questions about usage measures, and suggest that contemporary approaches to frequency should take morphological complexity into account.

The study of variation

Josef Fruehwald
The Oxford History of Phonology
The study and formalization of intra-speaker variation within variationist sociolinguistics has followed a largely parallel history with generative phonology, always borrowing heavily from the generative theories of the day. More recently, structured probabilistic variation has become enshrined as a fact-to-be-explained by any theory of human sound systems in more mainstream phonology. This chapter outlines this parallel history of variation study from its origins in dialectology, the evolution of modern variationist sociolinguistics, and the development of more contemporary variation focused phonological theory, as well as critiques that have been posed over this history. The chapter reviews in considerable detail how the original notion of ‘variable rule’ was elaborated and complexified, and how variation is treated in constraint-based approaches. It concludes with a look towards the future of variation study that is incorporating more insights from psycholinguistics.

FAVE (Forced Alignment and Vowel Extraction) Program Suite v2.0.0

Ingrid Rosenfelder, Josef Fruehwald, Christian Brickhouse, Keelan Evanini, Scott Seyfarth, Kyle Gorman, Hilary Prichard, Jiahong Yuan


Crosslinguistic perceptions of /s/ among English, French, and German listeners

Zac Boyd, Josef Fruehwald, Lauren Hall-Lew
Language Variation and Change
This study reports the results of a crosslinguistic matched guise test examining /s/ and pitch variation in judgments of sexual orientation and nonnormative masculinity among English, French, and German listeners. Listeners responded to /s/ and pitch manipulations in native and other language stimuli (English, French, German, and Estonian). All listener groups rate higher pitch guises as more gay- and effeminate-sounding than lower pitch guises. However, only English listeners hear [s+] guises as more gay- and effeminate-sounding than [s] or [s−] guises for all stimuli languages. French and German listeners do not hear [s+] guises as more gay- or effeminate-sounding in any stimulus language, despite this feature’s presence in native speech production. English listener results show evidence of indexical transfer, when indexical knowledge is applied to the perception of unknown languages. French and German listener results show how the enregistered status of /s/ variation affects perception, despite crosslinguistic similarities in production.


syllabifyr: v0.1.1

Josef Fruehwald
The goal of `syllabifyr` is to provide tidy syllabification of phonetic transcriptions. So far, only CMU dict transcriptions are supported.

Toward “English” Phonetics: Variability in the Pre-consonantal Voicing Effect Across English Dialects and Speakers

James Tanner, Morgan Sonderegger, Jane Stuart-Smith, Josef Fruehwald
Frontiers in Artificial Intelligence
Recent advances in access to spoken-language corpora and development of speech processing tools have made possible the performance of “large-scale” phonetic and sociolinguistic research. This study illustrates the usefulness of such a large-scale approach—using data from multiple corpora across a range of English dialects, collected, and analyzed with the SPADE project—to examine how the pre-consonantal Voicing Effect (longer vowels before voiced than voiceless obstruents, in e.g., bead vs. beat) is realized in spontaneous speech, and varies across dialects and individual speakers. Compared with previous reports of controlled laboratory speech, the Voicing Effect was found to be substantially smaller in spontaneous speech, but still influenced by the expected range of phonetic factors. Dialects of English differed substantially from each other in the size of the Voicing Effect, whilst individual speakers varied little relative to their particular dialect. This study demonstrates the value of large-scale phonetic research as a means of developing our understanding of the structure of speech variability, and illustrates how large-scale studies, such as those carried out within SPADE, can be applied to other questions in phonetic and sociolinguistic research.


Using the Tolerance Principle to predict phonological change

Betsy Sneller, Josef Fruehwald, Charles Yang
Language Variation and Change
Language acquisition is a well-established avenue for language change (Labov, 2007). Given the theoretical importance of language acquisition to language change, it is all the more important to formulate clear theories of transmission-based change. In this paper, we provide a simulation method designed to test the plausibility of different possible transmission-based changes, using the Tolerance Principle (Yang, 2016) to determine precise points at which different possible changes may become plausible for children acquiring language. We apply this method to a case study of a complex change currently in progress: the allophonic restructuring of /æ/ in Philadelphia English. Using this model, we are able to evaluate several competing explanations of the ongoing change and determine that the allophonic restructuring of /æ/ in Philadelphia English is mostly likely the result of children acquiring language from mixed dialect input, consisting of approximately 40% input from speakers with a nasal /æ/ split. We show that applying our simulation to a phonological change allows us to make precise quantitative predications about the progress of this change. Moreover, it forces us to reassess intuitively plausible hypotheses about language change, such as grammatical simplification, in a quantitative and independently motivated framework of acquisition.

Is phonetic target uniformity phonologically, or sociolinguistically grounded?

Josef Fruehwald
Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia 2019
In this paper, I investigate to what degree phonetic uniformity in diachronic vowels shifts can be accounted for in terms of a shared phonetic implementation rule of phonological features [6, 10], versus a shared social evaluation of the phonetic realizations [19]. I take a particular focus on the parallel fronting and subsequent retraction of the GOOSE, GOAT and MOUTH vowels, as well as the raising of the preconsonantal FACE and pre-voiceless PRICE vowels in Philadelphia, drawing data from the Philadelphia Neighborhood Corpus [15]. Using generalized additive models [21] I fit models for these vowels accounting for gender, date of birth, educational attainment, and vowel duration using tensor product smooths. Looking at the correlation of the byspeaker random intercepts, back vowel fronting appears to be highly correlated, thus likely phonologically grounded, while FACE and PRICE raising is not, thus likely socially grounded.

Age vectors vs. axes of intraspeaker variation in vowel formants measured automatically from several English speech corpora.

Jeff Mielke, Erik R Thomas, Josef Fruehwald, Michael McAuliffe, Morgan Sonderegger, Jane Stuart-Smith, Robin Dodsworth
Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia 2019
To test the hypothesis that intraspeaker variation in vowel formants is related to the direction of diachronic change, we compare the direction of change in apparent time with the axis of intraspeaker variation in F1 and F2 for vowel phonemes in several corpora of North American and Scottish English. These vowels were measured automatically with a scheme (tested on hand-measured vowels) that considers the frequency, bandwidth, and amplitude of the first three formants in reference to a prototype. In the corpus data, we find that the axis of intraspeaker variation is typically aligned vertically, presumably corresponding to the degree of jaw opening for individual tokens, but for the North American GOOSE vowel, the axis of intraspeaker variation is aligned with the (horizontal) axis of diachronic change for this vowel across North America. This may help to explain why fronting and unrounding of high back vowels are common shifts across languages.


Response to Berkson, Davis, & Strickler, ‘What does incipient /ay/-raising look like?’

Josef Fruehwald
Berkson, Davis, and Strickler (2017) provide an invaluable report on incipient /ay/-raising in Fort Wayne, Indiana. Their data suggest that /ay/-raising conditioned strictly by phonetic voice-lessness is a possible early stage in the development of /ay/-raising. This raises a particularly vexing question of why /ay/-raising has gone on to be conditioned by phonological voicing in all North American varieties for which its interaction with /t, d/ flapping has been examined. It suggests that the process of phonologization reorganizes the distribution of phonetic variants, rather than simply discretizing phonetic precursors.

Generations, lifespans, and the zeitgeist

Josef Fruehwald
Language Variation and Change
This paper is equal parts methodological recommendation and an empirical investigation of the time dimensions of linguistic change. It is increasingly common in the sociolinguistic literature for researchers to utilize speech data that was collected over the course of many decades. These kinds of datasets contain three different time dimensions that researchers can utilize to investigate language change: (i) the speakers' dates of birth, (ii) the speakers' ages at the time of the recording, and (iii) the date of the recording. Proper investigation of all three time dimensions is crucial for a theoretical understanding of the dynamics of language change. I recommend utilizing two-dimensional tensor product smooths, fit over speakers' date of birth and the year of the recording, to analyze the contribution of these three time dimensions to linguistic changes. I apply this method to five language changes, based on data drawn from the Philadelphia Neighborhood Corpus. I find relatively weak evidence for lifespan effects in these changes, robust generational effects, and in one case, evidence of a zeitgeist effect.


The early influence of phonology on a phonetic change

Josef Fruehwald
The conventional wisdom regarding the diachronic process whereby phonetic phenomena become phonologized appears to be the ‘error accumulation’ model, so called by Baker, Archangeli, and Mielke (2011). Under this model, biases in the phonetic context result in production or perception errors, which are misapprehended by listeners as target productions, and over time accumulate into new target productions. In this article, I explore the predictions of the hypocorrection model for one phonetic change (prevoiceless /ay/-raising) in detail. I argue that properties of the phonetic context underpredict and mischaracterize the contextual conditioning on this phonetic change. Rather, it appears that categorical, phonological conditioning is present from the very onset of this change.

Filled Pause Choice as a Sociolinguistic Variable

Josef Fruehwald
U. Penn Working Papers in Linguistics
In this paper, I argue that filled pause selection (um/uh) is a sociolinguistic variable, conditioned by both internal and external factors. There appears to be a language change in progress towards selecting um more often than uh. In all respects, the (UHM) variable appears to pattern quantiatively just like all other sociolinguistic variables which have been examined, even though the locus of (UHM) variation would seem to be firmly in the speech planning domain. Combined with the quantitative systematicity of sociolinguistic variables across the full range of linguistic modules, I argue that the locus of variation may not be in the grammar, but rather constitutes a separate domain of knowledge, perhaps what Preston (2004) called the “sociocultural selection device.”

Variation and Change in the Use of Hesitation Markers in Germanic Languages

Martijn Wieling, Jack Grieve, Gosse Bouma, Josef Fruehwald, John Coleman, M. Liberman
Language Dynamics and Change
In this study, we investigate crosslinguistic patterns in the alternation between um, a hesitation marker consisting of a neutral vowel followed by a final labial nasal, and uh, a hesitation marker consisting of a neutral vowel in an open syllable. Based on a quantitative analysis of a range of spoken and written corpora, we identify clear and consistent patterns of change in the use of these forms in various Germanic languages (English, Dutch, German, Norwegian, Danish, Faroese) and dialects (American English, British English), with the use of um increasing over time relative to the use of uh. We also find that this pattern of change is generally led by women and more educated speakers. Finally, we propose a series of possible explanations for this surprising change in hesitation marker usage that is currently taking place across Germanic languages.


I’m done my homework—Case assignment in a stative passive

Josef Fruehwald, Neil Myler
Linguistic Variation
We present an analysis of an understudied construction found in Philadelphian and Canadian English, and also in certain Vermont varieties. In this construction, the participle of certain verbs can appear along with a form of the verb be and a DP complement, producing strings like I’m done my homework , I’m finished my fries , and (in Vermont) I’m started the project . We show that the participle in the construction is an adjectival passive, not a perfect construction. We further argue that the internal argument DP in the construction is receiving Case from the adjectival head a , similar to what happens in all English dialects with the adjective worth , and that the internal argument is interpreted via a mechanism of complement coercion. The microparametric variation we find across English dialects with respect to the availability of this construction is accounted for by variation in the selectional restrictions on the a head.

FAVE (Forced Alignment and Vowel Extraction) 1.2.2

Ingrid Rosenfelder, Josef Fruehwald, Keelan Evanini, Scott Seyfarth, Kyle Gorman, Hilary Prichard, Jiahong Yuan


The Phonological Influence on Phonetic Change

Josef Fruehwald
This dissertation addresses the broad question about how phonology and phonetics are interre- lated, specifically how phonetic language changes, which gradually alter the phonetics of speech sounds, affect the phonological system of the language, and vice versa. Some questions I address are: (i) What aspects of speakers’ knowledge of their language are changing during a phonetic change? (ii) What is the relative timing of a phonetic change and phonological reanalysis? (iii) Can a modular feed-forward model of phonology and phonetics account of the observed patterns of phonetic change? (iv) What are the consequences of my results for theories of phonology, phonetics, and language acquisition? (v) What unique insight into the answers to these questions can the study of language change in progress give us over other methodologies? To address these questions, I drew data from the Philadelphia Neighborhood Corpus [PNC] (Labov and Rosenfelder, 2011), a collection of sociolinguistic interviews carried out between 1973 and 2013. Using the PNC data, I utilized a number of different statistical modeling techniques to evaluate models of phonetic change and phonologization, including standard mixed effects re- gression modeling in R (Bates, 2006), and hierarchical Bayesian modeling via Hamiltonian Monte Carlo in Stan (Stan Development Team, 2012). My results are challenging to the conventional wisdom that phonologization is a late-stage reanalysis of phonetic coarticulatory and perceptual effects (e.g. Ohala, 1981). Rather, it appears that phonologization occurs simultaneously with the onset of phonetic changes. I arrive at this conclusion by examining the rate of change of contextual vowel variants, and by investigating mismatches between which variants are expected to change on phonetic grounds versus phono- logical grounds. In my analysis, not only can a modular feed-forward model of phonology and phonetics account for observed patterns of phonetic change, but must be appealed to in some cases. These results revise some the facts to be explained by diachronic phonology, and I suggest the question to be pursued ought to be how phonological innovations happen when there are relatively small phonetic precursors.

Phonological Rule Change: The Constant Rate Effect

Josef Fruehwald, Jonathan Gress Wright, Joel Wallenberg
The proceedings of the North-Eastern Linguistic Society (NELS)
The detailed quantitative study of language change, as found in studies such as Labov (1994) and Kroch (1989), has raised two central questions for linguistic theory. The first is an issue in the theory of language change itself, namely: do changes in different components of the grammar progress in the same way? The second question addresses the relationship between the study of change and the development of synchronic linguistic theory: can quantitative, diachronic data help to choose between alternative analyses of synchronic facts? This paper addresses both of these questions with the case study of the loss of word-final stop fortition (frequently termed "devoicing") in the history of German, and concludes that the answer to both questions above is "yes".

One hundred years of sound change in Philadelphia: Linear Incrementaion, Reversal, and Reanalysis

William Labov, Ingrid Rosenfelder, Josef Fruehwald
The study of sound change in progress in Philadelphia has been facilitated by the application of forced alignment and automatic vowel measurement to a large corpus of neighborhood studies, including 379 speakers with dates of birth from 1888 to 1991. Two of the sound changes active in the 1970s show a linear pattern of incrementation in succeeding decades. The fronting of back upgliding vowels /aw/ and /ow/ shows a reversal in the direction of change, beginning with those born after 1940. The study also finds a general withdrawal from two salient features of local phonology, tense /æh/ and /oh/, led by those with higher education. Younger speakers with higher education have also reorganized the traditional Philadelphia tense/lax split of short-a to form a nasal system with tensing before all and only nasal consonants. The development of the Philadelphia vowel system can be understood in the geographic context of neighboring dialects. Features in common with North and North Midland dialects have accelerated in use while features in common with South Midland and Southern dialects have been reversed in favor of Northern patterns. The microevolution of a linguistic system can be seen here as subject to phonological generalizations but driven by social evaluation as features rise in level of salience for members of the speech community.


Redevelopment of a Morphological Class

Josef Fruehwald
Penn Working Papers in Linguistics
Coronal stop deletion (or‚`TD Deletion‚`) is the paradigm sociolinguistic variable. It was first described in African American English (Labov et al., 1968) as a rule whereby word final /Ct/ and /Cd/ clusters simplify by deleting the coronal stop. It has since been found in many dialects and varieties of English. Aside from the very regular phonological and phonetic factors which condition whether TD Deletion applies, morphological structure also appears to have an effect. The three morphological categories of primary interest are (i) monomorphemes}, (ii) regular past tense verbs and (iii) semiweak past tense verbs. In almost every dialect studied, the order of morphological classes from least favoring deletion to most favoring deletion is as given in (1). (1) monomorphemes > semiweak > regular past tense In this paper, I will be focusing on the difference between semiweak and regular past tense. I will pursue a revised version of the analysis in Guy & Boyd (1990), casting it in terms of Competing Grammars and Distributed Morphology. Specifically, I will propose that the rate of phonological TD Deletion is the same for the regular past and the semiweak. What leads to higher TD Absence in the semiweak verbs is variable morphological absence of /t/, i.e., there is a competing morphological analysis where the past tense of keep is simply "kep", instead of "kept".


Cross-derivational feeding is epiphenomenal

Josef Fruehwald, Kyle Gorman
Studies in the Linguistic Sciences: Illinois Working Papers
Baković (2005) proposes that patterns of sufficiently-similar segment avoidance are the result of interacting agreement and antigemination constraints, a pattern known as cross-derivational feeding (CDF). The bleeding interactions between epenthesis and assimilation which prevent adjacent sufficiently-similar segments in English are shown to follow, however, from extragrammatical considerations. Several case studies provide evidence against the major predictions of CDF.

FAVE (Forced Alignment and Vowel Extraction) Program Suite.

Ingrid Rosenfelder, Josef Fruehwald, Keelan Evanini, Jiahong Yuan


The Spread of Raising : Opacity , Lexicalization , and Diffusion

Josef Fruehwald
Penn Working Papers in Linguistics
The centralization of the low upgliding diphthong (typically called Canadian Raising, here just Raising), is frequently cited as an example of phonological opacity. Conditioned by a following voiceless segment, Raising continues to apply when an underlying unstressed /t/ is flapped on the surface. Dialects which have both Raising and Flapping, then, maintain the distinction between "writer" and "rider" in the quality of the vowel, rather than the voicing of the stop. Exceptions to the simplest formulation of Raising have been reported on in the past. Underapplication of Raising in pre-voiceless environments can possibly be accounted for by prosodic structure (Chambers, 1973, 1989; Jensen, 2000; Vance, 1987). However, a few reports from the Inland North (Vance, 1987; Dailey-O'Cain, 1997) and Canada (Hall, 2005) suggest that the regularity of Raising's conditioning has deteriorated, allowing raised nuclei before underlyingly voiced segments. The distribution of these raised variants is unpredictable within a speaker's phonology, but stable for given words, suggesting that Raising has lexicalized, and is undergoing diffusion to new environments. This paper focuses on the phonological status of Raising in Philadelphia. Raising was identified as an incipient sound change in progress in the LCV study of the 1970s, and has been revisited for study in connection with its masculine association (Labov, 2001; Conn, 2005; Wagner, 2007). After examining data from 12 boys, ages 14 through 19, it appears that Raising has lexicalized here as well. [^y] frequently appears before underlyingly voiced stops, as well as before nasals, but not in a phonologically predictable manner. Certain words seem to be selected for consistent overapplication however. "Spider" and "cider" are lexical items with raised nuclei for which there is broad agreement between speakers. However, there are also a number of lexical items which show more interspeaker variation, such as "tiny", produced variably as [tayni] or [t^yni]. Importantly, across all of the data, the effect of the lexical item on overapplication of Raising is stronger and more significant than the effect of surrounding phonological environment.


The Spread of Raising: Opacity, lexicalization, and diffusion

Josef Fruehwald
College Undergraduate Research Electronic Journal
Canadian Raising is typically described as the centralization of the nucleus of /ay/ before voiceless segments. However some recent studies in areas affected by Raising have shown that the current conditioning factors are not as regular as reported previously (Vance, 1987; Dailey-O’Cain, 1997; Hall, 2005). This paper explores the status of Raising in Philadelphia. Examining data from 12 boys, ages 14 to 19, it appears that Raising has lexicalized here as well. While Raising occurs before a number of voiced stops and nasals, the words which experience Raising most regularly suggest that it has spread due to its opaque applications.
No matching items