Philadelphia Language in Motion

--Josef Fruehwald

At NWAV 40 at Georgetown University, Bill Labov's plenary address featured this motion diagram of sound changes in Philadelphia.

Data: pred2, Chart ID: MotionChart_2011-07-18-10-30-34
R version 2.11.1 (2010-05-31), Google Terms of Use

Labov highlighted the extreme raising of /ay/ before voiceless segments (called ay0 in the graph) and of /ey/, the initially stable, then lowering /aeh/ and /oh/, and finally, the reversal of /aw/ raising by speakers born after 1960.


This graph, by necessity, does not represent raw data. At some point, I may produce one based on mean values per date of birth, but as of now, a graph like that still contains too much noise to be very useful. Instead, F1 and F2 for each vowel are smoothed over date of birth using local regression, loess() in R. My own brief experimentation with other smoothing techniques (like cubic regression splines) didn't seem to produce a qualitative difference from the loess() smooths. However, if anyone can think of any other more principled / interesting smoothing for the purpose of this diagram, please let me know.

As you can see from the date on the motion graph above, it was created July 2011. The Philadelphia Neighborhood Corpus has been, and will continue to be, developed since then. So in this space, I will maintain a version of the motion graph based on the most up-to-date data.

Data: loess.preds.c • Chart ID: MotionChartID19ea9fc3
R version 2.13.2 (2011-09-30) • googleVis-0.2.10Google Terms of UseData Policy

The Code

This is the R code involved in going from raw measurements to the motion chart.


The original development on these motion graphs was done by Hans Rosling's Gapminder organization, which was then acquired by Google. Google has continued their development, including distributing the googleVis R package, which I used.

The Gapminder motion graphs were first publicized in a 2006 TED talk called on the TED website "Hans Rosling shows the best stats you've ever seen", providing clear motivation for everyone else to also produce the best stats you've ever seen.

Language change data, especially vowel shifts, is especially well visualized in a motion diagram. Take, for instance, this figure from Principles of Language Change. Volume 2 (Labov, 2001), which is essentially a static version of the diagrams above.

Future Directions

While the Google Motion diagrams are an excellent off-the-shelf visualization product, I'm not completely satisfied with them. They are rather inflexible in their aesthetics, and in the ways in which data can be presented.

Some desiderata I have for the ultimate visualization of vowel shifts in Philadelphia are:

  • Fully customizable aesthetics.
  • Interactive playback.
  • Online & interactive definition of subsets (e.g. faceting by Sex, Education, Ethnicity etc.).
  • Flexible and clear inclusion of more dimensions (e.g. glide target).

Achieving these goals will probably involve programming my own visualization in something like Processing ... which will happen at some unspecified date in the future.

References & Resources

  • Philadelphia Neighborhood Corpus of LING560 Studies, 1972-2010, with support of NSF contract 921643 to W. Labov. [PNC]
  • I. Rosenfelder, J. Fruehwald, K. Evanini and J. Yuan 2011, FAVE Program Suite [Forced alignment and vowel extraction].
  • googleVis Package for R (Cran, Google Code)