|Title:||Phonological Variation and Lexical Frequency|
|Authors:||Andries W Coetzee|
|Comment:||'Short version' will appear in the proceedings of NELS 38.|
|Abstract:||Lexical usage frequency is known to influence the application rate of some variable processes. Specifically, variable lenition processes typically affect frequent lexical items more often than infrequent lexical items. For instance, variable t/d-deletion in English is more likely to apply to a frequent word (just) than to an infrequent word (bust). Existing grammatical models of phonological variation do well at accounting for the influence of grammatical factors on variation, but these models cannot account for the contribution of non-grammatical factors such as lexical frequency. In this paper, I propose a model of phonological variation that can simultaneously account for the influence of both these factors on variation. I assume an Optimality Theoretic grammar with lexically indexed faithfulness constraints. Variation arises as a result of variable lexical indexation -- a single lexical item can be assigned to different lexical classes on different evaluation occasions, and will hence not always be evaluated by the same faithfulness constraints. Each lexical item is associated with a probabilistic distribution function that determines the likelihood of it being assigned to each of the lexical classes. The shape of a lexical item's distribution function is determined by its usage frequency, so that frequency influences the likelihood of the lexical item being assigned to specific lexical classes, and hence the likelihood of it being evaluated by specific faithfulness constraints. I apply the proposed model to variable t/d-deletion in English, and show that it succeeds in accounting for the way in which usage frequency influences this process.
In two appendices, I show how the model can be implemented by interpreting the lexical distribution functions as instantiations of the beta distribution (Evans et al. 2000). This implementation of the model is then used to determine the expected t/d-deletion rates in a corpus of American English. I also use this implementation to model the acquisition path of a variable process, showing that the model predicts more variation during earlier acquisitional stages.