R O A
 VIEW ROA 858 
GO

858-0806 
A Maximum Entropy Model of Phonotactics and Phonotactic Learning
Authors 
Bruce Hayes UCLA <bhayes@humnet.ucla.edu> [Details]
Colin Wilson <colin@humnet.ucla.edu> [Details]
Comment 
Ms., Department of Linguistics, UCLA
Length 
55 pp.
Files 
 PDF 407kb
Abstract 


The study of phonotactics (e.g., the ability of English speakers to distinguish possible words like 'blick' from impossible words like *'bnick') is a central topic in phonology. We propose a theory of phonotactic grammars and a learning algorithm that constructs such grammars from positive evidence.

Our grammars consist of constraints that are assigned numerical weights according to the principle of maximum entropy. Possible words are assessed by these grammars based on the weighted sum of their constraint violations. The learning algorithm is robust against errors in the training data and yields grammars that can capture both categorical and gradient phonotactic patterns. The algorithm is not provided with any constraints in advance, but uses its own resources to form constraints and weight them. A baseline model, in which Universal Grammar is reduced to a feature set and an SPE-style constraint format, suffices to learn many phonotactic phenomena. In order to learn nonlocal phenomena such as stress and vowel harmony, it is necessary to augment the model with autosegmental tiers and metrical grids. Our results thus offer novel, learning-theoretic support for such representations.

We apply the model to English syllable onsets, Shona vowel harmony, quantity-insensitive stress typology, and the full phonotactics of Wargamay, showing that the learned grammars capture the distributional generalizations of these languages and accurately predict experimental findings.
Keywords 
 phonotactics, learning
Area 
 Phonology, Learnability
Type 
 Manuscript
 JUMP TO GO  
 Item Display:



R O A