[Author Login]
Title:Using phonotactics to learn phonological alternations
Authors:Bruce Tesar, Alan Prince
Abstract:A well-known formal challenge in the learning of linguistic systems is interdependence. The problem posed to learners by the interdependence of a phonological mapping and a corresponding lexicon is nontrivial. If the final consonant of a root (preceded by a vowel) is voiceless in isolation, but voiced when followed by a vowel-initial suffix, is the alternation due to the syllable-final devoicing of an underlyingly voiced consonant, or due to the intervocalic voicing of an underlyingly unvoiced consonant? The selection of the underlying form is dependent on the selected phonological mapping, and vice-versa.

Many phonological alternations are motivated by the enforcement of phonotactic restrictions. This linguistic observation leads to a learning proposal in which the learner deals with the interdependence of mapping and lexicon by getting a phonotactic-based initial estimate of the mapping, without reference to alternations or more abstract underlying forms. This phonotactic mapping can then be used as the starting point in a process of further refinement of both underlying forms and mappings.

The learner can get an initial estimate to the mapping of the language via early phonotactic learning, during which each word is treated as if it were a monomorphemic whole. This can be accomplished by an algorithm called Biased Constraint Demotion (BCD). BCD learns an Optimality-theoretic constraint ranking in response to phonological surface forms of words. In particular, it attempts to learn the most restrictive such ranking, resulting in a mapping which enforces all the phonotactic restrictions realizable by the constraint system which are attested in the data.

Later, when the learner becomes "morphologically aware", and detects that a morpheme alternates, it can gain insight into the correct underlying form for the morpheme by trying, for each word containing the morpheme, underlying forms differing in the values of precisely those features that differ in the surface realizations of the morpheme. In this way, the learner can determine which surface variants result from general phonotactic restrictions, and it can then base its stored underlying form on the surface variants that are not fully accounted for by phonotactics (i.e., that depend on the underlying specification).
Type:Paper/tech report
Article:Version 1