[Author Login]
Title:Learning underlying forms by searching restricted lexical subspaces
Authors:Nazarre Merchant, Bruce Tesar
Comment:To appear in CLS 41.
Abstract:Two intertwined tasks that face a learner are learning a lexicon and learning a constraint ranking. Given a set of surface forms, it is not possible in general to determine what ranking produces the given forms without knowing what the underlying forms are and similarly the underlying forms are not discernable without information about the ranking that produced the given surface forms (Albright and Hayes 2002; Hale and Reiss 1997; Tesar and Smolensky 2000; Tesar et al. 2003). Exhaustive searching of the possible underlying forms and constraint rankings is untenable. There simply are too many possible combinations of lexica and constraint rankings; the search space is too large. Because of this the learner must adopt some strategy to search the space non-exhaustively.

In this paper we propose an algorithm whereby the learner attends to contrast pairs. Contrast pairs are pairs of surface forms that contrast in some surface feature and differ in exactly one morpheme, as described in Tesar 2004. Because these pairs contrast it follows that they must differ underlyingly in some featural specification; the surface contrast could not obtain if the pair had identical underlying specifications. Furthermore, because they differ in only one morpheme the difference in underlying specification of the words must be a consequence of different underlying specifications for the differing morphemes.

The two English words 'cats' and 'cads' form a contrast pair. Here the roots 'cat' and 'cad' are in the morphological environment plural. Because the plural morpheme has one underlying specification the surface contrast must come about from differing underlying specifications of the morphemes 'cat' and 'cad'.

(1) kæt + s ~ kæd + z

In this algorithm, for each contrast pair, the learner tests all possible underlying forms for consistency. The learner then sets the underlying values of those features that have the same value for all consistent underlying forms. The learner processes contrast pairs in order by the number of undetermined features. This way the learner processes pairs with the fewest undetermined features (the ones requiring the least computational effort) first.

There are two main ideas captured in the algorithm presented here. First, relevant information about the lexicon can be extracted by only considering contrast pairs. At no time does the learner consider more than two surface forms simultaneously, nor does it consider single forms in isolation, as does the Surgery Learning Algorithm (henceforth SLA) (Tesar et al 2003). By using contrast pairs, the algorithm can be shown to generalize the capabilities of the CA procedure, being able to set all features that the CA procedure would set and others as well.

The second main idea is that processing with contrast pairs permits a meaningful reduction in the computational effort required of the learner, relative to a simultaneous evaluation of all the data. For each contrast pair, the learner is only considering combinations of values for the unset features of the morphemes in that pair. At no time does the learner consider all combinations of unset features of all morphemes simultaneously. It is in this way that the learner avoids the combinatorial nightmare of exhaustive search.

The results reported here support the claim that using contrast pairs in learning provides a good trade-off between information and computational effort. Contrast pairs provide enough information about underlying forms to allow the learner to ultimately learn the correct grammar without requiring excessive computational effort to process.
Type:Paper/tech report
Article:Version 1