|Title:||A Robbins-Monro type learning algorithm for an entropy maximizing version of stochastic Optimality Theory|
|Abstract:||The object of the present work is the analysis of the convergence behaviour of a learning algorithm for grammars belonging to a special version -- the maximum entropy version -- of stochastic Optimality Theory. Stochastic Optimality Theory is like its deterministic predecessor, namely Optimality Theory as introduced by Prince and Smolensky, in that both are theories of universal grammar in the sense of generative linguistics.
We give formal definitions of basic notions of stochastic Optimality Theory and introduce the learning problem as it appears in this context. A by now popular version of stochastic Optimality Theory is the one introduced by Boersma, which we briefly discuss. The maximum entropy version of stochastic Optimality Theory is derived in great generality from fundamental principles of information theory. The main part of this work is dedicated to the analysis of a learning algorithm proposed by Jaeger (2003) for maximum entropy grammars. We show in which sense and under what conditions the algorithm converges. Connections with well known procedures and classical results are made explicit.