|Abstract:||To some, the relation between bidirectional optimality theory and game theory seems obvious: strong bidirectional optimality corresponds to Nash equilibrium in a strategic game (Dekker and van Rooij 2000). But in the domain of pragmatics this formally sound parallel is conceptually inadequate: the sequence of utterance and its interpretation cannot be modelled reasonably as a strategic game, because this would mean that speakers choose formulations independently of a meaning that they want to express, and that hearers choose an interpretation irrespective of an utterance that they have observed. Clearly, the sequence of utterance and interpretation requires a dynamic game model. One such model, and one that is widely studied and of manageable complexity, is a signaling game. This paper is therefore concerned with an epistemic interpretation of bidirectional optimality, both strong and weak, in terms of beliefs and strategies of players in a signaling game. In particular, I suggest that strong optimality may be regarded as a process of internal self-monitoring and that weak optimality corresponds to an iterated process of such selfmonitoring. This latter process can be derived by assuming that agents act rationally to (possibly partial) beliefs in a self-monitoring opponent.