Abstracts

 

The central problem of theoretical linguistics is to find a class of representations that is large enough to account for the observed variety of natural languages, while still being restricted enough that they can be learned. While there are many formalisms that can satisfy the first of these criteria, learnability turns out to be  extremely hard. It seems reasonable therefore to follow a research strategy that puts achieving learnability first.


We argue that the important problems of learnability are not the information theoretic ones that lead to the famous negative results of Gold and so on. Reasonable assumptions about probabilistic data compensate for the absence of negative examples. The important problems are, rather,  the computational complexity of constructing the hypothesis from the available data.  In general, the structure of the representations should be based on the structure of the language. This "empiricist" property reduces the indeterminism in the representation and allows learnability. We argue that distributional learning (DL) offers several possible solutions: these are methods which explicitly or implicitly exploit the relation between context and substring in a language.


We discuss the foundational theoretical issues of DL in grammatical inference. Recent results have established that large classes of context free and context-sensitive languages can be efficiently learned using provably correct algorithms based on DL. The most elementary result is the learnability of substitutable languages (Clark and Eyraud, 2005); subsequent results which assume the existence of a membership oracle

can learn much larger classes using either context-free grammars or more powerful context sensitive formalisms such as Binary Feature Grammars (Clark et al 2008). Finally we discuss the class of Distributional Lattice Grammars (Clark, 2009), that are efficiently learnable and define a class of languages that seems a plausible fit to the class of natural languages. An important open question is whether the structural descriptions defined by these representations are rich enough to support semantic interpretation.

On Distributional Learning

Alexander Clark

Royal Holloway, University of London