Abstracts
Abstracts
Considerations of learnability have shaped research projects both in theoretical linguistics and in cognitive science. In theoretical linguistics, results such as Gold (1967)'s proof that no complex family of languages can be identified in the limit have contributed to the emergence of a highly restrictive notion of Universal Grammar, according to which the task of the language learner amounts to choosing from a finite set of candidate languages. Examples of this notion of learning include the Principles and Parameters framework (Chomsky 1981) and Optimality Theory (Prince and Smolensky 1993). In cognitive science, evidence that humans can take advantage of statistical regularities in their environment, as in the experiments of Saffran et al. (1996), has motivated complex, task-specific statistical learners. These learners often involve machinery that goes well beyond what seems to be needed to account for adult language, and they tend to be geared toward solving specific problems (in many cases, problems that have little overlap with the tasks suggested by work on linguistic competence), and do not scale up to more complex tasks such as learning syntactic structure or semantic entries.
This talk investigates the possibility that the seemingly conflicting effects that learnability-related considerations have had in linguistics and in cognitive science can be reconciled by approaching the problem of learning from the perspective of the assumptions required to account for linguistic competence in adults. I will argue that the ability to represent grammars in memory and the ability to use grammars to parse an input support a learning mechanism along the lines of MDL. I will present a learner that is based directly on this idea and will discuss its ability to handle data generated by certain probabilistic Context-Free Grammars and segmentation data of the kind discussed by Saffran et al. (1996). Significantly, the learner has been able to discover the underlying structure of the data in the segmentation task as a by-product of its general search for the best grammar for the data and without any knowledge of the task of segmentation or of notions such as lexicon or word.
Learning and Linguistic Competence
Roni Katzir
Cornell University