Our lab

The Cornell Computational Linguistics Lab is a research and educational lab in the Department of Linguistics and Computing and Information Science. It is a venue for lab sessions for classes, computational dissertation research by graduate students, undergraduate research projects, and grant research.

The lab collaborates with a large group at Cornell, including faculty and students in Cognitive Science, Computer Science, Psychology, and Information Science. The Department of Computing and Information Science provides system administration support for the lab, and some computational work is done on hardware at the Department of Computer Science and the Centre for Advanced Computing .


Graduate Students

Undergraduate Research Assistants

Lab Alumni


Students and faculty are currently working on diverse projects in computational phonetics, phonology, syntax, and semantics.

Finite-state phonology

Mats Rooth • Simone Harmath-de Lemos • Shohini Bhattasali • Anna Choi

In this project we train a finite state model to detect prosodic cues in a speech corpus. We are specifically interested in detecting stress cues in Brazilian Portuguese and Bengali and finding empirical evidence for current theoretical views.

Recent publications (2021 and 2020)

Here are some selected publications from recent work by faculty and graduate students:

Recent Courses:

If you are interested in computational linguistics, these classes are a great way to get started in this area:

LING 4424: Computational Linguistics

Introduction to computational linguistics. Possible topics include syntactic parsing using functional programming, logic-based computational semantics, and finite state modeling of phonology and phonetics.

LING 4434: Computational Linguistics 2

Computational Linguistics 2 - This course introduces techniques to probe for linguistic representations in neural network models of language. Centered around discussion of current research papers as well as student research projects.

LING 4429/6429: Grammar Formalisms

This course introduces different ways of "formalizing" linguistic analyses, with examples from natural language syntax. Students learn to identify recurrent themes in generative grammar, seeing how alternative conceptualizations lead to different analytical trade-offs. Using distinctions such as rule vs constraint, transformational vs. monostratal and violable vs. inviolable, students emerge better able to assess others' work in a variety of formalisms, and better able to deploy formalism in their own analyses.

LING 4485/6485: Topics in Computational Linguistics

Current topics in computational linguistics. Recent topics include computational models for Optimality Theory and finite state models.

LING 2264: Language, Mind, and Brain

An introduction to neurolinguistics, this course surveys topics such as aphasia, hemispheric lateralization and speech comprehension as they are studied via neuroimaging, intracranial recording and other methods. A key focus is the relationship between these data, linguistic theories, and more general conceptions of the mind.


Access to Cornell's G2 Computing Cluster
More than 80 Language Corpora in 60+ languages (e.g.news text, dialogue corpora, television transcripts, etc)


English 97, BankBaseline, and PF Linear Expansion: Please contact Mats Rooth.

The Cornell Conditional Probability Calculator (CCPC): Please contact ccpc@cornell.edu.

DeepParse 2.2, DepPrint 1.1, NegraToConfig: Please contact Marisa Boston.

Computation lexicon of Modern Greek annotated with POS and lemma, Newspaper corpus of Modern Greek:
Please contact Effie Georgala.

Useful links

Linguistics department
Natural Language Processing group
Cognitive Science program
Association of Computational Linguistics
Cornell Linguistics Circle