Computational Linguistics lab | Cornell University

Our lab

The Cornell Computational Linguistics Lab is a research and educational lab in the Department of Linguistics and Computing and Information Science. It is a venue for lab sessions for classes, computational dissertation research by graduate students, undergraduate research projects, and grant research.

The lab collaborates with a large group at Cornell, including faculty and students in Cognitive Science, Computer Science, Psychology, and Information Science. The Department of Computing and Information Science provides system administration support for the lab, and some computational work is done on hardware at the Department of Computer Science and the Centre for Advanced Computing.

In addition to this website, we encourage you to visit the affiliated Cornell Computational Psycholinguistic Discussions (C.Psyd) website. C.Psyd is a linguistics research group that uses computational models to study the intersection of computational linguistics and psycholinguists. By modeling human language processing behavior (e.g., reading times) one can identify linguistic features that impact human processing decisions. Relatedly, C.Psyd members use psycholinguistic techniques to study the strategies used by neural networks to produce high accuracy in different language contexts, which gives us insights as to when different strategies might be employed by humans.

Finally, both the Computational Lingustics Lab and the C.Psyd group are affiliated with the Cornell Natural Language Processing Group.

Faculty

Graduate Students

Undergraduate Research Assistants

Lab Alumni

Anca Chereches, Masters 2014
Software Engineer for TransitionZero
Software Developer for JNCTION
David Lutz, Masters 2013
Director of Data Science, Capital One, NYC
Debasmita Battacharya, Undergrad Research Assistant 2020-2022
PhD Student in Computer Science, Columbia University Spoken Language Processing Group
Effi Georgala, PhD 2012
Senior Research Manager @ Microsoft Copilot AI
Fangcong Yin, Research Assistant 2022-2023
PhD student in Computer Science, University of Texas at Austin
Forrest Davis, PhD 2022
Assistant Professor of Computer Science, Colgate University
Jacob Collard, PhD 2020
Post-doctoral researcher at NIST (National Institute of Standards & Technology)
John Hale, Associate Professor of Linguistics 2008-2018
Research Professor, Department of Cognitive Science, Johns Hopkins University
Jixing Li, PhD 2018
Assistant Professor in the Department of Linguistics and Translation and the Department of Behavior and Social Sciences , City University of Hong Kong.
Jiwan Yun, PhD 2013
Associate Professor, Stony Brook University
Katherine Blake, PhD 2022
Language Data Researcher, Amazon Alexa AI
Kaelyn Lamp, PhD 2024
Post-Graduate Researcher
Kyle Grove, Masters 2014
Senior Manager, Upwork
Marisa Ferrara Boston, PhD 2012
Founder, Reins AI
Marisabel Cabrera, Undergrad Research Assistant 2020-2021; Linguistics PhD Candidate, UCLA
Rishi Bommasani, Masters 2020
PhD Student in Computer Science, Stanford University
Shohini Bhattasali, PhD 2019
Assistant Professor of Computational Linguistics, Department of Language Studies, University of Toronto
Tejaswini Deoskar, PhD 2008
Associate Professor, Institute for Language Sciences, Department of Languages, Literature and Communication, Utrecht University
Tim Hunter, PostDoc 2012-2013
Associate Professor, Department of Linguistics, UCLA
Tristan Engst, Undergraduate Research Assistant, 2020-2021; Computer Science PhD Candidate, Simon Fraser University
William Timkey, Research Assistant 2020-2021
PhD student, Department of Linguistics, New York University
Yuping Zhou, PhD 2008
Postdoctoral Fellow, Department of Computer Science, Brandeis University
Zhong Chen, PhD 2014
Assistant Professor, Rochester Institute of Technology

Projects

Students and faculty are currently working on diverse projects in computational phonetics, phonology, syntax, and semantics.

Finite-state phonology

Mats Rooth • Simone Harmath-de Lemos

In this project we train a finite state model to detect prosodic cues in a speech corpus. We are specifically interested in detecting stress cues in Brazilian Portuguese and Bengali and finding empirical evidence for current theoretical views.

Xenophobia and dog whistle detection in social media

Marten van Schijndel

In this collaboration with the Cornell Xenophobia Meter Project, we study the linguistic properties of social media dog whistles to better identify extremist trends before they gain traction.

Models of code-switching

Marten van Schijndel • Ashlyn Winship

In this work, we study bilingual code-switching (that is, where two languages are used interchangably in a single utterance). We are particularly interested in how information flows across code-switch boundaries; how information from a span in Language 1 can influence productions in and comprehension of spans in Language 2. We are also studying what properties influence code-switching and whether they occur mainly to ease production for the speaker or whether they mainly serve to ease comprehension for the listener.

Representation sharing in neural networks

Marten van Schijndel • Jacob Matthews (Romance Languages) • John Starr

Much work has gone into studying which linguistic surface patterns are captured by neural networks, and in this work we are interested in studying how various surface patterns are grouped into larger linguistic abstractions within the networks and in studying how those abstractions interact. Is each instance of a linguistic phenomenon, like filler-gap, related to other instances of that phenomenon (i.e. models encode a filler-gap abstraction) or is each contextual occurrence encoded as a separate phenomenon?

Summarization as Linguistic Compression

Marten van Schijndel • Angelina Chen • Anna Lin • Diana Vazquez Palma

In this work, we conceptualize summarization as a linguistic compression task. We study how different levels of linguistic information are compressed during summarization and whether automatic summarization models learn similar compression functions. We also study how each aspect of linguistic compression is correlated with various measures of summary quality according to human raters.

Vowel Duration as a predictor of primary stress placement in Brazilian Portuguese

Simone De Lemos

Simone De Lemos (2021): Detecting word-level stress in continuous speech : A case study of Brazilian Portuguese Journal of Portuguese Linguistics, 20(1), 3.DOI

I work with speech corpora to investigate how compressed models of the speech signal (specifically MFCCs) can be used to support research in phonetics and phonology. I am currently working on a project that seeks to understand if vowel duration (as generated by forced aligners) would be a better predictor to primary stress placement in Brazilian Portuguese (BP) than spectral and energy features (as represented by MFCCs). This project also aims at further comparing pretonic, stressed, and posttonic vowels in BP, both syntagmatically and paradigmatically. In parallel, I am using the same method to look at possible spectral, energy, and durational differences between vowels in contexts where differences in syntactic attachment in a pair of words are said to trigger stress shift in BP. In a third thread of the work, I am investigating whether different combinations of bilingual acoustic models have an impact on the forced alignment of a small speech corpus of Bororo (Bororoan, Central Brazil).

Recent publications & dissertations (2024, 2023, 2022, 2021, 2020)

Here are some selected publications from recent work by faculty and graduate students:

2024

Kaelyn Lamp.(2024) PhD Dissertation: A Linguistic Analysis of Causality in Hate Speech

Jacob Matthews, John Starr, Marten van Schijndel. (2024) Semantics or spelling? Probing contextual word embeddings with orthographic noise., Findings of the Association for Computational Linguistics ACL 2024

S Ranjan, Marten van Schijndel. (2024) Does Dependency Locality Predict Non-canonical Word Order in Hindi?, Proc. Cogsci (2024)

2023

Dorit Abusch, Mats Rooth. (2023) Parallel and differential contributions from language and image in the discourse representation of picturebooks., To appear in proceedings of Sinn und Bedeutung 27 (preprint)

John Starr, Marten van Schijndel. (2023) Discourse Context Modulates Phonotactic Processing, Proc. AMP (2023)

Fangcong Yin Marten van Schijndel. (2023) Linguistic Compression in Single-Sentence Human-Written Summaries, Proc. Findings of EMNLP (2023)

2022

Katherine Blake. (2022), PhD Dissertation: Phonological Markedness Effects on Noun-Adjective Ordering

Forrest Davis. (2022), PhD Dissertation: On the Limitations of Data: Mismatches Between Neural Models of Language and Humans

Dorit Abusch, Mats Rooth. (2022) Pictorial free perception, Linguistics and Philosophy, 1-52, 05, December 2022

Dorit Abusch, Mats Rooth. (2022) Temporal and intensional pictorial conflation, Proceedings of Sinn und Bedeutung 26, Published 2022-12-22

Sidharth Ranjan, Marten van Schijndel, Sumeet Agarwal, Rajakrishnan Rajkumar (2022), Dual Mechanism Priming Effects in Hindi Word Order, arXiv:2210.13938, submitted 25 Oct 2022, Accepted toAACL (American Association for Corpus Linguistics) 2022

Sidharth Ranjan, Marten van Schijndel, Sumeet Agarwal, Rajakrishnan Rajkumar (2022), Discourse Context Predictability Effects in Hindi Word Order, arXiv:2210.13940, submitted 25 Oct 2022

Sidharth Ranjan, Marten van Schijndel, Sumeet Agarwal, and Rajakrishnan Rajkumar. (2022), Dual Mechanism Priming Effects in Hindi Word Order Proceedings of The 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (AACL-IJCNLP 2022)

Sidharth Ranjan, Marten van Schijndel, Sumeet Agarwal, and Rajakrishnan Rajkumar. (2022), Discourse Context Predictability Effects in Hindi Word Order Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022)

2021

Dorit Abusch,Mats Rooth. (2021) Modalized normality in pictorial narratives, Proceedings of Sinn und Bedeutung 25, Published 2021-09-17

Mats Rooth. (2021)On Lewis’s “Adverbs of Quantification",A Reader's Guide to Classic Papers in Formal Semantics, pp 295-310, First Online 14 November 2021

Eric Campbell and Mats Rooth.(2021), Epistemic semantics in guarded string models Proceedings of the Society for Computation in Linguistics (SCiL).2021

Marten van Schijndeland Tal Linzen. (2021), Single-stage prediction models do not explain the magnitude of syntactic disambiguation difficulty Cognitive Science, 45(6): e12988, 2021

Matthew Wilber, William TimkeyMarten van Schijndel, (2021), To Point or Not to Point: Understanding How Abstractive Summarizers Paraphrase Text, arXiv:2106.01581, Computer Science, Submitted 3 Jun 2021

Forrest Davis

Cognition, 213: 104651, 2021

Simone Harmath-de Lemos.(2021), Detecting word-level stress in continuous speech: A case study of Brazilian Portuguese Journal of Portuguese Linguistics 20.1, 2021

William Timkey and Marten van Schijndel.(2021), All Bark and No Bite: Rogue Dimensions in Transformer Language Models Obscure Representational Quality Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2021

Forrest Davis and Marten van Schijndel.(2021), Uncovering Constraint-Based Behavior in Neural Models via Targeted Fine-Tuning In Proceedings of the 2021 Annual Conference of the Association for Computational Linguistics (ACL). 2021

Matt Wilber, William Timkey, and Marten van Schijndel.(2021), Understanding How Abstractive Summarizers Paraphrase Text Proceedings of the 2021 Findings of the ACL. 2021

Samuel Ryb and Marten van Schijndel.(2021), Analytical, Symbolic and First-Order Reasoning within Neural Architectures Proceedings of the 2021 Workshop on Computing Semantics with Types, Frames and Related Structures. 2021.

2020

Jacob Collard. (2020), PhD Dissertation: A Model of Unsupervised Formal Learning for Natural Language

Mats Rooth. (2020) Is there reference to questions in the grammar of focus, SALT 20, Cornell University, August 17-20, 2020

Matthew Wilber, William Timkey,Martin van Schijndel. (2020) To Point or Not to Point: Understanding How Abstractive Summarizers Paraphrase Text, arXiv:2016.0158[v], 3 June 2021

Cory Shain, Idan Blank, Marten van Schijndel.(2020), William Schuler, and Evelina Fedorenko. (2020) fMRI reveals language-specific predictive coding during naturalistic sentence comprehension. Neuropsychologia, 138:107307. 2020

Forrest Davis and Marten van Schijndel.(2020) Recurrent neural network language models always learn English-like relative clause attachment. Proceedings of the 2020 Annual Conference of the Association for Computational Linguistics (ACL). 2020.

Forrest Davis and Marten van Schijndel. (2020) Discourse structure interacts with reference but not syntax in neural language models. 24th Conference on Computational Natural Language Learning (CoNLL). 2020.

Debasmita Bhattacharya and Marten van Schijndel. (2020) Filler-gaps that neural networks fail to generalize. 24th Conference on Computational Natural Language Learning (CoNLL). 2020.

Forrest Davis and Marten van Schijndel.(2020) Interaction with context during recurrent neural network sentence processing. Proceedings of the 42nd Annual Virtual Meeting of the Cognitive Science Society (CogSci). 2020.

Publications prior to 2020

2019

Jixing Li. (2019), PhD Dissertation: Neural Mechanisms of Pronoun Resolution

Shohini Bhattasali. (2019), PhD Dissertation: A Neurolinguistic Approach to Noncompositionality and Argument Structure

Dorit Abusch,Mats Rooth. (2019) Indexing Across Media Proceedings of the 22nd Amsterdam Colloquium. ILLC, University of Amsterdam.9

Marten van Schijndel, Aaron Mueller, and Tal Linzen. (2019) Quantity doesn't buy quality syntax with neural language models. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCAI). 2019.

Grusha Prasad, Marten van Schijndel, and Tal Linzen. (2019) Using Priming to Uncover the Organization of Syntactic Representations in Neural Language Models. Proceedings of the 2019 Conference on Computational Natural Language Learning (CoNLL). 2019.

Forrest Davisand Abby Cohn. (2019) Effects of lexical frequency and compositionality on phonological reduction in English compounds. 25th Architectures and Mechanisms of Language Processing conference (AMLaP 2019)

2018

Jacob Collard. (2018) Finite State Reasoning for Presupposition Satisfaction. Proceedings of the First International Workshop on Language Cognition and Computational Models (COLING 2018) .

Shohini Bhattasali, Murielle Fabre,John Hale. (2018) Processing MWEs: Neurocognitive Bases of Verbal MWEs and Lexical Cohesiveness within MWEs. Proceedings of the 14th Workshop on Multiword Expressions (COLING 2018) .

Simone Harmath-de Lemos. (2018) What Automatic Speech Recognition Can Tell Us About Stress and Stress Shift in Continuous Speech. Proceedings of the 9th International Conference on Speech Prosody 2018 .

Jixing Li, Murielle Fabre, Wen-Ming Luh,John Hale. (2018) Modeling Brain Activity Associated with Pronoun Resolution in English and Chinese. Proceedings of NAACL Workshop on Computational Models of Reference, Anaphora, and Coreference (CRAC 2018) .

Jacob Collard.(2018) A Naturalistic Inference Learning Algorithm. Linguistic Society of America (LSA 2018) .

Shohini Bhattasali, John Hale, Christophe Pallier, Jonathan R. Brennan, Wen-Ming Luh, R. Nathan Spreng. (2018) Differentiating Phrase Structure Parsing and Memory Retrieval in the Brain. Proceedings of the Society for Computation in Linguistics (SCiL 2018) .

2017

Mats Rooth. (2017) Finite-state intensional semantics. 12th International Conference on Computational Semantics (IWCS 2017) .

Matthew Nelson, Imen El Karoui, Kristof Giber, Xiaofang Yang, Laurent Cohen, Hilda Koopman, Sydney S. Cash, Lionel Naccache,John Hale, Christophe Pallier, Stanislas Dehaune. (2017) Neurophysiological dynamics of phrase-structure building during sentence processing. Proceedings of the National Academy of Sciences .

Matthew Nelson, Stanislas Dehaene, Christophe Pallier, andJohn Hale. (2017). Entropy Reduction correlates with Temporal Lobe Activity. Proceedings of the 7th Workshop on Cognitive Modelling and Computational Linguistics (CMCL 2017) .

2016

Jacob Collard. (2016) Inferring Necessary Categories in CCG. 9th International Conference on the Logical Aspects of Computational Linguistics (LACL 2016) .

Jonathan Howell,Mats Rooth, and Michael Wagner. (2016). Acoustic classification of focus: on the web and in the lab.Doi 1813/42538

Jonathan R. Brennan, Edward P. Stabler, Sarah E. Van Wagenen, Wen-Ming Luh, andJohn T. Hale. (2016) Abstract linguistic structure correlates with temporal activity during naturalistic comprehension.Brain and Language 157,81-94.

John Hale. (2016). Information-theoretical complexity metrics." Language and Linguistic Compass

Jixing Li, Jonathan Brennan, Adam Mahar, andJohn Hale. (2016). Temporal lobes as combinatory engines for both form and meaning. Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity (CL4LC 2016) .

2015

John T. Hale,David E. Lutz, Wenming Luh, and Jonathan R. Brennan. (2015). Modeling fMRI time courses with linguistic structure at various grain sizes. Proceedings of CMCL 2015 .

Shohini Bhattasali, Jeremy Cytryn, Elana Feldman, and Joonsuk Park. (2015). Automatic identification of rhetorical questions. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL 2015) .

Dissertations & Theses prior to 2015

Zhong Chen. (2014), PhD Dissertation: Animacy in Sentence Processing Across Languages: An Information-Theoretic Prospective

Kyle Grove. (2014), Masters Thesis: Lexical Structure, Weightedness, and Information in Sentence Processing

Jiwon Yun. (2013), PhD Dissertation: WH-Indefinites: Meaning and Prosody

Marisa Boston. (2012), PhD Dissertation: A Computational Model of Cognitive Constraints in Syntactic Locality

Effi Georgala. (2012), PhD Dissertation: Applicatives in their Structural and Thematic Function: A Minimalist Account of Multitransitivity

Tejaswini Deoskar. (2009), PhD Dissertation: Induction of Fine-grained Lexical Parameters of Treebank PCFGs with Inside-outside Estimation and Lexical Transformations

Yuping Zhou. (2008), PhD Dissertation: Ambiguity-Rich Semantics -- Inspired by a Corpus Study on the Negation~Quantifier Scope Ambiguity

Recent Courses:

If you are interested in computational linguistics, these classes are a great way to get started in this area:

LING 1170: Introduction to Cognitive Science (Fall, Summer)

This course provides an introduction to the science of the mind. Everyone knows what it's like to think and perceive, but this subjective experience provides little insight into how minds emerge from physical entities like brains. To address this issue, cognitive science integrates work from at least five disciplines: Psychology, Neuroscience, Computer Science, Linguistics, and Philosophy. This course introduces students to the insights these disciplines offer into the workings of the mind by exploring visual perception, attention, memory, learning, problem solving, language, and consciousness.

LING 3344 (Spring): Superlinguistics: Comics, Signs and Other Sequential Images

Super-linguistics is a subfield of linguistics that applies techniques used for analyzing natural language to non-linguistic materials. This course uses linguistic tools from semantics, pragmatics and syntax to study sequential images found in comics, films, and children’s books. We will also study multimedia, gestures, and static images such as instruction signs, emoji, and paintings. Linguistic topics include anaphora, implicature, tense and aspect, attitudes and embedding, indirect discourse, and dynamic semantics. We introduce linguistic accounts of each of the topics and apply them to pictorial data.

LING 4424/6424 (Spring): Computational Linguistics I

Computational models of natural languages. Topics are drawn from: tree syntax and context free grammar, finite state generative morpho-phonology, feature structure grammars, logical semantics, tabular parsing, Hidden Markov models, categorial and minimalist grammars, text corpora, information-theoretic sentence processing, discourse relations, and pronominal coreference.

LING 4434/6634 (Fall): Computational Linguistics II

An in-depth exploration of modern computational linguistic techniques. A continuation of LING 4424 - Computational Linguistics I. Whereas LING 4424 covers foundational techniques in symbolic computational modeling, this course will cover a wider range of applications as well as coverage of neural network methods. We will survey a range of neural network techniques that are widely used in computational linguistics and natural language processing as well as a number of techniques that can be used to probe the linguistic information and language processing strategies encoded in computational models. We will examine ways of mapping this linguistic information both to linguistic theory as well as to measures of human processing (e.g., neuroimaging data and human behavioral responses).

LING 4474/6674 (Spring): Natural Language Processing

This course constitutes an introduction to natural language processing (NLP), the goal of which is to enable computers to use human languages as input, output, or both. NLP is at the heart of many of today’s most exciting technological achievements, including machine translation, question answering and automatic conversational assistants. The course will introduce core problems and methodologies in NLP, including machine learning, problem design, and evaluation methods.

LING 4485/6485: Topics in Computational Linguistics

Current topics in computational linguistics.

LING 6693 - Computational Psycholinguistics Discussion (Fall, Spring)

This seminar provides a venue for feedback on research projects, invited speakers, and paper discussions within the area of computational psycholinguistics

LING 7710 (Fall) - Computational Seminar (Fall)

Addresses current theoretical and empirical issues in computational linguistics

Resources

Access to Cornell's G2 Computing Cluster
More than 900 Language Corpora in 60+ languages (e.g.news text, dialogue corpora, television transcripts, etc)

Useful Downloads (GitHub and Zenodo repositories)

Kaldi Utilities

Kaldi-alignments-matlab: Read, display, and play Kaldi phone alignments in Matlab

A Truly Cleaned and Filtered Subset of The Pile corpus

The Pudding repo contains code that creates a truly cleaned and filtered subset of The 800GB Pile corpus, parsed into the CONLL-U format

Tools for LSTM (long short-term memory) models

LSTM (long short-term memory) toolkit that can estimate incremental processing difficulty

Left-corner parsing toolkit that can estimate incremental processing difficulty

125 pre-trained English LSTM (Long short-term memory) models- this repository contains the 125 LSTM models analyzed invan Schijndel, Mueller, and Linzen (2019) "Quantity doesn't buy quality syntax with neural language models"

.

Useful links

Cornell Department of Linguistics
Cornell Natural Language Processing group
Cornell Cognitive Science program
Association of Computational Linguistics
Cornell Linguistics Circle

Anca Chereches, Masters 2014 Software Engineer for TransitionZero Software Developer for JNCTION

David Lutz, Masters 2013 Director of Data Science, Capital One, NYC

Debasmita Battacharya, Undergrad Research Assistant 2020-2022 PhD Student in Computer Science, Columbia University Spoken Language Processing Group

Effi Georgala, PhD 2012 Senior Research Manager @ Microsoft Copilot AI

Fangcong Yin, Research Assistant 2022-2023 PhD student in Computer Science, University of Texas at Austin

Forrest Davis, PhD 2022 Assistant Professor of Computer Science, Colgate University

Jacob Collard, PhD 2020 Post-doctoral researcher at NIST (National Institute of Standards & Technology)

John Hale, Associate Professor of Linguistics 2008-2018 Research Professor, Department of Cognitive Science, Johns Hopkins University

Jixing Li, PhD 2018 Assistant Professor in the Department of Linguistics and Translation and the Department of Behavior and Social Sciences , City University of Hong Kong.

Jiwan Yun, PhD 2013 Associate Professor, Stony Brook University

Katherine Blake, PhD 2022 Language Data Researcher, Amazon Alexa AI

Kaelyn Lamp, PhD 2024 Post-Graduate Researcher

Kyle Grove, Masters 2014 Senior Manager, Upwork

Marisa Ferrara Boston, PhD 2012 Founder, Reins AI

Rishi Bommasani, Masters 2020 PhD Student in Computer Science, Stanford University

Shohini Bhattasali, PhD 2019 Assistant Professor of Computational Linguistics, Department of Language Studies, University of Toronto

Tejaswini Deoskar, PhD 2008 Associate Professor, Institute for Language Sciences, Department of Languages, Literature and Communication, Utrecht University

Tim Hunter, PostDoc 2012-2013 Associate Professor, Department of Linguistics, UCLA

William Timkey, Research Assistant 2020-2021 PhD student, Department of Linguistics, New York University

Yuping Zhou, PhD 2008 Postdoctoral Fellow, Department of Computer Science, Brandeis University

Zhong Chen, PhD 2014 Assistant Professor, Rochester Institute of Technology

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Dissertations & Theses prior to 2015

Kaldi Utilities

A Truly Cleaned and Filtered Subset of The Pile corpus

Tools for LSTM (long short-term memory) models

Anca Chereches, Masters 2014
Software Engineer for TransitionZero
Software Developer for JNCTION

David Lutz, Masters 2013
Director of Data Science, Capital One, NYC

Debasmita Battacharya, Undergrad Research Assistant 2020-2022
PhD Student in Computer Science, Columbia University Spoken Language Processing Group

Effi Georgala, PhD 2012
Senior Research Manager @ Microsoft Copilot AI

Fangcong Yin, Research Assistant 2022-2023
PhD student in Computer Science, University of Texas at Austin

Forrest Davis, PhD 2022
Assistant Professor of Computer Science, Colgate University

Jacob Collard, PhD 2020
Post-doctoral researcher at NIST (National Institute of Standards & Technology)

John Hale, Associate Professor of Linguistics 2008-2018
Research Professor, Department of Cognitive Science, Johns Hopkins University

Jixing Li, PhD 2018
Assistant Professor in the Department of Linguistics and Translation and the Department of Behavior and Social Sciences , City University of Hong Kong.

Jiwan Yun, PhD 2013
Associate Professor, Stony Brook University

Katherine Blake, PhD 2022
Language Data Researcher, Amazon Alexa AI

Kaelyn Lamp, PhD 2024
Post-Graduate Researcher

Kyle Grove, Masters 2014
Senior Manager, Upwork

Marisa Ferrara Boston, PhD 2012
Founder, Reins AI

Rishi Bommasani, Masters 2020
PhD Student in Computer Science, Stanford University

Shohini Bhattasali, PhD 2019
Assistant Professor of Computational Linguistics, Department of Language Studies, University of Toronto

Tejaswini Deoskar, PhD 2008
Associate Professor, Institute for Language Sciences, Department of Languages, Literature and Communication, Utrecht University

Tim Hunter, PostDoc 2012-2013
Associate Professor, Department of Linguistics, UCLA

William Timkey, Research Assistant 2020-2021
PhD student, Department of Linguistics, New York University

Yuping Zhou, PhD 2008
Postdoctoral Fellow, Department of Computer Science, Brandeis University

Zhong Chen, PhD 2014
Assistant Professor, Rochester Institute of Technology