Yale linguists present at ACL

August 19, 2019

Professors Claire Bowern and Bob Frank; grad student Yiding Hao; undergraduates Noah Amsel, Yongjie Lin, and Yi Chern Tan; and undergrad alum Will Merrill, now a postbac at the Allen Institute for Artificial Intelligence, traveled to Florence, Italy, to present at various workshops of the annual meeting of the Association for Computational Linguistics (ACL). In total, Yale linguists presented six posters and two invited talks, including two posters featuring research conducted by the Computational Linguistics at Yale (CLAY) lab during the 2018–2019 academic year.

Natural language processing (NLP) is an interdisciplinary field that seeks to allow computers to understand human language. Recent advances in machine learning have enabled significant progress in machine translation, sentiment analysis, natural language understanding, and other NLP problems. However, while modern systems are much more effective than their traditional predecessors, they are also extraordinarily complex, and their behavior is not well-understood. For the past two years, the CLAY lab has been focused on the question of trying to better understand whether NLP systems truly encode information about natural language grammar, or whether they primarily rely on spurious statistical correlations.

In their paper Open Sesame: Getting inside BERT’s Linguistic Knowledge, Yongjie, Yi Chern, and Bob examine Bidirectional Encoder Representations from Transformers (BERT), a technique in which a computer attempts to learn information about a language such as English by processing large amounts of text. The three authors wanted to know whether the information learned by BERT includes information about sentence structure. They find that when BERT reads a sentence, it first identifies each word in the sentence, and gradually groups those words into complex phrases. When performing tasks such as subject–verb agreement, BERT is often sensitive to features such as person and number, though it is also sometimes distracted by irrelevant cues in the sentence. The study concludes that BERT captures some aspects of English grammar, even if it does not use this information perfectly.

BERT learns everything it knows on its own, directly from text. But what happens if the computer is explicitly given hints on what sentence structure looks like? Finding Hierarchical Structure in Neural Stacks, by Will, Noah, Yiding, and Bob along with undergraduate CLAY members Lenny Khazan and Simon Mendelsohn addresses this question by studying Neural Stacks. Like BERT, Neural Stacks attempt to learn information about language by reading and processing text. Unlike BERT, however, Neural Stacks are specially designed to process sentences with complex structure using a mechanism known as a stack. A previous CLAY study suggested that Neural Stacks experience difficulty in figuring out how to use the stack to model structure. However, this year’s study showed that Neural Stacks can use the stack to keep track of phrases being formed throughout the sentence. For example, when the computer sees a preposition, it uses the stack to record the fact that it expects the next word to be the object of that preposition. Based on that observation, the authors are able to visualize the various kinds of phrase structures learned by the computer.

Apart from CLAY projects, Yale attendees also presented their own research at the conference. Will presented his BA thesis work on the expressive power of sequential neural networks at the Deep Learning and Formal Languages (Delfol) workshop, where Bob gave an invited talk. Bob presented work with Will and undergrad Gigi Stark on modeling historical varieties of English at the Computational Approaches to Historical Linguistic Change (LChange) workshop, where Claire gave an invited talk. Yiding presented two posters at the SIGMORPHON workshop, featuring joint work with grad student Samuel Andersson on unbounded stress as well as joint work with former Yale postdoc Dustin Bowers, now of the University of Arizona, on rhythmic syncope. The Yale linguists were joined by grad students Tao Yu and Alex Fabbri of the Department of Computer Science, who presented work from Professor Drago Radev’s Language, Information, and Learning at Yale (LILY) lab, as well as undergrad alumni Jungo Kasai and Tom McCoy, now grad students at the University of Washington and Johns Hopkins University, respectively.

The main conference of the ACL was held in Florence, Italy, from July 28 to July 31. The workshops were held on August 1 and 2. The proceedings, including the full program and conference papers, are available on the ACL Anthology (BlackBoxNLP, Delfol, LChange, SIGMORPHON).