Logo

Workshop on Insights from Negative Results in NLP


Albuquerque, New Mexico, May 2025
(co-located with NAACL)

About

Call for papers

Accepted papers

Program

Sponsors

Organization

Invited Speakers

Program Committee

Insights 2024

Insights 2023

Insights 2022

Insights 2021

Insights 2020

Program

Insights 2021 will be a hybrid workshop. The poster sessions will be conducted online, in gather.town. The other sessions will have a mixture of virtual and on-site speakers and attendees.

Please see the workshop page on underline for the gather.town and zoom links.

All times are local time (Punta Cana, GMT-4).

8:45–9:00 Opening remarks

9:00–10:00 Invited talk: Bonnie Webber (University of Edinburgh) [Video]

The Reviewers and the Reviewed: Institutional Memory and Institutional Incentives

Everyone has their own stories about unfair reviewers, misguided reviewers, and reviewers who just don’t seem to get it. A Workshop on “Insights from Negative Results” then seems just the place to reflect on who reviews are for, what purpose they serve for authors and reviewers, and what may be gained or lost from recent changes to conference reviewing.

10:00–11:15 Poster session 1

  • Two Heads are Better than One? Verification of Ensemble Effect in Neural Machine Translation [PDF] [Video]
    Chanjun Park, Sungjin Park, Seolhwa Lee, Taesun Whang and Heuiseok Lim
  • The Highs and Lows of Simple Lexical Domain Adaptation Approaches for Neural Machine Translation [PDF] [Video]
    Nikolay Bogoychev and Pinzhen Chen
  • Backtranslation in Neural Morphological Inflection [PDF] [Video]
    Ling Liu and Mans Hulden
  • Does Commonsense help in detecting Sarcasm? [PDF] [Video]
    Somnath Basu Roy Chowdhury and Snigdha Chaturvedi
  • Finetuning Pretrained Transformers into Variational Autoencoders [PDF] [Video]
    Seongmin Park and Jihwa Lee
  • Zero-Shot Cross-Lingual Transfer is a Hard Baseline to Beat in German Fine-Grained Entity Typing [PDF] [Video]
    Sabine Weber and Mark Steedman.
  • Comparing Euclidean and Hyperbolic Embeddings on the WordNet Nouns Hypernymy Graph [PDF] [Video]
    Sameer Bansal and Adrian Benton.
  • When does Further Pre-training MLM Help? An Empirical Study on Task-Oriented Dialog Pre-training [PDF] [Video]
    Qi Zhu, Yuxian Gu, Lingxiao Luo, Bing Li, Cheng LI, Wei Peng, Minlie Huang and Xiaoyan Zhu
  • On the Difficulty of Segmenting Words with Attention [PDF] [Video]
    Ramon Sanabria, Hao Tang and Sharon Goldwater
  • Learning Data Augmentation Schedules for Natural Language Processing [PDF] [Video]
    Daphné Chopard, Matthias S. Treder and Irena Spasić
  • Investigating the Effect of Natural Language Explanations on Out-of-Distribution Generalization in Fewshot NLI [PDF] [Video]
    Yangqiaoyu Zhou and Chenhao Tan
  • Active Learning for Argument Strength Estimation [PDF] [Video]
    Nataliia Kees, Michael Fromm, Evgeniy Faerman and Thomas Seidl
  • Temporal Adaptation of BERT and Performance on Downstream Document Classification: Insights from Social Media [PDF] [Video]
    Paul Röttger and Janet B. Pierrehumbert
  • Does Vision-and-Language Pretraining Improve Lexical Grounding? [PDF] [Video]
    Tian Yun, Chen Sun and Ellie Pavlick
  • Recurrent Attention for the Transformer [PDF] [Video]
    Jan Rosendahl, Christian Herold, Frithjof Petrick and Hermann Ney

11:15–11:30 Social break / coffee time

11:30–12:30 Invited talk: Zachary Lipton (Carnegie Mellon University) [Video Part 1] [Video Part 2]

When Cute Stories Belie a Messy Reality

This talk will survey a broad set of negative results from our research over the last few years at ACMI lab, spanning (i) Reading Comprehension; (ii) Active Learning; (iii) Pretraining; (iv) Model Interpretation; (v) knowledge distillation; (vi) domain-adversarial neural networks; and (vii) Adversarial Question Answering. Some should have been seen from a mile away while others were genuinely surprising. The talk will touch upon some common themes, including (a) the blindspots that emerge when storytelling runs ahead of (or away from) the scientific process; (b) what constitutes a negative result worth publishing; and (c) positive outcomes from negative findings.

12:30–13:00 Thematic session 1 (oral presentations)

  • Two Heads are Better than One? Verification of Ensemble Effect in Neural Machine Translation [PDF] [Video]
    Chanjun Park, Sungjin Park, Seolhwa Lee, Taesun Whang and Heuiseok Lim
  • The Highs and Lows of Simple Lexical Domain Adaptation Approaches for Neural Machine Translation [PDF] [Video]
    Nikolay Bogoychev and Pinzhen Chen
  • Backtranslation in Neural Morphological Inflection [PDF] [Video]
    Ling Liu and Mans Hulden

13:00–14:00 Lunch break

14:00–15:00 Invited talk: Rachael Tatman (Rasa) [Video]

Chatbots can be good: What we learn from unhappy users

It’s no secret that chatbots have a bad reputation: no one enjoys a cyclical, frustrating conversation when all you need is a quick answer to an urgent question. But chatbots can, in fact, be good. Having bad conversations can help us get there before they’re ever deployed This talk will draw on both academic and industry knowledge to discuss problems like:

  • What do users’ reactions to unsuccessful systems tell us about what successful systems should look like?
  • Are we evaluating the right things… or the easy to measure things?
  • Do we really have to look at user data? If so, when and how often?
  • When, if ever, should we retire old methods?

15:00–16:15 Poster session 2

  • Corrected CBOW Performs as well as Skip-gram [PDF] [Video]
    Ozan ˙Irsoy, Adrian Benton and Karl Stratos
  • An Investigation into the Contribution of Locally Aggregated Descriptors to Figurative Language Identification [PDF] [Video]
    Sina Mahdipour Saravani, Ritwik Banerjee and Indrakshi Ray.
  • Blindness to Modality Helps Entailment Graph Mining [PDF] [Video]
    Liane Guillou, Sander Bijl de Vroe, Mark Johnson and Mark Steedman
  • Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics [PDF] [Video]
    Prajjwal Bhargava, Aleksandr Drozd and Anna Rogers
  • Challenging the Semi-Supervised VAE Framework for Text Classification [PDF] [Video]
    Ghazi Felhi, Joseph Le Roux and Djamé Seddah
  • Investigating Numeracy of a Text-to-Text Transfer model [PDF] [Video]
    Kuntal Kumar Pal and Chitta Baral
  • Do We Know What We Don’t Know? Studying Unanswerable Questions beyond SQuAD 2.0, Elior Sulem [PDF] [Video]
    Jamaal Hay and Dan Roth

16:15–16:30 Social break / coffee time

16:30–17:00 Thematic session 2 (oral presentations)

  • BERT Cannot Align Characters [PDF] [Video]
    Antonis Maronikolakis, Philipp Dufter and Hinrich Schütze
  • Are BERTs Sensitive to Native Interference in L2 Production? [PDF] [Video]
    Zixin Tang, Prasenjit Mitra and David Reitter
  • Is BERT a Cross-Disciplinary Knowledge Learner? A Surprising Finding of Pre-trained Models’ Transferability [PDF] [Video]
    Wei-Tsung Kao and Hung-yi Lee

17:00–18:00 Invited talk: Noah Smith (University of Washington / Allen Institute for AI) [Video]

What Makes a Result Negative?

In this talk, I’ll discuss the frame of “negative results” that is used to describe outcomes in the research process, specifically in modern natural language processing. I’ll link this frame to some assumptions that I think have been mostly harmful to research and researchers. I’ll argue for a few first principles that can help us to design research projects in such a way that useful new information is likely to emerge, no matter what the experiments show. Unfortunately, I can’t offer a foolproof method for avoiding “negative results,” but I do hope to move our field’s discourse to be better aligned with its broader goals, and offer some reminders about the incredible variety of ways to contribute to those goals. Though this talk won’t spend much time highlighting the research findings of my mentees and collaborators, and the views expressed should only be taken as my own (not theirs), the worldview I discuss has developed through interactions with them, for which I am grateful.

18:00–18:15 Closing remarks