A taxonomy and review of generalization research in NLP

Dieuwke Hupkes*, Mario Giulianelli*, Verna Dankers*, Mikel Artetxe, Yanai Elazar, Tiago Pimentel, Christos Christodoulopoulos, Karim Lasri, Naomi Saphra, Arabella Sinclair, Dennis Ulmer, Florian Schottmann, Khuyagbaatar Batsuren, Kaiser Sun, Koustuv Sinha, Leila Khalatbari, Maria Ryskina, Rita Frieske, Ryan Cotterell, Zhijing Jin

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

12 Citations (Scopus)
2 Downloads (Pure)

Abstract

The ability to generalize well is one of the primary desiderata for models of natural language processing (NLP), but what ‘good generalization’ entails and how it should be evaluated is not well understood. In this Analysis we present a taxonomy for characterizing and understanding generalization research in NLP. The proposed taxonomy is based on an extensive literature review and contains five axes along which generalization studies can differ: their main motivation, the type of generalization they aim to solve, the type of data shift they consider, the source by which this data shift originated, and the locus of the shift within the NLP modelling pipeline. We use our taxonomy to classify over 700 experiments, and we use the results to present an in-depth analysis that maps out the current state of generalization research in NLP and make recommendations for which areas deserve attention in the future.

Original languageEnglish
Pages (from-to)1161-1174
Number of pages14
JournalNature Machine Intelligence
Volume5
Issue number10
Early online date19 Oct 2023
DOIs
Publication statusPublished - 19 Oct 2023

Bibliographical note

Funding Information:
We thank A. Williams, A. Joulin, E. Bruni, L. Weber, R. Kirk and S. Riedel for providing feedback on the various stages of this paper, and G. Marcus for providing detailed feedback on the final draft. We also thank the reviewers of our work for providing useful comments. We thank E. Hupkes for making the app that allows searching through references, and we thank D. Haziza and E. Takmaz for other contributions to the website. M.G. was supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 819455). V.D. was supported by the UKRI Centre for Doctoral Training in Natural Language Processing, funded by the UKRI (grant no. EP/S022481/1) and the University of Edinburgh. N.S. was supported by the Hyundai Motor Company (under the project Uncertainty in Neural Sequence Modeling) and the Samsung Advanced Institute of Technology (under the project Next Generation Deep Learning: From Pattern Recognition to AI).

Publisher Copyright:
© 2023, The Author(s).

Data Availability Statement

Data availability
The full annotated list of articles included in our survey is available through the GenBench website (https://genbench.org/references), where articles can be filtered through a dedicated search tool. This is an evolving survey: we encourage authors to submit new work and to request annotation corrections through our contributions page (https://genbench.org/contribute). The exact list used at the time of writing can be retrieved from https://github.com/GenBench/GenBench.github.io/blob/cea0bd6bd8af6f2d0f096c8f81185b1dfc9303b5/taxonomy_clean.tsv. We also release interactive tools to visualize the results of our survey at https://genbench.org/visualisation. Source data are provided with this paper.
Source data:
https://static-content.springer.com/esm/art%3A10.1038%2Fs42256-023-00729-y/MediaObjects/42256_2023_729_MOESM3_ESM.csv

Fingerprint

Dive into the research topics of 'A taxonomy and review of generalization research in NLP'. Together they form a unique fingerprint.

Cite this