TY - UNPB
T1 - State-of-the-art generalisation research in NLP
T2 - a taxonomy and review
AU - Hupkes, Dieuwke
AU - Giulianelli, Mario
AU - Dankers, Verna
AU - Artetxe, Mikel
AU - Elazar, Yanai
AU - Pimentel, Tiago
AU - Christodoulopoulos, Christos
AU - Lasri, Karim
AU - Saphra, Naomi
AU - Sinclair, Arabella
AU - Ulmer, Dennis
AU - Schottmann, Florian
AU - Batsuren, Khuyagbaatar
AU - Sun, Kaiser
AU - Sinha, Koustuv
AU - Khalatbari, Leila
AU - Ryskina, Maria
AU - Technology, Hong
AU - Cotterell, Ryan
AU - Jin, Zhijing
N1 - We thank Adina Williams, Armand Joulin, Elia Bruni, Lucas Weber, Robert Kirk and Sebastian Riedel
for providing us feedback on various stages of this draft, and Gary Marcus for providing detailed feedback on the final draft of this paper. We thank Elte Hupkes for making the app that allows searching
through references, and we thank Daniel Haziza and Ece Takmaz for other contributions to the website
PY - 2023/1/9
Y1 - 2023/1/9
N2 - The ability to generalise well is one of the primary desiderata of natural language processing (NLP). Yet, what `good generalisation' entails and how it should be evaluated is not well understood, nor are there any common standards to evaluate it. In this paper, we aim to lay the ground-work to improve both of these issues. We present a taxonomy for characterising and understanding generalisation research in NLP, we use that taxonomy to present a comprehensive map of published generalisation studies, and we make recommendations for which areas might deserve attention in the future. Our taxonomy is based on an extensive literature review of generalisation research, and contains five axes along which studies can differ: their main motivation, the type of generalisation they aim to solve, the type of data shift they consider, the source by which this data shift is obtained, and the locus of the shift within the modelling pipeline. We use our taxonomy to classify over 400 previous papers that test generalisation, for a total of more than 600 individual experiments. Considering the results of this review, we present an in-depth analysis of the current state of generalisation research in NLP, and make recommendations for the future. Along with this paper, we release a webpage where the results of our review can be dynamically explored, and which we intend to up-date as new NLP generalisation studies are published. With this work, we aim to make steps towards making state-of-the-art generalisation testing the new status quo in NLP.
AB - The ability to generalise well is one of the primary desiderata of natural language processing (NLP). Yet, what `good generalisation' entails and how it should be evaluated is not well understood, nor are there any common standards to evaluate it. In this paper, we aim to lay the ground-work to improve both of these issues. We present a taxonomy for characterising and understanding generalisation research in NLP, we use that taxonomy to present a comprehensive map of published generalisation studies, and we make recommendations for which areas might deserve attention in the future. Our taxonomy is based on an extensive literature review of generalisation research, and contains five axes along which studies can differ: their main motivation, the type of generalisation they aim to solve, the type of data shift they consider, the source by which this data shift is obtained, and the locus of the shift within the modelling pipeline. We use our taxonomy to classify over 400 previous papers that test generalisation, for a total of more than 600 individual experiments. Considering the results of this review, we present an in-depth analysis of the current state of generalisation research in NLP, and make recommendations for the future. Along with this paper, we release a webpage where the results of our review can be dynamically explored, and which we intend to up-date as new NLP generalisation studies are published. With this work, we aim to make steps towards making state-of-the-art generalisation testing the new status quo in NLP.
U2 - 10.48550/arXiv.2210.03050
DO - 10.48550/arXiv.2210.03050
M3 - Preprint
BT - State-of-the-art generalisation research in NLP
PB - ArXiv
ER -