Abstract
To be trusted and perceived as natural and coherent, conversational systems must adapt to the language of their users. While personalized dialogue is a promising direction, controlling generation for fine-grained language features remains a challenge in this approach. A recent line of research showed the effectiveness of leveraging pre-trained language models toward adapting to a text's topic or sentiment. In this study, we build on these approaches and focus on a higher-level dimension of language variation: speakers' age. We frame the task as a dialogue response generation, and test methods based on bag-of-words (BoW) and neural discriminators (Disc) to condition the output of GPT-2 and DialoGPT without altering the parameters of the language models. We show that Disc models achieve a higher degree of detectable control than BoW models based on automatic evaluation. In contrast, humans can partially detect age differences in BoW but not Disc responses. Since BoW responses are deemed better than Disc ones by humans, simple controllable methods thus appear to be a better tradeoff between adaptation and language quality. Our work confirms the challenges of adapting to higher-level dimensions of language variation. Moreover, it highlights the need to evaluate natural language generation thoroughly.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM) |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 172-188 |
Number of pages | 17 |
ISBN (Electronic) | 9781959429128 |
DOIs | |
Publication status | Published - 8 Dec 2022 |
Event | 2nd Workshop on Natural Language Generation, Evaluation, and Metrics, GEM 2022, as part of EMNLP 2022 - Abu Dhabi, United Arab Emirates Duration: 7 Dec 2022 → 7 Dec 2022 |
Publication series
Name | ACL Anthology |
---|---|
Publisher | Association for Computational Linguistics |
Conference
Conference | 2nd Workshop on Natural Language Generation, Evaluation, and Metrics, GEM 2022, as part of EMNLP 2022 |
---|---|
Country/Territory | United Arab Emirates |
City | Abu Dhabi |
Period | 7/12/22 → 7/12/22 |
Bibliographical note
Funding Information:We would like to thank the four anonymous GEM reviewers for their valuable feedback and the participants of our crowdsourcing experiments. The work received funding from the University of Amsterdam’s Research Priority Area Human(e) AI and from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 819455).