AnaLog: Testing Analytical and Deductive Logic Learnability in Language Models

Samuel Ryb; Mario Giulianelli; Arabella Sinclair; Raquel Fernández

doi:10.18653/v1/2022.starsem-1.5

AnaLog: Testing Analytical and Deductive Logic Learnability in Language Models

Samuel Ryb, Mario Giulianelli, Arabella Sinclair, Raquel Fernández

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

1 Citation (Scopus)

Abstract

We investigate the extent to which pre-trained language models acquire analytical and deductive logical reasoning capabilities as a side effect of learning word prediction. We present AnaLog, a natural language inference task designed to probe models for these capabilities, controlling for different invalid heuristics the models may adopt instead of learning the desired generalisations. We test four languagemodels on AnaLog, finding that they have all learned, to a different extent, to encode information that is predictive of entailment beyond shallow heuristics such as lexical overlap and grammaticality. We closely analyse the best performing language model and show that while it performs more consistently than other language models across logical connectives and reasoning domains, it still is sensitive to lexical and syntactic variations in the realisation of logical statements.

Original language	English
Title of host publication	Proceedings of the 11th Joint Conference on Lexical and Computational Semantics
Publisher	Association for Computational Linguistics
Pages	55-68
Number of pages	14
ISBN (Electronic)	978-1-955917-98-8
DOIs	https://doi.org/10.18653/v1/2022.starsem-1.5
Publication status	Published - 14 Jul 2022
Event	The 11th Joint Conference on Lexical and Computational Semantics - Seattle, United States Duration: 14 Jul 2022 → 15 Jul 2022 https://sites.google.com/view/starsem2022/

Conference

Conference	The 11th Joint Conference on Lexical and Computational Semantics
Abbreviated title	*SEM 2022
Country/Territory	United States
City	Seattle
Period	14/07/22 → 15/07/22
Internet address	https://sites.google.com/view/starsem2022/

Bibliographical note

Acknowledgements
We would like to thank the anonymous ARR and *SEM 2022 reviewers for their feedback and suggestions, as well as Ece Takmaz for her comments. Samuel Ryb and Arabella Sinclair worked on this project while affiliated with the University of Amsterdam. The project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 819455).

1The dataset is available at https://github.com/dmg-illc/analog

Access to Document

10.18653/v1/2022.starsem-1.5Licence: CC BY

Ryb_etal_ACLA_AnaLog_Testing_Analytical_VoR
The ACL materials that are hosted in the Anthology are licensed to the general public under a liberal usage policy that allows unlimited reproduction, distribution and hosting of materials on any other website or medium, for non-commercial purposes. Prior to 2016, all ACL materials are licensed using the Creative Commons 3.0 BY-NC-SA (Attribution, Non-Commercial, Share-Alike) license. As of 2016, this policy has been relaxed further, and all subsequent materials are available to the general public on the terms of the Creative Commons 4.0 BY (Attribution) license; this means both commercial and non-commercial use is explicitly licensed to all. https://creativecommons.org/licenses/by/4.0/
Final published version, 420 KBLicence: CC BY

Cite this

AnaLog: Testing Analytical and Deductive Logic Learnability in Language Models. / Ryb, Samuel; Giulianelli, Mario; Sinclair, Arabella et al.
Proceedings of the 11th Joint Conference on Lexical and Computational Semantics. Association for Computational Linguistics, 2022. p. 55-68.

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

Ryb, S, Giulianelli, M, Sinclair, A & Fernández, R 2022, AnaLog: Testing Analytical and Deductive Logic Learnability in Language Models. in Proceedings of the 11th Joint Conference on Lexical and Computational Semantics. Association for Computational Linguistics, pp. 55-68, The 11th Joint Conference on Lexical and Computational
Semantics
, Seattle, Washington, United States, 14/07/22. https://doi.org/10.18653/v1/2022.starsem-1.5

@inproceedings{7c908bd196a84afe8f16a0595987ca0a,

title = "AnaLog: Testing Analytical and Deductive Logic Learnability in Language Models",

abstract = "We investigate the extent to which pre-trained language models acquire analytical and deductive logical reasoning capabilities as a side effect of learning word prediction. We present AnaLog, a natural language inference task designed to probe models for these capabilities, controlling for different invalid heuristics the models may adopt instead of learning the desired generalisations. We test four languagemodels on AnaLog, finding that they have all learned, to a different extent, to encode information that is predictive of entailment beyond shallow heuristics such as lexical overlap and grammaticality. We closely analyse the best performing language model and show that while it performs more consistently than other language models across logical connectives and reasoning domains, it still is sensitive to lexical and syntactic variations in the realisation of logical statements.",

author = "Samuel Ryb and Mario Giulianelli and Arabella Sinclair and Raquel Fern{\'a}ndez",

note = "Acknowledgements We would like to thank the anonymous ARR and *SEM 2022 reviewers for their feedback and suggestions, as well as Ece Takmaz for her comments. Samuel Ryb and Arabella Sinclair worked on this project while affiliated with the University of Amsterdam. The project has received funding from the European Research Council (ERC) under the European Union{\textquoteright}s Horizon 2020 research and innovation programme (grant agreement No. 819455). 1The dataset is available at https://github.com/dmg-illc/analog; The 11th Joint Conference on Lexical and Computational<br/>Semantics<br/>, *SEM 2022 ; Conference date: 14-07-2022 Through 15-07-2022",

year = "2022",

month = jul,

day = "14",

doi = "10.18653/v1/2022.starsem-1.5",

language = "English",

pages = "55--68",

booktitle = "Proceedings of the 11th Joint Conference on Lexical and Computational Semantics",

publisher = "Association for Computational Linguistics",

url = "https://sites.google.com/view/starsem2022/",

}

TY - GEN

T1 - AnaLog

T2 - The 11th Joint Conference on Lexical and Computational<br/>Semantics<br/>

AU - Ryb, Samuel

AU - Giulianelli, Mario

AU - Sinclair, Arabella

AU - Fernández, Raquel

N1 - Acknowledgements We would like to thank the anonymous ARR and *SEM 2022 reviewers for their feedback and suggestions, as well as Ece Takmaz for her comments. Samuel Ryb and Arabella Sinclair worked on this project while affiliated with the University of Amsterdam. The project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 819455). 1The dataset is available at https://github.com/dmg-illc/analog

PY - 2022/7/14

Y1 - 2022/7/14

N2 - We investigate the extent to which pre-trained language models acquire analytical and deductive logical reasoning capabilities as a side effect of learning word prediction. We present AnaLog, a natural language inference task designed to probe models for these capabilities, controlling for different invalid heuristics the models may adopt instead of learning the desired generalisations. We test four languagemodels on AnaLog, finding that they have all learned, to a different extent, to encode information that is predictive of entailment beyond shallow heuristics such as lexical overlap and grammaticality. We closely analyse the best performing language model and show that while it performs more consistently than other language models across logical connectives and reasoning domains, it still is sensitive to lexical and syntactic variations in the realisation of logical statements.

AB - We investigate the extent to which pre-trained language models acquire analytical and deductive logical reasoning capabilities as a side effect of learning word prediction. We present AnaLog, a natural language inference task designed to probe models for these capabilities, controlling for different invalid heuristics the models may adopt instead of learning the desired generalisations. We test four languagemodels on AnaLog, finding that they have all learned, to a different extent, to encode information that is predictive of entailment beyond shallow heuristics such as lexical overlap and grammaticality. We closely analyse the best performing language model and show that while it performs more consistently than other language models across logical connectives and reasoning domains, it still is sensitive to lexical and syntactic variations in the realisation of logical statements.

U2 - 10.18653/v1/2022.starsem-1.5

DO - 10.18653/v1/2022.starsem-1.5

M3 - Published conference contribution

SP - 55

EP - 68

BT - Proceedings of the 11th Joint Conference on Lexical and Computational Semantics

PB - Association for Computational Linguistics

Y2 - 14 July 2022 through 15 July 2022

ER -

AnaLog: Testing Analytical and Deductive Logic Learnability in Language Models

Abstract

Conference

Bibliographical note

Access to Document

Fingerprint

Cite this