Choosing Words in Computer-Generated Weather Forecasts

Ehud Baruch Reiter; Gowri Somayajulu Sripada; James Ritchie Wallace Hunter; J. Yu; I. Davy

doi:10.1016/j.artint.2005.06.006

Choosing Words in Computer-Generated Weather Forecasts

Ehud Baruch Reiter, Gowri Somayajulu Sripada, James Ritchie Wallace Hunter, J. Yu, I. Davy

Computing Science

Research output: Contribution to journal › Article › peer-review

213 Citations (Scopus)

Abstract

One of the main challenges in automatically generating textual weather forecasts is choosing appropriate English words to communicate numeric weather data. A corpus-based analysis of how humans write forecasts showed that there were major differences in how individual writers performed this task, that is, in how they translated data into words. These differences included both different preferences between potential near-synonyms that could be used to express information, and also differences in the meanings that individual writers associated with specific words. Because we thought these differences could confuse readers, we built our SUMTIME-MOUSAM weather-forecast generator to use consistent data-to-word rules, which avoided words which were only used by a few people, and words which were interpreted differently by different people. An evaluation by forecast users suggested that they preferred SUMTIME-MOUSAM's texts to human-generated texts, in part because of better word choice; this may be the first time that an evaluation has shown that NLG texts are better than human-authored texts. (c) 2005 Elsevier B.V. All rights reserved.

Original language	English
Pages (from-to)	137-169
Number of pages	32
Journal	Artificial Intelligence
Volume	167
DOIs	https://doi.org/10.1016/j.artint.2005.06.006
Publication status	Published - Sept 2005

Keywords

natural language processing
natural language generation
language and the word
information presentation
weather forecasts
lexical choice
idiolect
SEMANTICS

Access to Document

10.1016/j.artint.2005.06.006

Data2Text
Ehud Reiter (Coordinator), Gowri Sripada (Coordinator), Jim Hunter (Coordinator) & Ross John Turner (Coordinator)
Impact
Mainstream communication of big data using natural language generation (NLG)
Ehud Reiter (Coordinator) & Gowri Sripada (Coordinator)
Impact: Economic and/or Commercial
Promoting the Public Understanding of Artificial Intelligence
Graeme Ritchie (Coordinator), Kees Jacobus Van Deemter (Coordinator), Ehud Reiter (Coordinator) & Gowri Sripada (Coordinator)
Impact

Cite this

@article{8c15dbe7c0f746fa8649091903ee57b7,

title = "Choosing Words in Computer-Generated Weather Forecasts",

abstract = "One of the main challenges in automatically generating textual weather forecasts is choosing appropriate English words to communicate numeric weather data. A corpus-based analysis of how humans write forecasts showed that there were major differences in how individual writers performed this task, that is, in how they translated data into words. These differences included both different preferences between potential near-synonyms that could be used to express information, and also differences in the meanings that individual writers associated with specific words. Because we thought these differences could confuse readers, we built our SUMTIME-MOUSAM weather-forecast generator to use consistent data-to-word rules, which avoided words which were only used by a few people, and words which were interpreted differently by different people. An evaluation by forecast users suggested that they preferred SUMTIME-MOUSAM's texts to human-generated texts, in part because of better word choice; this may be the first time that an evaluation has shown that NLG texts are better than human-authored texts. (c) 2005 Elsevier B.V. All rights reserved.",

keywords = "natural language processing, natural language generation, language and the word, information presentation, weather forecasts, lexical choice, idiolect, SEMANTICS",

author = "Reiter, {Ehud Baruch} and Sripada, {Gowri Somayajulu} and Hunter, {James Ritchie Wallace} and J. Yu and I. Davy",

year = "2005",

month = sep,

doi = "10.1016/j.artint.2005.06.006",

language = "English",

volume = "167",

pages = "137--169",

journal = "Artificial Intelligence",

issn = "0004-3702",

publisher = "Elsevier",

}

TY - JOUR

T1 - Choosing Words in Computer-Generated Weather Forecasts

AU - Reiter, Ehud Baruch

AU - Sripada, Gowri Somayajulu

AU - Hunter, James Ritchie Wallace

AU - Yu, J.

AU - Davy, I.

PY - 2005/9

Y1 - 2005/9

N2 - One of the main challenges in automatically generating textual weather forecasts is choosing appropriate English words to communicate numeric weather data. A corpus-based analysis of how humans write forecasts showed that there were major differences in how individual writers performed this task, that is, in how they translated data into words. These differences included both different preferences between potential near-synonyms that could be used to express information, and also differences in the meanings that individual writers associated with specific words. Because we thought these differences could confuse readers, we built our SUMTIME-MOUSAM weather-forecast generator to use consistent data-to-word rules, which avoided words which were only used by a few people, and words which were interpreted differently by different people. An evaluation by forecast users suggested that they preferred SUMTIME-MOUSAM's texts to human-generated texts, in part because of better word choice; this may be the first time that an evaluation has shown that NLG texts are better than human-authored texts. (c) 2005 Elsevier B.V. All rights reserved.

AB - One of the main challenges in automatically generating textual weather forecasts is choosing appropriate English words to communicate numeric weather data. A corpus-based analysis of how humans write forecasts showed that there were major differences in how individual writers performed this task, that is, in how they translated data into words. These differences included both different preferences between potential near-synonyms that could be used to express information, and also differences in the meanings that individual writers associated with specific words. Because we thought these differences could confuse readers, we built our SUMTIME-MOUSAM weather-forecast generator to use consistent data-to-word rules, which avoided words which were only used by a few people, and words which were interpreted differently by different people. An evaluation by forecast users suggested that they preferred SUMTIME-MOUSAM's texts to human-generated texts, in part because of better word choice; this may be the first time that an evaluation has shown that NLG texts are better than human-authored texts. (c) 2005 Elsevier B.V. All rights reserved.

KW - natural language processing

KW - natural language generation

KW - language and the word

KW - information presentation

KW - weather forecasts

KW - lexical choice

KW - idiolect

KW - SEMANTICS

U2 - 10.1016/j.artint.2005.06.006

DO - 10.1016/j.artint.2005.06.006

M3 - Article

SN - 0004-3702

VL - 167

SP - 137

EP - 169

JO - Artificial Intelligence

JF - Artificial Intelligence

ER -

Choosing Words in Computer-Generated Weather Forecasts

Abstract

Keywords

Access to Document

Fingerprint

Impacts

Data2Text

Mainstream communication of big data using natural language generation (NLG)

Promoting the Public Understanding of Artificial Intelligence

Cite this