Abstract
Referring expression generation has recently been the subject of the first Shared Task Challenge in NLG. In this paper, we analyse the systems that participated in the Challenge in terms of their algorithmic properties, comparing new techniques to classic ones, based on results from a new human task-performance experiment and from the intrinsic measures that were used in the Challenge. We also consider the relationship between different evaluation methods, showing that extrinsic taskperformance experiments and intrinsic evaluation methods yield results that are not significantly correlated. We argue that this highlights the importance of including extrinsic evaluation methods in comparative NLG evaluations.
| Original language | English |
|---|---|
| Title of host publication | INLG 2008 |
| Subtitle of host publication | Proceedings of the Fifth International Natural Language Generation Conference |
| Publisher | Association for Computational Linguistics |
| Pages | 50-58 |
| Number of pages | 9 |
| DOIs | |
| Publication status | Published - 12 Jun 2008 |
| Event | 5th International Natural Language Generation Conference, INLG 2008 - Salt Fork, OH, United States Duration: 12 Jun 2008 → 14 Jun 2008 |
Conference
| Conference | 5th International Natural Language Generation Conference, INLG 2008 |
|---|---|
| Country/Territory | United States |
| City | Salt Fork, OH |
| Period | 12/06/08 → 14/06/08 |