This paper presents an investigation into the problem of content determination in natural language generation (NLG), using its an example the problem of determining what to say when asked "What is an A?", where A is a concept defined in an OWL ontology. It is shown that a naive approach to this problem, which just presents a set of the stated axioms, will often inadvertantly violate maxims of cooperative conversation. What is required instead is a kind of inference that generates logical conclusions of the axioms that are suitable for natural language presentation-natural language directed inference (NLDI). Although NLDI, in this case a kind of non-standard inference in description logics, is hard to formalise in general, for this problem we isolate a significant subproblem-that of enumerating subsumers of A that are suitable for natural language presentation. For this problem, which on the face of it appears intractable, we show how factors relevant to natural language presentation enable an optimised solution that is realistic in practice.
The paper makes a contribution to the increasingly important practical problem of explaining concepts in an ontology. It also makes a first step towards the development of domain independent principles for content determination. (c) 2008 Elsevier B.V. All rights reserved.
Bibliographical noteA paid open access option is available for this journal.
Voluntary deposit by author of pre-print allowed on Institutions open scholarly website and pre-print servers
Voluntary deposit by author of authors post-print allowed on institutions open scholarly website including Institutional Repository
Deposit due to Funding Body, Institutional and Governmental mandate only allowed where separate agreement between repository and publisher exists
Set statement to accompany deposit
Published source must be acknowledged
Must link to journal home page or articles' DOI
Publisher's version/PDF cannot be used
Articles in some journals can be made Open Access on payment of additional charge
NIH Authors articles will be submitted to PMC after 12 months
Authors who are required to deposit in subject repositories may also use Sponsorship Option
Pre-print can not be deposited for The Lancet
- natural language generation
- non-standard reasoning
- content determination