Generation

 

May 20, 2002

Simple approaches

"Canned text" and "template filling" are appropriate for some situations, and may even be psychologically realistic up to a point. However, these approaches have limitations: they are relatively inflexible, and all content has to be anticipated ahead of time.

They also tend to work best where the output is brief; "multisentential" output requires attention to text structure, anaphora etc.

A "default" architecture

· Planner does content selection & discourse structure; has as inputs "communicative goal", "knowledge base" and possibly other components; outputs a "discourse specification".
· Realizer does construction selection, referring expressions, lexical selection etc; outputs natural language sentences.

The line between planning and realization is blurry, and different systems draw it in different places. It's not even apparent that this modular approach is correct. In particular, the "realizer" may need access to the same information sources as the "planner". Consider:

· Pronominalization: this involves generation of referring expressions, but the decision whether to pronominalize might depend on the user model and/or inference engine to choose between pronoun and NP.
· Nominalization: whether to realize a proposition as a clause or an NP is a discourse planning issue. But in English, not all verbs have a morphological nominalization. The availability of syntactic and/or lexical resources in the realizer may thus impact planning decisions.

Systemic Functional Grammar (Halliday)

The implementation of SFG described in this section is that of Nigel, a large (the largest?) text generation grammar, developed at USC/ISI starting in the early 80s.

A grammar is a decision tree. Every branch specifies the functional conditions for the choice, and the surface consequences of the choice. For instance, an interrogative is used when the speaker is requesting information, and it results in subject-verb inversion.

Halliday groups all choices into one of the following three "metafunctions":

· Interpersonal (realized e.g. by "mood", i.e. indicative/declarative/interrogative)
· Ideational (realized e.g. by "transitivity", i.e. argument structure)
· Textual (realized e.g. by "theme", i.e. information flow)

A systemic grammar is formalized as a "system network" (p. 771).

Sentence structure is formalized in a "box diagram" (p. 769). Note that this resembles a tree diagram in some respects, but in fact there may be completely different constituency on the different layers. Does it allow discontinuous constituents? (Yes, in some versions.)

Note however that all three metafunctions can permeate the entire grammar, and in some versions a particular form can realize aspects of various metafunctions.

Different approaches are possible to the problem of providing the grammar information needed to make a decision.

· Let the "planning" component provide a specification, and provide defaults for where the information is missing. The SPL ("Sentence Planning Language") statement on p. 772 exemplifies this approach.

· Let the grammar provide its own planning. Attach "experts" to the choice points to search the environment for the relevant information. For instance, provide the "theme" by examining the discourse history. Less modular, requires a knowledge-rich environment.

In its current incarnation as part of the "Komet-Penman Multilingual Development Environment", Penman takes as input specifications in SPL ("Sentence Planning Language"); SPL (in conjunction with the "upper model" and a deduction system called KLONE) contains all the information necessary to traverse the system network.

Functional unification grammar (Martin Kay)

Uses a unification structure just like the kind we saw for parsing to specify the grammar, except that choices are marked with ALT and curly brackets.

This can implement any theory of grammar you like; the earliest version for generation pretty much implemented a Systemic system network.

Input to the grammar takes the form of a "Functional Description" (FD), which is another feature structure. This then unifies with the grammar to fill in additional realization information.

Discourse Planning

Discourse plans may be:

· generated from a schema based on the structure of the input content (for instance, a series of time-ordered steps, or moves in a tic-tac-toe game)

· derived from a goal-oriented rhetorical planner. Structure:

Post a goal
Goal fires an "operator" (if constraints on operator are satisfied)
Operator constructs a portion of an RST tree (nucleus + satellite)