In any planning problem, we have a set of predicates which describe the state of the system, and a set of actions, which affect the state of the system (making some predicates true or false). Some actions may be compound actions: they represent sequences of simpler actions. Also, actions have preconditions: predicates which must be true in order for the action to apply, and effects, predicates which become true when the action is performed.
We can represent a plan by a tree, in which a goal dominates an action which achieves that goal, and an action dominates its precondition goals and (if it is a compound action) its constituent actions.
Given a discourse, we ideally seek to create a plan (tree) in which the root (initial goal) is either an explicitly stated goal or is known to be a "plausible goal", and in which each sentence of the discourse can be tied to some action in the goal tree. (In reality we will normally not be able to connect all assertions in the discourse to the plan, but will prefer analyses which have the maximal connection to a plan.)
Trains domain (Allen, p. 484):
"Jack needed to be in Rochester by noon. He bought a ticket at the station."
"Sue bought a ticket to Rochester. She boarded the train at 4PM."
Equipment failure reports.
(The other simple organization is pure "system initiative", where the system asks the questions and only accepts direct answers … a fancy menu system.)
For mixed-initiative systems, we need to maintain some explicit representation of dialog goals in a goal stack. There will be both system (task) goals and user goals
For example, in an information gathering task, the system will have goals corresponding to the information it needs to gather (e.g., slots to fill in a form). If the top goal is such a system goal, the system will ask a question (to fill one slot in the form). The input will be analyzed with respect to this top goal:
Simple dialog systems process questions at "face value". This can lead to "stonewalling" behavior such as (from Waltz):
For example, in the stonewalling above, the literal interpretation does not correspond to a plausible goal, so a cooperative system would seek a more likely goal (actually getting the summaries) and respond to that.
If you ask a train agent "When does the train to Jamaica leave?" and
he answers "3:15, track 27", it’s because he inferred that you wanted
get on that train and therefore needed to know where as well as when it
Such interpretation, however, requires a very deep modeling of the
domain, which is feasible only for very narrow domains. More
practical are 'cue-based' systems which explicitly encode rules or
features for identifying the intended communicative act (for example,
that "Can you X" is a request for the system to do X.)