A computational treatment of the comparative

Candidate: Friedman,Carol


This thesis develops a computational treatment of the comparative in English that is general, efficient, and relatively easy to implement, while not unduly complicating the natural language processing system. Implementation was accomplished using the Proteus Question Answering System, which translates natural language questions into database queries. The comparative is a particularly difficult language structure to process, and presently only a few natural language systems handle it in limited ways. However, the comparative is an essential component of language that frequently occurs in discourse. The comparative is difficult to process because it corresponds to an amazingly diverse range of syntactic forms such as coordinate and subordinate conjunctions and relative clauses which are also very complex and often contain missing elements. Semantically, the comparative is cross-categorical: adjectives, quantifiers, and adverbs can have the comparative feature. The semantics of the comparative has to be consistent with that of different linguistic categories while retaining its own unique characteristics. The computational approach of this thesis is based on a language model which contains functionally independent syntactic, semantic, and pragmatic components. Although the comparative relates to all the components, the syntactic component is the one that is mainly affected. The syntactic stage of processing analyzes and regularizes the comparative structures. The analysis process utilizes existing mechanisms that handle structures similar to the comparative. The regularization process transforms all the different comparative structures into one standard form consisting of a comparative operator and two complete clauses. This process consists of two phases: the first uses a compositional approach based on Montague-style translation rules. The subsequent phase uses specialized procedures to complete the regularization process by expanding the comparative, filling in missing elements, and providing the appropriate quantified terms associated with the comparated elements. After the comparative is regularized, the remaining stages of processing are hardly affected. Each clause of the comparative is processed using the same procedures as usual, and only minor modifications are required specifically for the comparative.