G22.2590 - Natural Language Processing - Spring 2010     Prof. Grishman

Lecture 3 Outline

February 4, 2010

Role of Syntax Analysis

Basic Syntactic Structures of English (J&M 12.3)

Comparison with other Languages

Context-free grammar (J&M 12.2)

A small context-free English grammar

sentence := np vp;
np := n | art n | art adj n;
vp := v | v np;
Including auxiliaries vp := v | v np | v vp; Including PPs sentence := np vp;
np := ngroup | ngroup pp;
ngroup := n | art n | art adj n;
vp := v | v np | v vp | v np pp;
pp := p np;


Parsing as search (J&M 13.1)

Top-down recognizer / parser (J&M 13.1.1)

Bottom-up (immediate-constituent) parser (Grishman 2.4.2, J&M 13.1.2))

Uses tree nodes with components root (a non-terminal grammar symbol),
start and end (token numbers), and
constituents (a vector of parse tree nodes)
For i = 1 , … number of words in sentence Create a node with root = part of speech of word i, start = i, end = i+1 (if the word has several parts of speech, create one node for each P.O.S.) Put this node on list todo
While todo is not empty, Remove node n from todo
If there exists a production A --> a1 a2 … aj such that root(n) = aj
and there exist nodes n1 … nj-1 such that root(nk)=ak and end(nk)=start(nk+1) (k=1,…,j-1),
then create a new node with root = A, start = start(n1), end = end(n) and add it to todo.