![]() |
Naomi SagerResearch Professor, Computer Science DepartmentDepartment of Computer Science Courant Institute of Mathematical Sciences New York University Courant Institute of Mathematical Sciences • 251 Mercer Street • New York, NY 10012 • • +1 212.998-3097 (voice) • +1 212.995-4123 (fax) • sager@cs.nyu.edu • |
Naomi Sager was born in Chicago, IL in 1927. From 1942-1946 she attended the newly established Four Year College at the University of Chicago receiving the degree Bachelor of Philosophy in 1946. In 1953 Sager obtained a B.S. in Electrical Engineering from Columbia University and from 1953-1958 worked as an electronics engineer in the Biophysics Department of the Sloan-Kettering Institute for Cancer Research in New York City. One project there was to develop an instrument for continuous blood pressure measurement and control for patients in hemorrhagic shock ["Servomechanism for the Regulation of Blood Pressure," Naomi Sager, John H. Waite, J. William Poppell, and William S. Howland, Review of Scientific Instruments, 28 (1957).
Sager began work on the computer processing of language as a member of the team that developed the first English language parsing program that ran on Univac 1 at the University of Pennsylvania in 1959 [Transformations and Discourse Analysis Papers 15-19, Dept. of Linguistics, U. of Pa. 1959, and Z.S. Harris, String Analysis of Sentence Structure, Mouton & Co. The Hague, 1962]. Sager's section of the program was to treat syntactic ambiguity (more than one possible analysis at points in the sentence), cf. TDAP 17 on the list of TDAP publications. The structures imposed by the 1959 program proved unwieldy for this task and in 1960 Sager developed an algorithm and a form of string grammar in which the treatment of ambiguity was an integral part (A Procedure for Left-to-Right Analysis of Sentence Structure [TDAP 27]). This work became the basis of a Ph.D. thesis for which she was awarded a Ph.D. in Linguistics from the University of Pennsylvania in 1968 and served as the basis for the founding of the Linguistic String Project at New York University in 1965.
The computer string grammar of English at the core of the LSP parsing program was published in 1981. N. Sager, Natural Language Information Processing: A Computer Grammar of English and Its Applications (Addison-Wesley, Reading, MA) [LSP34]. The grammar was modified to handle the specialized language of clinical documents, and the English lexicon used by the parser was augmented with medical semantic attributes, Medical Language Processing: Computer Management of Narrative Data, with Friedman, C., Lyman, M.S., MD and members of the LSP, 1987 (Addison-Wesley, Reading, MA) [LSP65]. The resulting Medical Language Processor (MLP) is documented at the LSP website.
Sager taught courses in Natural Language Processing and maintained the Linguistic String Project (LSP) at New York University from 1965 until her retirement in 1995. She resides in New York and for part of each year in Paris. She was one of the translators from French to English of the autobiography of Ngo Van, a Vietamese revolutionary who, while working in a factory in Paris, became an engineer, a published scholar and author of numerous works [Ngo Van, In the Crossfire: Adventures of a Vietnamese Revolutionary, AK Press, Oakland CA, 2010].
discardedIn a 1967 article [MLP 1] Sager laid out the basis for language computation and described the first two implementations of the LSP parser and string grammar. The grammar was specified in two components: a set of formal rewriting rules written in Backus Normal Form (BNF) that provided the structure of the output parse tree, and a set of procedures, called restrictions, that operated on the parse tree to enforce detailed grammatical constraints [MLP 5]. An English-like programming language for expressing restrictions, the Restriction Language RL, was developed [MLP 12]. Applications of the LSP system drew upon Sublanguage Grammar, an extension of linguistic methods whereby the constraints on word combinations special to a subject matter are formalized into quasi-grammatical rules [MLP 11]. It was further shown that parsed documents could be mapped into sublanguage labeled structures, called information formats, on which information retrieval procedures could operate [MLP 28]. The MLP concentrated on the sublanguage of clinical reporting, X-ray reports, hospital discharge summaries, and the like, demonstrating an automated application of health care criteria to information formatted narrative medical reports [MLP 30]. The fully implemented form of the MLP string grammar and some of the initial applications were published in Sager's 1981 book [MLP 34]. The collective work of the LSP team on medical records was summarized in a 1987 volume [MLP 65]. A general overview of methods and results was presented by Sager at the New York Academy of Sciences in 1990 [MLP 78]. The ways in which contributions to Linguistics by Zellig Harris were utilized in the development of the LSP system were described in the symposium dedicated to his work [MLP 91]. The LSP medical language processor was converted to French in a collaborative research with the Informatics group of the Cantonal Hospital of Geneva, Switzerland, under the direction of Jean-Raoul Scherrer [MLP 77][MLP 76]. Subsequently, an XML hierarchy of medical knowledge tags was added to the system along with an online viewer by which clinicians could see highlighted portions of documents pertaining to particular patient problems or therapies, demonstrated as part of Sager's keynote address at the Second International Conference on the Clinical Document Architecture, October 20-22, 2004 at Acapulco, Mexico. |
The Linguistic String Project (LSP) at New York University was one of the earliest research and development projects in computer processing of natural (i.e. human) language. It was initiated by Naomi Sager at NYU in 1965 with a grant from the Office of Science Information Services of the National Science Foundation (OSIS). The OSIS at that time was seeking means to provide scientists rapid access to information in the expanding technical literature. Computer analysis of language that would facilitate pin-pointed search and retrieval of requested information was one avenue they were pursuing.
The LSP approach was to begin with a parsing program to obtain the syntactic relations among sentence words, the basic structure of language-borne information. This entailed the implementation of a parsing algorithm (top-down, left-to-right with calls on linguistic test procedures) as first described by Sager in 1960, "A Procedure for Left to Right Analysis of Sentence Structure," Report 27 of the series Transformations and Discourse Analysis Papers (TDAP) published by the Dept. of Linguistics, University of Pennsylvania.
A 1967 article "Syntactic Analysis of Natural Language" (Advances in Computer 8:153-188, Academic Press, NY) [LSP 1] laid out the basis for language computation and described the first two implementations of the LSP parser and string grammar. The grammar was specified in two components: a set of formal rewriting rules written in Backus Normal Form (BNF) that provided the structure of the output parse tree, and a set of procedures, called restrictions, that operated on the parse tree to enforce detailed grammatical constraints, see "A Two-Stage BNF Specification of Natural Language," Journal of Cybernetics 2-3 (1972): 39-50 [LSP 5]. An English-like programming language for expressing restrictions, the Restriction Language RL, was developed, see "The Restriction Language for Computer Grammars of Natural Language," with Grishman, R., Communications of the ACM 18:390-400 [LSP 12]. The computer grammar of English that formed an integral part of the system was published in 1981, see Natural Language Information Processing: A Computer Grammar of English and Its Applications, Addison-Wesley, Reading, MA [LSP 34].
Applications of the LSP system drew upon Sublanguage Grammar, an extension of linguistic methods whereby the constraints on word combinations special to a subject matter are formalized into quasi-grammatical rules, see "Sublanguage grammars in science information processing, Journal of the American Society for Information Science 26(1975): 10-16 [LSP 11]. It was further shown that parsed documents could be mapped into sublanguage labeled structures, called information formats, on which information retrieval procedures could operate, see "Natural languagee information formatting: the automatic conversion of texts to a structured data base," in Advances in Computers 17 (M.C. Yovits, ed.) 89-162 (1978), Academic Press, NY [LSP 28]. The LSP concentrated on the sublanguage of clinical reporting, X-ray reports, hospital discharge summaries, and the like, demonstrating an automated application of health care criteria to information formatted narrative medical reports, see Hirschman, L. et al. "Automatic application of health care criteria to narrative patient records," Proceedings of the Third Annual Symposium on Computer Applications in Medical Care (R.A. Dunn, ed.), 105-113, IEEE, NY [LSP 30]. The collective work of the LSP team on medical records was summarized in a 1987 volume, cf. Medical Language Processing: Computer management of narrative data, Sager, N, Friedman, C., Lyman, M.S., MD, and LSP members, Addison-Wesley, MA [LSP 65]. The LSP Medical Language Processor, including the medically specialized English grammar and dictionary is available on the Linguistic String Project website.
The LSP MLP was converted to French in a collaborative research with the Informatics group of the Cantonal Hospital of Geneva, Switzerland, see Nhàn, N.T., et al. "A medical language proccessor for two Indo-European languages," Proceedings of the 13th Annual Symposium on Computer Application in Medical Care (SCAMC13), L.C. Kingsland, ed., IEEE Computer Society Press, Washington D.C., 554-558 [LSP 77], and Borst, F. et al. "Analyse automatique de comptes rendues d'hospitalisation," Informatique et Santé, Informatique et Gestion des Unités de Soins, Comptes Redus du Colloque AIM-IF, Paris, 1989, Degoulet, P., et al., redacteurs, Paris, Springer-Verlag, 246-256 [LSP 76]. Subsequently, an XML hierarchy of medical knowledge tags was added to the system along with an online viewer by which clinicians could see highlighted portions of documents pertaining to particular patient problems or therapies, demonstrated as part of Sager's keynote address at the Second International Conference on the Clinical Document Architecture, October 20-22, 2004 at Acapulco, Mexico.
A general overview of methods and results of the LSP was presented by Sager at the New York Academy of Sciences in 1990, see "Computer analysis of sublanguage information structures," Annals of NY Academy of Sciences, 683: 161-179 [LSP 78]. The ways in which contributions to Linguistics by Zellig Harris were utilized in the development of the LSP system were described by Sager and Ngô Thanh Nhàn in the symposium dedicated to Harris's work, see "The computability of strings, transformations, and sublanguage," in The Legacy of Zellig Harris, eds. by Nevin, B. et al., John Benjamins Publishing Co., Amsterdam, Vol. 2, Chapter 4, 79-120 [LSP 91].
discarded
|