Roman Yangarber

 

715 Broadway, 7th floor, Rm 1015
New York, NY 10003
tel: +1 (212) 998-3264
fax: +1 (212) 995-4123
 
roman@cs.nyu.edu

www.cs.nyu.edu/roman

 

EDUCATION

Ph.D. Computer Science 

2000

Courant Institute of Mathematical Sciences
New York University

Thesis: "Scenario Customization for Information Extraction"
M.S. Computer Science 

1993

New York University
B.A. Mathematics and Computer Science (minor in Music)

1987

New York University

 

PROFESSIONAL EXPERIENCE

Research Assistant Professor

2003—Present

Assistant Research Scientist

1998—2003

Courant Institute of Mathematical Sciences
New York University

Areas of research: Information Extraction; unsupervised learning and automatic acquisition of domain-specific knowledge; customization of knowledge bases for new topics and domains. Government-sponsored projects, including: TIDES—cross-lingual information extraction and retrieval; ACE—Automatic Content Extraction.

 
Graduate Research Assistant

1993—1998

Department of Computer Science
New York University

Member of the Proteus Project in Natural Language Processing. Customization for adaptive Information Extraction. Algorithms for text alignment in Machine Translation. Participate in DARPA-sponsored Message Understanding Competition, MUC-7.
Previously: design of Griffin, a functional programming language for rapid prototyping; a project in textual document tailoring.

 
Assistant Research Scientist/Systems Programmer

1987—1992

Center for Neural Science
New York University

Multiprocessing systems for concurrent signal generation, real-time response collection, and data analysis for visual image processing and psychophysical applications. Computational modeling of visual perception.

 

TEACHING EXPERIENCE

Visiting Lecturer

May 2003

KIT Graduate School
University of Helsinki, Finland

Course title: "Unsupervised Learning in Language Technology"
Visiting Lecturer

Nov—Dec 2000

Department of General Linguistics
University of Helsinki, Finland

Course title: "Quasi-Unsupervised Learning for Natural Language Processing"
Lecturer

1991—1995

Brandon Systems Corporation
New York/New Jersey

Intensive 1- to 2-week full-time courses in C/C++, UNIX programming, Visual Basic, software design.
Instructor

1987—1991

Structured Techniques/Delft Consulting, Inc.
New York

Multi-platform programming seminars for industry professionals: C/C++/Unix.

 

HONORS

The Janet Fabri Award: for Best Doctoral Dissertation in Computer Science in the year 2000, New York University
(award shared with Suren Talla)

Graduate Scholarship in Computer Science, Courant Institute of Mathematics Sciences, 1993—1998

 

Review Committees

 

PUBLICATIONS


Roman Yangarber.  (2004)   ``User-Oriented Evaluation in Information Extraction.''   Workshop on User-Oriented Evaluation of Knowledge Discovery Systems, at the 4th International Conference on Language Resources and Evaluation (LREC 2004) Lisbon, Portugal

Winston Lin, Roman Yangarber and Ralph Grishman.  (2003)   ``Bootstrapped Learning of Semantic Classes from Positive and Negative Examples.''   Workshop on The Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining, at the 20th International Conference on Machine Learning (ICML 2003) Washington, D.C.

Roman Yangarber.  (2003)   ``Counter-Training in Discovery of Semantic Patterns.''   In proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL 2003) Sapporo, Japan

Ralph Grishman, Silja Huttunen, and Roman Yangarber.  (2003)   ``Information Extraction for Enhanced Access to Disease Outbreak Reports.''   In Journal of Biomedical Informatics, C. Friedman, ed., 35(4) pp. 236-246

Roman Yangarber, Winston Lin and Ralph Grishman.  (2002)   ``Unsupervised Learning of Generalized Names.''   In proceedings of the 19th International Conference on Computational Linguistics (COLING 2002) Taipei, Taiwan

Roman Yangarber.  (2002)   ``Acquisition of Domain Knowledge.''   Invited chapter in Extraction in the Web Era (M.T. Pazienza ed.), Lecture Notes in Artificial Intelligence, Vol. 2700 Springer-Verlag Heidelberg, pp. 1-28, a collection of contributions at the 3rd Summer Convention on Information Extraction (SCIE 2002), Frascati, Italy

Silja Huttunen, Roman Yangarber, Ralph Grishman.  (2002)   ``Complexity of Event Structure in IE Scenarios.''   In proceedings of the 19th International Conference on Computational Linguistics (COLING 2002) Taipei, Taiwan

Silja Huttunen, Roman Yangarber and Ralph Grishman.  (2002)   ``Diversity of Scenarios in Information Extraction.''   In proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002) Las Palmas de Gran Canaria, Spain

Ralph Grishman, Silja Huttunen and Roman Yangarber.  (2002)   ``Real-Time Event Extraction for Infectious Disease Outbreaks.''   In proceedings of Human Language Technology Conference (HLT 2002) San Diego, CA

Roman Yangarber, Ralph Grishman, Pasi Tapanainen and Silja Huttunen.  (2000)   ``Automatic Acquisition of Domain Knowledge for Information Extraction.''   In proceedings of the 18th International Conference on Computational Linguistics (COLING 2000) Saarbrücken, Germany, pp. 940-946

Roman Yangarber and Ralph Grishman.  (2000)   ``Machine Learning of Extraction Patterns from Un-annotated Corpora.''   In proceedings of the 14th European Conference on Artificial Intelligence, Workshop on Machine Learning for Information Extraction (ECAI 2000) Berlin, Germany

Roman Yangarber and Ralph Grishman.  (2000)   ``Extraction Pattern Discovery through Corpus Analysis.''   In proceedings of the Second International Conference on Language Resources and Evaluation, Workshop ``Information Extraction meets Corpus Linguistics'' (LREC 2000) Athens, Greece, pp. 31-35

Roman Yangarber, Ralph Grishman, Pasi Tapanainen and Silja Huttunen.  (2000)   ``Unsupervised Discovery of Scenario-Level Patterns for Information Extraction.''   In proceedings of the Sixth Conference on Applied Natural Language Processing (ANLP/NAACL 2000) Seattle, WA, pp. 282-289

Ralph Grishman and Roman Yangarber.   (2000)   ``Issues in Corpus-Trained Information Extraction.''   In proceedings of the International Symposium: Toward the Realization of Spontaneous Speech Engineering, Tokyo, Japan

Roman Yangarber and Ralph Grishman.  (1998)   ``Transforming Examples into Patterns for Information Extraction.''   In proceedings of TIPSTER Text Program Phase III Morgan Kaufmann, Baltimore, MD

Chikashi Nobata, Satoshi Sekine and Roman Yangarber.   (1998)   ``Japanese IE System and Customization Tool.''   In proceedings of TIPSTER Text Program Phase III, Morgan Kaufmann, Baltimore, MD

Adam Meyers, Roman Yangarber, Ralph Grishman, Catherine Macleod, Antonio Moreno-Sandoval.   (1998)   ``Deriving Transfer Rules from Dominance-Preserving Alignments.''   In proceedings of the 17th International Conference on Computational Linguistics and the 36th Meeting of the Association for Computational Linguistics (COLING/ACL 1998) Montreal, Canada

Adam Meyers, Catherine Macleod, Roman Yangarber, Ralph Grishman, Leslie Barrett, Ruth Reeves.   (1998)   ``Using NOMLEX to Produce Nominalization Patterns for Information Extraction.''   In proceedings of Workshop on Computational Treatment of Nominals (COLING/ACL 1998) Montreal, Canada

Roman Yangarber and Ralph Grishman.  (1998)   ``NYU: Description of the Proteus/ PET System as Used for MUC-7 ST.''   In proceedings of the Seventh Message Understanding Conference (MUC-7) Fairfax, VA

Roman Yangarber and Ralph Grishman.  (1997)   ``Customization of Information Extraction Systems.''   In proceedings of International Workshop on Lexically-Driven Information Extraction, [Invited talk] Frascati, Italy

Adam Meyers, Roman Yangarber, Ralph Grishman.  (1996)   ``Alignment of Shared Forests for Bilingual Corpora.''   In proceedings of the 16th International Conference on Computational Linguistics (COLING 1996) Copenhagen, Denmark, pp. 460-465

Other Presentations

Roman Yangarber and Ralph Grishman.  (1997)   ``Rapid Customization of Information Extraction Systems.''   Symposium on Advanced Information Processing and Analysis (AIPA 1997) Washington, D.C.

 

LANGUAGES

 

PERSONAL