FASTUS: A Cascaded Finite-State Transducer for Extracting Information from Natural-Language Text
Jerry R. Hobbs, Douglas Appelt, John Bear, David Israel, Megumi, Kameyama, Mark Stickel, and Mabry Tyson (Artificial Intelligence Center, SRI, International, Menlo Park, California)

TL;DR
FASTUS is an efficient, multi-stage finite-state automaton system designed for extracting structured information from natural language text to facilitate database entry and other applications.
Contribution
It introduces a cascaded finite-state automaton approach that decomposes language processing into stages, enabling effective domain-independent syntax and domain-dependent semantic processing.
Findings
Successfully used in various applications
Efficient and effective information extraction
Handles complex language structures
Abstract
FASTUS is a system for extracting information from natural language text for entry into a database and for other applications. It works essentially as a cascaded, nondeterministic finite-state automaton. There are five stages in the operation of FASTUS. In Stage 1, names and other fixed form expressions are recognized. In Stage 2, basic noun groups, verb groups, and prepositions and some other particles are recognized. In Stage 3, certain complex noun groups and verb groups are constructed. Patterns for events of interest are identified in Stage 4 and corresponding ``event structures'' are built. In Stage 5, distinct event structures that describe the same event are identified and merged, and these are used in generating database entries. This decomposition of language processing enables the system to do exactly the right amount of domain-independent syntax, so that domain-dependent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · semigroups and automata theory · Logic, programming, and type systems
