Abstract

There is a need to have an automated system that can read family history books or other historical texts and extract as many genealogy facts as possible from them. Embley and others have applied traditional information extraction techniques to this problem in a system called OntoES with a reasonable amount of success. In parallel much linguistic theory has been developed in the past decades, and Lonsdale and others have built computational embodiments of some of these theories using Soar. In this thesis we introduce a system called OntoSoar which combines the Link Grammar Parser using a grammar customized for family history texts with an innovative semantic analyzer inspired by construction grammars to extract genealogical facts from family history books and use them to populate a conceptual model compatible with OntoES with facts derived from the text. The system produces good results on the texts tested so far, and shows promise of being able to do even better with further development.

Degree

MA

College and Department

Humanities; Linguistics and English Language

Rights

http://lib.byu.edu/about/copyright/

Date Submitted

2014-06-24

Document Type

Thesis

Handle

http://hdl.lib.byu.edu/1877/etd7096

Keywords

information extraction, genealogy, linguistic theory, cognitive semantics, construction grammar, cognitive architectures

Included in

Linguistics Commons

Share

COinS