There is a need to have an automated system that can read family history books or other historical texts and extract as many genealogy facts as possible from them. Embley and others have applied traditional information extraction techniques to this problem in a system called OntoES with a reasonable amount of success. In parallel much linguistic theory has been developed in the past decades, and Lonsdale and others have built computational embodiments of some of these theories using Soar. In this thesis we introduce a system called OntoSoar which combines the Link Grammar Parser using a grammar customized for family history texts with an innovative semantic analyzer inspired by construction grammars to extract genealogical facts from family history books and use them to populate a conceptual model compatible with OntoES with facts derived from the text. The system produces good results on the texts tested so far, and shows promise of being able to do even better with further development.
College and Department
Humanities; Linguistics and English Language
BYU ScholarsArchive Citation
Lindes, Peter, "OntoSoar: Using Language to Find Genealogy Facts" (2014). Theses and Dissertations. 4133.
information extraction, genealogy, linguistic theory, cognitive semantics, construction grammar, cognitive architectures