Family history research on the web is increasing in popularity, and many competing genealogical websites host large amounts of data-rich, unstructured, primary genealogical records. It is labor-intensive, however, even after making these records machine-readable, for humans to make these records easily searchable. What we need are computer tools that can automatically produce indices and databases from these genealogical records and can automatically identify individuals and events, determine relationships, and put families together. We propose here a possible solution—specialized ontologies, built specifically for extracting information from primary genealogical records, with expert logic and rules to infer genealogical facts and assemble relationship links between persons with respect to the genealogical events in their lives. The deliverables of this solution are extraction ontologies that can extract from parish or town records, annotated versions of original documents, data files of individuals and events, and rules to infer family relationships from stored data. The solution also provides for the ability to query over the rules and data files and to obtain query-result justification linking back to primary genealogical records. An evaluation of the prototype solution shows that the extraction has excellent recall and precision results and that inferred facts are correct.
College and Department
Physical and Mathematical Sciences; Computer Science
BYU ScholarsArchive Citation
Woodbury, Charla Jean, "Automatic Extraction From and Reasoning About Genealogical Records: A Prototype" (2010). Theses and Dissertations. 2335.
information extraction, ontology, SWRL rule, family history, genealogy