Faculty Publications

Enabling Search for Facts and Implied Facts in Historical Documents

Deryle W. Lonsdale, Brigham Young UniversityFollow
David W. Embley, Brigham Young UniversityFollow
Spencer Machado, Brigham Young University
Thomas Packer, Brigham Young University
Joseph Park, Brigham Young University
Andrew J. Zitzelberger, Brigham Young University - ProvoFollow
Stephen W. Liddle, Brigham Young UniversityFollow
Nathan Tate, Brigham Young University

Keywords

fact extraction, implied facts, historical documents

Abstract

Building a database of facts extracted from historical documents to enable database-like query and search would reduce the tedium of gleaning facts of interest from historical documents. We propose a solution in which historical documents themselves constitute the stored database. In our solution, we use information-extraction techniques to produce a conceptualized external annotation of facts found in each document, and we superimpose the conceptualization over the document collection. The annotation process populates the conceptualization producing a repository of extracted facts, and a reasoner obtains inferred facts from these extracted facts. Our query interface accepts free-form queries and converts them to formal queries over the extracted and inferred facts. Displayed results include, in addition to standard query results, images of original documents with results highlighted along with reasoning chains for inferred facts grounded in these highlighted facts. Along with giving the implementation status of our proof-of-concept prototype, we present results for extraction accuracy and efficiency and point to current and future work needed to enable a practical solution for the envisioned historical-document database.

Original Publication Citation

Enabling Search for Facts and Implied Facts in Historical Documents. International Workshop onHistorical Document Imaging and Processing (HIP 2011), Beijing, China., 16-17 September, 2011. [co-authors: D.W. Embley, S.W. Liddle, S. Machado, T. Packer, J. Park, N. Tate, and A. Zitzelberger].

BYU ScholarsArchive Citation

Lonsdale, Deryle W.; Embley, David W.; Machado, Spencer; Packer, Thomas; Park, Joseph; Zitzelberger, Andrew J.; Liddle, Stephen W.; and Tate, Nathan, "Enabling Search for Facts and Implied Facts in Historical Documents" (2011). Faculty Publications. 6814.
https://scholarsarchive.byu.edu/facpub/6814

Document Type

Other

Publication Date

2011

Publisher

Association for Computing Machinery

Language

English

College

Humanities

Department

Linguistics and English Language

University Standing at Time of Publication

Associate Professor

Copyright Use Information

https://lib.byu.edu/about/copyright/

Link to Full Text

COinS

BYU ScholarsArchive

Faculty Publications

Enabling Search for Facts and Implied Facts in Historical Documents

Keywords

Abstract

Original Publication Citation

BYU ScholarsArchive Citation

Document Type

Publication Date

Publisher

Language

College

Department

University Standing at Time of Publication

Copyright Use Information

Search

Browse

BYU Links

Author Corner

Hosted by the

BYU ScholarsArchive

Faculty Publications

Enabling Search for Facts and Implied Facts in Historical Documents

Authors

Keywords

Abstract

Original Publication Citation

BYU ScholarsArchive Citation

Document Type

Publication Date

Publisher

Language

College

Department

University Standing at Time of Publication

Copyright Use Information

Share

Search

Browse

BYU Links

Author Corner

Hosted by the