Keywords
Document Reading, Information Extraction, Conceptual Modeling, Ontology Conceptualization, Extraction Ontology
Abstract
Ontological document reading is defined as automatically and appropriately populating a conceptual model representing an ontological conceptualization of some fragment of the real world. Appropriately populating the conceptualization involves not only extracting the information with respect to the declared object and relationship sets of the conceptual model but also involves checking the extracted information for real-world constraint violations, standardizing the data, and inferring the unwritten information that a document author intended convey. Appropriately populating an ontology may, in addition, require adjustments to the ontology itself. This approach to document reading is presented in terms of an effort to build a system to extract the genealogical information in family history books. The status of the reading system is reported. Also explained is how the generated results can be imported into and thus contribute to the construction of a large repository of world-wide family interrelationships. The reading system’s potential use for constructing similar knowledge repositories in other domains is foreshadowed.
Original Publication Citation
David W. Embley, Stephen W. Liddle, Deryle W. Lonsdale, and Scott N. Woodfield (2018). Ontological Document Reading: An Experience Report. Enterprise Modelling and InformationSystems Architectures---International Journal of Conceptual Modeling: Special Issue onConceptual Modelling in Honour of Heinrich C. Mayr, Vol. 13 Num. 2, pp. 133-181. Gesellschaft für Informatik e.V. (The German Informatics Society)
BYU ScholarsArchive Citation
Lonsdale, Deryle W.; Embley, David W.; Liddle, Stephen W.; and Woodfield, Scott N., "Ontological Document Reading An Experience Report" (2018). Faculty Publications. 6883.
https://scholarsarchive.byu.edu/facpub/6883
Document Type
Peer-Reviewed Article
Publication Date
2018
Publisher
Enterprise Modelling and Information Systems Architectures
Language
English
College
Humanities
Department
Linguistics