Document Reading, Information Extraction, Conceptual Modeling, Ontology Conceptualization, Extraction Ontology


Ontological document reading is defined as automatically and appropriately populating a conceptual model representing an ontological conceptualization of some fragment of the real world. Appropriately populating the conceptualization involves not only extracting the information with respect to the declared object and relationship sets of the conceptual model but also involves checking the extracted information for real-world constraint violations, standardizing the data, and inferring the unwritten information that a document author intended convey. Appropriately populating an ontology may, in addition, require adjustments to the ontology itself. This approach to document reading is presented in terms of an effort to build a system to extract the genealogical information in family history books. The status of the reading system is reported. Also explained is how the generated results can be imported into and thus contribute to the construction of a large repository of world-wide family interrelationships. The reading system’s potential use for constructing similar knowledge repositories in other domains is foreshadowed.

Original Publication Citation

David W. Embley, Stephen W. Liddle, Deryle W. Lonsdale, and Scott N. Woodfield (2018). Ontological Document Reading: An Experience Report. Enterprise Modelling and InformationSystems Architectures---International Journal of Conceptual Modeling: Special Issue onConceptual Modelling in Honour of Heinrich C. Mayr, Vol. 13 Num. 2, pp. 133-181. Gesellschaft für Informatik e.V. (The German Informatics Society)

Document Type

Peer-Reviewed Article

Publication Date



Enterprise Modelling and Information Systems Architectures







University Standing at Time of Publication

Associate Professor

Included in

Linguistics Commons