Abstract
Entity extraction is an important step in document understanding. Higher accuracy entity extraction on fine-grained entities can be achieved by combining the utility of Named Entity Recognition (NER) and Relation Extraction (RE) models. In this paper, a cascading model is proposed that implements NER and Relation extraction. This model utilizes relations between entities to infer context-dependent fine-grain named entities in text corpora. The RE module runs independent of the NER module, which reduces error accumulation from sequential steps. This process improves on the fine-grained NER F1-score of existing state-of-the-art from .4753 to .8563 on our data, albeit on a strictly limited domain. This provides the potential for further applications in historical document processing. These applications will enable automated searching of historical documents, such as those used in economics research and family history.
Degree
MS
College and Department
Physical and Mathematical Sciences; Computer Science
Rights
https://lib.byu.edu/about/copyright/
BYU ScholarsArchive Citation
Segrera, Daniel, "Hierarchical Joint Entity Recognition and Relation Extraction of Contextual Entities in Family History Records" (2023). Theses and Dissertations. 10265.
https://scholarsarchive.byu.edu/etd/10265
Date Submitted
2023-03-08
Document Type
Thesis
Handle
http://hdl.lib.byu.edu/1877/etd13103
Keywords
natural language processing, named entity recognition, relation extraction, information extraction
Language
english