Abstract

Several large-scale projects including FamilySearch, Ancestry, BALSAC (University of Quebec), and others have gathered incredible amounts of genealogical data ranging from millions to billions of individuals. To study the structure of this data, we propose a model that generates a genealogical network based on real-world genealogical data using two key features: (i) geodesic distance between couples prior to union and (ii) the number of children per couple. The distribution of the distance to a couples' nearest common ancestor in an observed community captures the global scale at which biological cycles form in the underlying genealogical network. Similarly, the number of children per couple captures the local structure given by the degree distribution in the genealogical network. Constructing imitation data which approximates a real-world network's structure and growth rate is desirable for use in generalizable machine learning models. This model, which we refer to as the Target Model, provides a foundation for further work in predicting family network growth and structure.

Degree

MS

College and Department

Physical and Mathematical Sciences; Mathematics

Rights

https://lib.byu.edu/about/copyright/

Date Submitted

2023-06-12

Document Type

Thesis

Handle

http://hdl.lib.byu.edu/1877/etd12823

Keywords

genealogical networks, distance to union, target model

Language

english

Share

COinS