Abstract
Facial action unit (AU) recognition is used in a variety of downstream applications, including sign language translation, dementia detection, pain detection, and other high-impact tasks that require understanding facial motion. Although AU recognition is a well-established field, a number of unresolved issues remain. An initial problem is the need to improve overall performance on the benchmark datasets, as measured by the average F1 score. Also, the performance across all individual AUs, not just the average F1 score, needs to be increased. Recently, the field has shifted its attention towards ensuring that AU recognition models generalize beyond the distributions represented in their training datasets. To better understand the literature and how a related field addresses this issue, I researched face recognition and how researchers address the imbalance in the distribution of human races in available face recognition data. As there are many similarities between the field of face recognition across race and the field of AU recognition, the findings from this face recognition literature review laid the groundwork for the remainder of this dissertation. To address the need for increasing overall performance on the average F1 score, I designed a neural network architecture for AU recognition that achieves the highest average F1 score on both benchmark datasets. Next, to move beyond the average F1 score and achieve high performance across all AUs, not just one or a few, I introduced a meta-learner for an ensemble and released a synthetic dataset. Finally, to generalize the model beyond the distributions of the AU recognition datasets, I used face reenactment and a cross-corpus loss, achieving the highest generalization of any open-source model. To aid in future research, each contribution's code and model are publicly available. Overall, this dissertation presents a generalized AU recognition model that performs well in applications it was not directly trained on.
Degree
PhD
College and Department
Ira A. Fulton College of Engineering; Electrical and Computer Engineering
Rights
https://lib.byu.edu/about/copyright/
BYU ScholarsArchive Citation
Sumsion, Andrew W., "Generalizing Networks to Address Data Imbalance for Facial Action Unit Recognition" (2026). Theses and Dissertations. 11160.
https://scholarsarchive.byu.edu/etd/11160
Date Submitted
2026-03-17
Document Type
Dissertation
Permanent Link
https://arks.lib.byu.edu/ark:/34234/q20b99c8e8
Keywords
facial action unit recognition, domain generalization, face reenactment, multi-label classification, facial expression analysis, meta-learner, mixture of experts
Language
english