Abstract

Facial action unit (AU) recognition is used in a variety of downstream applications, including sign language translation, dementia detection, pain detection, and other high-impact tasks that require understanding facial motion. Although AU recognition is a well-established field, a number of unresolved issues remain. An initial problem is the need to improve overall performance on the benchmark datasets, as measured by the average F1 score. Also, the performance across all individual AUs, not just the average F1 score, needs to be increased. Recently, the field has shifted its attention towards ensuring that AU recognition models generalize beyond the distributions represented in their training datasets. To better understand the literature and how a related field addresses this issue, I researched face recognition and how researchers address the imbalance in the distribution of human races in available face recognition data. As there are many similarities between the field of face recognition across race and the field of AU recognition, the findings from this face recognition literature review laid the groundwork for the remainder of this dissertation. To address the need for increasing overall performance on the average F1 score, I designed a neural network architecture for AU recognition that achieves the highest average F1 score on both benchmark datasets. Next, to move beyond the average F1 score and achieve high performance across all AUs, not just one or a few, I introduced a meta-learner for an ensemble and released a synthetic dataset. Finally, to generalize the model beyond the distributions of the AU recognition datasets, I used face reenactment and a cross-corpus loss, achieving the highest generalization of any open-source model. To aid in future research, each contribution's code and model are publicly available. Overall, this dissertation presents a generalized AU recognition model that performs well in applications it was not directly trained on.

Degree

PhD

College and Department

Ira A. Fulton College of Engineering; Electrical and Computer Engineering

Rights

https://lib.byu.edu/about/copyright/

Date Submitted

2026-03-17

Document Type

Dissertation

Keywords

facial action unit recognition, domain generalization, face reenactment, multi-label classification, facial expression analysis, meta-learner, mixture of experts

Language

english

Included in

Engineering Commons

Share

COinS