Theses and Dissertations

Robust and Interpretable Cross-Domain Arabic Readability Prediction Using Hybrid Modeling: Evidence on the Limitations of Transformer-Based Classifiers

Sarah Alzu'bi, Brigham Young UniversityFollow

Abstract

Automatic readability assessment aims to predict how difficult a text is for readers to understand. The concept of readability has been widely studied across languages due to its significant impact on reading comprehension, leading to the development of various methods and approaches. In this study, I addressed two key challenges in readability assessment. First, I examine the performance of statistical classifiers and transformer-based models, including CAMeLBERT(Inoue et al, 2021.) with a focus on their ability to generalize under domain, genre, and stylistic shifts. Second, I investigate the interpretability of transformer performance in readability task, which are often criticized for their black box nature.

To address these challenges, I built a new corpus for automatic Arabic readability assessment consisting of 82,512 samples and expanded it with the DARES corpus (El-Haj et al., 2024), which includes 10,755 samples. I also proposed hybrid approaches that combine transformer-based representations with rich handcrafted linguistic features, as well as statistical model that integrate embeddings and linguistic features. I trained the models on a large, highquality dataset constructed from Jordanian and Saudi curricula and evaluated on a separate benchmark corpus, BAREC (Elmadani, Habash, & Taha-Thomure, 2025), across multiple domains. To improve interpretability, I employed probing experiments to analyze what linguistic information is captured by the CLS representation in BERT-based models.

In addition, I utilized the Captum framework to compute attribution scores for input tokens and align them with outputs from CAMeL Tools to identify the most influential linguistic features. Furthermore, I conducted an ablation study to assess the contribution of different features and model components in both in-domain and cross-domain settings.

I found that transformer-based models perform well in in-domain settings but degrade significantly under cross-domain conditions. This suggests that these models rely heavily on domain-specific semantic and topical cues, which do not generalize well across domains. In contrast, statistical classifiers demonstrate relatively stronger cross-domain generalization, likely because they rely more on stable linguistic features rather than topic-dependent signals. Moreover, the statistical classifier XGBoost model, trained on linguistic features, outperforms hybrid and transformer-based approaches in cross-domain evaluation.

The interpretability analysis further reveals that CAMeLBERT captures many of the handcrafted linguistic features implicitly across its layers. Lower layers emphasize lexical cues, middle layers capture grammatical features, and higher layers focus on semantic and task-specific information. Attribution analysis of Captum shows that nominal group is among the most important features for readability prediction, followed by adjectives and verbs. Among morphological features, definiteness (marked by al-, the Arabic definite article “لا”), singular number, and surface-form features are particularly influential.

Degree

College and Department

Humanities; Linguistics

Rights

https://lib.byu.edu/about/copyright/

BYU ScholarsArchive Citation

Alzu'bi, Sarah, "Robust and Interpretable Cross-Domain Arabic Readability Prediction Using Hybrid Modeling: Evidence on the Limitations of Transformer-Based Classifiers" (2026). Theses and Dissertations. 11272.
https://scholarsarchive.byu.edu/etd/11272

Date Submitted

2026-04-22

Document Type

Thesis

Permanent Link

https://arks.lib.byu.edu/ark:/34234/q2ceb10361

Keywords

readability, automatic readability assessment, Arabic language, cross-domain evaluation, interpretability, CAMeLBERT, Captum

Language

english

Download

Included in

Arts and Humanities Commons

COinS

BYU ScholarsArchive

Theses and Dissertations

Robust and Interpretable Cross-Domain Arabic Readability Prediction Using Hybrid Modeling: Evidence on the Limitations of Transformer-Based Classifiers

Abstract

Degree

College and Department

Rights

BYU ScholarsArchive Citation

Date Submitted

Document Type

Permanent Link

Keywords

Language

Included in

Search

Browse

BYU Links

Author Corner

Hosted by the

BYU ScholarsArchive

Theses and Dissertations

Robust and Interpretable Cross-Domain Arabic Readability Prediction Using Hybrid Modeling: Evidence on the Limitations of Transformer-Based Classifiers

Author

Abstract

Degree

College and Department

Rights

BYU ScholarsArchive Citation

Date Submitted

Document Type

Permanent Link

Keywords

Language

Included in

Share

Search

Browse

BYU Links

Author Corner

Hosted by the