Abstract

This thesis investigates the impact of finetuning the LLaMA 33B language model on partisan news datasets, revealing negligible changes and underscoring the enduring influence of pretraining datasets on model opinions. Training nine models across nine distinct news datasets spanning three topics and two ideologies, the study found consistent demographic representation, predominantly favoring liberal, college-educated, high-income, and non-religious demographics. Interestingly, a depolarizing effect emerged from partisan news finetuning, suggesting that intense exposure to topic-specific information might lead to depolarization, irrespective of ideological alignment. Despite the exposure to contrasting viewpoints, LLaMA 33B maintained its common sense reasoning ability, showing minimal variance on evaluation metrics like Hellaswag accuracy, ARC accuracy, and TruthfulQA MC1 and MC2. These results might indicate robustness in common sense reasoning or a deficiency in synthesizing diverse contextual information. Ultimately, this thesis demonstrates the resilience of high-performing language models like LLaMA 33B against targeted ideological bias, demonstrating their continued functionality and reasoning ability, even when subjected to highly partisan information environments.

Degree

MS

College and Department

Physical and Mathematical Sciences; Computer Science

Rights

https://lib.byu.edu/about/copyright/

Date Submitted

2023-08-16

Document Type

Thesis

Handle

http://hdl.lib.byu.edu/1877/etd12897

Keywords

large language models, political news, demographic alignment, common sense reasoning, information synthesis

Language

english

Share

COinS