Abstract
This thesis investigates the impact of finetuning the LLaMA 33B language model on partisan news datasets, revealing negligible changes and underscoring the enduring influence of pretraining datasets on model opinions. Training nine models across nine distinct news datasets spanning three topics and two ideologies, the study found consistent demographic representation, predominantly favoring liberal, college-educated, high-income, and non-religious demographics. Interestingly, a depolarizing effect emerged from partisan news finetuning, suggesting that intense exposure to topic-specific information might lead to depolarization, irrespective of ideological alignment. Despite the exposure to contrasting viewpoints, LLaMA 33B maintained its common sense reasoning ability, showing minimal variance on evaluation metrics like Hellaswag accuracy, ARC accuracy, and TruthfulQA MC1 and MC2. These results might indicate robustness in common sense reasoning or a deficiency in synthesizing diverse contextual information. Ultimately, this thesis demonstrates the resilience of high-performing language models like LLaMA 33B against targeted ideological bias, demonstrating their continued functionality and reasoning ability, even when subjected to highly partisan information environments.
Degree
MS
College and Department
Physical and Mathematical Sciences; Computer Science
Rights
https://lib.byu.edu/about/copyright/
BYU ScholarsArchive Citation
Shaw, Alexander Glenn, "The Influence of Political Media on Large Language Models: Impacts on Information Synthesis, Reasoning, and Demographic Representation" (2023). Theses and Dissertations. 10059.
https://scholarsarchive.byu.edu/etd/10059
Date Submitted
2023-08-16
Document Type
Thesis
Handle
http://hdl.lib.byu.edu/1877/etd12897
Keywords
large language models, political news, demographic alignment, common sense reasoning, information synthesis
Language
english