Abstract

For the end user, Large Language Models (LLMs) are programs that process natural language inputs into natural language outputs. In popular usage, this tends to take the form of conversation: a user asks a question, provides information, or gives instructions, and the LLM (hopefully) replies in a manner we would expect of an informed and compliant person. While convenient and intuitive for users, this natural conversational format encourages the misconception that LLMs are learning from conversations, when they do not. This work presents the benefits and practicality of a language model paradigm that meets this user expectation -- that is, learning from each prompt via real-time fine-tuning. I evaluate the time efficiency, effectiveness, and fluency retention of real-time trained models using standard F1 and MMLU benchmarks. This work finds that real-time model instruction from natural inputs is feasible, while subject to volatility.

Degree

College and Department

Computational, Mathematical, and Physical Sciences; Computer Science

Rights

https://lib.byu.edu/about/copyright/