Abstract

This thesis presents training of an end-to-end autoencoder model using the transformer, with an encoder that can encode sentences into fixed-length latent vectors and a decoder that can reconstruct the sentences using image representations. Encoding and decoding sentences to and from these image representations are central to the model design. This method allows new sentences to be generated by traversing the Euclidean space, which makes vector arithmetic possible using sentences. Machines excel in dealing with concrete numbers and calculations, but do not possess an innate infrastructure designed to help them understand abstract concepts like natural language. In order for a machine to process language, scaffolding must be provided wherein the abstract concept becomes concrete. The main objective of this research is to provide such scaffolding so that machines can process human language in an intuitive manner.

Degree

MS

College and Department

Physical and Mathematical Sciences; Mathematics

Rights

https://lib.byu.edu/about/copyright/

Date Submitted

2023-04-07

Document Type

Thesis

Handle

http://hdl.lib.byu.edu/1877/etd13138

Keywords

machine learning, deep learning, natural language processing, language modeling, autoencoder, transformer, attention, Jacobian, matrix calculus

Language

english

Share

COinS