Keywords
transformer neural network architecture, long short-term memory, model predictive control
Abstract
Transformer neural networks have revolutionized natural language processing by effectively addressing the vanishing gradient problem. This study focuses on applying Transformer models to time-series forecasting and customizing them for a simultaneous multistep-ahead prediction model in surrogate model predictive control (MPC). The proposed method showcases improved control performance and computational efficiency compared to LSTM-based MPC and one-step-ahead prediction models using both LSTM and Transformer networks. The study introduces three key contributions: (1) a new MPC system based on a Transformer time-series architecture, (2) a training method enabling multistep-ahead prediction for time-series machine learning models, and (3) validation of the enhanced time performance of multistep-ahead Transformer MPC compared to one-step-ahead LSTM networks. Case studies demonstrate a significant fifteen-fold improvement in computational speed compared to one-step-ahead LSTM, although this improvement may vary depending on MPC factors like the lookback window and prediction horizon.
Original Publication Citation
Junho Park, Mohammad Reza Babaei, Samuel Arce Munoz, Ashwin N. Venkat, John D. Hedengren, Simultaneous multistep transformer architecture for model predictive control, Computers & Chemical Engineering, Volume 178, 2023, 108396, ISSN 0098-1354, https://doi.org/10.1016/j.compchemeng.2023.108396.
BYU ScholarsArchive Citation
Park, Junho; Babaei, Mohammad Reza; Munoz, Samuel Arce; Venkat, Ashwin N.; and Hedengren, John, "Simultaneous Multistep Transformer Architecture for Model Predictive Control" (2023). Faculty Publications. 8240.
https://scholarsarchive.byu.edu/facpub/8240
Document Type
Peer-Reviewed Article
Publication Date
2023-08-22
Publisher
Computers & Chemical Engineering
Language
English
College
Ira A. Fulton College of Engineering
Department
Chemical Engineering
Copyright Status
© 2023 Elsevier Ltd. All rights reserved. This is the preprint version of this article. The definitive version can be found at https://doi.org/10.1016/j.compchemeng.2023.108396.
Copyright Use Information
https://lib.byu.edu/about/copyright/