Keywords

transformer neural network architecture, long short-term memory, model predictive control

Abstract

Transformer neural networks have revolutionized natural language processing by effectively addressing the vanishing gradient problem. This study focuses on applying Transformer models to time-series forecasting and customizing them for a simultaneous multistep-ahead prediction model in surrogate model predictive control (MPC). The proposed method showcases improved control performance and computational efficiency compared to LSTM-based MPC and one-step-ahead prediction models using both LSTM and Transformer networks. The study introduces three key contributions: (1) a new MPC system based on a Transformer time-series architecture, (2) a training method enabling multistep-ahead prediction for time-series machine learning models, and (3) validation of the enhanced time performance of multistep-ahead Transformer MPC compared to one-step-ahead LSTM networks. Case studies demonstrate a significant fifteen-fold improvement in computational speed compared to one-step-ahead LSTM, although this improvement may vary depending on MPC factors like the lookback window and prediction horizon.

Original Publication Citation

Junho Park, Mohammad Reza Babaei, Samuel Arce Munoz, Ashwin N. Venkat, John D. Hedengren, Simultaneous multistep transformer architecture for model predictive control, Computers & Chemical Engineering, Volume 178, 2023, 108396, ISSN 0098-1354, https://doi.org/10.1016/j.compchemeng.2023.108396.

Document Type

Peer-Reviewed Article

Publication Date

2023-08-22

Publisher

Computers & Chemical Engineering

Language

English

College

Ira A. Fulton College of Engineering

Department

Chemical Engineering

University Standing at Time of Publication

Full Professor

Share

COinS