Monitoring and diagnosis of a gas turbine is a critical issue in equipment maintenance field. Traditional diagnosis methods are established on the basis of physical models. However, the complexity and degradation of gas turbine limit both comprehensiveness and accuracy of these physical models, making the diagnosis less effective. Therefore, data-driven models are introduced to supplement and revise previous models.
Benefitting from the prosperous development of machine learning, neural network has been greatly improved and widely used in various fields of data mining. Three neural networks, Multilayer Perceptron, Convolutional Neural Network and Long Short-term Memory Network are applied in data-driven model establishment. Their training time and prediction accuracy are the two most important factors in judging the effectiveness.
An active real time training which means training and predicting simultaneously is applied as the main modelling method for an on-line diagnosis system. Three periods are defined according to the time line: data preparation period, model establishing period and stable prediction period. From the three above neural networks, the most effective data-driven models that corresponding to the last two periods are tested and selected, the purpose is to ensure the high level of accuracy.
When high level of accuracy is demanded, neural network always need large computing time and memory space in data learning process. To avoid prediction delay and keep rapid response for the coming fault, distributed training on a 1-master 2-workers computer cluster is designed and applied in this system. Two types of data parallelism are realized on the cluster through Apache Spark and Shell Script for Linux. Comparing with each other and the local training mode, the results shows that dispensing data at first and averaging parameters at last reaches a better outcome both in high accuracy and low training time.