Applying Design Knowledge and Machine Learning To Scada Data For Classification of Wind Turbine Operating Regimes
Applying Design Knowledge and Machine Learning To Scada Data For Classification of Wind Turbine Operating Regimes
Abstract—Wind turbines operate under non-stationary dy- only deployed for a limited testing period for prototyping
namic loads to which they constantly adapt by regulating the purposes, and do not comprise a standard system like SCADA.
orientation of the blades and the rotor, as well as the gener- SHM monitoring system can consist of accelerometers, strain
ator torque resulting in characteristic responses (i.e. operating
regimes) over a range of operating conditions. We propose a gauges, force and displacement measurements.
method to classify the operating regimes from coarse resolution To gain knowledge of the operating regimes we propose
data recorded by the turbine supervisory controller (i.e. data to tap into coarse resolution measurements recorded by the
from the SCADA system). It relies on design knowledge, and SCADA unit and combine them with acceleration measure-
algorithms for dimensionality reduction and classification. High ments from a custom SHM system. This data-driven method
resolution acceleration measurements from a custom structural
state monitoring system and a data set of several channels from combines generic knowledge of how a wind turbine operates
the SCADA system are used for validation. Estimation of the (i.e. design knowledge) and algorithms for dimensionality
level of damage accumulated on structural components based on reduction and clustering. Several works exist that use data from
the classification of operating regimes is shown as an application. the SCADA system for CM and SHM [1–7], fault-detection
[8–12], and performance assessment [13, 14]. Some of these
Index Terms—Supervised classification, unsupervised classifi-
cation, data clustering, data dimensionality reduction, vibration
applied signal processing techniques and learning algorithms
data, structural health monitoring. such as Artificial Neural Networks (ANN). Häckell et al. [5],
applied classification algorithms to data from environmental
I. I NTRODUCTION operating conditions to then normalized SHM data for damage
detection purposes. Hu et al., and Bogoevska et al. [1, 7]
Estimating the accumulation of damage on wind turbine coupled information from the SCADA system with carefully
support structures in an online manner can be useful to im- situated structural sensing monitoring systems for assessing
prove maintenance and ensure their structural integrity while the structural condition under different environental condi-
maximising system availability. Since wind turbines operate tions. However, neither a schema for detecting the operating
in different regimes according to the wind conditions and regime nor a schema for exploiting design knowledge as means
their control settings, it is expected that the damage suffered for classifying operating regimes are proposed. Whereas in
by structural components will be different under different the fields of machine learning, artificial intelligence and data
operating regimes. A natural step is to map operating regimes mining several works apply expert knowledge to complement
to damage indexes. Condition Monitoring (CM) and Structural statistical methods. Here, we strive to tap into the coarse
Health Monitoring (SHM) systems can provide data to pursue resolution (i.e., 10-min values) data from the SCADA unit
this task. because it is accessible. This historical data is typically used
Data from Supervisory Control and Data Acquisition for performance assessment of wind farms. Whereas lower
(SCADA) systems is readily available since every wind power level, higher resolution data from the control unit, that can
plant has a SCADA system which permanently interfaces to also be accessed through the SCADA system, is hard to
the hardware of control units. SCADA systems offer informa- handle. Even for the different stakeholders that operate and
tion on operational and environmental variables, but it does not own wind turbines can be burdemsome to access it and handle
offer any information on the structural condition of compo- it. Therefore, this approach might prove to be applicable to
nents such as the tower, foundation and blades. SHM systems current operation and maintenance practices.
can fill this gap, these are mostly custom systems deployed Even though it can be advantageous to identify operating
to measure structural response. At the current time, these are regimes, it is also very challenging because support struc-
Fig. 1. Layout of vibration sensors at tower top (i.e., 100m from the ground),
two 3-axial vibration sensors are located on the north and east faces, the main
The data set from the SCADA system comprises 10-min
wind direction of the site θu ≈ 250o . averages of a large set of sensors and measurements which
directly relate to the state of the wind turbine. To provide
min time series, once every hour) with tri-axial accelerometers an overview of the 190 channels, we classified them in (1)
placed on the north, south and east inner faces of the tower Environmental (i.e., wind speed, ambient temperature); (2)
at different heights (i.e., 60m, 70m, 80m, and 100m). Fig.1 Turbine dynamics (i.e., acceleration, rotor speed, oil pressure);
shows the orientation of the axes of the sensors rigged at the (3) Electrical systems (all power related measures); (4) Sub-
tower top located to the north (yN , zN ) and east (yE , zE ). component condition (i.e., temperature); (5) Counters, alarms,
A histogram of peaks found in the power spectrum of these and logs (i.e., turbine-on, service time, generator time-on). The
vibration records, as shown in Fig.2, gives an impression of the first four data classes consist of average, standard deviation and
different structural responses observed and the predominant extreme values. Data class 5 includes counters of the number
vibration modes. of seconds during which the turbine was in a given state. Fig.4
The different responses are mainly associated to changes in shows time series representing each category. The relation
the mean wind speed and wind direction, as well as turbulence between wind conditions and turbine controls is illustrated in
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on July 20,2022 at 02:20:48 UTC from IEEE Xplore. Restrictions apply.
Fig.5, where P ramps up as u increases. The design rules that formulation [16], given Am×n find a subspace with dimension
we introduce later stem from the relation between θ and ωr k < n where the variance of the projected data is maximized.
because u tends to have larger measurement uncertainties. When n features and m samples are sufficiently large the most
efficient algorithms, such as [17] resort to construct a singular
value decomposition that approximates A
A ≈ UΣV (1)
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on July 20,2022 at 02:20:48 UTC from IEEE Xplore. Restrictions apply.
When the objective is to find a set of classes (i.e. a partition
C{c1 , . . . , ck }) in a data set without prior knowledge of the
number of classes and which class ck each sample xi belongs
to; clustering algorithms such as k-means, DBSCAN and the
HC family can be applied.
Once initialized with a number of clusters and their cen-
troids, k-means iterates between assigning xi to a cluster,
and adjusting the centroid ck to minimize a distance metric.
Several equivalent formulations of the objective function exist
[19], a common aim is to minimize the sum of cluster
variances (4)
C = argmin 2|ck | (xij − μij )
2
(4)
C ck ∈C j=1...d xi ∈ck
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on July 20,2022 at 02:20:48 UTC from IEEE Xplore. Restrictions apply.
are calculated using an Euler-Bernoulli beam fixed at one TABLE I
end, with a thin-walled circular cross section for which shear C LASSIFICATION BASED ON RULES FOR LOWER AND UPPER BOUNDS OF
PITCH ANGLE (θLB , θUB ) AND GENERATOR ROTOR SPEED ( ωr,LB , ωr,UB ).
deformation is not playing as large role as bending.
Regime θlb θub ωr,lb ωr,ub
D. State estimation
Parked 73.0 93.0 -8.0 300.0
The state estimation can be performed with different reso- [Green]
lutions in time and space, from monitoring the instantaneous Idling [Blue] 21.5 26.0 -8.0 355.0
condition to estimating life-cycle performance. For fatigue one Start 1.0 21.5 -8.0 1235.0
can estimate damage from rainflow counting of forces or stress [Orange]
histories combined with damage accumulation models. Here, Plow-mid -3.0 1.0 920. 1550.0
damage equivalent loads and damage at the tower base flange [Brown]
Plim [red] -3.0 1.0 1550.0 1865.0
are calculated. The accumulated damage is the estimated
Pmax 1.0 30.0 1500.0 1865.0
according to Palmgren-Miner’s linear damage accumulation [purple]
rule (5), where the number of cycles at each stress amplitude
(ni , Si ) are counted with the rainflow-counting algorithm. The
corresponding cycles-to-failure Ni (Si ) are found from the shows a set rules on the ωg -θ (i.e., design variables) plane that
material S-N curve. are illustrated in Fig.7. The corresponding clusters are shown
k
ni on the power curve and the nacelle acceleration in Fig.8. We
D= (5) can observe larger fore-aft ẍfa and side-to-side ẍss vibrations
N i
i=1 as the wind speed increases. ẍss vibrations seem larger and
IV. A NALYSIS AND RESULTS are more scatter than ẍfa , particularly at higher wind speeds
where the fore-aft motion as aerodynamic damping. These
To validate the proposed framework, we analyze the vi-
clusters are used as a reference to compare different clustering
bration data according to operating conditions, derive rules
alternatives.
to define operating regimes, test supervised and unsupervised
classification, and then illustrate the estimation of fatigue life
with the long term coarse resolution data.
A. Classification of operating modes and correlation to mea-
sured response
The response measured with vibration data is linked to the
to the operating regime observed as follows:
1) Filter and downsample vibration data. From the SCADA
system data discard variables with counters, match time
stamps to those of vibration data.
2) Set rules to define operating regimes based on generic
design knowledge, and the observed behavior. Verify
consistency of rules on feature space and power curve
plot.
3) Reduce dimension of SCADA system data with PCA,
evaluate how many components represent the data well
enough.
4) Apply clustering on the lower dimension subspace,
verify classification on the feature space.
In this process, several euristic rules based on knowledge Fig. 7. Clusters based on rules from Table I. Plot shows also minimum-
and maximum-ωr . Data points outside the defined clusters are assigned as
of the wind turbine operation are tested and it is observed outliers.
that the modes of vibration measured with the accelerometers
(illustrated in Fig.2 and 3) show a good relation to different
subsets of SCADA system variables. The magnitude and the B. Supervised and unsupervised classification
peak location on the PSD of the time series correlate with oper- A few classification workflows are defined based on su-
ating regimes at increasing u. The vibration data demonstrates pervised (nearest neighbors) and on unsupervised clustrering
the higher magnitude vibrations at higher u, up to the power algorithms (k-means) that are applied on a training data set
limitation region where although the steady state aerodynamic corresponding to the month of March. Table II sumarizes the
thrust decreases, fluctuations of the acceleration remain high. main inputs, settings and a score of the model fit. The main
The location of these peaks is linked to the passing frequencies settings of the clustering algorithms are defined euristically, by
of the blades and to the modes of vibration of the tower. Table I performing a few trials to achieve a good score without over
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on July 20,2022 at 02:20:48 UTC from IEEE Xplore. Restrictions apply.
TABLE II
S ETTINGS AND FIT OF SUPERVISED AND UNSUPERVISED CLASSIFICATION .
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on July 20,2022 at 02:20:48 UTC from IEEE Xplore. Restrictions apply.
estimation methods and use of higher fidelity numerical
models. Regarding classification, a formal evaluation of
clustering alernatives (i.e., algorithm settings, multi-stage
classification) is necessary to improve performance.
• The design variables used for classification are a better
indication of the operating regime than environmental
variables which hold larger uncertainty.
V. C ONCLUSIONS
The dynamic response of wind turbine support structures
shows different characteristic responses depending on the wind
conditions and the control settings. These operating regimes
Fig. 10. Mean per operating regime of damage equivalent load So , normalized correlate well with the measured response of the structure. We
by the value in regime Pmax . Accelerations from sensors conditional to wind propose an approach where operating regimes are classified
direction are used to estimate fore-aft and side-to-side loads. based on generic design knowledge and clustering methods.
Aspects of this approach that are explored include: reduction
from the classification of the complete 5 years data set, Fig.11 of dimensionality of the data set, and classification with
shows the evolution of the estimated So and u for a period of supervised and unsupervised methods. Principal component
3 months. analysis reduces the dimensionality of the data. The subse-
quent unsupervised classification can be realized in the lower
dimensional space. A potential application is demonstrated by
estimating the fatigue state of a structural component from
several years of SCADA system data. The estimated states can
be utilized to calibrate models of operation and maintenance
of wind turbine structures. Further research will involve the
application of the deviced framework at the wind farm level
where the operating conditions and responses might be highly
diversified amongst turbines.
ACKNOWLEDGEMENT
The authors would like to gratefully acknowledge the sup-
Fig. 11. Wind and damage equivalent loads from October to December of port of the European Research Council via the ERC Starting
2015.
Grant WINDMIL (ERC-2015-StG #679843) on the topic of
Smart Monitoring, Inspection and Life-Cycle Assessment of
D. Summary Wind Turbines. REpower is acknowledged for facilitating data
• Most of the time the tower experiences stress cycles much and access to a wind turbine for instrumentation.
smaller than the endurance limit of the steel. Operating
regimes at higher wind speed tend to show larger damage R EFERENCES
relative to those at lower wind speeds. [1] W.-H. Hu, S. Thöns, R. G. Rohrmann, S. Said, and
• Fore-aft motion yields the highest damage to the turbine, W. Rücker, “Vibration-based structural health monitor-
although the side-to-side motion can show large peak ing of a wind turbine system part ii: Environmen-
values of acceleration because there is practically no tal/operational effects on dynamic properties,” Engineer-
aerodynamic damping in that direction. ing Structures, vol. 89, pp. 273–290, 2015.
• Larger vibrations, with larger damage values, correlate [2] M. Spiridonakos, Y. Ou, E. Chatzi, and U. Reiter, “Wind
with operating regimes at higher wind speeds. However, turbines structural identification framework for the rep-
once rated power is reached further classification and ac- resentation of both short- and long-term variability,” in
counting for the change in steady state thrust is necessary SHMII 2015 - 7th International Conference on Struc-
to accurately describe the damage tendency in that range. tural Health Monitoring of Intelligent Infrastructure,
• Abnormal vibration episodes during which the turbine A. De Stefano, Ed., 2015.
experiences higher than normal vibrations in very short [3] N. Noppe, A. Iliopoulos, W. Weijtjens, and C. Devriendt,
intervals are not easily detected using on 10-min values “Full load estimation of an offshore wind turbine based
from the SCADA system. Using higher resolution data on scada and accelerometer data,” in Journal of Physics:
and extending the vibration data set would improve Conference Series, vol. 753, no. 7. IOP Publishing,
statistical accuracy. 2016, p. 072025.
• More detailed analyses should include estimation of [4] J. Helsen, G. De Sitter, and P. J. Jordaens, “Long-term
forces from accelerometer measurements using advanced monitoring of wind farms using big data approach,”
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on July 20,2022 at 02:20:48 UTC from IEEE Xplore. Restrictions apply.
in Big Data Computing Service and Applications (Big- in wind turbine gearbox using supervisory control and
DataService), 2016 IEEE Second International Confer- data acquisition data,” IET Renewable Power Generation,
ence on. IEEE, 2016, pp. 265–268. vol. 9, no. 6, pp. 610–617, 2015.
[5] M. W. Häckell, R. Rolfes, M. B. Kane, and J. P. [12] Y. Qiu, Y. Feng, J. Sun, W. Zhang, and D. Infield,
Lynch, “Three-tier modular structural health monitoring “Applying thermophysics for wind turbine drivetrain
framework using environmental and operational condi- fault diagnosis using scada data,” IET Renewable Power
tion clustering for data normalization: Validation on an Generation, vol. 10, no. 5, pp. 661–668, 2016.
operational wind turbine system,” Proceedings of the [13] E. Papatheou, N. Dervilis, A. E. Maguire, I. Antoniadou,
IEEE, vol. 104, no. 8, pp. 1632–1646, 2016. and K. Worden, “A performance monitoring approach for
[6] P. Sun, J. Li, C. Wang, and Y. Yan, “Condition as- the novel lillgrund offshore wind farm,” IEEE Transac-
sessment for wind turbines with doubly fed induction tions on Industrial Electronics, vol. 62, no. 10, pp. 6636–
generators based on scada data,” Journal of Electrical 6644, 2015.
Engineering & Technology, vol. 12, no. 2, pp. 689–700, [14] N. Mittelmeier, T. Blodau, and M. Kühn, “Monitoring
2017. offshore wind farm power performance with scada data
[7] S. Bogoevska, M. Spiridonakos, E. Chatzi, E. Dumova- and an advanced wake model,” Wind Energy Science,
Jovanoska, and R. Höffer, “A data-driven diagnostic vol. 2, no. 1, pp. 175–187, 2017. [Online]. Available:
framework for wind turbine structures: A holistic ap- http://www.wind-energ-sci.net/2/175/2017/
proach,” Sensors, vol. 17, no. 4, p. 720, 2017. [15] S. Wold, K. Esbensen, and P. Geladi, “Principal compo-
[8] A. Zaher, S. McArthur, D. Infield, and Y. Patel, “Online nent analysis,” Chemometrics and intelligent laboratory
wind turbine fault detection through automated scada systems, vol. 2, no. 1-3, pp. 37–52, 1987.
data analysis,” Wind Energy, vol. 12, no. 6, pp. 574–593, [17] N. Halko, P.-G. Martinsson, Y. Shkolnisky, and
2009. M. Tygert, “An algorithm for the principal component
[9] K. Kim, G. Parthasarathy, O. Uluyol, W. Foslien, analysis of large data sets,” SIAM Journal on Scientific
S. Sheng, and P. Fleming, “Use of scada data for failure computing, vol. 33, no. 5, pp. 2580–2594, 2011.
detection in wind turbines,” in ASME 2011 5th Interna- [18] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel,
tional Conference on Energy Sustainability. American B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer,
Society of Mechanical Engineers, 2011, pp. 2071–2079. R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cour-
[10] M. Schlechtingen and I. F. Santos, “Comparative anal- napeau, M. Brucher, M. Perrot, and E. Duchesnay,
ysis of neural network and regression based condition “Scikit-learn: Machine learning in Python,” Journal of
monitoring approaches for wind turbine fault detection,” Machine Learning Research, vol. 12, pp. 2825–2830,
Mechanical systems and signal processing, vol. 25, no. 5, 2011.
pp. 1849–1875, 2011. [19] H.-P. Kriegel, E. Schubert, and A. Zimek, “The (black)
[11] I. Al-Tubi, H. Long, P. Tavner, B. Shaw, and J. Zhang, art of runtime evaluation: Are we comparing algorithms
“Probabilistic analysis of gear flank micro-pitting risk or implementations?” Knowledge and Information Sys-
[16] C. M. Bishop, “Pattern recognition,” Machine Learning, tems, pp. 1–38, 2016.
vol. 128, pp. 1–58, 2006.
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on July 20,2022 at 02:20:48 UTC from IEEE Xplore. Restrictions apply.