Rafferty Local Anomaly Detection by Application of Regression Analysis On PMU Data
Rafferty Local Anomaly Detection by Application of Regression Analysis On PMU Data
PMU Data
Rafferty, M., Brogan, P., Hastings, J., Laverty, D., Liu, X., & Khan, R. (2018). Local Anomaly Detection by
Application of Regression Analysis on PMU Data. In Proceedings of Power and Energy Society General Meeting
(PESGM), 2018 (IEEE Power & Energy Society General Meeting: Proceedings). Institute of Electrical and
Electronics Engineers Inc..
Published in:
Proceedings of Power and Energy Society General Meeting (PESGM), 2018
Document Version:
Peer reviewed version
Publisher rights
© 2018 IEEE.
This work is made available online in accordance with the publisher’s policies. Please refer to any applicable terms of use of the publisher.
General rights
Copyright for the publications made accessible via the Queen's University Belfast Research Portal is retained by the author(s) and / or other
copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated
with these rights.
Open Access
This research has been made openly available by Queen's academics and its Open Research team. We would love to hear how access to
this research benefits you. – Share your feedback with us: http://go.qub.ac.uk/oa-feedback
Abstract—PMU data has the potential of providing a wealth Another, more sinister possibility is that the PMU data
of information on power system operation, health, faults and had been modified in transit deliberately. The most prevalent
anomalies. PMU tend to provide tens of measurements per standard in use today for synchrophasor transport is IEEE
second, therefore automated anomaly detection is required;
especially for use in real or near-time applications by power C37.118.2 (2005 or 2011) [1]. This PMU data transport
system operators. This paper demonstrates a method of detecting protocol contains no security mechanisms at all, [2], and is
local anomalies in PMU data utilizing multiple linear regression. highly susceptible to ‘in-transit’ manipulation. Considering the
A window of near-time data is employed to generate a regression historic use of PMU data for post-fault analysis, this may not
function that predicts the live data that arrives. If the error seem like much of an issue. However, PMU has the potential
between the observed and predicted values exceeds a threshold
an exception is noted. The threshold is dynamically updated to become widely used for real-time analysis and as inputs to
based on the error in the regression function, allowing the advanced control systems - therefore this type of attack vector
method to work equally well on data of varying regularity. This requires consideration and system design.
anomaly detection method is not tuned to particular events and Whether it’s a legitimate event, an unlikely data acquisition-
should detect novel occurrences. The method is evaluated on two
numerical case studies, a genuine power system event and a man level fault, or an even less likely targeted data-stream manip-
in the middle cyber attack. Real data was collected from PMUs ulation, the method presented in this paper detects anomalous
placed on the Irish power system. activity that may cause inappropriate PMU response. It will
Index Terms—Anomaly detection, PMU, machine learning, allow anomalous data to be quickly ’flagged’ for further
linear regression, cyber security analysis.
To achieve high accuracy in the detection of anomalies
I. I NTRODUCTION in PMU data, this paper has (1) employed Multiple Linear
Regression Analysis for the prediction of a power system
Many methods exist for event detection and these are often variable based on previous measurements and other recorded
employed in digital fault recorders (DFR). A typical method variables in the power system; (2) developed a metric for
is to set simple thresholds for frequency, RoCoF, sequence the threshold of detecting anomalies in PMU data based on
components, harmonics or magnitude variations. Care must be the difference between observed and predicted values; (3)
taken when choosing thresholds as the region between missing implemented a sliding window approach to the methodology
events and near constant triggering can be small. Although to dynamically update thresholds for anomaly detection us-
typical thresholds will vary between power systems, they will ing current, relevant power system parameters; (4) validated
also vary between sites, meaning each placement may need to the proposed method through two numerical case studies of
be separately evaluated and monitored; preventing large scale anomalous data collected from the Irish power system, a
rapid deployment. The method demonstrated in this paper is genuine power system event and a MITM attack on system
adaptive both to the local environment, both spatially and frequency measurement on PMU data.
temporally, as the event threshold is updated based on the
data recorded in the previous seconds.
As well as legitimate power system events, other anomalies II. M ACHINE L EARNING FOR A NOMALY D ETECTION
can appear in the synchrophasor information due to ‘bad data.’
This can happen in a number of ways. For example, a missing Machine learning [3] is described as the construction of
GPS signal can cause deviation in phase-angle estimation due methods which automatically improve with experience. These
to a misaligned phase-locked loop. Information on GPS lock can be split into two main types, supervised and unsupervised.
is usually sent in the PMU data stream, but unless this is Supervised learning is when the training data consists of ex-
processed by the application, the error can persist through the amples of inputs and corresponding targets (typical problems
phasor estimation stage. Hardware or software related data are classification and regression), whilst unsupervised learning
acquisition errors are also possible due to hardware failure, is when the training data consists of inputs only with targets
or where mis-configured analog-to-digital (ADC) modules or unknown but worked out by the algorithm (clustering is a
software processing (pre-phasor estimation) lead to errors in typical problem). A typical machine learning problem consists
the output of the phasor-estimation section of the PMU. of two main phases, these are learning and prediction.
X = xi,1 ...xi,n−1 respectively, and and the error. Again,
once the values of β have been calculated for a training dataset
of X and y values, a regression model can then be used for
the prediction of y values given X.
that is used to determine the relationship between two (simple • Rate of change of frequency (RoCoF)
of regression analysis is to determine the value of parameters • Rate of change of voltage (RoCoV)
for a function that gives the best fit line through a set of Response
observations (learning phase), thus allowing the prediction • Rate of change of phase (RoCoP)
of one variable based on the other recorded variables and After the initial regression model has been constructed,it can
the regression model (prediction phase). In this section the be used to predict the next response value from the predictor
two most basic and commonly used regression types will be values.
presented, these are simple and multiple linear regression. B.1. Sliding Window Regression
A.1. Simple Linear Regression An adaptive regression technique was required to allow the
In this case the model is bivariate, and shows the relation- model to be re-trained as power system conditions are con-
ship between one independent variable (predictor), x, and a tinuously changing. The methodology incorporates a sliding
dependent variable (response), Y , is given by: window [5] approach. The model is continuously retrained as
Y = β0 + β1 x + (1) new data is received. Important parameters for optimization
are window size and how often model retraining occurs.
β0 and β1 represent the model coefficients, with β0 represent- B.2. Anomaly Detection - 68–95–99.7 rule
ing the intercept and β1 the coefficient of x (or the parameter In literature the method of detecting anomalous data by
of the slope), and the error in the model. A regression model calculating standard deviation from the mean has been exten-
can then be determined by learning the values of β0 and β1 sively researched [6], and can be known as the 68-95-99.7 rule.
from equation 1 for a training dataset of x and y values. Using In normally distributed datasets the amount of data that lies
the regression model a new Y sample can be predicted using between 1 standard deviation of the mean is 68%, 2 standard
the corresponding x value for the sample and equation 1. deviations is 95% and 3 standard deviations is 99.7%.
A.2. Multiple Linear Regression The standard deviation is calculated from the RMS error,
Multiple predictors are used in the model to build the rela- between the observed and predicted data within the sliding
tionship between predictors and response variables. Denoting window, and is expressed as:
a set of samples in a dataset at the i-th sample instant as v
n
zi ∈ <1×n , a data matrix Z ∈ <i×n can be constructed with
uP
u (yi − ŷi )2
each row representing a sample, where i is the number of
t
i=1
(3)
samples and n the number of variables in the matrix Z. If n
a single variable is selected as a response (y) and the other where yi and ŷi are the observed and predicted values for the
variables selected as predictors (x1 , x2 , x3 ...xn−1 ), then for i-th observation and n the number of samples.
the i-th sample the regression equation is given as: The choice of the number of standard deviation from
yi = β0 +β1 xi,1 +β2 xi,2 +β3 xi,3 +...+βn−1 xi,n−1 +i (2) the mean of RMS error was investigated, and findings are
presented in Fig 4. This figure displays an initial regression
where, similarly as before, β values represent the model model for normal power system operating data, with the top
coefficients, with β0 representing the intercept, and β1 ...βn−1 plot showing the observed and predicted RoCoP values whilst
representing the coefficient of each element of X, where the bottom shows the difference and also a number of standard
Fig. 3: Overview of MITM attack on PMU Data
)
Actual Predict
1.3 50
1.2 49.95
RoCoP (
1.1 49.9
1 49.85
0.9 49.8
0 0.5 1 1.5 2 2.5 3 3.5 4 0 10 20 30 40 50 60
252
1 2 3
Voltage (V)
Difference
0.1
0.05
0 251
-0.05
-0.1
-0.15 250
0 0.5 1 1.5 2 2.5 3 3.5 4 0 10 20 30 40 50 60
)
Actual Predict
RoCoP (
effective it is in the prediction of itself. This is illustrated -2
Difference
0.5
-0.5
-1.5
typical line trip event that was recorded on the PMU situated 0 10 20 30 40 50 60
Time (second)
at Queen’s, a plot of two of the chosen predictor variables
(frequency and voltage) are shown in Fig 5 (a). It can be (b) Anomaly Detection monitoring
seen from this figure, that up until t ≈ 25 seconds both Fig. 5: Case 1: (a) frequency and voltage plots for line trip event, and (b)
comparison of observed and predicted RoCoP values (top plot), and anomaly
system frequency and voltage are varying (as expected) around detection results showing the difference between observed and predicted
their nominal values, at t ≈ 25 seconds the system frequency values and thresholds for detection (bottom plot)
has a sharp decrease from 49.98 Hz to 49.84 Hz over 0.1
seconds. The frequency then returns to nominal again before The attacker script is first set to the desired ramp and test
experiencing an additional fall 49.89 Hz over 0.9 seconds period. The attacker PC is then plugged into the same physical
before rising again and returning to nominal frequency around network as the sending (or receiving) PMU - this would
t = 42 seconds. Similar to frequency, voltage also experiences emulate a physical security breach in a substation or where
a sudden drop from 251.3 to 250.9 volts in 0.1 seconds, before the data is received. Once the script is executed it detects
further voltage drop to 250.3 volts is registered at the PMU, PMU data streams (using the common TCP port used for PMU
agreeing with the frequency plot previously analyzed. data and the known footprint, or signature of C37.118.2 data),
Monitoring results from the observed and predicted RoCoP the system then initiates a MITM attack through spoofing
are displayed in Fig 5 (b) which shows that pre-event there operations, and places itself as a ’pathway’ in the data stream.
are small differences between observed and predicted RoCoP Once the path is established, the attacker can then initiate the
value, however these fall below the set threshold for anomaly manipulation of the data, taking the incoming data points,
detection in the PMU data. Agreeing with the variable plots in one at a time, and modifying them by the correct amount
Fig 5, the method detects anomalous data when thresholds are to replicate attack parameters. The modified packet is then
violated at t ≈ 25 seconds when the difference change from 0.1 sent to the intended recipient, where there are no indications
to -0.9. Also from the monitoring results in Fig 5 (b) it can be that any data modification has taken place. This attack action
observed that whilst the event is occurring, anomaly thresholds can be applied to any of the system variables and over an in
are held constant as to avoid contamination of the model by discriminant amount of time.
using event data in its construction. This allows a proper return Displayed in Fig 6 (a) is a MITM attack that targeted live
to nominal value for the variables to be calculated at t = 42 frequency data being reported by a PMU. In this case the
seconds, again agreeing with the variable plot in 5. It should frequency was increased from 49.5 Hz to 55 Hz over the period
be noted that in the voltage variable plot the measurement of 5 seconds. This type of frequency ramp could be interpreted
from t = 50 - 60 seconds is much lower than some of the as an islanding event (when a distributed generator continues
event voltage, however as can be seen in the result plot this to energize local loads after isolation from the main system),
does not cause the triggering of any anomaly. where generation is higher than load. If distributed generation
2) Case 2 - MITM Attack on Frequency Measurement: assets were using this data for islanding protection, then they
The modification of system frequency was implemented using might disconnect or the system operator might intentionally
the MAC-spoofing MITM technique described in Section III. isolate that part of the network.
The method proposed has several advantages including self
Frequency (Hz)
56
54
50
50
40 42 44 46 48 50
ging mechanism’ has applications for real-time PMU data
48
30 40 50 60 70
that could be applied at the central processing level or at
remote locations. Flagging is used to highlight, anticipated and
233
unanticipated, anomalies in data streams, before the relevant
Voltage (V)
232.5
data is forwarded for more computationally intensive event
232 categorization. The flagging mechanism could also be applied
231.5
as a method of highlighting suspicious data. For example,
30 40 50 60 70
0
5
0
-1
40 42 44 46 48 50
conditions for anomaly detection and clearing are at present
-5
ad hoc; such as using a 0.5 second time delay before triggering
30 40 50 60 70
a genuine anomaly. In practice, this may be slow and in some
5
cases genuine anomalies may occur for less than this delay.
Difference
0
1
While this delay is an arbitrary value, future investigation will
-5
0 determine a more optimal solution. At present this regression
-10
-1
40 42 44 46 48 50
method has only been applied to RoCoP, the next step will be
-15
30 40 50 60 70
to apply it to other system variables. The method described
Time (second)
can easily be applied to frequency, phase angle, voltage and
(b) Anomaly Detection monitoring their derivatives simultaneously; this may also give insight into
Fig. 6: Case 2: (a) frequency and voltage plots for frequency intrusion with the nature of the anomalies.
insert of pre-anomaly system frequency also shown, and (b) anomaly detection
results for frequency intrusion with inserts showing pre-anomaly monitoring Future work will look at some enhancements to improve
and further validate the methodology presented. Firstly, an
Anomaly detection monitoring results are presented in Fig 6 investigation into optimal window size and frequency of
(b), the insert plots show results pre-anomaly and it can be regression model retraining will be carried out. Secondly,
seen that there is a small difference between observed and the simultaneous regression technique on multiple system
predicted RoCoP values (top plot), and as expected these fall variables will be carried out to identify if it can yield a
between RMS error thresholds previously selected during the more robust detection of anomalies, anomaly categorization
initialization phase of the methodology (bottom plot). It can and identify MITM attacks. Finally, the results from multiple
be observed from the lower plot in Fig 6 (b) that at t = locations will be combined to develop this from a local to a
50 seconds the difference between observed and predicted wide-area detection method.
RoCoP values begin to exceed the set RMS error threshold and
once a genuine anomaly has been detected in the PMU data, ACKNOWLEDGMENT
thresholds are held again. At t = 63 seconds the methodology The research is supported by a British Council Newton Institu-
detects the end of the anomaly, which corresponds with the tional Links Programme grant with Helwan University, Egypt.
system frequency plot in 6 (a). Once the anomalous data has
been cleared from the system, anomaly thresholds begin to R EFERENCES
update again and difference between observed and predicted [1] “IEEE standard for synchrophasor data transfer for power systems,” Dec.
RoCoP falls between calculated thresholds. 28 2011, IEEE C37.118.2-2011 (Revision of IEEE Std C37.118-2005)
doi: 10.1109/IEEESTD.2011.6111222.
[2] R. Khan, K. McLaughlin, D. Laverty, and S. Sezer, “Analysis of ieee c37.
IV. C ONCLUSIONS AND F UTURE W ORK 118 and iec 61850-90-5 synchrophasor communication frameworks,” in
Power and Energy Society General Meeting (PESGM), 2016. IEEE,
In this paper a methodology for the detection of local 2016, pp. 1–5.
[3] T. M. Mitchell et al., “Machine learning. wcb,” 1997.
power system anomalies is proposed. The multiple linear [4] S. Chatterjee and A. S. Hadi, Regression analysis by example. John
regression method employs frequency, phase angle and voltage Wiley & Sons, 2015.
magnitude; collected from PMUs located across the Irish [5] R. Akerkar, Big data computing. CRC Press, 2013.
[6] D. Ruan, G. Chen, E. E. Kerre, and G. Wets, Intelligent data mining:
power system. Two case studies are presented as preliminary techniques and applications. Springer Science & Business Media, 2005,
evaluations of the methodology, they demonstrate the potential vol. 5.
for real-time applications and historical analysis. Setting a [7] D. M. Laverty, R. J. Best, P. Brogan, I. Al Khatib, L. Vanfretti, and
D. J. Morrow, “The OpenPMU platform for open-source phasor mea-
threshold of 3 × the nominal RMS error was demonstrated surements,” IEEE Trans. on Instrumentation and Measurement, vol. 62,
as a reliable method of detecting anomalies in the PMU data. no. 4, pp. 701–709, 2013.
[8] X. Zhao, D. M. Laverty, A. McKernan, D. J. Morrow, K. McLaughlin,
and S. Sezer, “Gps-disciplined analog-to-digital converter for phasor
measurement applications,” IEEE Transactions on Instrumentation and
Measurement, 2017.