0% found this document useful (0 votes)

46 views

Improving Reliability of IEEE-1588 in Substation Automation Based On Clock Drift Prediction

Master thesis: Improving Reliability of IEEE-1588 in substation automation. Yin xiao: clock drift prediction techniques can handle loss of GPS signal. Arithmetic average based prediction technique can easily reach an accuraccy of less than 10 microseconds for a prediction duration of a couple of seconds.

Uploaded by

Kiran Lanka

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views

Improving Reliability of IEEE-1588 in Substation Automation Based On Clock Drift Prediction

Uploaded by

Kiran Lanka

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 91

IT 08 039

Examensarbete 30 hp Oktober 2008

Improving Reliability of IEEE-1588 in Substation Automation Based on Clock Drift Prediction

Yin Xiao

Institutionen fr informationsteknologi Department of Information Technology

Abstract
Improving Reliability of IEEE-1588 in Substation Automation Based on Clock Drift Prediction
Yin Xiao

Teknisk- naturvetenskaplig fakultet UTH-enheten Besksadress: ngstrmlaboratoriet Lgerhyddsvgen 1 Hus 4, Plan 0 Postadress: Box 536 751 21 Uppsala Telefon: 018 471 30 03 Telefax: 018 471 30 00 Hemsida: http://www.teknat.uu.se/student

An electric substation is a node in the power grid network. It serves the purpose of transmitting and distributing electric energy from power sources to consumers. An electric substation is made of primary equipment and secondary equipment. The secondary equipment aims at protecting and controlling the primary one by sensing and analyzing various data. One pre-requisite to perform efficient protection functions is to have synchronized data provided by the various devices. The IEEE-1588 protocol is one promising way to handle the synchronized requirements of tomorrow's substation automation, however, one of the remaining issue is its lack of reliability in case of the loss of the GPS signal (e.g., due to atmospheric disturbances or failure of the GPS antenna) which would lead to the de-synchronization of the devices inside a substation or between different substations. The assignment of this master thesis project, commissioned by ABB CRC in Baden, is to investigate different drift clock prediction techniques which can handle the loss of the GPS signal, the loss of the GPS antenna receiver or the loss of the grand master device, thereby keep the substation automation synchronized without the GPS signal. Various of linear and nonlinear models of time series prediction are explored in Matlab, five main approaches based on arithmetic average, weighted average and delay coordinate embedding are eventually chosen and developed in combination with an existing open source implementation of IEEE-1588 PTPd. The five approaches' performance were judged and they have shown good results. Evaluation experiments run in our laboratory identify the most suitable technique for each type of GPS signal loss duration. On one hand, an arithmetic average based prediction technique can easily reach an accuraccy of less than 10 microseconds for a prediction duration of a couple of seconds at a minimal computing cost. On the other hand, a time series-based prediction technique can provide an accuracy of 76 microseconds over a period of 48 hours but at a much higher computing power cost. Keywords: IEEE-1588, Precision Time Protocol, Reliability, Substation Automation, Time Series, Prediction.

Handledare: Jean-Charles Tournier mnesgranskare: Bengt Jonsson Examinator: Anders Jansson IT 08 039 Tryckt av: Reprocentralen ITC

Contents
1 Introduction 1.1 Consideration and Background 1.2 Problem Statement 1.3 Aims and Objectives 1.4 Dissertation Organization 2 1 1 3 7 7 9 9 9

State of the Art of Precision Clock Prediction 2.1 Introduction

2.2 Overview of Physical Clock Model

2.3 Time Series Prediction 11 2.3.1 2.3.2 Characterization of Time-Series . . . . . . . . . . . . . . . Time-Series Analysis Techniques . . . . . . . . . . . . . . . 12 15

2.4 Precision Clock Prediction Using Time Series

2.5 Conclusion 23 3 Fast Forecasting of Time-Series Data 25 3.1 Introduction 25 26

3.2 Arithmetic average and weighted average approaches 3.3 The Procedure of Time Series Prediction 3.4 Delay Coordinate Embedding 3.4.1 3.4.2

29 34 36 i

Log Transformation . . . . . . . . . . . . . . . . . . . . . . Auto-Regressive . . . . . . . . . . . . . . . . . . . . . . . .

3.5 Model Validation 39 3.6 Conclusion 39 4 Implementation 4.1 Introduction 41 41 42

4.2 Overview of PTPd and the Prediction Extension 4.3 Software Architecture

44 47

4.4 Data Structures and Components Design

4.5 Conclusion 50 5 Experimental Evaluation and Analysis 5.1 Introduction 51

5.2 Overview of Experiments Settings 53 5.3 Results and Evaluation 55 5.4 Comparison of Fast Forecasting Methods 58

5.5 Conclusion 62 6 Conclusion and Future Work 65 6.1 Conclusion 65 6.2 Future Work 68

Appendix 71 A.1 Data Dictionary and Pseudo Code 71 A.2 Guideline for Users 76

List of Figures
1.1 1.2 1.3 3.1 4.1 4.2 4.3 5.1 5.2
Representation of a substation from a IEEE1588 point of view.

. . . . . . . . . . . .

3 4 6 28 43 45 46 53

Conguration of a substation after the failure of the master clock. State machine of the prediction mechanism

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

The Procedure of Time Series Signals Prediction. Structure of PTPd with its build-on Message paths in the PTPd system System diagram of clock servo Samples of history Drift Data

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

The actual and predicted osets time series using arithmetic average, weighted average and direct extrapolation

. . . . . . . . . . . . . . . . . . . . .

5.3

The actual and predicted osets time series using AR model and log transformation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

56 59 60

5.4 5.5 5.6

Evolution of the criteria for the ve techniques over a varying horizon prediction Evolution of the criteria for the ve techniques over a varying horizon prediction Evolution of Processing time for the ve techniques over a varying horizon prediction

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

iii

List of Tables
2.1 2.2 2.3
Classication of Time Series

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

12 21 22

Classication of Time Series Analysis Techniques Properties of Dierent Models

. . . . . . . . . . . . . . . . . . . . . .

1
Introduction
1.1 Consideration and Background
An electric substation [1] is a node in the power grid network. It serves the purpose of transmitting and distributing electric energy from power sources to consumers, such as households or industrial plants. An electric substation is made of primary equipment (switchgears, breakers, transformers) and secondary equipment (sensors, merging units, intelligent electronic devices). The secondary equipment aims at protecting and controlling the primary one by sensing and analyzing various data. One pre-requisite to perform ecient protection functions is to have synchronized data provided by the various devices. Depending on the considered function, the synchronization is either local, i.e., the devices of one substation are synchronized (e.g. busbar protection function) or global, i.e. the devices of two dierent substations are synchronized (e.g. line dierential protection function). Moreover, the synchronization requirement ranges from 10 sec to 100 sec depending on the considered protection functions. 1

Chapter 1. Introduction One promising way to handle the synchronized requirements in tomorrows substation automation is to use the IEEE 1588 Precision Time Protocol [2]. It enables precise synchronization of clocks in measurement and control systems implemented with technologies such as network communication, local computing, and distributed objects. The protocol is based on the master-slave paradigm in order to evaluate the relative oset and drift of each connected slave. One interesting feature of the protocol is its self-congurability which allows dynamically adding or removing any participating devices (either master or slave) by electing the best available clock at runtime. From a time synchronization point of view, a typical substation automation system has the following architecture (gure 1.1). A GPS signal is received by a network device, called the grand master clock, inside the SA. The grand master clock transmits the GPS time using the IEEE 1588 protocol to the devices (mostly IEDs, but also station PCs, gateways, transient fault recorders) connected to the network over TCP/IP.

In the outlined architecture, the GPS signal and its receiving part represent a single point of failure since the loss of the GPS means that the master clock will be running on its own and naturally drift away from the GPS clock. As an example, the consequences of atmospheric disturbances or thunderstorms hitting the GPS antenna imply the loss of the correct time and hence result in a desynchronization of dierent SA geographically distant which makes for example dierential protection functions inoperable if the time oset across the SA becomes too large.

Chapter 1. Introduction

Figure 1.1: Representation of a substation from a IEEE1588 point of view.

1.2 Problem Statement

In the context of SA, the single point of failure characteristic of the architecture is problematic indeed. In the case of a transient failure1 of the GPS signal, a new master clock, called transient master, will be elected among the devices and will propagate its own time on the network (gure 1.2). Due to environmental conditions (e.g. temperature and pressure) or hardware imperfections (e.g., quartz quality), the transient master clock will drift from the GPS clock. Once the GPS signal comes back and the grand master clock takes over again, an oset will appear between the time sent by the GPS and the one present on the network. This can lead to a situation in which all devices will have to go through a resynchronization stage. A re-synchronization stage implies an interruption of the control and protection functions running on the devices, as well as an abrupt
1

After a nite time the SA system will be back to the normal conguration

Chapter 1. Introduction

change of the devices time base which in turn may lead to a malfunctioning of the protection algorithms depending on time tagged data snapshots. Although the version 2 of the IEEE 1588 standard introduces the concept of redundant master to improve the reliability of the protocol, the loss of the GPS signal due to atmospheric disturbances or the loss of GPS antenna are not addressed. Such scenarios, which are common in the case of an electric substation, must be handled by either keeping the substation synchronized to the GPS time or at least as close as possible to avoid a hard resynchronization of the clocks devices and therefore a full re-initialization of the protection functions when the GPS signal is back.

Figure 1.2: Conguration of a substation after the failure of the master clock. A clock drift prediction mechanism has been explored in this thesis project. The basic idea is to record on each device, or possible transient master, its oset and drift history during the normal conguration. Once a transient master is 4

Chapter 1. Introduction

elected, it then propagates its local time corrected with information gained from the oset and drift history in order to minimize the drift from the GPS clock. In the context of this approach, two dierent congurations are considered:

1. Normal conguration, where the GPS signal is received and propagated to the SA system.

2. Faulty conguration, where the GPS signal is missing, the grand master clock computer is down (temporary or permanent failure), or its network cable is unplugged.

During the normal conguration where the GPS signal is received and propagated to the SA system, each connected device, i.e., IEDs and the grand master clock, record its oset over several hours or days. Once a new device is elected as transient master in the faulty conguration, it performs the normal tasks accomplished by a IEEE1588 master. The dierence is that every time it has to get its own current time, it will estimate its oset to the GPS clock and then correct the time which is sent to the slaves. The oset estimation is made out of the oset history, this can be done by, for example but not limited to, either computing an average oset or by identifying patterns out of the history data. The amount of recorded data for the oset history, as well as the prediction method used for the oset prediction, is a trade o between processing power, available resources and required time accuracy. In gure(1.3), each node can be either in a slave or transient master state. In a case of a slave state, the history is simply stored, while in a case of a transient master state, an estimated oset is calculated and then applied to the estimated real time.

Chapter 1. Introduction

Figure 1.3: State machine of the prediction mechanism

Chapter 1. Introduction

1.3 Aims and Objectives

The primary objective of this thesis project is to reduce and ideally avoid the oset between the GPS clock and the time known in a SA system while a GPS signal or the grandmaster clock is not available, in order to have a smooth reintegration of the GPS clock in the system. It therefore needs to design a time synchronization subsystem for SA which can handle the failure of a grand master clock for a given time frame (typically in the range of 48 hours). It needs the subsystem to handle three dierent disconnection scenarios, the loss of the GPS signal (e.g., in the case of atmospheric disturbances), the loss of the GPS antenna receiver (e.g., the antenna being hit by a thunderstorm) and the loss of the grand master device (e.g., hardware failure). In order to achieve this goal, this mater thesis investigates the possibility to use local history drift data to predict GPS time.

1.4 Dissertation Organization

Chapter 2 summarizes the state of the art of most commonly used linear and non-linear time-series predictions, introducing the ideas behind the various approaches that have been proposed in the past. It gives a denition of time series signals and some of their properties, it also classies time series as well as methods for predicting their future. Chapter 3 introduces ve dierent approaches to clock drift prediction and discusses the detailed procedure of time series prediction. It then specializes the description to the autoregressive based delay coordinate embedding algorithm. It also introduces a logarithm transformation process to normalize the time series before model construction. 7

Chapter 1. Introduction

Chapter 4 provides the corresponding design descriptions of the prediction extension to an existing open source IEEE 1588 implementation, PTPd [3]. Chapter 5 evaluates the ve dierent techniques with respect to their applicability in substation automation, and presents a quantitative comparison of those techniques in terms of processing power and prediction accuracy. Chapter 6 rounds o the dissertation by summarizing the improvements made, and by oering suggestions for future work. Appendix A shows the pseudo code of the implementation and the guideline for users.

2
State of the Art of Precision Clock Prediction
2.1 Introduction
In the present chapter, various precision clock prediction related techniques and algorithms will be introduced. First, a classical approach of modeling of clocks is outlined. Following, the denition of time series and its properties are introduced, the evolution of clock prediction is also presented. Finally, an overview over and a classication of time series as well as the most commonly used time series analysis techniques are provided.

2.2 Overview of Physical Clock Model

Normal clocks such as clocks at home and wristwatches usually drift compared to the actual time due to the inuences of many factors. They often drift dierently depending on their quality, the exact power they get from the battery, the sur9

Chapter 2. State of the Art of Precision Clock Prediction

rounding temperature, humidity, air pressure, and other environmental variables. Thus the same clock can have dierent clock drift rates at dierent occasions. A clock can be evaluated precisely only by comparison to other clocks. Therefore, evaluation of a clock actually refers to the measurement of the dierence of two clocks. Normally, people conceptualize some of the laws of physics with time as the independent variable. However, in order to estimate the dierences of two clocks, it is somehow crucial to have those laws inverted so that time is the dependent variable. In [4], it proposed a precise physical model for clocks and oscillators based on the fact that time as people now generates it is dependent upon dened origins, a dened resonance in the cesium atom, interrogating electronics, induced bias, timescale algorithms, and random perturbations from the ideal. In general, the reasons why does a clock deviate from others can be included within two categories. The rst one is systematic deviation, i.e., frequency oset, time oset which are often environmentally induced; the second category is the random deviation which are usually not thought to be deterministic. Hence, at ideal level, to have a characterized clock model is one potential approach to exhaustively explore the relationship between the clock and GPS time. Nevertheless, [5] also demonstrates that it is complex to dene such a model representing the quartz quality of a clock based on the inuencing parameters. On the other hand, this will involve strict requirements of functional equipments by nature, e.g., multiple sensors are required in order to sense the conditional humidity, temperature values. However, this approach can not be foreseen in the context of todays electric substation automations. Characterizing the time-domain signal would become another option since it is viable to store the history drift values as a continuous or discrete data set by the device before GPS signal is lost. There are, obviously, numerous reasons to record data streams of time deviation between a clock and some primary reference.

Chapter 2. State of the Art of Precision Clock Prediction

Among these is the wish to gain a better understanding of the the underlying context of the drift generating mechanism from the history drift record, the prediction of future deviations based on past measurements of drift, or the optimal control of the system.

2.3 Time Series Prediction

From the point of view of time series, the observations collected are results of the state of the system. A time series is dened as a collection of measurements of a variable that are typically taken at successive times, and usually spaced at equal time intervals. This data set can be decomposed into components including trends, seasonality, and random noise [6]. A trend in most time series patterns represents a general systematic linear or nonlinear component that changes over time and does not repeat or at least not repeat within the time range captured by the sample data, and the seasonality refers to patterns that are recurring in correspondence to the date, it may have a formally similar nature as a trend, however, it repeats itself in systematic intervals over time. It is also assumed that a time series consist of systematic pattern (a set of identiable components) and random noise which can be caused by errors in the observation methods. Random noise typically makes the pattern dicult to identify. The characteristic property of a time series is the fact that the data are not generated independently, their dispersion varies in time, they are often governed by a trend and they have cyclic components. Time series are assumed to have been generated by dynamic systems, it is therefore important to know when the measurements were taken, i.e., it must either be assumed that the measurement signal has been equidistantly sampled, or alternatively, the time instant when each sample was taken must be stored together with the time series as a second piece of information. 11

Chapter 2. State of the Art of Precision Clock Prediction

In [4], it described a general systematic model for clocks and oscillators which is dened by the following expression: t = a0 + a1 t + U (t)
2.1

where t is the time oset measured by comparison with the reference clock, t the time elapsed after the last synchronization, a0 the synchronization error or clock bias, a1 the clock drift, that is, the relative frequency oset of the clock compared with the reference clock, and U (t) the random error, which depends on the noise spectrum of the clock. As discussed before, the random error U (t) consists of various of inuencing parameters which are often environmentally induced can not be measured since deployment of sensors is not viable, it is consequential to concentrate on the oset part, a1 , i.e., time series of history drift data.

2.3.1

Characterization of Time-Series

In [7], it provided a useful classication of dierent time series, [8] also enhanced the classication by adding several detailed classes, in table (2.1), it is repeated with a few abatements. Table 2.1: Classication of Time Series 1 2 3 4 5 6 7 Continuous Natural Deterministic Stationary Linear Short Single Recording Discrete Synthetic Stochastic Non-stationary Non-linear Long Multiple Recordings

Chapter 2. State of the Art of Precision Clock Prediction

The system from which the time series is drawn can be either discrete or continuous. A discrete time series is one where the set of times at which observations are made is a discrete set. Continuous time series are obtained by recording observations continuously over some time interval. In academia, time series are quite often synthesized from simulation experiments. Techniques that are capable of dealing with such synthetic time series are not necessarily also well suited to deal with natural (measured) time series. Hence it is important to record whether a time series is natural or synthetic. If a time series can be exactly predicted from past knowledge, it is termed as deterministic, i.e., there exists a scalar function f which can express dependence between two quantities. One of which is given (the independent variables, e.g., the previous observations) and the other produced (dependent variable, i.e., the estimation). Otherwise it is termed as statistical, where past knowledge can only indicate the probabilistic structure of future behaviour. A statistical series can be considered as a single realisation of some stochastic process. A stochastic process is a family of random variables dened on a probability space. Many of modern theory of time series, especially statistical techniques, rely on stationary process, if a time series is non-stationary, they might be unsuitable of dealing with it. For this reason time-series analysis often requires one to transform a non-stationary series into a stationary one so as to use this theory, for instance, regular dierencing and seasonal dierencing might be used for removing the trend and seasonality, respectively. From an intuitive point of view, a time series is said to be stationary if there is no systematic change in mean (no trend), if there is no systematic change in variance and if strictly periodic variations have been removed [9]. The system from which the time series is drawn may be either linear or non-

Chapter 2. State of the Art of Precision Clock Prediction

linear. A linear system is a mathematical model of a system based on the use of a linear operator, they typically exhibit features and properties that are much simpler than the nonlinear case. Non-linear process is always being exploited by some of the computational intensive techniques. Time series can be either short or long. A short time series may be caused by a transitory event, such as a surgery, that is of limited duration. Although it is possible to increase the sampling rate, thereby making the time series longer, this may not help, e.g., it is pointless to make the time series longer by shorten the synchronization round. There exists a natural sampling rate for each time series that is related to the natural frequencies (eigenfrequencies) of the system from which the time series is sampled. Oversampling increases the length of the time series, but not its information contents. Forecasting techniques that are based on models suer from data deprivation in the presence of short or oversampled time series. The simple extrapolation techniques may work best in this case, at least for single time series. Multiple correlated time series may still be better predicted using model based approaches. A time series may consist of a single recording, or of multiple recordings. Multiple recordings lead to multiple uncorrelated trajectories representing dierent patterns of the same phenomenon. However, from an intuitive point of view, it seems to be dicult to evaluate if it is the same phenomenon as observed before or not by only judging the records of the evolution of a singel variable, drift value. Finally, time series may be documented or blind. These terms relate to the amount of knowledge available about the systems from which the time series was drawn. Obviously, such knowledge can be exploited exhaustively by prediction approaches, especially deductive ones, and make use of such knowledge to infer a model structure that matches that of the underlying system.

Chapter 2. State of the Art of Precision Clock Prediction

2.3.2

Time-Series Analysis Techniques

Unlike the analyses of random samples of observations that are discussed in the context of most other statistics, the analysis of time series is based on the assumption that successive values in the data le represent consecutive measurements taken at equally spaced time intervals, it accounts for the fact that data points taken over time may have an internal structure (such as autocorrelation, trend or seasonal variation) that should be accounted for. Time series analysis concerns itself with the investigation of single or multiple observations of measurement data streams taken from a system under observation, i.e., extract data that maximally informative for system model constructing purpose. It is a characteristic property of time series that they never contain complete information about the system being observed, and in particular, that the excitations that are imposed on the system are not under the observers control, and are, in many cases, unknown to them. The desire to predict the future and understand the past drives the search for laws that explain the behavior of observed phenomena; Examples range from the volatility of nancial markets to the irregularity in a heartbeat. If there are known underlying deterministic equations, in principle they can be solved to forecast the outcome of an experiment based on the knowledge of the initial conditions. To make a forecast if the equations are not known, one must nd both the rules governing system evolution and the actual state of the system. The relationship between observed phenomena of a system or the knowledge of its properties is termed as the model of the system, therefore, the aim of modeling is to nd a description that accurately captures features of the long-term behavior of the system while the aim of forecasting (predicting) is to accurately predict the shortterm evolution of the system1 . Generally, there are two potential roads lead to
1

It is worth noting that these two are not necessarily identical.

Chapter 2. State of the Art of Precision Clock Prediction

predictions: 1. predicting was done by directly extrapolating the series through a global t in the time domain. 2. attempts to create a model that explains relationships between observations made in the past, and then t the model in a simulation to make predictions of future.

Evolution of Time Series Prediction In the early days, prediction methods are simple techniques that do not rely on training a system model rst, but directly use the available data to make predictions about the future, namely extrapolation, the previously made predictions make no eect on further predictions. Direct extrapolation is inherently unsafe since it does not provide for means that would allow to estimate the quality of predictions made [8], i.e., estimate the error associated with any prediction made. Modeling is a better approach, it enables operations on input variables, thereby allowing users to formulate various of scenarios and observe the consequences that might result when implementing any one of those scenarios. Historically, the general theory of rst modern time series approaches can be traced back to [10] [11], it was a linear model which allows to correlate dierent observations with each other, thereby improving the potential of making good predictions. It has three particularly desirable features: 1. can be used to represent both stationary and non-stationary processes. 2. straightforward to implement. 3. a requiring reasonable computation time. 16

Chapter 2. State of the Art of Precision Clock Prediction

The penalty for this convenience is that they may be entirely inappropriate of dealing with non-linear behavioral systems since they always assume the systems from which the time series has been measured to be linear and to operate under stationary conditions [12]. In order to have time series analysis widely used in systems especially with nonlinear characteristics, non-linear models were proposed in [13]; Moreover, in order to be able to deal time series that exhibit non-stationary characteristics, in [14] [15], pre-ltering methods were developed that remove the eects induced by trend or seasonality of non-stationary time series. However, the choice of linear or nonlinear ones is a tradeo in terms of accuracy and processing overhead, moreover, the later ones need more training data (observations) and are less well behaved. In case of short synchronization round of IEEE1588, non-linear predictors combined to the other tasks running on a node may not have enough time to proceed. Some of other important contributions were the construction and identication of state-space models [16], the introduction of learning techniques for model identication [7]. The use of learning techniques constitutes an important trend in modern prediction approaches. it make less structural assumptions about the system from which the variables were generated, and that are therefore more generally applicable to a wider class of prediction problems. The most widely used among these is neural network, it has been successfully applied to nacial markets prediction [17] [18] and wind power forecasting [19] [20], etc. Although previous comparisons of neural networks and linear predictors have shown that neural network sometimes can give better results [21] [22], it meanwhile requires additional procedures (e.g., genetic computing2 ) and much more computationally expensive training cost.

A method used to construct neural networks architecture

Chapter 2. State of the Art of Precision Clock Prediction

There exists a vast literature on methods for analyzing time series. There even were organized competitions on a worldwide level in order to advance the state of the art of methodologies for time-series analysis and prediction [23]. Time-series analysis has been applied to many dierent application areas, such as the prediction of nancial markets (economical models), or the monitoring of physiological signals stemming from a patient during surgery (biomedical systems). In engineering, time-series analysis is of interest in the contexts of instrumentation and ltering of signals, and the design of predictive controllers. In particular, time-series analysis also has been used by many researchers for precision clock prediction. When the underlying dynamical system is not fully understood, observed regularities in the time series can provide insights into the inner working of the system. Given a time series from the past, the aim of a prediction algorithm is to forecast the expected future of the time series by drawing analogies from previous behaviour, and thus, indirectly forecast the future behaviour of the system.

Classication of Time-Series Analysis Techniques A rst coarse classication can be made by distinguishing between prediction and simulation approaches, i.e., techniques that operate on the time series directly versus techniques that create a model in advance and then operate on that model. Yet, this classication is not truly crisp. Even simple extrapolation techniques usually identify parameters of a polynomial or regression function. Whether this polynomial or regression function is called a model is simply a question of taste. A second classication can be made by distinguishing between deductive and inductive modeling techniques. Again, also this classication is not strictly crisp. In fact, there are no strictly deductive approaches to time-series analysis, even in the case of an extremely well documented time series [8]. For instance the 18

Chapter 2. State of the Art of Precision Clock Prediction

DVS algorithm described in [7], a deductive modeling approach would make use of the knowledge provided about the system to conclude that the output signal is governed by the linear equations, a set of k very simply and well understood linear equations that are furthermore autonomous. The modeler would thus only need to identify the parameters of the equations, such that they match the observed output patterns optimally (Least square method). The technique is almost purely deductive, except for the identication of the k nearest neighbors, which can be considered an inductive process. In general, the more structural assumptions are being made about the system from which the time series is drawn, the more the approach must be considered deductive. Among the various available primarily inductive modeling approaches, the following two: Articial Neural Networks (ANN) [24] and NARMA [25] are the most classical ones. They can better deal with non-stationary behavior since ANNs and NARMA do not exploit stationary explicitly, except training the weights of neurons may become more problematic in the case of non-stationary time series. As described above, they are not viable approaches in the context of substation automations. Based on the previous discussion, the choice of deductive techniques in this work can be narrowed down to the choice of linear processes. Time series analysis produced by linear predictors is based on the fact that a nite-dimensional linear system produces a signal characterized by a nite number of frequencies. There exist highly successful methods of time series prediction based on the exploitation of this fact in the time domain also frequency domain. Such as Auto Regressive (AR), Moving Average (MA) and ARMA. Based upon Allans general clock model, linear techniques were employed by most of investigators [26] for the prediction of precision clocks. Moving average (MA) works by calculating the average of a small set of past 19

Chapter 2. State of the Art of Precision Clock Prediction

data in which each data subset average is calculated, it describes the time series as a moving average of white noise. The MA method operates in an open loop without feedback, it can only transform an input that is applied to it and limited to stationary processes, or those processes that do not have trend or periodic uctuations and do have an unvarying variance over time. Some feedback provided are necessary for generating internal dynamics of a system, this would lead to another model, namely autoregressive. One of the rst linear autoregressive (AR) models was proposed by [27] for sunspots study. This method can be used to represent non-stationary processes as well as stationary processes. In his model, it predicted the next value as a weighted sum of previous observations of the series, the value of the series at the current time is as function of the previous values added to the random variation term which can represent either a controlled input to the system or noise depending on the application. The desirable objective of training an autoregressive model consists of setting each weight to an optimized value, i.e., gives the best prediction in the sense that that model minimizes the squared error within the model class. However, by comparing with MA models, AR models give limited freedom in describing the properties of the disturbance terms. Furthermore, processes may be a combination of moving average and autoregressive process which require the ARMA model. The advantages of ARMA include a requiring reasonable computation time and creation of a tool for analysis, forecasting, and control. One restriction of the ARMA model is that it was not designed for time series with asymmetry or data with sudden bursts of large amplitude at irregular times [28]. In addition, ARMA is used with the assumption that the underlying system is linear. The AR method is useful because it only requires knowledge of the systems output values and can be used for both stationary and non-stationary time series. 20

Chapter 2. State of the Art of Precision Clock Prediction

In conclusion, the most important advantages of AR models are its ability to generate decent models of time series in a fairly automated manner, its relatively less computational overhead, its exibility to t both stationary and non-stationary system, its rm mathematical basis [29], nally, less knowledge it requires from the system. Table(2.2) presents the overview of the pros and cons of these classication proposed in this chapter. Auto-Regressive model mixed with a arithmetic approach and a weighted average approach shall be used in the subsequent chapters of this thesis, although only AR shall be discussed in any great detail. For the reasons discussed above, it makes sense to use these relatively less tricky approaches if given only the history data set of drift taken in sequence which is not informative enough for other approaches. The strict requirements of processing cost is the other important consideration as well.

Table 2.2: Classication of Time Series Analysis Techniques Type of Time series Continuous Discrete Deterministic Stochastic Stationary Non-stationary Linear Non-linear Type of Modeling Method AR ARMA ** ** ** * ** * ** ** ** ** ** ** ANNs ** ** ** ** ** ** **

Chapter 2. State of the Art of Precision Clock Prediction

Table 2.3: Properties of Dierent Models Properties of Models Wide Usability Coding Complexity Input Information Computation Cost Type of Modeling Method AR ARMA * * * * * ** ** ** ANNs ** ** ** **

2.4 Precision Clock Prediction Using Time Series

The last three decades has seen some decisive contributions in precision clock prediction eld, made possible as a consequence of the appearance of creating various of systematic or random models based on Allan Variance 3 (also known as two-sample variance) that allowed to characterize the systematic and random deviations of clocks. Important systematic deviations may include information of daily and annual dependences which can be manifestations of environmental eects such as deviations induced by humidity, temperature and pressure, it opens the door to optimum estimation of precision clocks. Based upon the theory in [4], the choice of predictors can be conned considerably to linear ones since they are sucient enough to optimally predict clocks whose stability is described by an extension of the common oscillator model, including drifts [26]. For the two decades following Allan, the reigning paradigm remained that of linear models driven by noise [30] [31]. In order to make future predictions better than the straightforward linear ones, lots of optimum predic3

Allan Variance is a method invented by David W. Allan for measuring stability in a variety

of exotic precision clocks

Chapter 2. State of the Art of Precision Clock Prediction

tors in terms of accuracy came out [32] [33], these new developments were mainly focused on the noise modeling since the optimum precise-clock prediction is somehow equivalent to optimum precise-clock noise prediction. Again, the penalty for these potential improvements is the heavier processing load. On the other hand, there also exist some researchers strived towards less computation overhead in order to meet the requirements of potential real time applications. [34] described a fast, easily coded numerical algorithm (which is competitive with conventional k-nearest-neighbor approaches) for convenient initial analysis of time series data. The algorithm [26] proposed was for the generation of time scales where the best predictor is chosen by comparing the value of various standard deviations in real time. It is simpler and faster than iterating to create linear models in real time using least square method. However, in that case, a table consisted of a bunch of predictors as well as the related performance calculated under different scenarios should be stored by the device, that means it involves a set of experiments have to be run in advance on every substation or every node which is susceptible to perform such operations.

2.5 Conclusion
This chapter provided an introduction to the nature and importance of time series analysis techniques. It was dened what a time series is and what are its basic properties, it also oered an overview over and a classication of the dierent types of time series to be found. The most commonly linear and neural networks models were outlined. It nally suggested the commonly used deductive modeling approaches for precision clock prediction. This thesis shall mainly deal with the linear autoregressive model, the proposed methods are all based on sequences observed from single source used as training data set, and normally, the training 23

Chapter 2. State of the Art of Precision Clock Prediction

data set is a continuous sequence of drifts recorded naturally before loss of GPS signal. There also exist other two naive techniques, namely arithmetic average method and weighted average method which are not belong to modeling techniques. However, they will, within this work (especially short-term time horizon), surprisedly be useful. A fact of these two will be extensively discussed and exploited in subsequent chapters of this thesis.

3
Fast Forecasting of Time-Series Data
3.1 Introduction
Already in Chapter 2, it was mentioned that one of the primary goals of this work is to gain a suitable way to get devices within one substaion or geographically distant substations synchronized in case of the lose of a reference clock (e.g., GPS signal). In this Chapter, the application of three dierent approaches (a arithmetic average, a weighted average and Delay Coordinate Embedding algorithm) to clock drift prediction are presented. In this thesis, these predictors strictly operate on naturally recorded uni-variate time series. i.e., the arithmetic average and weighted average ones directly extrapolate the past drifts while the Delay Coordinate Embedding approach analyzes observed patterns of measurement drifts, and predicts the future behavior on the basis of its own past, without ever identifying the systems status under which these signals were generated. These methods perfectly t the realistic situation that lack a large number of free parameters, unlike such methods as Articial Neural Networks, which leads to enhanced time requirements, and perhaps more strict 25

Chapter 3. Fast Forecasting of Time-Series Data

requirements on input knowledge.

3.2 Arithmetic average and weighted average approaches

The arithmetic average approach simply calculates the average over the last n synchronization rounds, i.e., the rst estimation is computed by averaging over the last n observations, and then iterates to get the next estimations by shifting the previous n observations until the GPS signal is back. Thus, the arithmetic average approach takes the accumulation strategy which utilizes the new estimations as parts of the history data in the next round, as does the weighted average approach.

xt =

t1 1 xti t 1 i=1

3.1

It not surprising that arithmetic average calculation performs well if the clock drift is stationary, especially in short-term predictions. The most important advantage of doing average would be the negligeable processing cost. It is worth noting that, due to the low processing overhead, each node of a substation can run its own arithmetic average prediction. Such a scenario will allow each node to handle the loss of one or two consecutive IEEE 1588 synchronization messages due to networking problems. The weighted average approach is similar to the arithmetic average one but only takes into account the drift values of which the probability is greater than a given threshold value over the last n rounds. For instance, select a group of most frequent ones (i.e., partially lter the possible noises), then perform the weighted average over that group. 26

Chapter 3. Fast Forecasting of Time-Series Data

xt =
i=1

ti xti ; (ti )

3.2

where ti represents the probability of element xti during last n observations, the given threshold value of lter.

3.3 The Procedure of Time Series Prediction

The prediction of time series signal is based on their past values. Therefore, it is necessary to obtain a data record. When obtaining a data record, the objective is to have data that are maximally informative and an adequate number of records for prediction purposes. Hence, future values of a time series xt can be predicted as a function of past values xt1 , xt2 , ..., xtm .

xt = f (xt1 , xt2 , ..., xtm )

3.3

The problem of time series prediction now becomes a problem of system identication. The unknown system to be identied is the function f () whose inputs are the past values of the time series. While observing a system there is a need for a concept that denes how its variables relate to each other. As described in Chapter 2, the relationship between observations of a system or the knowledge of its properties is termed as the time series model, hence, Equation (3.1) can be called a model of the system. The search for the most suitable model for a system is guided by an assessment criterion of the goodness of a model. In the prediction of time series, the assessment of the goodness of a model is based upon the prediction error of the specic model, Moreover, in this work, processing cost and the amount of free input parameters required are other two criterions of equal importance. 27

Chapter 3. Fast Forecasting of Time-Series Data

Figure 3.1: The Procedure of Time Series Signals Prediction.

Chapter 3. Fast Forecasting of Time-Series Data

After the most suitable model of a system has been determined, it has to be validated. The validation step in the system identication procedure is very important because in the model identication step, the most suitable model obtained was chosen among the predened candidate models. This step will certify that the model obtained describes the true system. Usually, a dierent set of data than the one used during the identication of the model, the validation set, is used during this step.

3.4 Delay Coordinate Embedding

In brief, the delay-coordinate-embedding approaches identify the state of the system at certain time points, search the past history of observations for similar states, and, by studying the evolution of the observable outputs following the similar states, infer information about the future of the system. As well as other classical prediction techniques, for a time series {xi } = x1 , x2 , ..., xN , the delaycoordinate-embedding approach attemps to t models of the following form, but with a linear function f ().

xi+T f (xi , xi , ..., xi(m1) ) The integers T and m dene the following quantities:

3.4

1. T : lead time or look ahead time, (i.e., prediction time into the future). 2. m : embedding dimension or resolution, (i.e., the degree to which information about individual sequence elements is preserved). Furthermore, the m past values are combined in the delay vector Xi in which the elements are time delayed data values from the time series. The spacing of the 29

Chapter 3. Fast Forecasting of Time-Series Data

samples of the delay line is equal, i.e., Xi := (xi , xi , ..., xi(m1) ), where is the lag time between each sample, it will be made equal to 1 (i.e., 2 seconds) here, since the time series collected is already sampled fairly coarsely. The basic idea of the delay-coordinate-embedding approach is prediction of future values given the current delay coordinate vector. In this method the time series (history drift sequence) is broken up into equal-sized chunks
1

which can

indicate the internal state of the system, i.e., reect the evolution of the system. Whenever a prediction needs to be made, the latest chunk
2

is found, and the

chunks nearest to it in the chunks space are retrieved. By observing how the time series developed in each of these cases, its expected course under the current circumstances can be predicted. This involves several issues:

1. Decide the size of the chunks. 2. Decide the number k be used. 3. Decide the method of interpolation for combining the results from the k nearest subsets found.

In our experiment, the size of the chunks were chosen on an ad hoc basis, i.e., having a bunch of experiments applied on the same data set by varying the embedding dimension systematically. The choices of number of nearest neighbors are also based on practical applications, i.e., if k is too small, sensitive to noise points, otherwise, neighborhoods may include points which are not that close to the latest
1

Time

series

(xt , xt1 , ..., x1 )

can

broken

into

several

chunks

(xt , xt1 , ..., xti ), (xt1 , xt2 , ..., xti1 ), ..., (xi+1 , xi , ..., x1 ).

These all chunks compose a

so-called chunk space. 2 To predict xt , the chunk which has xt1 as the last element is the latest chunk, e.g., (xt1 , xt2 , ..., xtm ).

Chapter 3. Fast Forecasting of Time-Series Data

chunk. The Euclidean Distance was used for identifying neighborhoods. For solving the third problem, the k nearest neighbors Xi1 , Xi2 , ..., Xik of the current state vector Xi are found, and the time series values immediately succeeding these k instance (denoted p1 , ..., pk ) are used to estimate the value after Xi . One method is to perform arithmatic average over p1 , ..., pk , another more tricky one would be utilizing an Auto-Regressive model:

xt =
i=1

i xti +

3.5

which uses the least squares method to nd the linear function f : IRm > IR that gives the best prediction for xi+1 in the sense that it minimizes the squared error within the model class. On average, the Gaussian white noise to be small relative to xt , therefore, xt can be estimated by using:
m

is assumed

xt xt = xt

=
i=1

i xti

3.6

One particular implementation of delay-coordinate-embedding approach is given by [16]. Here we describe it but with some modications3 which are compliant to the scope of this thesis work. 1. Data pre-processing. A function that maps the entire time series to a new set of replacement values such that each old value can be identied with one of the new values, i.e., using a natural logarithm transformation. 2. Divide the time series into two parts: (a) a training set x1 , ..., xNf used to estimate the coecients of each model.
3

The algorithm presented here diers from [16] in that we update the training data after each

new estimation is made rather than holding it xed.

Chapter 3. Fast Forecasting of Time-Series Data

(b) a test set xNf +1 , ..., xNf +Nt used to evaluate the model. Nf denotes the number of points in the training set, Nt denotes the number of points in the test set. 3. Choose m, k. (outer loops)

4. Choose an input signal Xi (i.e.,test delay) for a forecasting task (i > Nf 1).

5. Compute the distances dij of the test vector Xi from the training vectors Xj ( for all j such that (m 1) < j < i 1).

6. Order the distance dij (Using the quick sort algorithm).

7. Find the k nearest neighbors Xj

(1)

through Xj

(k)

of Xi , and t an ane model

with coecients 1 , ..., m of the following form:

xt
i=1

(l)

i xti , (l = 1, ..., k.)

(l)

3.7

In time series context, this is an Auto-Regressive model of order m tted to the k nearest neighbors to the test points; i.e., there are k equations. j denotes those times in the training set where the dynamics is most similar to the test point. 8. Use the tted model from step (7) to estimate a one-step-ahead forecast xt (k) starting from the test vector, and compute its error :

ei (k) = xi xi 32

3.8

Chapter 3. Fast Forecasting of Time-Series Data

9. Repeat step (4) through (8) as (i+1) runs through the test data, but replace the xi by the lastest prediction, xi , i.e., instead of having xi+1 , xi , ..., xi(m2) , now, xi+1 , xi , ..., xi(m2) is the input used to t the model for next estima tion. The nal oset and possible maximum oset can be computed as:

m (k) =
i=1

ei (k)

3.9

m (k) = Maxm (k) =

i=1

ei (k), t [1, Nt ]

3.10

The Delay-Coordinate-Embedding algorithm as presented here uses a local linear approximation in step 7. The idea of systematically varying embedding dimension and neighborhood size is not restricted to local linear models. Also, the model can be made considerably more detailed by introducing a variety of noise terms, as is typically done in non-stationary modeling [28]. Furthermore, in step 9, to get an estimate for xt+1 , there are two obvious choices. The direct prediction method means that the original method is applied to xti , ..., xt1 to predict two times units ahead. In contrast, iterated prediction means applying the method to xti , ..., xt1 , xt to predict one unit ahead. The direct prediction only uses real observations for the forecast, whereas the iterated prediction also uses previously made forecasts as if they were real observations. Much discussion has ensued over which choice is superior. The reliability of direct prediction is suspect because it is forced to predict farther ahead. On the other hand, iterated prediction uses xt , which is possibly corrupted data. However, [35] argue that iterated prediction is superior. 33

Chapter 3. Fast Forecasting of Time-Series Data

3.4.1

Log Transformation

A natural logarithm transformation can be applied in step(1) of the delay coordinate embedding algorithm, prior to creating the linear predictor based on the local approximation technique [36]. It is actually a concept of data preprocessing which attempts to stabilize the variance of the time series. It has usually been tried when sometimes one is not very clear what kind of variance variability the time series has [37]. The logarithm transformation may reduce the eects introduced by noises (spikes), the detailed reasons why it is decent for this research are given below: 1. It is believed that the reason why it was so good is that it tended to give the same output as a linear predictor when the drifts were about constant. This was because the log predictor had approximately the same weights as the linear predictor, and averaging in log space is the same as averaging in linear space if the points being averaged are close together. 2. On the other hand, in the event of spikes (as occurred in a small part of the data set) the log predictor gave much better prediction results. This may reect the geometric averaging process as being better than the arithmetic averaging process following a spike. To be more specic, the second reason can be explained by presenting the output pattern of linear regressive

xt = 1 xt1 + 2 xt2 , ..., + m xtm

3.11

which can be assumably considered as an arithmetic average if the elements of coecients vector have the equal values, i.e., 1 = 2 =, ..., = m . 34

Chapter 3. Fast Forecasting of Time-Series Data

xt =

xt1 + xt2 , ..., + xtm m

3.12

By carrying out the natural logarithm transformation in advance, the output pattern would be transformed to

xt = 1 log xt1 + 2 log xt2 , ..., + m log xtm which especially can be written in short

3.13

log(t ) = log(xt1 xt2 , ..., xtm ) m x

3.14

if 1 = 2 =, ..., = m . Therefore, the prediction can be dened by the following:

xt =

(xt1 xt2 , ..., xtm )

3.15

which can be considered as a geometric averaging. Therefore, in case there exists spike(s)4 within the the training data, the estimation produced by geometric averaging process would be closer to the expectation value of that sequence than the arithmetic one, for instance given a time series Xi

Xi = (9, 11, 2, 10) the arithmetic average of Xi is

3.16

9 + 11 + 50 + 10 = 20 4
4

3.17

Spikes here refer to the history points which are not frequently occurs within a certain period

Chapter 3. Fast Forecasting of Time-Series Data

while the geometric average one is

5 9 11 50 10 14.915

3.18

Generally, the probability for the succeeding time series which arranges from 9 to 11 should be relatively higher since the third element 50 is considered as a spike of this time series, therefore, logarithm transformation works well.

3.4.2

Auto-Regressive

Linear autoregressive is one of the model that has been used in step 7 of procedure delay-coordinate-embedding. It is a classical approach to combine the results from the k nearest neighbors found. The general form of the AR model is given by the linear equation:

xt = 1 xt1 + 2 xt2 , ..., + m xtm +

3.19

xt xt +

3.20

where the current value of the time series is expressed as a weighted sum of past values plus the white-noise term t . Thus, xt can be considered to be regressed on the m previous values of x(). If on average
t

is small relative to xt , the Gaussian

white noise can be neglected. To be more specic, calculating xt is therefore a matter of evaluating the coecients 1 , ..., m . In the context of drift clock prediction, the previous linear equation is transformed as a matrix calculation as follows: 36

Chapter 3. Fast Forecasting of Time-Series Data

xt = xt1 xt2

1 2 . xtm . .

the coecients are then evaluated from the following matrixal equation:

xt1 xt2 . . .

xtk

xt11 xt21 . . .

xt21 xt31 . . .

.. .

xtk1 xt(k+1)1

xt(m+k1)1

1 xt(m+1)1 2 . . . . . .

xtm1

or in short:

A = b

3.21

If m = k, the matrix equation represents a set of k linear equations in m unknowns that can be solved in an unique fashion using any technique suitable for solving linear systems of equations. If m < k, it represents an overdetermined set of linear equations in m unknowns that can be solved approximately in a least squares sense which can be interpreted as a method of tting data. The best-tting in the least-squares sense is that instance of the model for which the sum of squared deviations from a given set of data has its least value. A deviation being the dierence between an observed value and the output given by the model. The linear regression can be achieved by multiplying the transposed matrix AT at both sides of equation as: 37

Chapter 3. Fast Forecasting of Time-Series Data

AT A = AT b and hence

3.22

= (AT A)1 AT b

3.23

is an approximate solution of the set of equations, where (AT A)1 AT is a pseudoinverse of matrix A. Once the coecient vector has been found, future estimations can recursively be obtained using the equation:

xt = Xt1 where: Xt1 = xt1 xt2 xtm

3.24

this concludes the straightforward description of the method. It is of interest to discuss the stability of an AR model. If the uni-variate time series is stationary, it seems reasonable to expect that recursive predictions produce a stationary forecast as well. However, there is no guarantee that the least squares approach to determining the parameter values of the AR model will satisfy the stability requirement, i.e., recursive predictions using Eq.(3.16) may grow beyond all bounds. One promising approach to ensure the stability of AR model is to request that:

i = 1.0
i=1

3.25

Chapter 3. Fast Forecasting of Time-Series Data

by making use of autocorrelation function. However, it will cause a computational intensive training process since expected values and variances of time series need to be calculated through the vector Xt1 .

3.5 Model Validation

The last step in the system identication procedure is to validate the identied model. Validation of a model can be performed in a number of ways. A possible validating method is to check whether the model satises its purpose. For example, evaluating the prediction error over a validation data set that is dierent from the data set used during the identication of the model. A very useful technique for validation of a model is the residual analysis. The residuals carry information about the quality of the model, they can be dened as Eq.(3.8). Another aspect need to be evaluated is the computing cost, although there is no specic requirement been addressed yet. It thereby remains one of works for future exploration.

3.6 Conclusion
In this chapter we dened the basic properties of time series. The procedure of time series signal prediction was presented, methods for measuring the prediction were also outlined, moreover, the criteria for model validation were stated. Subsequently, ve ecient prediction approaches, especially the delay coordinate embedding one integrated with AR model, has been introduced. In particular, a data preprocessing method has been stated. Two strategies of achieving multistep prediction, i.e., with the horizon of prediction greater than one step, has been analyzed. 39

Chapter 3. Fast Forecasting of Time-Series Data

It was shown that weighted average predictor lters out what it considers to be noise, a feature that may sometimes be quite useful, but that might also be a nuisance, because the user has little control over what the predictor considers noise, and what it considers a signal. Moreover, the choice of threshold value is always ad-hoc. It was also shown that the delay-coordinate-embedding predictor highly depends on the choice of embedding dimension m and number of nearest neighbors k which are both empirical determined oriented. Finally, the stability of AR model has been discussed. A relevant solution: autocorrelation, to this potential issue in long-term prediction has been mentioned.

4
Implementation
4.1 Introduction
This chapter describes the software design for the time series prediction extension on top of the existing IEEE 1588 protocol stack, PTP daemon. It mainly provides the architecture of the extension built upon regular PTPd, overviews the prediction mechanism, nally, it describes ve extension libraries along with their associated subroutines. The primary objective of this experimental extension is to be able to have various clocks within an IEEE1588 network approximately synchronized in case of loss of GPS or disconnection. The design goals of this extension are real-time prediction, high-resolution (up to 48 hours) drift data simulation, exible usage of prediction approaches. Especially, it requires to meet the following targets: 1. Near realistic simulation of loss of GPS signal. 2. Real-time prediction. In order to achieve fast forecasting, this extension is required to perform prediction of clock oset eciently using various tech41

Chapter 4. Implementation

niques and models, e.g., arithmetic average, weighted average, Linear system Model. 3. Ecient data management. The high-resolution clock drift simulation will produce a huge data throughput since it requires retrieve, search, interpolate, and sort operations. 4. Error evaluation. Once the simulation terminates, three types of deviations (Final error , i.e., the deviation from the GPS clock by the end of simulation; Maximum error , i.e., the maximum deviation from the GPS clock during the simulation; Actual existing error which means the nal error without carrying out predictions) will be calculated.

4.2 Overview of PTPd and the Prediction Extension

The PTP daemon (PTPd) is an open source application which implements the Precision Time protocol (PTP) as dened by the version 1.0 of IEEE 1588 standard. This extension of PTPd seeks to carry out a prediction mechanism when the GPS connection fails. It consists of ve libraries, and each of them implements a dierent prediction technique. In order to construct predictors for clock drift estimation, six types of inputs1 are required: 1. Time disconnected. (Disconnection_point) 2. Time horizon of predictions. (Disconnection_duration) 3. Related previous sequence(s), i.e., a group of sample subsets retrieved from training sets.
1

These parameters are all dened in interface.h

Chapter 4. Implementation

4. Threshold value, which serves as a simple lter when comes to weighted average approach. (Threshold) 5. Resolution, i.e., the degree to which information about individual sequence elements is preserved. (Emdims) 6. Ignored subset, i.e., the length of the initial unstationary part which should be excluded from the training set because of the overhead caused by system initialization. (Ign_til)

Moreover, two extra inputs are necessary for Time-series based approaches :

1. Depth, i.e., how far back memory goes. (Backtrack) 2. Number of k nearest neighbors. (K_NN )

Figure(4.1) shows an overview of the major components of the extended PTPd as well as their interactions.

Figure 4.1: Structure of PTPd with its build-on 43

Chapter 4. Implementation

The main protocol state machine is implemented as a forever loop with cases for each state. It calls BMC 2 after start-up to return the proper state, master or slave, based on the reported clock statistics. After connection has been initialized, msg packer gathers data into or extracts data from the message package as well as the time stamps which were achieved in kernel level. The clock servo computes the oset-from-master from the master-to-slave delay and slave-to-master delays and sends it to the database, once the connection fails, the protocol state machine switches to the prediction conguration and launches the predictor which will retrieve useful history osets from the database based on its algorithm and calculate the estimation. The estimation will be continuously sent back to clock servo as well as the database. The error counter is implemented as a subroutine of protocol engine associated with the prediction mechanism, it counts the errors based on the estimations and actual osets stored in the database.

4.3 Software Architecture

This section describes the software architecture of this extension. The proposed extension framework will have the following functional components:

1. Module for controlling (start, terminate) the simulation, included as an addon of protocol engine. 2. Procedures for high-resolution clock drift data prediction, involving data retrieving, constructing predictors, sorting, and data normalization techniques, etc. 3. Module that computes the prediction error.
2

The best master clock algorithm is used to select the most stable and accurate clock.

Chapter 4. Implementation

The following gure(4.2) and gure(4.3) present the message send and receive paths in a typical system running PTPd, the workow of prediction within the extended components, and the basic PTPd functions which form the synchronization mechanism. The updated oset data will be continuously sent to data center (arr,arrFake) from clock servo, while the predictors always take in relevant data from data center and send predictions back afterwards. Error Counter will be called once the simulation is done. This ecient prediction requires the predictors, with the modules, or components of PTPd stack, to work together smoothly and reliably.

Figure 4.2: Message paths in the PTPd system The prediction models will continuously take in relevant clock drift data from the data center, subsequently use it to form the predictors, and the oset prediction is carried out eciently. Meanwhile, the large amount of processing data will be managed in an ecient way to facilitate easy and reliable access. For example, every individual prediction will cause many data retrieves during the procedure 45

Chapter 4. Implementation

Figure 4.3: System diagram of clock servo of model training, the database was implemented using arrays from which the operation of data retrieved will be rapid since the entries of array can be accessed directly. Furthermore, this extension (prediction procedures), its interface to the PTPd stack, and its input modules, will be totally compliant with the PTPd components, as well as the variables passed between PTPd components, while the output data variables and formats will follow PTPd specication. Finally, the original PTPd stack has been slightly modied in order to perform a hard resynchronization at each 2 seconds. Following is the list of functional components :

1. Predictors. Arithmetic average predictor. Weighted average predictor. Directly Extrapolation: delay-coordinate-embedding integrated with average approach. Linear predictor: delay-coordinate-embedding integrated with AutoRegressive Model. 46

Chapter 4. Implementation

Log Transformation: a natural logarithm transformation of the drift data is carried out only if comes to the Log transformation technique which is actually a pre-processing technique. 2. Data Sorting: using the Quick sort Algorithm. 3. Data Retrieve: Euclidean Distance is utilized to determine the k nearest neighbors. 4. Data Storing: two arrays are implemented to store the updated actual osets and prediction results respectively, and the prediction results will be sent back. 5. Error Calculation: calculate the deviations by comparing the two arrays. Theoretically, the extension can be congured to run all these models simultaneously.

4.4 Data Structures and Components Design

This section describes the internal software data structure. As described earlier, the main components that drive the prediction are the predictors. The sorting and Euclidian-Distance functions are included as subroutines of one of the prediction libraries. Two arrays arr,arrFake passed between PTPd components and the Predictors as well as the error module are used to store the updated real osets and prediction results respectively, from where the predictors and error counter retrieve the relevant input data. The major components and the subroutines of the extension are: 1. Data Sorting, quickSort() The data sorting is implemented using an optimized quick sort algorithm 47

Chapter 4. Implementation

which is dierent from the Wikipedia version3 . Advantages of this optimized one over the Wikipedia code is: swap variables: In any one pivot round, the Wikipedia version can pass items through a swap variable many times. The optimized one passes only one item (the pivot) through a swap variable, and only once per round. multiple moves: In any one pivot round, the Wikipedia version can move the same item more than once in the list. The optimized one never moves an item to a new position in the list more than once per round. 2. Data Retrieve, eud() To identify the nearest neighbors, Euclidean Distance has been used for calculating the similarity between two samples. 3. Solver of Linear equation system, gauss() Gauss Elimination4 combined with maximal column pivoting strategy is used for solving the linear equation system which consists of the k nearest neighbors. 4. Arithmetic average, Average.c The arithmetic average approach performs the average aver() over the last n osets stored in the array, i.e., t the rst estimation by averaging over last n observations and iterate to get next estimations by shifting the previous n-observations. 5. weighted average, Prob.c The weighted average approach is similar to the arithmetic average one, it is
3 4

http://en.wikipedia.org/wiki/Quicksort http://en.wikipedia.org/wiki/Gauss_elimination

Chapter 4. Implementation

the same as calculating the weighted average over the previous drift values of which the occurring probability is greater than a given threshold value over the last n rounds. quickSort() function has been called in order to lter useless samples. 6. Delay Coordinate Embedding (direct extrapolation), DvsAver.c Averaging the results from k nearest neighbors which are found by comparing the Euclidian-Distances eud(). 7. Delay Coordinate Embedding (Auto-Regressive), Dvs.c This is actually called linear regression technique which performs prediction by solving the k linear equations using least-square method. As well as direct extrapolation, it calls function eud() and quickSort() to decide the k nearest neighbors to be used. 8. Log Transformation, Logtrans.c A natural logarithm transformation of the stored drift data was carried out, prior to modeling the linear predictor described above. 9. Error, ErrorCount.c The nal error and error (error without prediction) can be calculated by summing the relevant subsets of array arr and arrFake, respectively. i.e., nal error equals to the summation of all elements of the array which stores the estimations, and the one stores actual error, respectively. While the possible maximum error is equal or greater than . 10. Protocol Engine add-ons, protocol.c Instead of forever loop, in order to teminate the implementation as soon as the GPS signal is back, the protocol.c has been slightly modied. Moreover, two arrays which store the updated drift data have been added as 49

Chapter 4. Implementation

two extra variables passed between other PTPd functions, e.g., doState(), updateClock(), etc. 11. Clock Servo add-ons, servo.c The servo has been modied to call a predictor as long as the connection fails, continually updates the arrFake once a prediction has been made. 12. Other code The interface header le interface.h contains function prototype, structure and data type declarations, as well as constant declarations.

4.5 Conclusion
In this chapter, the scope and purpose of implementation were initially introduced, subsequently, a block diagram provided an overview of the regular system architecture of PTPd associated with the prediction extension which mainly consisted of forecasting libraries and error counter. Two diagrams showed the interaction between the PTPd and the built-ons as well as the message path during the whole extended system. The workow of how an estimation can be carried out was also demonstrated. Finally, it described the software data structure, detailed implementations of each components and their subroutines, e.g., sorting, linear equation system solving. Data dictionary, the pseudo code, and user guide are all provided in appendix later.

5
Experimental Evaluation and Analysis
5.1 Introduction
As already presented before, in this thesis work, accuracy and eciency are two important criteria for the evaluation of certain prediction techniques. In present chapter, these critera are evaluated respectively for each of the prediction techniques by applying to the collected drift sequences. Following, their performances for dierent disconnection scenarios are quantitatively interpreted by showing some plots. Finally, three classes of GPS disconnection are identied based on the test results. In order to evaluate the accuracy and the eciency of the arithmetic average, weighted average and time series based prediction techniques, an IEEE1588 network has been set up. The network consists of an independent local area network connecting four dierent commercial desktops through an Ethernet switch. Each device contains a clock of a dierent quality but all clocks have a standard deviation in the range of 50sec/sec. One of the desktop acts as a grand-master clock connected to a GPS receiver. Each node is running Linux and the open source 51

Chapter 5. Experimental Evaluation and Analysis

implementation of IEEE1588 PTPd1 . The PTPd stack has been slightly modied in order to perform a hard-resynchronization, i.e., call to the setClock() system call, at every synchronization round (2 seconds) in order to eliminate the inuence of the servo clock tick-rate control. In this work, it requires the nal error of prediction to range from 10 sec to 100 sec depending on the considered protection functions of substation automation. In addition to PTPd, master clock is running an extension library performing the prediction. Each library implements a dierent prediction technique, i.e. using an arithmetic average approach, a weighted average one and a time series approach. The experiments consist of running PTPd under a constant indoor temperature (around 33 C in July) for 24 hours (43200 observations) in a normal conguration, i.e. with the GPS signal connected, and then disconnecting the GPS signal for a delay ranging from 4 seconds to 48 hours. The experiment has been run for each prediction technique on the same PTP slave in order to have comparable results. The classication of time series has been already outlined in Chapter 2, the training data here were just the copies of history sequences without any synthetically insert, delete or permute. They are natural ones since the original sequences were assumed to be more convictive in reecting the instantaneous system states if other variables (humidity, temperature, pressure) are not available. Furthermore, during each simulation session, only one single data stream was investigated from a single source observed by a single sensor since multiple recordings were considered as less accurate compared to single recording in terms of linear models [38]. For the validation purpose of dierent models, after the training data has been stored, the prediction sequences were collected simultaneously in addition to the real drift sequences which can be utilized as validation data set. A successful prediction of a time series depends on the characteristics of the
1

http://ptpd.sourceforge.net - Version Released on 2007-06-17

Chapter 5. Experimental Evaluation and Analysis

time series. It is to be expected that a stationary process can be predicted better and over a longer time horizon than a non-stationary process. Also, time series that exhibit a more regular, more deterministic behavior should be more easily predictable than time series that exhibit a more stochastic behavior. It is also important to recognize that the time horizon of a meaningful prediction will, at least in this work (ranging from 4 seconds to 48 hours), usually be limited, and in fact, may be rather short.
10 x 10
4

History Drift Data 9

Drift (Nanoseconds)

20'000

40'000

60'000

80'000 100'000 Time Horizon (Seconds)

120'000

140'000

160'000

Figure 5.1: Samples of history Drift Data

5.2 Overview of Experiments Settings

The objective of experiments is to make one-step ahead prediction using all techniques. There exists dierent input parameters as well as their values for 53

Chapter 5. Experimental Evaluation and Analysis

dierent prediction techniques. While the arithmetic average and weighted average predictions are extremely straightforward to implement, the time series prediction needs to be detailed. Before applying these techniques to the data sets of history drifts, a few more general remarks about the experiments setting are in order. An implementation of a time series prediction has to choose the values for the embedding dimension, m, the number of nearest neighbours, k, and the size of the history data used to make a prediction (depth). From the point of view of predictive accuracy, the performance varies randomly depending on the embedding dimension and number of nearest neighbors, this thesis work considers them as ad hoc. Even though it is impossible to choose the optimal values (i.e. the higher is not necessarily the better), the choice is a tradeo between the quality of the prediction and the required computing time. In our experiments the history data grows from 43200 elements, i.e., number of synchronization rounds during 24 hours ( (246060) ), to 129599 elements. Since the clock is considered as a station2 ary process, for the eciency purpose, only the most recent 200 history subsets or samples are explored from where the training data or nearest neighbors were retrieved. Finally, the choice of value for m and k is empirical, from the dierent experiments, the optimal values seem to be m = 8 and k = 20. One of the evaluation aspects, accuracy, is dened by the nal oset and the maximum possible oset during the horizon prediction. The accuracy is therefore dened by:
Nt

m (k) =
i=1

(i xi ) x

5.1

and hence
t

m (k) = Maxm (k) =

i=1

(i xi ), t [1, Nt ] x

5.2

Nt represents the time horizon of prediction, xi the prediction result while xi the real drift. 54

Chapter 5. Experimental Evaluation and Analysis

5.3 Results and Evaluation

The results of the prediction using arithmetic average, weighted average and direct extrapolation can be seen in gure(5.2). It is shown that the three predicted curves are approximately equal to straight lines which almost tend to be narrow bands with the time horizon. The results of the prediction using autoregressive and log transformation can be seen in gure(5.4). It is obvious that the two tricky ones can follow the tracks of actual osets.

Figure 5.2: The actual and predicted osets time series using arithmetic average, weighted
average and direct extrapolation

Chapter 5. Experimental Evaluation and Analysis

8.5 8 7.5 7 6.5 6 5.5 5

x 10

The actual and predicted offset with Autoregressive Model Predicted time series Actual time series

200

400

600

800

1000

1200

1400

1600

1800

2000

8.5 8 7.5 7 6.5 6 5.5 5

x 10

The actual and predicted offset with Log transformation Predicted time series Actual time series

200

400

600

800

1000

1200

1400

1600

1800

2000

Figure 5.3: The actual and predicted osets time series using AR model and log transformation

Chapter 5. Experimental Evaluation and Analysis

Final oset - The rst aspect of the evaluation is the comparison of the nal oset, i.e., after various horizon prediction ranging from 4 seconds to 48 hours, i.e. 172800 seconds. Figure(5.5) shows the evolution of the criteria for the ve prediction techniques over a varying horizon prediction. Almost all the ve prediction techniques perform well, i.e., with an oset less than 8sec, for a horizon up to 10 seconds. The weighted average method and autoregressive model both were 5 to 10 times more accurate compared to arithmetic average one which can only stay within 1.5msec oset for a horizon up 5 minutes. It can be noticed that the direct extrapolation has a similar performance as Autoregressive technique, from which the notation that the trend of the predicted time series mainly relies on the number of k nearest neighbors can be inferred. While it comes to longer horizon, e.g. 24 or 48 hours, only the time series prediction integrated with logarithm transformation is able to have an oset less than 500 sec. Moreover, it is worth noting that even after only 4 seconds of disconnection, the oset is already more than 200 sec if no prediction is performed.

Maximal oset - The second aspect of the evaluation deals with the maximal possible oset reached during the horizon prediction. Figure(5.6) shows the evolution of the criteria for the ve prediction techniques over a varying horizon prediction. In comparison with the previous evaluation, the autoregressive model based approaches always perform better than the three other techniques. However, when the horizon is less than a couple of dozen of minutes the dierence between the weighted average and time series prediction, especially between the two autoregressive based methods, is small and both stay synchronized by less than 10 sec. When the horizon is in the range of dozen of seconds, 57

Chapter 5. Experimental Evaluation and Analysis

the arithmetic average technique and direct extrapolation method can still give acceptable results since the oset is not bigger than 10 sec. Finally, it is interesting to note that in the case of the autoregressive predictor the maximal oset is almost growing linearly with the horizon, while the three other techniques tend to grow exponentially. Processing time Finally the last aspect of the evaluation deals with the processing time required to compute the prediction. The processing times reported in gure(5.7) have a similar behavior for each prediction technique since they tend to follow a linear progression. However, the main comment remains that the curves are inverted compared to the two previous gures. The arithmetic average technique is by far the cheapest way to perform a prediction, while the autoregressive based ones are the most expensive one. On an average, the arithmetic average technique requires an average of 0.05 sec for each synchronization round, the weighted average and direct extrapolation ones need around 210 sec, and the two autoregressive based approaches require 315 sec and 350 sec respectively.

5.4 Comparison of Fast Forecasting Methods

From the results presented in the previous section, three classes of GPS disconnection are identied: 1. In the case of a short disconnection, i.e., in the range of a couple of seconds due to for example atmospheric disturbances, a arithmetic average prediction allows to stay synchronized by less than 10 sec with a negligible processing overhead. Moreover, with such a low processing overhead each node of a 58

Chapter 5. Experimental Evaluation and Analysis

Final Offsets (Nanoseconds)

Log Transformation Auto Regressive Direct Extrapolation Probabilistic Statistical No Prediction

10 Horizon (Seconds)

Figure 5.4: Evolution of the criteria for the ve techniques over a varying horizon prediction

Chapter 5. Experimental Evaluation and Analysis

Maximal Offsets (Nanoseconds)

No prediction Statistical Probabilistic Direct Extrapolation Auto Regressive Log Transformation

10 Horizon (Seconds)

Figure 5.5: Evolution of the criteria for the ve techniques over a varying horizon prediction

Chapter 5. Experimental Evaluation and Analysis

Computation Cost (Microseconds)

Statistical Probabilistic Direct Extrapolation Autoregressive Log Transformation

-1

10 Horizon (Seconds)

Figure 5.6: Evolution of Processing time for the ve techniques over a varying horizon prediction

Chapter 5. Experimental Evaluation and Analysis

substation can run its own arithmetic average prediction. This scenario will allow each node to handle the loss of one or two IEEE 1588 synchronization messages in a row due to networking issues. 2. In the case of a long disconnection, i.e., from one hour to two days due to for example the loss of the GPS antenna, a time series prediction is the only possible way to keep the substation synchronized. Even though after two days of prediction the oset is in the range of 75 sec some protection functions will have to be turned o, their resynchronization can be done smoothly (i.e., instead of a full reinitialization) and protection functions with looser synchronization requirements can keep running. However since this kind of prediction is time consuming, it has to be implemented on the grand master node of the substation. 3. In the case of a medium disconnection, i.e., couple of minutes due to for example small maintenance operations such network cable replacement, a weighted average prediction is the best tradeo in terms of oset and processing power. However since this kind of prediction can not take place on every node of a substation because of its processing requirements, some specic nodes have to be identied. Such scenario leads to the notion of synchronization group identied [39] in where the prediction will run on each group leader node.

5.5 Conclusion
This chapter rstly described the experiments settings as well as the properties of the oset time series which were used for training and validation of later simulation. Following, it reported the prediction results obtained with ve dierent methods for the oset time series, three criterions based on nal prediction error, 62

Chapter 5. Experimental Evaluation and Analysis

possible maximal error and processing cost were briey evaluated and commented. Finally, the discussions were narrowed down to potential applications to the substation automation. Three classes of disconnection scenarios, i.e., short, medium, long term disconnections were identied from the result of evaluation.

6
Conclusion and Future Work
6.1 Conclusion
This master thesis project presented an overview of the work carried out for the project titled Improving Reliability of IEEE 1588 Protocol in Electric Substation Automation based on Clock Drift Prediction. It relates to the eld of time synchronization in substation automation (SA) using the IEEE 1588 protocol, while the aim of this project is to handle the un-addressed loss-of-GPS issues (which are common scenarios in case of an electric substation) by keeping the substation synchronized to the GPS time or at least as close as possible, of which the substation can manage to avoid a hard resynchronization of the clocks devices and therefore a full re-initialization of the protection functions when the GPS signal is back. In terms of the reliability of electric substation automations, one pre-requisite to perform ecient protection functions is to have synchronized data provided by the various devices, and the synchronization requirement ranges from 10 sec to 100 sec depending on the considered protection functions. Another naturally arisen problem is the extra processing cost of carrying out those predictions. In 65

Chapter 6. Conclusion and Future Work

the methods implemented in this project, they attempt to overcome this problem. First, the various time prediction approaches of precision clock were introduced. These methods were classied into three categories: physical clock modeling, time series prediction and direct extrapolation. The currently used prediction methods for precision clocks were investigated, subsequently, the discussion of how this thesis been conned to the last two categories of approaches was also presented. Following, the procedure for time series prediction was outlined. The steps identied in this procedure were: collection of data, formation of a set of candidate models, selection of criterion of model tness, model identication and nally model validation. The two simple approaches such as arithmetic average and weighted average were presented. Then, a detailed description of the delay coordinate embedding (DCE) algorithm was provided. Finally, methods for data pre-processing as well as measuring the prediction accuracy (validation) were also presented. Besides the two simpler arithmetic average and weighted average methods (direct extrapolation), the more tricky ones which are based on linear autoregressive (AR) model were implemented for this thesis project. In order to simulate the application scenarios of IEEE 1588 network, the implementation part of this thesis work carried out ve dierent prediction techniques which were built upon the open source implementation of IEEE 1588 PTPd. The implemented methods were tested on the clock osets time series collected from IEEE 1588 networks. The results of those methods presented in section 5.3 were analyzed and compared with each other, subsequently, three classes of GPS disconnection were identied. In general, the arithmetic average method can have negligible processing overhead and the autoregressive based methods can have the best prediction accuracy. Although the results of arithmetic average one obtained with short-term disconnection was good, its performance showed that it has not 66

Chapter 6. Conclusion and Future Work

the ability to satisfy the synchronization requirements while comes to long-term disconnection. Therefore, the arithmetic average method can be considered as one potential approach to handle the short-term GPS disconnection. On the contrary, the autoregressive based algorithm showed that it has the ability to provide a satisfying accuracy over a period of 48 hours (long-term GPS disconnection) but at a much higher computing power cost. In case of medium disconnection, the weighted average method is the best tradeo in terms of accuracy and processing power. The ability of the AR model based algorithm (DCE) was best seen with integration of a logarithm transformation in the initial phase of prediction. Another feature of the DCE algorithm is that its error exhibited linear increment with the time horizon while the others were exponential. In conclusion, it can be said that the delay coordinate embedding algorithm gave very encouraging results and has the potential for further improvements. The results show that the DCE algorithm integrated with a logarithm transformation can provide an accuracy of less than 80 microseconds over 48 hours of prediction but at a high computing cost, while using a arithmetic average based prediction technique an accuracy of less than 10 microseconds over a couple of seconds can be achieved at an extremely low cost. Those results demonstrate that clock drift prediction can be used in the context of electrical substation automation in order to improve the overall protocol reliability in SA without modifying the protocol itself. Moreover, compared to the redundancy concept for the master node, the same level of reliability is achieved at no additional hardware cost.

Chapter 6. Conclusion and Future Work

6.2 Future Work

The results obtained with the implemented methods are very encouraging. In all problems tested there was a huge decrease in the nal deviation compared to the actual deviation without any prediction. Although the results and analysis are promising and show evident benets, a couple of practical issues remain in order to deploy the technology in a substation. In the case of a delay coordinate embedding prediction, the main problem as the choices of the embedding dimension and number of nearest neighbors identied by [38] still remain. The quality of the DCE prediction depends highly on the m and k parameters. However, there is no mean to evaluate the optimal values for those parameters and only an empirical solution can be foreseen. Since m and k depend on the drift produced by each clock, a set of experiments have to be run on every node which is susceptible to perform such prediction. If the synchronization rounds are too short, the weighted average or time series prediction combined to the other tasks running on a node may not have enough time to proceed. The experiments presented in this thesis project assume that a node knows immediately if the GPS signal or master connection is lost. However in a real case, immediate notication of a disconnection remains an issue. As an example, without modifying the IEEE 1588 protocol, at least one synchronization round is required to detect a disconnection and therefore allows the clock to drift during this time without correction. Even though one solution is to shorten the synchronization round, one must be aware of the computing overhead introduced by the prediction especially in the case of the time series 68

Chapter 6. Conclusion and Future Work

prediction.

In conclusion, the results presented in the thesis are promising, however, they need to be consolidated by running the prediction techniques on various dierent clock quality and dierent environment conditions. Moreover, an FPGA based implementation of the dierent prediction techniques needs to be implemented in order to o-load the main processor and therefore smoothly switch-over the dierent prediction techniques as the prediction window grows. Reliability for synchronization protocols has mainly been tackled from an architectural point of view [39] [40], e.g., relying on redundant time servers and multiple network paths, it can easily be combined with the idea proposed in this thesis work.

A
Appendix
A.1 Data Dictionary and Pseudo Code
1. Average.c Function Name - average Parameters (a) *arrFake: Pointer to array (b) arrLen: int pseudo code / * average(arrFake, length) * / average(arrFake, length) { while i<=length do sum(arrFake(i)); // sum up previous observations; end; 71

Appendix A. Appendix

return (sum/length); } 2. Prob.c Function Name - proba Parameters (a) *arrFake: Pointer to array (b) arrlen: int pseudo code / * proba(arrFake, length) * / proba(arrFake, length) { while i<=length do if probability( arrFake(i) ) > THRESHOLD { sum=sum + arrFake(i)* times(arrFake(i))/length; prob=prob + times(arrFake(i)); } end; return sum/prob ; } 3. dvsAver.c Function Name - dvsAver Parameters (a) *arrFake: Pointer to array 72

Appendix A. Appendix

(b) arrLen: int pseudo code / * dvsAver(arrFake, length) * / dvsAver(arrFake, length) { while i<=length do subset[i]=locate(arrFake); /* Euclidian Distance */ eud(testVector,subset[i], length); end;

while i<=K do /* sum up the K elements succeeding the K instances found */ sum(subset[m]); end;

return sum/K; }

4. DVS.c Function Name - dvs Parameters (a) *arrFake: Pointer to array (b) arrLen: int 73

Appendix A. Appendix

pseudo code / * dvs(arrFake, length) * / dvs(arrFake, length) { testVector =locate(arrFake); // locate the lastest subsets while i<=length do subset[i]=locate(arrFake); eud(testVector,subset[i], length); end;

/* form the K linear equations */ while i<=k do matrix[i]= subset; end;

/* Gauss elimination Algorithm coefficient =gauss(matrix); /* Cartesian product */

return Cartesian(coefficient ,testVector) ; } 5. logtrans.c Function Name: logtrans Parameters: (a) *arrFake: Pointer to array 74

Appendix A. Appendix

(b) arrLen: int pseudo code / * logtrans(arrFake, length) * / logtrans(arrFake, { testVector =locate(arrFake); // locate the testing subset while i<=length do subset[i]=locate(arrFake); eud(testVector,subset[i], length); end; length)

/* Logarithm Transformation while i<=k do matrix[i]= log(subset); end;

/ * using Gauss Elimination - maximal colum pivoting coefficient =gauss(matrix);

/* Cartesian product */ return Cartesian(coefficient ,testVector) ;

6. Error_Count.c Function Name: errorCount 75

Appendix A. Appendix

Parameters: (a) *arr: Pointer to array (b) *arrFake: Pointer to array pseudo code / * errorCount(arr,arrFake) * / errorCount(arr,arrFake) { for (i=DisconnectionTime; i<=DisconnectionDuration; i++) { originalDev =sum(arr[i]); finalError= sum(arr[i]-arrFake[i]); }

/* maximum positive value or minimum negative value */ maxError= max(finalError); }

A.2 Guideline for Users

This section is useful for developers who want to add and/or modify procedures in this PTPd extension, as well as playing around with it. The default conguration of this extended PTPd is entirely compliant with the regular one, there exists a specic source code documentation of PTPd on the website1 . For the prediction part, the libraries are all allocated in the same src directory as well as other PTPd source les, while the headles have been all
1

http://ptpd.sourceforge.net/doc.html

Appendix A. Appendix

wrapped under the include directory. To run and test dierent prediction techniques, one can customize the input parameters (e.g., time disconnected, time re-connected) to control the simulation by accessing to the interface.h from where all the constants were dened. To use dierent predictors, one can modify the clock servo (servo.c) by changing the library it calls, for instance:

void updateClock(rtOpts, ptpClock, arr, arrFake,*counter)

/* GPS signal is available / if(counter <=Disconnection_point) { ... };

/* loss of GPS signal */ else { ... prediction = dvs(arrFake,*counter); arrFake[(*counter)-1] = prediction; ... } \\ calls dvs predictor \\ update the training data

Bibliography

[1] K. Brand et al. (2003). Substation Automation Handbook. Utility Automation Consulting Lohmann. [2] IEEE Instrumentation and Measurement Society, 1588 - IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems. IEEE, Tech. Rep., 2002. [3] http://ptpd.sourceforge.net/. Checked on May 2008. [4] Allan, D.W. (1987). Time and frequency (time-domain) estimation and prediction of precision clocks and oscillators. IEEE Trans. on Ultrasound, Ferroelectrics, and Frequency Control UFFC-34, pp.647-654. [5] D. Allan et al. (1992). Precision Oscillators: Dependence of Frequency on Temperature, Humidity and Pressure in Proceedings of the IEEE Frequency Control Symposium. [6] Brockwell, Peter J. and Davis, Richard A. (1987). Time Series: Theory and Methods. New York: Springer-Veriag Inc. [7] Weigend, A. S. and N. Gershenfeld (Eds.). (1994). Time Series Prediction: Forecasting the Future and Understanding the Past. Reading, MA: AddisonWesley. [8] Josena Lopez Herrera. (1999). Time Series Prediction Using Inductive Reasoning Techniques. Instituto de Organizacion y Control de Sistemas Industriales, Ph.d Dissertation. [9] Chateld, C. (1989). The Analysis of Time Series. London: Chapman and Hall. 79

Bibliography

[10] Kolmogorov, A. (1941). Interpolation und Extrapolation von stationren zufalligen Folgen. Bull. Acad. Nauk. 5: 314. U.S.S.R., Ser. Math. [11] Wiener, N. (1949). The Extrapolation, Interpolation and Smoothing of Stationary Time Series with Engineering Applications. New York: Wiley. [12] Priestley, M. (1981). Spectral Analysis and Time Series. London: Academic Press. [13] Volterra, V. (1959). Theory of Functionals and of Integral and IntegroDierential Equations. New York: Dover. [14] Brockwell, Peter J. and Davis, Richard A. (1996). Introduction to Time Series and Forecasting. New York: Springer-Verlag. [15] Box, G. E. P. and F. M. Jenkins. (1994). Time Series Analysis: Forecasting and Control. Englewood Clis, NJ: Prentice Hall. [16] Casdagli, M. and S. Eubank (Eds.). (1992).Nonlinear Modeling and Forecasting. Addison-Wesley. [17] Back, B., Laitinen, T. and Sere, K. (1996). Neural networks and genetic algorithms for bankruptcy predictions. Expert System with Applications, Vol.11, pp.407-403. [18] Leshno, M. and Spector, Y. (1996). Neural network prediction analysis: the bankruptcy case. Neurocomputing, Vol.10. pp.125-147. [19] Pinson, P. and Kariniotakis, G.N. (2003). Wind power forecasting using fuzzy neural networks enhanced with on-line prediction risk assessment. IEEE Bologna PowerTech Conference. 80

Bibliography

[20] Li, S. (2003). Wind power prediction using recurrent multilayer perceptron neural networks. Power Engineering Society General Meeting, IEEE, Vol.4. pp.13-17. [21] Tang, Z., de Almeida, C., Fishwick, P. A. (1991). Time series forecasting using Neural Networks vs. Box-Jenkins Methodology, Simulation, 57:5, pp.303310. [22] Sharda, R., Patil, R. (1990). Neural Networks as Forecasting Experts: an Empirical Test, International Joint Conference on Neural Networks, vol.1, pp.491494,Washington, D.C. [23] Makridakis, S. and M. Hibon. (1979). Accuracy of forecasting: An empirical investigation. J. Roy. Stat. Soc. A 142, 97-145. [24] Kosko, B. (1991). Neural Networks for Signal Processing. Englewood Clis, NJ: Prentice Hall. [25] Connor, J., L. E. Atlas, and D. R. Martin. (1992). Recurrent network and NARMA modeling. In Advances in Neural Information Processing Systems, Vol 4, pp. 301-308. [26] Lepek, A. (1996). Clock Prediction and Cross-Sigma. European Frequency Time Forum. [27] Yule, G. (1927). On a method of investigating periodicity in disturbed series with special reference to Wolfers sunspot numbers. Phil. Trans. Roy. Soc. London A 226, 267-298. [28] Tong, H. (1990). Non-linear Time Series: A Dynamical System Approach. Oxford University Press. [29] M. Sauer, J. A. Yorke, and M. Casdagli. (1991). Embedology. J.Stat. Phys., pp. 597-616. 81

Bibliography

[30] F. Vernotte, J. Delporte, M. Brunet, and T. Tournier. (2001). Uncertainties of drift coecients and extrapolation errors: Application to clock error prediction. Metrologia, vol. 38, no. 4. [31] Busca, G., Wang,Q. (2003). Time prediction accuracy for a space clock. Metrologia, Berlin. Vol.40, pp.s265-s269. [32] Zhu, S. (1997). Optimum precise-clock prediction and its applications. Frequency Control Symposium, 1997, PP.412-417. [33] Greenhall, Charles A. (2005). Optimal prediction of clocks from nite data. International Conference on Finite Power Series and Algebraic Combinations. [34] Pineda, F.J and J.C, Sommerer. (1993). Estimating generalized dimensions and choosing time delays: A fast algorithm. See Weigend and Gershenfeld (1994: 367-385). [35] J. Farmer and J. Sidorowich. (1987). Predicting chaotic time series. Phys. Rev. Lett., vol. 59(8), pp. 845-848. [36] Goodman, T. M. and Ambrose, B. E. Time Series Prediction of Telephone Trac Occupancy using Neural Network. California Institute of Technology. [37] J. Sanchez. (2005). Making a Time Series Stationary. Lecture: Introduction to Time Series, Department of Statistics, UCLA. [38] M. Casdagli, A. Weigend. (1993). Exploring the Continuum Between Deterministic and Stochastic Modeling. In A. S. Weigend and N. A. Gershenfeld (Eds.), Time Series Prediction: Forecasting the Future and Understanding the Past, Reading, MA, pp. 347-366. Addison-Wesley. [39] David L. Mills. (2006). Computer Network Time Synchronization: The Network Time Protocol. Taylor and Francis CRC Press. 82

Bibliography

[40] S. Meier. 2007. IEEE 1588 applied in the environment of high availability LANs, International IEEE Symposium on Precision Clock Synchronization for Measurement, Control and Communication, 2007. [41] Graupe, Daniel. (1997). Principles of Articial Neural Networks. New Jersey: Worid Scientic Publishing Co. Pte. Ltd.

A First Course in Predictive Control, Second Edition PDF
100% (5)
A First Course in Predictive Control, Second Edition PDF
427 pages
22mdt1038 Capstone Final
No ratings yet
22mdt1038 Capstone Final
63 pages
Starter Solenoid Wiring Diagram
No ratings yet
Starter Solenoid Wiring Diagram
1 page
Ir 3300 Code List
82% (45)
Ir 3300 Code List
44 pages
Sliptest 1
0% (2)
Sliptest 1
5 pages
108477_DRIDI_2022_archivage
No ratings yet
108477_DRIDI_2022_archivage
154 pages
these twang
No ratings yet
these twang
141 pages
Ieee 1588
No ratings yet
Ieee 1588
140 pages
MT 13122
No ratings yet
MT 13122
51 pages
A Time Series Forecasting Approach For Queue Wait-Time Prediction
No ratings yet
A Time Series Forecasting Approach For Queue Wait-Time Prediction
76 pages
Generators 1
No ratings yet
Generators 1
101 pages
ISA Certified Automation Professional (CAP) Associate: Certification Exam Prep: 500 Practice Exam Questions and Explanations
From Everand
ISA Certified Automation Professional (CAP) Associate: Certification Exam Prep: 500 Practice Exam Questions and Explanations
Steve Brown
No ratings yet
Bachelor Degree Project: Application To The Swedish Power Grid
No ratings yet
Bachelor Degree Project: Application To The Swedish Power Grid
40 pages
Thesis Qiu Xueheng Final
No ratings yet
Thesis Qiu Xueheng Final
154 pages
PDF
No ratings yet
PDF
210 pages
10.5445IR1000146668
No ratings yet
10.5445IR1000146668
158 pages
Predictive Maintenance for AirProductionUnit in EuroTram Vehicles MarianaBarros
No ratings yet
Predictive Maintenance for AirProductionUnit in EuroTram Vehicles MarianaBarros
110 pages
Detecting Misbehaving Remote ID-equipped UAVs Via RSS-based Location Verification
No ratings yet
Detecting Misbehaving Remote ID-equipped UAVs Via RSS-based Location Verification
59 pages
02 Whole
No ratings yet
02 Whole
145 pages
Goldsmith Comb
No ratings yet
Goldsmith Comb
146 pages
A Review of Deep Learning Models For Time Series Prediction
No ratings yet
A Review of Deep Learning Models For Time Series Prediction
16 pages
Control Predictivo
No ratings yet
Control Predictivo
43 pages
10.1201 9781315104126 Previewpdf
No ratings yet
10.1201 9781315104126 Previewpdf
63 pages
1.6 Machine Learning For Time Series Analysis and Forecasting
No ratings yet
1.6 Machine Learning For Time Series Analysis and Forecasting
54 pages
Deep_Learning_for_Time-Series_Prediction_in_IIoT_P
No ratings yet
Deep_Learning_for_Time-Series_Prediction_in_IIoT_P
20 pages
2008 - Nonlinear System Identification Using Wavelet Based SDP Models - THESIS - RMIT
No ratings yet
2008 - Nonlinear System Identification Using Wavelet Based SDP Models - THESIS - RMIT
247 pages
2305.12095
No ratings yet
2305.12095
39 pages
Thesis Ostashchuk Oleg: Prediction Time Series Data Analysis
No ratings yet
Thesis Ostashchuk Oleg: Prediction Time Series Data Analysis
78 pages
TimeGPT 1 2310.03589
No ratings yet
TimeGPT 1 2310.03589
12 pages
HParthasarathy FinalMSc PDF
No ratings yet
HParthasarathy FinalMSc PDF
118 pages
Deep Learning Models For Spatio-Temporal Forecasting and Analysis
No ratings yet
Deep Learning Models For Spatio-Temporal Forecasting and Analysis
131 pages
Schedule-free optimization of the transformers-based time series forecasting model
No ratings yet
Schedule-free optimization of the transformers-based time series forecasting model
10 pages
ADA468557
No ratings yet
ADA468557
203 pages
Predictive Maintenance for IT Infrastructure: A Machine Learning Approach
No ratings yet
Predictive Maintenance for IT Infrastructure: A Machine Learning Approach
2 pages
Egede Project Final Correction
No ratings yet
Egede Project Final Correction
119 pages
5.h 130
No ratings yet
5.h 130
5 pages
Time Gpt
No ratings yet
Time Gpt
12 pages
FULLTEXT01
No ratings yet
FULLTEXT01
84 pages
Deep Learning Based Forecasting of Critical Infrastructure Data
No ratings yet
Deep Learning Based Forecasting of Critical Infrastructure Data
10 pages
Tesi 2018
No ratings yet
Tesi 2018
70 pages
High-Performance and Low-Power Clock Network Synthesis in The Presence of Variation
No ratings yet
High-Performance and Low-Power Clock Network Synthesis in The Presence of Variation
174 pages
Merged
No ratings yet
Merged
67 pages
Lian Duke 0066D 13204
No ratings yet
Lian Duke 0066D 13204
117 pages
s3950476 TimeSeriesAnalysis Assignment 3
No ratings yet
s3950476 TimeSeriesAnalysis Assignment 3
13 pages
Using Long-Short-Term-Memory Recurrent Neural Networks To Predict
No ratings yet
Using Long-Short-Term-Memory Recurrent Neural Networks To Predict
85 pages
Xilinx TDC
No ratings yet
Xilinx TDC
60 pages
Full Text 01
No ratings yet
Full Text 01
39 pages
Ispcs 2011 6070157
No ratings yet
Ispcs 2011 6070157
6 pages
Comparison_of_Machine_Learning_Techniques_Applied_to_Traffic_Prediction_of_Real_Wireless_Network
No ratings yet
Comparison_of_Machine_Learning_Techniques_Applied_to_Traffic_Prediction_of_Real_Wireless_Network
20 pages
1-s2.0-S2210670720308301-main
No ratings yet
1-s2.0-S2210670720308301-main
13 pages
Labonne 2020
No ratings yet
Labonne 2020
122 pages
Renewable and Sustainable Energy Reviews: Muhammad Qamar Raza, Abbas Khosravi
No ratings yet
Renewable and Sustainable Energy Reviews: Muhammad Qamar Raza, Abbas Khosravi
21 pages
(Jayaraj 2010) (Thesis) Minimum Symbol Error Rate Timing Recovery System
No ratings yet
(Jayaraj 2010) (Thesis) Minimum Symbol Error Rate Timing Recovery System
45 pages
Master Project PTP 1588
No ratings yet
Master Project PTP 1588
56 pages
Ting_Ta_Jiun_202111_MAS_thesis
No ratings yet
Ting_Ta_Jiun_202111_MAS_thesis
77 pages
Time-Domain Simulation of Large Electric Power Systems Using Domain-Decomposition and Parallel Processing Methods
No ratings yet
Time-Domain Simulation of Large Electric Power Systems Using Domain-Decomposition and Parallel Processing Methods
211 pages
Introduction to Quantum Computing & Machine Learning Technologies: 1, #1
From Everand
Introduction to Quantum Computing & Machine Learning Technologies: 1, #1
M. Sreedevi
No ratings yet
231907774
No ratings yet
231907774
103 pages
Load Forecasting Using Time Series Techn
No ratings yet
Load Forecasting Using Time Series Techn
15 pages
project_Synopsis_final[1][2][1]
No ratings yet
project_Synopsis_final[1][2][1]
21 pages
Big Data Analysis of Synchrophasor Data Outcomes of Research Activities Supported by DOE FOA 1861 (PNNL, 2022)
No ratings yet
Big Data Analysis of Synchrophasor Data Outcomes of Research Activities Supported by DOE FOA 1861 (PNNL, 2022)
39 pages
ISA/IEC 62443 Cybersecurity Risk Assessment Specialist Certification Practice Exam Prep
From Everand
ISA/IEC 62443 Cybersecurity Risk Assessment Specialist Certification Practice Exam Prep
Steve Brown
No ratings yet
High-Accuracy_Wireless_Traffic_Prediction_A_GP-Based_Machine_Learning_Approach
No ratings yet
High-Accuracy_Wireless_Traffic_Prediction_A_GP-Based_Machine_Learning_Approach
6 pages
Digital Design Interview Questions 1
100% (1)
Digital Design Interview Questions 1
14 pages
IEC-62133-2-2017-Battery
No ratings yet
IEC-62133-2-2017-Battery
24 pages
SCR Switching Characteristics OR Dynamic Characteristics
No ratings yet
SCR Switching Characteristics OR Dynamic Characteristics
22 pages
QRH (A) PDF
No ratings yet
QRH (A) PDF
236 pages
18F4520
No ratings yet
18F4520
392 pages
V-ECU Hardware Warning: Language Code Product Group No. Date Applies To
100% (2)
V-ECU Hardware Warning: Language Code Product Group No. Date Applies To
3 pages
Question Bank (Elect.G)
100% (1)
Question Bank (Elect.G)
17 pages
G N R 2 0 D 2 0 1 K : Varistors Catalog Number System
No ratings yet
G N R 2 0 D 2 0 1 K : Varistors Catalog Number System
6 pages
Experiment 5 Pressure Control
No ratings yet
Experiment 5 Pressure Control
20 pages
Bipolar Junction Transistor Modes of Operation Transistor As A Switch Transistor As An Amplifier Biasing Circuits
No ratings yet
Bipolar Junction Transistor Modes of Operation Transistor As A Switch Transistor As An Amplifier Biasing Circuits
58 pages
Triton TRA001 Plunge Router
No ratings yet
Triton TRA001 Plunge Router
72 pages
Current Statistics: No. 21: Wholesale Price Index
No ratings yet
Current Statistics: No. 21: Wholesale Price Index
3 pages
RW 9924 0106
No ratings yet
RW 9924 0106
3 pages
SOPDT
No ratings yet
SOPDT
31 pages
Panther
No ratings yet
Panther
28 pages
AC Motor Selection and Application Guide
No ratings yet
AC Motor Selection and Application Guide
32 pages
Noise Cancellation Using Adaptive Filtering
No ratings yet
Noise Cancellation Using Adaptive Filtering
6 pages
2024-09-27 Ver.2.3 PowerShaper PowerBase XL
No ratings yet
2024-09-27 Ver.2.3 PowerShaper PowerBase XL
2 pages
SSPre PFIEV Chapter1 2012
No ratings yet
SSPre PFIEV Chapter1 2012
174 pages
Bike of The Future-Pneumatic Bike2
No ratings yet
Bike of The Future-Pneumatic Bike2
17 pages
Seminar Report
100% (1)
Seminar Report
21 pages
Single Transistor Switching Supply Without Optocoupler
No ratings yet
Single Transistor Switching Supply Without Optocoupler
2 pages
Portable Radios: Operating Instructions
100% (1)
Portable Radios: Operating Instructions
47 pages
Lecture 6
No ratings yet
Lecture 6
36 pages
Sytrus
No ratings yet
Sytrus
86 pages
Marley Cooling Tower Design Guide
100% (1)
Marley Cooling Tower Design Guide
32 pages
Telepac Customer Module
No ratings yet
Telepac Customer Module
7 pages