Parallel Computing Based Distribution Network - Reliability Evaluation Technology Research
Parallel Computing Based Distribution Network - Reliability Evaluation Technology Research
Abstract 瀥With the development of social economy and the evaluation is an essential component of power system analysis
expansion of distribution network scaleˈ
ˈthe demand of power [1-3]..
supply reliability is gradually improving. However, the traditional One main difference between power system reliability
reliability analysis of distribution network power supply is more evaluation and that of distribution network lies on the
suitable for the simplified small-scale system, which is difficult to calculation scale [4]. The quantity of distribution line and
meet the reliability calculation task of large-scale distribution appliance in DN system is considerable. In a typical Chinese
network in the data age. In this paper, an analysis method based city, the feeder line number could exceed 5000 or even 10000.
on Spark parallel computing platform is proposed, and the system In high real-time applications, the reliability evaluation speed
reliability analysis is relied on Map-Reduce. Firstly, hash is challenging to traditional computation platforms and
mapping the network topology data to determine every element’s algorithms [5-6]. In recent research, parallel computing
connection. Then the minimum cut-sets is determined by depth technologies have been used in reliability evaluation. In [7], an
first search of each load point by Map-Reduce, and the reliability IEEE reliability test system is studied by using parallel genetic
index is calculated according to the series parallel relationship of algorithm with multiple processors. The results show that this
the components. On this basis, correcting the reliability index of method can improve the calculation efficiency of power system
the system by considering the influence of standby power supply reliability evaluation speed. It is proved in [8] that the state
and distribution automation equipment in network. Finally, the space pruning using parallel correction technique can improve
reliability of 10kV distribution network in a certain area is the convergence speed of reliability computation. In addition,
realized on the Spark platform, and the calculation speed and Map-Reduce is a typical parallel programming model that can
accuracy of the algorithm are verified. deal with massive data, it can deal with massive data quickly,
Keywords瀥Spark, Map-Reduce, reliability of distribution and can shield the low-level implementation details, and
network. reduce the difficulty of parallel programming [9].
This paper focuses on introducing a newly developed
I. INTRODUCTION
parallel computing platform: Apache Spark, into DN reliability
With the rapid development of distribution network (DN) evaluation [10]. The usage of minimum cut-sets algorithm in
in China, it has become one of the most complex systems in system reliability analysis is introduced and the algorithm is
the city. Considering the huge investment of DN and the improved to adapt the spark computing platform. Based on
importance of the economical entities it supports, its reliability Spark computing platform and minimum cut-sets algorithm,
,(((
the fast and accurate calculation of the distribution network is different worker nodes cannot communicate during computing
realized. processes. This characteristic prohibits high-efficient iterative
computations, which are commonly required in power system
II. SPARK BASED PARALLEL COMPUTING analysis including distribution network reliability evaluation.
PLATFORM Above problems can be solved by using memory based
computing and Resilient Distributed Datasets (RDD) provided
A. Apache Spark based distributed parallel computing
by Apache Spark computing platform. RDD is a fault-tolerant,
technology
parallel data structure, which allows users to store data into
Parallel computing is a definition compared with traditional memory or hard drives to control data partitions. Spark also
serial computing technology and it can be implemented using provides a set of operations to operate RDD based data
multithread or computing cluster to achieve better performance. structures, which can be classified into transformations,
In this paper, we use computing cluster technology to realize including map, flatMap, filter and actions, including join,
the parallel computing based reliability analysis of distribution groupBy, reduceByKey. The abstraction of RDD makes Spark
networks for three advantages: suitable for high-efficient memory-based Mapreduce
1) The computing cluster technology can be easily operations. For these reasons, Spark is selected as the basic
extended. If better computing power is needed, more nodes can parallel computing platform in our works.
be added into the cluster since the structure is extendable.
B. Design of the parallel computing platform
2) Powerful computing platforms are available for
computing cluster. Thanks to the rapid development of big data The practical parallel computing platform for power system
technologies, foundations and companies release their own reliability evaluation includes three components: data storage,
parallel computing platforms which integrate storage, task computing framework and interface/development environment.
management and data structures. Most of them are Considering the requirement of analysis of large scale
open-sourced. distribution network, the platform need to
3) Cloud computing technology reduces the cost of 1) Be able to store and read large scale of hybrid data,
computing cluster. Many internet companies like Amazon and including historical data, real-time data, structured data and
Alibaba provide cloud computing services which can be unstructured data.
applied and managed online. Meanwhile, the configuration of 2) High performance computing to accelerate algorithm
the clusters can be real-time adjusted according to the with distributed parallel structure and to realize program
performance requirement. Compared to purchasing expensive commonly, efficiently and flexibly.
multi-core CPUs, renting cloud computing environment is 3) Friendly development environment to assist prototype
cheaper and more convenient. test, algorithm inspection and application deployment.
The Mapreduce computing model is the most widely used
parallel computing programming model, which can be easily
adapted to diver-worker distributed parallel computing
platform. Since Map operations and Reduce operations are
independent, the scheduling and synchronization of worker
nodes are easy. Programmers do not need to know the details
of task management on different node and they only need to
split the algorithm into Map step and Reduce Step.
However, traditional Hadoop Mapreduce parallel
computing is based on open-loop data flow model. Every data
process involves read-write of files on hard drives. Meanwhile,
sets, is more often used in distribution network reliability
Interactive User Interface
(Jupyter Notebook) evaluation[11-12].
Jupyter Kernels
(Python,Scala, R, JAVA, C++...)
Spark GraphX
MLib
(machine
Spark
Ipyparallel
(Ipython Kernel)
P5
Streaming (graph) SQL
learing)
...
Apache Spark P3
P6
1
Data Storage P
Historical Data Storage Real Time Data Storage P4
P0
P2
<L,
<L, topo>
topo>
po> Depth
Depthh first
first
fi
Split 0
Map task1 search
search
Merge Data
x Map task2 load
load node
node and
...
Read topology and
x Determine the Split 2 source
source and
and
<L,
<L, topo>
topo>
source and backup
backup supply
supply
x
Backup supply
Input Split 3 <L,
... Calculate failure <L
Calculate failure < ii,λ
<L ,λii U
Uii>>
<L, topo>
topo>
reliability rate
rate and
and outage
outage Reduce task n
parameters time
time ofof the
the load
load
Map task n node
node according
according
Split 4 to
to the
the cut-
cut- set
set
Reads the
Judge the The elements
A series
4) Parallel computing on reliability of load points and
connection in the cut-set
componen
relation are handled
system O '
L1 calculating reliability indexes
ts of the formula is
between in a parallel U '
minimum
components system
used to deal L1 The reliability indexes includes system average interruption
cut with cut-sets
and cut-sets formula frequency index (SAIFI), system average interruption duration
index (SAIDI) and average service availability index (ASAI),
which are defined in (5)-(7)
Calculation of reliability correction for backup supply
Reads the Use difference and Compute ∑
Correct the = (5)
minimum interssection to calculate Min( λL1,
parameters of
cut and the the reliability index of the UL1) to
backup component in the backup O ' and U '
L1 L1 get the ∑
minimum supply which does not in
then get
best = (6)
λL1 and UL1
cut set the minimum cut set backup
∑
= 1− (7)
⋅
Output the min( λL1, UL1)
These indexes are counted as the mathematical expectation
Fig. 4. Algorithm of reliability computing of load point of failure rate and interruption duration of all load points. This
processes is carried out in parallel as a reduce action of Spark.
The algorithm of reliability computing on load point is
shown in Fig. 4. First, the reliability calculation under main
IV. EXPERIMENT AND ANALYSIS
power supply is implemented. The probabilities of every main
power supply cut-set of each load point forms a parameter The experiments are implemented on Ubuntu Server 16.04,
matrix. The series formula and parallel formula are updated Spark 2.1.0, python 3.5, Pandas 0.19.2 and PandaPower 1.2.
according to the connection between components and cut-sets. The hardware configuration is CPU: i7 6700K (4.0-4.2Ghz)
Thus, the failure rate λL1’ and interruption duration UL1’ can be and 32G DDR4 2133Mhz memory. The IEEE RBTS-BUS4
calculated with following equation: 33kv modal used in [17] is used to verify the parallel
computing technology, as shown in Fig. 5. The reliability
°Oci Oi ri (¦1 / ri ) parameters are listed in Table II.
® (3)
°̄U ci Oi ri TABLE II. RELIABILITY PARAMETERS
Circuit
°OL' 1 ¦ Oci Equipment Line
Breaker
Transformer
® ' (4)
°̄U L1 ¦ Oci rci λ(times/year) 0.04 0.00514 0.01
r(hours/time) 8 8 100
The elements in the cut-set are handled by parallel system λ’(times/year) 0.0143 0.2055 0.1444
formula 3 and the series formula 4 is used to deal with the r’(hours/time) 2.18 8 10
[2] Andrej Schreiner, Gerd Balzer, Armin Precht. "Risk sensitivity of failure rate
and maintenance expenditure, " 2011 IEEE 11th International Conference on
Probabilistic Methods Applied to Power System, pp.137-142, 2010.
[3] R.M. Vitorino, L.P. Neves, H.M. Jorge. " Network reconfiguration to
improve reliability and efficiency in distribution system, " 2009 IEEE
Bucharest PowerTech, pp.1-7, 2009.
[4] Qian Xie, Haozhong Cheng, Yi Zhang, et al. "Active distribution network
planning based on active management, " 2014 China International
Conference on Electricity Distribution, pp.1261-1265, 2014.
[5] Baek, Joonsang, et al. "A Secure Cloud Computing Based Framework for
Big Data Information Management of SmartGrid," IEEE Transactions on
Cloud Computing, vol. 3, pp. 233-244, 2015.
[6] Song, Yaqi, G. Zhou, and Y. Zhu. "Present Status and Challenges of Big
Data Processing in SmartGrid." Power System Technology, vol. 37, pp.
927-935, 2013.
[7] Lingfeng Wang, Chanan Singh. "Multi-deme parallel genetic algorithm in
reliability analysis of composite power systems," 2009 IEEE Bucharest
PowerTech, pp.1-6,2009.
[8] Robert C. Green, Lingfeng Wang, Mansoor Alam, et al. "Intelligent and
Fig. 5. System network structure parallel state space pruning for power system reliability analysis using MPI
on a multicore platform," 2009 ISGT 2011, pp.1-8,2011.
[9] Dean J, Chenmawat S. "MapReduce : simplified data processing on large
TABLE III. RELIABILITY EVALUATION RESULTS clusters," Communications of the ACM, vol.51,pp.107-113, 2008.
[10] Liu, Keyan, et al. "Big Data Application Requirements and Scenario
Reliability SAIFI SAIDI Analysis in Smart Distribution Network." Zhongguo Dianji Gongcheng
ASAI
indices (times/year) (h/year) Xuebao/proceedings of the Chinese Society of Electrical Engineering, vol.
Value 0.1266 0.2658 0.999970 35, pp. 287-293, 2015
[11] Zefang Zhou, Zhean Gong, Bo Zeng, et al. "Reliability analysis of
The reliability evaluation result is listed in Table III, which distribution system based on the minimum cut-set method," 2012
International Conference on Quality, Reliability, Risk, Maintenance, and
is identical with the result in [17]. The parallel computing Safety Engineering, pp.112-116,2012.
performance is evaluated with the model of distribution [12] Vijay Venu Vadlamudi; Oddbjørn Gjerde, Gerd Kjølle. "Impact of
protection system reliability on power system reliability: A new minimal
network of Taizhou, China. The model contains 207 nodes, cutset approach," 2014 International Conference on Probabilistic Methods
Applied to Power Systems (PMAPS), pp.1-6, 2014.
203 lines, 45 swathes/breakers, 29 load points and 5 backup [13] https://pandapower.readthedocs.io/en/v1.3.0/elements.html#
[14] Tang H, Gulbeden A, Zhou JY, et al. "A self-organizing storage cluster for
power supplies. The traditional series computing performance parallel data-intensive applications," IEEE Computer Society, pp.52-63,
is compared with parallel computing performance. The series 2004.
[15] Jane C C, Lin J S,Yuan J. "Reliability evaluation of a limited-flow network
computing takes 4.116s while parallel computing 2.75s. The in terms of minimal cutsets," IEEE Transactions on Reliability, vol.42,
pp.354-361, 1993.
speedup ratio is 33%. This result verifies the efficiency of [16] R.Billinton, R.Allan. Reliability Evaluation of Engineering Systems:
proposed parallel computing based reliability evaluation. The Concepts and Techniques (second edition). New York and London: Plenum
Press, 1992.
performance can be further enhanced when more nodes [17] R.N. Allan, R.Billinton, I.Sjarief, et al. "A reliability test system for
educational purposes-basic distribution system data and relusts, " IEEE
participate in the cluster. Transactions on Power Systems, vol.6, pp.813-820, 1991.
V. CONCLUSION
In the paper, an Apache Spark based parallel computing
platform is designed for distribution network reliability
evaluation. The eveluation algorithm contains four steps and is
deisgned in parallel with RDD techonolgy. The experiment
result indicates that the proposed parallelled computing
methods could enhance the effiencicy and speed of distribution
network reliability evaluation.
REFERENCES
[1] Alan J. McBride, Andrew R. McGee. "Assessing smart Grid security," Bell
Labs Technical Jouranl, vol.17, pp.87-103, 2012.