2015 Computational Intelligence Applications in Modeling and Control
2015 Computational Intelligence Applications in Modeling and Control
Ahmad Taher Azar
Sundarapandian Vaidyanathan Editors
Computational
Intelligence
Applications in
Modeling and
Control
Studies in Computational Intelligence
Volume 575
Series editor
Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland
e-mail: [email protected]
About this Series
Editors
Computational Intelligence
Applications in Modeling
and Control
123
Editors
Ahmad Taher Azar Sundarapandian Vaidyanathan
Faculty of Computers and Information Research and Development Centre
Benha University Vel Tech University
Benha Chennai
Egypt India
v
vi Preface
this book after a rigorous review process in the broad areas of Control Systems,
Power Electronics, Computer Science, Information Technology, modeling, and
engineering applications. Special importance is given to chapters offering practical
solutions and novel methods for recent research problems in the main areas of this
book, viz., Control Systems, Modeling, Computer Science, IT and engineering
applications.
Intelligent control methods can be broadly divided into the following areas:
• Neural network control
• Fuzzy logic control
• Neuro-fuzzy control
• Genetic control
• Expert Systems
• Bayesian control
• Intelligent agents
This book discusses trends and applications of computational intelligence in
modeling and control systems engineering.
The objective of this book takes a modest attempt to cover the framework of
computational intelligence and its applications in a single volume. The book is not
only a valuable title on the publishing market, but is also a successful synthesis of
computational intelligence techniques in the world literature. Several multidisci-
plinary applications in Control, Engineering, and Information Technology are
discussed in this book, where CI have excellent potentials for use.
Book Features
• The book chapters deal with recent research problems in the areas of intelligent
control, computer science, information technology, and engineering.
• The chapters contain a good literature survey with a long list of references.
• The chapters are well-written with a good exposition of the research problem,
methodology, and block diagrams.
Preface vii
• The chapters are lucidly illustrated with numerical examples and simulations.
• The chapters discuss details of engineering applications and the future research
areas.
Audience
This book is primarily meant for researchers from academia and industry, who are
working in the research areas—Computer Science, Information Technology,
Engineering, and Control Engineering. The book can also be used at the graduate or
advanced undergraduate level as a textbook or major reference for courses such as
intelligent control, mathematical modeling, computational science, numerical sim-
ulation, applied artificial intelligence, fuzzy logic control, and many others.
Acknowledgments
As the editors, we hope that the chapters in this well-structured book will stimulate
further research in computational intelligence and control systems and utilize them
in real-world applications.
We hope sincerely that this book, covering so many different topics, will be very
useful for all readers.
We would like to thank all the reviewers for their diligence in reviewing the
chapters.
Special thanks go to Springer, especially the book Editorial team.
ix
x Contents
R. Vashist (&)
Faculty of Computer Science, Shri Mata Vaishno Devi University Katra,
Katra, Jammu and Kashmir, India
e-mail: [email protected]
A. Vashishtha
Faculty of Management, Shri Mata Vaishno Devi University Katra,
Katra, Jammu and Kashmir, India
e-mail: [email protected]
ranking generated by the CAMEL model is verified using lower and upper
approximation. This chapter demonstrates the accuracy of Ranks generated by
CAMEL model and decisions rules are generated by rough set method for the
CAMEL model. Further, the most important attribute of CAMEL model is iden-
tified as risk-adjusted capital ratio, CRAR under capital adequacy attribute and
results generated by rough set theory confirm the accuracy of the Ranks generated
by CAMEL Model for various Indian public- sector banks.
Keywords CAMEL model Rough set Rules Reduct Core Ranking of banks
1 Introduction
2 Related Work
As a response to the financial crises in the recent decades there is a growing volume
of literature that is largely devoted to analyzing the sources and effects of financial
crises, and predicting and suggesting remedies for their prevention. There is a keen
interest of researchers in testing and improving the accuracy of early warning
systems for timely detection of prevention of the crisis.
Wheelock and Wilson [29] examine the factors that are believed to be relevant to
predict bank failure. The analysis is carried out in terms of competing-risks hazard
models. It is concluded that the more efficiently a bank operates, the less likely it is
to fail. The possibility of bank failure is higher in respect of banks having lower
capitalization, higher ratios of loans to assets, poor quality of loan portfolios and
lower earnings.
Estrella and Park [9] examine the effectiveness of three capital ratios (the first
based on leverage, the second on gross revenues, and the third on risk-weighted
assets) for predicting bank failure. The study is based on 1988–1993 data pertaining
to U.S. banks. The results show that over 1 or 2 year time-horizons the simple
leverage and gross revenue ratios perform as well as the more complex risk-
weighted ratio. But over longer periods the simple ratios are not only less costly to
implement but also useful supplementary indicators of capital adequacy.
Canbas et al. [4] demonstrate that an Integrated Early Warning System (IEWS)
can be employed to predict bank failure more accurately. The IEWS is conceived in
terms of Discriminant Analysis (DA), Logit/Probit regression, and Principal
Component Analysis (PCA). In order to test the predictive power of the IEWS, the
authors use the data for 40 privately owned Turkish commercial banks. The results
show that the IEWS has more predictive power relatively to the other techniques.
Kolari et al. [13] and Lanine and Rudi [14] are other notable studies that have
attempted to develop an early warning system based on Logit and the Trait Rec-
ognition method The authors test the predictive ability of the two models in terms of
their prediction accuracy. The Trait Recognition model is found to outperform the
Logit model.
Tung et al. [28] explain and predict financial distress of banks using Generic
Self-organizing Fuzzy Neural Network (GenSoFNN) based on the compositional
rule of inference (CRI). The study is based on a population of 3,635 US banks
observed over a 21 years period, 1980–2000. The authors have found the perfor-
mance of their bank failure classification and EWS as encouraging.
Mannasoo and Mayes [16] employ logit technique with a five components
CAMELS model and structural and macroeconomic factors to demonstrate that
besides bank-specific factors, macroeconomic factors and institutional frameworks
are also crucial factors responsible for bank distress in East European countries over
the period 1996–2003.
Demirguc-Kunt et al. [7] analyze structural and other changes in the banking
sector following a bank crisis. The authors observe that individuals and companies
take away their funds from inefficient, weaker banks and invest the same in stronger
4 R. Vashist and A. Vashishtha
3 Methodology
set for generating rules and for finding the reduct and core. The accuracy of the
ranking generated by the CAMEL model is verified using lower and upper
approximation.
When ranks are assigned simply on the basis of array of financial indices arranged
in ascending/descending order (Unclassified Rank Assignment Approach) some
absurd outcomes may follow. Two such possibilities are discussed here. A major
problem with rank assignment occurs when the ranks fail to represent the relativity
among magnitudes of financial indices on which they are based. Consider the ranks,
assigned to various banks on the basis of rates-of-return (on equity) achieved by
them, as specified in Table 1 (col. 2.1). In this case, improvements in rate-of-return
positions are not proportionally seen in the corresponding ranks. For instance, a
difference of 1.83 in rates-of-return as between the bank at Sr. No. 8 (having a
11.76 % rate-of-return) and the bank at Sr. No. 7 (having a 9.93 % rate-of-return),
which amounts to an increase of 18.43 %, induces only a one-step improvement in
rank from 10 to 9. In contrast, difference between the bank at Sr. No. 9 (having
7.65 % rate-of-return) and the bank at Sr. No. (having 15.91 % rate-of-return) is
even lesser, 1.83 (which amounts to an increase of 10.94 %). Yet it leads to a much
greater improvement in ranks, that is, a three-step improvement from rank 7 to
rank 4. Other rank figures may also reveal similar anomalies. Apparently, this is an
absurd outcome.
For accuracy of ranking, it is also necessary that overtime changes in financial
performance of banks are duly reflected in the ranks assigned to them. Suppose in
due course there is improvement in earnings position in case of one bank and
deterioration in the case of another. If the ranks assigned to these banks are still
found to be unchanged we may reasonably infer that ranking methodology does not
provide for accuracy. It is also possible that in one case a relatively small increase in
earnings of a bank may lead to improvement in its rank position, whereas a rela-
tively substantial decline in the earnings of another bank may leave the rank
unchanged. This is another instance of an absurd outcome. For illustration, consider
the behavior of earnings of banks during two time-periods, t = 1 and t = 2 as
depicted in Table 1.
For ensuring a fair degree of rank accuracy, we must have an approach that pro-
vides for a more or less systematic relationship between the behavior of ranks and
financial indices on which the ranks are based. This may be expected under the
Classified Rank Assignment Approach. Computation of ranks under this approach
involves the following steps. Consider, for instance, col. 2.1 of Table 2:
1. There are in all 10 (N) indices relating to return on equity. Take the difference
(D) between the maximum and the minimum indices, and divide it by 9 (that is
N − 1) in order to classify the given indices under 10 class intervals each having
the same class width, with minimum and maximum indices falling at mid point
of the 1st and the 10th class intervals, respectively. In the present case, we have
D = 21.03 − 9.93 = 11.1. Dividing D by 9 we have 11.1/9 = 1.233 as the width
of each of the 10 class intervals.
2. In order to determine the first class interval, divide 1.233 by 2. We have 1.233/
2 = 0.616 which is the difference between the mid value and lower/upper class
limits. Accordingly, for the 1st class interval lower class limit is 9.31 (that is,
9.93–0.62), and the upper class limit is 10.54 (that is, 9.31 + 1.23). Limit values
for the rest of the class intervals can be obtained by adding 1.23 to the lower
limit (that is upper limit of the previous class interval) each time.
8
Class 9.31–10.54 10.54–11.77 11.77–13.00 13.00–14.23 14.23–15.46 15.46–16.69 16.69–17.92 17.92–19.15 19.15–20.38 20.38–21.68
interval
Rank 10 9 8 7 6 5 4 3 2 1
R. Vashist and A. Vashishtha
An Investigation into Accuracy of CAMEL Model … 9
A considerable variety of choice is often seen in the literature as regard the choice
of CAMEL components, and the financial ratios or indices within each component.
There is no uniform approach as to how many and which components/indices
should be included. The choice is mainly specific to the objective of the study and
investigator’s critical judgment. The objective of present study is to rank public
sector banks in India on the basis of their financial soundness particularly in relation
to the concerns of debt holders. The choice of financial indices has been motivated
mainly by this consideration. We have considered ten largest Indian public sector
banks on the basis of their deposit-base over a five-year period, 2008–2009 through
2012–2013 so as to ensure that these banks constitute more or less a homogeneous
group in terms of ownership structure and regulatory and administrative control
over their policies. For ranking of these banks, we have employed in all eleven
financial indices relating to capital adequacy, asset quality, management efficiency,
10 R. Vashist and A. Vashishtha
earnings and liquidity. Each of these financial indices has been specified by a
quantity which represents the mean value over the five-year period, 2008–2009
through 2012–2013. The rankings of banks obtained under the two approaches,
Unclassified Rank Assignment Approach (URAA) and Classified Rank Assignment
Approach (CRAA), are specified in Table 3 through Table 8. For accuracy of ranks
it is important to ensure that financial indices included under various components
are relevant and basic to the purpose in question and no financial parameter is
included which is redundant. For compliance of this requirement, ranks repre-
senting overall financial soundness of various banks were regressed on the eleven
indices to identify if any of them needed to be excluded. This exercise revealed that
Net NPA ratio (in relation to asset quality) and business per employee (in relation to
management efficiency) fell under the category of ‘excluded variables’. Accord-
ingly, these indices were dropped while computing ranks.
Capital Adequacy: Bank’s capital or equity provides protection to the depositors
against its possible financial distress. A bank having a shrinking capital base relative
to its assets is vulnerable to potential financial distress. This is the main reason that in
the literature on analysis of financial distress of banks, this component appears as a
critical factor practically in all empirical research. Bank equity ratio is defined in two
broad contexts: risk-weighted capital ratio and non-risk-weighted capital ratio. In
some of the studies the latter ratio is employed. It is argued that the risk-weighted
ratios are open to manipulation. A bank while adjusting its assets for the associated
risk may be tempted to apply weighting to different categories of risky assets that
may help it conceal the real position with regard to its financial fragility. This
possibility is recognized by a number of studies such as [5, 15]. In view of this
possibility studies such as [22, 27] employ non-risk-weighted capital ratios. The
Indian banking sector is under stringent regulatory regime of the RBI. Possibilities of
such manipulations are believed to be practically non-existent. Accordingly, we
have employed the risk-adjusted capital ratio, CRAR. It represents bank’s qualifying
capital as a proportion of risk adjusted (or weighted) assets. The RBI has set the
minimum capital adequacy ratio at 9 % for all banks to ensure that banks do not
expand their business without having adequate capital. A ratio below the minimum
indicates that the bank is not adequately capitalized to expand its operations. As a
measure of bank’s capital adequacy, we have supplemented the CRAR by including
another ratio, namely, D/E ratio (deposit to equity ratio). The capital adequacy of
different banks and their rank position based on the same are specified in Table 3.
Asset Quality: The inferior quality of bank assets accentuates its vulnerability to
financial distress. Losses resulting from such assets eat into the capital base of the
bank and become one of the main causes of bank failure. Since the predominant
business of a commercial bank is lending, loan quality is an important factor in the
context of its financial soundness. A relatively dependable measure of overall loan
quality of a bank is the Net NPA ratio which represents non-performing assets as a
proportion of its total loans (advances). According to the RBI, NPAs are those
assets for which interest is overdue for more than 90 days (or 3 months). For
assessing asset quality besides the Net NPA ratio we have included another ratio, as
well, that is, the ratio of government and other approved securities to total assets as
Table 3 Capital adequacy
Name of bank CRARa Ranking under D/E Ranking under Average rank Capital adequacy
alternative alternative rank R (C)
approaches approaches
URAA CRAA URAA CRAA URAA CRAA URAA CRAA
State Bank of India 13.28 4 5 12.79 1 1 2.5 3 2 2
Bank of Baroda 14.18 1 1 14.85 2 4 1.5 2.5 1 1
Punjab National Bank 13.19 5 5 16.18 7 5 6 5 4 4
Bank of India 12.22 9 10 15.73 6 5 7.5 7.5 6 8
An Investigation into Accuracy of CAMEL Model …
these are believed to be nearly risk-free assets. The greater the proportion of theses
securities, the greater the provision for depositors concerns. The asset quality of
different banks and their rank position based on the same are specified in Table 4.
Management Efficiency: The success of any institution depends largely on
competence and performance of its management. The more vigilant and capable the
management, the greater the bank’s financial soundness. There is no doubt about
these assertions. But the problem is with regard to identifying and measuring the
effect of management on bank’s financial performance. There is no direct and
definite measure in this regard. Management influences can be gauged in many
ways, for instance, asset quality or return on assets, business expansion and optimal
utilization of bank’s skills and material resources. In the present study we have
attempted to capture the influence of management quality in terms three indices,
namely, return on advances (adjusted for cost of funds), business per employee and
profit per employee. Table 5 presents relevant information in this regard together
with the ranks assigned to various banks on this count.
Earnings: Earnings and profits are important for strengthening capital base of a
bank and preventing any financial difficulty. There should be not only high profits
but also sustainable profits. In empirical research—for instance, [2, 3, 22, 27] a
wide variety of indicators of earnings have been employed, but the relatively more
popular measures are rate-of-return on assets (ROA) and rate-of-return on equity
(ROE). We have used these indicators to represent earnings and profits of the banks
(Table 6).
Liquidity: An essential condition for preserving confidence of depositors in
financial ability of the bank is that the latter must have sufficient liquidity for
discharging their short-term obligations and unexpected withdrawals. If it is not so,
there is a catastrophic effect of loss of confidence in the financial soundness of the
bank. The confidence loss may cumulate to even lead to financial distress of the
bank [19]. In order to capture the liquidity position of a bank, we have employed
two ratios, namely, ratio of liquid assets to demand deposits (LA/DD) and ratio of
government and other approved securities to demand deposits (G. Sec./DD). The
relevant statistics and results are specified in Table 7.
We have pooled the information for each bank as regard its relative position in
respect of the five CAMEL indicators so as to assign to it an overall rank repre-
senting its financial soundness relatively to that of other banks. The overall rank is
represented by the simple mean value of the ranks assigned to various indices
included under the five broad financial indicators. These indicators do not neces-
sarily imply the same significance as regard the financial soundness of a bank;
significance may vary depending on the context in which these are viewed as well
Table 4 Asset quality
Name of bank Net NPA Ranking under alternative G.Sec./TA Ranking under Average rank Asset quality rank
ratio approaches (%) alternative R (A)
approaches
URAA CRAA URAA CRAA URAA CRAA URAA CRAA
State Bank of 1.81 Excluded Excluded 20.26 7 6 8.5 6 7 6
India variable variable
Bank of Baroda 0.56 17.67 10 10 5.5 10 4 10
Punjab National 1.08 22.03 5 3 4.0 3 2 3
Bank
Bank of India 1.24 19.73 9 7 7.0 7 6 7
Canara Bank 1.38 23.39 3 1 5.0 1 3 1
An Investigation into Accuracy of CAMEL Model …
Table 7 Liquidity
Name of bank LA/DDa (%) Ranking under G.Sec./DD (%) Ranking under Average rank Liquidity rank R
alternative alternative (L)
approaches approaches
URAA CRAA URAA CRAA URAA CRAA URAA CRAA
State Bank of India 23.66 8 8 58.14 10 10 9 9 7 10
Bank of Baroda 40.53 1 1 73.72 8 7 4.5 4.0 4 2
Punjab National Bank 19.23 10 10 68.85 9 8 9.5 9.0 8 10
Bank of India 28.17 4 6 88.94 3 3 3.5 4.5 2 3
Canara Bank 28.48 3 6 101.84 1 1 2 3.5 1 1
Union Bank of India 22.18 9 9 76.86 7 6 8 7.5 6 7
Central Bank of India 24.78 6 8 80.69 5 5 5.5 6.5 5 6
Indian Overseas Bank 26.47 5 7 96.61 2 2 3.5 4.5 2 3
Syndicate Bank 30.51 2 5 78.90 6 6 4 5.5 3 4
Allahabad Bank 23.87 7 8 83.00 4 5 5.5 6.5 5 6
a
LA/DD in Table 7 specifies Ratio of liquid assets to demand deposits (LA/DD)
R. Vashist and A. Vashishtha
An Investigation into Accuracy of CAMEL Model … 17
Since the year of its inception in 1982, rough set theory has been extensively used
as an effective data mining and knowledge discovery technique in numerous
applications in the finance, investment and banking fields. Rough set theory is a
way of representing and reasoning imprecision and uncertain information in data
[21]. This theory is basically revolves around the concept of Indiscernibility which
means the inability to distinguish between objects or objects which are similar
under the given information. Rough set theory deals with the approximation of sets
constructed from empirical data. This is most helpful when trying to discover
decision rules, important features, and minimization of conditional attributes. There
are four important concepts to discuss when talking about rough set theory:
information systems, Indiscernibility, reduction of attributes and rule generation.
In the Rough Sets Theory, information systems are used to represent knowledge.
The notion of an information system presented here is described in Pawlak [20].
Suppose we are given two finite, non-empty sets U and A, where U is the universe,
and A, a set attributes. With every attribute a ∈ A we associate a set Va, of its
values, called the domain of a. The pair S = (U, A) will be called a database or
information system. Any subset B of A determines a binary relation I(B) on U,
which will be called an Indiscernibility relation, and is defined as (x, y) ∈ I(B) if and
only if a(x) = a(y) for every a ∈ A, where a(x) denotes the value of attribute a for
element x. The Indiscernibility relation will be used next to define two basic
operations in rough set theory, which are defined below:
• The set of all objects which can be with certainty classified as members of
X with respect to R is called the R-lower approximation of a set X with respect to
R, and denoted by
RðXÞ ¼ fx 2 U : RðxÞ Xg
Table 8 Comparative analysis of ranks under the two approaches
18
The ranks obtained by Unclassified Rank Assighment Approach (URAA) and Classified Rank Assignment Approach (CRAA) has been made bold so that they can be compared side by side
An Investigation into Accuracy of CAMEL Model … 19
• The set of all objects which can be only classified as possible members of X with
respect to R is called the R-upper approximation of a set X with respect to R, and
denoted by
RðXÞ ¼ fx 2 U : RðxÞ \ X 6¼ ug
The set of all objects which can be decisively classified neither as members of
X nor as members of—X with respect to R is called the boundary region of a set
X with respect to R, and denoted by RN (X) R, i.e.
RN ð X ÞR ¼ RðXÞ RðXÞ
A set X is called crisp (exact) with respect to R if and only if the boundary region
of X is empty.
A set X is called rough (inexact) with respect to R if and only if the boundary
region of X is nonempty.
Let C, D A, be sets of condition and decision attributes, respectively. We will
say that C′ C is a D-reduct (reduct with respect to D) of C, if C′ is a minimal
subset of C such that c (C, D) = c (C′, D).
Now we define a notion of a core of attributes. Let B be a subset of A. The core
of B is a set of all indispensable attributes of B. The following is an important
property, connecting the notion of the core and reducts
Core(B) = ∩ Red(B),
where Red(B) is the set off all reducts of B.
Table 8 given above is divided into two tables one for Unclassified Rank
assignment approach and other for Classified Rank Assignment Approach. Both
these tables are given as input to rose2 software for analysis of rough set.
For classified CAMEL approach there is no attributes in core set which means
that there is no indispensable attribute and any single attribute in such an infor-
mation system can be deleted without altering the equivalence-class structure. There
are two reducts which are found by Heuristic methods
Core = {}
Reduct1 = {CRAR, PPE}
Reduct2 = {D/E, PPE}
The classification accuracy of ranks as given by lower and upper approximation
for classified CAMEL Rank approach is shown in Fig. 1.
20 R. Vashist and A. Vashishtha
Fig. 1 Lower and upper approximation for classified rank assignment approach
For Unclassified Rank Assignment approach there is no elements in the core set
and there are two reducts set
Core = {}
Reduct1 = {CRAR}
Reduct2 = {GTA}
The classification accuracy of ranks as given by lower and upper approximation
for Unclassified Rank Assignment Approach is shown in Fig. 2.
And the rules generated for Unclassified Rank Assignment Approach are
# LEM2
# C:\Program Files\ROSE2\examples\unclassified camel.isf
# objects = 10
# attributes = 10
# decision = URAA
# classes = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
# Wed Apr 02 11:43:22 2014
# 0s
The objective of present study is to rank public sector banks in India on the basis of
their financial soundness particularly in relation to the concerns of debt holders. The
choice of financial indices has been motivated mainly by this consideration.
An Investigation into Accuracy of CAMEL Model … 23
Ten largest Indian public sector banks on the basis of their deposit-base over a five-
year period, 2008–2009 through 2012–2013 have been considered so as to ensure
that these banks constitute more or less a homogeneous group in terms of owner-
ship structure and regulatory and administrative control over their policies. For
ranking of these banks, we have employed in all eleven financial indices relating to
capital adequacy, asset quality, management efficiency, earnings and liquidity. Each
of these financial indices has been specified by a quantity which represents the
mean value over the five-year period, 2008–2009 through 2012–2013. The rankings
of banks obtained under the two approaches, Unclassified Rank Assignment
Approach (URAA) and Classified Rank Assignment Approach (CRAA), are
specified. For accuracy of ranks it is important to ensure that financial indices
included under various components are relevant and basic to the purpose in
question and no financial parameter is included which is redundant. For compliance
of this requirement, ranks representing overall financial soundness of various banks
were regressed on the eleven indices to identify if any of them needed to be
excluded. This exercise revealed that Net NPA ratio (in relation to asset quality) and
business per employee (in relation to management efficiency) fell under the cate-
gory of ‘excluded variables’. Accordingly, these indices were dropped while
computing ranks. Various public sector banks are ranked from 1–10 under classi-
fied and unclassified rank assignment approach and the ranks obtained by these two
approaches are almost same. Ranks of banks obtained by these two approaches are
given as input to the rough set for checking the accuracy of ranks by lower and
upper approximation. Since for both the approaches there is no element in the core
or core is empty this means that which means that there is no indispensable attribute
any single attribute in such an information system can be deleted without altering
the equivalence-class structure. In such cases, there is no essential or necessary
attribute which is required for the class structure to be represented. There are
attributes in the information systems (attribute-value table) which are more
important to the knowledge represented in the equivalence class structure than other
attributes. Often, there is a subset of attributes which can, by itself, fully charac-
terize the knowledge in the database; such an attribute set is called a reduct.
A reduct can be thought of as a sufficient set of features—sufficient, that is, to
represent the category structure. For both approaches we have risk-adjusted capital
ratio CRAR under capital adequacy attribute as one of the element of reduct. This
means that CRAR alone is self sufficient to define the ranking of different banks. On
the basis of the values of CRAR ranks of different banks can be generated as shown
in the rules generated for unclassified rank assignment approach. Ranks of different
banks as generated by Classified and Unclassified Rank approach are 100 %
accurate as shown in Figs. 1 and 2 by lower and upper approximation. On the scale
of financial soundness, the Canara Bank stands at the top with an overall rank of 1,
followed by the Bank of Baroda with rank of 2 whereas Central Bank of India is at
the bottom with rank 10.
24 R. Vashist and A. Vashishtha
7 Conclusion
Banking supervision has assumed enormous significance after 2008 global financial
meltdown which has caused a wide spread distress and chaos among various
constituents of financial system across the globe. Subsequently, an overhaul of
banking supervision and regulatory practices took place which put special emphasis
on supervisory tools like CAMEL model with certain reformulations. Another,
parallel but related development in banking supervision is Basel-III norms which
have been framed by the Basel Committee on Banking Supervision, a global forum
of Central Bankers of major countries for effective banking supervision. Basel-III
has also put maximum focus on risk based capital of banks which is also the
findings of our analysis as Capital to Risk weighted Asset Ratio (CRAR) appears as
single most important element as reduct of RST analysis. This research has applied
RST approach to CAMEL model and future research may use RST analysis for
Basel-III norms as well. It will be an interesting exercise for the researcher to find
out the authenticity and effectiveness of CAMEL model in combination with Basel-
III norms.
References
1. Ahn, B.S., Cho, S.S., Kim, C.Y.: The integrated methodology of rough set theory and artificial
neural network for business failure prediction. Expert Syst. Appl. 18(2), 65–74 (2000)
2. Arena, M.: Bank failures and bank fundamentals: A comparative analysis of Latin America
and East Asia during the nineties using bank-level data. J. Bank. Finan. 32(2), 299–310 (2008)
3. Avkiran, N.K., Cai, L.C.: Predicting Bank Financial Distress Prior to Crises, Working Paper.
The University of Queensland, Australia (2012)
4. Canbas, S., Cabuk, A., Kilic, S.B.: Prediction of commercial bank failure via multivariate
statistical analysis of financial structures: The Turkish case. Eur. J. Oper. Res. 166(2), 528–546
(2005)
5. Das, S., Sy, A.N.R.: How Risky Are Banks’ Risk Weighted Assets? Evidence from the
Financial Crisis, IMF Working Paper, 12/36 (2012)
6. Daubie, M., Leveck, P., Meskens, N.: A comparison of the rough sets and recursive
partitioning induction approaches: An application to commercial loans. Int. Trans. Oper. Res.
9, 681–694 (2002)
7. Demirguc-Kunt, A., Detragiache, E., Gupta, P.: Inside the crisis: An empirical analysis of
banking systems in distress. J. Int. Money Finan. 25(5), 702–718 (2006)
8. Demyanyk, Y., Hasan, I.: Financial crises and bank failures: A review of prediction method.
OMEGA 38(5), 315–324 (2010)
9. Estrella, A., Park, S.: Capital ratios as predictors of bank failure. Econ. Policy Rev. 6(2),
33–52 (2000)
10. Greco, S., Matarazzo, B., Slowinski, R.: Rough sets theory for multicriteria decision analysis.
Eur. J. Oper. Res. 129, 1–47 (2001)
11. Hassanien, A.Q., Zamoon, S., Hassanien, A.E., Abrahm, A.: Rough set generating prediction
rules for stock price movement. In: Computer Modeling and Simulation, EMS ′08. Second
UKSIM European Symposium, pp. 111–116 (2008)
An Investigation into Accuracy of CAMEL Model … 25
12. Khoza, M., Marwala, T.: A rough set theory based predictive model for stock prices. In:
Proceeding of IEEE 12th International Symposium on Computational Intelligence and
Informatics, pp. 57–62. Budapest (2011)
13. Kolari, J., Glennon, D., Shin, H., Caputo, M.: Predicting large US commercial bank failures.
J. Econ. Bus. 54(4), 361–387 (2002)
14. Lanine, G., Rudi, V.V.: Failure predictions in the Russian bank sector with logit and trait
recognition models. Expert Syst. Appl. 30(3), 463–478 (2006)
15. Le Lesle, V., Avramova, S.: Revisiting Risk-Weighted Assets, IMF Working Paper, 12/90
(2012)
16. Mannasoo, K., Mayes, D.G.: Investigating the Early Signals of Banking Sector Vulnerabilities
in Central and East European Emerging Markets, Working Paper of Eesti Pank, p. 8 (2005)
17. Mariathasan, M., Merrouche, O.: The Manipulation of Basel Risk-Weights. Evidence from
2007–2010. University of Oxford, Department of Economics, Discussion Paper, p. 621 (2012)
18. Nursel, S.R., Fahri, U., Bahadtin, R.: Predicting bankruptcies using rough set approach: The
case of Turkish bank. In: Proceeding of American Conference on Applied Mathematics (Math
′08), Harvard, Massachusetts, USA, 24–26 Mar 2008
19. Ooghe, H., Prijcker S.D.: Failure Processes and Causes of Company Bankruptcy: A Typology,
Working Paper, Steunpunt OOI (2006)
20. Pawlak, Z.: Rough set approach to knowledge-based decision support. Eur. J. Oper. Res. 99,
48–57 (1997)
21. Pawlak, Z.: Rough sets. Int. J. Comput. Int. Sci. 11(3), 341–356 (1982)
22. Poghosyan, T., Cihák, M.: Distress in European Banks: An Analysis Based on a New Dataset.
IMF Working Paper, 09/9 (2009)
23. Prasad, K.V.N., Ravinder, G.: A camel model analysis of nationalized banks in India. Int.
J. Trade Commer. 1(1), 23–33 (2012)
24. Reyes, S.M., Maria, J.V.: Modeling credit risk: An application of the rough set methodology.
Int. J. Bank. Finan. 10(1), 34–56 (2013)
25. Rodriguez, M., Díaz, F.: La teoría de los rough sets y la predicción del fracaso empresarial.
Diseño de un modelo para pymes, Revista de la Asociación Española de Contabilidad y
Administración de Empresas 74, 36–39 (2005)
26. Segovia, M.J., Gil, J.A., Vilar, L., Heras, A.J.: La metodología rough set frente al análisis
discriminante en la predicción de insolvencia en empresas aseguradoras. Anales del Instituto
de Actuarios Españoles 9 (2003)
27. Tatom, J., Houston, R.: Predicting Failure in the Commercial Banking Industry. Networks
Financial Institute at Indiana State University. Working Paper, p. 27 (2011)
28. Tung, W.L., Quek, C., Cheng, P.: Genso-Ews: A novel neural-fuzzy based early warning
system for predicting bank failures. Neural Netw. 17(4), 567–587 (2004)
29. Wheelock, D.C., Wilson, P.W.: Why do banks disappear? The determinants of U.S. bank
failures and acquisitions. Rev. Econ. Stat. 82(1), 127–138 (2000)
30. Xu, J.N., Xi, B.: AHP-ANN based credit risk assessment for commercial banks. J. Harbin
Univ. Sci. Technol. 6, 94–98 (2002)
31. Yu, G.A., Xu, H.B.: Design and implementation of an expert system of loan risk evaluation.
Comput. Eng. Sci. 10, 104–106 (2004)
Towards Intelligent Distributed
Computing: Cell-Oriented Computing
1 Introduction
traits (genes) and a defence system (cell membrane). All cells in the body are
connected to a giant computer called Intelligence that controls their tasks. The
human intelligence works like a super-computer. Indeed, the human cell network is
millions of times larger than the communication networks of the whole Web. Each
cell has a great capacity to receive and transmit information to every cell in the body;
each remembers the past for several generations, stores all the impressions of past
and present human lives in its data banks and also evaluates and records possibilities
for the future. It has an internal defence system to face intruders when an external
attack occurs.
This chapter is arranged as follows: Sect. 2 discusses previous work on intel-
ligent distributed computing, Sect. 3 introduces the Cell computing methodology,
Sect. 4 discusses some definitions relating to the proposed model, Sect. 5 shows the
Cell components, Sect. 6 describes the strategy of Cell-oriented computing, Sect. 7
discusses the characteristics of the proposed computing type and finally, Sect. 8
summarizes the over-arching ideas of the chapter.
2 Background
hand, Suwanapong et al. [4] propose the Intelligent Web Service (IWS) system as a
declarative approach to the construction of semantic Web applications. IWS utilizes
a uniform representation of ontology axioms, ontology definitions and instances, as
well as application constraints and rules in machine-processable form. In order to
improve single service functions and meet complex business needs, Li et al. [5]
introduced composite semantic Web services based on an agent. To discover better
Web service, Rajendran and Balasubramanie [6] proposed an agent-based archi-
tecture that respected QoS constraints. To improve Web service composition, Sun
et al. [7] proposed a context-aware Web service composition framework that was
agent-based. Their framework brings context awareness and agent-based technol-
ogy into the execution of Web service composition, thus improving the quality of
service composition, while at the same time providing a more suitable service
composition to users. Another approach towards better Web service composition
was made by Tong et al. [8], who proposed a formal service agent model,
DPAWSC, which integrates Web service and software agent technologies into one
cohesive entity. DPAWSC is based on the distributed decision making of the
autonomous service agents and addresses the distributed nature of Web service
composition. From a different perspective, Yang [9] proposed a cloud information
agent system with Web service techniques, one of the relevant results of which is
the energy-saving multi-agent system.
Cloud computing has recently emerged as a compel paradigm for managing and
delivering services over the Internet. It has rapidly modified the information tech-
nology scene and eventually made the goals of utility computing into a reality [15].
In order to provide distributed IT resources and services to users based on context-
aware information, Jang et al. designed a context model based on the ontology of
mobile cloud computing [16]. In the same area of smart cloud research, Haase et al.
[17] discussed intelligent information management in enterprise clouds and intro-
duced eCloudManager ontology in order to describe concepts and relationships in
enterprise cloud management . In an equivalent manner, the work of Block et al.
[18] establishes an alignment between ontologies in a cloud computing architecture.
However, this work did not rely on reasoning among the distributed ontologies. By
contrast, a distributed reasoning architecture, DRAGO, has been designed, based on
local semantics [19, 20]. It uses a distributed description logics outline [21] to
represent multiple semantically connected ontologies. Unlike DRAGO, the model
introduced in Schlicht and Stuckenschmidt [22, 23] creates a distributed, compre-
hensive and terminating algorithm that demonstrates consistency of logical termi-
nologies and promises that the overall semantics will be preserved.
The aim of grid computing is to enable coordinated resource sharing and problem
solving in dynamic, multi-institutional virtual organizations [24]. Shi et al. pro-
posed an intelligent grid computing architecture for transient stable constrains that
reassign a potential evaluation of future smart grids. In their architecture, a model of
generalized computing nodes with an ‘able person should do more work’ feature is
introduced and installed to make full use of each node [25]. GridStat has been
introduced as a middleware layer capable of keeping pace with the data collection
capabilities of the equipment present in the power grid [26]. Liang and Rodrigues
[27] proposed a service-oriented middleware for smart grids. Their solution is
capable of tackling issues related to heterogeneous services, which are most
common in the smart grid domain.
32 A. Karawash et al.
3 Cell Theory
Cell theory is the modular representation of human cell characteristics from the
perspective of computer science. It is a flexible and scalable virtual processing unit
that treats complex distributed computing smartly by organized and accurate
decisions. A cell is a software object that:
• Is sited within a command/execution environment;
• Holds the following compulsory properties:
– Collaborative: works in groups to finish a job;
– Inheritance: serves clients according to their environmental profile if there is
no specification in their requests;
– Shares business processes: each cell business process represents a group of
business processes of components with the same goal. However, every cell is
open for collaboration with all other cells and can keep up best process
quality via dynamic changes in process nodes. Thus, the cell has great
processing power since all cells’ business processes can be shared by one
cell to serve the client;
– Uniqueness: each cell deals with a specific type of job;
– Reactive: cell senses modification in the environment and acts in accordance
with those changes;
– Autonomous: has control over its own actions;
– Optimal: keeps to best functional and non-functional requirements;
– Federative: each cell has its own information resources;
– Self-error covering: monitors changes in the computing environment and
applies improvements when errors are detected;
– Dynamic decision making: applies decision alteration based on the change of
context;
– Learning: acclimatizes in accordance with previous experience;
In order to introduce the proposed cell theory, we discuss a new software design
style: the cell architecture and then show how cell computing works. For simplicity,
we identify the infrastructure of the Cell-Oriented Architecture (COA) and the
functionality of Cell-Oriented Computing (COC) model with definability in the
mathematical model.
The architecture is designed to achieve smart Web goals and overcome the
limitations of existing Web infrastructures. The cell architecture presented here is
device, network and provider independent. This means that COA works across
most computing machines and ensures a novel methodology of computing.
COA is designed to cater to smart Web requirements and aims to achieve at last
an ambient, intelligent Web environment. Cells in COA are internally secured,
sustain autonomic analysis of communications and are able to support the mech-
anism of collaborations through the following requirements:
[R1] Management and Communication: to establish local and remote sessions, the
underlying infrastructure provides the ability to find any other cells in the
network and then to establish a session with that cell.
[R2] Context-based Security: to enable secure interactions in the communication
spaces among all connected participants.
[R3] Analysis: supporting analysis of data exchange among cells, plus encom-
passing the interior analysis of cell process infrastructure.
[R4] Validation: to verify cell components and ensure consistent process combi-
nations among cells.
[R5] Output Calculation: to evaluate the suitable output results with less cost and
minimal use of resources.
[R6] Trait Maintenance: to avoid and deal spontaneously with all sources of
weakness in cells’ communications.
To realize these goals, we developed a complete command-execute architecture,
designed from the ground up to work over existing Web standards and traditional
networks. COA makes it possible to merge the material and digital worlds by
incorporating physical and computing entities into smart spaces. Put simply, it
facilitates the steps to achieving a pervasive form of computing. Figure 1 outlines
the components of this COA, its functionality and the operation of the underlying
protocols.
COA is composed of three main components: Cell Commander, Executive Cell
and Cell Feeding Source. Cell theory is introduced to provide intelligence in dis-
tributed computing; however, it combines client/server and peer-to-peer models at
once. This is a client/server representation because we have a client component (the
Https/XML
Feeding Source
Commander Executive
Cell Cells Https/XML .
Feeding Source
Client Cell
Provider
Cell
The Commander Cell represents the client side in COA and is the main requester of
an output. This section discusses the structure of cells from the client side (Fig. 2).
Command Cell Manager (CCM): the client cell’s ‘head’ that is responsible of
any external collaboration with the Executive Cells. It receives a client as a list of
Towards Intelligent Distributed Computing … 35
ISS
CPM
CCM
PQM
PC
LPA
CPD
four components: proposed cell input, interval of output of Executive Cell result,
proposed Cell process’s general design (if available) and the required cell process
quality. Some of these components can be inherited from the client cell’s envi-
ronment. The Command Cell Manager monitors the context profile of the Com-
mander Cell via the profile manager. It also manages the access to the client cell by
specified rules of internal security.
Internal Security System (ISS): this is protection software that is responsible of
giving tickets for Executive Cells to access the Command Cell manager. It depends
mainly on the analysis of the outer cell’s context profile to ascertain whether it can
collaborate with the client Cell.
Process Quality Manager (PQM): software used by the Commander Cell to
select the required quality of the cell process. For example, the client may need to
specify some qualities such as performance, cost, response time, etc. If there is no
selection of specific qualities, these qualities are inherited from the environment’s
qualities (as an employee may inherit a quality from his company).
Cell Process Designer (CPD): a graphical design interface that is used to build a
general cell process flow graph or to select an option from the available process
graphs. If there is no graph design or selection, the Executive Cell has the right to
pick a suitable gene based on the commander profile.
Logic Process Analyser (LPA): after designing a general proposition for the
executive gene design via the process designer, the job of the logic process analyser
is to transform the graph design into a logical command to be sent to the executive
side.
Context Profile Manager (CPM): this tool is responsible for collecting infor-
mation about the Commander Cell profile, such as place, type of machine, user
properties, etc. Since the commander profile is dynamic, several users may use the
same Commander Cell; the profile information is instantaneously provided when
needed.
Profile Core (PC): this storage is performed by a special database that stores
information about the Commander Cell profile and allows the Executive Cell to tell
whether there are several users utilizing the same Commander Cell.
36 A. Karawash et al.
TVS
IM CC PAR
GMM
CRA OFC
MCC
GC GM
M
CPA
CP PAC
M GR
ISS GU
CB BR
C PA
CFS
TMS PM
ECB QR AS
The cell provider represents the supplier side in the COA, which is responsible for
building suitable outputs for client invocation. This section discusses the structure
of the cell provider shown in Fig. 3.
Management and Control Centre (MCC): Smart software work like an agent
and is considered to be similar to the brain of the COA, in which it orchestrates the
whole computing infrastructure. It is composed of a virtual processing unit that
controls all the internal and external connections. So, Executive Cells are supported
and managed according to well-defined cell level agreements. It monitors every
connection among cells and prepares all decisions, such as update requirement,
communication logics, maintenance facilities, access control management, reposi-
tory stores and backups, etc. The COA management and control centre have stable
jobs inside the cell provider. However, it cannot respond to an external job from
other cells without security permission from the internal security system. Since one
of the main principles of cell theory is availability, the management and control
centre is replicated in order that collaboration can be carried out to serve cells. Each
cell uses its Decision System to communicate with the COA management centre.
Testing and Validation System (TVS): the cell testing and validation system
describes the testing of cells during the process composition phase of the Executive
Cell. This will ensure that new or altered cells are fit for purpose (utility) and fit for
use (warranty). Process validation is a vital point within cell theory and has often
been the unseen underlying cause of what were in the past seen as inefficient cell
Towards Intelligent Distributed Computing … 37
management processes. If cells are not tested and validated sufficiently, then their
introduction into the operational environment will bring problems such as loops,
deadlocks, errors, etc. In a previous book chapter [28] we have discussed a new
model of how to validate the business processes of Web service; the concepts of the
same validation method can be used to validate the cell business process (Gene).
Cell validation and testing’s goal means that the delivery of activities adds value in
an agreed and expected manner.
Cell Traits Maintenance System (TMS): the challenge is to make cell tech-
nology work in a way that meets customer expectations of quality, such as avail-
ability, reliability, etc., while still offering Executive Cells the flexibility needed to
adapt quickly to changes. Qualities of genes are stored in a QoG repository and the
maintenance system has permission to access and monitor these qualities. QoG can
be considered a combination of QoS with a set of Web services if the source of the
cell is a Web service provider. QoG parameters are increasingly important as cell
networks become interconnected and larger numbers of operators and providers
interact to deliver business processes to Executive Cells.
Process Analyser Core (PAC): since a cell process map can be composed of a
set of other components’ business processes, there should be a method for selecting
the best direction for the cell map. In addition to the context of environment
dependency, cell theory uses a deep quality of service analysis to define a best
process. This type of process map analysis is summarized by building a quality of
process data warehouse to monitor changes in process map nodes. Every process
component invokes a set of subcomponents, similar to sub services in a service
model, in which all these subcomponents are categorized in groups according to
goals. The process analyser core applies analysis to these subcomponents and
communicates with the cell broker to achieve the best map of the Executive Cell
process. In addition to analysing Executive Cell process, the process analyser core
also analyses and maps the invocations from the Commander Cells. This type of
dual analysis results in an organized store of collaboration data without the need to
re-analyse connections and without major data problems.
Output Fabrication Centre (OFC): depending on the specific output goal,
options may be available for executive cells to communicate with the output fab-
rication centre. This centre provides more control over the building of the executive
cell process to serve the client cell. Based on the results of the process analyser core
and the consequences of the test and validation system, executive cells, specifically
their output builder systems, collaborate with the output fabrication centre to return
a suitable output to the commander cell.
Cell Profile Manager (CPM): traditional styles of client/server communications
suffer from a weakness: the dominance of the provider. Indeed, a server can request
information about client profiles for security purposes, but power is limited in the
converse direction. In cell theory, every ell must have a profile to contact other
cells. The cell profile manager works to build suitable profiles for executive cells to
help in constructing a trusted cell instruction tunnel.
Cell Federation System (CFS): the system coordinates sharing and exchange of
information which is organized by the cells, describing common structure and
38 A. Karawash et al.
Cell Broker (CB): analytical software that monitors changes in cell processes
and evaluates quality of processes according to their modifications. The evaluation
of quality of process is similar to that of quality of service in the service model.
However, the new step can be summarized as the building of a data warehouse for
quality of process that permits an advance online process analysis.
QoG Repository (QR): a data warehouse for the quality of cell process. It
collects up-to-date information about process properties, such as performance,
reliability, cost, response time, etc. This repository has an OLAP feature that sup-
port an online process analysis.
COA Governance Unit (GU): the COA governance unit is a component of
overall IT governance and as such administers controls when it comes to policy,
process and metadata management.
Process Analysis Repository (PAR): a data warehouse of all cells’ process
connections. It stores information about cell processes in the shape of a network
graph, in which every sub unit of a process represents a node. The collected data
summarizes analytical measures such as centrality.
Gene Core Manager (GCM): software responsible of gene storage, backups
and archiving. It receives updates about business processes from sources and alters
the gene ontology, backs up the gene when errors occur and archives unused genes.
Gene Mediator (GM): the problem of communication between the gene core
manager and the sources of business processes may be complex, so GM defines an
object that encapsulates how a set of objects interact. With the gene mediator,
communication between cells and their sources is encapsulated by a mediator
object. Business process sources and cells do not communicate directly, but instead
communicate through the mediation level, ensuring a consistent mapping of dif-
ferent business process types onto the gene infrastructure.
Gene Meta-Data Manager (GMM): genes are complex components that are
difficult to analyse, so for analysis and validation purposes, the gene meta-data
manger invokes gene meta-data from the gene repository and supplies gene core
data through this process.
Gene Repository (GR): ontologies are used as the data model throughout the
gene repository, meaning that all resource descriptions, as well as all data inter-
changed during executive cell usage, are based on ontologies. Ontologies have been
identified as the central enabling technology for the Semantic Web. The general use
of ontologies allows semantically-enhanced information processing as well as
support for interoperability. To facilitate the analysis of the gene map, meta-data
about each gene is also stored in the gene repository.
Backup and Recovery Control (BRC): this refers to the different strategies and
actions occupied in protecting cell repositories against data loss and reconstructing
the database after any kind of such loss.
Process Archiving (PA): the archiving process helps to remove the cell process
instances which have been completed and are no longer required by the business.
All cell process instances which are marked for archiving will be taken out from the
archive set database and archived to a location as configured by the administrator.
40 A. Karawash et al.
SM
GS RC
The job of the process archiving component includes the process-, task- and
business log-related content from the archive database.
Archive Set (AS): a database for unused genes that is accessed and managed by
the process archiving component.
Cell source can be any kind of code that can be reused and follow specific com-
position rules. Generally, the first sources of cells are Web service business pro-
cesses (such as BPEL and OWL-S) or reusable code (Java, C# etc.). This section
discusses the structure of the sources that feed Executive Cells (Fig. 4).
Resource Code (RC): a store of cell sources, such as business processes or
reusable code. If the cell source is a Web service provider, then its business process
may be BPEL, OWL-S, or another. Further, the cell source may be a reusable
programming code for a combination of objects (in Java, C#, etc.).
Source Mediator (SM): transformer software that maps the process of a cell’s
source into a gene. The mediator’s job is similar to that of the BPEL parser in a
Web service provider, which maps BPEL code into a WSDL code. In COA, every
source business process is converted into OWL-S ontology. However, the obtained
OWL-S ontology has a special property: the extension of OWL-S’ business
process.
Gene Store (GS): a store that is composed by mapping the source business
process. This is an abstract of a source process in shape of an ontology, organized in
a structure compatible with the cell’s job.
Definition 1 Let WðP; Q; TÞ be a finite nonempty set that represents Web infra-
structure, where: P ¼ fp1 ; p2 ; . . .; pn g represents the set of feeding sources of Web
applications, Q ¼ fq1 ; q2 ; . . .; qm g represents the set of consumers of Web sources
and T ¼ ft1 ; t2 ; . . .; tk g represents the set of tools that are used by Web providers to
serve Web customer, where n; m; k 2 IN.
Towards Intelligent Distributed Computing … 41
S
Definition 2 Let set J ¼ f zm jm =jm is specific goal and jm 6¼ jn g and set S ¼
Sz
f m sm =si is structure of componentg.
As with most things in the business world, the size and scope of the business plan
depend on specific practice. A specific practice is the description of an activity that
is considered important in achieving the associated specific goal. Set J represents a
group of components, each of which supports a specific computing goal based on a
particular practice. However, the structure of the studied components is denoted by
set S.
S
Proposition 1 A set r ¼ f ni Li =ri denote a Cellg{T, is a finite and ordered set
such that Jri \ Jrj ¼ ; and Sri ¼ Srj , where i; j; n 2 IN.
In all other computing models, different components may perform similar jobs.
For example, two classes, in the object-oriented model, can utilize similar inputs
and return the same type of output but using different coding structures. Further-
more, in the discovery phase of service-oriented computing, service consumers
receive a set of services that do the same job before selecting which one of them to
invoke. The main advantage of Web service theory is the possibility of creating
value-added services by combining existing ones. Indeed, the variety involved in
serving Web customers is useful in that it gives several aid choices to each one of
them. However, this direction in computing failed since service customers found
themselves facing a complex service selection process. One of the main properties
of cell methodology is the avoidance of the ‘service selection’ problem. The cell is
developed to provide highly focused functionality for solving specific computing
problems. Every cell has its own functionality and goal to serve, so one cannot find
two different cells which support the same type of job. However, all cells are similar
in base and structure: they can sense, act, process data and communicate. That is to
say, regarding cell structure there is only one component to deal with, while in
function there are several internal components, each with a different computing
method and resource.
Definition 3 Let u be a property that expresses the collaboration relation such that
aub where a; b 2 r.
l : DFS ! DS
ðF:1Þ
x ! lðxÞ
p : OBS ! DS
ðF:2Þ
x ! pðxÞ
q : TMS ! OBS
ðF:3Þ
x ! qðxÞ
s : PAS ! OBS
ðF:4Þ
x ! sðxÞ
c : PVS ! OBS
ðF:5Þ
x ! cðxÞ
Towards Intelligent Distributed Computing … 43
d : GSS ! TMS
ðF:6Þ
x ! dðxÞ
e : GSS ! PAS
ðF:7Þ
x ! eðxÞ
h : GSS ! PVS
ðF:8Þ
x ! hðxÞ
Theorem If q denotes a commander cell request and x denotes a cell gene, then:
The proposed executive cell in cell theory is composed of (Fig. 5): decision system
(DS), gene store system (GSS), trait maintenance system (TMS), output builder
system (OBS), process validation system (PVS), process analyser system (PAS),
defence system (DFS) and gene storage.
Fig. 5 Components of
executive cell
PVS
GSS TMS
DS
DFS OBS
PAS
44 A. Karawash et al.
The decision system is the brain of the Executive cell in COC. It is controlled by the
management and control centre and is responsible for taking decisions and directing
other components of the cell. Cell inputs are received by the DS which study the
client request and emit suitable outputs. Cell computing is characterized by two
levels of collaboration that are managed through DSs. The first collaboration level
is expressed by internal cooperation among cell subsystems, while the second level
of collaboration is applied among cells to build a complete answer for cell cus-
tomers. In the case of a customer request, the DS asks the defence system to verify
the customer identity and request before starting the answer process. If the customer
request is safe, DS sends the input to the OBS and waits for the answer. Sometimes,
one cell is not sufficient to serve a customer. In this case, the DS asks for collab-
oration from other cells to produce an answer.
Cell computing aims to correct the problems of the service model. One of the main
service-oriented computing problems is security. Security weakness is less of a
danger in the case of Web service, but currently most cloud services are public and
store sensitive data, so that any security fault may be fatal to some institutions. As a
way of obtaining strict computing resource protection, COC introduces internal cell
protection. As is well known, there are two main steps to protecting the Web. The
first step is network protection via several encryption methods. However, the
second step is characterized by server resources protection via user tokens and
security tools. The proposed COC security technique ensures protection against any
internal or external unauthorized access to a cell. In addition to network and system
protection, the cell defence system aims to introduce a double verification method.
This is a hidden type of cell protection that verifies, on one side, if a customer has
the right to invoke a cell, while it also checks, on the other side, if a customer’s
machine is capable of receiving an output from such a cell. COC aims to make the
distributed Web application as secure as possible.
There are several combinations of processes that return the same results in a dis-
tributed application. Some of these applications are Web services that are divided
into a set of groups, such that in each group all the applications can do the same
jobs. The problem for service theory is summed up by the question of how to select
the best service from an ocean of similar job services? COC has indeed found a
Towards Intelligent Distributed Computing … 45
solution to the service selection problem. Simply put, why not transform all the
business processes into a new structure to be used by a novel model like COC? In
order to obtain a successful COC model, we need to build a suitable business
process (gene) for each cell. The first step in building cell genes is to transform the
service business processes and their combinations into a graph (or map) of abstract
business processes. The obtained graph has no abstract information about any
service business process. For example, if several services make a division job, then
all of their abstract business processes are linked to a division node of the gene
graph. Each cell uses a specific part of the obtained abstract graph and is known as a
cell business process or gene. The gene store system’s job is to store the genes and
classify them, shaped by logical rules in a database to be easily used by cell
subsystems.
Cells in COC are considered as intelligent modular applications that can be pub-
lished, located and invoked across the Web. They are intended to give the client
best results by composing their distributed business processes dynamically and
automatically based on specific rules. Based on the service model, companies only
implement their core business and outsource other application services over the
Internet. However, no single Web service can satisfy the functionality required by
the user; thus, companies try to combine services together in order to fulfil the
request. Indeed, companies face a major problem: Web service composition is still a
highly complex task and it is already beyond human capability to deal with the
whole process manually. Therefore, building composite Web services with an
automated or semi-automated tool is critical. As a solution to the service compo-
sition problem, cell theory proposes a cell that is capable of achieving an automated
composition of its business process. In sum, after analysing, validating and ensuring
the good characteristics of business process choices to be used by a cell by PAS,
PVS and TMS, OBS selects and executes the best process plan based on the user’s
request. The role of OBS is to apply a dynamic and autonomic composition of the
selected business processes of the collaborating cells.
Cell computing allows sharing of the process to reach a solution. This way of
computing results, indirectly, in a shared resources environment similar to that of
grid computing. Recursively, a client cell has access to all other executive cells as
Towards Intelligent Distributed Computing … 47
Security Decision
Collaboration
Collaboration Communication
Analysis
Output Analysis/Validation
Feeding
Mediation & Transform Feeding
Source
they are running on one machine. The cell network is organized, secure, reliable,
scalable and dynamic. Cell computing strategy, as shown in Fig. 6, is based on five
main layers of computation: command layer, management layer, collaboration
layer, analysis layer and feeding layer.
Command Layer: The command layer consists of solutions designed to make
use of the smart selection of cells that can provide a specific service. It makes up the
initial step of the exchange in cell architecture. An important role of the command
layer is to allow for clear separation between available solutions and a logical
methodology in producing a solution based on the client’s command. The tradi-
tional Web service methodology gives clients the right to select one of the pre-
designed Web applications that will process their solution depending on several
qualities and a complex selection process. However, cell methodology has
improved the process by making clients give commands and creating the applica-
tion according to these commands. This approach enables a slew of new applica-
tions to be created that make use of the COA’s cooperative capabilities, without
requiring in-depth knowledge of application processes, communication protocols,
coding schemes or session management procedures; all these are handled by the
upper layers of the cell strategy. This enables modular interfaces to incorporate new
services via a set of commands composed of specifying inputs, output intervals,
QoG requirements and the user profile.
48 A. Karawash et al.
8 Discussion
The most dominant paradigms in distributed computing are Web service and
software agent. Inserting intelligence into these paradigms is critical, since both
paradigms depend on non-autonomous driven architecture (SOA) that prevents one-
to-one concurrence with the needs of third party communication (i.e., the service
registry). In addition to this, the Web service and agent paradigms suffer from
negative complexity and migration effects, respectively. The complexity, in general,
comes from the following sources. First, the number of services accessible over the
Web has increased radically during recent years and a huge Web service repository
to be searched is anticipated. Second, Web services can be formed and updated
during normal processing; thus the composition system needs to detect this
updating at runtime and make decisions based on the up-to-date information [32].
Third, Web services can be developed by different organizations, which use dif-
ferent conceptual models to describe the services; however, there is no unique
business process to define and evaluate Web services. Therefore, building com-
posite Web services with an automated or semi-automated tool is critical.
The migration of processes and the control of that migration, together with their
effect on communication and security, was a problem for mobile agents. Indeed, the
first problem of the Web is security; giving an application the ability to move
among distributed systems and choose the place to make execution may cause
critical problems. Agent methodology has several advantages; however, it can
destroy human control if it is not covered by rules and limits.
How we can benefit from the wide use of service-oriented architecture in
building intelligent architecture? How can we avoid the complex selection process
of the Web service model? How can we achieve dynamic business process com-
position despite the variety of companies providing different types of service pro-
cessing? How can we use the intelligence of multi-agent systems as a control mode
from the client side? How can we reach the best non-functional properties of
processes in an autonomic manner? How can we avoid the security weaknesses
resulting from mobile agent communications? How can we prevent damage to
service caused by internal and subservice fail? Why not separate software processes
based on their purpose? How might we arrange procedures of distributed computing
in a way that evades big data analysis problems resulting from random connections
among distributed systems? How can globally consistent solutions be generated
through the dynamic interaction of distributed intelligent entities that only have
local information? How can heterogeneous entities share capabilities to carry out
collaborative tasks? How can distributed intelligent systems learn, improving their
performance over time, or recognize and manage faults or anomalous situations?
Why not use dynamic online analysis centres that monitor the on-the-fly qualities of
distributed software processes? How should we validate the processes of distributed
software at the design phase? How should we accomplish the internal protection of
distributed components based on the dual context-profile of both consumer and
solution provider?
50 A. Karawash et al.
8.1 Autonomy
The cell approach proposes that the problem space should be decomposed into
multiple autonomous components that can act and interact in a flexible way to
achieve a processing goal. Cells are autonomous in the sense that they have total
control over their encapsulated state and are capable of taking decisions about what
to do based on this state, without a third party intervention.
8.2 Inheritance
The commander cell inherits the profile property from its environment (company,
university, etc.). However, the executer cell can serve commanders according to
their environmental profile (selection of suitable qualities of a process) or by special
interference from the commander’s side to specify more precisely the general
design of a process and its qualities. This inheritance property in COA is similar to
the inheritance among generations of human beings. For example, babies inherits
traits of their parents such that cells combine traits from the father and the mother,
but the parent can ask a doctor for specific trait in a baby different from their own
traits (blue eyes, brown hair, etc.). In this case, they have given more specifications
to the cell in order to select suitable genes.
Towards Intelligent Distributed Computing … 51
8.4 Availability
8.5 Collaboration
8.6 Performance
based on basic properties such as response time and code complexity. Furthermore,
an increase in communication acquaintance can be a guide to an improvement in
performance as it enables cells to communicate in a more efficient manner.
8.7 Federation
Cells are independent in their jobs and goals. However, all distributed processes
that do same type of job are connected to a specific executive cell. Thus each
executive cell is federated with respect to the commander cell’s request. Cells map
can be considered as a set of federated components that are capable of collaborating
to achieve a solution.
There are two types of errors that can be handled by cell computing: structural and
resource errors. The cell process is based on a combination of codes that are
fabricated by different computational sides. These combinations may fail because of
coding or system errors and fall in deadlock. The process validation system’s job is
to monitor changes in process and recover errors if detected. Resource errors are
described as failure in providing a service from the computational resource. The
solution to these types of error is to connect spare procedures in each cell process to
achieve the same quality of job from different sources.
8.9 Interoperability
Cell interoperability comes from the ability to communicate with different feeding
sources and transform their business processes into cell business processes. For
example, in spite of differences among business processes, such as BPEL and
OWL-S, every provider of service is seen as a source of genes and as useful in cell
computing. Based on cell interoperability, all procedures and applications used by
service providers can be unified under a unique type of process computing, the cell
gene, with respect to cell provider.
9 Conclusion
With the extensive deployment of distributed systems, the management and inte-
gration of these systems have become challenging problems, especially after smart
procedures are implemented. Researchers build new technologies to deal with these
Towards Intelligent Distributed Computing … 53
problems. Trends of the future Web require inserting intelligence into the distrib-
uted computing model; thus, the goal of future research is intelligent distributed
computing. At this time, the research introduces the cell computing theory to cover
the distributed system problems through intelligent method of processing. Cell
theory is the implementation of human cells’ functions in a distributed computa-
tional environment. The cell is an intelligent, organized, secure and dynamic
component that serves a specific type of job. Cell methodology divides the task
between two types of components, the commander and the executer. The com-
mander is a light cell that represents the client and can communicate smartly with its
distributed environment to request solutions. The executive cell works as a smart
supplier that depends on wide collaborations to fabricate a solution. Cell strategy is
based on high-level communication among Cells, a permanent analysing process
among collaborating components and context-based security among collaborating
cells.
Acknowledgments This work has been supported by the University of Quebec at Chicoutimi and
the Lebanese University (AZM Association).
References
1. Brain, M.: How cells work, howstuffworks? A Discovery Company (2013). http://science.
howstuffworks.com/life/cellular-microscopic/cell.htm
2. Petrenko, A.I.: Service-oriented computing (SOC) in engineering design. In: Third
International Conference on High Performance Computing HPC-UA (2013)
3. Feier, C., Polleres, A., Dumitru, R., Domingue, J., Stollberg, M., Fensel, D.: Towards
intelligent web services: the web service modeling ontology (WSMO). In: 2005 International
Conference on Intelligent Computing (ICIC’05), Hefei, 23–26 Aug 2005
4. Suwanapong, S., Anutariya, C., Wuwongse, V.: An intelligent web service system.
Engineering Information Systems in the Internet Context, IFIP—The International
Federation for Information Processing vol. 103 (2002), pp. 177–201 (2014)
5. Li, C., Zhu, Z., Li, Q., Yao, X.: Study on semantic web service automatic combination
technology based on agent. In: Lecture Notes in Electrical Engineering, vol. 227, pp. 187–194.
Springer, Berlin (2012)
6. Rajendran, T., Balasubramanie, P.: An optimal agent-based architecture for dynamic web
service discovery with QoS. In: International Conference on Computing Communication and
Networking Technologies (ICCCNT) (2010)
7. Sun, W., Zhang, X., Yuan, Y., Han, T.: Context-aware web service composition framework
based on agent, information technology and applications (ITA). In: 2013 International
Conference
8. Tong, H., Cao, J., Zhang, S., Li, M.: A distributed algorithm for web service composition
based on service agent model. IEEE Trans. Parallel Distrib. Syst. 22, 2008–2021 (2011)
9. Yang, S.Y.: A novel cloud information agent system with web service techniques: example of
an energy-saving multi-agent system. Expert Syst. Appl. 40, 1758–1785 (2013)
10. Maryam, M., Varnamkasti, M.M.: A secure communication in mobile agent system. Int.
J. Eng. Trends Technol. (IJETT) 6(4), 186–188 (2013)
11. Liu, C.H., Chen, J.J.: Role-based mobile agent for group task collaboration in pervasive
environment. In Second International Conference, SUComS 2011, vol. 223, pp. 234–240
(2011)
54 A. Karawash et al.
12. Rogoza, W., Zabłocki, M.: Grid computing and cloud computing in scope of JADE and OWL
based semantic agents—a survey, Westpomeranian Technological University in Szczecin
(2014). doi:10.12915/pe.2014.02.25
13. Elammari, M., Issa, Z.: Using model driven architecture to develop multi-agent systems. Int.
Arab J. Inf. Technol. 10(4) (2013)
14. Brazier, F.M.T., Jonker, C.M., Treur, J.: Principles of component-based design of intelligent
agents. Data Knowl. Eng. 41, 1–27 (2002)
15. Shawish, A., Salama, M.: Cloud computing: paradigms and technologies. Stud. in Comput.
Intell. 495(2014), 39–67 (2014)
16. Jang, C., Choi, E.: Context model based on ontology in mobile cloud computing. Commun.
Comput. Inf. Sci. 199, 146–151 (2011)
17. Haase, P., Tobias, M., Schmidt, M.: Semantic technologies for enterprise cloud management.
In Proceedings of the 9th International Semantic Web Conference (2010)
18. Block, J., Lenk, A., Carsten, D.: Ontology alignment in the cloud. In Proceedings of ontology
matching workshop (2010)
19. Ghidini, C., Giunchiglia, F.: Local model semantics, or contextual reasoning = locality +
compatibility. Artif. Intell. 127(2), 221–259 (2001)
20. Serafini, L., Tamilin, A.: DRAGO: distributed reasoning architecture for the semantic web. In:
Proceedings of the Second European Conference on the Semantic Web: Research and
Applications (2005)
21. Borgida, A., Serafini, L.: Distributed description logics: assimilating information from peer
sources. J. Data Semant. 2003, 153–184 (2003)
22. Schlicht, A., Stuckenschmidt, H.: Distributed resolution for ALC. In: Proceedings of the 21th
International Workshop on Description Logics (2008)
23. Schlicht, A., Stuckenschmidt, H.: Peer-peer reasoning for interlinked ontologies. Int.
J. Semant. Comput. (2010)
24. Kahanwal, B., Singh, T.P.: The distributed computing paradigms: P2P, grid, cluster, cloud,
and jungle. Int. J. Latest Res. Sci. 1(2), 183–187 (2012). http://www.mnkjournals.com/ijlrst.
htm
25. Shi, L., Shen, L., Ni, Y., Bazargan, M.: Implementation of an intelligent grid computing
architecture for transient stability constrained TTC evaluation. Journal Electr Eng Technol 8
(1), 20–30 (2013)
26. Gjermundrod, H., Bakken, D.E., Hauser, C.H., Bose, A.: GridStat: a flexible QoS-managed
data dissemination framework for the Power Grid. IEEE Trans. Power Deliv. 24, 136–143
(2009)
27. Liang, Z., Rodrigues, J.J.P.C.: Service-oriented middleware for smart grid: principle,
infrastructure, and application. IEEE Commun. Mag. 2013(51), 84–89 (2013)
28. Karawash, A., Mcheick H., Dbouk, M.: Intelligent web based on mathematic theory, case
study: service composition validation via distributed compiler and graph theory. Springer’s
Studies in Computation Intelligence (SCI) (2013)
29. Aviv, R.: Mechanisms of Internet-based collaborations: a network analysis approach. Learning
in Technological Era, 15–25 (2006). Retrieved from http://telem-pub.openu.ac.il/users/chais/
2006/04/pdf/d-chaisaviv.pdf
30. Karawash, A., Mcheick H., Dbouk, M.: Simultaneous analysis of multiple big data networks:
mapping graphs into a data model. Springer’s Studies in Computation Intelligence (SCI),
(2014a)
31. Karawash, A., Mcheick H., Dbouk, M.: Quality-of-service data warehouse for the selection of
cloud service: a recent trend. Springer’s Studies in Computation Intelligence (SCI) (2014b)
32. Portchelvi, V., Venkatesan, V.P., Shanmugasundaram, G.: Achieving web services
composition—a survey. Sci. Acad. Publ. 2(5), 195–202 (2012)
Application of Genetic Algorithms
for the Estimation of Ultrasonic
Parameters
Abstract In this chapter, the use of genetic algorithm (GA) is investigated in the
field of estimating ultrasonic (US) propagation parameters. Recent works are, then,
surveyed showing an ever-spread of employing GA in different applications of US.
A GA is, specifically, used to estimate the propagation parameters of US waves in
polycrystalline and composite materials for different applications. The objective
function of the estimation is the minimization of a rational difference error between
the estimated and measured transfer functions of US-wave propagation. The US
propagation parameters may be the phase velocity and attenuation. Based on ten-
tative experiments, we will demonstrate how the evolution operators and para-
meters of GA can be chosen for modeling of US propagation. The GA-based
estimation is applied, in a test experiment, on steel alloy and Aluminum specimens
with different grain sizes. Comparative results of that experiment are presented on
different evolution operators for less estimation errors and complexity. The results
prove the effectiveness of GA in estimating parameters for US propagation.
Keywords Genetic algorithm (GA) Inverse problem characterization Ultrasonic
(US) non-destructive testing (NDT) Transfer function (TF) parameter estimation
Materials characterization
1 Introduction
Ultrasound waves are used, in practice, for nondestructive testing (NDT) and
evaluation (NDE) of materials. In this area, the evaluation is achieved through
estimating material parameters which are related to such wave propagation. The
estimation of ultrasonic (US) propagation parameters; phase velocity and attenuation,
is an important task for many applications. US waves which have been transmitted
through a material sample can be measured in the form of discrete time-series.
Analysis of acquired time-series in both time and frequency domains allow acoustic
and, hence, mechanical parameters of such sample to be estimated [8, 9] like wave
velocity, attenuation or density [5]. The complexity of the transfer function of
US wave requires an efficient estimation technique for identifying these parameters.
The estimation of US propagation parameters was studied in different works as in
[12, 15, 19].
The parametric modeling of the propagation transfer function (T.F.) is an
appropriate approach when the material is either dispersive or exhibits frequency-
dependent attenuation. The T.F. is obtained through a through-transmission
experiment in which an US wave is transmitted through a test specimen. T.F.
spectrum can be, then, expressed as a rational function in terms of the propagation
velocity and attenuation. In general, the model parameters may be estimated by
minimizing the error between the modeled spectrum and the measured one.
However, most of the traditional optimization methods have many drawbacks when
applied to multi-extremely nonlinear functions [30]. Well-developed techniques
such as least-square, instrumental variable and maximum likelihood exist for
parameters estimation of models. However, these techniques often fail in searching
for the global optimum if the search space is not differentiable or continuous in the
parameters [27]. Gradient-based methods may offer a sufficiently good approach.
Nevertheless, in these cases if the measurements are noisy, such methods possibly
will fail [28]. For these reasons, genetic algorithm (GA) may help in avoiding such
a failure.
GAs are a subclass of evolutionary computational methods which emulate nature
in evolution. A GA is a parallel optimization method which searches through the
space of system parameters. A GA applies operators inspired by the mechanism of
natural selection to a population of binary strings encoding the parameters space.
GAs are considered as global optimizers which avoid the convergence to weak local
solutions [30]. Over many generations, natural populations of the estimated para-
meters evolve according to the principle of natural selection and the survival of
fittest including the concepts of crossover and mutation. The GAs give fast and
accurate estimates irrespective of the complexity of the objective function. At each
generation, it judges the objective function for different areas of the parameters
space, and then directs the search towards the optimum region. By working with a
population of solutions the algorithm can in effect search many local solutions and
thereby increases the likelihood of finding the global one. It differs from other
optimization techniques in that no differentiation is incurred through the algorithm
and accordingly the searching space continuity is not a condition. These features
enable GA to iterate several times guiding the search towards the global solution
[15]. Recently, the evolution learning has been applied on identifying US data
[5, 9, 22, 29].
Application of Genetic Algorithms … 57
2 Genetic-Algorithms
[Mutation] With a mutation probability, apply the mutation operator on the new
generation at the selected position in chromosome.
4 [Accepting] Place new children in the new population
5 [Replace] Use new generated population for a further run of algorithm
6 [Test] If the end condition is satisfied, stop, and return the best solution in
current population
[Loop] Go to step 2
2.2 Chromosome
The chromosome is composed of the parameters which represent a solution for the
studied problem. The most used way of encoding is a binary string. The chromo-
some then could look like this:
Application of Genetic Algorithms … 59
Chromosome 1 1101100100110110
Chromosome 2 1101111000011110
There are many other ways of encoding. This depends mainly on the solved
problem. For example, one can encode directly integer or real numbers, sometimes
it is useful to encode some permutations and so on.
During the reproductive phase of the GA, individuals are selected from the popu-
lation and recombined, producing new generation which will comprise the next
generation. Parents are selected randomly from the population which favors the
more fit individuals. Good individuals will probably be selected several times in a
generation, poor ones may be discarded. Having selected two parents, their chro-
mosomes are recombined, typically using the mechanisms of crossover and
mutation.
Crossover takes two individuals, and cuts their chromosome strings at some
randomly chosen position, to produce two “head” segments and two “tails”. The
segments are then swapped over to produce two new full length chromosomes.
Crossover can then look like this (is the crossover point):
Chromosome 1 11011·00100110110
Chromosome 2 11001·11000011110
New generation 1 11011·11000011110
New generation 2 11001·00100110110
Mutation is applied to each individual after crossover. It randomly alters each gene
with a small probability (typically 0.001–0.02). This is to prevent falling all solu-
tions in population into a local optimum of solved problem. The mutation is
important for rapidly exploring the search space. Mutation provides a small amount
of random search and helps to ensure that no point in the space has a zero prob-
ability of being examined [3].
For binary encoding we can switch a few randomly chosen bits from 1 to 0 or
from 0 to 1. Mutation can described through the following examples:
Some studies have clarified that much larger mutation rates decreasing over the
course of evolution are often helpful with respect to the convergence reliability and
velocity of GA [1]. A larger rate can speed up the search in early searching phases
and then fine examination is followed using smaller rate. Some works excluded the
mutation operator from GA which is used to segment a multi-component image
[23].
3 Ultrasonic Waves
The term “ultrasonic” applied to sound with frequencies above audible sound, and
nominally includes any frequency over 20 kHz. Frequencies used for medical
diagnostic ultrasound scans extend to 10 MHz and beyond. The range of
20–100 kHz are commonly used for communication and navigation by bats, dol-
phins, and some other species.
US is based on the vibration in materials which is generally referred to as
acoustics. All material substances are composed of atoms, which may be forced into
vibrational motion about their equilibrium positions. Many different patterns of
vibrational motion exist at the atomic level; however, most are irrelevant to
acoustics and US testing. Acoustics is focused on particles that contain many atoms
that move in harmony to produce a mechanical wave. When a material is not
stressed in tension or compression beyond its elastic limit, its individual particles
perform elastic oscillations. When the particles of a medium are displaced from
their equilibrium positions, internal restoration forces arise. These elastic restoring
forces between particles, combined with inertia of the particles, lead to the oscil-
latory motions of the medium. These mechanisms make solid materials as good
conductor for sound waves.
Application of Genetic Algorithms … 61
The interaction effect of sound waves with the material is stronger the smaller
the wave length which should be in the order of internal dimensions between atoms,
this means the higher the frequency of the wave. Typical frequency range between
about 0.5 and 25 MHz is used in NDT and NDE [14]. US NDT is one of the most
frequently used methods of testing for internal flaws. This means that many volume
tests are possible with the more economical and non-risk US test method. In cases
where the highest safety requirements are demanded (e.g. nuclear power plants,
aerospace industry) US methods are useful.
US waves are emitted from a transducer. US transducer contains a thin disk
made of a crystalline material with piezoelectric properties, such as quartz. When
alternating voltage is applied to piezoelectric materials, they begin to vibrate, using
the electrical energy to create movement. When the mechanical sound energy
comes back to the transducer, it is converted into electrical energy. Just as the
piezoelectric crystal converted electrical energy into sound energy, it can also do
the reverse.
In solids, sound waves can propagate in four principal modes that are based on
the way the particles oscillate. Sound can propagate as longitudinal, shear, surface
and in thin materials as plate waves. Longitudinal and shear waves are the two
modes of propagation most widely used in US testing. Sound waves does, then,
transmit at different speeds in different materials. This is because the mass of the
atomic particles and the spring constants are different for different materials. The
mass of the particles is related to the density of the material, and the spring constant
is related to the elastic constants of a material. This relation may take a number of
different forms depending on the type of wave (longitudinal or shear) and which of
the elastic constants that are used. In isotropic materials, the elastic constants are the
same for all directions within the material. However, most materials are anisotropic
and the elastic constants differ with each direction. For example, in a piece of rolled
aluminum plate, the grains are elongated in one direction and compressed in the
others and the elastic constants for the longitudinal direction differs slightly from
those for the transverse or short transverse directions.
When sound travels through a medium, its intensity diminishes with distance. In
idealized materials, sound pressure (signal amplitude) is reduced due to the
spreading of the wave. In natural materials, however, the sound amplitude is further
weakened due to the scattering and absorption. Scattering is the reflection of the
sound in directions other than its original direction of propagation. Absorption is
the conversion of the sound energy to other forms of energy. The combined effect
of scattering and absorption is called attenuation [16].
Basically, US waves are impinged from an US transducer into an object and the
returning or transmitted waves are analyzed. If an impurity or a crack is present, the
sound will bounce off of them and be seen in the returned signal. US measurements
can be used to determine the thickness of materials and determine the location of a
discontinuity within a part or structure. This is done by accurately measuring the
time required for a US pulse to travel through the material and reflect from the back
surface or the discontinuity.
62 M.H.F. El-Sayed
An earlier work in [9] explained the use of GA for estimating acoustic properties
from the T.F. of US transmission. A comparative approach is followed for choosing
the efficient evolution operators and parameters of GA. GA has been, recently, used
to estimate the propagation parameters of US waves in polycrystalline materials.
The objective function of the estimation is the minimization of a rational difference
error between the estimated and measured transfer functions of US propagation
model. The US propagation parameters are, customarily, the phase velocity and
attenuation. A frequency-dependent form for the attenuation is considered during
estimation. The proposed approach is applied on steel alloy and Aluminum spe-
cimens with different grain sizes. Comparative results are presented on different
evolution operators for less estimation errors and search time. A recent study
recommended the best selection operator, crossover technique and a mutation rate.
The results are also compared with other previous works which adopt traditional
methods for similar US problems [8, 18].
The mechanical properties of composite materials under loading conditions may
be estimated by some conventional techniques like tensile and compressive tests.
Such tests are destructive in nature and provide only a few elastic constants and are
difficult to perform on thin structures [29]. US techniques possibly help in these
aspects over the conventional techniques. The thickness of samples should be in the
same order as the wavelength in the medium, otherwise or in case of multilayer
samples, overlapping can occur and direct measurement of parameters is no longer
possible [5]. A 1-D wave propagation model helps in obtaining an estimation for
the spectrum of the signal transmitted through multilayer sample. In [5] US char-
acterization of porous silicon applied GA to estimate acoustic properties of com-
posites with different thicknesses. An one dimensional model describing wave
propagation through a water immersed sample is used in order to compute trans-
mission spectra. Optimum values of thickness, longitudinal wave velocity and
density are obtained through searching parameters space using GA. A validation of
the method is performed using aluminum (Al) plates with two different thicknesses
as references. The experiment, is, then applied for obtaining the parameters for a
bulk silicon wafer and a porous silicon layer etched on silicon wafer. A good
agreement between retrieved values and theoretical ones is observed even in the
case where US signals overlap.
In general, GA introduces a good alternative for solving inverse problems of
estimating T.F. parameters. This is the case especially in finding the solution of
complicated inverse problems, such as those resulting from the modeling and
characterization of complex nonlinear systems, such as in particular materials with
nonlinear elastic behavior. In [6] inverse problem solution of US T.F. is discussed
highlighting the difficulties inherent in the application of traditional analytical–
numerical techniques, and illustrating how GAs may in part alleviate such problems.
Another inverse estimation based on GA is presented in Sun et al. [26] to
determine the elastic properties of functional graded materials (FGMs) plate from
Application of Genetic Algorithms … 63
lamb wave phase velocity data. The properties of FGMs plate are estimated by
minimizing the standard deviations between the actual and calculated phase velo-
cities of lamb waves. This GA based work proves reliable determination of the
FGM parameters with deviation which can be controlled below 5 %.
Another type of US data are collected from self emission of different materials
under stress and called acoustic emission (AE). In Sibili et al. [24], a GA achieves
superiority over the k-means algorithm since it allows the better clustering of AE
data even on complex data sets.
Meanwhile, the goniometry based US through transmission data can be collected
by using wide aperture low cost polyvinylidene fluoride (PVDF) based receiver for
composite materials. The elastic moduli of polymer matrix based structural com-
posite materials is necessary to characterize their strength after fabrication and
while in service. The use of a hybrid inversion technique which combines GA based
evolutionary approach, critical angle information and the use of stiffness invariants
is implemented for determination of stiffness values from experimental transmission
spectra measurements. Promising results for unidirectional and a multi-layered
composite laminate are presented in [21].
In Luo et al. [17] US speckle technique is used for measuring the strain of a
material. In this method, displacement measurements of an inner surface for
underwater object are correlated. The GA searches, after adjusting genetic para-
meters, for the maximum value among the whole distribution of correlation coef-
ficients efficiently. The results obtained with different algorithms, including the
adaptive genetic, coarse-fine interpolation and hill-climbing searching algorithms,
were compared with each other in Luo et al. [17]. It was clear that the adaptive GA
(AGA) outperformed other methods in computational time and accuracy. Addi-
tionally, there was a good agreement of the measured strain with the corresponding
simulation results from finite element analysis. Considering this performance and
the penetration of ultrasound, this study recommends the US speckle measurement
technique based on AGA for measuring strain of a material.
Another work reports a GA-based reconstruction procedure in Vishnuvardhan
et al. [29] to determine the elastic constants of an orthotropic plate from US pro-
pagation velocity. Phase velocity measurements are carried out using US back-
reflection technique on laminated unidirectional graphite–epoxy and quasi-isotropic
graphite–epoxy fiber reinforced composite plates. The GA-based estimation using
data obtained from multiple planes were evaluated and it was sufficient for the
computation of seven elastic constants.
Reconstruction from US measurements using GA is, also, considered in Kodali
et al. [13] for detecting material inclusion. Simulation results of the ultrasound
computerized tomography (CT) are obtained with enhanced GA for detecting
material inclusion. Multiple types of inclusions are detected in the test specimen to
be reconstructed in this work. In addition to being able to identify inclusion of
different materials, both the shape and location of the inclusion could be recon-
structed. The preliminary results for a simple configuration are found to be better
than previously reported ones. The results are, also, found to be consistent and
satisfactory for a wide range of sizes and geometries of inclusion.
64 M.H.F. El-Sayed
TX TY
d2
Coupling Medium
(e.g. water)
Application of Genetic Algorithms … 65
TX and TY are, respectively, the transmitter and receiving transducers. X(t) is the
input US pulse and Y(t) is the variation of output with time t after passing through
the specimen and making multiple reflection and transmission through it. According
to the wave propagation model which describes the propagation of the US waves
through the tested material, we can write [20]
np
HðxÞ ¼ Hm ðxÞ:Hcal ðxÞ ð1Þ
Where;
1 R212
Hm ðxÞ ¼ ejKw d2 ejKM d2 ð2Þ
1 R212 e2jKM d2
where; R12 is the reflection coefficient at the interface between the specimen and the
coupling medium while the material wave number KM is written as [16]
x
KM ¼ iaðxÞ;
V
Where V is the phase velocity of US propagation inside the tested material and α is
the attenuation factor. Kw is the wave number of the water and d2 is the thickness of
the used specimen as shown in Fig. 1.
For metals, it is more practical for the attenuation to be formulated depending on
frequency as follows [16]
aðf Þ ¼ a1 f þ a2 f 4 ð3Þ
1
Vðf Þ ¼ ð4Þ
V01 þ a0 pa12 lnðf Þ 3p
a2 3
2 f
where; Vo is the dispersion-less phase velocity at the center frequency of the used
transducer having fo = 2 MHz.
a1 a2
and a0 ¼ Lnðf0 Þ þ 2 f03 :
p2 3p
V and α were expressed as defined above and the set of parameters to be estimated
are a1, a2, and V0.
The application of this formulation to the through transmission experiment
causes no restrictions on the behavior of the attenuation factor and the phase
velocity with frequency. Every term in this formula has a physical interpretation for
the impact of either scattering or hysteresis.
The time domain experimental data is processed and converted to the frequency
domain by using Fourier Transform. The measured T.F. Hmnp ðxÞ is obtained using
the through transmission experiment. This T.F. is compared with a modeled one
using the form of H(ω) in (1). The simulated Hm(ω) is obtained by assuming a set of
parameters describing the attenuation factor and the phase velocity as defined in (3)
and (4). An error might be defined as the sum of the squared differences between
the spectrum coefficients of the measured T.F Hmnp ðxÞ and the modeled one H(ω) as
in Eq. (5). The squared difference error is, then, taken as a percentage of the total
energy of the measured spectrum.
Pp np 2
i¼1 Hm ðxi Þ Hðxi Þ
e¼ Pp h np i2 100% ð5Þ
i¼1 H m
ðx i Þ
The minimization of the error in (5) is searched using GA over the space of 3
parameters which are Vo, a1, and a2. This process cannot be considered uncon-
strained search since the estimated velocity and attenuation have natural mean
values. In other words, the natural mean values of the estimated parameters should
Application of Genetic Algorithms … 67
not be exceeded. Accordingly, such constraints are included and the estimated
parameter becomes just a perturbation (xe) around the known mean value (xo) as
x = xo + xe where x, is the parameter included into the T.F.
According to the above discussion, the following GA is used to deal with US
parameter estimation problem
1. The individual chromosome is composed of cascaded propagation parameters in
binary format as described in Sect. 3.
2. A starting population has been assumed using a normalized random distribution
with a size as shown in Table 1.
3. Fitness function is evaluated for each individual and a fitness array can be
obtained using Eq. (5).
4. Tournament selection can be applied to the population with a tournament size t
[25].
5. Reproduction operators are then applied as discussed in Sect. 3.
6. New population is resulted as a new generation.
7. Iterations lead to step 4 until error limit is achieved or maximum iterations are
exceeded.
The nominal value of the mutation probability is in the order of 0.02, however, it
may be better to use higher values as indicated in some studies [9]. These studies
have clarified that much larger mutation rates decreasing over the course of evo-
lution are often helpful with respect to the convergence reliability and velocity of a
GA [1]. This value helps in avoiding stagnation, to speed up the task and reduces
the processing time to few generations toward the optimum parameters.
6 Illustrative Results
specimens contain 0.26 % Carbon. Samples of the used Steel (A1 to A4) were
annealed for different periods of time at 900 °C in order to obtain different Ferrite
grain sizes [18]. The annealing has been followed by grinding and polishing to
remove any surface oxide. The grain sizes were measured by the linear intercept
method [18]. Each recorded value is an average of 100 measurements.
The objective function defined in (5) is, then, minimized using typical GA with
tournament selection and tentative parameters as listed in Table 1. The attenuation
factor and the phase velocity are evaluated by using the set of estimated parameters
a1, a2 and Vo.
The obtained results are listed in Tables 2, 3, 4, 5, 6 and 7 along with the grain size
and the minimization error e which has been calculated using Eq. (5). The minimum
Table 2 The estimated parameters for a steel specimen of grain size = 14.4 μm
Crossover V0 (m/s) a1*105 a2*1014 e (%) #
operator (m−1Hz−1) (m−1Hz−4) Iterations
Initial 4,000 0.1 0.013
Single-point 5,714.45 1.08493 2.03256 1.617 22
Uniform 5,714.45 1.156 0.0843 1.634 15
Table 3 The estimated Parameters for Steel-A1 specimen of grain size = 17.8 μm
Crossover V0 (m/s) a1*105 a2*1014 e (%) #
operator (m−1Hz−1) (m−1Hz−4) Iterations
Initial 4,000 0.1 0.013
Single-point 5,682.15 1.8 0.59 0.783 7
Uniform 5,683.95 1.8486 0.277 0.781 19
Table 4 The estimated parameters for steel-A2 specimen of grain size = 21.2 μm
Crossover V0 (m/s) a1*105 a2*1014 e (%) #
operator (m−1Hz−1) (m−1Hz−4) Iterations
Initial 4,000 0.1 0.013
Single-point 5,560.65 1.94 1.17 0.799 11
Uniform 5,568.64 1.925 0.1917 0.768 19
Table 7 The estimated parameters for steel-A3 specimen for different mutation rates and uniform
crossover
Mutation V0 (m/s) a1*105 a2*1014 e (%) #
rate (m−1Hz−1) (m−1Hz−4) Iterations
0.001 5,329.64 2.05500 3.242 1.590 18
0.020 5,331.00 2.27987 3.200 1.554 22
0.100 5,335.94 2.25437 2.440 1.501 17
0.120 5,3329.94 2.15917 0.707 1.527 20
0.150 5,330.34 2.19337 0.291 1.523 19
0.250 5,334.04 2.15600 0.185 1.517 6
is comparable with that in [16] when the assumed cubic grain size dependence is
considered. These results show that as the grain size increases, the attenuation factor
as a whole is increased and the increase in scattering coefficient turns the
attenuation factor to deviate from linearity.
7 Conclusion
Acknowledgment The author would like to deeply thank Dr. A. A. Hassanien who first taught the
concept of NDT at Cairo University and for his contribution to different works lead to this work.
References
1. Back, T., Hammel, U., Schwefel, H.: Evolutionary computation: Comments on the history and
current state. Evol. Comput. IEEE Trans. 1(1), 3–17 (1997)
2. Beasley, D., Martin, R., Bull, D.: An overview of genetic algorithms: part 1. Fundamentals.
Univ. Comput. 15(2), 58–69 (1993)
Application of Genetic Algorithms … 71
3. Beasley, D., Martin, R., Bull, D.: An overview of genetic algorithms: part 2. Fundamentals.
Univ. Comput. 15(4), 170–181 (1993)
4. Blickle, T., Thiele, L.: A comparison of selection schemes used in genetic algorithms. TIK-
Report, no. 11, Swiss Federal Institute of Technology, (ETCH), Switzerland, Dec 1995
5. Bustillo, J., Fortineau, J., Gautier, G., Lethiecq, M.: Ultrasonic characterization of porous
silicon using a genetic algorithm to solve the inverse problem. NDT & E Int. 62, 93–98 (2014)
6. Delsanto, P.P.: Universality of nonclassical nonlinearity. In: Delsanto, S., Griffa, S., Morra, L.
(eds.) Inverse Problems and Genetic Algorithms, pp. 349–366. Springer, New York (2006)
7. Elangovan, S., Anand, K., Prakasan, K.: Parametric optimization of ultrasonic metal welding
using response surface methodology and genetic algorithm. Int. J. Adv. Manufact. Technol. 63
(5–8), 561–572 (2012)
8. Hassanien, A., Hesham, M., Nour El-Din, A.M.: Grain-size effect on the attenuation and
dispersion of ultrasonic waves. J. Eng. Appl. Sci. 46(3), 401–411 (1999)
9. Hesham, M.: Efficient evolution operators for estimating ultrasonic propagation parameters
using genetic algorithms. Ain Shams Eng. J. 36(2), 517–532 (2001)
10. Hesham, M., Hassanien, A.: A genetic algorithm for polycrystalline material identification
using ultrasonics, Egypt. J. Phys. 31(2), pp. 149–161 (2000)
11. http://www.obitko.com/tutorials/genetic-algorithms/
12. Kinra, V., Dayal, V.: A new technique for ultrasonic-nondestructive evaluation of thin
specimens. Exp. Mech. 28(3), 288–297 (1988)
13. Kodali, S.P., Bandaru, S., Deb, K., Munshi, P., Kishore, N.N.: Applicability of genetic
algorithms to reconstruction of projected data from ultrasonic tomography. In: C. Ryan, M.
Keijzer (eds.), ‘GECCO’, ACM, pp. 1705–1706
14. Krautkrämer, J., Krautkrämer, H.: Ultrasonic Testing of Materials. Springer, Berlin (1969)
15. Kristinsson, K., Dumont, G.A.: System identification and control using genetic algorithms.
Syst. Man Cybern. IEEE Trans. 22(5), 1033–1046 (1992)
16. Kuttruff, H.: Ultrasonics Fundamentals and Applications. Elsevier Applied Science, London
(1991)
17. Luo, Z., Zhu, H., Chu, J., Shen, L., Hu, L.: Strain measurement by ultrasonic speckle
technique based on adaptive genetic algorithm. J. Strain Anal. Eng. Des. 48(7), 446–456
(2013)
18. Nour-El-Din, A.M.: Attenuation and dispersion of ultrasonic waves in metals. M.Sc. thesis,
Faculty of Engineering, Cairo University, May 1997
19. O’Donnell, M., Jaynes, E., Miller, J.: Kramers-Kronig relationship between ultrasonic
attenuation and phase velocity. J. Acoust. Soc. Am. 69(3), 696–701 (1981)
20. Peirlinckx, L., Pintelon, R., Van Biesen, L.: Identification of parametric models for ultrasonic
wave propagation in the presence of absorption and dispersion. IEEE Trans. Ultrason.
Ferroelectr. Freq. Control 40(4), 302–312 (1993)
21. Puthillath, P., Krishnamurthy, C., Balasubramaniam, K.: Hybrid inversion of elastic moduli of
composite plates from ultrasonic transmission spectra using PVDF plane wave sensor.
Compos. B Eng. 41(1), 8–16 (2010)
22. Ramuhalli, P., Polikar, R., Udpa, L., Udpa, S.S.: Fuzzy ARTMAP network with evolutionary
learning, Acoustics, Speech, and Signal Processing. In: Proceedings of IEEE International
Conference on ICASSP‘00, 6, pp. 3466–3469 (2000)
23. Rosenberger, C., Chehdi, K.: Genetic fusion: application to multi-components image
segmentation, Acoustics, Speech, and Signal Processing. In: Proceedings of IEEE
International Conference on ICASSP‘00, 4, pp. 2223–2226 (2000)
24. Sibil, A., Godin, N., R’Mili, M., Maillet, E., Fantozzi, G.: Optimization of acoustic emission
data clustering by a genetic algorithm method. J. Nondestr. Eval. 31(2), 169–180 (2012)
25. Spears, W.M.: Adapting crossover in a genetic algorithm, Navy Center for Applied Research
in Artificial Intelligence, (NCARAI) Naval Reaserch Lab., pp. 20375–5000, Washington, DC
(1992)
72 M.H.F. El-Sayed
26. Sun, K., Hong, K., Yuan, L., Shen, Z., Ni, X.: Inversion of functional graded materials elastic
properties from ultrasonic lamb wave phase velocity data using genetic algorithm. J. Nondestr.
Eval. 33, 34–42 (2013)
27. Tavakolpour, A.R., Mat Darus, I.Z., Tokhi, O., Mailah, M.: Genetic algorithm-based
identification of transfer function parameters for a rectangular flexible plate system. Eng. Appl.
Artif. Intell. 23(8), 1388–1397 (2010)
28. Toledo, A.R.; Fernández, A.R.; Anthony, D. K.: A comparison of GA objective functions for
estimating internal properties of piezoelectric transducers used in medical echo-graphic
imaging. Health Care Exchange (PAHCE), 2010 Pan American, vol. 185(190), pp. 15–19,
Mar 2010
29. Vishnuvardhan, J., Krishnamurthy, C., Balasubramaniam, K.: Genetic algorithm
reconstruction of orthotropic composite plate elastic constants from a single non-symmetric
plane ultrasonic velocity data. Compos. B Eng. 38(2), 216–227 (2007)
30. Weile, D.S., Michielssen, E.: Genetic algorithm optimization applied to electromagnetics: a
review. Antennas Propag. IEEE Trans. 45(3), 343–353 (1997)
A Hybridized Approach for Prioritizing
Software Requirements Based on K-Means
and Evolutionary Algorithms
1 Introduction
analytic hierarchy process (AHP) is the most prominently used technique. How-
ever, this technique suffers bad scalability. This is due to the fact that, AHP exe-
cutes ranking by considering the criteria that are defined through an assessment of
the relative priorities between pairs of requirements. This becomes impracticable as
the number of requirements increases. It also does not support requirements evo-
lution or rank reversals but provide efficient or reliable results [4, 17]. Also, most
techniques suffer from rank reversals. This term refers to the inability of a technique
to update rank status of ordered requirements whenever a requirement is added or
deleted from the list. Prominent techniques that suffer from this limitation are case
base ranking [25]; interactive genetic algorithm prioritization technique [31];
Binary search tree [17]; cost value approach [17] and EVOLVE [11]. Furthermore,
existing techniques are prone to computational errors [27] probably due to lack of
robust algorithms. [17] conducted some researches where certain prioritization
techniques were empirically evaluated. From their research, they reported that, most
of the prioritization techniques apart from AHP and bubble sorts produce unreliable
or misleading results while AHP and bubble sorts were also time consuming. The
authors submitted that; techniques like hierarchy AHP, spanning tree, binary search
tree, priority groups produce unreliable results and are difficult to implement. [4]
were also of the opinion that, techniques like requirement triage, value intelligent
prioritization and fuzzy logic based techniques are also error prone due to their
reliance on experts and are time consuming too. Planning game has a better vari-
ance of numerical computation but suffer from rank reversals problem. Wieger’s
method and requirement triage are relatively acceptable and adoptable by practi-
tioners but these techniques do not support rank updates in the event of require-
ments evolution as well. The value of a requirement is expressed as its relative
importance with respect to the other requirements in the set.
In summary, the limitations of existing prioritization techniques can be described
as follows:
2:1:1 Scalability: Techniques like AHP, pairwise comparisons and bubblesort
suffer from scalability problems because, requirements are compared based
on possible pairs causing n (n−1)/2 comparisons [17]. For example, when
the number of requirements is doubled in a list, other techniques will only
require double the effort or time for prioritization while AHP, pairwise
comparisons and bubblesort techniques will require four times the effort or
time. This is bad scalability.
2:1:2 Computational complexity: Most of the existing prioritization techniques
are actually time consuming in the real world [4, 17]. Furthermore, [1]
executed a comprehensive experimental evaluation of five different priori-
tization techniques namely; AHP, binary search tree, planning game, $100
(cumulative voting) and a new method which combines planning game and
AHP (PgcAHP), to determine their ease of use, accuracy and scalability.
The author went as far as determining the average time taken to prioritize 13
requirements across 14 stakeholders with these techniques. At the end of the
experiment; it was observed that, planning game was the fastest while AHP
A Hybridized Approach for Prioritizing Software Requirements … 77
EA are not allowed to lose any customer i.e. the offspring must contain the same
number of customers as the parent, otherwise parent are stored into the requirement
pool instead of offspring.
4 Proposed Approach
independent. In order to drive the synthetic utility values, we first exploited the
factor analysis technique to extract the attributes that possess common function-
alities. This caters for requirement dependencies challenges during the prioritization
process. The attributes with the same functionalities are considered to be mutually
dependent. Therefore, before relative weights are assigned to the requirements by
relevant stakeholders, attention should be paid to requirement dependencies issues
in order to avoid redundant results. However, when requirements evolve, it
becomes necessary to add or delete from a set. The algorithm should also be able to
detect this situation and update rank status of ordered requirements instantly. This is
known as rank reversals. It is formally expressed as: (1) failure of the type 1 → 5 or
5 → 1; (2) failuresz of the type 1 → ϕ or 5 → ϕ (where ϕ = the null string) (called
deletions); and (3) failures of the type ϕ → 1 or ϕ → 5 (called insertions). A weight
metric w, on two requirement (X, Y) is defined as the smallest number of edit
operations (deletions, insertions and updates) to enhance the prioritization process.
Three types of rank updates operations on X → Y are defined as: a change operation
(X ≠ ϕ and Y ≠ ϕ), a delete operation (Y = ϕ) and an insert operation (X = ϕ). The
weights of all the requirements can be computed by a weight function w. An
arbitrary weight function w is obtained by computing all the assigned non-negative
real number w (X, Y) on each requirement. On the other hand, there is mutual
independence between attributes, and the measurement is an additive case, so we
can utilize the additive aggregate method to conduct the synthetic utility values for
all the attributes in the entire requirements. As we can see in Algorithm 1, differ-
ential evolution starts with the generation of random population (line 1) through the
assignment of a random coefficient to each attribute of the individual (requirement).
The population consists of a certain number of solutions (this is, a parameter to be
configured). Each individual (requirement) is represented by a vector of weighting
factors provided by the stakeholders. After the generation of the population, the
fitness of each individual is assigned to each solution using the Pearson correlation.
This correlation, corr(X, Y), is calculated with the scores provided by stakeholders
for every pair of requirement of the RALIC dataset [22].
The closer the value of correlation is to any of the extreme values, the stronger is
the correlation between the requirements and the higher is the accuracy of the
prioritized results. On the contrary, if the result tends toward 0, it means that the
requirements are somewhat uncorrelated which gives an impression of poor quality
prioritized solution.
Algorithm 1: Pseudo-code for the DE/K-means algorithm
1. generateRandom centroids from K clusters (population)
2. assign weights of each attribute to the cluster with the closest centroid
3. update the centroids by calculating the mean values of objects within clusters:
Fitness (population)
4. while (stop condition not reached) do
5. for (each individual of the population)
6. selectIndividuals (xTarget, xBest, xInd1, xInd2)
7. xMut diffMutation (xBest, F, xInd1, xInd2)
A Hybridized Approach for Prioritizing Software Requirements … 81
DNðsÞ
¼ f ðNÞ jðcÞcðwÞN; Nð0Þ ¼ N0 ð3Þ
Dw
Where f ðNÞ is a real-valued function which models the increase in the number of
requirements; cðwÞ is the weight of requirements based on pre-defined criteria; jðcÞ
is a quality representing the efficacy of the ranked weights. The rank criteria cðwÞ in
(3), is the only variable directly controllable by the stakeholders. Therefore, the
problem of requirements prioritization can be regarded as the problem of planning
for software releases based on the relative weights of requirements. The optimal
weights of requirements are in the form of a discrete ordered program with
N requirements given at weights w1 ; w2 ; . . .; wn . Each requirement is assessed by
s stakeholders characterized by their defined criteria Cij ; i ¼ 1; n; j ¼ 1; d in the set.
These criteria can be varied within the boundaries specified by the constraints in
Eq. (3). The conflicting nature of these constraints and the intention to develop a
model-independent approach for prioritizing requirements makes the utilization of
computational optimization techniques very viable.
All the experiments were executed under the same environment: an Intel Pen-
tium 2.10 GHz processor and 500 GB RAM. Since we are dealing with a stochastic
A Hybridized Approach for Prioritizing Software Requirements … 83
algorithm, we have carried out 50 independent runs for each experiment. Results
provided in the following subsections are average results of these 50 executions.
Arithmetic mean and standard deviation were used to statistically measure the
performance of the proposed system.
maximum value. From each generation of individuals, one or few of them, that has
the highest fitness values are picked out, and inserted into the result set.
The weights of requirements are computed based on their frequencies and a
mean score is obtained to determine the final rank. Figure 1 depicts the fitness
function for the mean weights of the dataset across 76 stakeholders. This is
achieved by counting the numbers of requirements, where the DE simply add their
sums and apportion precise values across requirements to determine their relative
importance.
The results displayed in Table 1 shows the summary statistics of 50 experimental
runs. For 10 requirements, the total number of attributes was 262 and the size of
each cluster varied from 1 to 50 while, the mean and standard deviation of each
cluster spanned from 1–30 and 15–30, respectively.
Also, Table 2 shows the results provided by each cluster that represents the 10
requirements during the course of running the algorithm on the data set. It displays
the sum of weights, within-class variances, minimum distance to the centroid,
average distance to the centroid and maximum distance to the centroids. Table 3
shows the distances between the class centroids for the 10 requirements across the
total number of attributes while, Table 4 depict the analysis of each iteration.
Analysis of multiple runs of this experiment showed exciting results as well. Using
500 trials, it was discovered that, the algorithm classified requirements correctly,
where the determinants (W) for each variable were computed based on the stake-
holder’s weights. The sum of weights and variance for each requirement set was
also calculated.
The learning process consists of finding the weight vector that allows the choice
of requirements. The fitness value of each requirement can be measured on the basis
of the weights vectors based on pre-defined criteria used to calculate the actual
ranking. The disagreements between ranked requirements must be minimized as
much as possible. We can also consider the disagreements on a larger number of top
vectors, and the obtained similarity measure which can be used to enhance the
agreement index. The fitness value will then be a weighted sum of these two
similarity measures.
Definition 5.1 Let X be a measurable requirement that is endowed with attributes
of σ-functionalities, where N is all subsets of X. A learning process g defined on the
measurable space ðX; NÞ is a set function g : N ! ½0; 1 which satisfies the fol-
lowing properties:
X; N Y 2 N ! ½0; 1 ð5Þ
In the case where gðX [ YÞ maxfgðXÞ; gðYÞg, the learning function g attempts
to determine the total number of requirements being prioritized and
ifgðX \ YÞ minfgðXÞ; gðYÞg, the learning function attempts to compute the rel-
ative weights of requirements provided by the relevant stakeholders.
P
n
Definition 5.2 Let h ¼ Xi :1Xi be a simple function, where 1Xi is the attribute
i¼1
function of the requirements Xi 2 N; i ¼ 1; . . .; n; Xi are pairwise disjoints, but if
MðXi Þ is the measure of the weights between all the attributes contained in Xi , then
the integral of h is given as:
Z
X
n
h:dM ¼ MðXi Þ:Xi ð7Þ
i¼1
For 0 k 1
In practical application of the learning process, the number of cluster which
represents the number of requirements must be determined first. The attributes that
describes each requirement are known as the data elements that are to be ranked.
Therefore, before relative weights are assigned to requirements by stakeholders,
attention should be paid to requirement dependencies issues in order to avoid
redundant results. Prioritizing software requirements is actually determined by
relative perceptions which will inform the relative scores provided by the stake-
holders to initiate the ranking process.
Prioritizing requirements is an important activity in software development [31,
32]. When customer expectations are high, delivery time is short and resources are
limited, the proposed software must be able to provide the desired functionality as
early as possible. Many projects are challenged with the fact that, not all the
requirements can be implemented because of limited time and resource constraints.
This means that, it has to be decided which of the requirements can be removed for
the next release. Information about priorities is needed, not just to ignore the least
important requirements but also to help the project manager resolve conflicts, plan
for staged deliveries, and make the necessary trade-offs. Software system’s
acceptability level is frequently determined by how well the developed system has
met or satisfied the specified user’s or stakeholder’s requirements. Hence, eliciting
and prioritizing the appropriate requirements and scheduling right releases with the
correct functionalities are essential ingredients for developing good quality software
systems. The matlab function used for k-means clustering which is idx = k means
(data, k), that partitions the points in the n-by-p data matrix data into k clusters was
employed. This iterative partitioning minimizes the sum, over all clusters, of the
within-cluster sums of point-to-cluster-centroid distances. Rows of data corre-
sponds to attributes while columns corresponds to requirements. K-means returns
an n-by-1vector idx containing the cluster indices of each attribute which is utilized
in calculating the fitness function to determine the ranks.
Further analysis was performed using a two-way analysis of variance (ANOVA).
On the overall dataset, we found significant correlations between the ranked
requirements. The results of ANOVA produced significant effect on the Rate P and
Rank P with minimized disagreement rates (p-value = 0.088 and 0.083
respectively).
The requirements are randomly generated as population, while the fitness value
is calculated which gave rise to the best and mean fitnesses of the requirement
generations that were subjected to a stoappge criteria during the iteration processes.
The best fitness stopping criteria option was chosen during the simulation process.
A Hybridized Approach for Prioritizing Software Requirements … 89
The requirements generations significantly increased while the mean and best
values were obtained for all the requirements which will aid the computation of
final ranks for all the requirements. Also, the best, worst and mean scores of
requirements were computed. In the context of requirement prioritization, the best
scores can stand for the most valued requirements while the worst scores can stand
for the requirements that were less ranked by the stakeholders. The mean scores are
the scores used to determine the relative importance of requirements across all the
stakeholders for the software development project. Therefore, mutation should be
performed with respect to the weights of attributes describing each requirement set.
50 independent runs were conducted for each experiment. The intra cluster
distances obtained by clustering algorithms on test dataset have been summarized in
Table 5. The results contains the sum of weights and within-class variance of the
requirements. The sum of weights are considered as the prioritized results. The
iterations depcited in Table 6 represents the number of times that the clustering
algorithm has calculated the fitness function to reach the (near) optimal solution.
The fitness is the average correlation for all the lists of attribute weights. It is
dependent on the number of iterations to reach the optimal solution. As seen from
Table 6, the proposed algorithm has produced the highest quality solutions in terms
of the determinant as well as the initial and final within class variances. Moreover,
the standard deviation of solutions found by the proposed algorithm is small, which
means that the proposed algorithm could find a near optimal solution in most of the
runs. In other words, the results confirm that the proposed algorithm is viable and
robust. In terms of the number of function evaluations, the k-means algorithm needs
the least number of evaluations compared to other algorithms.
90 P. Achimugu and A. Selamat
Table 6 (continued)
Repetition Iteration Initial Within-class Final Within-class Determinant (W)
variance variance
40 3 1,120.536 26.388 35,745,608.991
41 3 1,131.305 26.587 37,270,986.568
42 3 1,125.849 579.831 128,941,063,793.784
43 3 1,100.849 522.269 176,077,098,304.402
44 3 1,121.924 25.182 36,102,923.262
45 3 1,116.445 25.120 36,854,622.383
46 3 1,116.223 522.626 228,664,613,438.719
47 3 1,116.968 36.729 55,171,737.251
48 3 1,127.872 20.857 29,381,923.033
49 3 1,130.837 20.633 27,643,374.936
50 3 1,104.055 587.943 187,806,412,544.351
6 Conclusion
Acknowledgment The authors appreciate the efforts of the Ministry of Science, Technology and
Innovation Malaysia (MOSTI) under Vot 4S062 and Universiti Teknologi Malaysia (UTM) for
supporting this research.
References
20. Kobayashi, M., Maekawa, M.: Need-based requirements change management. In: Proceeding
of Eighth Annual IEEE International Conference and Workshop on the Engineering of
Computer Based Systems, pp. 171–178 (2001)
21. Krishna, K., Narasimha, M.: Genetic K-means algorithm. IEEE Trans. Syst. Man Cyber. Part
B (Cyber.) 29(3), 433–439 (1999)
22. Lim, S.L., Finkelstein, A.: takeRare: using social networks and collaborative filtering for
large-scale requirements elicitation. Softw. Eng. IEEE Trans. 38(3), 707–735 (2012)
23. Maulik, U., Bandyopadhyay, S.: Genetic algorithm-based clustering technique. Pattern
Recogn. 33(9), 1455–1465 (2000)
24. Moisiadis, F.: The fundamentals of prioritizing requirements. In: Proceedings of Systems
Engineering Test and Evaluation Conference (SETE 2002) (2002)
25. Perini, A., Susi, A., Avesani, P.: A Machine Learning Approach to Software Requirements
Prioritization. IEEE Trans. Software Eng. 39(4), 445–460 (2013)
26. Qin, A.K., Suganthan, P.N.: Kernel neural gas algorithms with application to cluster analysis.
In: Proceedings-International Conference on Pattern Recognition (2004)
27. Ramzan, M., Jaffar, A., Shahid, A.: Value based intelligent requirement prioritization (VIRP):
expert driven fuzzy logic based prioritization technique. Int. J. Innovative Comput. 7(3),
1017–1038 (2011)
28. Selim, S., Alsultan, K.: A simulated annealing algorithm for the clustering problem. Pattern
Recogn. 24(10), 1003–1008 (1991)
29. Sung, C.S., Jin, H.W.: A tabu-search-based heuristic for clustering. Pattern Recogn. 33(5),
849–858 (2000)
30. Storn, R., Price, K.: Differential Evolution—A Simple and Efficient Adaptive Scheme for
Global Optimization over Continuous Spaces, TR-95-012. Int. Comput. Sci. Inst., Berkeley
(1995)
31. Tonella, P., Susi, A., Palma, F.: Interactive requirements prioritization using a genetic
algorithm. Inf. Softw. Technol. 55(2013), 173–187 (2012)
32. Thakurta, R.: A framework for prioritization of quality requirements for inclusion in a software
project. Softw. Qual. J. 21, 573–597 (2012)
One-Hour Ahead Electric Load
Forecasting Using Neuro-fuzzy System
in a Parallel Approach
1 Introduction
A. Laouafi (&)
Department of Electrical Engineering, University of 20 August 1955—Skikda, Skikda,
Algeria
e-mail: Laouafi[email protected]
M. Mordjaoui
Department of Electrical Engineering, LRPCSI Laboratory, University of 20 August 1955—
Skikda, Skikda, Algeria
e-mail: [email protected]
D. Dib
Department of Electrical Engineering, University of Tebessa, Tebessa, Algeria
e-mail: [email protected]
The common approach is to analyse time series data of load consumption and
temperature to modelling and to explain the series [30]. The intuition underlying
time-series processes is that the future behavior of variables is related to its past
values, both actual and predicted, with some adaptation/adjustment built-into take
care of how past realizations deviated from those expected. The temporal fore-
casting can be broadly divided into 4 types:
• Very Short term (from few minutes to a 1 h).
• Short term (from 1 h to a week).
• Medium term (from a week to a year).
• Long term (from a year to several years).
Long term prediction is normally used for planning the growth of the generation
capacity. This long term forecasting is used to decide whether to build new lines
and sub-stations or to upgrade the existing systems. Medium-term load forecast is
used to meet the load requirements at the height of the winter or the summer season
and may require a load forecast to be made a few days to few weeks (or) few
months in advance.
In STLF, the forecast calculates the estimated load for each hour of the day, the
daily peak load and the daily/weekly energy generation. Many operations like real
time generation control, security analysis, spinning reserve allocation, energy
interchanges with other utilities, and energy transactions planning are done based
on STLF.
Economic and reliable operation of an electric utility depends to a significant
extent on the accuracy of the load forecast. The load dispatcher at main dispatch
center must anticipate the load pattern well in advance so as to have sufficient
generation to meet the customer requirements. Over estimation may cause the
startup of too many generating units and lead to an unnecessary increase in the
reserve and the operating costs. Underestimation of the load forecasts results in
failure to provide the required spinning and standby reserve and stability to the
system, which may lead into collapse of the power system network [1]. Load
forecast errors can yield suboptimal unit commitment decisions. Hence, correct
forecasting of the load is an essential element in power system.
In a deregulated, competitive power market, utilities tend to maintain their
generation reserve close to the minimum required by an independent system
operator. This creates a need for an accurate instantaneous-load forecast for the next
several minutes. Accurate forecasts, referred to as very short-term load forecasts
ease the problem of generation and load management to a great extent. These
forecasts, integrated with the information about scheduled wheeling transactions,
transmission availability, generation cost, spot market energy pricing, and spinning
reserve requirements imposed by an independent system operator, are used to
determine the best strategy for the utility resources. Very short-term load fore-
casting has become of much greater importance in today’s deregulated power
industry [5, 36].
A wide variety of techniques has been studied in the literature of short-term load
forecasting [20]. For example, time series analysis (ARMA, ARIMA, ARMAX …etc.)
One-Hour Ahead Electric Load Forecasting Using … 97
[4, 19], regression approach [34], exponential smoothing technique [44], artificial
neural networks methods [35], hybrid approaches based on evolutionary algorithms
[12] …etc.
The nature of electrical load forecasting problem is well suited to the technology
of artificial neural networks (ANN) as they can model the complex non-linear
relationships through a learning process involving historical data trends. Therefore,
several studies in recent years have examined the application of ANN for short-term
load forecasting [26].
Recently, hybrid neuro-fuzzy models have received a considerable attention
from researchers in the field of short-term load forecasting [29, 30, 33]. Further-
more, the neuro-fuzzy approach attempts to exploit the merits of both neural-
network and fuzzy-logic-based modeling techniques. For example, the fuzzy
models are based on fuzzy IF-THEN rules and are, to a certain degree, transparent
to interpretation and analysis, whereas the neural-networks based black-box model
has a unique learning ability [32]. While building a FIS, the fuzzy sets, fuzzy
operators, and the knowledge base are required to be specified. To implement an
ANN for a specific application the architecture and learning algorithm are required.
The drawbacks in these approaches appear complementary and consequently it is
natural to consider implementing an integrated system combining the neuro-fuzzy
concepts [41].
Nevertheless, very short-term load demand forecasting methods based on neuro-
fuzzy approach are not so numerous [9, 10]. Therefore, this lack has motivated us to
provide this paper to the development and the implementation of adaptive neuro-
fuzzy inference system models devoted to VSTLF.
The paper is organized as follows. Section 2 is proposed to summarize very
short-term load forecasting methods. Section 3 is devoted to the description of the
ANFIS architecture. Section 4 describes the proposed estimation methods. Sec-
tion 5 provides and explains forecasting results. Finally, Sect. 6 concludes the
paper.
Very short-term load forecasting (VSTLF) predicts the loads in electric power
system 1 h into the future in steps of a few minutes in a moving window manner.
Depending on the electric utilities, used data in VSTLF could be of, minute-by-
minute basis [27, 43], 5-min intervals [11, 16, 17, 40], 15 min steps [6, 31], or a
half-hourly intervals [24, 25].
Methods for very short-term load forecasting are limited. Existing methods
include time series analysis, exponential smoothing, neural network (NN), fuzzy
logic, adaptive Neuro-Fuzzy inference system, Kalman filtering, and Support
Vector Regression. Usually, weather conditions in very short-term load forecasting
are ignored because of the large time constant of load as a function of weather. The
representative methods will be briefly reviewed in this Section.
98 A. Laouafi et al.
Time series models are based on the assumption that the data have an internal
structure, such as autocorrelation, trend or seasonal variation. The forecasting
methods detect and explore such a structure. A time series model includes:
• Autoregressive Model (AR)
• Moving Average Model (MA)
• Autoregressive Moving Average Model (ARMA)
• Autoregressive Integrated Moving Average model (ARIMA)
• Autoregressive Moving Average Model with exogenous inputs model
(ARMAX)
• Autoregressive Integrated Moving Average with Explanatory Variable
ARIMAX)
However, the most popular Time series models used in VSTLF are the auto-
regressive model [27], and the Autoregressive integrated moving average model
[25].
Neural networks (NN) assume a functional relationship between load and affecting
factors, and estimate the functional coefficients by using historical data. There are
many types of neural networks including the multilayer perceptron network (MLP),
self-organizing network and Hopfield’s recurrent network [27]. Based on learning
strategies, neural network methods for load forecasting can be classified into two
One-Hour Ahead Electric Load Forecasting Using … 99
groups. The first one is a supervised neural network that adjusts its weights
according to the error between pre-tested and desired output. The second are
methods based on unsupervised learning algorithm. Generally, methods based on
supervised learning algorithm like a feed forward multilayer perceptron are used.
Although MLP is a classical model, it is still the most favorite ANN architecture
in forecasting applications. The structure of MLP consists of input layer, hidden
layer, and output nodes connected in a feed-forward fashion via multiplicative
weights. Inputs are multiplied by connection weights and passed on to the neurons
in hidden layer nodes. The neurons in hidden and output layer nodes have a transfer
function. The inputs to hidden layer are passed through a transfer function to
produce output. ANN would learn from experience and is trained with back-
propagation and supervised learning algorithm. The proper selection of training
data improves the efficiency of ANN [8].
Most neural network methods for VSTLF use inputs e.g., time index, load of
previous hour, load of the yesterday and previous week with same hour and
weekday index to the target hour [5, 15, 39]. Chen and York [7] have presented a
neural network based very short-term load prediction. Results indicated that under
normal situations, forecasted minutely load values by NN-based VSTLP for the
future 15 min are provided with good accuracy on the whole as well as for the worst
cases.
The Kalman filtering (KF) algorithm is a robust tracking algorithm that has long
been applied to many engineering fields such as radar tracking. In load forecasting,
it is introduced to estimate the optimal load forecast parameters and overcome the
unknown disturbance in the linear part of the systems during load prediction [48].
Very short-term load prediction in [45] was done using slow and fast Kalman
estimators and an hourly forecaster. The Kalman model parameters are determined
by matching the frequency response of the estimator to the load residuals. The
methodology was applied to load data taken from the portion of the western North
American power system operated by the BPA.
Guan et al. [18] have presented a method of wavelet neural networks trained by
hybrid Kalman filters to produce very short-term forecasting with prediction
interval estimates online. Testing results demonstrate the effectiveness of hybrid
Kalman filters for capturing different features of load components, and the accuracy
of the overall variance estimate derived based on a data set from ISO New England.
Support vector machines (SVM) method, which was proposed by Vapnik [46], is
used to solve the pattern recognition problems by determining a hyperplane that
separates positive and negative examples, by optimization of the separation margin
One-Hour Ahead Electric Load Forecasting Using … 101
between them [32]. Later Vapnik promotes the SVM method to deal with the
function fitting problems in 1998, which forms the support vector regression (SVR)
method [47]. SVR produces a decision boundary that can be expressed in terms of a
few support vectors and can be used with kernel functions to create complex
nonlinear decision boundaries. Similarly to linear regression, SVR tries to find a
function that best fits the training data.
Setiawan et al. [38] have presented a new approach for the very short-term
electricity load demand forecasting using SVR. Support vector regression was
applied to predict the load demand every 5 min based on historical data from the
Australian electricity operator NEMMCO for 2006–2008. The results showed that
SVR is a very promising prediction model, outperforming the back propagation
neural networks (BPNN) prediction algorithms, which is widely used by both
industry forecasters and researchers.
The hybrid neuro-fuzzy approach is a way to create a fuzzy model from data by
some kind of learning method that is motivated by learning algorithms used in
neural networks. This considerably reduces development time and cost while
improving the accuracy of the resulting fuzzy model. Thus, neuro-fuzzy systems are
basically adaptive fuzzy systems developed by exploiting the similarities between
fuzzy systems and certain forms of neural networks, which fall in the class of
generalized local methods. Therefore, the performance of a neuro-fuzzy system can
also be represented by a set of humanly understandable rules or by a combination of
localized basis functions associated with local models, making them an ideal
framework to perform nonlinear predictive modeling. However, there are some
ways to mix neural networks and fuzzy logic. Consequently, three main categories
characterize these technologies: fuzzy neural networks, neural fuzzy systems and
fuzzy-neural hybrid systems [2, 3]. In the last approach, both neural networks and
fuzzy logic are used independently, becoming, in this sense, a hybrid system.
An adaptive Neuro-Fuzzy inference system is a cross between an artificial neural
network and a fuzzy inference system. An artificial neural network is designed to
mimic the characteristics of the human brain and consists of a collection of artificial
neurons. Adaptive Neuro-Fuzzy Inference System (ANFIS) is one of the most
successful schemes which combine the benefits of these two powerful paradigms
into a single capsule [21]. An ANFIS works by applying neural learning rules to
identify and tune the parameters and structure of a Fuzzy Inference System (FIS).
There are several features of the ANFIS which enable it to achieve great success in
a wide range of scientific applications. The attractive features of an ANFIS include:
easy to implement, fast and accurate learning, strong generalization abilities,
excellent explanation facilities through fuzzy rules, and easy to incorporate both
linguistic and numeric knowledge for problem solving [22]. According to the
neuro-fuzzy approach, a neural network is proposed to implement the fuzzy system,
102 A. Laouafi et al.
so that structure and parameter identification of the fuzzy rule base are accom-
plished by defining, adapting and optimizing the topology and the parameters of the
corresponding neuro-fuzzy network, based only on the available data. The network
can be regarded both as an adaptive fuzzy inference system with the capability of
learning fuzzy rules from data, and as a connectionist architecture provided with
linguistic meaning [2, 3].
Fig. 2 A two input first order Sugeno fuzzy model with two rules
fi are the outputs within the fuzzy region specified by the Fuzzy rule, fai ; bi ; ci g are
the design parameters that are determined during the training process. Some layers
of ANFIS have the same number of nodes, and nodes in the same layer have similar
functions: Layer 1: Every node i in this layer is an adaptive node. The outputs of
layer 1 are the fuzzy membership grade of the inputs, which are given by:
It other words, O1i;1 is the membership function of Ai , and it specifies the degree to
which the given input satisfies the quantifier Ai . lAi ðx1 Þ and lBi ðx2 Þ can adopt any
fuzzy membership function. However, the most commonly used are Bell shaped
and Gaussian membership functions. For example, if the bell shaped membership
function is employed, lAi ðx1 Þ is given by:
1
lAi ðx1 Þ ¼ 2 bi ð5Þ
x1 ci
1þ ai
Where ai ; bi and ci are the parameters of the membership function, governing the
bell shaped functions accordingly. Layer 2: Every node in this layer is a circle node
labeled Π, which multiplies the incoming signals and sends the product out. The
Fuzzy operators are applied in this layer to compute the rule antecedent part [30].
The output of nodes in this layer can be presented as:
Layer 3: The fuzzy rule base is normalized in the third hidden layer. Every node in
this layer is a circle node labeled N. The ith node calculates the ratio of the ith rule’s
firing strength to the sum of all rules’ firing strengths:
wi
vi ¼ ; i ¼ 1; 2 ð7Þ
w1 þ w2
Layer 4: Every node i in this layer is a square node with a node function:
Layer 5: Finally, layer five, consisting of circle node labeled with ∑ is the sum-
mation of all incoming signals. Hence, the overall output of the model is given by:
X P2
2
wi fi
O5i ¼ vi f i ¼ i¼1
P 2
ð9Þ
i¼1 i¼1 wi
f ¼ v1 f 1 þ v2 f 2 ð11Þ
AH ¼ y ð14Þ
where H contains the unknown parameters in S2. This is a linear square problem,
and the solution for Θ, which is minimizes kAH ¼ yk, is the least square estimator:
1
H ¼ AT A AT y ð15Þ
we can use also recursive least square estimator in case of on-line training. For the
backward path (see Fig. 2), the error signals propagate backward. The premise
parameters are updated by descent method, through minimising the overall qua-
dratic cost function:
N h i2
1X _
J ðH Þ ¼ yðkÞ y ðk; HÞ ð16Þ
2 N¼1
In a recursive manner with respect Θ(S2). The update of the parameters in the ith
node in layer Lth layer can be written as:
106 A. Laouafi et al.
þ
H ^ L ðk 1Þ þ g o EðkÞ
^ i ðkÞ ¼ H ð17Þ
i
^ L ðkÞ
oH i
oþ E o ^zL;i
L ¼ eL;i ð18Þ
^
oH oH ^L
i i
o ^zL;i being the node’s output and eL;i is the backpropagated error signal.
Figure 3 presents the ANFIS activities in each pass. As discussed earlier, the
consequent parameters thus identified are optimal under the condition that the
premise parameters are fixed.
The flow chart of training methodology of ANFIS system is shown in Fig. 4.
Usually, the modeling process starts by obtaining a data set (input-output data pairs)
and dividing it into training and checking data sets. Training data constitutes a pairs
of input and output vectors. In order to make data suitable for the training stage, this
data are normalized and used as the input and the outputs to train the ANFIS. Once
both training and checking data were presented to ANFIS, the FIS was selected to
have parameters associated with the minimum checking data model error. The
stopping criterion of ANFIS is the testing error when it became less than the
tolerance limit defining at the beginning of the training stage or by putting con-
straint on the number of learning iterations.
One-Hour Ahead Electric Load Forecasting Using … 107
4 Proposed Methods
Variations in electrical load are, among other things, time of the day dependent,
introducing a dilemma for the forecaster: whether to partition the data and use a
separate model for each specified time of the day (the parallel approach), or use a
single model (the sequential approach) [14].
In this work, the electrical load time series are separated in autonomous points.
A set of independent points means that the load at each quarter-hour of the day is
independent from the load at any other quarter-hour. These are called parallel series.
We propose three suggestions. The first is to split the French quart-hourly load data
into 96 parallel series, each series is composed by loads consumed at a specified
time of a distinctive day (Saturday, Sunday …, etc.). In the second, the parallel
series contain loads from all previous days consumed at a specified quarter-hour. In
the third, parallel load series are classified in three categories: Saturdays, Sundays,
workdays.
Data classification need some knowledge such as the identification of the first
day in the historical load data (Saturday, Sunday,…), the number of days in each
month, the number of days available in the historical load data. By effecting simple
If-Then statements, the parallel load series for each class can be extracted.
Step 5: Selected input from the previous step is used then to generate and trains a
Sugeno FIS of two fuzzy rules, two sigmoid membership and twenty
epochs.
Step 6: At last, original input related to the selected input is used to predict the
load in y(i).
Step 7: We repeat then the two last previous steps in order to predict the desired
load in y(i + 1), y(i + 2) and y(i + 3).
However, we propose in the paper, three ANFIS models:
• Method 1: the electrical loads series in this method are obtained by imple-
menting the first classification.
• Method 2: the electrical loads series in this method are obtained by imple-
menting the second classification.
• Method 3: the electrical loads series in this method are obtained by imple-
menting the third classification.
All three methods are applied in the French real time load data. These data consists
of quart-hourly recording ranging from Sunday 07 April 2013 until Friday 28
February 2014, where the last month is used in a one-hour ahead forecasting. Used
data are represented by Fig. 5.
The graphical user interface developed for all three methods is represented by
Fig. 6. The essential function of this tool is to ensure, at any quarter-hour selected
Fig. 5 Quart-hourly French electric load time series from Sunday 07 April 2013 to 28 February
2014
110 A. Laouafi et al.
Fig. 6 Developed forecasting tool for a 1-h ahead electric load forecasting using ANFIS
from the user, a 1-h ahead demand prediction. In addition, when actual values of the
load are available, the performance of forecasted loads could be verified using
different error measurement criteria. Moreover, the tool has the advantage, by
division the original data into 96 separated load series, to reduce the number of data
should be taken into consideration before predicting the load at the specified hour
and, and by the way reducing computational time.
To evaluate and compare the performance of the new proposed methods, fore-
casts are done along the month of February 2014. For each day in the selected
Month, first steps in the 1-h ahead prediction are 00:15, 01:15, 02:15… until 23:15.
Forecasts by Method 3 in the field “11:15 p.m. to 00:00” are based on the first
classification.
Results of three methods are represented in Figs. 7, 8 and 9. As shown in these
figures, the proposed ANFIS models have successfully predict the load over the
month of February 2014, and there is almost no different between predicted and real
load.
To evaluate the performance of developed models, we have used APE (Absolute
Percentage error), MAPE (Mean absolute percentage error) and RMSE (Root mean
square error) criteria. Evaluation results are summarized in Table 1.
j^yt yt j
APE ¼ 100 ð19Þ
yt
One-Hour Ahead Electric Load Forecasting Using … 111
Fig. 7 One-hour ahead forecasted load versus real load for method 1
Fig. 8 One-hour ahead forecasted load versus real load for method 2
Fig. 9 One-hour ahead forecasted load versus real load for method 3
112 A. Laouafi et al.
1X n
j^yt yt j
MAPE ¼ 100 ð20Þ
n m¼1 yt
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 X n
RMSE ¼ ð^yt yt Þ2 ð21Þ
n m¼1
As shown in Table 1, all three methods achieve a high accuracy in the 1-h ahead
load forecasting. We can perceive that’s the accuracy of the second method
decrease in free days compared to in working days, this can be justified by the fact
that this model use loads from a specific quarter-hour in all previous days in the
historical load data. Here, we should note that the load in Saturday and Sunday is
very low compared to in working days. For example, if we would like to predict the
load at a specified quarter-hour in a Saturday using the second method, than latest
values of the parallel load series contains loads from previous nearest Monday until
the previous day (a Friday). These values effects the prediction because they are
height compared to the desired load in Saturday. This can be clearly observed in the
first method, which is based on intraday classification and the parallel series con-
tains load consumed at a specified quarter-hour in a typical day (Monday, Tuesday
…), where the accuracy of prediction in Saturdays and Sundays is not different
compared to in others days.
However, what is impressive; is that the number of data that should be taken into
account in the second method is seven times higher than the used in the first
method, while the obtained results clearly show that the first method is more
accurate than the second method. This demonstrates that, in addition to the selection
of an appropriate forecasting technique, classifying the historical data to extract
useful knowledge and patterns from large database also, affect on the forecasting
accuracy. Moreover, by classifying the data in the third method into three clusters:
Saturdays, Sundays and working days, the accuracy is increased, and it is superior
to that in the first and the second method.
Figure 10 represents the distribution of the maximum percentage error for three
methods. We can perceive that the proposed ANFIS methods have failed to predict
peaks consummation around 19:00 with a high accuracy, which make a real need in
this field, to propose a separate model for predicting the peak consumption, or to
train the ANFIS with more than one input. However, as shown if Fig. 11, for the
third proposed method, 56 % of the forecasted loads have an APE under 0.5 and an
APE under one was achieved for about 80 % of cases. Likewise, as demonstrate
Figs. 12 and 13, the first and the second method provide also a good accuracy in
most of time.
In addition to a robust model that assures a very high accuracy, time required in
the forecasting procedure take an important role in real time electric load forecasting.
Tables 2 show in detail, for all three methods, prediction results for four different
hours. Results are obtained using Windows 7 64 bit and MATLAB R2013a in a
Table 1 One-hour ahead forecasting accuracy for a three proposed methods over the month of February 2014
Method 1 Method 2
MAPE RMSE MAPE RMSE
Sat 01 Feb 2014 0.6117 0.7193 0.7432 614.3959 769.0680 794.52 1.0218 0.7877 0.8404 878.28 817.96 862.22
Sun 02 Feb 2014 0.9875 860.3865 1.2399 1102.7
Mon 03 Feb 2014 0.7229 798.3312 0.6262 717.08
Tue 04 Feb 2014 0.6374 761.9828 0.6081 681.86
Wed 05 Feb 2014 0.7088 770.9580 0.6640 774.39
Thu 06 Feb 2014 0.6188 704.2705 0.5958 699.53
Fri 07 Feb 2014 0.7481 845.0963 0.7583 792.49
Sat 08 Feb 2014 0.5808 0.6460 601.0464 733.7181 0.8091 0.7372 656.40 779.67
Sun 09 Feb 2014 0.7268 783.3347 1.1295 1052.8
Mon 10 Feb 2014 0.7171 779.3326 0.7075 769.91
Tue 11 Feb 2014 0.7159 798.9643 0.6553 721.39
Wed 12 Feb 2014 0.6148 687.2521 0.6556 720.37
Thu 13 Feb 2014 0.6003 802.7999 0.6210 831.08
Fri 14 Feb 2014 0.5661 656.5204 0.5823 627.08
One-Hour Ahead Electric Load Forecasting Using …
Sat 15 Feb 2014 0.6898 0.7493 715.4786 833.9910 0.9259 0.8790 783.74 891.84
Sun 16 Feb 2014 0.7259 747.6218 1.4201 1291.7
Mon 17 Feb 2014 0.9867 1076.7 0.9973 997.71
Tue 18 Feb 2014 0.8307 977.6372 0.7036 795.45
Wed 19 Feb 2014 0.6776 758.8888 0.6995 750.08
Thu 20 Feb 2014 0.6761 748.8020 0.7456 765.08
Fri 21 Feb 2014 0.6580 739.0911 0.6607 713.34
Sat 22 Feb 2014 0.7332 0.8583 701.0428 836.5079 0.9180 0.9576 821.48 949.38
Sun 23 Feb 2014 1.0665 949.4138 1.7065 1465.7
Mon 24 Feb 2014 0.9972 991.8403 1.1547 1037.8
Tue 25 Feb 2014 0.8315 787.2473 0.7563 710.35
Wed 26 Feb 2014 0.7654 726.2326 0.7103 793.33
Thu 27 Feb 2014 0.7744 790.2337 0.7671 819.81
Fri 28 Feb 2014 0.8396 866.0148 0.6902 776.58
113
(continued)
Table 1 (continued)
114
Method 3
MAPE RMSE
Sat 01 Feb 2014 0.6117 0.6846 0.6966 614.39 733.23 748.10
Sun 02 Feb 2014 0.9875 860.38
Mon 03 Feb 2014 0.5412 690.52
Tue 04 Feb 2014 0.6077 695.24
Wed 05 Feb 2014 0.7280 804.35
Thu 06 Feb 2014 0.6035 694.97
Fri 07 Feb 2014 0.7123 745.31
Sat 08 Feb 2014 0.5808 0.6157 601.04 708.35
Sun 09 Feb 2014 0.7268 783.33
Mon 10 Feb 2014 0.6004 690.21
Tue 11 Feb 2014 0.6115 721.01
Wed 12 Feb 2014 0.6277 711.93
Thu 13 Feb 2014 0.5954 811.52
Fri 14 Feb 2014 0.5672 612.99
Sat 15 Feb 2014 0.6898 0.6879 715.47 757.18
Sun 16 Feb 2014 0.7259 747.62
Mon 17 Feb 2014 0.8994 1010.3
Tue 18 Feb 2014 0.6518 767.93
Wed 19 Feb 2014 0.5989 685.11
Thu 20 Feb 2014 0.6183 664.92
Fri 21 Feb 2014 0.6313 648.38
Sat 22 Feb 2014 0.7332 0.7982 701.04 791.16
Sun 23 Feb 2014 1.0665 949.41
Mon 24 Feb 2014 0.8746 872.60
Tue 25 Feb 2014 0.6531 619.95
Wed 26 Feb 2014 0.7518 758.67
Thu 27 Feb 2014 0.7600 814.72
Fri 28 Feb 2014 0.7484 776.89
A. Laouafi et al.
One-Hour Ahead Electric Load Forecasting Using … 115
laptop of 4 GB of RAM, Intel i3 380 M processor and 5,400-rpm hard drive. Results
confirm the superior accuracy of the third proposed method. In addition, needed time
in the forecasting procedure is less than two second. This time includes the
exhaustive search affected to select the more appropriate input for training the
ANFIS, and four ANFIS corresponding to each quarter-hour load series.
116 A. Laouafi et al.
Moreover, the accuracy decreases when the proposed methods are used to
forecast the peak consummation at 19:00. For example, the MAPE pass from
0.23 % at the beginning hour of 1 February 2014 to 1.38 % at the field 18:15–19:00
of the last day of February 2014. These is more clearly perceived from Figs. 8, 9,
10 and 12 where maximums APE (between 3 and 8 % in first and third method, and
between 3 and 10 % in the second method) are done around 18:15–19:00. This
decrease can be justified by the non-consideration of weather condition in the
proposed methods. As we know, changing weather conditions represent the major
source of variation in peak load forecasting and the inclusion of temperature has a
significant effect due to the fact that in winter heating systems are used specially in
the evening around 19:00, whilst in summer air conditioning appliances are used
particularly around 13:00. Other weather factors include relative humidity, wind
speed and nebulosity. Therefore, numerous papers are devoted to electricity peak
demand forecasting [13, 42]. However, since weather variables tend to change in a
smooth fashion, Weather conditions are ignored in very short term load forecasting
Table 2 Forecasting results for four different hours in all three methods
Method 1 Method 2
Results Real Predicted APE (%) MAPE (%) RMSE Elapsed Predicted APE (%) MAPE (%) RMSE Elapsed
Time load load (MW) time (s) load (MW) Time (s)
(MW) (MW) (MW)
01 Feb 00:15 68,270 68,264 0.0087 0.2322 205.12 1.2480 68,597 0.4789 0.3141 262.74 1.4670
00:30 66,626 66,653 0.0405 66,593 0.0495
00:45 65,407 65,074 0.5091 65,343 0.0978
00:00 64,252 64,490 0.3704 64,657 0.6303
10 Feb 06:15 62,739 62,532 0.3299 0.7238 508.82 1.0920 62,062 1.0790 1.9104 1292.83 1.4510
06:30 64,621 63,912 1.097 63,127 2.3119
06:45 65,548 65,953 0.6178 64,219 2.0275
07:00 67,153 67,724 0.8502 65,660 2.2232
19 Feb 12:15 70,632 70,782 0.2123 1.5312 1435.65 1.1380 71,275 0.9103 1.0796 767.08 1.3730
12:30 70,245 70,124 0.1722 70,964 1.0235
One-Hour Ahead Electric Load Forecasting Using …
Method 3
Results Real load (MW) Predicted load (MW) APE (%) MAPE (%) RMSE (MW) Elapsed time (s)
Time
01 Feb 00:15 68,270 68,264 0.0087 0.2322 205.12 1.2480
00:30 66,626 66,653 0.0405
00:45 65,407 65,074 0.5091
00:00 64,252 64,490 0.3704
10 Feb 06:15 62,739 62,404 0.5339 0.57799 457.483 1.4660
06:30 64,621 63,816 1.2457
06:45 65,548 65,285 0.4012
07:00 67,153 67,065 0.1310
19 Feb 12:15 70,632 70,987 0.5026 0.6617 487.69 1.2950
12:30 70,245 70,640 0.5623
12:45 70,960 71,354 0.5552
12:00 69,827 70,544 1.0268
28 Feb 18:15 68,734 68,793 0.0858 1.3793 1366.3 1.2640
18:30 70,121 69,872 0.3551
18:45 72,242 70,866 1.9047
19:00 74,000 71,653 3.1716
A. Laouafi et al.
One-Hour Ahead Electric Load Forecasting Using … 119
and they could be captured in the demand series itself. By the way, it would be
more appropriate for us to propose a separate model for predicting the peak
consumption.
6 Conclusion
In this paper, three new models based on the use of adaptive neuro-fuzzy inference
system technique in parallel data were developed to forecast the French real time
quart-hourly load, in a 1-h ahead basis. The best ANFIS technique found was the
third, which classify the parallel load series in three categories. We have perceive
that the proposed ANFIS methods have some failed to predict peaks consummation
around 19:00; which make a real need in this field, to propose a separate model for
predicting the peak consumption, or to train the ANFIS with more than one input.
However, for the third method, 56 % of the forecasted loads have an APE under
0.5, and an APE under one was achieved for about 80 % of cases. Therefore, at
exception for peak consummation, the third proposed method can be successfully
applied to build a 1-h ahead electric load prediction in real time.
References
1. Amit, J., Srinivas, E., Rasmimayee, R.: Short term load forecasting using fuzzy adaptive
inference and similarity. World Congress on Nature and Biologically Inspired Computing,
pp. 1743–1748. NaBIC, Coimbatore India (2009)
2. Azar, A.T.: Adaptive neuro-fuzzy systems. In: Azar, A.T (ed.), Fuzzy Systems. InTech,
Vienna, Austria, (2010a) ISBN 978-953-7619-92-3
3. Azar, A.T.: Fuzzy Systems. IN-TECH, Vienna, Austria (2010). ISBN 978-953-7619-92-3
4. Box, G.E.P., Jenkins, J.M.: Time Series Analysis: Forecasting and Control. Holden-Day, San
Francisco (1976)
5. Charytoniuk, W., Chen, M.S.: Very short-term load forecasting using artificial neural
networks. IEEE Trans. Power Syst. 15(1), 263–268 (2000)
6. Cheah, P.H., Gooi, H.B., Soo, F.L.: Quarter-hour-ahead load forecasting for microgrid energy
management system. In: IEEE Trondheim PowerTech, Trondheim, 19–23 June 2011, pp. 1–6
(2011)
7. Chen, D., York, M.: Neural network based very short term load prediction, In: IEEE PES
Transmission & Distribution Conference and Exposition, Chicago IL, 21–24 April 2008,
pp. 1–9 (2008)
8. Daneshi, H., Daneshi, A.: Real time load forecast in power system. In: Third International
Conference on Electric Utility Deregulation and Restructuring and Power Technologies,
DRPT2008, Nanjuing China, 6–9 April 2008 pp. 689–695 (2008)
9. de Andrade, L.C.M., da Silva, I.N.: Very short-term load forecasting based on ARIMA model
and intelligent systems. In: 15th International Conference on Intelligent System Applications
to Power Systems, ISAP ‘09, 8-12 Nov, Curitiba, pp. 1–6 (2009)
10. de Andrade, L.C.M., da Silva, I.N.: Very short-term load forecasting using a hybrid neuro-
fuzzy approach. In: Eleventh Brazilian Symposium on Neural Networks (SBRN), 23–28 Oct,
Sao Paulo, pp. 115–120 (2010a)
120 A. Laouafi et al.
11. de Andrade, L.C.M., da Silva, I.N.: Using intelligent system approach for very short-term load
forecasting purposes. In: IEEE International Energy Conference, 18–22 Dec. 2010, Manama,
pp. 694–699 (2010b)
12. El-Telbany, M.: Short-term forecasting of Jordanian electricity demand using particle swarm
optimization. Electr. Power Syst. Res. 78(3), 425–433 (2008)
13. Fan, S., Mao, C., Chen, L.: Peak load forecasting using the self-organizing map. In: Advances
in Neural Networks—ISNN 2005, Springer Berlin Heidelberg. New York, Part III,
pp. 640–647 (2005)
14. Fay, D., Ringwood, J.V., Condon, M., Kellyc, M. 24-h electrical load data-a sequential or
partitioned time series? Neuro-computing, vol 55(3–4), October 2003, pp. 469–498 (2003)
15. Guan, C., Luh, P.B., Coolbeth, M.A., Zhao, Y., Michel, L.D., Chen, Y., Manville, C. J.,
Friedland, P.B., Rourke, S.J.: Very short-term load forecasting: multilevel wavelet neural
networks with data pre-filtering. In Proceeding of: Power and Energy Society General
Meeting, 2009. Calgary, pp. 1–8 (2009)
16. Guan, C., Luh, P.B., Michel, L.D., Coolbeth, M.A., Friedland, P.B.: Hybrid Kalman
algorithms for very short-term load forecasting and confidence interval estimation. In: IEEE
Power and Energy Society General Meeting, 25–29 July 2010, Minneapolis, MN, pp 1–8
(2010)
17. Guan, C., Luh, P.B., Michel, L.D., Wang, Y., Friedland, P.B.: Very short-term load
forecasting: wavelet neural networks with data pre-filtering. IEEE Trans. Power Syst. 28(1),
30–41 (2013)
18. Guan, C., Luh, P.B., Michel, L.D., Chi, Z.: Hybrid Kalman filters for very short-term load
forecasting and prediction interval estimation. IEEE Trans. Power Syst. 28(4), 3806–3817
(2013)
19. Hagan, M.T., Behr, S.M.: The time series approach to short term load forecasting. IEEE Trans.
Power Syst. 2(3), 785–791 (1987)
20. Hesham, K.: Electric load forecasting: Literature survey and classification of methods. Int.
J. Syst. Sci. 33(1), 23–34 (2002)
21. Jang, J.S.R.: ANFIS: Adaptive network based fuzzy inference system. IEEE Trans. Syst.,
Man, Cybern. 23(3), 665–685 (1993)
22. Jang, J.-S.R., Sun, C.-T.: Neuro-fuzzy modeling and control. Proc. IEEE 83(3), 378–406
(1995)
23. Jang, J.S.R., Sun, C.T., Mizutani, E.: Neuro-Fuzzy and Soft Computing: A Computational
Approach to Learning and Machine Intelligence, pp. 353–360. Prentice-Hall, Englewood
Cliffs, NJ (1997)
24. Khan, G.M., Zafari, F., Mahmud, S.A.: Very short term load forecasting using cartesian
genetic programming evolved recurrent neural networks (CGPRNN). In: 2013 12th
International Conference on Machine Learning and Applications (ICMLA), 4–7 Dec. 2013,
Miami, FL, USA, vol 2, pp. 152–155 (2013)
25. Kotillová, A.: Very Short-Term Load Forecasting Using Exponential Smoothing and ARIMA
Models. J. Inf., Control Manage. Syst. 9(2), 85–92 (2011)
26. Kumar, M.: Short-term load forecasting using artificial neural network techniques. Thesis for
Master of Science degree in Electrical Engineering. India, Rourkela, National Institute of
Technology (2009)
27. Liu, K., Subbarayan, S., Shoults, R.R., Manry, M.T., Kwan, C., Lewis, F.L., Naccarino, J.:
Comparison of very short-term load forecasting techniques. IEEE Trans. Power Syst. 11(2),
877–882 (1996)
28. Mordjaoui, M., Chabane, M., Boudjema, B., Daira, R.: Qualitative ferromagnetic hysteresis
Modeling. J. Comput. Sci. 3(6), 399–405 (2007)
29. Mordjaoui, M., Boudjema, B., Bouabaz, M., Daira, R.: Short-term electric load forecasting
using neuro-fuzzy modeling for nonlinear system identification. In: Proceeding of the 3rd
Conference on Nonlinear Science and Complexity, Jul. 28–31, Ankara, Turkey, link: (2010)
http://nsc10.cankaya.edu.tr/proceedings/, paper ID_64
One-Hour Ahead Electric Load Forecasting Using … 121
30. Mordjaoui, M., Boudjema, B.: Forecasting and modeling electricity demand using anfis
predictor. J. Math. Statis. 7(4), 275–281 (2011)
31. Neusser, L., Canha, L.N., Abaide, A., Finger, M.: Very short-term load forecast for demand
side management in absence of historical data. In: International Conference on Renewable
Energies and Power Energies and Power Quality (ICREPQ’12), 28–30th March, Santiago de
Compostela (Spain), (2012) link: http://www.icrepq.com/icrepq%2712/479-neusser.pdf
32. Palit A.K., Popovic D.: Computational intelligence in time series forecasting: theory and
engineering applications. In: Series: Advance in Industrial Control Springer, New York, Inc.
Secaucus, NJ, USA (2005)
33. Palit, A.K., Anheier, W., Popovic, D.: Electrical load forecasting using a neural-fuzzy
approach. In: Natural Intelligence for Scheduling, Planning and Packing Problems, Studies in
Computational Intelligence, Springer, Berlin Heidelberg, volume 250, pp. 145–173 (2009)
34. Papalexopoulos, A.D., Hesterberg, T.C.: A regression-based approach to short-term system
load forecasting. IEEE Trans. Power Syst. 5(4), 1535–1547 (1990)
35. Park, D.C., El-Sharkawi, M.A., Marks II, R.J., Atlas, L.E., Damborg, M.J.: Electric load
forecasting using an artificial neural network. IEEE Trans. Power Syst. 6(2), 442–449 (1991)
36. Qingle, P., Min, Z.: Very short-term load forecasting based on neural network and rough set.
In: 2010 International Conference on Intelligent Computation Technology and Automation,
11-12 May 2010, Changsha, volume 3, pp. 1132–1135
37. Ramiro, S.: Very short-term load forecasting using exponential smoothing. In: Engineering
Universe for Scientific Research and Management, volume 3(5) (2011) Link: http://www.
eusrm.com/PublishedPaper/3Vol/Issue5/20112011eusrm03051037.pdf
38. Setiawan, A., Koprinska, I., Agelidis, V.G.: Very short-term electricity load demand
forecasting using support vector regression. In: Proceedings of International Joint Conference
on Neural Networks, 14–19 June 2009, Atlanta, Georgia, pp. 2888–289 (2009)
39. Shamsollahi, P., Cheung, K.W., Quan, C., Germain, E.H.: A neural network based very short
term load forecaster for the interim ISO New England electricity market system. In: 22nd
IEEE Power Engineering Society International Conference on PICA. Innovative Computing
for Power—Electric Energy Meets the Market, 20–24 May 2001, Sydney, NSW, pp. 217–222
(2001)
40. Shankar, R., Chatterjee, K., Chatterjee, T.K.: A very short-term load forecasting using Kalman
filter for load frequency control with economic load dispatch. J. Eng. Sci. Technol., Rev. 5(1),
97–103 (2012)
41. Sumathi, S., Surekha, P.: Computational intelligence paradigms theory and applications using
MATLAB. Taylor and Francis Group, LLC (2010)
42. Sigauke, C., Chikobvu, D.: Daily peak electricity load forecasting in South Africa using a
multivariate non-parametric regression approach. ORiON: J. ORSSA, 26(2), pp. 97–111
(2010)
43. Taylor, J.W.: An evaluation of methods for very short-term load forecasting using minute-by-
minute British data. Int. J. Forecast. 24(4), 645–658 (2008)
44. Taylor, J.W.: Short-term load forecasting with exponentially weighted methods. IEEE Trans.
Power Syst. 27, 458–464 (2012)
45. Trudnowski, D.J., McReynolds, W.L., Johnson, J.M.: Real-time very short-term load
prediction for power-system automatic generation control. IEE Trans. Control Syst. Technol. 9
(2), 254–260 (2001)
46. Vapnik, V.: The nature of statistic learning theory. Springer-Verlag New York, Inc., New
York, NY, USA (1995)
47. Vapnik, V.: Statistical learning theory. Wiley, Inc., New York (1998)
48. Zhao, F., Su, H.: Short-term load forecasting using Kalman filter and Elman neural network.
In: 2nd IEEE Conference on Industrial Electronics and Applications, 23–25 May 2007,
Harbin, pp. 1043–1047 (2007)
A Computational Intelligence
Optimization Algorithm Based
on the Behavior of the Social-Spider
Abstract Classical optimization methods often face great difficulties while dealing
with several engineering applications. Under such conditions, the use of compu-
tational intelligence approaches has been recently extended to address challenging
real-world optimization problems. On the other hand, the interesting and exotic
collective behavior of social insects have fascinated and attracted researchers for
many years. The collaborative swarming behavior observed in these groups pro-
vides survival advantages, where insect aggregations of relatively simple and
“unintelligent” individuals can accomplish very complex tasks using only limited
local information and simple rules of behavior. Swarm intelligence, as a compu-
tational intelligence paradigm, models the collective behavior in swarms of insects
or animals. Several algorithms arising from such models have been proposed to
solve a wide range of complex optimization problems. In this chapter, a novel
swarm algorithm called the Social Spider Optimization (SSO) is proposed for
solving optimization tasks. The SSO algorithm is based on the simulation of
cooperative behavior of social-spiders. In the proposed algorithm, individuals
emulate a group of spiders which interact to each other based on the biological laws
of the cooperative colony. The algorithm considers two different search agents
(spiders): males and females. Depending on gender, each individual is conducted by
a set of different evolutionary operators which mimic different cooperative
behaviors that are typically found in the colony. In order to illustrate the proficiency
and robustness of the proposed approach, it is compared to other well-known
evolutionary methods. The comparison examines several standard benchmark
functions that are commonly considered within the literature of evolutionary
algorithms. The outcome shows a high performance of the proposed method for
searching a global optimum with several benchmark functions.
Keywords Swarm algorithms Global optimization Bio-inspired algorithms
Computational intelligence Evolutionary algorithms Metaheuristics
1 Introduction
In particular, insect colonies and animal groups provide a rich set of metaphors for
designing swarm optimization algorithms. Such cooperative entities are complex
systems that are composed by individuals with different cooperative-tasks where
each member tends to reproduce specialized behaviors depending on its gender [4].
However, most of swarm algorithms model individuals as unisex entities that per-
form virtually the same behavior. Under such circumstances, algorithms waste the
possibility of adding new and selective operators as a result of considering indi-
viduals with different characteristics such as sex, task-responsibility, etc. These
operators could incorporate computational mechanisms to improve several important
algorithm characteristics including population diversity and searching capacities.
Although PSO and ABC are the most popular swarm algorithms for solving
complex optimization problems, they present serious flaws such as premature
convergence and difficulty to overcome local minima [35, 36]. The cause for such
problems is associated to the operators that modify individual positions. In such
algorithms, during their evolution, the position of each agent for the next iteration is
updated yielding an attraction towards the position of the best particle seen so-far
(in case of PSO) or towards other randomly chosen individuals (in case of ABC).
As the algorithm evolves, those behaviors cause that the entire population con-
centrates around the best particle or diverges without control. It does favors the
premature convergence or damage the exploration-exploitation balance [3, 37].
The interesting and exotic collective behavior of social insects have fascinated
and attracted researchers for many years. The collaborative swarming behavior
observed in these groups provides survival advantages, where insect aggregations
of relatively simple and “unintelligent” individuals can accomplish very complex
tasks using only limited local information and simple rules of behavior [11]. Social-
spiders are a representative example of social insects [22]. A social-spider is a
spider species whose members maintain a set of complex cooperative behaviors
[32]. Whereas most spiders are solitary and even aggressive toward other members
of their own species, social-spiders show a tendency to live in groups, forming
long-lasting aggregations often referred to as colonies [1]. In a social-spider colony,
each member, depending on its gender, executes a variety of tasks such as preda-
tion, mating, web design, and social interaction [1, 6]. The web it is an important
part of the colony because it is not only used as a common environment for all
members, but also as a communication channel among them [23]. Therefore,
important information (such as trapped prays or mating possibilities) is transmitted
by small vibrations through the web. Such information, considered as a local
knowledge, is employed by each member to conduct its own cooperative behavior,
influencing simultaneously the social regulation of the colony.
In this paper, a novel swarm algorithm, called the Social Spider Optimization
(SSO) is proposed for solving optimization tasks. The SSO algorithm is based on
the simulation of the cooperative behavior of social-spiders. In the proposed
algorithm, individuals emulate a group of spiders which interact to each other based
on the biological laws of the cooperative colony. The algorithm considers two
different search agents (spiders): males and females. Depending on gender, each
individual is conducted by a set of different evolutionary operators which mimic
126 E. Cuevas et al.
2 Biological Fundamentals
Social insect societies are complex cooperative systems that self-organize within a
set of constraints. Cooperative groups are better at manipulating and exploiting their
environment, defending resources and brood, and allowing task specialization
among group members [15, 25]. A social insect colony functions as an integrated
unit that not only possesses the ability to operate at a distributed manner, but also to
undertake enormous construction of global projects [14]. It is important to
acknowledge that global order in social insects can arise as a result of internal
interactions among members.
A few species of spiders have been documented exhibiting a degree of social
behavior [22]. The behavior of spiders can be generalized into two basic forms:
solitary spiders and social spiders [1]. This classification is made based on the level
of cooperative behavior that they exhibit [6]. In one side, solitary spiders create and
maintain their own web while live in scarce contact to other individuals of the same
species. In contrast, social spiders form colonies that remain together over a
communal web with close spatial relationship to other group members [23].
A social spider colony is composed of two fundamental components: its
members and the communal web. Members are divided into two different catego-
ries: males and females. An interesting characteristic of social-spiders is the highly
female-biased population. Some studies suggest that the number of male spiders
barely reaches the 30 % of the total colony members [1, 2]. In the colony, each
member, depending on its gender, cooperate in different activities such as building
and maintaining the communal web, prey capturing, mating and social contact (Yip
2008). Interactions among members are either direct or indirect [29]. Direct inter-
actions imply body contact or the exchange of fluids such as mating. For indirect
interactions, the communal web is used as a “medium of communication” which
A Computational Intelligence Optimization Algorithm … 127
conveys important information that is available to each colony member [23]. This
information encoded as small vibrations is a critical aspect for the collective
coordination among members (Yip 2008). Vibrations are employed by the colony
members to decode several messages such as the size of the trapped preys, char-
acteristics of the neighboring members, etc. The intensity of such vibrations depend
on the weight and distance of the spiders that have produced them.
In spite of the complexity, all the cooperative global patterns in the colony level
are generated as a result of internal interactions among colony members [12]. Such
internal interactions involve a set of simple behavioral rules followed by each spider
in the colony. Behavioral rules are divided into two different classes: social inter-
action (cooperative behavior) and mating [30].
As a social insect, spiders perform cooperative interaction with other colony
members. The way in which this behavior takes place depends on the spider gender.
Female spiders which show a major tendency to socialize present an attraction or
dislike over others, irrespectively of gender [1]. For a particular female spider, such
attraction or dislike is commonly developed over other spiders according to their
vibrations which are emitted over the communal web and represent strong colony
members (Yip 2008). Since the vibrations depend on the weight and distance of the
members which provoke them, stronger vibrations are produced either by big spiders
or neighboring members [23]. The bigger a spider is, the better it is considered as a
colony member. The final decision of attraction or dislike over a determined member
is taken according to an internal state which is influenced by several factors such as
reproduction cycle, curiosity and other random phenomena (Yip 2008).
Different to female spiders, the behavior of male members is reproductive-
oriented [26]. Male spiders recognize themselves as a subgroup of alpha males
which dominate the colony resources. Therefore, the male population is divided
into two classes: dominant and non-dominant male spiders [26]. Dominant male
spiders have better fitness characteristics (normally size) in comparison to non-
dominant. In a typical behavior, dominant males are attracted to the closest female
spider in the communal web. In contrast, non-dominant male spiders tend to con-
centrate upon the center of the male population as a strategy to take advantage of
the resources wasted by dominant males [33].
Mating is an important operation that no only assures the colony survival, but also
allows the information exchange among members. Mating in a social-spider colony
is performed by dominant males and female members [16]. Under such circum-
stances, when a dominant male spider locates one or more female members within a
specific range, it mates with all the females in order to produce offspring [8].
In this paper, the operational principles from the social-spider colony have been
used as guidelines for developing a new swarm optimization algorithm. The SSO
assumes that entire search space is a communal web, where all the social-spiders
128 E. Cuevas et al.
interact to each other. In the proposed approach, each solution within the search
space represents a spider position in the communal web. Every spider receives a
weight according to the fitness value of the solution that is symbolized by the
social-spider. The algorithm models two different search agents (spiders): males and
females. Depending on gender, each individual is conducted by a set of different
evolutionary operators which mimic different cooperative behaviors that are com-
monly assumed within the colony.
An interesting characteristic of social-spiders is the highly female-biased pop-
ulations. In order to emulate this fact, the algorithm starts by defining the number of
female and male spiders that will be characterized as individuals in the search
space. The number of females Nf is randomly selected within the range of 65–90 %
of the entire population N. Therefore, Nf is calculated by the following equation:
where rand is a random number between [0,1] whereas floorðÞ maps a real number
to an integer number. The number of male spiders Nm is computed as the com-
plement between N and Nf . It is calculated as follows:
Nm ¼ N Nf ð2Þ
In the biological metaphor, the spider size is the characteristic that evaluates the
individual capacity to perform better over its assigned tasks. In the proposed
approach, every individual (spider) receives a weight wi which represents the solu-
tion quality that corresponds to the spider i (irrespective of gender) of the population
S. In order to calculate the weight of every spider the next equation is used:
Jðsi Þ worstS
wi ¼ ; ð3Þ
bestS worstS
where Jðsi Þ is the fitness value obtained by the evaluation of the spider position si
with regard to the objective function JðÞ. The values worstS and bestS are defined
as follows (considering a maximization problem):
A Computational Intelligence Optimization Algorithm … 129
Vibi;j ¼ wj edi;j ;
2
ð5Þ
where the di;j is the Euclidian distance between the spiders i and j, such that
di;j ¼ si sj .
Although it is virtually possible to compute perceived-vibrations by considering
any pair of individuals, three special relationships are considered within the SSO
approach:
1. Vibrations Vibci are perceived by the individual i ðsi Þ as a result of the infor-
mation transmitted by the member c ðsc Þ who is an individual that has two
important characteristics: it is the nearest member to i and possesses a higher
weight in comparison to i ðwc [ wi Þ.
Vibci ¼ wc edi;c
2
ð6Þ
Vibbi ¼ wb edi;b
2
ð7Þ
3. The vibrations Vibfi perceived by the individual i ðsi Þ as a result of the infor-
mation transmitted by the member f ðsf Þ, with f being the nearest female indi-
vidual to i.
130 E. Cuevas et al.
(a) (b)
Vibbi
sx sb
Fe- wx = 0.8 Fe- wb = 0.8
Vibci
di , x wc = 0.6 di ,b wx = 0.6
Fe- Fe-
sy sc sy sx
wy = 0.2 wy = 0.2
di , y di ,c M di , y di , x M
(c)
sx
Fe- wx = 0.8
Vibf i
di , x wy = 0.6
Fe-
sf sy
w f = 0.2
di , y M
di , f
si
Vibfi ¼ wf edi;f
2
ð8Þ
Figure 1 shows the configuration of each special relationship: (a) Vibci , (b) Vibbi
and (c) Vibfi .
Like other evolutionary algorithms, the SSO is an iterative process whose first step
is to randomly initialize the entire population (female and male). The algorithm
begins by initializing the set S of N spider positions. Each spider position, f i or mi ,
is a n-dimensional vector containing the parameter values to be optimized. Such
A Computational Intelligence Optimization Algorithm … 131
values are randomly and uniformly distributed between the pre-specified lower
initial parameter bound plow
j and the upper initial parameter bound phigh
j , just as it
described by the following expressions:
fi;j0 ¼ plow
j þ randð0; 1Þ ðphigh
j plow
j Þ
i ¼ 1; 2; . . .; Nf ; j ¼ 1; 2; . . .; n
ð9Þ
m0k;j ¼ plow
j þ rand(0,1) ðphigh
j plow
j Þ
k ¼ 1; 2; . . .; Nm ; j ¼ 1; 2; . . .; n;
where j, i and k are the parameter and individual indexes respectively whereas zero
signals the initial population. The function rand(0,1) generates a random number
between 0 and 1. Hence, fi;j is the j-th parameter of the i-th female spider position.
f ki þ a Vibci ðsc f ki Þ þ b Vibbi ðsb f ki Þ þ d ðrand 12Þ with probability PF
f kþ1 ¼ ;
i
f ki a Vibci ðsc f ki Þ b Vibbi ðsb f ki Þ þ d ðrand 12Þ with probability 1 PF
ð10Þ
where α, β, δ and rand are random numbers between [0,1] whereas k represents the
iteration number. The individual sc and sb represent the nearest member to i that
holds a higher weight and the best individual of the entire population S,
respectively.
Under this operation, each particle presents a movement which combines the
past position that holds the attraction or repulsion vector over the local best element
sc and the global best individual sb seen so-far. This particular type of interaction
avoids the quick concentration of particles at only one point and encourages each
particle to search around the local candidate region within its neighborhood ðsc Þ,
rather than interacting to a particle ðsb Þ in a distant region of the domain. The use of
this scheme has two advantages. First, it prevents the particles from moving
towards the global best position, making the algorithm less susceptible to premature
convergence. Second, it encourages particles to explore their own neighborhood
thoroughly before converging towards the global best position. Therefore, it pro-
vides the algorithm with global search ability and enhances the exploitative
behavior of the proposed approach.
8 k
< mi þ a
Vibfi ðsf mki Þ þ d ðrand 12Þ if wNf þi [ wNf þm
P
mkþ1 ¼ ð11Þ
Nm
mkh wNf þh
i
: mki þ a P
h¼1
Nm mki if wNf þi wNf þm
wNf þh
h¼1
where the individual sf represents the nearest female individual to the male member
P Nm k P Nm
i whereas h¼1 mh wNf þh = h¼1 wNf þh correspond to the weighted mean of the
male population M.
By using this operator, two different behaviors are produced. First, the set D of
particles is attracted to others in order to provoke mating. Such behavior allows
incorporating diversity into the population. Second, the set ND of particles is
attracted to the weighted mean of the male population M. This fact is used to
partially control the search process according to the average performance of a sub-
group of the population. Such mechanism acts as a filter which avoids that very
good individuals or extremely bad individuals influence the search process.
where i 2 Tg .
Once the new spider is formed, it is compared to the new spider candidate snew
holding the worst spider swo of the colony, according to their weight values (where
wwo ¼ minl2f1;2;...;N g ðwl Þ). If the new spider is better than the worst spider, the worst
spider is replaced by the new one. Otherwise, the new spider is discarded and the
134 E. Cuevas et al.
(a) 10
5
J (s i )
0
−5
−10
2
0 3
x2 −2 −2 −1 0 1 2
−3 x1
(b) (c)
3 3
2 2
2 6 2 6
s1
wnew = 1 s2
2
2
0 2 0 s4
4
4
T
4
4
8 w4 = 1 8 f1 f2
4
4
6 s new f2 6 f4
1 f4 w2 = 0.57 1
2 2 m2
0 0
−2 m2 −2
f1 s7 f3
2
2
0
0
w7 = 0.57 f3 s3
0 0
−2
−2
w1 = 0
w3 = 0.42
2
2
2 2
−1 0 2 0 −1 0 2 0f
0 −4 0 −4
f5
5
s5
−2
−2
w6 = 0.28 s6
−2
−2
−6 w5 = 0.78 −6
−2 m1 −4 −2 m1 −4
−2 0 −2 0
m3 m1 s8
w8 = 0.42
−3 −3
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
Fig. 2 Example of the mating operation: a optimization problem, b initial configuration before
mating and c configuration after the mating operation
population does not suffer changes. In case of replacement, the new spider assumes
the gender and index from the replaced spider. Such fact assures that the entire
population S maintains the original rate between female and male members.
In order to demonstrate the mating operation, Fig. 2a illustrates a simple opti-
mization problem. As an example, it is assumed a population S of eight different
2-dimensional members (N = 8), five females ðNf ¼ 5Þ and three males ðNm ¼ 3Þ.
Figure 2b shows the initial configuration of the proposed example with three dif-
ferent female members f 2 ðs2 Þ; f 3 ðs3 Þ and f 4 ðs4 Þ constituting the set E2 which is
located inside of the influence range r of a dominant male m2 ðs7 Þ. Then, the new
candidate spider snew is generated from the elements f 2 ; f 3 ; f 4 and m2 which con-
stitute the set T2 . Therefore, the value of the first decision variable snew;1 for the new
spider is chosen by means
of the roulette mechanism considering the values already
existing from the set f2;1 ; f3;1 ; f4;1 ; m2;1 . The value of the second decision variable
snew;2 is also chosen in the same manner. Table 1 shows the data for constructing the
new spider through the Roulette method. Once the new spider snew is formed, its
weight wnew is calculated. As snew is better than the worst member f 1 that is present
in the population S, f 1 is replaced by snew . Therefore, snew assumes the same gender
and index from f 1 . Figure 2c shows the configuration of S after the mating process.
A Computational Intelligence Optimization Algorithm … 135
Table 1 Data for constructing the new spider snew through the Roulette method
Spider Position wi Psi Roulette
s1 f1 (−1.9,0.3) 0.00 –
s2 f2 (1.4,1.1) 0.57 0.22
s3 f3 (1.5,0.2) 0.42 0.16
s4 f4 (0.4,1.0) 1.00 0.39
s5 f5 (1.0,−1.5) 0.78 –
s6 m1 (−1.3,−1.9) 0.28 –
s7 m2 (0.9,0.7) 0.57 0.22
s8 m3 (0.8,−2.6) 0.42 –
Under this operation, new generated particles locally exploit the search space
inside the mating range in order to find better individuals.
Step 1: Considering N as the total number of n-dimensional colony members, define the
number of male Nm and females Nf spiders in the entire population S.
Nf ¼ floor½ð0:9 rand 0:25Þ N and Nm ¼ N Nf , where rand is a random
number between [0,1] whereas floorðÞ maps a real number to an integer number.
Step 2: Initialize randomly the female (F ¼ ff 1 ; f 2 ; . . .; f Nf g) and male
(M ¼ fm1 ; m2 ; . . .; mNm g) members (where S ¼
s1 ¼ f 1 ; s2 ¼ f 2 ; . . .; sNf ¼ f Nf ; sNf þ1 ¼ m1 ; sNf þ2 ¼ m2 ; . . .; sN ¼ mNm and cal-
culate the radius of mating.
Pn high low
ðpj pj Þ
r ¼ j¼1 2n
for (i = 1;i < Nf + 1;i++)
for(j = 1;j < n+1;j++)
fi;j0 ¼ plow
j þ randð0; 1Þ ðphigh
j plow
j Þ
end for
end for
for (k = 1;k < Nm + 1;k++)
for(j = 1;j < n + 1;j++)
m0k;j ¼ plow
j þ rand ðphigh
j plow
j Þ
end for
end for
(continued)
136 E. Cuevas et al.
(continued)
Step 3: Calculate the weight of every spider of S (Sect. 3.1).
for (i = 1,i < N+1;i ++)
Jðsi ÞworstS
wi ¼ best S worstS
where bestS ¼ maxk2f1;2;...;N g ðJðsk ÞÞ and
worstS ¼ mink2f1;2;...;N g ðJðsk ÞÞ
end for
Step 4: Move female spiders according to the female cooperative operator (Sect. 3.4).
for (i = 1;i < Nf + 1;i++)
Calculate Vibci and Vibbi (Sect. 3.2)
If (rm <PF); where rm 2 randð0; 1Þ
f kþ1
i ¼ f ki þ a Vibci ðsc f ki Þ þ b Vibbi ðsb f ki Þ þ d ðrand 12Þ
else if
f kþ1
i ¼ f ki a Vibci ðsc f ki Þ b Vibbi ðsb f ki Þ þ d ðrand 12Þ
end if
end for
Step 5: Move the male spiders according to the male cooperative operator (Sect. 3.4).
Find the median male individual (wNf þm ) from M.
for (i = 1;i < Nm + 1;i ++)
Calculate Vibfi (Sect. 3.2)
If (wNf þi [ wNf þm )
mikþ1 ¼ mki þ a Vibfi ðsf mki Þ þ d ðrand 12Þ
Else if
PNm k
mh wNf þh
mikþ1 ¼ mki þ a P
h¼1
Nm m k
i
h¼1
wNf þh
end if
end for
Step 6: Perform the mating operation (Sect. 3.5).
for (i = 1;i < Nm + 1;i ++)
If (mi 2 D)
Find Ei
If (Ei is not empty)
Form snew using the Roulette method
If (wnew [ wwo )
swo ¼ snew
end if
end if
end if
end for
Step 7: If the stop criteria is met, the process is finished; otherwise, go back to Step 3
A Computational Intelligence Optimization Algorithm … 137
Evolutionary algorithms (EA) have been widely employed for solving complex
optimization problems. These methods are found to be more powerful than conven-
tional methods based on formal logics or mathematical programming [40]. In an EA
algorithm, search agents have to decide whether to explore unknown search positions
or to exploit already tested positions in order to improve their solution quality. Pure
exploration degrades the precision of the evolutionary process but increases its
capacity to find new potential solutions. On the other hand, pure exploitation allows
refining existent solutions but adversely drives the process to local optimal solutions.
Therefore, the ability of an EA to find a global optimal solutions depends on its
capacity to find a good balance between the exploitation of found-so-far elements and
the exploration of the search space [7]. So far, the exploration–exploitation dilemma
has been an unsolved issue within the framework of evolutionary algorithms.
EA defines individuals with the same property, performing virtually the same
behavior. Under these circumstances, algorithms waste the possibility to add new
and selective operators as a result of considering individuals with different char-
acteristics. These operators could incorporate computational mechanisms to
improve several important algorithm characteristics such as population diversity or
searching capacities.
On the other hand, PSO and ABC are the most popular swarm algorithms for
solving complex optimization problems. However, they present serious flaws such
as premature convergence and difficulty to overcome local minima [35, 36]. Such
problems arise from operators that modify individual positions. In such algorithms,
the position of each agent in the next iteration is updated yielding an attraction
towards the position of the best particle seen so-far (in case of PSO) or any other
randomly chosen individual (in case of ABC). Such behaviors produce that the
entire population concentrates around the best particle or diverges without control
as the algorithm evolves, either favoring the premature convergence or damaging
the exploration-exploitation balance [3, 37].
Different to other EA, at SSO each individual is modeled considering the gender.
Such fact allows incorporating computational mechanisms to avoid critical flaws
such as premature convergence and incorrect exploration-exploitation balance
commonly present in both, the PSO and the ABC algorithm. From an optimization
point of view, the use of the social-spider behavior as a metaphor introduces
interesting concepts in EA: the fact of dividing the entire population into different
search-agent categories and the employment of specialized operators that are
applied selectively to each of them. By using this framework, it is possible to
improve the balance between exploitation and exploration, yet preserving the same
population, i.e. individuals who have achieved efficient exploration (female spiders)
and individuals that verify extensive exploitation (male spiders). Furthermore, the
social-spider behavior mechanism introduces an interesting computational scheme
with three important particularities: first, individuals are separately processed
according to their characteristics. Second, operators share the same communication
138 E. Cuevas et al.
Initialization
Female Male
cooperative operator cooperative operator
Mating
operator Communication
Mechanism
4 Experimental Results
A comprehensive set of 19 functions, which have been collected from Refs. [18, 9,
21, 24, 31, 34, 39], has been used to test the performance of the proposed approach.
Table 4 in the Appendix A presents the benchmark functions used in our experi-
mental study. In the table, n indicates the function dimension, f ðx Þ the optimum
value of the function, x the optimum position and S the search space (subset of Rn ).
A detailed description of each function is given in the Appendix A.
We have applied the SSO algorithm to 19 functions whose results have been
compared to those produced by the Particle Swarm Optimization (PSO) method
A Computational Intelligence Optimization Algorithm … 139
Table 2 Minimization results of benchmark functions of Table 4 with n = 30. Maximum number
of iterations = 1,000
SSO ABC PSO
f1 ð xÞ AB 1.96E−03 2.90E−03 1.00E+03
MB 2.81E−03 1.50E−03 2.08E−09
SD 9.96E−04 1.44E−03 3.05E+03
f2 ð xÞ AB 1.37E−02 1.35E−01 5.17E+01
MB 1.34E−02 1.05E−01 5.00E+01
SD 3.11E−03 8.01E−02 2.02E+01
f3 ð xÞ AB 4.27E–02 1.13E+00 8.63E+04
MB 3.49E−02 6.11E−01 8.00E+04
SD 3.11E−02 1.57E+00 5.56E+04
f4 ð xÞ AB 5.40E−02 5.82E+01 1.47E+01
MB 5.43E−02 5.92E+01 1.51E+01
SD 1.01E−02 7.02E+00 3.13E+00
f5 ð xÞ AB 1.14E+02 1.38E+02 3.34E+04
MB 5.86E+01 1.32E+02 4.03E+02
SD 3.90E+01 1.55E+02 4.38E+04
f6 ð xÞ AB 2.68E−03 4.06E−03 1.00E+03
MB 2.68E−03 3.74E−03 1.66E–09
SD 6.05E–04 2.98E−03 3.06E+03
f7 ð xÞ AB 1.20E+01 1.21E+01 1.50E+01
MB 1.20E+01 1.23E+01 1.37E+01
SD 5.76E−01 9.00E−01 4.75E+00
f8 ð xÞ AB 2.14E+00 3.60E+00 3.12E+04
MB 3.64E+00 8.04E−01 2.08E+02
SD 1.26E+00 3.54E+00 5.74E+04
f9 ð xÞ AB 6.92E−05 1.44E−04 2.47E+00
MB 6.80E−05 8.09E−05 9.09E−01
SD 4.02E−05 1.69E−04 3.27E+00
f10 ð xÞ AB 4.44E−04 1.10E−01 6.93E+02
MB 4.05E−04 4.97E−02 5.50E+02
SD 2.90E−04 1.98E−01 6.48E+02
f11 ð xÞ AB 6.81E+01 3.12E+02 4.11E+02
MB 6.12E+01 3.13E+02 4.31E+02
SD 3.00E+01 4.31E+01 1.56E+02
f12 ð xÞ AB 5.39E−05 1.18E−04 4.27E+07
MB 5.40E−05 1.05E−04 1.04E−01
SD 1.84E−05 8.88E−05 9.70E+07
f13 ð xÞ AB 1.76E−03 1.87E−03 5.74E−01
MB 1.12E−03 1.69E−03 1.08E−05
SD 6.75E−04 1.47E−03 2.36E+00
(continued)
140 E. Cuevas et al.
Table 2 (continued)
SSO ABC PSO
f14 ð xÞ AB −9.36E+02 −9.69E+02 −9.63E+02
MB −9.36E+02 −9.60E+02 −9.92E+02
SD 1.61E+01 6.55E+01 6.66E+01
f15 ð xÞ AB 8.59E+00 2.64E+01 1.35E+02
MB 8.78E+00 2.24E+01 1.36E+02
SD 1.11E+00 1.06E+01 3.73E+01
f16 ð xÞ AB 1.36E−02 6.53E−01 1.14E+01
MB 1.39E−02 6.39E−01 1.43E+01
SD 2.36E−03 3.09E−01 8.86E+00
f17 ð xÞ AB 3.29E–03 5.22E−02 1.20E+01
MB 3.21E−03 4.60E−02 1.35E−02
SD 5.49E−04 3.42E–02 3.12E+01
f18 ð xÞ AB 1.87E+00 2.13E+00 1.26E+03
MB 1.61E+00 2.14E+00 5.67E+02
SD 1.20E+00 1.22E+00 1.12E+03
f19 ð xÞ AB 2.74E−01 4.14E+00 1.53E+00
MB 3.00E−01 4.10E+00 5.50E−01
SD 5.17E−02 4.69E−01 2.94E+00
[20] and the Artificial Bee Colony (ABC) algorithm [17]. These are considered as
the most popular swarm algorithms for many optimization applications. In all
comparisons, the population has been set to 50 individuals. The maximum iteration
number for all functions has been set to 1,000. Such stop criterion has been selected
to maintain compatibility to similar works reported in the literature [42].
The parameter setting for each algorithm in the comparison is described as
follows:
1. PSO: The parameters are set to c1 ¼ 2 and c2 ¼ 2; besides, the weight factor
decreases linearly from 0.9 to 0.2 [20].
2. ABC: The algorithm has been implemented using the guidelines provided by its
own reference [17], using the parameter limit = 100.
3. SSO: Once it has been determined experimentally, the parameter PF has been
set to 0.7. It is kept for all experiments in this section.
The experiment compares the SSO to other algorithms such as PSO and ABC.
The results for 30 runs are reported in Table 2 considering the following perfor-
mance indexes: the Average Best-so-far (AB) solution, the Median Best-so-far
(MB) and the Standard Deviation (SD) of best-so-far solution. The best outcome for
each function is boldfaced. According to this table, SSO delivers better results than
PSO and ABC for all functions. In particular, the test remarks the largest difference
in performance which is directly related to a better trade-off between exploration
and exploitation.
A Computational Intelligence Optimization Algorithm … 141
(a) x 10
4 (b) 10
x 10
5
7 PSO
PSO 9 ABC
6 ABC 8 SSO
Fitness Value
5 SSO 7
Fitness Value
6
4
5
3 4
3
2
2
1 1
0
0 0 200 400 600 800 1000
0 200 400 600 800 1000
Iteration(s) Iteration(s)
(c) 8 (d)
x 10 10000
3
PSO PSO
ABC 9000 ABC
2.5 SSO 8000 SSO
7000
Fitness Value
Fitness Value
2
6000
1.5 5000
4000
1 3000
2000
0.5
1000
0 0
0 200 400 600 800 1000 0 200 400 600 800 1000
Iteration(s) Iteration(s)
(e) (f)
450 30
PSO PSO
400 ABC ABC
SSO 25 SSO
350
Fitness Value
Fitness Value
300 20
250
15
200
150 10
100
50 5
0
0 200 400 600 800 1000 0
0 200 400 600 800 1000
Iteration(s)
Iteration(s)
Fig. 4 Evolution curves for PSO, ABC and the proposed algorithm considering as examples the
functions a f1 , b f3 , c f5 , d f10 , e f15 and f f19 from the experimental set
Figure 4 presents the evolution curves for PSO, ABC and the proposed algo-
rithm considering as examples the functions f1 ; f3 ; f5 ; f10 ; f15 and f19 from the
experimental set. Among them, the rate of convergence of SSO is the fastest, which
finds the best solution in less of 400 iterations on average while the other three
algorithms need much more iterations. A non-parametric statistical significance
proof known as the Wilcoxon’s rank sum test for independent samples [10, 38] has
been conducted over the “average best-so-far” (AB) data of Table 2, with an 5 %
142 E. Cuevas et al.
significance level. Table 3 reports the p-values produced by Wilcoxon’s test for the
pair-wise comparison of the “average best so-far” of two groups. Such groups are
constituted by SSO versus PSO and SSO versus ABC. As a null hypothesis, it is
assumed that there is no significant difference between mean values of the two
algorithms. The alternative hypothesis considers a significant difference between
the “average best-so-far” values of both approaches. All p-values reported in
Table 3 are less than 0.05 (5 % significance level) which is a strong evidence
against the null hypothesis. Therefore, such evidence indicates that SSO results are
statistically significant and it has not occurred by coincidence (i.e. due to common
noise contained in the process).
5 Conclusions
In this paper, a novel swarm algorithm called the Social Spider Optimization (SSO)
has been proposed for solving optimization tasks. The SSO algorithm is based on
the simulation of the cooperative behavior of social-spiders whose individuals
emulate a group of spiders which interact to each other based on the biological laws
of a cooperative colony. The algorithm considers two different search agents
(spiders): male and female. Depending on gender, each individual is conducted by a
set of different evolutionary operators which mimic different cooperative behaviors
within the colony.
A Computational Intelligence Optimization Algorithm … 143
See Table 4
F4 n
P pffiffiffiffiffiffiffi ½100; 100n n ¼ 30 x ¼ ð0; . . .; 0Þ;
f4 ðxÞ ¼ 418:9829n þ xi sin jxi j
i¼1 f ðx Þ ¼ 0
Rosenbrock Ph
n1 i ½30; 30n n ¼ 30 x ¼ ð1; . . .; 1Þ;
f5 ðxÞ ¼ 100ðxiþ1 x2i Þ2 þ ðxi 1Þ2 f ðx Þ ¼ 0
i¼1
Step P
n
½100; 100n n ¼ 30 x ¼ ð0; . . .; 0Þ;
f6 ðxÞ ¼ ðbxi þ 0:5cÞ2
i¼1 f ðx Þ ¼ 0
Quartic P
n
½1:28; 1:28n n ¼ 30 x ¼ ð0; . . .; 0Þ;
f7 ðxÞ ¼ ix4i þ randomð0; 1Þ
i¼1 f ðx Þ ¼ 0
n
P 2 n ¼ 30 x ¼ ð0; . . .; 0Þ;
Dixon and ½10; 10n
f8 ðxÞ ¼ ðx1 1Þ2 þ i 2x2i xi1
price i¼1 f ðx Þ ¼ 0
8 2 9
Levy > sin ð3px1 Þ > ½10; 10n n ¼ 30 x ¼ ð1; . . .; 1Þ;
>
> n >
< P 2 >
= f ðx Þ ¼ 0
f9 ðxÞ ¼ 0:1 þ ðxi 1Þ 1 þ sin ð3pxi þ 1Þ
2
>
> i¼1 >
>
>
: >
;
þðxn 1Þ2 1 þ sin2 ð2pxn Þ
Xn
þ uðxi ; 5; 100; 4Þ;
8 i¼1 m
< k ðx i a Þ
> xi [ a
uðxi ; a; k; mÞ ¼ 0 a\xi \a
>
:
kðxi aÞm xi \ a
(continued)
144 E. Cuevas et al.
Table 4 (continued)
Name Function S Dim Minimum
Sum of P
n
½10; 10n n ¼ 30 x ¼ ð0; . . .; 0Þ;
f10 ðxÞ ¼ ix2i
squares i¼1 f ðx Þ ¼ 0
2 n 4
Zakharov P
n P
n P ½5; 10n n ¼ 30 x ¼ ð0; . . .; 0Þ;
f11 ðxÞ ¼ x2i þ 0:5ixi þ 0:5ixi f ðx Þ ¼ 0
i¼1 i¼1 i¼1
8 9
Penalized 10 sinðpy1 Þþ ½50; 50n n ¼ 30 x ¼ ð0; . . .; 0Þ;
p < n1 =
f ðx Þ ¼ 0
f12 ðxÞ ¼ P 2
n: ðyi 1Þ 1 þ 10 sin ðpyiþ1 Þ þ ðyn 1Þ ;
2 2
i¼1
X
n
þ uðxi ; 10; 100; 4Þ
i¼1
8
< k ðxi aÞ
>
m
xi [ a
ðxi þ 1Þ
yi ¼ 1 þ uðxi ; a; k; mÞ ¼ 0 a xi a
4 >
:
k ðxi aÞ xi \a
m
8 2 9
½50; 50n n ¼ 30 x ¼ ð0; . . .; 0Þ;
< sin ð3px1 Þ
Penalized 2 > >
= f ðx Þ ¼ 0
f13 ðxÞ ¼ 0:1 P n ðxi 1Þ2 1 þ sin2 ð3pxi þ 1Þ
>
: þ >
;
i¼1 þðxn 1Þ 1 þ sin ð2pxn Þ
2 2
X n
þ uðxi ; 5; 100; 4Þ
i¼1
where uðxi ; a; k; mÞ is the same as Penalized function.
Schwefel P
n pffiffiffiffiffiffiffi ½500; 500n n ¼ 30 x ¼ ð420; . . .; 420Þ;
f14 ðxÞ ¼ xi sin jxi j
i¼1 f ðx Þ ¼ 418:9829 n
n
P n ¼ 30 x ¼ ð0; . . .; 0Þ;
Rastrigin ½5:12; 5:12n
f15 ðxÞ ¼ x2i 10 cosð2pxi Þ þ 10
i¼1 f ðx Þ ¼ 0
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi!
Ackley 1X n ½32; 32n n ¼ 30 x ¼ ð0; . . .; 0Þ;
f16 ðxÞ ¼ 20 exp 0:2 x2 f ðx Þ ¼ 0
n i¼1 i
!
1X n
exp cosð2pxi Þ þ 20 þ exp
n i¼1
P Q
Griewank n n
½600; 600n n ¼ 30 x ¼ ð0; . . .; 0Þ;
f17 ðxÞ ¼ 4000
1
x2i cos pxiffii þ 1
i¼1 i¼1 f ðx Þ ¼ 0
Xn=k n ¼ 30 x ¼ ð0; . . .; 0Þ;
Powelll f18 ðxÞ ¼ ðx4i3 þ 10x4i2 Þ2 þ5ðx4i1 x4i Þ
2 ½4; 5n
i¼1 f ðx Þ ¼ 0
þ ðx4i2 x4i1 Þ4 þ10ðx4i3 x4i Þ4
sffiffiffiffiffiffiffiffiffiffiffi! sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Salomon Pn Pn ½100; 100n n ¼ 30 x ¼ ð0; . . .; 0Þ;
f19 ðxÞ ¼ cos 2p x2i þ 0:1 x2i þ 1 f ðx Þ ¼ 0
i¼1 i¼1
References
1. Aviles, L.: Sex-ratio bias and possible group selection in the social spider Anelosimus
eximius. Am. Nat. 128(1), 1–12 (1986)
2. Avilés, L.: Causes and consequences of cooperation and permanent-sociality in spiders. In:
Choe, B.C. (ed.) The Evolution of Social Behavior in Insects and Arachnids, pp. 476–498.
Cambridge University Press, Cambridge (1997)
3. Banharnsakun, A., Achalakul, T., Sirinaovakul, B.: The best-so-far selection in artificial bee
colony algorithm. Appl. Soft Comput. 11, 2888–2901 (2011)
4. Bonabeau, E.: Social insect colonies as complex adaptive systems. Ecosystems 1, 437–443
(1998)
5. Bonabeau, E., Dorigo, M., Theraulaz, G.: Swarm Intelligence: From Natural to Artificial
Systems. Oxford University Press, New York (1999)
6. Burgess, J.W.: Social spacing strategies in spiders. In: Rovner, P.N. (ed.) Spider
communication: mechanisms and ecological significance, pp. 317–351. Princeton University
Press, Princeton (1982)
A Computational Intelligence Optimization Algorithm … 145
7. Chen, D.B., Zhao, C.X.: Particle swarm optimization with adaptive population size and its
application. Appl. Soft Comput. 9(1), 39–48 (2009)
8. Damian, O., Andrade, M., Kasumovic, M.: Dynamic population structure and the evolution of
spider mating systems. Adv. Insect Physiol. 41, 65–114 (2011)
9. Duan, X., Wang, G.G., Kang, X., Niu, Q., Naterer, G., Peng, Q.: Performance study of mode-
pursuing sampling method. Eng. Optim. 41(1) (2009)
10. Garcia, S., Molina, D., Lozano, M., Herrera, F.: A study on the use of non-parametric tests for
analyzing the evolutionary algorithms’ behaviour: a case study on the CEC’2005 special
session on real parameter optimization. J. Heurist. (2008). doi:10.1007/s10732-008-9080-4
11. Gordon, D.: The organization of work in social insect colonies. Complexity 8(1), 43–46
(2003)
12. Gove, R., Hayworth, M., Chhetri, M., Rueppell, O.: Division of labour and social insect
colony performance in relation to task and mating number under two alternative response
threshold models. Insectes Soc. 56(3), 19–331 (2009)
13. Hossein, A., Hossein-Alavi, A.: Krill herd: a new bio-inspired optimization algorithm.
Commun. Nonlinear Sci. Numer. Simul. 17, 4831–4845 (2012)
14. Hölldobler, B., Wilson, E.O.: The Ants. Harvard University Press (1990). ISBN 0-674-04075-
9
15. Hölldobler, B., Wilson, E.O.: Journey to the Ants: A Story of Scientific Exploration (1994).
ISBN 0-674-48525-4
16. Jones, T., Riechert, S.: Patterns of reproductive success associated with social structure and
microclimate in a spider system. Anim. Behav. 76(6), 2011–2019 (2008)
17. Karaboga, D.: An idea based on honey bee swarm for numerical optimization. Technical
Report-TR06. Engineering Faculty, Computer Engineering Department, Erciyes University
(2005)
18. Karaboga, D, Akay, B.: A comparative study of artificial bee colony algorithm. Appl. Math.
Comput. 214(1), 108–132 (2009). ISSN 0096-3003
19. Kassabalidis, I., El-Sharkawi, M.A., Marks, R.J., Arabshahi, P., Gray, A.A.: Swarm
intelligence for routing in communication networks. Global Telecommunications
Conference, GLOBECOM’01, 6, IEEE, pp. 3613–3617 (2001)
20. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of the 1995 IEEE
International Conference on Neural Networks, vol. 4, pp. 1942–1948, December 1995
21. Krishnanand, K.R., Nayak, S.K., Panigrahi, B.K., Rout, P.K.: Comparative study of five bio-
inspired evolutionary optimization techniques. In: Nature & Biologically Inspired Computing,
NaBIC, World Congress on, pp.1231–1236 (2009)
22. Lubin, T.B.: The evolution of sociality in spiders. In: Brockmann, H.J. (ed.) Advances in the
Study of Behavior, vol. 37, pp. 83–145. Academic Press, Burlington (2007)
23. Maxence, S.: Social organization of the colonial spider Leucauge sp. in the Neotropics:
vertical stratification within colonies. J. Arachnol. 38, 446–451 (2010)
24. Mezura-Montes, E., Velázquez-Reyes, J., Coello Coello, C.A. : A comparative study of
differential evolution variants for global optimization. In: Proceedings of the 8th Annual
Conference on Genetic and Evolutionary Computation (GECCO '06). ACM, New York, NY,
USA, pp. 485–492 (2006)
25. Oster, G., Wilson, E.: Caste and ecology in the social insects. Princeton University Press,
Princeton (1978)
26. Pasquet, A.: Cooperation and prey capture efficiency in a social spider, Anelosimus eximius
(Araneae, Theridiidae). Ethology 90, 121–133 (1991)
27. Passino, K.M.: Biomimicry of bacterial foraging for distributed optimization and control.
IEEE Control Syst. Mag. 22(3), 52–67 (2002)
28. Rajabioun, R.: Cuckoo optimization algorithm. Appl. Soft Comput. 11, 5508–5518 (2011)
29. Rayor, E.C.: Do social spiders cooperate in predator defense and foraging without a web?
Behav. Ecol. Sociobiol. 65(10), 1935–1945 (2011)
30. Rypstra, A.: Prey size, prey perishability and group foraging in a social spider. Oecologia 86
(1), 25–30 (1991)
146 E. Cuevas et al.
31. Storn, R., Price, K.: Differential evolution—a simple and efficient heuristicfor global
optimization over continuous spaces. J. Glob. Optim. 11(4), 341–359 (1995)
32. Uetz, G.W.: Colonial web-building spiders: balancing the costs and benefits of group-living.
In: Choe, E.J., Crespi, B. (eds.) The Evolution of Social Behavior in Insects and Arachnids,
pp. 458–475. Cambridge University Press, Cambridge (1997)
33. Ulbrich, K., Henschel, J.: Intraspecific competition in a social spider. Ecol. Model. 115(2–3),
243–251 (1999)
34. Vesterstrom, J., Thomsen, R.: A comparative study of differential evolution, particle swarm
optimization, and evolutionary algorithms on numerical benchmark problems. In:
Evolutionary Computation, 2004. CEC2004. Congress on 19–23 June, vol. 2,
pp. 1980–1987 (2004)
35. Wan-Li, X., Mei-Qing, A.: An efficient and robust artificial bee colony algorithm for
numerical optimization. Comput. Oper. Res. 40, 1256–1265 (2013)
36. Wang, Y., Li, B., Weise, T., Wang, J., Yuan, B., Tian, Q.: Self-adaptive learning based
particle swarm optimization. Inf. Sci. 181(20), 4515–4538 (2011)
37. Wang, H., Sun, H., Li, C., Rahnamayan, S., Jeng-shyang, P.: Diversity enhanced particle
swarm optimization with neighborhood. Inf. Sci. 223, 119–135 (2013)
38. Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics 1, 80–83 (1945)
39. Yang, E., Barton, N.H., Arslan, T., Erdogan, A.T.: A novel shifting balance theory-based
approach to optimization of an energy-constrained modulation scheme for wireless sensor
networks. In: Proceedings of the IEEE Congress on Evolutionary Computation, CEC 2008,
June 1–6, 2008, Hong Kong, China, pp. 2749–2756. IEEE (2008)
40. Yang, X.: Nature-Inspired Metaheuristic Algorithms. Luniver Press, Beckington (2008)
41. Yang, X.S.: Engineering Optimization: An Introduction with Metaheuristic Applications.
Wiley, Hoboken (2010)
42. Ying, J., Ke-Cun, Z., Shao-Jian, Q.: A deterministic global optimization algorithm. Appl.
Math. Comput. 185(1), 382–387 (2007)
Black Hole Algorithm and Its Applications
1 Introduction
Simulated Annealing [9] is inspired by the annealing technique used by the different
metallurgists to get a ‘‘well ordered’’ solid state of minimal energy (while avoiding
the ‘‘meta stable’’ structures, characteristic of the local minima of energy), Ant
Colony optimization (ACO) algorithm is a metaheuristic technique to solve prob-
lems that has been motivated by the ants’ social behaviors in finding shortest paths.
Real ants walk randomly until they find food and return to their nest while
depositing pheromone on the ground in order to mark their preferred path to attract
other ants to follow [6, 20, 21], Particle Swarm Optimization (PSO) was introduced
by James Kennedy and Russell Eberhart as a global optimization technique in 1995.
It uses the metaphor of the flocking behavior of birds to solve optimization prob-
lems [22], firefly algorithm is a population based metaheuristic algorithm. It has
become an increasingly important popular tool of Swarm Intelligence that has been
applied in almost all research area so of optimization, as well as science and
engineering practice. Fireflies have their flashing light. There are two fundamental
functions of flashing light of firefly: (1) to attract mating partners and (2) to warn
potential predators. But, the flashing lights comply with more physical rules. On the
one hand, the light intensity of source (I) decrease as the distance (r) increases
according to the term I ∝ 1/r2. This phenomenon inspired [23] to develop the firefly
algorithm [23–25], Bat-inspired algorithm is a metaheuristic optimization algo-
rithm. It was invented by Yang et al. [26–28] and it is based on the echolocation
behavior of microbats with varying pulse rates of emission and loudness. And
honey bee algorithm [29] etc. Such algorithms are progressively analyzed, deployed
and powered by different researchers in many different research fields [3, 5, 27, 28,
30–32]. These algorithms are used to solve different optimization problems. But,
there is no specific algorithm or direct algorithms to achieve the best solution for all
optimization problems. Numerous algorithms give a better solution for some par-
ticular problems than others. Hence, searching for new heuristic optimization
algorithms is an open problem [29] and it requires a lot of exploration of new
metaheuristic algorithms for solving of hard problems.
Recently, one of the metaheuristic approaches has been developed for solving the
hard or complex optimization and data clustering problem which is NP-hard problem
known as black hole heuristic approach. Black Hole heuristic algorithm is inspired
by the black hole phenomenon and black hole algorithm (BH) starts with an initial
population of candidate solutions to an optimization problem and an objective
function that is calculated for them similar to other population-based algorithms.
The ‘‘heuristic” is Greek word and means ‘‘to know”, ‘‘to find”, ‘‘to discover” or
‘‘to guide an investigation” by trial and error methodology [33]. Specifically,
150 S. Kumar et al.
Metaheuristic algorithms are the master strategy key that modify and update the
other heuristic produced solution that is normally generated in the quest of local
optimal. These nature-inspired algorithms are becoming popular and powerful in
solving optimization problems. The suffix “meta” Greek means upper level meth-
odology, beyond or higher level and it generally perform better than simple heu-
ristic approach. Metaheuristic algorithms are conceptual set of all heuristic
approach which is used to find the optimal solution of a combinatorial optimization
problem. The term “metaheuristic” was introduced by Sir F. Glover in research
paper. In addition, metaheuristic algorithms use certain trade-off of randomization
and local search for finding the optimal and near optimal solution. Local search is a
general methodology for finding the high quality solutions to complex or hard
combinatorial optimization problems in reasonable amount of time. It is basically
an iterative based search approach to diversification of neighbor of solutions trying
to enhance the current solution by local changes [23, 35].
Diversification phase ensures that the algorithm explores the search space more
efficiently and thoroughly and helps to generate diverse solutions. On other hand,
when diversification is too much, it will increase the probability of finding the true
optimality globally solutions. But, it will often to slow the exploration process with
much low rate of convergence of problem.
mimics the thermodynamics process of cooling of molten metal for getting the
minimum free energy state. It works with a point and at each iteration builds a point
according to the Boltzmann probability distribution [9, 38]. Artificial immune sys-
tems (AIS) simulate biological immune systems [7], ant colony optimization (ACO)
field study a model derived from the observation of real ant’s behavior and uses these
models as source of inspiration for the design of innovative and novel approaches for
solution of optimization and distributed control problems. The main objective of ant
colony algorithm is that the self organizing principle which allows the highly coor-
dinated behavior of ants can be exploited to coordinate transport, Bacterial foraging
algorithm (BFA) comes from search and optimal foraging of bacteria and particle
swarm optimization (PSO) simulates the behavior of flock of birds and fish schooling
which search for best solution in both local and global search space [5, 22, 38]. Based
on bio-inspired characteristics, various algorithms are illustrated as below (Table 1).
Unlike exact algorithm methodologies (it is guaranteed to find the optimal
solution and to prove its optimality for every finite size instance of a combinatorial
optimization problem within an instance dependent run time.). The metaheuristic
algorithms ensure to find the optimal solution of a given hard problem and rea-
sonable amount of time. The application of metaheuristic falls into a large number
of area some of them are as follows:
• Engineering design, topological optimization, structural optimizations in elec-
tronics and VLSI design, aerodynamics based structural design.
• Fluid dynamics, telecommunication field, automotives and robotics design and
robotic roadmap planning optimization.
• In data mining and machine learning: Data mining in bioinformatics, compu-
tational biology.
• System modeling simulations and identification in chemistry, physics and biology.
• Images processing and control signal processing: Feature extraction from data
and selection of feature with help of metaheuristic approach.
• Planning in routing based problems, robotic planning, scheduling and produc-
tion based problems, logistics and transportation, supply chain management and
environmental.
In the eighteens-century, Dr. John Michel and Pierre Pierre Simon de Laplace were
established to blemish the idea of black holes. Based on Newton’s law, they invented
the concept of a star turning into invisible to the human eye but during that period it
was not able to recognize as a black hole in 1967, John Wheeler the American
physicist first named the phenomenon of mass collapsing as a black hole [42]. A black
hole in space is a form when a star of massive size collapses named the develop-
ment of mass collapsing as apart. The gravitational power of the black hole is too
strong that even the any light cannot escape from it. The gravity of such body is so
Black Hole Algorithm and Its Applications 153
Table 1 (continued)
S.No. Metaheuristic algorithms Description of metaheuristic algorithms
12. Black hole (BH) algorithm It is inspired by the black hole phenomenon. The
basic idea of a black hole is simply a region of
space that has so much mass concentrated in it
that there is no way for a nearby object to escape
its gravitational pull. Anything falling into a
black hole, including light, is forever gone from
our universe
strong because matter has been squeezed into a tiny space and anything that crosses
the boundary of the black hole will be consumed or by it and vanishes and nothing can
get away from its enormous power. The sphere-shaped boundary of a black hole in
space is known as the event horizon. The radius of the event horizon is termed as the
Schwarzschild radius. At this radius, the escape speed is equal to the speed of light,
and once light passes through, even it cannot escape. Nothing can escape from within
the event horizon because nothing can go faster than light. The Schwarzschild radius
(R) is calculated by R ¼ 2GM c2 , where G is the gravitational constant
(6.67 × 10−11 N × (m/kg)2), M is the mass of the black hole, and c is the speed of light.
If star moves close to the event horizon or crosses the Schwarzschild radius it will be
absorbed into the black hole and permanently disappear. The existence of black holes
can be discerned by its effect over the objects surrounding it [43, 44].
A black hole is a region of space-time (x, y, t) whose gravitational field is so strong and
powerful that nothing can escape from it. The theory and principle of general relativity
predicts that a sufficiently compact mass will deform space-time to form a black hole.
Around a black hole, there is a mathematically defined surface called an event horizon
that marks the point of no return. If anything moves close to the event horizon or
crosses the Schwarzschild radius, it will be absorbed into the black hole and per-
manently disappear. The existence of black holes can be discerned by its effect over
the objects surrounding it [45]. The hole is called black because it absorbs all the light
that hits the horizon, reflecting nothing, just like a perfect black body in thermody-
namics [46, 47]. A black hole has only three independent physical properties: Black
hole’s mass (M), charge (Q) and angular momentum (J). A charged black hole repels
other like charges just like any other charged object in given space. The simplest black
holes have mass but neither electric charge nor angular momentum [48, 49].
Black Hole Algorithm and Its Applications 155
The basic idea of a black hole is simply a region of space that has so much mass
concentrated in it that there is no way for a nearby object to escape its gravitational
pull. Anything falling into a black hole, including light, is forever gone from our
universe.
Black Hole: In black hole algorithm, the best candidate among all the candidates at
each iteration is selected as a black hole.
Stars: All the other candidates form the normal stars. The creation of the black
hole is not random and it is one of the real candidates of the population.
Movement: Then, all the candidates are moved towards the black hole based on
their current location and a random number.
1. Black hole algorithm (black hole) starts with an initial population of candidate
solutions to an optimization problem and an objective function that is calculated
for them.
2. At each iteration of the Black Hole, the best candidate is selected to be the black
hole and the rest form the normal stars. After the initialization process, the black
hole starts pulling stars around it.
3. If a star gets too close to the black hole it will be swallowed by the black hole
and is gone forever. In such a case, a new star (candidate solution) is randomly
generated and placed in the search space and starts a new search.
1. Initial Population: PðxÞ ¼ fxt1 ; xt2 ; xt3 ; . . .; xtn } randomly generated population of
candidate solutions (the stars) are placed in the search space of some problem or
function.
2. Find the total Fitness of population:
Xsize
pop
fi ¼ evalðpðtÞ ð1Þ
i¼1
popPsize
3. fBH ¼ evalðpðtÞ
i¼1
where fi and fBH are the fitness values of black hole and ith star in the initialized
population. The population is estimated and the best candidate (from remaining
156 S. Kumar et al.
stars) in the population, which has the best fitness value, fi is selected to be the black
hole and the remaining form the normal stars. The black hole has the capability to
absorb the stars that surround it. After initializing the first black hole and stars, the
black hole starts absorbing the stars around it and all the stars start moving towards
the black hole.
The black hole starts absorbing the stars around it and all the stars start moving towards
the black hole. The absorption of stars by the black hole is formulated as follows:
where i = 1; 2; 3; . . .n, Xit and Xitþ1 are the locations of the ith star at iterations t and
(t + 1) respectively. XBH is the location of the black hole in the search space and rand
is a random number in the interval [0, 1]. N is the number of stars (candidate
solutions). While moving towards the black hole, a star may reach a location with
lower cost than the black hole. In such a case, the black hole moves to the location of
that star and vice versa. Then the black hole algorithm will continue with the black
hole in the new location and then stars start moving towards this new location.
In block hole algorithm, the probability of crossing the event horizon of black hole
during moving stars towards the black hole is used to gather the more optimal data
point from search space of the problem. Every star (candidate solution) crosses the
event horizon of the black hole will be sucked by the black hole and every time a
candidate (star) dies it means it sucked in by the black hole, another candidate
solution (star) is populated and distributed randomly over the search space of the
defined problem and go for a new search in the search solution space. It is com-
pleted to remain the number of candidate solutions constant. The next iteration
takes place after all the stars have been moved. The radius of the event horizon in
the black hole algorithm is calculated using the following equation: The radius of
horizon (R) of black hole is demonstrated as follow:
fBH
R ¼ PN ð4Þ
i¼1 fi
where fi and fBH are the fitness values of black hole and ith star. N is the number of
stars (candidate solutions).When the distance between a candidate solution and the
black hole (best candidate) is less than R, that candidate is collapsed and a new
candidate is created and distributed randomly in the search space.
Black Hole Algorithm and Its Applications 157
Xsize
pop
fi ¼ evalðpðtÞ
i¼1
Xsize
pop
fBH ¼ evalðpðtÞ
i¼1
3. Select the best star that has the best fitness value as the black hole.
4. Change the location of each star according to Eq. (3) as
5. If a star reaches a location with lower cost than the black hole, exchange their
locations.
6. If a star crosses the event horizon of the black hole
7. Calculate the event horizon radius (R)
fBH
REventHorizon ¼ PN
i¼1 fi
8. When the distance between a candidate solution and the black hole (best
candidate) is less than R, that candidate is collapsed and a new candidate is
created and distributed randomly in the search space.
9. Replace it with a new star in a random location in the search space
10. else
break
11. If a termination criterion (a maximum number of iterations or a sufficiently
good fitness) is met exit the loop.
The candidate solution to the clustering problem corresponds to one dimensional
(1-D) array while applying black hole algorithm for data clustering. Every candi-
date solution is considered as k initial cluster centers and the individual unit in the
array as the cluster center dimension. Figure 1 illustrates a candidate solution of a
problem with three clusters and all the data objects have four features.
158 S. Kumar et al.
Fig. 1 Learning problems: dots correspond to points without any labels. Points with labels are
denoted by plus signs, asterisks, and crosses. In (c), the must-link and cannot link constraints are
denoted by solid and dashed lines, respectively [50]. a Supervised. b Partially labelled. c Partially
constrained. d Unsupervised
Given N objects, assign each object to one of K clusters and minimize the sum of
squared euclidean distances between each object and the center of the cluster that
belongs to every allocated object:
XN XK
FðO; ZÞ ¼ i¼1 j¼1
WijðOi Zj Þ2
where ðOi Zj Þ2 ¼ Oi Zj is the Euclidean distance between a data object Oi
and the cluster center Zj . N and K are the number of data objects and the number of
clusters, respectively. Wij is the association weight of data object Oi with cluster j.
1 if object i is assign to cluster j:
Wij ¼
0 if object i is not assigned to cluster j:
The goal of data clustering also known as cluster analysis is to discover the natural
grouping of a set of patterns, points or objects. An operational definition of clus-
tering can be stated as follows: Given a representation of n objects, find K groups
based on a measure of similarity such that the similarities between objects in the
same group are high while the similarities between objects in different groups are
160 S. Kumar et al.
Fig. 2 Diversity of clusters. The seven clusters in (a) [denoted by seven different colors in 1(b)]
differ in shape, size and density. Although these clusters are apparent to a data analyst, none of the
available clustering algorithms can detect all these clusters. (Source [50]. A Input data. b Desired
clustering
low. Figure 2 demonstrates that clusters may differ in terms of their shape, size, and
density. The presence of noise in the data makes the detection of the clusters even
more difficult and ideal cluster can be defined as a set of points that is compact and
isolated. While humans are excellent cluster seekers in two and possibly three
dimensions, we need automatic algorithms for high-dimensional data. It is this
challenge along with the unknown number of clusters for the given data that has
resulted in thousands of clustering algorithms that have been published and that
continue to appear. An example of clustering is shown in Fig. 2. In pattern rec-
ognition, data analysis is concerned with predictive modeling: given some training
data and to predict the behavior of the unseen test data. This task is also referred to
as learning.
collect a huge amount of labeled data ½ðxn ; yn Þ. One of learning approach to deal
such issues is to exploit unsupervised learning. The main aim object is to learn a
classification model from both labeled X ¼ ½ðxn ; yn Þ 2 X Y n N and
NþM
½xj j¼Nþ1 unlabelled data where N M. Clustering is a more difficult and chal-
lenging problem than classification.
Semi-supervised Learning
Clustering algorithms are classified into can be broadly divided into two categories:
(1): Hierarchical clustering and Partitional clustering.
Hierarchical Algorithm
Partitioned Algorithm
algorithm finds a partition such that the squared error between the empirical mean of
a cluster and the points in the cluster is minimized. Let lk be the mean of cluster Ck .
The squared error between lk and the points in cluster Ck is defined as
P
Jðck Þ ¼ ðXi lk Þ2 . The goal of K-means is to minimize the sum of the squared
xi 2ck
P P
error over all K clusters, JðCÞ ¼ Kk¼1 Nxi2ck ðXi lk Þ2 . Minimizing this objective
function is known to be an NP-hard problem (even for K = 2). Thus K-means which
is a greedy algorithm and can only converge to a local minimum, even though recent
study has shown with a large probability, K-means could converge to the global
optimum when clusters are well separated. K-means starts with an initial partition
with K clusters and assign patterns to clusters so as to reduce the squared error. Since
the squared error always decreases with an increase in the number of clusters K (with
J(C) = 0 when K = n), it can be minimized only for a fixed number of clusters [64].
Recently efficient hybrid evolutionary and bio-inspired metaheuristic methods and
K-means to overcome local problems in clustering are used [72, 73] (Niknam and
Amiri 2010).
Fig. 3 Illustration of K-means algorithm a Two-dimensional input data with three clusters; b three
seed points selected as cluster centers and initial assignment of the data points to clusters; and
c updates intermediate cluster labels
164 S. Kumar et al.
Fig. 4 a and b intermediate iterations updating cluster labels and their centers an final clustering
obtained by K-means
Among the all classical clustering algorithms, K-means clustering algorithm is the
well known algorithm due to their simplicity and efficiency. It suffers from two
problems: It needs number of cluster before starting i.e. the number of cluster must
be known a priori. In addition, its performance strongly depends on the initial
centroids and may get stuck in local optima solutions and its convergence rate are
affected [56]. In order to overcome the shortcomings of K-means many heuristic
approaches have been applied in the last two decades.
Black Hole Algorithm and Its Applications 165
Recently, a major challengeable subject that are facing the electric power system
operator and how to manage optimally the power generating units over a scheduling
horizon of one day considering all of the practical equality inequality and dynamic
constraints. These constraints of system are comprised of load plus transmission
losses balance, valve-point effects, prohibited operating zones, multi-fuel options,
line flow constraints, operating reserve and minimum on/off time. There is not
available any optimization for the short-term thermal generation scheduling
(STGS). It has high-dimensional, high-constraints, non-convex, non-smooth and
non-linear nature and needs an efficient algorithm to be solved. Then, a new
optimization approach, known as gradient-based modified teaching–learning-based
optimization combined with black hole (MTLBO–BH) algorithm has been planned
to seek the optimum operational cost [82–84].
6 Discussion
role for solving the complex problem of different applications in science, computer
vision, computer science, data analysis, data clustering, and mining, clustering
analysis, industrial forecasting of weather, medical and biological research, econ-
omy and different multi-disciplinary engineering research field. In addition, meta-
heuristics are useful in computer vision, image processing, machine learning and
pattern recognition of any subject which can be deployed for finding the optimal set
of discriminant values in form of Eigen vector (face recognition, fingerprint and
other biometric characteristics) and incorporate these values for identification
purpose. In biometrics and computer vision, face recognition has always been a
major challenge for machine learning researchers and pattern recognition.
Introducing the intelligence in machines to identifying humans from their face
images (which is stored in template data base) deals with handling variations due to
illumination condition, pose, facial expression, scale and disguise etc., and hence
becomes a complex task in computer vision. Face recognition demonstrates a
classification problem for human recognition. Face recognition classification
problems can be solved by a technique for the design of the Radial Basis Functions
neural network with metaheuristic approaches (like firefly, particle swarm intelli-
gence and black hole algorithm). These algorithms can be used at match score level
in biometrics and select most discriminant set of optimal features for identification
of face and their classification.
Recently black hole methodology plays a major role of modeling and simulating
natural phenomena for solving complex problems. The motivation for new heuristic
optimization algorithm is based on the black hole phenomenon. Further, it has a
simple structure and it is easy to implement and it is free from parameter tuning
issues like genetic algorithm. The black hole algorithm can be applied to solve the
clustering problem and can run on different benchmark datasets. In future research,
the proposed algorithm can also be utilized for many different areas of applications.
In addition, the application of BH in combination with other algorithms may be
effective. Meta-heuristics support managers in decision-making with robust tools
that provide high-quality solutions to important applications in business, engi-
neering, economics and science in reasonable time horizons.
We conclude that new black hole algorithm approach is population based same as
particle swarm optimization, firefly, genetic algorithm, BAT algorithm and other
evolutionary methods. It is free from parameter tuning issues like genetic algorithm
and other. It does not suffer from premature convergence problem. This implies that
black hole is potentially more powerful in solving NP-hard (e.g. data clustering
problem) problems which is to be investigated further in future studies. The further
improvement on the convergence of the algorithm is to vary the randomization
parameter so that it decreases gradually as the optima are approaching. In wireless
sensor network, density of deployment, scale, and constraints in battery, storage
Black Hole Algorithm and Its Applications 167
References
1. Tan, X., Bhanu, B.: Fingerprint matching by genetic algorithms. Pattern Recogn. 39, 465–477
(2006)
2. Karakuzu, C.: Fuzzy controller training using particle swarm optimization for nonlinear
system control. ISA Trans. 47(2), 229–239 (2008)
3. Rajabioun, R.: Cuckoo optimization algorithm. Elsevier Appl. Soft Comput. 11, 5508–5518
(2011)
4. Tsai Hsing, C., Lin, Yong-H: Modification of the fish swarm algorithm with particle swarm
optimization formulation and communication behavior. Appl. Soft Comput. Elsevier 1,
5367–5374 (2011)
5. Baojiang, Z., Shiyong, L.: Ant colony optimization algorithm and its application to neu ro-
fuzzy controller design. J. Syst. Eng. Electron. 18, 603–610 (2007)
6. Dorigo, M., Maniezzo, V., Colorni, A.: The ant system: optimization by a colony of
cooperating agents. IEEE Trans. Syst. Man Cybern. Part B 26(1), 29–41 (1996)
7. Farmer, J.D., et al.: The immune system, adaptation and machine learning. Phys. D Nonlinear
Phenom. Elsevier 22(1–3), 187–204 (1986)
8. Kim, D.H., Abraham, A., Cho, J.H.: A hybrid genetic algorithm and bacterial foraging
approach for global optimization. Inf. Sci. 177, 3918–3937 (2007)
9. Kirkpatrick, S., Gelatto, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science
220, 671–680 (1983)
10. Tang, K.S., Man, K.F., Kwong, S., He, Q.: Genetic algorithms and their applications. IEEE
Sig. Process. Mag. 3(6), 22–37 (1996)
11. Du, Weilin, Li, B.: Multi-strategy ensemble particle swarm optimization for dynamic
optimization. Inf. Sci. 178, 3096–3109 (2008)
12. Yao, X., Liu, Y., Lin, G.: Evolutionary programming made faster. IEEE Trans. Evol. Comput.
3, 82–102 (1999)
13. Liu, Y., Yi, Z., Wu, H., Ye, M., Chen, K.: A tabu search approach for the minimum sum-of-
squares clustering problem. Inf. Sci. 178(12), 2680–2704 (2008)
14. Kim, T.H., Maruta, I., Sugie, T.: Robust PID controller tuning based on the constrained
particle swarm optimization. J. Autom. Sciencedirect 44(4), 1104–1110 (2008)
15. Cordon, O., Santamarı, S., Damas, J.: A fast and accurate approach for 3D image registration
using the scatter search evolutionary algorithm. Pattern Recogn. Lett. 27, 1191–1200 (2006)
16. Yang, X.S.: Firefly algorithms for multimodal optimization, In: Proceeding of Stochastic
Algorithms: Foundations and Applications (SAGA), 2009 (2009)
168 S. Kumar et al.
17. Kalinlia, A., Karabogab, N.: Artificial immune algorithm for IIR filter design. Eng. Appl.
Artif. Intell. 18, 919–929 (2005)
18. Lin, Y.L., Chang, W.D., Hsieh, J.G.: A particle swarm optimization approach to nonlinear
rational filter modeling. Expert Syst. Appl. 34, 1194–1199 (2008)
19. Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press,
Ann Arbor (1975)
20. Jackson, D.E., Ratnieks, F.L.W.: Communication in ants. Curr. Biol. 16, R570–R574 (2006)
21. Goss, S., Aron, S., Deneubourg, J.L., Pasteels, J.M.: Self-organized shortcuts in the Argentine
ant. Naturwissenschaften 76, 579–581 (1989)
22. Kennedy, J., Eberhart, R.C.: Particle swarm optimization. Proc. IEEE Int. Conf. Neural
Networks 4, 1942–1948 (1995)
23. Yang, X. S.: 2010, ‘Nature-inspired metaheuristic algorithms’, Luniver Press
24. Tarasewich, p, McMullen, P.R.: Swarm intelligence: power in numbers. Commun. ACM 45,
62–67 (2002)
25. Senthilnath, J., Omkar, S.N., Mani, V.: Clustering using firefly algorithm: performance study.
Swarm Evol. Comput. 1(3), 164–171 (2011)
26. Yang, X.S.: Firefly algorithm. Engineering Optimization, pp. 221–230 (2010)
27. Yang, X.S.: Bat algorithm for multi-objective optimization. Int. J. Bio-inspired Comput. 3(5),
267–274 (2011)
28. Tripathi, P.K., Bandyopadhyay, S., Pal, S.K.: Multi-objective particle swarm optimization
with time variant inertia and acceleration coefficients. Inf. Sci. 177, 5033–5049 (2007)
29. Karaboga, D.: An idea based on honey bee swarm for numerical optimization. Technical
Report TR06,Erciyes University (2005)
30. Ellabib, I., Calamari, P., Basir, O.: Exchange strategies for multiple ant colony system. Inf.
Sci. 177, 1248–1264 (2007)
31. Hamzaçebi, C.: Improving genetic algorithms performance by local search for continuous
function optimization. Appl. Math. Comput. 96(1), 309–317 (2008)
32. Lozano, M., Herrera, F., Cano, J.R.: Replacement strategies to preserve useful diversity in
steady-state genetic algorithms. Inf. Sci. 178, 4421–4433 (2008)
33. Lazar, A.: Heuristic knowledge discovery for archaeological data using genetic algorithms and
rough sets, Heuristic and Optimization for Knowledge Discovery, IGI Global, pp. 263–278
(2014)
34. Russell, S.J., Norvig, P.: Artificial Intelligence a Modern Approach. Prentice Hall, Upper
Saddle River (2010). 1132
35. Fred, W.: Glover, Manuel Laguna, Tabu Search, 1997, ISBN: 079239965X
36. Christian, B., Roli, A.: Metaheuristics in combinatorial optimization: Overview and
conceptual comparison. ACM Comput. Surveys (CSUR) 35(3), 268–308 (2003)
37. Gazi, V., Passino, K.M.: Stability analysis of social foraging swarms. IEEE Trans. Syst. Man
Cybern. Part B 34(1), 539–557 (2008)
38. Deb, K.: Optimization for Engineering Design: Algorithms and Examples, Computer-Aided
Design. PHI Learning Pvt. Ltd., New Delhi (2009)
39. Rashedi, E.: Gravitational Search Algorithm. M.Sc. Thesis, Shahid Bahonar University of
Kerman, Kerman (2007)
40. Shah-Hosseini, H.: The intelligent water drops algorithm: a nature-inspired swarm-based
optimization algorithm. Int. J. Bio-inspired Comput. 1(1), 71–79 (2009)
41. Dos Santos, C.L., et al.: A multiobjective firefly approach using beta probability. IEE Trans.
Magn. 49(5), 2085–2088 (2013)
42. Talbi, E.G.: Metaheuristics: from design to implementation, vol. 74, p. 500. Wiley, London
(2009)
43. Giacconi, R., Kaper, L., Heuvel, E., Woudt, P.: Black hole research past and future. In: Black
Holes in Binaries and Galactic Nuclei: Diagnostics. Demography and Formation, pp. 3–15.
Springer, Berlin, Heidelberg (2001)
44. Pickover, C.: Black Holes: A Traveler’s Guide. Wiley, London (1998)
45. Frolov, V.P., Novikov, I.D.: Phys. Rev. D. 42, 1057 (1990)
Black Hole Algorithm and Its Applications 169
46. Schutz, B. F.: Gravity from the Ground Up. Cambridge University Press, Cambridge. ISBN
0-521-45506-5 (2003)
47. Davies, P.C.W.: Thermodynamics of Black Holes. Reports on Progress in Physics, Rep. Prog.
Phys. vol. 41 Printed in Great Britain (1978)
48. Heusler, M.: Stationary black holes: uniqueness and beyond. Living Rev. Relativity 1(1998),
6 (1998)
49. Nemati, M., Momeni, H., Bazrkar, N.: Binary black holes algorithm. Int. J. Comput. Appl. 79
(6), 36–42 (2013)
50. Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31(8), 651–666
(2010)
51. Akay, B., Karaboga, D.: A modified artificial bee colony algorithm for real-parameter
optimization. Inf. Sci. 192, 120–142 (2012)
52. El-Abd, M.: Performance assessment of foraging algorithms vs. evolutionary algorithms. Inf.
Sci. 182, 243–263 (2012)
53. Ghosh, S., Das, S., Roy, S., Islam, M.S.K., Suganthan, P.N.: A differential covariance matrix
adaptation evolutionary algorithm for real parameter optimization. Inf. Sci. 182, 199–219
(2012)
54. Fox, B., Xiang, W., Lee, H.: Industrial applications of the ant colony optimization algorithm.
Int. J. Adv. Manuf. Technol. 31, 805–814 (2007)
55. Geem, Z., Cisty, M.: Application of the harmony search optimization in irrigation. Recent
Advances in Harmony Search Algorithm’, pp. 123–134. Springer, Berlin (2010)
56. Selim, S.Z., Ismail, M.A.: K-means-type algorithms: a generalized convergence theorem and
characterization of local optimality pattern analysis and machine intelligence. IEEE Trans.
PAMI 6, 81–87 (1984)
57. Wang, J., Peng, H., Shi, P.: An optimal image watermarking approach based on a multi-
objective genetic algorithm. Inf. Sci. 181, 5501–5514 (2011)
58. Picard, D., Revel, A., Cord, M.: An application of swarm intelligence to distributed image
retrieval. Inf. Sci. 192, 71–81 (2012)
59. Chaturvedi, D.: Applications of genetic algorithms to load forecasting problem. Springer,
Berlin, pp. 383–402 (2008) (Journal of Soft Computing)
60. Christmas, J., Keedwell, E., Frayling, T.M., Perry, J.R.B.: Ant colony optimization to identify
genetic variant association with type 2 diabetes. Inf. Sci. 181, 1609–1622 (2011)
61. Guo, Y.W., Li, W.D., Mileham, A.R., Owen, G.W.: Applications of particle swarm
optimization in integrated process planning and scheduling. Robot. Comput.-Integr. Manuf.
Elsevier 25(2), 280–288 (2009)
62. Rana, S., Jasola, S., Kumar, R.: A review on particle swarm optimization algorithms and their
applications to data clustering. Artif. Intell. Rev. 35, 211–222 (2011)
63. Yeh, W.C.: Novel swarm optimization for mining classification rules on thyroid gland data.
Inf. Sci. 197, 65–76 (2012)
64. Zhang, Y., Gong, D.W., Ding, Z.: A bare-bones multi-objective particle swarm optimization
algorithm for environmental/economic dispatch. Inf. Sci. 192, 213–227 (2012)
65. Marinakis, Y., Marinaki, M., Dounias, G.: Honey bees mating optimization algorithm for the
Euclidean traveling salesman problem. Inf. Sci. 181, 4684–4698 (2011)
66. Anderberg, M.R.: Cluster analysis for application. Academic Press, New York (1973)
67. Hartigan, J.A.: Clustering Algorithms. Wiley, New York (1975)
68. Valizadegan, H., Jin, R., Jain, A.K.: Semi-supervised boosting for multi-class classification.
19th European Conference on Machine Learning (ECM), pp. 15–19 (2008)
69. Chris, D., Xiaofeng, He: Cluster merging and splitting in hierarchical clustering algorithms.
Proc. IEEE ICDM 2002, 1–8 (2002)
70. Leung, Y., Zhang, J., Xu, Z.: Clustering by scale-space filtering. IEEE Trans. Pattern Anal.
Mach. Intell. 22, 1396–1410 (2000)
71. Révész, P.: On a problem of Steinhaus. Acta Math. Acad. Scientiarum Hung. 16(3–4),
311–331(1965)
170 S. Kumar et al.
72. Niknam, T., et al.: An efficient hybrid evolutionary optimization algorithm based on PSO and
SA for clustering. J. Zhejiang Univ. Sci. A 10(4), 512–519 (2009)
73. Niknam, T., Amiri, B.: An efficient hybrid approach based on PSO, ACO and k-means for
cluster analysis. Appl. Soft Comput. 10(1), 183–197 (2011)
74. Ding, C., He, X.: K-means clustering via principal component analysis. Proceedings of the
21th international conference on Machine learning, pp. 29 (2004)
75. Uddin, M.F., Youssef, A.M.: Cryptanalysis of simple substitution ciphers using particle swarm
optimization. IEEE Congress on Evolutionary Computation, pp. 677–680 (2006)
76. Clerc, M., Kennedy, J.: The particle swarm-explosion, stability, and convergence in a
multidimensional complex space. IEEE Trans. Evol. Comput. 6, 58–73 (2002)
77. Danziger, M., Amaral Henriques, M.A.: Computational intelligence applied on cryptology: a
brief review. Latin America Transactions IEEE (Revista IEEE America Latina) 10(3),
1798–1810 (2012)
78. Chee, Y., Xu, D.: Chaotic encryption using discrete-time synchronous chaos. Phys. Lett.
A 348(3–6), 284–292 (2006)
79. Hussein, R.M., Ahmed, H.S., El-Wahed, W.: New encryption schema based on swarm
intelligence chaotic map. Proceedings of 7th International Conference on Informatics and
Systems (INFOS), pp. 1–7 (2010)
80. Chen, G., Mao, Y.: A symmetric image encryption scheme based on 3D chaotic cat maps.
Chaos Solutions Fractals 21, 749–761 (2004)
81. Hongbo, Liu: Chaotic dynamic characteristics in swarm intelligence. Appl. Soft Comput. 7,
1019–1026 (2007)
82. Azizipanah-Abarghooeea, R., et al.: Short-term scheduling of thermal power systems using
hybrid gradient based modified teaching–learning optimizer with black hole algorithm.
Electric Power Syst. Res. Elsevier 108, 16–34 (2014)
83. Bard, J.F.: Short-term scheduling of thermal-electric generators using Lagrangian relaxation.
Oper. Res. 36(5), 756–766 (1988)
84. Yu, I.K., Song, Y.H.: A novel short-term generation scheduling technique of thermal units
using ant colony search algorithms. Int. J. Electr. Power Energy Syst. 23, 471–479 (2001)
Genetic Algorithm Based Multiobjective
Bilevel Programming for Optimal Real
and Reactive Power Dispatch Under
Uncertainty
Papun Biswas
P. Biswas (&)
Department of Electrical Engineering, JIS College of Engineering,
West Bengal University of Technology, West Bengal, India
e-mail: [email protected]
1 Introduction
The thermal power operation and management problems [36] are actually optimi-
zation problems with multiplicity of objectives and various system constraints. The
most important optimization problem in power system operation and planning is
real power optimization i.e. economic-dispatch problem (EDP). It is to be noted that
more than 75 % of the power plants throughout the world are thermal plants, where
fossil fuel is used as source for power generation. In most of the thermal power
plant coal is used as main electric power generation source. But, generation of
electric power by burning coal leads to produce various harmful pollutants like
oxides of carbon, oxides of nitrogen and oxides of sulphur. These byproducts not
only affect the human but also the entire living beings in this world. So, economic-
dispatch problem of electric power plant is actually a combined optimization
problem where real-power generation cost and environmental emission from the
plant during the generating of power has to optimize simultaneously when several
operational constraints must be satisfied.
Actually the practical loads in electrical system may have resistance, inductance
and capacitance or their combinations. Most of the loads in electrical power system
are reactive (inductive or capacitive) in nature. Due to the presence of reactive
loads, the reactive power will also always be present in the system. Figure 1 rep-
resents the voltage and current waveform in inductive load.
Figure 2 diagrammatically represents the real (P) and reactive (Q) power. The
vector sum of real and reactive power is known as apparent power (S). Electrical
system designers have to calculate the various parameters based on the apparent
power which is greater than the real power.
P = real power
Q = Reactive Power
S = Apparent Power
Now, reactive power is very essential for flowing of active power in the circuit.
So, reactive power dispatch (RPD) is also a very important objective in power
system operation and planning. The main objective of reactive power planning is to
maintain a proper voltage profile and satisfaction of operational constrains in all the
load buses.
Reactive power, termed as Volt-Amps-Reactive (VAR), optimization is one of
the major issues of modern energy management system. Reactive power optimi-
zation also has significant influence on economic operations of power systems.
The purpose of optimal reactive power dispatch (ORPD) is to minimize the real
power loss of the system and improve voltage profiles by satisfying load demand
and operational constraints. In power systems, ORPD problem is actually an
optimization problem with multiplicity of objectives. Here the reactive power
dispatch problem involves best utilization of the existing generator bus voltage
magnitudes, transformer tap setting and the output of reactive power sources so as
to optimize the loss and to enhance the voltage stability of the system. Generally,
the load bus voltages can be maintained within their permissible limits by adjusting
transformer taps, generator voltages, and switchable VAR sources. Also, the system
losses can be minimized via redistribution of reactive power in the system.
Now, in power system operation and planning the simultaneous combined
optimization of the above two problems (EDP and RPD) is very essential for proper
planning of the system design and operation.
In most of the practical optimization the decision parameters of problems with
multiplicity of objectives are inexact in nature. This inexactness is due to the
inherent impressions in parameter themselves as well as imprecise in nature of
human judgments during the setting of the parameter values.
To cope with the above situations and to overcome the shortcomings of the
classical approaches, the concept of membership functions in fuzzy sets theory
(FST) [69] has appeared as a robust tool for solving the optimization problems.
174 P. Biswas
Also the concept of fuzzy logic [5, 6], deals with approximate reasoning rather than
fixed and exact, has been extended to handle the concept of partial truth, where the
truth value may range between completely true and completely false.
In this situation, Fuzzy programming (FP) approach [73] based on FST can be
applied for achieving the solution of the real-life optimization problem. The con-
ventional FP approaches discussed by Zimmerman [74] have been further extended
by Shih and Lee [57], Sakawa and Nishizaki [55, 56], and others to solve hierar-
chical decision problems from the view point of making a balance of decision
powers in the decision making environment.
But, the main drawback of conventional FP approach is that there is a possibility
of rejecting the solution repeatedly due to dissatisfaction with the solution. Due to
this repeated rejection and further calculation the computational load and com-
puting time increases and the decision is frequently found not closer to the highest
membership value (unity) to reach the optimal solution.
To overcome the above difficulty in solving FP problems with multiplicity of
objectives, fuzzy goal programming (FGP) as an extension of conventional goal
programming (GP) [35, 52] which is based on the goal satisficing philosophy
exposed by Simon [58], has been introduced by Pal and Moitra [47] to the field of
conventional FP for making decision with regard to achieving multiple fuzzy goals
efficiently in the decision making environment.
In FGP approach, achievement of fuzzy goals to their aspired levels in terms of
achieving the highest membership value (unity) of each of them is considered. In
FGP model formulation, the membership functions of the defined fuzzy goals are
transformed into flexible membership goals by assigning the highest membership
value (unity) as the aspiration levels and introducing under- and over-deviational
variables to each of them in an analogous to conventional GP approach. Here, in the
goal achievement function, only the under-deviational variables of the membership
goals are minimized, which can easily be realized from the characteristics of
membership functions defined for the fuzzy goals.
Now, in the context of solving the optimization problems, although conventional
multiobjective decision making (MODM) methods have been studied extensively in
the past, it would have to be realized to the fact that the objectives of the proposed
problem are inherently incommensurable owing to the two opposite interests of
economic power generation and emission reduction for protection of environment
from pollution, because emission amount depends on quality of fossil-fuel used in
thermal plants.
To overcome the above situation, bilevel programming (BLP) formulation of the
problem in a hierarchical decision [37] structure can be reasonably taken into
account to make an appropriate decision in the decision making environment.
The concept of hierarchical decision problem was introduced by Burton [10] for
solving decentralized planning problems. The concept of BLP was first introduced
by Candler and Townsley [12]. Bilevel programming problem (BLPP) is a special
case of Multilevel programming problem (MLPP) of a hierarchical decision system.
In a BLPP, two decision makers (DMs) are located at two different hierarchical
Genetic Algorithm Based Multiobjective Bilevel Programming … 175
levels, each independently controlling one set of decision variables and with dif-
ferent and perhaps conflicting objectives.
In a hierarchical decision process, the lower-level DM (the follower) executes
his/her decision powers, after the decision of the higher-level DM (the leader).
Although the leader independently optimizes its own benefits, the decision may be
affected by the reactions of the follower. As a consequence, decision deadlock
arises frequently and the problem of distribution of proper decision power is
encountered in most of the practical decision situations. In MOBLP, there will be
more than one objective in both the hierarchical levels.
Further, to overcome the computational difficulty with nonlinear and competitive
in nature of objectives, genetic algorithms (GAs) [25] based on natural selection
and natural genetics, initially introduced by Holland [31, 32], have appeared as
global solution search tools to solve complex real-world problems. The deep study
on GA based solution methods made in the past century has been well documented
by Michalewicz [44] and Deb [16] among others. The GA based solution methods
[23, 72] to fuzzy multiobjective optimization problems have been well documented
by Sakawa [54].
The GA based solution approach to BLPPs in crisp decision environment was
first studied by Mathieu et al. [43]. Thereafter, the computational aspects of using
GAs to fuzzily described hierarchal decision problems have been investigated [29,
46, 54, 55] in the past. The potential use of GAs to quadratic BLPP model has also
been studied by Pal and Chakraborti [49] in the recent past.
2 Related Works
Demand of electric power has increased in an alarming way in the recent years
owing to rapid growth of human development index across the countries in modern
world. Here it is to be mentioned that the main source of electric energy supply is
thermal power plants, where fossil-fuel is used as resource for power generation.
The thermal power system planning and operation problems are actually optimi-
zation problems with various system constraints in the environment of power
generation and dispatch on the basis of needs in society.
The general mathematical programming model for optimal power generation
was introduced by Dommel and Tinney [19]. The first mathematical model for
optimal control of reactive power flow was introduced by Peschon et al. [50].
Thereafter, various mathematical models for reactive power optimization have been
developed [22, 53]; Lee and Yang [40]; Quintana and Santos-Nieto [17, 51]. The
study on environmental power dispatch models developed from 1960s to 1970s was
surveyed by Happ [28]. Thereafter, different classical optimization models devel-
oped in the past century for environmental-economic power generation (EEPG)
problems have been surveyed [14, 42, 62] in the past.
Now, in the context of thermal power plant operations, it is worthy to mention
that coal as fossil-fuel used to generate power produces atmospheric emissions,
176 P. Biswas
namely Carbon oxides (COx), Sulfur oxides (SOx) and oxides of Nitrogen (NOx),
are the major and harmful gaseous pollutants. Pollution affects not only humans, but
also other species including plants on the earth.
The constructive optimization model for minimization of thermal power plant
emissions was first introduced by Gent and Lament [24]. Thereafter, the field was
explored by Sullivan and Hackett [61] among other active researchers in the area of
study.
Now, consideration of both the aspects of economic power generation and
reduction of emissions in a framework of mathematical programming was initially
studied by Zahavi and Eisenberg [70], and thereafter optimization models for EEPG
problems were investigated [11, 63] in the past.
During 1990s, emissions control problems were seriously considered and dif-
ferent strategic optimization approaches were developed with the consideration of
1990s Clean Air Amendment by the active researchers in the field and well doc-
umented [20, 30, 60] in the literature. Different mathematical models for reactive
power optimization have also been developed during 1990 by the eminent
researchers [27, 41, 67]; Bansilal and Parthasarathy [7, 40] in this field. Thereafter,
different mathematical programming approaches for real and reactive power opti-
mizations have been presented [9, 15, 21, 26, 33, 34, 59]; Abou et al. [3] and widely
circulated in the literature. Here, it is to be mentioned that in most of the previous
approaches the inherent multiobjective decision making problems are solved by
transforming them into single objective optimization problems. As a result, decision
deadlock often arises there concerning simultaneous optimization of both the
objectives.
To overcome the above difficulty, GP as a robust and flexible tool for multi-
objecive decision analysis and which is based on the satisficing (coined by the
noble laureate [58] philosophy has been studied [45] to obtain the goal oriented
solution of economic-emission power dispatch problems.
During the last decade, different multiobjective optimization methods for EEPG
problems have been studied [1, 4, 64] by considering the Clean Air Act
Amendment.
The traditional stochastic programming (SP) approaches to EEPG problems was
studied [18, 68] in the past. FP approach to EEPG problems has been discussed
[8, 66]. But, the extensive study in this area is at an early stage.
To solve thermal power planning problems, consideration of both the aspects of
real and reactive power generation and dispatch in the framework of a mathematical
programming model was initially studied by Jolissaint, Arvanitidis and Luenberger
[38], and thereafter optimization models for combined real and reactive power
management problems were investigated [39, 41, 65] in the past. But, the deep
study in this area is at an early stage.
GA for solving the large-scale economic dispatch was first studied by Chen and
Chang [13]. Then, several soft computing approaches to EEPG problems have also
been studied [2, 9, 26] by the active researchers in this field.
Now, it is to be observed that the objectives of power system operation and
control are highly conflict each other. As an essence, optimization of objectives in a
Genetic Algorithm Based Multiobjective Bilevel Programming … 177
3 Problem Description
Let there be N number of generators present in the system. Pgi, Vi and Ti be the
decision variable of generating power, generator voltages and transformer tap set-
ting in the system. Then, let PD be the total demand of power (in p.u.), TL be the
total transmission-loss (in p.u), PL be the real power losses in the system and VD be
the load-bus voltage deviation associated with the system.
The objectives and constraints of the proposed P-Q management problem are
discussed as follows.
178 P. Biswas
The total fuel-cost ($/hr) function associated with generation of power from all
generators of the system can be expressed as:
X
N
FC ¼ ðai þ bi Pgi þ ci P2gi Þ; ð1Þ
i¼1
In a thermal power plant operational system, various types of pollutions are dis-
charged to the earth’s Environment due to burning of coal for generation of power.
The total emission (ton/hr) can be expressed as:
X
N
E¼ 102 ðai þ bi Pgi þ ci P2gi Þ þ fi expðki Pgi Þ; ð2Þ
i¼1
X
nl
PL ¼ gk [V2i þ V2j 2Vi Vj cosðdi dj )]; ð3Þ
k¼1
where ‘nl’ is the number of transmission lines, gk is the conductance of the kth line,
Vi and Vj are the voltage magnitude and δi and δj are the voltage phase angle at the
end buses i and j of the kth line respectively.
The improvement of voltage profile is nothing but minimizing the bus voltage
deviation (VD) from 1.0 per unit
Genetic Algorithm Based Multiobjective Bilevel Programming … 179
The system constraints which are commonly involved with the problem are defined
as follows.
The generation of total power must cover the total demand (PD) and total trans-
mission-loss ðTL Þ inherent to a thermal power generation system.
The total power balance constraint can be obtained as:
X
N
Pgi (PD þ TL Þ ¼ 0; ð5Þ
i¼1
N X
X N X
N
TL ¼ Pgi Bij Pgj þ B0i Pgi þ B00 ; ð6Þ
i¼1 j¼1 i¼1
where Bij ; B0i and B00 are called Kron’s loss-coefficients or B-coefficients [66]
associated with the power transmission network.
The Real power balance is as follows:
X
NB
Pgi Pdi Vi Vj [Gij cosðdi dj Þ þ Bij sinðdi dj )] ¼ 0 ð7Þ
j¼1
X
NB
Qgi Qdi Vi Vj [Gij sinðdi dj Þ þ Bij cosðdi dj )] ¼ 0; ð8Þ
j¼1
180 P. Biswas
where i = 1, 2, …, NB; ‘NB’ is the number of buses; Pgi and Qgi are the i-th
generator real and reactive power respectively; Pdi and Qdi are the i-th load real and
reactive power respectively; Gij and Bij are the transfer conductance and suscep-
tance between bus i and bus j respectively.
In an electric power generation and dispatch system, the constraints on the gen-
erators can be considered as:
gi Pgi Pgi ;
Pmin max
gi Qgi Qgi ;
Qmin ð9Þ
max
Vmin
gi Vgi Vmax
gi ; i ¼ 1,2,. . .; N
where Pgi, Qgi and Vgi are the active power, reactive power and generator bus
voltage, respectively. ‘N’ is the number of generators in the system.
Tmin
i Ti Tmax
i ; i ¼ 1,. . .; NT ð10Þ
Now, MOBLP formulation of the proposed problem for minimizing the objec-
tive functions is presented in the following Sect. 4.
In a BLP model formulation, the vector of decision variables are divided into two
distinct vectors and assigned them separately to the DMs for controlling
individually.
Let D be the vector of decision variables in a thermal power supply system.
Then, let DL and DF be the vectors of decision variables controlled independently
by the leader and follower, respectively, in the decision situation, where L and F
stand for leader and follower, respectively.
Then, BLP model of the problem appears as [49]:
Find DðDL ;DF Þ so as to:
X
N
Minimize FC ¼ (ai þ bi Pgi þ ci P2gi )
DL
i¼1
X
nl
Minimize PL ¼ gk [V2i þ V2j 2Vi Vj cosðdi dj )]
DL
k¼1
X
N
Minimize E ¼ 102 ðai þ bi Pgi þ ci P2gi Þ þ fi expðki Pgi Þ
DF
i¼1
X
Minimize VD ¼ jVi 1.0j
DF
i2NL
182 P. Biswas
X
N
Minimize FC ¼ ðai þ bi Pgi þ ci P2gi Þ
DL
i¼1
X
nl
Minimize PL ¼ gk ½Vi2 þ Vj2 2Vi Vj cosðdi dj Þ
DL
k¼1
(Leader'sproblem)
and; for givenDL ; DF solves
X
N
Minimize E ¼ 102 ðai þ bi Pgi þ ci P2gi Þ þ fi expðki Pgi Þ
DF
i¼1
X
Minimize VD ¼ j Vi 1:0 j
DF
i2NL
(Follower's problem)
ð13Þ
In the literature of GAs, there is a variety of schemes [25, 44] for generating new
population with the use of different operators: selection, crossover and mutation.
In the present GA scheme, binary representation of each candidate solution is
considered in the genetic search process. The initial population (the initial feasible
solution individuals) is generated randomly. The fitness of each feasible solution
individual is then evaluated with the view to optimize an objective function in the
decision making context.
The basic steps of the GA scheme with the core functions adopted in the solution
search process are presented in the following algorithmic steps.
where, Zk represents the objective function of the k-th level DM, and where the
subscript v is used to indicate the fitness value of the v-th chromosome, v ¼
1; 2; . . .; pop_size.
The best chromosome with largest fitness value at each generation is determined
as:
In a power generation decision context, it is assumed that the objectives in both the
levels are motivated to cooperative to each other and each optimizes his/her benefit
by paying an attention to the benefit of other one. Here, since leader is in the leading
position to make own decision, relaxation on the decision of leader is essentially
needed to make a reasonable decision by follower to optimize the objective function
to a certain level of satisfaction. Therefore, relaxation of individual optimal values
of both the objectives as well as the decision vector DL controlled by leader up to
certain tolerance levels need be considered to make a reasonable balance of exe-
cution of decision powers of the DMs.
To cope with the above situation, a fuzzy version of the problem in (13) would
be an effective one in the decision environment.
The fuzzy description of the problem is presented as follows Section.
In a fuzzy decision situation, the objective functions are transformed into fuzzy
goals by means of assigning an imprecise aspiration level to each of them.
In the sequel of making decision, since individual minimum values of the
objectives are always acceptable by each DM, the independent best solutions of
fb f b fb fb
leader and follower are determined first as ðDlb
L ; DF ;FC ; PL Þ and ðDL ; DF ;E ; VD Þ,
lb lb lb
respectively, by using the GA scheme, where lb and, fb stand for leader’s best and
follower’s best, respectively.
Genetic Algorithm Based Multiobjective Bilevel Programming … 185
Then, the fuzzy goals of the leader and follower can be successively defined as:
FC \ Flb
C and PL \ PL
lb
ð14Þ
E \ Efb and VD \ Vfb
D
where ‘\
’ Refers to the fuzziness of an aspiration level and it is to be understood as
‘essentially less than’ [73].
Again, since maximum values of the objectives when calculated in isolation by
the DMs would be the most dissatisfactory ones, the worst solutions of leader and
follower can be obtained by using the same GA scheme as ðDlw L ; DF ;FC ; PL Þ and
lw lw lw
fw f w fw fw
ðDL ; DF ;E ; VD Þ, respectively, where lw and, fw stand for leader’s worst and
follower’s worst, respectively.
fw
Then, FlwC ; PL ,E and VD would be the upper-tolerance limits of achieving the
lw fw
DL \ Dlb
L ð15Þ
In the fuzzy decision situation, it may be noted that the increase in the values of
fuzzily described goals defined by the goal vector in (15) would never be more than
the corresponding upper-bounds of the power generation capacity ranges defined
in (9).
Let DtL ; ðDtL \ Dmax
L Þ; be the vector of upper-tolerance limits of achieving the
goal levels of the vector of fuzzy goals defined in (15).
Now, the fuzzy goals are to be characterized by the respective membership
functions for measuring their degree of achievements in a fuzzy decision
environment.
The membership function representation of the fuzzy objective goal of fuel cost
function under the control of leader appears as:
8
>
> 1; if FC Flb
< lw C
Fc FC
lFC ½Fc ¼ lb ; 1 \FC FC
if Zlb ð16Þ
lw
>
lw
> C FC
:
F
0; if FC [ Flw
C
in (14).
186 P. Biswas
where ðPlw
L PL Þ is the tolerance range for achievement of the fuzzy goal defined
lb
in (14).
Similarly, the membership function representations of the fuzzy objective goals
of emission and voltage profile improvement function under the control of follower
are successively appear as:
8
>
< 1; fw if E Efb
lE ½ E ¼ E E ; if Efb \E Efw ð18Þ
>
fw
E
: E
fb
0; if E [ Efw
where ðEfw Efb Þ is the tolerance range for achievement of the fuzzy goal defined
in (14).
8
>
> 1; if VD Vfb
< fw D
V D V D
lVD ½VD ¼ ; if Vfb
\V D VD
fw
ð19Þ
>
> V fw
Vfb D
: D D
0; if VD [ VfwD
where (Vfw fb
D VD Þ is the tolerance range for achievement of the fuzzy goal defined
in (14).
Genetic Algorithm Based Multiobjective Bilevel Programming … 187
The membership function of the fuzzy decision vector DL of the leader appears
as:
8
>
< 1;t if D L Dlb
L
DL DL
lDL ½DL ¼ ; L \DL DL
if Dlb t
ð20Þ
> DtL DlbL
:
0; if DL [ DtL
where ðDtL DlbL Þ is the vector of tolerance ranges for achievement of the fuzzy
decision variables associated with DL defined in (15).
Note 1: l½.]represents membership function.
Now, minsum FGP formulation of the proposed problem is presented in the
following section.
L PL
Plw
lPL : þ d þ
2 d2 ¼ 1; ð21Þ
L PL
Plw lb
E E
fw
lE : þ d þ
3 d3 ¼ 1;
Efw Efb
D VD
Vfw
lVD : þ d þ
4 d4 ¼ 1;
VD Vfb
fw
D
DtL D L
lD L : t þ d þ
5 d5 ¼ I
D L PG L lb
where d þ
k ,dk 0, (k = 1, …, 4) represent the under- and over-deviational variables,
respectively, associated with the respective membership goals. d þ
5 ,d5 0 represent
the vector of under- and over-deviational variables, respectively, associated with the
membership goals defined for the vector of decision variables in DL , and where I is
a column vector with all elements equal to 1 and the dimension of it depends on the
dimension of DL . Z represents goal achievement function, w k [ 0, k = 1, 2, 3, 4
denote the relative numerical weights of importance of achieving the aspired goal
levels, and w 5 [ 0 is the vector of numerical weights associated with d5 , and they
are determined by the inverse of the tolerance ranges [48] for achievement of the
goal levels in the decision making situation.
Now, the effective use of the minsum FGP model in (21) is demonstrated via a
case example presented in the next section.
The standard IEEE 30-bus 6-generator test system [1] is considered to illustrate the
potential use of the approach.
The pictorial representation of single-line diagram of IEEE 30-bus test system is
shown in the Fig. 4.
The system shown in Fig. 4 has 6 generators and 41 lines and the total system
demand for the 21 load buses is 2.834 p.u. The detailed data are given in Tables 1,
2, 3, 4 and 5.
The B-coefficients [66] are presented as follows:
2 3
0:1382 0:0299 0:0044 0:0022 0:0010 0:0008
6 0:0299 0:0025 0:0041 7
6 0:0487 0:0004 0:0016 7
6 7
6 0:0044 0:0025 0:0182 0:0070 0:0066 0:0066 7
B¼6
6 0:0022
7
6 0:0004 0:0070 0:0137 0:0050 0:0033 77
6 7
4 0:0010 0:0016 0:0066 0:0050 0:0109 0:0005 5
0:0008 0:0041 0:0066 0:0033 0:0005 0:0244
(leader’s objective 1)
190 P. Biswas
subject to
Pg1 þ Pg2 þ Pg3 þ Pg4 þ Pg5 þ Pg6 ð2:834 þ LT Þ ¼ 0;
Now, employing the proposed GA scheme the individual best and least solutions
of the leader’s objectives are determined.
The computer program developed in MATLAB and GAOT (Genetic Algorithm
Optimization Toolbox) in MATLAB-Ver. R2010a is used together for the calcu-
lation to obtain the results. The execution is made in Intel Pentium IV with
2.66 GHz. Clock-pulse and 4 GB RAM.
Now, the following GA parameter values are introduced during the execution of
the problem in different stages.
The parameter values used in genetic algorithm solution are given in Table 6.
The individual best and worst solution of the Fuel-cost function in leader’s
objectives is obtained as (Table 7):
The individual best and worst solution of the real-power loss function is obtained
as:
X
NB
Pgi Pdi Vi Vj [Gij cosðdeltai dj Þ þ Bij sinðdi dj )] ¼ 0
j¼1
X
NB
Qgi Qdi Vi Vj [Gij sinðdi dj Þ þ Bij cosðdi dj )] ¼ 0;
j¼1
where i = 1,2,…,30; Pgi and Qgi are the i-th generator real and reactive power
respectively; Pdi and Qdi are the i-th load real and reactive power respectively; Gij
and Bij are the transfer conductance and susceptance between bus i and bus j
respectively.
gi Pgi Pgi ;
Pmin max
gi Qgi Qgi ;
Qmin max
(Generator Constraints)
Tmin
i Ti Tmax
i ; i ¼ 11, 12, 15, 36:
(Transformer Constraints)
ci Qci Qci ;
Qmin i ¼ 11, 12, 15, 36:
max
(Security Constraints)
Now, employing the proposed GA scheme the individual best and least solutions
of the objective is determined (Table 8).
Similarly the individual best and worst solutions of the Emission function in the
follower’s problem can be obtained as:
Genetic Algorithm Based Multiobjective Bilevel Programming … 195
(follower’s objective 1)
subject to
and
0:05 Pg1 0:50; 0:05 Pg2 0:60;
0:05 Pg3 1:00; 0:05 Pg4 1:20;
0:05 Pg5 1:00; 0:05 Pg6 0:60;
Now, employing the proposed GA scheme the individual best and least solutions
of the objective is determined.
The individual best and worst solution of the voltage profile improvement
function in follower’s problem is obtained as
Find {Vk,Tl,(Qc)m; k = 1,2,5,8,11,13; l = 11,12,15,36, m = 10,12,15,17,20,
21,24,29}
X
Minimize VD ¼ jVi 10.0j; i
¼ 2; 3; 4; 5; 7; 8; 10; 12; 14; 15; 16; 17; 18; 19; 20; 21; 23; 24; 26; 29; 30
(follower’s objective 2)
subject to
X
NB
Pgi Pdi Vi Vj [Gij cosðdi dj Þ þ Bij sinðdi dj )] ¼ 0
j¼1
X
NB
Qgi Qdi Vi Vj [Gij sinðdi dj Þ þ Bij cosðdi dj )] ¼ 0;
j¼1
Genetic Algorithm Based Multiobjective Bilevel Programming … 197
where i = 1,2,…,30; Pgi and Qgi are the i-th generator real and reactive power
respectively; Pdi and Qdi are the i-th load real and reactive power respectively; Gij and
Bij are the transfer conductance and susceptance between bus i and bus j respectively.
gi Pgi Pgi ;
Pmin max
gi Qgi Qgi ;
Qmin max
gi Vgi Vgi ;
Vmin i ¼ 1,2,. . .; 6
max
(Generator Constraints)
Tmin
i Ti Tmax
i ; i ¼ 11, 12, 15, 36:
(Transformer Constraints)
ci Qci Qci ;
Qmin i ¼ 11, 12, 15, 36:
max
Li VLi VLi ; i ¼ 2; 3; 4; 5; 7; 8; 10; 12; 14; 15; 16; 17; 18; 19; 20; 21; 23; 24; 26; 29; 30
Vmin max
P
N
705:2694 ðai þbi Pgi þci P2gi Þ
and satisfy lFc : i¼1
705:2694595:9804 þ d1 d1þ ¼ 1
P
41
6:47 gk ½Vi2 þ Vj2 2Vi Vj cosðdi dj Þ
lPL : k¼1
þ d2 d2þ ¼ 1;
6:47 4:55
P
N
0:2533 102 ðai þ bi Pgi þ ci P2gi Þ þ fi expðki Pgi Þ
lE : i¼1
þ d3 d3þ ¼ 1
0:2533 0:1952
1:95 VD
lVD : þ d4 d4þ ¼ 1;
1:95 0:09
P
where VD ¼ jVi 1:0j, i = 2,3,4,5,7,8,10,12,14,15,16,17,18,19,20,21,
i2NL
23,24,26,29 and 30
198 P. Biswas
0:25 Pg1
lPg1 : + d þ
5 d5 ¼ 1
0:25 0:1220
0:35 Pg2
lPg2 : + d þ
6 d6 ¼ 1
0:35 0:2863
0:9321 Pg3
lPg3 : + d
7 d7 ¼ 1
þ
0:9321 0:5823
0:9901 Pg4
lPg4 : + d
8 d8 ¼ 1
þ
0:9901 0:6254 ð22Þ
0:8 Pg5
lP5 : + d þ
9 d9 ¼ 1
0:8 0:5236
0:8 Pg5
lP5 : + d þ
9 d9 ¼ 1
0:8 0:5236
0:45 Pg6
lP6 : þ d þ
10 d10 ¼ 1
0:45 0:3518
d þ
i ; di 0; i ¼ 1; 2; . . .; 8:
The result shows that a satisfactory decision is achieved here from the view point
of balancing the decision powers of the DMs on the basis of order of hierarchy
adopted in the decision making situation (Fig. 5).
200 P. Biswas
8 Conclusions
In this chapter, an GA based FGP approach for modeling and solving optimal real
and reactive power flow management problem in the framework of MOBLP in a
hierarchical decision structure is presented.
The main advantage of the proposed approach is that the BLP formulation of the
problem within the framework of multiobjective decision making model leads to
take individual decisions regarding optimization of objectives on the basis of
hierarchy assigned to them.
Again, computational load and approximation error inherent to conventional
linearization approaches can be avoided here with the use of the GA based solution
method.
In the framework of the proposed model, consideration of other objectives and
environmental constraints may be taken into account and the possible aspects of
formulating MLPP within hierarchical decision structure for power plant operations
may be a problem in future study.
Further, sensitivity analysis with assignment of objectives to DMs along with
controlling of decision variables at different hierarchical levels on the basis of needs
in the decision horizon may be a open problem in future.
Finally, it is hoped that the solution approach presented here may lead to future
research for proper planning of electric power generation and dispatch.
Genetic Algorithm Based Multiobjective Bilevel Programming … 201
References
26. Gong, D., Zhang, Y., Qi, C.: Environmental/economic power dispatch using a hybrid multi-
objective optimization algorithm. Electr. Power Energy Syst. 32, 607–614 (2010)
27. Granville, S.: Optimal reactive power dispatch through interior point methods. IEEE Trans.
Power Syst. 9(1), 98–105 (1994)
28. Happ, H.H.: Optimal power dispatch—a comprehensive survey. IEEE Trans. Power Appar.
Syst. 96(3), 841–854 (1977)
29. Hejazi, S.R., Memariani, A., Jahanshahloo, G., Sepehri, M.M.: Linear bilevel programming
solution by genetic algorithm. Comput. Oper. Res. 2(29), 1913–1925 (2002)
30. Hobbs, B.F.: Emission dispatch under the underutilization provision of the 1990 U.S. Clean air
act amendments: models and analysis. IEEE Trans. Power Syst. 8(1), 177–183 (1993)
31. Holland, J.H.: Genetic algorithms and optimal allocation of trials. SIAM J. Comput. 2, 88–105
(1973)
32. Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press,
Ann Arbor, MI (1975)
33. Huang, C.M., Yang, H.T., Huang, C.L.: Bi-objective power dispatch using fuzzy satisfaction-
maximizing decision approach. IEEE Trans. Power Syst. 12(4), 1715–1721 (1997)
34. Iba, K.: Reactive power optimization by genetic algorithms. IEEE Trans. Power Syst. 9(2),
685–692 (1994)
35. Ignizio, J.P.: Goal Programming and Extensions. D. C Health, Lexington (1976)
36. Jizhong, Z.: Optimization of Power System Operation. Wiley, Hoboken (2009)
37. Lai, Y.J.: Hierarchical optimization: a satisfactory solution. Fuzzy Sets Syst. 77(3), 321–335
(1996)
38. Jolissaint, C.H., Arvanitidis, N.V., Luenberger, D.G.: Decomposition of real and reactive
power flows: a method suited for on-line applications. IEEE Trans. Power Appar. Syst. 91(2),
661–670 (1972)
39. Lee, K.Y., Park, Y.M., Ortiz, J.L.: A united approach to optimal real and reactive power
dispatch. IEEE Trans Power Apparatus Syst. 104(5), 1147–1153 (1985)
40. Lee, K.Y., Yang, F.F.: Optimal reactive power planning using evolutionary programming: a
comparative study for evolutionary programming, evolutionary strategy, genetic algorithm and
linear programming. IEEE Trans. Power Syst. 13(1), 101–108 (1998)
41. Mangoli, M.K., Lee, K.Y.: Optimal real and reactive power control using linear programming.
Electr. Power Syst. Res. 26, 1–10 (1993)
42. Momoh, J.A., El-Hawary, M.E., Adapa, R.: A review of selected optimal power flow literature
to 1993. II. Newton, linear programming and interior point methods. IEEE Trans. Power Syst.
14(1), 105–111 (1999)
43. Mathieu, R., Pittard, L., Anandalingam, G.: Genetic algorithm based approach to bilevel linear
programming. Oper. Res. 28(1), 1–21 (1994)
44. Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs’, 3rd edn.
Spriger, Berlin (1996)
45. Nanda, J., Kothari, D.P., Lingamurthy, K.S.: Economic-emission load dispatch through goal
programming techniques. IEEE Trans. Energy Convers. 3(1), 26–32 (1988)
46. Nishizaki, I., Sakawa, M.: Computational methods through genetic algorithms for obtaining
Stackelberg solutions to two-level mixed zero-one programming problems. Cybernetics and
Systems 31(2), 203–221 (2000)
47. Pal, B.B., Moitra, B.N.: A goal programming procedure for solving problems with multiple
fuzzy goals using dynamic programming. Eur. J. Oper. Res. 144(3), 480–491 (2003)
48. Pal, B.B., Moitra, B.N., Maulik, U.: A goal programming procedure for fuzzy multiobjective
linear fractional programming problem. Fuzzy Sets Syst. 139(2), 395–405 (2003)
49. Pal, B.B., Chakraborti, D.: Using genetic algorithm for solving quadratic bilevel programming
problems via fuzzy goal programming. Int. J. Appl. Manage. Sci. 5(2), 172–195 (2013)
50. Peschon, J., Piercy, D.S., Tinney, W.F., Tveit, O.J., Cuenod, M.: Optimum control of reactive
power flow. IEEE Trans. Power Appar. Syst. 87(1), 40–48 (1968)
51. Quintana, V.H., Santos-Nieto, M.: Reactive-power dispatch by successive quadratic
programming. IEEE Trans. Energy Convers. 4(3), 425–435 (1989)
Genetic Algorithm Based Multiobjective Bilevel Programming … 203
52. Romero, C.: Handbook of Critical Issues in Goal Programming. Pergamon Press, Oxford
(1991)
53. Sachdeva, S.S., Billinton, R.: Optimum network VAR planning by nonlinear programming.
IEEE Trans. Power Apparatus Syst. 92, 1217–1225 (1973)
54. Sakawa, M.: Genetic Algorithms and Fuzzy Multiobjective Optimization. Kluwer Academic
Publishers, Boston (2001)
55. Sakawa, M., Nishizaki, I.: Interactive fuzzy programming for decentralized two-level linear
programming problems. Fuzzy Sets Syst. 125(3), 301–315 (2002)
56. Sakawa, M., Nishizaki, I.: Interactive fuzzy programming for two-level Nonconvex
programming problems with fuzzy parameters through genetic algorithms. Fuzzy Sets Syst.
127(2), 185–197 (2002)
57. Shih, H.S., Lee, S.: Compensatory fuzzy multiple level decision making. Fuzzy Sets Syst. 114
(1), 71–87 (2000)
58. Simon, H.A.: Administrative Behavior. Fress Press, New York (1957)
59. Sonmez, Y.: Multi-objective environmental/economic dispatch solution with penalty factor
using artificial bee colony algorithm. Sci. Res. Essays 6(13), 2824–2831 (2011)
60. Srinivasan, D., Tettamanzi, A.G.B.: An evolutionary algorithm for evaluation of emission
compliance options in view of the clean air act amendments. IEEE Trans. Power Syst. 12(1),
336–341 (1997)
61. Sullivan, R.L., Hackett, D.F.: Air quality control using a minimum pollution-dispatching
algorithm. Environ. Sci. Technol. 7(11), 1019–1022 (1973)
62. Talaq, J.H., El-Hawary, F., El-Hawary, M.E.: A summary of environmental/ economic
dispatch algorithms. IEEE Trans. Power Syst. 9(3), 1508–1516 (1994)
63. Tsuji, A.: Optimal fuel mix and load dispatching under environmental constraints. IEEE Trans.
Power Appar. Syst. PAS 100(5), 2357–2364 (1981)
64. Vanitha, M., Thanushkodi, K.: An efficient technique for solving the economic dispatch
problem using biogeography algorithm. Eur. J. Sci. Res. 5(2), 165–172 (2011)
65. Vlachogiannis, J.G., Lee, K.Y.: Quantum-inspired evolutionary algorithm for real and reactive
power dispatch. IEEE Trans. Power Syst. 23(4), 1627–1636 (2008)
66. Wang, L.F., Singh, C.: Environmental/economic power dispatch using a fuzzified multi-
objective particle swarm optimization algorithm. Electr. Power Syst. Res. 77(12), 1654–1664
(2007)
67. Wu, Q.H., Ma, J.T.: Power system optimal reactive power dispatch using evolutionary
programming. IEEE Trans. Power Syst. 10(3), 1243–1249 (1995)
68. Yokoyama, R., Bae, S.H., Morita, T., Sasaki, H.: Multiobjective optimal generation dispatch
based on probability security criteria. IEEE Trans. Power Syst. 3(1), 317–324 (1988)
69. Zadeh, L.A.: Fuzzy Sets. Inf. Control 8(3), 338–353 (1965)
70. Zahavi, J., Eisenberg, L.: Economic-environmental power dispatch. IEEE Trans. Syst. Man
Cybern. SMC 5(5), 485–489 (1975)
71. Zhang, G., Zhang, G., Gao, Y., Lu, J.: Competitive strategic bidding optimization in electricity
markets using bilevel programming and swarm technique. IEEE Trans. Industr. Electron. 58
(6), 2138–2146 (2011)
72. Zheng, D.W., Gen, M., Ida, K.: Evolution program for nonlinear goal programming. Comput.
Ind. Eng. 31(3/4), 907–911 (1996)
73. Zimmermann, H.-J.: Fuzzy Sets Decision Making and Expert Systems. Kluwer Academic
Publisher, Boston (1987)
74. Zimmermann, H.-J.: Fuzzy Set Theory and Its Applications, 2nd Revised edn. Kluwer
Academic Publishers, Boston (1991)
A Monitoring-Maintenance Approach
Based on Fuzzy Petri Nets
in Manufacturing Systems with Time
Constraints
Abstract Maintenance and its integration with control and monitoring systems
enable the improvement of systems functioning, regarding availability, efficiency,
productivity and quality. This paper proposes a monitoring-maintenance approach
based on fuzzy Petri Nets (PN’s) for manufacturing job-shops with time constraints.
In such systems, operation times are included between a minimum and a maximum
value. In this context, we propose a new fuzzy Petri net called Fuzzy Petri Net for
maintenance (FPNM). This tool is able to identify and select maintenance activities
of a discrete event system with time constraints, using a temporal fuzzy approach.
The maintenance module is consists of P-time PNs and fault tree. The first is used
for modelling of normal behaviour of the system by temporal spectrum of the
marking. The second model corresponds to diagnosis activities. Finally, to illustrate
the effectiveness and accuracy of proposed maintenance approach, two industrial
examples are depicted.
1 Introduction
The demands for products with higher quality and competitive prices have led to the
development of complex manufacturing systems. A consequence is that the number
of failures tends to increase as well as the time required to locate and repair them.
The occurrence of failures during nominal operation can deeply modify the
From the modelling point of view, P-TPNs were introduced in 1996 in order to
model Dynamic Discrete Event System (DDES) including sojourn time constraints.
Definition 1 [14] The formal definition of a P-TPN is given by a pair 〈R; I〉 where:
• R is a marked Petri net,
• I : P ! Qþ ðQþ [ fþ1gÞ
ISi defines the static interval of staying time of a mark in the place pi belonging
to the set of places P (Q+ is the set of positive rational numbers). A mark in the
place pi is taken into account in transition validation when it has stayed in pi at least
a duration ai and no longer than bi. After the duration bi the token will be dead.
In manufacturing job-shops with time constraints, for each operation is associ-
ated a time Interval ([ai, bi] with u.t: unit time). Its lower bound indicates the
minimum time needed to execute the operation and the upper bound sets the
maximum time not to exceed in order to avoid the deterioration of the product
quality. Consequently P-TPNs have the capability of modelling time intervals and
deducing a set of scenarios, when time constraints are violated.
The production is subject to many uncertainties arising from the processes, the
operators or the variations of quality of the products. A production is seldom
perfectly repetitive, due to uncertainties on the process. However, a regular pro-
duction is required in order to maintain product quality.
208 A. Mhalla and M. Benrejeb
All authors, who treated uncertainties, studied mainly two disturbances: distur-
bances on the equipment and more particularly the breakdowns machine or the
disturbances concerning work and more particularly the change in the operational
durations [24]. For all these reasons, a function of possibilities, representing
uncertainty over the effective residence time (qi) of a token in a place pi, is pro-
posed. This function makes it possible to highlight zones of certainty for an
operational duration and helps the human agent (or supervisor) in charge of
detecting failures and deciding reconfiguration/repair actions [18].
In order to quantify to a set of possible sojourn time of the token in the place pi, a
fuzzy set A, representing the uncertainty on the effective sojourn time of the token
in the place pi (qi) is proposed (Fig. 1).
This quantification allows us to define a measure of the possibility with which
the sojourn time qi, is verified. These results, Fig. 1, make it possible to highlight
zones of certainty for operation durations; a high value of effective sojourn time can
guarantee a normal behaviour of monitored system. Instead, a low value implies the
possibility of detecting of failure symptom (behavioural deviation).
Based on fuzzy model, Fig. 1, all system scenarios are developed. The scenarios
consider all possible deviations. Deviations can occur due to the failure of
qi
0 ai q i min q i max bi
Uncompleted Operation
Degraded production Degraded production
* Incorrect product
Correct product
Degraded production
Problem of the production equipment
The control objects can be connected to the P-TPN model of the manufacturing
system, and can respectively be applied to perform checking and validation of the
adequateness and correctness of all operations that are introduced in the system.
Two general types of control objects can be utilized for the purpose:
Watch-Dogs Objects
These control objects are able to restrict the maximum and the minimum time
periods, that are allowed for the execution of the particular operations. In cases, if
the time restrictions are violated, the system is capable to generate an immediate
reaction (similar to the alarm cases).
Time-Out Objects
These control objects represent particular kind of watch-dogs that restrict only the
maximal time periods, allowed for a particular operation, and react in the same way
as the watch-dogs.
In the P-TPN, watch-dog model can be connected to the places. In that model the
beginning and the end of all operation, which needs strict control of their execution
time periods. Thus, in manufacturing system with time constraints, any detection of
a constraint violation (possible defects) can be modelled by specific mechanisms
named watch dogs.
error condition, i.e. within acceptable deviation and before failure occurs. Thus, this
study employs uncertainty of sojourn time in order to perform early failure
detection.
Definition 1 [11] A fault tree FT is a directed acyclic graph defined by the tuple
{Ei, Gi, Di, TOPi}. The union of the sets Gi (logical gates) and Ei (events) represents
the nodes of the graph; Di is a set of directed edges, each of which can only connect
an event to the input of a logical gate or the output of a logical gate to an event.
A top event TOPi is an event of the fault tree FTi that is not the input of any logic
gate, i.e. there are no edges that come out of the top event. The nodes of a fault tree
are connected through logical gates, in this paper; we consider only static fault
trees, i.e. fault trees in which the time variable does not appear. Therefore, only the
AND and the OR gate will be treated in this paper.
Definition 2 [11] Let us suppose ANDi is an AND gate with n inputs INkANDi,
1 < k < n and output OUTANDi. Let Pin(k, i) be the probability associated with the
input INkANDi and POUTANDi be the probability associated with the output of
ANDi. If the inputs to the AND gate is mutually independent, the probability
associated with the output can be calculated as follows:
Y
n
POUT ANDi ¼ Pin ðk; i) ð1Þ
k¼1
Definition 3 [11] Let us suppose ORi is an OR gate with n inputs INkORi, 1 < k<n
and output OUTORi. Let Pin(k, i) be the probability associated with the input
INkORi and POUTORi be the probability associated with the output of ORi.
If the inputs to the OR gate all mutually exclusive, the output can be calculated
as follows:
Y
n
POUT ORi ¼ 1 ð1 Pin ðk; i)) ð2Þ
k¼1
A Monitoring-Maintenance Approach Based on Fuzzy Petri Nets … 211
For each α-level of the fuzzy number which represents a probability, the model
is run to determine the minimum and maximum possible values of the output. This
information is then directly used to construct the corresponding membership
function of the output.
x
0
m
212 A. Mhalla and M. Benrejeb
~¼X
Z ~ þY
~ ! ½ZðaÞ L ; ZðaÞ R ¼ ½XðaÞ L þ YðaÞ L ; XðaÞ R þ YðaÞ R ð3Þ
~¼X
Z ~ Y
~ ! ½ZðaÞ L ; ZðaÞ R ð4Þ
(
Z~ ðaÞ ¼ min(XðaÞ L YðaÞ L ; XðaÞ R YðaÞ L ; XðaÞ L YðaÞ R ; XðaÞ R YðaÞ R Þ
with : L
ðaÞ
Z~R ¼ maxðXðaÞ L YðaÞ L ; XðaÞ R YðaÞ L ; XðaÞ L YðaÞ R ; XðaÞ R YðaÞ R Þ
Reliability and life are two major elements of maintenance tasks. Reliability theory
can also assists maintenance engineers in judging the operational status of equip-
ment and in developing safe response measures to prevent any accidents during
shutdown procedures [15]. Such the FTA provides a basis for further maintenance
of manufacturing systems with time constraints, there by, enabling engineers to
conduct additional tests to determine a proper reliability distribution model for
further analysis and application.
According to the failure records of well-documented manufacturing systems,
maintenance tasks are generally categorized as adjustment, repair, and replacement
[15]. While failure distribution characteristics are analyzed using fault tree analysis,
failure distribution modes and Weibull distributions, are incorporated into a system
reliability model and then tested and analyzed to establish a proper reliability
distribution model.
A Monitoring-Maintenance Approach Based on Fuzzy Petri Nets … 213
In manufacturing system, failure will occur if the degradation level exceeds the
permissible value. Therefore, maintenance is defined as a strategy to maintain
available or operational conditions of a facility using all possible methods and
means, or to restore functions from trouble and failures [22].
Much of development works has been undertaken in certain of the maintenance
fields. Recovery tools have been researched [15] and their application to failure
prevention is well reviewed. The proposed recovery tool is inspired of the research
of Minca [21].
To model the recovery functions, a definition of a fuzzy PN model able to
integrate uncertainty on sojourn time (qi) and fuzzy probabilities of the monitored
system (Pi), related to a base of fuzzy logic rules, is given, Fig. 3.
The fuzzy Petri net for maintenance (FPNM) is defined as being the n-uplet: 〈P, T,
Pr, Q, RA, F, Ψ, Ω, Δ, M0〉 with:
P ¼ px [ py the finite set of input px and output py places;
T ¼ ft1 ; t2 ; . . .; tn g a collection of transitions. A transition ti is specialized in
inference/aggregation operations of logic rules;
Fuzzy Rules
Types of
Maintenance
214 A. Mhalla and M. Benrejeb
S
Pr ¼ ze¼1 Pre the finite set of the input variable “fuzzy probability”;
Sr
Q ¼ f ¼1 qf subsets of input variables “sojourn time”;
S
RA ¼ sg¼1 rag subsets of output variables “recovery action”;
Pr (resp Q) and RA are subsets of variables that are respectively in the ante-
cedence and in the consequence of the fuzzy rules Fw;
S
F ¼ aw¼1 F w F w ¼ Pr [Q ! RA: the fuzzy logic rules
set
Ψ = (Ψ11, Ψ12, …, Ψz1,Ψz2, …, Ψze) the finite set of membership functions,
defined on the universe [0, 1] of the input
variables “fuzzy probability”, Pr = (Pr1,
Pr2, …, Prz).“e” represents the number of
input variables Pr;
Ω = (Ω11, Ω12, …, Ωf1, Ωf2, …, Ωfr) the finite set of membership functions,
definite on the universe [0, 1] of the second
input variable “sojourn time”. “r” is the
number of input variables “sojourn time”;
Δ = (Δ11, Δ12, …, Δg1, Δg2, …, Δgs) the finite set of membership functions,
definite on the universe [0, 1] of the output
variable “recovery action”. “s” is the num-
ber of output variables;
M0 the initial marking of the input places
pi ∈ Px
Each input or output place of the FPNM is associated to a fuzzy description. For
the input places, we describe the marking variable of the place, whereas for the
output places we describe recovery action.
In FPNM, each base of logic rules “F” represents the fuzzy implications
describing the knowledge base of the expert. Each implication respects the “if-then”
model represents the logical dependence of variable Pr (resp. Q and RA} associated
to the fuzzy sets Ψ (resp. Ω and Δ).
The proposed FPNM is considered as an adaptive technique dedicated to the
recovering of manufacturing systems with time constraints. This model has a
double interface, one with the modelling model (based on P-time PN) and the
second one with the diagnosis model (fuzzy Fault Tree). To demonstrate the
effectiveness of our proposed methodology we present two maintenance realistic
examples.
A Monitoring-Maintenance Approach Based on Fuzzy Petri Nets … 215
Output packaging
Convoyer T1
Stock of finished products SA Group of 6 bottles
5 Illustrative Examples
For simplicity, we disregard the nature of the precise operations performed in the
packaging unit; therefore we represent a simplified model of the unit.
Figure 4, shows a milk packaging unit: to packing the products (bottles of
1,000 ml), bottles are placed on the conveyor T1 to supply the packaging machine
(M), where they will be wrapped by welding in a group of 6. The finished products
are deposited on the output conveyor towards the stock of finished products SA.
Figure 5, shows a P-time Petri net (G) modeling the packaging machine. Three
fuzzy sets, representing the uncertainty on the effective sojourn time of the token in
the places p1, p2 and p8, are proposed (Fig. 5). The obtained membership’s func-
tions are used to study the maintenance of the machine (M). As the sojourn times in
places have not the same functional signification when they are included in the
sequential process of a product or when they are associated to a free resource, a
decomposition of the Petri net model into two sets is made using [16], Fig. 5,
where:
• RU is the set of places representing the used machines,
• TransC is the set of places representing the loaded transport resources.
216 A. Mhalla and M. Benrejeb
μ(q2)
IS2=[12,20]; q2e=15 1
p2
q2
IS3=[12,20];q3e=15 0 12 13 q2e=15 17 20
p3
IS4=[12,20]; q4e=15
p4
6 bottles Package of 6 bottles
OR
Fingers failed
Failing sealing bar
ds2
ds1
OR AND
The Fault Tree analysis (FTA) is based on the fuzzy set theory [2, 3, 4]. So, we can
allocate a degree of uncertainty to each value of the failure probability. Thus,
according to Eqs. (1) and (2), the fuzzy probability of a system failure (top event
occurrence) is determined from the fuzzy probabilities of components failure.
The parameter ai is the lower bound, the parameter mi is the modal value, and the
parameter bi is the upper bound for each fuzzy probability of components failure.
These parameters are given in Table 1.
Figure 7, provides the representation of computed fuzzy probability associated to
the failure F, ds1 and ds2. The fuzzy failure probability of the top event (F) is given
below:
218 A. Mhalla and M. Benrejeb
1
X: 0.002911
0.9 Y: 1
0.8
F 0.7
0.6
0.5
0.4
0.3
0.2
OR 0.1
1 0
X: 0.0015
0.9 Y: 1 2.5 3 3.5 4 4.5-3
x 10
0.8 1
0.7 ds1 ds2 0.9
X: 1.322e-005
Y: 1
0.6 0.8
0.5 0.7
0.6
0.4
0.5
0.3
0.4
0.2
OR AND 0.3
0.1 0.2
0 0.1
1 1.5 2 2.5 -3 0
x10
0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2
a b c d x 10
-5
Fig. 8 Three-dimensional
trapezoidal membership
function [RA = f(q2, PrF)]
– Each rule use the operator “AND” in the premise, since it is an AND operation,
the minimum criterion is used (Mamdani inference method), and the fuzzy
outputs corresponding to these rules are represented by Fig. 8.
– Next we perform defuzzification to convert our fuzzy outputs to a single number
(crisp output), various defuzzification methods were explored. The best one for
this particular application: the centre of area (COA) defuzzifier. According to
the COA method, the weighted strengths of each output member function are
multiplied by their respective output membership function center points and
summed. Finally, this area is divided by the sum of the weighted member
function strengths and the result is taken as the crisp outputs.
In practice there are two fuzzy outputs to defuzzify (corrective and preventive
maintenance). Analysing the data, it is noted that the appropriate technique for
recovery is the corrective.
In the processing station, Fig. 9, workpieces are tested and processed on a rotary
indexing table. The rotary indexing table is driven by a DC motor [10]. On the
rotary indexing table, the workpieces are tested and drilled in two parallel pro-
cesses. A solenoid actuator with an inductive sensor checks that the workpieces are
inserted in the correct position. During drilling, the workpiece is clamped by a
solenoid actuator. Finished workpieces are passed on via the electrical ejector,
Fig. 9. The processing station is consists of [10]:
• Rotary indexing table module
• Testing module
• Drilling module
• Clamping module
220 A. Mhalla and M. Benrejeb
Testing module
Testing solenoid
Drilling module
Drilling moteur
Electrical ejector
DC gear motor
Figure 10, shows a P-time Petri net (G) modeling the production unit. The obtained
G is used to study the maintenance of processing unit.
The full set time intervals of operations, in studied unit, are summarized in
Table 2 (u.t: unit time).
The purpose of the monitoring task is to detect, localise, and identify problems that
occur on the system. These problems can be physical (a piece of equipment is down,
a cable is cut) or logical (a station is rebooting, a logical connection is down …).
The considered approach uses the additional information provided by the
knowledge of the effective sojourn time and allows detecting a failure symptom
when a constraint is violated.
Let us suppose that we want to monitor the drilling platform. In this module, a
clamping device clamps the workpiece. Once the drilling is completed, the drilling
machine is stopped, moved to its upper stop and the clamping device is retracted,
Fig. 9.
According to P-TPN, Fig. 10, the minimum time granted to the drilling operation
is 5 u.t, whereas the maximum time is 11 u.t (IS5 = [5, 11]; q5e = 7).
A Monitoring-Maintenance Approach Based on Fuzzy Petri Nets … 221
T5 d5=1
T6 d6=1
P8 T1
P4 P6
IS6 = [3, 5]
IS4 = [1, 3]
q6e =4
q 4e =2 IS8= [0, +∞]; q8e=10 P1
T7 d7=1
T4 d4=1
IS1= [7, 11]
P7 q1e=10
IS7 = [1, 3]
q7e=2
T2
P3 T8 d8=1
T3 d3=0 P2
IS3= [2, 5]
q3e=4 IS2 = [2, 5]
q2e =3
Suppose that the drilling duration is 13 u.t (indicated by the effective sojourn
time q5 = 13 u.t and q5∉[a5, b5]). This delay of sojourn time, Fig. 11, implies that:
• there is a technical failure of the production tool (clamping device, drilling
machine, inductive sensor, …) requiring to generate a maintenance action,
• the quality of the manufactured product is incorrect since q5 ∉ [a5, b5]).
222 A. Mhalla and M. Benrejeb
μ (q5)
Incorrect piece
Failure of production tool
q5
To establish the causality of failures on the sub-systems that can affect the system
status, a fault tree, Fig. 12, was constructed and processing unit failure was defined
as the top event of the fault tree (F0). This diagnostic tree was comprised of 16 basic
events.
The calculation of the probability allows us to determine the critical components
of the tree and improve system reliability. In addition, this probability guide us in
locating basic events that contribute to the vagueness of top event failure rates and
thus effectively reduce this imprecision by a feedback on the vagueness of con-
cerned elementary events.
OR
Capacitive
Elec- Electrical
proximity trical gear clamping OR
sensor (f) motor (g) AND
device (h)
Default
coupling (l) Motor Winding fail- Brake lock
vibration (m) ure (o) (p)
0.8
0.6
Alpha
0.4
0.2
0
0 0.05 0.1 0.15 0.2 0.25
Probabilité d'occurence de F0
The parameter ai, mi, bi are given in Table 3. We choose the trapezoidal shapes
because of their mathematical simplicity. Figure 13, gives the fuzzy probability of
the top event occurrence (PF0). Analysing the data, it is noted that the most critical
events in the fault tree are g, and i, respectively associated respectively to defaults
dg, di (greater probability value). Consequently we can deduce the most critical
components to system failure; in fact a small variation in the critical component
224 A. Mhalla and M. Benrejeb
configuration causes a relatively greater change in the estimate of the top event
failure probability.
Based on probabilistic measures, the proposed maintenance model is able to
evaluate the relative influence of components reliability on the reliability of the
system and provide useful information about the maintenance strategy of these
elements. The FPNM model is able to trigger one or more preventive or corrective
actions. Thus, the failures and repair process is capable of indicating when a failure
(is about to) occur, so that repair can be performed before such failure causes
damage or capital investment loss.
Fw
RA1
PrF0
t
μ q5
Ω q5 .1 Ω q5.2 Ω q5.3
μ RA2
Δ 2.Prv Δ 2.C Δ 2.Prd
q5
RA2
necessary to ‘‘repair the defective material’’, eliminate fault effects in order to reach
the system’s regular operation status.
When maintenance is required (corrective, preventive or predictive), the FPNM
model inhibits all pre-set regular operating conditions at the modelling and diag-
nosis level. At this point the maintenance module takes up control of the process.
The maintenance task is, therefore, synchronized with the modelling and diagnosis
model.
6 Conclusion
In this paper, we have proposed a fuzzy Petri net for maintenance, able to analyze
monitoring and recovery tasks of manufacturing systems with time constrains. The
new recovery approach is based on the study of effective sojourn time of the token
in places and the evaluation of the failure probability of fault tree events.
Our study makes the assumption that the supervised system is modeled by P-
time Petri nets. The paper proposes an adaptive technique dedicated to the main-
tenance of manufacturing systems with time constraints. This model has a double
interface, one with the modeling model system and the second one with the
behavioral model (diagnosis).
At the occurrence of a dysfunction in a milk packaging machine, it is important
to react in real time to maintain the productivity and to ensure the safety of the
system. It has been shown that the knowledge of the effective sojourn time of the
token has a significant contribution regarding this type of problem, since it makes
the supervision more efficient by an early detecting of a time constraint violation.
This is quite useful for the maintenance task.
We have developed and used a fuzzy probabilistic approach to evaluate the
failure probability of the top event, when there is an uncertainty about the com-
ponents failure probabilities. This approach is based on the use of fuzzy
probabilities.
To illustrate the efficiency of the maintenance approach, we have applied it to a
packaging process. The proposed Petri net approach can achieve early failure
detection and isolation for fault diagnosis. These capabilities can be very useful for
health monitoring and preventive maintenance of a system.
It is interesting as further research to incorporate the issues of maintenance and
repair strategies into the fuzzy probabilistic approach in order to compute a mod-
ified maintenance cost. This last problem needs a specific approach, because of the
production loss which occurs when maximum time constraints are not fulfilled
anymore.
Based on two workshops topology, it can be claimed that the proposed fuzzy
Petri nets for maintenance allows applying various maintenance policies—correc-
tive, preventive and predictive.
A Monitoring-Maintenance Approach Based on Fuzzy Petri Nets … 227
References
1. Adamyan, A., He, D.: Analysis of sequential failures for assessment of reliability and safety of
manufacturing systems. Reliab. Eng. Syst. Saf. 76(3), 227–236 (2002)
2. Azar, A.T.: Fuzzy Systems. IN-TECH, Vienna (2010). ISBN 978-953-7619-92-3
3. Azar, A.T.: Adaptive Neuro-Fuzzy Systems. In: A.T Azar (ed) Fuzzy Systems. IN-TECH,
Vienna, Austria (2010). ISBN 978-953-7619-92-3
4. Azar, A.T.: Overview of type-2 fuzzy logic systems. Int. J. Fuzzy Syst. Appl. 2(4), 1–28
(2012)
5. Buckley, J.J., Eslami, E.: Fuzzy Markov chains: uncertain probabilities. MathWare Soft
Comput. 9(1), 33–41 (2008)
6. Buckley, J.J.: Fuzzy Probabilities: New Approach and Application, vol. 115, p. 17. Springer,
Berlin (2005)
7. Cassady, C.R., Pohl, E.A., Murdock, W.P.: Selective maintenance modeling for industrial
systems. J. Qual. Maintenance Eng. 7(2), 104–117 (2001)
8. Celen, M., Djurdjanovic, D.: Operation-dependent maintenance scheduling in flexible
manufacturing systems. CIRP J. Manufact. Sci. Technol. 5(4), 296–308 (2012)
9. Dunyak, J., Saad, I.W., Wunsch, D.: A theory of independent fuzzy probability for system
reliability. IEEE Trans. Fuzzy Syst. 7(2), 286–294 (1999)
10. Festo: Mechatronics and factory automation, pp. 245–246. MPS Station (2014)
11. Fovino, I.N., Masera, M., De Cian, A.: Integrating cyber attacks within fault trees. J. Reliab.
Eng. Syst. Saf. 94(9), 1394–1402 (2009)
12. Hua, Z., Tao Z.: Dynamic job-shop scheduling with urgent orders based on Petri net and
GASA. In: IEEE Chinese Conference of Control and Decision (CCDC’09), pp. 2446–2451
(2009)
13. Khalid, M.N.A., Yusof, U.K., Sabudin, M.: Solving flexible manufacturing system distributed
scheduling problem subject to maintenance using harmony search algorithm. In: 4th IEEE
Conference on Data Mining and Optimization (DMO), pp. 73–79 (2012)
14. Khansa, W., Denat, J.P., Collart-Dutilleul, S.: P-Time Petri nets for manufacturing systems. In:
IEEE Workshop on Discrete Event Systems (WODES’96), pp. 94–102. Edinburgh (1996)
15. Kao Chang, C., Liang Hsiang, C.: Using generalized stochastic Petri nets for preventive
maintenance optimization in automated manufacturing systems. J. Qual. 18(2), 117–129
(2011)
16. Long, J., Descotes-Genon, B.: Flow optimization method for control synthesis of flexible
manufacturing systems modeled by controlled timed Petri nets. In: IEEE International
Conference on Robotics and Automation, pp. 598–603. USA (1993)
17. Mentes, A., Helvacioglu, I.: An application of fuzzy fault tree analysis for spread mooring
systems. J. Ocean Eng. 38(2), 285–294 (2011)
18. Mhalla, A., Jenheni, O., Collart Dutilleul, S., Benrejeb, M.: Contribution to the monitoring of
manufacturing systems with time constraints: application to a surface treatment line. In: 14th
International Conference of Sciences and Techniques of Automatic and Computer
Engineering, pp. 243–250. Sousse (2013)
19. Mhalla, A., Collart Dutilleul S., Benrejeb, M.: Monitoring of packaging machine using
synchronized fuzzy Petri nets. In: Management and Control of Production and Logistics,
pp. 337–343. Brazil (2013)
20. Mhalla, A., Collart Dutilleul, S., Craye, E., Benrejeb, M.: Estimation of failure probability of
milk manufacturing unit by fuzzy fault tree analysis. J. Intell. Fuzzy Syst. 26, 741–750 (2014)
21. Minca, E., Racoceanu, D., Zerhouni, N.: Monitoring systems modeling and analysis using
fuzzy Petri nets. Stud. Inform. Control 11(4), 331–338 (2002)
22. Rocha Loures, E., Busetti de Paula, M.A., Portela Santos, E.A.: A control-monitoring-
maintenance framework based on Petri net with objects in flexible manufacturing system. In:
3th International Conference on Production Research, pp. 3–6. Brazil (2006)
228 A. Mhalla and M. Benrejeb
23. Sallak, M., Simon, C., Aubry, J.F.: Evaluating safety integrity level in presence of uncertainty.
In: 4th International Conference on Safety Reliability, p. 5. Krakow (2006)
24. Sitayeb, F.B.: Contribution à l’étude de la performance et de la robustesse des
ordonnancements conjoint production/ maintenance : cas du Flowshop. Thèse de Doctorat,
Université de Franche Comté, pp. 88–90 (2005)
25. Tanaka, H., Fan, L.T., Lai, F.S., Toguchi, K.: Fault tree analysis by fuzzy probability. IEEE
Trans. Reliab. 32(5), 453–457 (1983)
26. Yang, S.K., Liu, T.S.: A Petri net approach to early failure detection and isolation for
preventive maintenance. Int. J. Qual. Reliab. Eng. 14(5), 319–330 (1998)
27. Zhang, W.W., Su, Q.X., Liu, P.Y.: Study of equipment virtual disassembly Petri net modeling
for virtual maintenance. Advances in Computer, Communication, Control and Automation,
pp. 361–367. Springer, Berlin (2012)
28. Zhang, T., Andrews, J., Guo, B.: A simulated Petri-net and genetic algorithm based approach
for maintenance scheduling for a railway system. In: Advances in Risk and Reliability
Technology Symposium. 20th AR2TS, pp. 86–87. Nottingham (2013)
Box and Jenkins Nonlinear System
Modelling Using RBF Neural Networks
Designed by NSGAII
Abstract In this work, we use radial basis function neural network for modeling
nonlinear systems. Generally, the main problem in artificial neural network is often
to find a better structure. The choice of the architecture of artificial neural network
for a given problem has long been a problem. Developments show that it is often
possible to find architecture of artificial neural network that greatly improves the
results obtained with conventional methods. We propose in this work a method
based on No Sorting Genetic Algorithm II (NSGA II) to determine the best
parameters of a radial basis function neural network. The NSGAII should provide
the best connection weights between the hidden layer and output layer, find the
parameters of the radial function of neurons in the hidden layer and the optimal
number of neurons in the hidden layers and thus ensure learning necessary. Two
functions are optimized by NSGAII: the number of neurons in the hidden layer of
the radial basis function neural network, and the error which is the difference
between desired input and the output of the radial basis function neural network.
This method is applied to modeling Box and Jenkins system. The obtained results
are very satisfactory.
Keywords NSGAII Radial basis function (RBF) neural networks Optimization
Modelling Non linear system Box and Jenkins system
K. Lamamra (&)
Department of Electrical Engineering, University of Oum El Bouaghi, Oum El Bouaghi,
Algeria
e-mail: [email protected]
K. Lamamra
Laboratory of Mastering of Renewable Energies, University of Bejaia, Bejaia, Algeria
K. Belarbi S. Boukhtini
University of Constantine, Constantine, Algeria
e-mail: [email protected]
S. Boukhtini
e-mail: [email protected]
1 Introduction
The first phase of modelling is to bring together the knowledge which we have
about the process behaviour, from experiments and/or theoretical analysis of
physical phenomena. This knowledge leads to several model assumptions. Each of
these dynamic models realizes nonlinear functions between its control variables,
state, and output. In the case where these functions are unknown, a black box model
is used [56]. If some functions can be fixed from the physical analysis, then we talk
about knowledge model [21, 56].
The second phase is to select the best model. This phase is the identification; this
involves estimating the parameters of competing models. The estimation of the
model parameters is performed by minimizing a cost function determined from the
difference between the measured process outputs and the predicted values (pre-
diction error).
The quality of this estimate depends on the wealth of learning sequences and
effectiveness of the used algorithm. After the identification of all hypothesis
models, we use the hypothesis corresponding to the best obtained predictor; the
final model validation is performed according to the performance of its intended use
[6, 20, 58].
There are several modelling tools, among them artificial neural networks. Sev-
eral studies are currently using artificial neural networks in the field of modelling.
for example: Grasso et al. [31] proposed “a new neural architecture able to
accomplish the identification task. It is based on a relatively new neural algorithm,
the multi-valued neural network with complex weights. The main idea is to use a set
of measurements or simulations made on the system, taken at different values of
geometrical parameters and at different frequencies, to train a multilayer architec-
ture with multi-valued neurons, able to estimate the electrical parameters of the
lumped model”.
Badkar et al. [7] presented “a study the Laser transformation hardening of
commercially pure titanium, nearer to ASTM grade 3 of chemical composition was
investigated using continuous wave 2 kW, Nd: YAG laser. The effect of laser
process variables such as laser power, scanning speed, and focused position was
investigated using response surface methodology and artificial neural network
keeping argon gas flow rate of 10 lpm as fixed input parameter”. They described in
their work, “the comparison of the heat input (HI) and ultimate tensile strength (σ)
(simply called as tensile strength) predictive models based on artificial neural
network and response surface methodology. The performance of the developed
artificial neural network models were compared with the second-order RSM
mathematical models of HI and σ. There was good agreement between the
experimental and simulated values of response surface methodology and artificial
neural network”.
Among the advantages of a neural network is its ability to adapt to the conditions
imposed by any environment, and ease to change its parameters (weight, number of
neurons, etc…) depending on the behaviour of its environment The neural networks
Box and Jenkins Nonlinear System Modelling … 231
are used to model and control dynamic systems linear and nonlinear where con-
ventional methods fail [18, 41].
Research in the field of neural networks is focused on architecture by which
neurons are combined and methodologies by which the weight of interconnections
are calculated or adjusted.
Currently researchers are divided into two groups, the first is made up of biol-
ogists, physicists and psychologists, this group is trying to develop a neural model
able to mimic with a given accuracy, the behaviour of the brain, the second group
consists of engineers who are concerned with how the artificial neurons are inter-
connected to form networks with powerful computing capabilities. Actually, studies
of neural networks are expanding and their use is still growing rapidly [57, 65].
Usually we associate with an artificial neural network learning algorithm to
modify the processing performed in order to achieve a given task. For artificial
neural network, learning can be seen as the problem of updating the weights of the
connections within the network, in order to succeed the requested task [8, 40].
Generally learning neural networks can be made in two ways. In supervised
learning, we have a set of examples (input-output pairs) and we must learn to give
the correct output of new inputs. In reinforcement learning: we have inputs
describing a situation and we receive a punishment (or error) if we give out is not
adequate [11, 16, 44].
In supervised learning, the identification of the parameters of neural network is
often performed by the algorithms of back-propagation, based on minimizing the
training error and the chaining rule [12, 66]. This algorithm is a gradient descent on
a differentiable error function.
This algorithm showed several disadvantages such as slow convergence, sen-
sitivity to local minima and the difficulty to adjust the learning parameters (the
number of neurons in the hidden layers, learning step etc…). In some networks
using Hebbian learning where the synaptic weights can be adjusted during a
learning phase patterns through Hebb’s formula leads to a formula expressing the
weights based on grounds recognized [4, 28, 51].
Several approaches have been proposed to improve the method of back-propa-
gation as the modification of learning step, decentralization of learning step algo-
rithms using quasi-Newton [37], genetic algorithms [59] …etc.
In this work we propose the use of No Sorting Genetic Algorithm II (NSGA II)
to construct a model using a RBF neural network with optimal structure. In this
approach the NSGA II is used to optimize the number of neurons in the hidden
layer of neural network, find the best connection weights between the hidden layer
and output layer, find the parameters of the radial function of neurons in the hidden
layer and ensure learning of neural network.
This paper arbitrary small on a compact region [15, 29, 33, 49] is organized as
follows: in the second section we briefly recall the basic principles of neural net-
works, in the third section we present the NSGA II algorithm, its operating principle
and its application in our method, and the fourth section we present the learning of
radial basis function neural networks by the NSGAII, and finally we present the
simulation results of the developed method in the fifth section.
232 K. Lamamra et al.
The radial basis function neural network is a two-layer network in which the hidden
layer performs a nonlinear transformation usually a Gaussian function to map the
input space (Fig. 1), then the output layer combines the outputs of the intermediate
layer linearly as the outputs of whole network. They may be considered as linearly
parameterized networks.
RBF’s are Gaussian functions have the form:
dðxÞ
fi ðxÞ ¼ e r2
i
Box and Jenkins Nonlinear System Modelling … 233
With:
d(x) the Euclidian distance given by: dðxÞ ¼ kx ci k
ci centers of Gaussian functions
σi widths of Gaussian functions
Fig. 1 General structure of a RBF neural network. Example of RBF neural network with two
neurons in the input layer, four neurons in the hidden layer and one neuron in the output layer
234 K. Lamamra et al.
from hidden layer to output layer. RBF network have the disadvantage of requiring
good coverage of the input space by radial basis functions. RBF centers are
determined with reference to the distribution of the input data, but without reference
to the prediction task. As a result, representational resources may be wasted on
areas of the input space that are irrelevant to the learning task. In this work, it is to
the NSGAII algorithm to find the best RBF centers.
Currently, the RBF neural network are used in many works for different tasks,
for example in network traffic identification, where “a method of network traffic
identification based on RBF neural network is proposed by analysis of the current
status of the network environment. The public data set and the real-time traffic are
used for a combination of supervised learning” [62].
Gutiérrez et al. [34] used in their work, the RBF neural networks in a hybrid
multi-logistic methodology, called logistic regression, “the process for obtaining the
coefficients is carried out in three steps. First, an evolutionary programming (EP)
algorithm is applied, in order to produce an RBF neural network with a reduced
number of RBF transformations and the simplest structure possible. Then, the initial
attribute space (or, as commonly known as in logistic regression literature, the
covariate space) is transformed by adding the nonlinear transformations of the input
variables given by the RBFs of the best individual in the final generation. Finally, a
maximum likelihood optimization method determines the coefficients associated
with a multilogistic regression model built in this augmented covariate space”.
In the work of Pendharkar [48], radial basis function neural networks are used
for classification problems, “a hybrid radial basis function network-data envelop-
ment analysis (RBFN-DEA) neural network is proposed, the procedure uses the
radial basis function to map low dimensional input data from input space R to a
high dimensional R+ feature space where DEA can be used to learn the classifi-
cation function”.
Sheikhan et al. [54, 55] presented a “RBF-based Active queue management
(AQM) controller is used, RBF as a nonlinear controller is suitable as an AQM
scheme to control congestion in transmission control protocol (TCP) communica-
tion networks since it has nonlinear behaviour. Particle swarm optimization algo-
rithm is also employed to derive RBF output weights such that the integrated-
absolute error is minimized. Furthermore, in order to improve the robustness of
RBF controller, an error-integral term is added to RBF equation. The output
weights and the coefficient of the integral error term in the latter controller are also
optimized by Particle swarm optimization algorithm”. Sheikhan et al. [54, 55]
“makes the Lorenz hyper-chaos synchronization and its application to improve the
security of communication systems. Two methods are proposed to synchronize the
general forms of hyper-chaotic systems, and their performance in securing com-
munication application is verified. The first method uses a standard RBF neural
controller. Particle swarm optimization algorithm is used to derive and optimize the
parameters of the RBF controller. In the second method, with the aim of increasing
the robustness of the RBF controller, an error integral term is added to the equations
of RBF neural network”.
Box and Jenkins Nonlinear System Modelling … 235
In the work of Xia et al. [60] “an energy-based controller is incorporated with
RBF neural network compensation, which is used to swing up the pendubot and
raise it to its uppermost unstable equilibrium position”.
Radial basis function neural networks are also used in adaptive control, Ja-
farnejadsani et al. [38] proposed “an adaptive control based on radial-basis-function
neural network for different operation modes of variable-speed variable-pitch wind
turbines including torque control at speeds lower than rated wind speeds, pitch
control at higher wind speeds and smooth transition between these two modes The
adaptive neural network control approximates the nonlinear dynamics of the wind
turbine based on input/output measurements and ensures smooth tracking of the
optimal tip-speed-ratio at different wind speeds. The robust neural network weight
updating rules are obtained using Lyapunov stability analysis”.
The radial basis function neural networks are used also in identification of non-
linear systems, in the work of Chai and Qiao [13], RBF neural networks are used to
“model the non linear system when the system runs without a fault, after some input
and output data of the system are obtained, the center of the hidden nodes are
chosen using clustering technology. Assuming that the system noise and approxi-
mation error are unknown but bounded, the output weights of RBF neural network
model of the system are determined by a linear-in-parameter set membership
estimation. An interval containing the actual output of the system running without a
fault can be predicted based on the result of the estimation. If the measured output is
out of the predicted interval, it can be determined that a fault has occurred”.
Dos Santos Coelho et al. [52], used a “Radial Basis Function neural network
with training combining the Gustafson-Kessel clustering method and a modified
differential evolution in order to perform the swimmer velocity profile identifica-
tion. The main idea is to obtain the dynamic of the velocity profile and to use it to
improve the athletes’ swim style. To achieve good performance with differential
evolution algorithm, the tuning of control parameters is essential as its performance
is sensitive to the choice of the mutation and crossover settings”.
In this work authors “combines the two strategies described above proposing a
modified differential evolution algorithm based on the association of a sinusoidal
signal and chaotic sequences generated by logistic map for the mutation factor
tuning. By using data collected from breaststroke and crawl swim style of an elite
female swimmer; the validity and the accuracy of the RBF neural network model
have been tested by simulations”.
RBF neural network are used in optimization, in the work of Mukhopadhyay
et al. [46] it’s introduced “Discrete Hilbert Transform (DHT)-Neural Model which
provides better result than the ARMA-Neural Model. a signal and its’ DHT pro-
duces the same Energy Spectrum. Based on this concept DHT is used for Wind
Speed forecasting purpose. Thereafter the RBF neural network is used on this to
forecast wind power”.
Chen et al. [14] proposed “a novel online modelling algorithm for nonlinear and
non-stationary systems using a radial basis function neural network with a fixed
number of hidden nodes. Each of the RBF basis functions has a tunable center
vector and an adjustable diagonal covariance matrix. A multi-innovation recursive
236 K. Lamamra et al.
least square (MRLS) algorithm is applied to update the weights of RBF online,
while the modelling performance is monitored. When the modelling residual of the
RBF network becomes large in spite of the weight adaptation, a node identified as
insignificant is replaced with a new node, for which the tunable center vector and
diagonal covariance matrix are optimized using the quantum particle swarm opti-
mization (QPSO) algorithm”.
Learning is to determine the weights for the output of the neural network to be as
close as possible to the target. The main problem is how to build a neural network,
how many layers of covers and the number of units (or neurons) in hidden layer
required achieving a good approximation. Since a wrong choice can lead to poor
network performance matching [23].
The first attempts to solve the problem of determining the architecture have been
to test several networks with different architectures to achieve the desired perfor-
mance [27].
In recent years, many studies have been devoted to developing methods for
optimizing the architecture of the neural network. The main algorithms have been
proposed can be classified into three families:
1. Pruning algorithms: detect and remove the weights or units that contribute little
to the network performance [45].
2. Ascending or constructive algorithms: start from an approximate solution to the
problem with a simple network and add if necessary unit or hidden layers to
improve network performance [26].
3. The direct algorithms: define a suitable architecture and perform learning or
perform both operations simultaneously, such as genetic algorithms [3].
The Multi Objective Genetic Algorithm that we used in this work is the NSGAII
(Non-dominated Sorting Genetic Algorithm) introduced and enhanced by Deb and
Goel [19].
It is one of the most used and most cited in the literature algorithms [53]. It is
widely used by many authors, not only in the context of multi-objective optimi-
zation, but also for comparison with other algorithms, it is considered as a
benchmark by several researchers, for example: Hashmi et al. [35] used this
algorithm in “a negotiation Web service that would be used by both the consumer
and provider Web services for conducting negotiations for dependent QoS
parameters”.
Box and Jenkins Nonlinear System Modelling … 237
Min et al. [43] used it in a “multi-objective history matching model to predict the
individual performance”.
In the work of Gossard et al. [30] NSGAII is “coupled with an artificial neural
network to optimize the equivalent thermo-physical properties of the external walls
(thermal conductivity kwall and volumetric specific heat (ρc) wall) of a building in
order to improve its thermal efficiency”.
Adham et al. [1] used NSGAII “as an optimization technique in combination
with a multi-objective general optimization scheme with the thermal resistance
model as an analysis, for a potential improvement in the overall performance of a
rectangular micro-channel heat sink using a new gaseous coolant namely ammonia
gas”.
In the work of Prasad and Singru [50] NSGA-II is used to “select the optimum
design of turbo-alternator (TA), a real-life TA used in an industry is considered”.
Domínguez et al. [22] proposed “a high-performance architecture for the NSGA-II
using parallel computing, for evaluation functions and genetic operators. In the
proposed architecture, the Mishra Fast Algorithm for finding the Non Dominated
Set was used; it’s proposed a modification in the sorting process for the NSGA-II
that improves the distribution of the solutions in the Pareto front”.
NSGAII is an algorithm establishing the dominance relationships between
individuals and providing a fast sorting method of chromosomes [19]. This algo-
rithm uses a measure of crowding around individuals to ensure diversity in the
population. The principle of this algorithm is shown in Fig. 2.
At the beginning, an initial population is randomly generated, and then it
undergoes a sorting using the concept of non-domination. Each solution is assigned
a strength or rank equal to the level of non-dominance (1 for best, 2 for the next
level, etc…). The reproduction step consists of a tournament for the selection of
parents.
When two individuals of the population are chosen randomly in the population,
the tournament is based on a comparison of the domination with constraints of the
two individuals. For a given generation t, we create Rt = PtUQt, Qt is children
population of the previous population Pt (generated from the parents through the
operators of crossover and mutation), Rt includes individuals of Pt, which ensures
the elite nature of the algorithm NSGAII. Population Rt contains 2 N individuals (it
is composed of N parents and N children). Then Rt undergoes a sorting using the
concept of non-dominance of Pareto. Individuals are grouped into non-dominated
fronts such as F1 represents individuals of rank 1, F2 individuals of rank 2, etc…
The next objective is to reduce the number of individuals in the 2 N population
Rt for a population Pt+1 of size N. If the size of F1 is less than N, then all F1
individuals are retained. It is the same for the other fronts as long as the number of
individuals retained does not exceed the size N.
If we take the example of Fig. 2, the fronts F1 and F2 are fully retained but the
conservation front F3 will result in exceeding the size N of the population Pt+1. It
must then make a selection of F3 individuals to keep.
It is then necessary to make a selection of F3 individuals to maintain. In this
case, NSGAII involves a mechanism for preserving the diversity of the population
based on the evaluation of the density of individuals around each solution through a
procedure for calculating the “distance proximity”.
A low value of the proximity distance for an individual is an individual “well
surrounded”. It then proceeds to a descending sorting according to this distance
proximity to retain individuals F3 front and eliminate individuals from the densest
areas. This way we complete the population Pt+1. Individuals with extreme values
for the criteria are preserved by this mechanism, thereby maintaining the external
terminals of the Pareto front.
At the end of this phase, the population Pt+1 is created. Then a new population Qt
+1 is generated by reproduction from Pt+1. We continue iteratively the procedure
described above to the satisfaction of stop criteria set by the user.
Generally, the NSGAII keeps elitism and diversity without adding additional
parameters, while using an algorithm attractive in its simplicity with a minimum of
parameters.
The NSGA II is used to optimize the structure and parameters of the RBF NN. The
following two objective functions are chosen:
• The first function to be optimized (f1) is the number of neuron of the hidden
layer of the RBF NN.
• The second function (f2) is the quadratic error which is the difference between
the desired input of the RBF NN and its output.
Box and Jenkins Nonlinear System Modelling … 239
NSGAII must find the best number of neurons in the hidden layer (Nn) and
provide the best connection weights between neurons in the hidden layer and the
output layer, and find the parameters of the radial function of neurons hidden layer.
In this work, we used the radial functions of Gaussian form, and NSGA II must find
the best centers (Ci) and the best widths sigma (σi) for these functions.
The chromosome then contains the number of neurons in the hidden layer,
Gaussian functions centers and widths of the hidden layer neurons, and the weights
of connections between the hidden layer and the output layer.
The chromosome contains the following parameters:
where:
Nn is the number of neuron in the hidden layer
C2 C1… CNn Gaussian functions centers of the hidden layer neurons
σ1 σ2 … σNn widths of the Gaussian functions of the hidden layer neurons
Z1 Z2… ZNn the connection weights between the neurons of the hidden layer
and the neuron of the output layer
The length of the chromosome (Lch) depends only on the number of neurons in
the hidden layer (Nn) and the number of neurons of the output layer (Nns) because
the inputs are fixed to two.
The general expression of the length of the chromosome (Lch) is given by [2]:
For example, for a neural network with one output and two neurons in the
hidden layer, the length of the chromosome is Lch equal to 7: the number of
neurons Nn (one allele), Gaussian functions centers neurons of the hidden layer C1
C2 (two alleles), the widths “σ” of the Gaussian functions of the hidden layer
neurons σ1 σ2 (two alleles), the connection weights between the neurons of the
hidden layer and the output layer neuron Z1 Z2 (two alleles).
The population size Tm (population matrix) is given by [3]:
For example, if the maximum number of neurons in the hidden layer is equal to
20, then the length of the population chromosomes is equal to 61. In this case if the
neurons number in the chromosome i is equal to 6 (Nni = 6), it will be organized as
follows:
240 K. Lamamra et al.
5 Results of Simulation
This method is applied to modelling the BOX and JENKINS system, which is a
time series. This process is a gas-fired boiler with the input the gas at the inlet and
the output the concentration of released CO2.
A time series is a sequence of observations on a variable measured at successive points in
time or over successive periods of time. The measurements may be taken every hour, day,
week, month, or year, or at any other regular interval. The pattern of the data is an important
factor in understanding how the time series has behaved in the past. If such behaviour can
be expected to continue in the future, we can use the past pattern to guide us in selecting an
appropriate forecasting method.
To identify the underlying pattern in the data, a useful first step is to construct a time series
plot. A time series plot is a graphical presentation of the relationship between time and the
time series variable; time is on the horizontal axis and the time series values are shown on
the vertical axis. Let us review some of the common types of data patterns that can be
identified when examining a time series plot [2].
The data of BOX and JENKINS consist of 296 measurements of input and
output [10, 25].
The RBF neural network has two inputs, one output and a hidden layer. The
number of neurons in the hidden layer Nn is determined by the NSGA II as well as
the centers and the widths of the Gaussian functions, and the weights of connection
between the hidden layer and the output layer.
The NSGA II optimizes simultaneously Nn and the quadratic cumulated error ec
given by:
Box and Jenkins Nonlinear System Modelling … 241
X
N X
N
ec ¼ e2 ðiÞ; with eðiÞ ¼ ðYd ðiÞ Yr ðiÞÞ
i¼1 i¼1
where:
ec cumulative error. e(i): instantaneous error. N: length of the simulation
sequence (the number of data N = 296)
Yd the desired out put,
Yr real output (output of the RBF NN model)
Figure 7 represents the desired output training, which represent the desired
training concentration of CO2 released from the output of the boiler.
Figure 8 shows the RBF neural network model output during training phase.
Figure 9 represents the desired output and RBF neural network model output
during training phase, and the Training error is shown in the Fig. 9.
244 K. Lamamra et al.
Fig. 9 Desired output and RBF neural network model output during training phase
By analyzing these figures, we can observe that the output of RBF neural net-
work model has perfectly followed the desired output during the training phase with
a training error of 0.2043.
In the training error figure (Fig. 10), the most significant peak appears at 10th
iteration.
Figure 11, represents the desired validation concentration of CO2 released from
the output of the boiler (the desired output validation).
Figure 12, shows the RBF neural network model output during validation phase.
Box and Jenkins Nonlinear System Modelling … 245
Figure 13, represents the desired output and the RBF neural network model
output during validation phase, and the validation error is shown in the Fig. 14.
In these figures we can also observe that the output of RBF neural network
model has followed the desired output during the validation phase with a training
error of 0.3216.
This error is larger than the training error; this can be justified of the fact that the
training data number chosen is lower than validation data number.
246 K. Lamamra et al.
Fig. 13 Desired output and RBF neural network model output during validation phase
In the validation error figure (Fig. 14), the most significant peaks are appeared at
37th and 163th iterations.
The global desired output which is the global desired concentration of CO2
released from the output of the boiler is shown in the Fig. 15.
Box and Jenkins Nonlinear System Modelling … 247
Figure 16 represents the global RBF neural network model and the Fig. 17
shows the concentration of CO2 released from the output of the boiler (desired
output yd) and the output of the RBF neural model (yr).
Figure 18 represents the global error, and finally, the Pareto front “global
cumulated error function of the number of neurons” is shown in the Fig. 19.
Based on these results, we can conclude that the multi-objective genetic algo-
rithm NSGAII gave a good structure of radial basis function neural network model,
with a good number of neurons in the hidden layer and good connection weights
248 K. Lamamra et al.
Fig. 17 The concentration of CO2 released from the output of the boiler (desired output yd) and
the output of the RBF neural model (year)
between the hidden layer and the output layer, and it also found the best parameters
of the radial function of hidden layer neurons, because we see that the radial basis
function neural network model output is very close to that of the desired output,
with training error equal to 0.2043 and validation error equal to 0.3216 and a global
error equal to 0.5259.
Box and Jenkins Nonlinear System Modelling … 249
Fig. 19 Pareto front “global cumulated error function of the number of neurons”
• The number of neuron of the hidden layer of the radial basis function neural
network.
• The quadratic error which is the difference between the desired input of the RBF
neural network and its output.
• The regressor’s number at the input of the RBF neural network.
6 Conclusion
Neural networks are increasingly used and applied in various fields, mainly in the
problems of modelling of complex systems. This is due to their simplicity and their
universal approximation properties and the ability of information parallel treatment.
These properties make that these networks are well used for modeling and con-
trolling linear and nonlinear dynamics systems, where conventional methods fail.
The most difficult problem to solve for neural networks is to obtain the best and
right architecture. The networks established in most of practical applications are
built with an experimental way.
This difficulty can be highlighted by a number of issues, such as the number of
hidden layers to be used in a multilayer network, the optimal number of neurons in
each layer and the initial values of connection weights during the learning phase …
etc. A bad choice can lead to poor performances of the corresponding network.
In this work we present a technique to solve the problems mentioned above. This
technique is to treat these problems as a multi-criteria optimization problem. We
considered in this work the designing of RBF neural network using the multi-
objective genetic algorithms type NSGAII, by optimizing simultaneously two
objectives functions: the first function is the quadratic cumulatively error, which is
the difference between the desired signal and the RBF neural network model output
signal, and the second is the number of neurons in the hidden layer, thereby the
NSGA II chromosome contains the number of neurons in the hidden layer, Gaussian
functions centers and widths of the hidden layer neurons, and the weights of con-
nections between the hidden layer and the output layer. At the end of the evolution of
this algorithm off-line, we have a set of RBF models which forming the final Pareto
front, and includes all allowed results, and ensuring the predefined criteria.
This optimization technique is applied to modelling a nonlinear system which is
the BOX and JENKINS process. It’s a gas-fired boiler with the input the gas at the
inlet and the output the concentration of released CO2. The results show that using
NSGAII to optimize the RBF neural network provides a good model, these results
are very satisfying.
At the end of the NSGAII algorithm evolution, it converges to a set of solutions
(Pareto front), it is a set of solutions respecting the optimization criteria, which is
generally difficult to choose one solution from the set, to resolve this problem we
proposed in the future works, the uses of selection method such as the multi-criteria
decision analysis approach.
Box and Jenkins Nonlinear System Modelling … 251
References
15. Chen, T., Chen, H.: Approximation capability to functions of several variables, nonlinear
functionals, and operators by radial basis function neural networks. IEEE Trans. Neural Netw.
6(4), 904–910 (1995). doi:10.1109/72.392252
16. Cherkassky, V., Friedman, J.H., Wechsler, H.: From statistics to neural networks: theory and
pattern recognition applications. Springer Publishing Company, Incorporated (2012)
17. Cook, D.F., Ragsdale, C.T., Major, R.L.: Combining a neural network with a genetic
algorithm for process parameter optimization. Eng. Appl. Artif. Intell. 13(4), 391–396 (2000).
doi:10.1016/S0952-1976(00)00021-X
18. Dahl, G.E., Sainath, T.N. and Hinton, G.E.: Improving deep neural networks for LVCSR
using rectified linear units and dropout. In: 2013 IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP), pp. 8609–8613. IEEE (2013)
19. Deb, K., Goel, T.: Controlled Elitist non-dominated sorting genetic algorithms for better
convergence. In: Zitzler, E., Thiele, L., Deb, K., Coello, C.A.C., Corne, D. (eds.) Evolutionary
Multi-Criterion Optimization. Lecture Notes in Computer Science, pp. 67–81. Springer, Berlin
(2001)
20. Deichmueller, M., Denkena, B., de Payrebrune, K.M., Kröger, M., Wiedemann, S., Schröder,
A., Carstensen, C. (2013) Modeling of process machine interactions in tool grinding. In:
Process Machine Interactions, pp. 143–176. Springer, Berlin
21. Deutschmann, O.: Modeling and Simulation of Heterogeneous Catalytic Reactions: From the
Molecular Process to the Technical System. Wiley, New York (2013)
22. Domínguez, J., Montiel-Ross, O., Sepúlveda, R.: High-performance architecture for the
modified NSGA-II. In: Melin, P., Castillo, O. (eds.) Soft Computing Applications in
Optimization, Control, and Recognition. Studies in Fuzziness and Soft Computing, vol. 294,
pp. 321–341. Springer, Berlin Heidelberg (2013)
23. Urbani, D., Marcos, S., Thiria, S.: (1995) Statistical methods for selecting neural architectures:
application to the design of models of dynamic processes. PhD thesis of University Pierre and
Marie Curie
24. Furtuna, R., Curteanu, S., Leon, F.: Multi-objective optimization of a stacked neural network
using an evolutionary hyper-heuristic. Appl. Soft Comput. 12(1), 133–144 (2012). doi:10.
1016/j.asoc.2011.09.001
25. Box, G.E.P., Jenkins, G.M.: Time series analysis: forecasting and control, p. 575. Holden-Day,
San Francisco (1976)
26. Giles, C.L., Chen, D., Sun, G.-Z., Chen, H.-H., Lee, Y.-C., Goudreau, M.W.: Constructive
learning of recurrent neural networks: limitations of recurrent cascade correlation and a simple
solution. IEEE Trans. Neural Netw. 6(4), 829–836 (1995). doi:10.1109/72.392247
27. Giles, C.L., Miller, C.B., Chen, D., Chen, H.H., Sun, G.Z., Lee, Y.C.: Learning and extracting
finite state automata with second-order recurrent neural networks. Neural Comput. 4(3),
393–405 (1992). doi:10.1162/neco.1992.4.3.393
28. Gilson, M., Py, J.S., Brault, J.-J., Sawan, M.: Training recurrent pulsed networks by genetic
and Taboo methods. In: Canadian Conference on Electrical and Computer Engineering, 2003,
IEEE CCECE 2003, vol. 3, pp. 1857–1860 (2003). doi:10.1109/CCECE.2003.1226273
29. Girosi, F., Poggio, T.: Networks and the best approximation property. Biol. Cybern. 63(3),
169–176 (1990). doi:10.1007/BF00195855
30. Gossard, D., Lartigue, B., Thellier, F.: Multi-objective optimization of a building envelope for
thermal performance using genetic algorithms and artificial neural network. Energy Build. 67,
253–260 (2013). doi:10.1016/j.enbuild.2013.08.026
31. Grasso, F., Luchetta, A., Manetti, S., Piccirilli, M.C.: System identification and modelling
based on a double modified multi-valued neural network. Analog Integr. Circ. Sig. Process 78
(1), 165–176 (2014). doi:10.1007/s10470-013-0211-y
32. Guerra, F.A., dos Coelho, L.S.: Multi-step ahead nonlinear identification of Lorenz’s chaotic
system using radial basis neural network with learning by clustering and particle swarm
optimization. Chaos, Solitons Fractals 35(5), 967–979 (2008). doi:10.1016/j.chaos.2006.05.
077
Box and Jenkins Nonlinear System Modelling … 253
33. Gupta, M.M., Rao, D.H.: Neuro-control systems: theory and applications. IEEE, New York
(1993)
34. Gutiérrez, P.A., Hervas-Martinez, C., Martínez-Estudillo, F.J.: Logistic regression by means of
evolutionary radial basis function neural networks. IEEE Trans. Neural Netw. 22(2), 246–263
(2011). doi:10.1109/TNN.2010.2093537
35. Hashmi, K., Alhosban, A., Najmi, E., Malik, Z., Rezgui (2013) Automated Web service
quality component negotiation using NSGA-2. In: 2013 ACS International Conference on
Computer Systems and Applications (AICCSA), pp. 1–6. doi:10.1109/AICCSA.2013.
6616502
36. Haykin, S., Widrow, B. (2003) Least-Mean-Square Adaptive Filters. Wiley, New York (2003)
37. Jacek M.Z.: Introduction to Artificial Neural Systems. Jaico Publishing House, Mumbai
(1992)
38. Jafarnejadsani, H., Pieper, J., Ehlers, J.: Adaptive control of a variable-speed variable-pitch
wind turbine using radial-basis function neural network. IEEE Trans. Control Syst. Technol.
21(6), 2264–2272 (2013). doi:10.1109/TCST.2012.2237518
39. Lamamra, K., Belarbi, K., Bosche, J., Hajjaji, A.E.L.: A neural network controller optimised
with multi objective genetic algorithms for a laboratory anti-lock braking system. Sci.
Technol. J. Constantine 1 Univ 35 (2012)
40. Kasabov, N., Dhoble, K., Nuntalid, N., Indiveri, G.: Dynamic evolving spiking neural
networks for on-line spatio-and spectro-temporal pattern recognition. Neural Networks 41,
188–201 (2013)
41. Levine, D.S., Aparicio I.V.M.: Neural networks for knowledge representation and inference.
Psychology Press, Rouledge (2013)
42. Mallot, H.A.: Artificial neural networks. In: Computational Neuroscience, Springer Series in
Bio-/Neuroinformatics, vol. 2, pp. 83–112. Springer International Publishing, Berlin (2013)
43. Min, B.H., Park, C., Jang, I.S., Lee, H.Y., Chung, S.H., Kang, J.M.: Multi-objective history
matching allowing for scale-difference and the interwell complication. doi:10.3997/2214-
4609.20130172
44. BG, Mirta: Dynamics of complex systems and applications to SHS: models, concepts,
methods. Leibniz-IMAG Laboratory, Grenoble (2004)
45. Morse, J.N.: Reducing the size of the nondominated set: pruning by clustering. Comput. Oper.
Res. 7(1–2), 55–66 (1980). doi:10.1016/0305-0548(80)90014-3
46. Mukhopadhyay, S., Panigrahi, P.K., Mitra, A., Bhattacharya, P., Sarkar, M., Das, P.:
Optimized DHT-RBF model as replacement of ARMA-RBF model for wind power
forecasting. In: 2013 International Conference on Emerging Trends in Computing,
Communication and Nanotechnology (ICE-CCN), pp. 415–419. doi:10.1109/ICE-CCN.
2013.6528534 (2013)
47. Nikdel, N., Nikdel, P., Badamchizadeh, M.A., Hassanzadeh, I.: Using neural network model
predictive control for controlling shape memory alloy-based manipulator. IEEE Trans. Industr.
Electron. 61(3), 1394–1401 (2014). doi:10.1109/TIE.2013.2258292
48. Pendharkar, P.C.: A hybrid radial basis function and data envelopment analysis neural network
for classification. Comput. Oper. Res. 38(1), 256–266 (2011). doi:10.1016/j.cor.2010.05.001.
(Project Management and Scheduling)
49. Poggio, T., Girosi, F.: Networks for approximation and learning. Proc. IEEE 78(9),
1481–1497 (1990). doi:10.1109/5.58326
50. Prasad, K.V.R.B., Singru, P.M.: Optimum design of turbo-alternator using modified NSGA-II
algorithm. In: Bansal, J.C., Singh, P., Deep, K., Pant, M., Nagar, A. (eds.) Proceedings of
Seventh International Conference on Bio-Inspired Computing: Theories and Applications
(BIC-TA 2012), Advances in Intelligent Systems and Computing, vol. 202, pp. 253–264.
Springer, India (2013)
51. Roberto, B., Ubaldo, C., Stefano, M., Roberto, I., Elisa, S., Paolo, M.: Graybox and adaptative
dynamic neural network identification models to infer the steady state efficiency of solar
thermal collectors starting from the transient condition. Sol. Energy 84(6), 1027–1046 (2010)
254 K. Lamamra et al.
52. Dos Santos Coelho, L., Ferreira da Cruz, L., Zanetti Freire, R.: Swim velocity profile
identification by using a modified differential evolution method associated with RBF neural
network. In: 2013 Third International Conference on Innovative Computing Technology
(INTECH), pp. 389–395. doi:10.1109/INTECH.2013.6653721 (2013)
53. Schoenauer, M., Deb, K., Rudolph, G., Yao, X., Lutton, E., Merelo, J.J., Schwefel, H.-P.
(eds.) (2000) A Fast Elitist Non-dominated Sorting Genetic Algorithm for Multi-objective
Optimization: NSGA-II. Parallel Problem Solving from Nature PPSN VI. Lecture Notes in
Computer Science. Springer Berlin Heidelberg, Berlin (2000)
54. Sheikhan, M., Shahnazi, R., Garoucy, S.: Hyperchaos synchronization using PSO-optimized
RBF-based controllers to improve security of communication systems. Neural Comput. Appl.
22(5), 835–846 (2013). doi:10.1007/s00521-011-0774-4
55. Sheikhan, M., Shahnazi, R., Hemmati, E.: Adaptive active queue management controller for
TCP communication networks using PSO-RBF models. Neural Comput. Appl. 22(5),
933–945 (2013). doi:10.1007/s00521-011-0786-0
56. Syed, A.A., Pittner, A., Rethmeier, M., De, A.: Modeling of gas metal arc welding process
using an analytically determined volumetric heat source. ISIJ Int. 53(4), 698–703 (2013)
57. Tang, Y., Wong, W.K.: Distributed synchronization of coupled neural networks via randomly
occurring control. IEEE Trans. Neural Netw. Learn. Syst. 24(3), 435–447 (2013)
58. Teixidor, D., Grzenda, M., Bustillo, A., Ciurana, J.: Modeling pulsed laser micromachining of
micro geometries using machine-learning techniques. J. Intell. Manuf. 1–14. doi:10.1007/
s10845-013-0835-x (2013)
59. Whitley, D., Starkweather, T., Bogart, C.: Genetic algorithms and neural networks: optimizing
connections and connectivity. Parallel Comput. 14(3), 347–361 (1990). doi:10.1016/0167-
8191(90)90086-O
60. Xia, D., Wang, L., Chai, T.: Neural-network-friction compensation-based energy swing-up
control of pendubot. IEEE Trans. Industr. Electron. 61(3), 1411–1423 (2014). doi:10.1109/
TIE.2013.2262747
61. Xiao, Z., Liang, S., Wang, J., Chen, P., Yin, X., Zhang, L., Song, J.: Use of general regression
neural networks for generating the GLASS leaf area index product from time-series MODIS
surface reflectance. IEEE Trans. Geosci. Remote Sens. 52(1), 209–223 (2014). doi:10.1109/
TGRS.2013.2237780
62. Xu, Y., Zheng, J.: Identification of network traffic based on radial basis function neural
network. In: Chen, R. (ed.) Intelligent Computing and Information Science. Communications
in Computer and Information Science, vol. 134, pp. 173–179. Springer, Berlin (2011)
63. Yu, H., Xie, T., Paszczynski, S., Wilamowski, B.M.: Advantages of radial basis function
networks for dynamic system design. IEEE Trans. Industr. Electron. 58(12), 5438–5450
(2011). doi:10.1109/TIE.2011.2164773
64. Yuan, J., Yu, S.: Privacy preserving back-propagation neural network learning made practical
with cloud computing. IEEE Trans. Parallel Distrib. Syst. 25(1), 212–221 (2014). doi:10.1109/
TPDS.2013.18
65. Zhang, H., Yang, F., Liu, X., Zhang, Q.: Stability analysis for neural networks with time-
varying delay based on quadratic convex combination. IEEE Trans. Neural Netw. Learn. Syst.
24(4), 513–521 (2013)
66. Zhi, C., Guo, L.H., Zhang, M.Y., Shi, Y.: Research on dynamic subspace divided BP neural
network identification method of color space transform model. Adv. Mater. Res. 174, 97–100
(2011)
Back-Propagation Neural Network
for Gender Determination in Forensic
Anthropology
I. Afrianty (&)
Faculty of Science and Technology, UIN Suska Riau, Pekanbaru 28124, Indonesia
e-mail: [email protected]
D. Nasien H. Haron
Faculty of Computing, Universiti Teknologi Malaysia (UTM), 81310 Skudai, Johor,
Malaysia
e-mail: [email protected]
H. Haron
e-mail: [email protected]
M.R.A. Kadir
Faculty of Bioscience and Medical Engineering, Universiti Teknologi Malaysia (UTM),
81310 Skudai, Johor, Malaysia
e-mail: afi[email protected]
using DFA only obtained accuracy as high as 87 %. Hence, it can be concluded that
BPNN provide classification accuracy higher than DFA for gender determination in
forensic anthropology.
Keywords Forensic anthropology Gender determination Sacrum bones Back-
propagation neural network
1 Introduction
In the past few years, many issues have been raised concerning the many bone
fragments or skeletal remains found but not identified with various conditions such
as burns, dismemberment, dry bone, or simply not intact [2, 40]. These conditions
cannot provide further information as to whether the fragments are derived from
human or non-human remains. Usually the cases related to research into human
skeletons from ancient times. However, lately the findings of human skeletons is in
the criminal context. Several criminal cases have included the discovery of bodies
and human skeletons without viable means of identification. These problems are
usually handled by forensic anthropologist or researchers in forensic anthropology
field. Forensic anthropologists often bemoan the major issues in forensic anthro-
pology namely the process of identifying the bones or skeletal remains of the
unknown. These cases have always been a challenge for forensic anthropologist in
order to recognize and identify the skeletal remains. The skeleton or skeletal
remains that have been found should be identified and established the cause and
manner of death.
Identification of skeletal remains (like Fig. 1) is a mainstay in forensic anthro-
pology. The ability of the forensic anthropologist to undertake analysis is funda-
mentally determined by the preservation of the skeletal remains to identify and its
biological profile uncovered [44]. Forensic anthropology is one of the fastest
growing medico-legal disciplines whose main goal is to identify the biological
profile of unknown skeletal remains, including the cause and time of death [19]. As
a discipline, forensic anthropology must see its responsibilities through from the
scene to the courtroom. Forensic anthropology is a relatively young subfield within
biological anthropology with a focus on biological profile identification leaving the
task of building detailed biological profiles of human skeletal remains vacant [6].
The most important component of identification of biological profile of an
individual is gender determination. However, gender determination is the first
essential step in the positive identification of skeletal remains. Knowledge of the
gender of an unknown set of remains is essential to make a more accurate esti-
mation of age [24]. Without an accurate determination of gender, there can be no
accurate determination or estimation of age at death. Thus, gender determination is
a necessary to process of identification which also includes the Big Fours
parameters.
Back-Propagation Neural Network for Gender … 257
To assist determine gender tools or classification techniques are used that can
provide more accurate information about the biological profile of an individual. The
condition of skeleton is complete and available while identification will obtain a
more accurate result in gender determination [10, 45]. From previous studies, many
researchers have used linear approach such as Discriminant Function Analysis
(DFA). DFA or DA is a method used to find a set of axes that possess the greatest
possible ability to discriminate between two or more groups [15]. In previous
studies, researchers have been using many parts of the skeleton for identification
and one of the skeletal elements used as an indicator of gender is the pelvic bones.
Using the same data from previous studies, this paper will use another classification
technique from non-linear approach that apply Artificial Neural Network model,
namely Back-Propagation Neural Network (BPNN).
BPNN is a classical domain-dependent technique for supervised training [4].
The purpose of this paper is to determine gender using BPNN from a part of pelvic
bones namely sacrum bones, and to compare classification accuracy of the results
obtained by BPNN with previous techniques (DFA). This paper is structured as
follows: Sect. 2 will review and discuss the related work; Sect. 3 gives an overview
of proposed technique to determine gender from Neural Network model, namely
BPNN technique; Sect. 4 described the research methodology which will be
developed and explain key functions of the proposed technique; Sect. 5 is dis-
cussion that explain data acquisition in gender determination process and showed
the result obtained; and finally, a conclusion of this paper is provided in Sect. 6.
258 I. Afrianty et al.
2 Related Work
Anthropology is the study about the human biological, cultural and linguistic
conditions. Anthropology is divided into two main branches, namely cultural and
physical [26]. Forensic anthropology is closely related with physical or biological
anthropology that work through identification of skeletal remains. Forensic
anthropology is disciplines of physical or biological anthropology that the fastest
growing [19, 26]. The main objective of forensic anthropology is to identify skeletal
remains and thus generate a biological profile of the individual. Following the
biological profile, anthropologists or researchers will endeavor to provide a per-
sonal identification of the remains based on evidence, any distinguishing charac-
teristics the individual may display, and determine whether remains derived are
human or non-human. The biological profile includes gender, age, race (ancestry),
and stature, also known as the ‘‘Big Four’’ parameters of forensic anthropology [29,
35]. The first step for positive identification when dismembered or decomposed
bodies are recovered is gender determination [28, 50]. Identification then proceeds
toward the determination of age, race, and stature. In other words, gender deter-
mination is necessary to identify age, ancestry, and stature.
From previous researches, identification of skeletal remains might be done by
fingerprint, anthropological, dental, DNA analysis at laboratory or radiological
examinations [55]. However, the most popular method for identification of gender
is DNA analysis. In some cases, where the bones are burned, dismembered, or very
dry, DNA analysis has failed because suitable DNA cannot be extracted under the
conditions mentioned and not recoverable from remains in all circumstances [6,
49]. Thus, protein analysis or the study of the microscopic structure of the fragment
may be useful, and, at times, the only applicable method [6]. DNA analysis has
been developed to provide accurate gender determination. For gender determination
cases, DNA analysis cannot replace the anthropological analysis because it cannot
provide data on some of the important parameters of the biological profile [7].
Moreover a thorough anthropological analysis is conducted to obtain a more reli-
able characterization of the individual, providing more data to confirm identity [7].
Therefore, due to the drawbacks of DNA analysis, forensic anthropology has been
developed in order to improve previous identification methods of profiling
unknown remains, particularly in gender determination. Forensic anthropology
assists in creating a biological profile including determination of gender, age, race,
and stature, also known as the ‘‘Big Four’’ parameters of forensic anthropology
[29]. The contribution of forensic anthropology is often important during the
investigation and the interpretation of decomposing human remains [26].
Gender builds based on biological sex. Gender is the very process of creating a
dichotomy by effacing similarity and elaborating on difference, which gender is
related to biology. In general, gender determination is an important part of the
forensic process and gender determination will more reliable if the skeleton is
complete and in good condition. The purpose of gender determination is to identify
human skeletal remains in order to know the difference between male and female
Back-Propagation Neural Network for Gender … 259
United State. Review of paper’s [16] distinguished three periods about the devel-
opment of history of forensic anthropology: pre-1939, 1939–1972, and post-1972.
While [53] distinguished four periods of forensic anthropology development which
is based on milestone in the publication document and the research:
1. Early 18th to last quarter of the 19th century
2. In 1878 until 1939 year
3. In post-World War II until the last quarter of the 20th century
4. In 1972 until present-day because many the researchers still research about it
(post-1972).
Their paper discussed about identification from skeletal remains and researched
parts of bones were found, such as research involving estimates of gender, age,
ancestry, and stature. Based on previous reviews, can be summarized that period of
forensic anthropology development may be distinguished in three periods: on 18th
century, 19th century, and 20th century.
Before the late 18th century, skeletal analysis within a forensic context was
mostly an applied area of anatomy so that anatomists and physicians could use
general knowledge, the few techniques that existed in textbooks, and their expe-
rience [54]. In 19th century, the anthropologists were increasing demanded by
medico-legal establishments thus can render aid in the case of skeletonized (i.e. for
identification human remains in World War II the Korean War) [35]. Before 1939,
the principal contributors to the methodology of human skeletal variation, using a
collection of bodies known age, heredity, gender, and morbidity [16]. In the late
19th century and began to enter the 20th century, forensic anthropology has
developed into modern period, namely the establishing of the “Physical Anthro-
pology Section of the American Academy of Forensic Sciences” in 1971 and
anthropologists have applied modern techniques in solving of the cases [13, 35].
Recent developments place forensic anthropology in a wider criminal investi-
gation context, researchers are even asked to aid in the identification of living
individuals [26]. In forensic anthropology covers a variety of topics and issues.
Anthropologists no longer limited to research involving the estimation of determine
gender, age, ancestry, and stature. Forensic anthropology worldwide, there seem to
be considerable differences in many aspects of education, training, professional
status, research activities and job opportunities globally [26]. However, research is
still continuously improved and developed with the involvement of a variety of
techniques to assist research [16].
identification and recognition and interpretation of evidence of foul play [54]. The
biological profile includes gender, age, race (ancestry), and stature, also known as
the ‘‘Big Four’’ parameters of forensic anthropology [29, 35].
1. Gender determination
Gender determination is the classification of an individual, whether as male or
female [12]. Gender determination is based on skeletal features plays a crucial
role in legal medicine and forensic anthropology [18]. Many researchers have
done research to gender determination using several of parts the body such as
[14, 23, 34].
2. Age determination
Knowledge of the gender of an unknown set of remains is essential to make a
more accurate estimation of age [24]. The determination of age can be estimated
of progression of the skeletal maturity.
3. Race (ancestry) determination
Ancestry determination is important component of a skeletal profile that very
difficult to estimate [35]. Ancestry estimates are based on two features, namely
the shape or morphology and the metric analysis various elements of the
skeleton.
4. Stature determination
Determination of stature is an important aspect in establishing identity. The
determination of stature standards are based on two major methods namely the
anatomical and mathematical (regression) method [1, 35]. The anatomy method
requires the presence of a complete or near complete skeleton thus provide the
best approximation of stature. But, it has main drawbacks which it requires a
complete skeleton and the addition of correction factors to compensate for soft
tissues [1]. Whereas, the mathematical method is used for analyzing an
incomplete skeleton such as single or body part, which the lengths of target
bones are regressed upon stature [35]. That method was using regression
equation or multiplication factors based on the correlation of individual mea-
surements of bones to living statures. It has disadvantage that its predictive
ability is less accurate because of wide variability in population body propor-
tions [1, 35].
2.3 Materials
Pelvic bones consist of a pair of hipbones, sacrum, and coccyx. The pelvic is
another element of the skeleton that exhibits sexual dimorphism [35]. It is con-
sidered the most sexually dimorphic skeletal element because the female pelvic
must accommodate the relatively large head of an infant during childbirth. Hence,
the female pelvic is typically wider in every dimension than the male pelvic [51].
Sacrum bones are a part of the pelvic bones that are related to reproduction and
262 I. Afrianty et al.
Fig. 3 The six variables for sacrum bones by metric measurement [14]
fertility. The sacrum could well be thought to share significant qualities with the
reproductive organs, and even to transport material from the brain to those organs.
Hence, sacrum bones can be up to 100 % accurate as an indicator of sex if all of the
required data is complete. Some of researchers have conducted a study of the pelvic
bone in gender determination, such as [14, 15, 56].
The data used in this paper included as many as 91 sacrum bones consisting of
34 females and 57 males derived from analysis of previous researches, namely [14].
The collected data was then measured using metric measurement (can be seen in
Fig. 3).
There are six measurements for sacrum bones that are used as indicators or
variables in determining gender. They are real height, anterior length, anterior
superior breadth, mid-ventral breadth, anterior posterior diameter of the base, and
max-transverse diameter of the base. The variables for measurement of sacrum
bones with its code respectively, can be seen in the following Table 1.
From the review of previous studies about forensic anthropology, it can be sum-
marized that the processes of gender determination are traditionally divided into
two processes. The first is measurement of data collected and the second process is
Back-Propagation Neural Network for Gender … 263
Neural Network (BPNN). Unlike DFA, ANN does not require distributional
assumptions of the variables and is able to model all types of non-linear functions
between input and output of a model [9].
3 Proposed Technique
In this paper, proposed Artificial Neural Network (ANN) technique specific BPNN
for gender determination. ANN is a characteristic in biological Neural Network
(NN) in which it contains an information processing system modeled on the
structure of the dynamic process that are composed of simple elements operating in
parallel [37]. ANN consists of a number of neurons that are linked with the neurons
in the human brain. ANN can be classified into feed forward and recurrent, depend
on their connectivity. The ability of an ANN to predict outcomes accurately depend
on the selection of proper weights during the training. Training or learning is the
relationship between inputs and target. The rule of the learning defined as a pro-
cedure of a network aims to adjust weights and biases [5]. Its learning rule uses the
most rapid descent to continuously adjust the weights and thresholds of neural
network through back propagation, so that the sum of squared error of network is
minimum [20]. Three learning of neural network methods are supervised, unsu-
pervised learning, and reinforced learning [30]. In supervised learning, the network
is provided with inputs and desired outputs or target values. In unsupervised
learning, on the other hand, the weights and biases are modified in response to
network inputs only. The performance of the models is measured using Mean
Squared Error (MSE). MSE is the average of the squares of the difference between
each output and the desired output, given by equation (Eq. 1) below:
1X n X yd
MSE ¼ ðyi ^yij Þ2 ð1Þ
n i¼1 j¼1 j
ANN is learned using the back propagation algorithm in which the errors for the
units of the hidden layer are determined by back propagating (BP) the errors of the
units of the output layer [25]. There are two processes involved in a BPNN namely
input signal and error signal. Input signal presented to input layer and continued till
produced the output layer. On the contrary, error signal is caused the different from
the desired output then error is calculated after that propagated backwards from the
output layer to input layer. Back Propagation Neural Network (BPNN) is currently
the most widely used algorithm for supervised learning with multilayer feedforward
networks [37]. It works by measuring the output error, calculating the gradient of
this error, and adjusting the ANN weights (and biases) in the descending gradient
direction. Hence, BPNN is a gradient-descent local search procedure (expected to
stagnate in local optima in complex landscapes) [4]. BPNN due to its good
robustness and fault tolerance is widely used in optimization and function
Back-Propagation Neural Network for Gender … 265
3. [V] Represents the weights of synapses connecting input neurons and hidden
neurons and [W] represents weights of synapses connecting hidden neurons and
output neurons. Initialize the weight to small random values usually −1–1. For
general problems, k can be assumed as 1 and the threshold values can be taken
as 0.
A common initial weight is to set the range between −0.5 and +0.5 for large
database and −0.35 and +0.35 for small database.
4. Present one set of inputs and outputs for the training data. Present the pattern to
the input layer II as inputs to the input layer. By using linear activation function,
the output of the input layer may be evaluated as
O I ¼ II
ð3Þ
l1 l1
6. Let the hidden layer units evaluate the output using sigmoidal function as
8 9
>
> >
>
>
> >
>
>
> >
>
< =
OH ¼ 1
IHi
>
>
ð1þe >
> ð5Þ
>
> >
>
>
> >
>
: ;
m1
IO ¼ ½W T OH
ð6Þ
n1 nm m1
Back-Propagation Neural Network for Gender … 267
8. Let the output layer units evaluate the output using sigmoidal function as
8 9
>
> >
>
>
> >
>
< =
OH ¼ 1
Ioj ð7Þ
>
> ð1þe >
>
>
> >
>
: ;
9. Calculate the error and difference between the network output and desired
output as for ith training set as
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
P
ðTj Ooj Þ2
Ep ¼ ð8Þ
n
10. Find d as
8 9
>
> >
>
>
> >
>
< =
d ¼ ðTk OokÞOokð1 OokÞ ð9Þ
>
> >
>
>
> >
>
: ;
½Y ¼ OH hd i
ð10Þ
mn m1 1n
12. Find
13. Find e
e ¼ ½W d
ð12Þ
m1 mn 11
268 I. Afrianty et al.
8 9
>
> >
>
>
> >
>
>
> >
>
< =
d ¼ ei hOHi ih1 OHi i
>
> >
> ð13Þ
>
> >
>
>
> >
>
: ;
m1 m1
½ X ¼ O I hd i ¼ II hd i
ð14Þ
1m 11 1m 11 1m
16. Find
18. Repeat steps iv–xvi until convergence in the error is less than tolerance value.
The neural network methodology is well known by its ability for generalization,
its massive parallel processing power and its high nonlinearity, making it perfect for
gender estimation [36].
4 Research Methodology
This research describes the research methodology that explains the activities and the
detailed phases of the process of gender determination using classification tech-
niques. This research framework is composed of five main phases as follows:
Back-Propagation Neural Network for Gender … 269
The framework of this paper has five phases and each phase represents the
problems in this research.
1. The first phase is problem identification and specification
In this phase includes the problem background, forensic anthropology specifi-
cally identification of skeletal remains, gender determination, sacrum bones,
objective and scope in this research.
Identification of the problem statement began with a literature review on issues
related to gender determination based on sacrum bones in forensic anthropology.
There are three major issues related to the problems namely Forensic Anthropology,
gender determination, and development of classification technique. After the lit-
erature review, then define problem solving of this research which is examined thus
the objective can be determined to answer the problem statement. The process of
problem identification is done by referring to the previous literatures in published
paper and journals. For constraints of work, the scope should be defined in
accordance with a predetermined objective. The scope of this research is limited to
explaining the process of BPNN in gender determination based on six measure-
ments as variables of sacrum bones.
2. The second phase is data definition and collection.
Data definition means a process of defining the type of data used, deciding the
source of data, categorizing the data for testing and training. The data is used is
sample data of sacrum bones. Otherwise, data collection is having the input data
built from original source and gathered into a compilation of numbers of input that
relevant information toward validating and analyzing the algorithm. In data col-
lection for the purpose of validating and analyzing the algorithm, a function is
created in the Matrix Laboratory (MATLAB) program. For analyzing of the data
collection is measured using metric measurement. The data type and sources that
used in this paper are sacrum bones dataset in tables form. The dataset is obtained
from analysis the previous research.
3. The third phase is explaining the classification technique
The classification technique is used in the process of gender determination,
namely Back Propagation neural Network (BPNN). This phase is initiated with
variables linked with numerous data in the process of gender determination. The
variables will be used as nodes in input layer of BPNN. Then, BPNN will do
learning appropriate to the structures and parameters which have been decided, such
as determining number of neurons in hidden layer, learning rate (lr), momentum
(mc), and activation function.
4. The fourth phase is evaluation of the result
The evaluation of the result specifically an evaluation of the performance
accuracy of the classification techniques based on the sample data and result
analysis produced in classification phase. The performance of the models is mea-
sured using mean squared error (MSE). The result obtained by algorithm is
Back-Propagation Neural Network for Gender … 271
recorded in table form then copied into Excel software for graph plotting. Based on
the table and graph, manual analysis by observation on the output of algorithm is
conducted to give the conclusion on the best technique for gender determination.
Based on the experiment of the classification technique, the result is recognition
accuracy for each algorithm when analyzing input data. In analyzing the classifi-
cation result namely accuracy is recorded manually and copied into Excel table.
After each testing session, the graph is plotted using standard functions such as Plot
Graph in Excel software. Manual graph analysis is done, thus it can be found the
best classifier to gender determination based on the high accuracy. A detail of the
research framework in this paper is described in Fig. 6.
5. The last phase, fifth phase is implementation.
The implementation includes discussion of the tools that required in the fourth
phase. The requirements to develop the integrated system are categorized into two
272 I. Afrianty et al.
parts; hardware and software. The hardware specification is used in this paper is
Compaq Presario CQ41 with Intel® Core™ i3 processor; its operating system is
Windows 8 32-bit with 2 GB RAM memory. The software required for analyzed
data is Matrix Laboratory (MATLAB) program; MATLAB 2012a that is used as
platform to write the code for BPNN.
5 Discussion
earlier ones in order to update the current input-hidden layer weights and hidden-
output layer weights. By updating these weights, the network would learn to reach
the target. The target reached is 1 for female and 0 for male. In the algorithm, the
error was calculated in the output and the new values of weights were computed in
each layer until the error was minimized to a considerable value. The measurement
of ANN performance was observed by using the MSE and total prediction accuracy
of the network to the tested data. And training is best when the ANN is capable to
achieve lowest MSE value.
274 I. Afrianty et al.
Table 4 Experimental results of BPNN for training and testing [6; 12; 2]
Performance of training lr mc
0.1 0.5 0.9
Accuracy (%) 0.1 97.900 98.870 98.500
MSE 0.012 0.010 0.011
Accuracy (%) 0.5 97.740 98.549 99.030
MSE 0.012 0.011 0.010
Accuracy (%) 0.9 98.390 98.549 98.390
MSE 0.014 0.010 0.010
Performance of testing
Accuracy (%) 0.1 96.790 96.927 97.200
MSE 1.120 1.952 1.356
Accuracy (%) 0.5 96.805 97.044 97.379
MSE 0.954 1.515 1.660
Accuracy (%) 0.9 96.720 96.855 96.850
MSE 1.683 1.704 1.547
Table 5 The comparison advantages and disadvantages between DFA and ANN
Classification Advantages Disadvantages
technique
Discriminant 1. DFA reduces subjective judgment 1. DFA gave to slightly better
function anal- as well as the level of expertise and results of classification (in the case
ysis (DFA) experience needed for the determi- of sex determination from upper
nation of sex femur by [9]
2. DFA is techniques that are simple, 2. Regarding this research, the pre-
quick, and accurate for gender vious study obtained accuracy
determination which is always pop- 87 %, which is lower than BPNN
ulation specific
Artificial neu- 1. The neural network is a powerful 1. The neural network have archi-
ral network classification technique and may tecture is different from the archi-
(ANN) improve the accuracy rate of gender tecture of microprocessors
determination models
2. This applies to problems where 2. Requires high processing time
the relationships may be quite for large neural networks
dynamic or non-linear such as DFA
3. The neural network using many 3. The neural network needs train-
variables gives the best overall ing to operate
results achieving the highest rate of
correctly classified individuals
4. The neural network using vari-
ables correctly classified 92.1 % of
male femurs and 94.7 % of female
femurs [9]
5. Regarding this research, BPNN
model provide result accuracy
99.030 % for training and 97.379 %
for testing. Thus, indicate that
BPNN give accuracy higher than
DFA
Back-Propagation Neural Network for Gender … 277
In the learning process of BPNN the experiment was repeated 10 times and the
results are outlined in Tables 3 and 4.
Table 3 described that experimental result of 10 times experiment obtained best
training and testing is on lr = 0.9 and mc = 0.5. The average accuracy obtained is
97.740 % training and 96.422 % testing. The performance of accuracy obtained by
[6; 6; 2] structure of BPNN described in Plot Graph in Fig. 9
Whereas on model of structure [6; 12; 2] can be seen on Table 4.
Table 4 indicate that the performance of each lr and mc yeild different results in
both training and testing. The experiment was repeated 10 times. The highest
accuracy was found to be experimental lr = 0.5 and mc = 0.9, namely 99.030 and
97.379 % training and testing classification rates, respectively. The performance of
accuracy obtained by [6; 12; 2] of BPNN described in Plot Graph in Fig. 10.
6 Conclusion
Acknowledgment This research has been supported by Malaysian Ministry of Education under
Fundamental Research Grant Scheme with UTM Vote Number RJ130000.7828.4F115.
278 I. Afrianty et al.
References
1. Ahmed, A.A.: Estimation of stature from the upper limb measurements of Sudanese adults.
Forensic Sci. Int. 228, 178.e171–e178 (2013). doi:10.1016/j.forsciint.2013.03.008
2. Akhlaghi, M., Sheikhazadi, A., Ebrahimnia, A., Hedayati, M., Nazparvar, B., Saberi Anary, S.
H.: The value of radius bone in prediction of sex and height in the Iranian population.
J. Forensic Leg. Med. 19(4), 219–222 (2012). doi:10.1016/j.jflm.2011.12.030
3. Akhlaghi, M., Sheikhazadi, A., Naghsh, A., Dorvashi, G.: Identification of sex in Iranian
population using patella dimensions (Research Support, Non-U.S. Gov’t). J. Forensic Leg.
Med. 17(3), 150–155 (2010). doi:10.1016/j.jflm.2009.11.005
4. Alba, E., Chicano, J.F.: Training neural networks with GA hybrid algorithms. Genetic and
Evolutionary Computation—GECCO 2004, pp. 852–863. Springer, Heidelberg (2004)
5. Beale, M.H., Hagan, M.T., Demuth, H.B.: Neural Network Toolbox™.. User’s Guide, Mar
2012
6. Cattaneo, C.: Forensic anthropology: developments of a classical discipline in the new
millennium (review). Forensic Sci. Int. 165(2–3), 185–193 (2007). doi:10.1016/j.forsciint.
2006.05.018
7. Cunha, E., Pinheiro, J., Nuno Vieira, D.: Identification in forensic anthropology: its relation to
genetics. Int. Congr. Ser. 1288, 807–809 (2006). doi:10.1016/j.ics.2005.12.068
8. Dixit, S.G., Kakar, S., Agarwal, S., Choudhry, R.: Sexing of human hip bones of Indian origin
by discriminant function analysis. J. Forensic Leg. Med. 14(7), 429–435 (2007). doi:10.1016/
j.jflm.2007.03.009
9. du Jardin, P., Ponsaille, J., Alunni-Perret, V., Quatrehomme, G.: A comparison between neural
network and other metric methods to determine sex from the upper femur in a modern French
population (comparative study). Forensic Sci. Int. 192(1–3), 127 e121–126 (2009). doi:10.
1016/j.forsciint.2009.07.014
10. Duric, M., Rakocevic, Z., Donic, D.: The reliability of sex determination of skeletons from
forensic context in the Balkans. Forensic Sci. Int. 147(2–3), 159–164 (2005). doi:10.1016/j.
forsciint.2004.09.111
11. Editorial: Global forensic anthropology in the 21st century. Forensic Sci. Int. 117, 1–6 (2001)
12. El Morsi, D.A., Al Hawary, A.A.: Sex determination by the length of metacarpals and
phalanges: X-ray study on Egyptian population. J. Forensic Leg. Med. 20(1), 6–13 (2013).
doi:10.1016/j.jflm.2012.04.020
13. Eshak, G.A., Ahmed, H.M., Abdel Gawad, E.A.: Gender determination from hand bones
length and volume using multidetector computed tomography: a study in Egyptian people.
J. Forensic Leg. Med. 18(6), 246–252 (2011). doi:10.1016/j.jflm.2011.04.005
14. Gomez-Valdes, J.A., Torres Ramirez, G., Baez Molgado, S., Herrera Sain-Leu, P., Castrejon
Caballero, J.L., Sanchez-Mejorada, G.: Discriminant function analysis for sex assessment in
pelvic girdle bones: sample from the contemporary Mexican population. J. Forensic Sci. 56(2),
297–301 (2011). doi:10.1111/j.1556-4029.2010.01663.x
15. Gonzalez, P.N., Bernal, V., Perez, S.I.: Geometric morphometric approach to sex estimation of
human pelvis (research support, Non-U.S. Gov’t). Forensic Sci. Int. 189(1–3), 68–74 (2009).
doi:10.1016/j.forsciint.2009.04.012
16. Grisbaum, G.A., Ubelaker, D.H.: An Analysis of Forensic Anthropology Cases Submitted to
the Smithsonian Institution by the Federal Bureau of Investigation from 1962 to 1994.
Smithsonian Institution Press, Washington, DC (2001)
17. Guyomarc’h, P., Bruzek, J.: Accuracy and reliability in sex determination from skulls: a
comparison of Fordisc(R) 3.0 and the discriminant function analysis (comparative study).
Forensic Sci. Int. 208(1–3), 180 e181–186 (2011). doi:10.1016/j.forsciint.2011.03.011
Back-Propagation Neural Network for Gender … 279
18. Hsiao, T.H., Tsai, S.M., Chou, S.T., Pan, J.Y., Tseng, Y.C., Chang, H.P., Chen, H.S.: Sex
determination using discriminant function analysis in children and adolescents: a lateral
cephalometric study (research support, Non-U.S. Gov’t validation studies). Int J Legal Med.
124(2), 155–160 (2010). doi:10.1007/s00414-009-0412-1
19. Iscan, M.Y., Olivera, H.E.S.: Forensic anthropology in Latin America. Forensic Sci. Int. 109,
15–30 (2000)
20. Jianguo, Z., Gang, Q.: Application of BP neural network forecast model based on principal
component analysis in railways freight forecas. In: Paper presented at the international
conference on computer science and service system (2012)
21. Kanchan, T., Krishan, K.: Anthropometry of hand in sex determination of dismembered
remains—a review of literature (review). J. Forensic Leg. Med. 18(1), 14–17 (2011). doi:10.
1016/j.jflm.2010.11.013
22. Kemkes-Grottenthaler, A.: Sex determination by discriminant analysis: an evaluation of the
reliability of patella measurements. Forensic Sci. Int. 147(2–3), 129–133 (2005). doi:10.1016/
j.forsciint.2004.09.075
23. Kim, D.I., Kim, Y.S., Lee, U.Y., Han, S.H.: Sex determination from calcaneus in Korean
using discriminant analysis. Forensic Sci. Int. 228(1–3), 177 e171–177 (2013). doi:10.1016/j.
forsciint.2013.03.012
24. Koçak, A., Özgür Aktas, E., Ertürk, S., Aktas, S., Yemisçigil, A.: Sex determination from the
sternal end of the rib by osteometric analysis. Leg. Med. 5(2), 100–104 (2003). doi:10.1016/
s1344-6223(03)00045-2
25. Kottaimalai, R., Rajasekaran, M.P., Selvam, V., Kannapiran, B.: EEG signal classification
using principal component analysis with neural network in brain computer interface
applications. In: Paper presented at the international conference on emerging trends in
computing, communication and nanotechnology (2013)
26. Kranioti, E., Paine, R.: Forensic anthropology in Europe: an assessment of current status and
application. J. Anthropol. Sci. 89, 71–92 (2011). doi:10.4436/jass.89002
27. Kranioti, E.F., Bastir, M., Sanchez-Meseguer, A., Rosas, A.: A geometric-morphometric study
of the Cretan humerus for sex identification (research support, Non-U.S. Gov’t). Forensic Sci.
Int. 189(1–3), 111 e111–118 (2009). doi:10.1016/j.forsciint.2009.04.013
28. Kranioti, E.F., Michalodimitrakis, M.: Sexual dimorphism of the humerus in contemporary
Cretans–a population-specific study and a review of the literature* (review). J. Forensic Sci. 54
(5), 996–1000 (2009). doi:10.1111/j.1556-4029.2009.01103.x
29. Krishan, K., Sharma, A.: Estimation of stature from dimensions of hands and feet in a North
Indian population. J. Forensic Leg. Med. 14, 327–332 (2007). doi:10.1016/j.jcfm.2006.10.008
30. Kumaravel, G., Kumar, C.: A Novel Bats Echolocation System Based Back Propagation
Algorithm for Feed Forward Neural Network. Springer, Heidelberg (2012)
31. Lee, T.-L.: Back-propagation neural network for the prediction of the short-term storm surge
in Taichung harbor, Taiwan. Eng. Appl. Artif. Intell. 21(1), 63–72 (2008). doi:10.1016/j.
engappai.2007.03.002
32. Li, X.-f., Zhang, P.: A research on value of individual human capital of high-tech enterprises
based on the bp neural network algorithm. In: The 19th International Conference on Industrial
Engineering and Engineering Management, pp. 71–78. Springer, Berlin (2013)
33. Liang, L., Wu, D.: An application of pattern recognition on scoring Chinese corporations
financial conditions based on backpropagation neural network. J. Comput. Oper. Res. 32,
1115–1129 (2005)
34. Lin, C., Jiao, B., Liu, S., Guan, F., Chung, N.E., Han, S.H., Lee, U.Y.: Sex determination from
the mandibular ramus flexure of Koreans by discrimination function analysis using three-
dimensional mandible models (research support, Non-U.S. Gov’t). Forensic Sci. Int. 236, 191
e191–e196 (2014). doi:10.1016/j.forsciint.2013.12.015
35. Love, J. C., Hamilton, M.D.: Introduction to Forensic Anthropology, pp. 509–537 (2011).
doi:10.1007/978-1-60761-872-0_19
280 I. Afrianty et al.
36. Mahfouz, M., Badawi, A., Merkl, B., Fatah, E.E., Pritchard, E., Kesler, K., Jantz, L.: Patella
sex determination by 3D statistical shape models and nonlinear classifiers. Forensic Sci. Int.
173(2–3), 161–170 (2007). doi:10.1016/j.forsciint.2007.02.024
37. Mandal, S.R., Raju, D.H.: Ocean wave parameters estimation using Backpropagation Neural
Networks. Mar. Struct. 18, 301–318 (2005). doi:10.1016/j.marstruc.2005.09.002
38. Mastrangelo, P., De Luca, S., Aleman, I., Botella, M.C.: Sex assessment from the carpals
bones: discriminant function analysis in a 20th century Spanish sample (historical article).
Forensic Sci. Int. 206(1–3), 216 e211–210 (2011). doi:10.1016/j.forsciint.2011.01.007
39. Mastrangelo, P., De Luca, S., Sanchez-Mejorada, G.: Sex assessment from carpals bones:
discriminant function analysis in a contemporary Mexican sample. Forensic Sci. Int. 209(1–3),
196 e191–115 (2011). doi:10.1016/j.forsciint.2011.04.019
40. Mostafa, E.M., El-Elemi, A.H., El-Beblawy, M.A., Dawood, A.E.-W.A.: Adult sex
identification using digital radiographs of the proximal epiphysis of the femur at Suez
Canal University Hospital in Ismailia, Egypt. Egypt. J. Forensic Sci. 2(3), 81–88 (2012).
doi:10.1016/j.ejfs.2012.03.001
41. Mountrakis, C., Eliopoulos, C., Koilias, C.G., Manolis, S.K.: Sex determination using
metatarsal osteometrics from the Athens collection (research support, Non-U.S. Gov’t).
Forensic Sci. Int. 200(1–3), 178 e171–177 (2010). doi:10.1016/j.forsciint.2010.03.041
42. Nagalakshmi, S., Kamaraj, N.: On-line evaluation of loadability limit for pool model with
TCSC using back propagation neural network. Int. J. Electr. Power Energy Syst. 47, 52–60
(2013). doi:10.1016/j.ijepes.2012.10.051
43. Nagaoka, T., Hirata, K.: Reliability of metric determination of sex based on long-bone
circumferences: perspectives from Yuigahama-minami, Japan (research support, Non-U.S.
Gov’t). Anat. Sci. Int. 84(1–2), 7–16 (2009). doi:10.1007/s12565-008-0003-0
44. Nicholas, G., Hollowell, J.: World archaeological Congress research handbooks in
archaeology. In: Blau, S., Ubelaker, D.H. (eds.) Handbook of Forensic Anthropology and
Archaeology. Left Coast Press Inc, California (2009)
45. Ogawa, Y., Imaizumi, K., Miyasaka, S., Yoshino, M.: Discriminant functions for sex
estimation of modern Japanese skulls. J. Forensic Leg. Med. 20(4), 234–238 (2012). doi:10.
1016/j.jflm.2012.09.023
46. Papaioannou, V.A., Kranioti, E.F., Joveneaux, P., Nathena, D., Michalodimitrakis, M.: Sexual
dimorphism of the scapula and the clavicle in a contemporary Greek population: applications
in forensic identification. Forensic Sci. Int. 217(1–3), 231 e231–237 (2012). doi:10.1016/j.
forsciint.2011.11.010
47. Raghavendra Babu, Y.P., Kanchan, T., Attiku, Y., Dixit, P.N., Kotian, M.S.: Sex estimation
from foramen magnum dimensions in an Indian population. J. Forensic Leg. Med. 19(3),
162–167 (2012). doi:10.1016/j.jflm.2011.12.019
48. Rajasekaran, S., Vijayalakshmi, G.A.: Neural Networks, Fuzzy Logic, Genetic Algorithms,
Synthesis and Applications. Prentice-Hall of India, New Delhi (2007)
49. Ramsthaler, F., Kreutz, K., Verhoff, M.A.: Accuracy of metric sex analysis of skeletal remains
using Fordisc based on a recent skull collection (comparative study). Int. J. Legal Med. 121(6),
477–482 (2007). doi:10.1007/s00414-007-0199-x
50. Slaus, M., Bedic, Z., Strinovic, D., Petrovecki, V.: Sex determination by discriminant function
analysis of the tibia for contemporary Croats (research support, Non-U.S. Gov’t). Forensic Sci.
Int. 226(1–3), 302 e301–304 (2013). doi:10.1016/j.forsciint.2013.01.025
51. Steadman, D., Andersen, S.A.: Personal identification: theory and applications the case study
approach, pp. 12–15 (2008)
52. Tan, M., He, G., Nie, F., Zhang, L., Hu, L.: Optimization of ultrafiltration membrane
fabrication using backpropagation neural network and genetic algorithm. J. Taiwan Inst.
Chem. Eng. (2013). doi:10.1016/j.jtice.2013.04.004
Back-Propagation Neural Network for Gender … 281
53. Thompson, T.J.U.: Recent advances in the study of burned bone and their implications for
forensic anthropology. Forensic Sci. Int. 146, S203–S205 (2004). doi:10.1016/j.forsciint.2004.
09.063
54. Ubelaker, D.H.: Chap. 1: Forensic anthropology. Humana Press Inc, Totowa (1989)
55. Uthman, A.T., Al-Rawi, N.H., Al-Naaimi, A.S., Tawfeeq, A.S., Suhail, E.H.: Evaluation of
frontal sinus and skull measurements using spiral CT scanning: an aid in unknown person
identification. Forensic Sci. Int. 197(1–3), 124 e121–127 (2010). doi:10.1016/j.forsciint.2009.
12.064
56. Zech, W.D., Hatch, G., Siegenthaler, L., Thali, M.J., Losch, S.: Sex determination from os
sacrum by postmortem CT. Forensic Sci. Int. 221(1–3), 39–43 (2012). doi:10.1016/j.forsciint.
2012.03.022
Neural Network Approach to Fault
Location for High Speed Protective
Relaying of Transmission Lines
Abstract Fault location and distance protection in transmission lines are essential
smart grid technologies ensuring reliability of the power system and achieve the
continuity of service. The objective of this chapter is to presents an accurate
algorithm for estimating fault location in Extra High Voltage (EHV) transmission
lines using Artificial Neural Networks (ANNs) for high speed protection. The
development of this algorithm is based on disturbed transmission line models. The
proposed fault protection (fault detection/classification and location) uses only
the three phase currents signals at the one end of the line. The proposed technique
uses five ANNs networks and consists of two steps, including fault detection/
classification and fault location. For fault detection/classification, one ANN net-
work is used in order to identify the fault type; the fault detection/classification
procedure uses the fundamental components of pre-fault and post-fault sequence
samples of three phase currents and zero sequence current. For fault location, four
ANNs networks are used in order to estimate the exact fault location in transmission
line. Magnitudes of pre-fault and post-fault of three phase currents are used. The
ANNs are trained with data under a wide variety of fault conditions and used for the
fault classification and fault location on the transmission line. The proposed fault
detection/classification and location approaches are tested under different fault
conditions such as different fault locations, different fault resistances and different
fault inception angles via digital simulation using MATLAB software in order to
verify the performances of the proposed methods. The ANN-based fault classifier
and locator gives high accuracy for all tests under different fault conditions.
The simulations results show that the proposed scheme based on ANNs can be used
for on-line fault protection in transmission line.
1 Introduction
transmission line, the next step is to identify the fault type into the different cate-
gories based on the phases that are faulted. Then, the third step is designed to
estimate the distance of the fault in the transmission line.
Accurate fault location is highly required by operators and utility staffs to
expedite service restoration, fast reparation and restoration of the faulty line in order
to improve reliability and the service restoration reduce outage time, operating costs
and customer complains. Fault location is still the subject of rapid further devel-
opments. Research efforts are focused on developing efficient fault location algo-
rithms intended for application to more and more complex networks.
Fault location is a process enables to locate of the fault in transmission line with
the highest accuracy possible. Fault locators algorithms present generally the
supplementary protection equipment, which apply the fault location algorithms for
estimate the exact fault location a distance to the fault. When the transmission line
consisting of more than one section (multi-terminal line), initially a faulted section
has to be identified and then a fault on this section has to be located.
Fault location algorithm can be implemented in:
• Microprocessor-based relays;
• digital fault recorders (DFRs);
• Stand-alone fault locators;
• Post-fault analysis programs.
Transmission fault location techniques can be classified into three main cate-
gories: techniques based on traveling waves [5, 22, 24, 27, 54, 55, 59, 67], tech-
niques utilizing the higher frequency components (harmonics) of currents and
voltages [19, 43, 65] and techniques utilizing the fundamental frequency voltages
and currents measured at the terminals of a line [25, 31, 51]. The techniques in these
categories can be further classified into two subcategories: these techniques which
use measurements from one terminal of the transmission line and techniques which
use measurements taken from both terminals line [21, 41] are generally more
accurate than the ones using data only from one terminal. However, in many
transmission lines, a communication channel between the line terminals is not
available, thus make it necessary to use data from the one terminal line only.
Fault location algorithms using one terminal line data (voltages and currents) need
to make some simplifying assumptions for the fast calculation of the exact fault
location. However, the fault detection/classification and location techniques using
one terminal data could be more attractive for researchers. Various techniques of fault
detection/classification and location have been developed in the literature. Trans-
mission lines protection is based on the estimation of the fundamental power fre-
quency components. Barros and Drake [17], Girgis and Brown [30] used kalman
filter, discrete Fourier transformation [23], walsh function [33], etc. for estimate the
phasor quantities. Nevertheless, these techniques didn’t have the ability to adapt
dynamically to the system operating conditions, and require a long computation time.
There is a need to develop algorithms that have the ability to adapt dynamically
to the system operating conditions such as changes in the system configuration,
286 M.B. Hessine et al.
source impedances and faults conditions (fault resistance, fault inception angle,
fault position).
In this context, various fault detection, classifications and location approaches
for transmission lines have been developed. These approaches are based on intel-
ligent artificial tools such as Fuzzy Logic [20, 28, 46, 50, 67], Neuro-fuzzy [37, 38,
54, 64], Fuzzy Logic-Wavelet based systems [47, 69] and Artificial Neural
Networks ([4, 6, 31, 35 41, 46, 49, 62]; Yilmaz et al. 2012).
The goal of this chapter is to develop and integrate a new and accurate fault
detection/classification and location based on ANN for high speed protection relays
in EHV transmission lines compared to conventional methods. A single end fault
detection/classification and location algorithms are proposed for on-line application
using artificial neural networks (ANNs) for all the ten types of faults in transmission
lines. Throughout the study a 400 kV transmission line of 100 km length has been
chosen as a representative system. Pre-fault and post-fault samples of three phase
currents and zero sequence currents are used to train the ANNs in order to classify
and accurately locate the faults on transmission line.
The remainder of the chapter is organized as follows: Sect. 2 describes the
reviews for existing fault detection/classification and location in transmission lines.
The power system under study used for training and testing the proposed ANNs
based on fault detection/classification and location is given in Sect. 3. Description
for the artificial neural networks and the learning algorithm used in this work is
presented in Sect. 4. Section 5 describes the proposed algorithms for fault detection/
classification and location using single ANN approach and modular ANNs
approach respectively. Tests performances for the proposed fault protection scheme
are given in Sect. 6. Finally, Sect. 7 presents a comparative study between the
proposed scheme and the related works.
Since the 1960s, several significant researches for fault location and protection of
energy transmission line subjects had developed. It was motivated by the fact that
more than half of the faults occur on the airlines. Thus (Rockefeller 1969) proposed
a protection scheme for the electrical network based on digital relays. In recent
years, new systems of digital relays were developed [58, 59] and field-tested by
electric companies. Many reasons favoring the fast development of digital relays
are presented as follows:
• Reduced price of digital equipments;
• High reliability made possible by monitoring the power grid and self -diagnostic
system relay;
• Best performance, due to practicality, to implement the various functions of
relays and desired to form complex features operational;
Neural Network Approach to Fault Location … 287
The fault location algorithms could be more accurate if more information about the
transmission line were available. Thus, if the communication channels are available
in transmission line, the techniques use the measurements at the two ends are used
for locating the exact fault position. These techniques are more precise than the
distance relaying protection algorithms which are affected by the insufficient
transmission line modeling and the parameter uncertainty due to the aging of lines.
In the 80s the techniques use the synchronization measurement technology
appeared as a promising prospect in the realization of real time protection in
transmission lines. With global positioning system (GPS), digital measurement at
different line terminals can be performed synchronously [1, 8, 18, 55]. Phasor
Measurements Units (PMU) are the most frequently used synchronized measure-
ment devices for system protection, whose measurements are synchronized relative
to a GPS clock, for it the fault locators algorithm used the PMU are more accurate
than the method based on unsynchronized measurements [40, 43, 45, 60, 63].
Moreover, these techniques require the presence of a GPS where measurements
are synchronized compared to a GPS clock. Nevertheless, the synchronized mea-
surement technology presents many drawbacks as the high cost and the presence of
a communication channel between the line terminals which is not available in the
majority of lines. Therefore, the fault diagnosis techniques using one terminal data
could be more attractive for researchers (Fig. 1).
288 M.B. Hessine et al.
Fig. 1 Schematic diagram of two-end synchronized fault location using GPS synchronization
Fault location techniques use measurements taken from both terminals line, is
expensive since it requires the synchronization equipment and telecommunication.
This approach is not generalized and is only used on high-voltage lines and direct
current lines. This technique utilizes measurements of three phase current and
voltage from one terminal line (Fig. 3). This technique has a major advantage such
as no communication means are needed and simplicity for implementation. Fault
location based on data from only one end is the most commonly used.
Single-end algorithms-based fault location estimate a distance to fault with the
use of fundamental components of three phase voltages and currents acquired at a
particular end of the transmission line by the protection relay through CT and VT.
Different fault location techniques using one terminal line data are developed in
the literature, these techniques based on the estimation of the fundamental power
frequency components using kalman filter, discrete Fourier transformation, walsh
function, etc. Nevertheless, these techniques didn’t have the ability to adapt
dynamically to the system operating conditions, and require a long computation
time.
There is a need to develop algorithms that have the ability to adapt dynamically
to the system operating conditions such as changes in the system configuration,
source impedances and faults conditions (fault resistance, fault inception angle,
fault position).
Today, Fuzzy Logic and Artificial Neural Networks represent an area of
intensive research in different applications of system identification, control systems,
biomedical application, signal processing and fault diagnosis. Fuzzy logic, which is
a mathematical tool based on fuzzy sets theory [10, 11, 14, 51] has rapidly become
one of the most successful technologies for developing sophisticated control sys-
tems today.
Recently, the combination of fuzzy logic and neural networks has been studied
for applying in real applications [9, 12, 13, 15, 37, 38]. With this combination,
neuro-fuzzy systems use the advantages of both approaches, namely, fuzzy logic
and neural networks. These advantages make neuro-fuzzy systems a powerful tool
which can be applied in different disciplines as system identification, control sys-
tems, signal process, load forecasting in power systems, and protection system.
The principal objective of this chapter is to explore the capacity of artificial
neural network for identify the fault types and to estimate the faults in the trans-
mission lines for high speed protective relaying system.
The ANN based on protection relays have been developed as an alternative to
conventional methods, since they present very promising results with regard to
precision and operating time. Different techniques have been published describing
the applications of neural networks in fault classification and location.
Application of ANN to fault location in transmission line is proposed by Tahar
[63]. This approach consists of two parts. In the first part the fault detection is
determined, and in the second part the fault position is calculated. But the fault type
and the response time for this approach is not indicate. RBF (radial basis function)
neural networks based fault classification and location algorithms are proposed by
Joorabian et al. [41]. The maximum error of the fault location algorithm is 0.5 %.
Nevertheless, the fault detection and the response time for this approach are not
indicated. Banu and Suja [16] developed a new fault location scheme for a trans-
mission line is proposed. The scheme uses a single ANN based on the Levenberg
Marquardt optimization technique. But the fault detection and the fault type are not
indicated; also the error of the algorithm is kept below 0.65 %. Wavelet and neural
network fault classification and location are developed by Aritra et al. [7]. The
reported method is suitable for classifies all ten types of faults as well as estimates
the location of faults simultaneously with maximum fault-location error is 3.25 %
and a response time is not indicated. Gaganpreet et al. [29], Hassan and Zuyi [33]
proposed a neural network approaches for fault detection and fault location in
transmission lines. Nevertheless, these approaches detect only the faults appeared in
the first zone of the line, namely, 80 % of the transmission line length. A neural
network approach for fault classification is presented by Arita et al. [7].This
approach can be classified the faults line-to-ground (L-G), line-to-line (L-L), and
three-line (L-L-L) faults for a particular distance of fault location, although an line-
to-line-to-ground (L-L-G) fault is not considered. An alternative approach to fault
Neural Network Approach to Fault Location … 291
classification and location are presented by Yilmaz [66]. The error of the algorithm
is kept below 3 %. In Ref. [4] fault distance and direction estimation based on ANN
for protection of doubly fed transmission lines is proposed, but the fault type is not
indicated. The operating time of this approach is 1.5 cycles. Jiang et al. uses a fault-
location module for fault diagnosis, which incorporates a two-stage adaptive
structure neural network, the fault detection, classification and location algorithms
are presented with averaged fault location error equal to 0.5 %.The results clearly
show that this approach leads to a reliable location for all types of faults with times
equal to 1.28 cycle after fault occurrence.
In this section the architecture of the proposed fault location algorithm is developed
and presented.
The proposed technique consists of two modules: fault detection/classification
and fault location. This strategy can analyze faults occurring between two busses.
Specifically, the first step of the proposed scheme is to detect and classify the fault
type in the transmission line in real time. If no fault is detected, the remaining
portion of the modules will not be activated. On the other hand, if the fault
detection/classification module captures the feature of a fault, it will activate fault
location module.
The inputs of the protective relay are principally three phase voltages and cur-
rents at the recorded location of the protective relay. These signals (currents and
voltages) at the end of line (relay site in the transmission line) will be acquired by
the relay via current transformer CT and voltage transformer VT.
Pre-processing of three phase currents and voltages signal measured at one end
only of the transmission line can significantly reduce the size and the training time
of the Neural Network.
The ANN-based fault protection (ANN-based fault detector/classifier and ANN-
based fault locator) uses the fundamental magnitudes of pre-fault and post-fault
samples of three phase currents.
Most digital protection relays use the fundamental frequency of a signal sam-
pled, for that the Fast Fourier Transform is the most common method to estimate
the magnitude and the phase of the fundamental frequency for each signal (current
and voltage). The schematic diagram for the proposed fault classifier and locator
depicted by Fig. 5.
• Anti-Aliasing Filter: The anti-aliasing filter removes the unwanted frequencies
from a sampled waveform. A simple second order low-pass Butterworth filter
with cut-off frequency of 400 Hz is integrated.
• Sampling Rate: The three phase currents and voltages signals are sampled at a
sampling frequency of 1 kHz; this sampling rate is compatible with the sampling
rates currently used by the digital relays
Neural Network Approach to Fault Location … 293
• Discrete Fourier Transformer: One full cycle Discrete Fourier Transform (DFT)
is used to calculate the magnitudes of fundamental components of three phase
currents and voltages after the fault appearance
• Normalization (±1): The input signals samples have to be normalized in order to
reach the input level (±1).
This work presents a new scheme for fault protection in transmission line. The
proposed scheme employs Artificial Neural Networks for fault detection/classifi-
cation and fault location on transmission lines. In this respect, the main goal of the
next section is to develop the principal function of the ANN used in this work.
4.1.1 Presentation
In this paper, a multi-layer neural networks (FFNNs) was used and trained with a
supervised learning algorithm called back-propagation. The multi-layer neural
294 M.B. Hessine et al.
network consists of three layers: an input layer, an output layer, and one or more
hidden layer. Each layer consists of a predefined number of neurons. We recall that
the neural network is a collection of cells of neurons interconnected by synaptic
weights and biases. The inputs are connected to the first hidden layer. Each hidden
layer is connected to the next hidden layer, and the last hidden layer is connected to
the output layer (Fig. 6).
A neuron mathematical model has a much simpler structure comparing with a
biological neuron [37]. However, a neuron j can be described mathematically with
the following equation.
X
P
aj ¼ rðw0 þ wij xi Þ ð1Þ
i¼1
Where:
r represent the transfer function (activation function) of neuron j
{xi} i = 1…n: represents the inputs signals of neuron j
{wij} represents the weight coefficients of the connection between inputs and
neuron j
w0 is the bias of neuron j
!
X
P
aj ¼ rhidden w0 þ hidden
wij xi ð2Þ
i¼1
Where:
whidden
ij
represent the connection weight between the neuron j in the hidden layer
and the ième neuron of the input layer
w0 represent the bias of neuron j
rhidden represent the activation function of the hidden layer
The values of the vector [a] of the hidden layer are transferred to the output layer
using the connection weight between the hidden layers and the output layer.
However, the output vector [b] = (b1, b2, bk, …, bR) of the output layer is deter-
mined. The output ak of the neuron K (on the output layer) is obtained as follows:
!
X
R
ak ¼ rout w0 þ wout
jk
xi ð3Þ
j¼1
wout
jk
represent the connection weight between the neuron K in the output layer
and of the jth neuron of the hidden layer
rout is the activation function of the output layer
The error in the output layer between the output ak and its desired value ak-desired
(ak − ak-desired) is minimized by the mean square error at the output layer, defined as
follows:
1X Q
Error ¼ ðakdesired ak Þ2 ð4Þ
2 k¼1
The training data set of an ANN should contain the necessary information to
generalize the problem. In this work, different combinations of various fault con-
ditions were considered and training patterns were generated by simulating different
fault situation on the power system study. Fault conditions such as fault resistance,
fault location, and fault inception angle were changed to obtain training patterns
covering a wide range of different power system conditions.
The design process of the Artificial Neural Networks (ANNs) used on fault
detector/classifier FClassifier and fault locator FLocator in transmission line detailed by
the following steps:
296 M.B. Hessine et al.
Step 1.
Preparation a data base from all simulation.
Step 2.
Assemble and pre-process the training data for ANNs.
Step 3.
Training Process.
Step 4.
Test of performances.
Step 5.
Select the best ANN gives the best performance and stored the trained
network.
Step 6. Application.
The outputs have been termed as R, S, T and G, which represent the three phases
and ground. Any one of the outputs R, S, G approaching 1 indicates a fault in that
phase, and if G is taken 1 indicates a fault related to ground (Table 2).
Once the fault is detected and classified, the relevant ANNs for fault location are
activated. The inputs for these networks are the magnitude of three phase current
and the output is the normalized distance of the fault point from the sending end of
the transmission line.
Neural Network Approach to Fault Location … 297
The proposed neural locator for our study case is designed to indicate the fault
location in transmission lines. The fault locator is activated when a fault is detected
and classified by the fault detector and fault classifier respectively. The exact
location of such a fault is given by identifying directly the power system state
starting from the instantaneous current and voltage data.
The overall algorithm of proposed ANN-based fault locator is detailed in Fig. 7.
The single ANN approach based fault locator present many disadvantage such as
the wide training sets, long training time, complexity architecture which affect the
accuracy of the fault location algorithms [4, 7, 63]. Thus, it was decided to develop
a new algorithm based on modular ANN approach present many advantage (sim-
plicity, less training sets, less training time and more accuracy) compared on single
ANN approach. The proposed fault locator algorithm consists of four independent
ANNs, one for each fault type (ANNLG, ANNLLG, ANNLL and ANNLLL). In this
case each fault type trained by one neural network. Finally, the outputs of the ANNs
are used to realise the fault location task.
The principal factor in the determination of the adequate size and architecture for
Artificial Neural Network is the input and output numbers which it must have.
However, the sufficient input data to characterize the problem must be assured. The
recorded signals at one terminal line are used for the fault location task
Various works [4, 7, 63, 66] treated at the same time the magnitudes of the
fundamental components (50 Hz) of three phase currents and voltages measured
where the protection relay is installed in order to estimate the exact fault location,
which leads to establishment of a complex ANNs architecture dedicated to this task
and a long training time and slow learning capability.
As a perspective to reduce the ANN sizes used for fault location and to allot
additive performances to this task, we are based on the fact that only magnitudes of
the fundamental components of three phase currents IR, IS and IT used. This makes
it possible to solve the quoted problems with reduced ANN architectures with high
accuracy and a fast training time. For that, in our study case, we thought of integrate
a neural fault locator which treats only the magnitude of three phase currents.
The inputs of ANNs are the magnitudes of the fundamental components (50 Hz)
of three phase currents. The output of modular ANN-based fault locator is a real
number indicates the fault distance location in km.
Before the currents signals penetrate in the neural network, a scaling technique
will have a great importance in order to reduce the computing execution time.
298 M.B. Hessine et al.
For this purpose, we thought to adopt a scaling technique expressed by the dividing
the magnitudes of the fundamentals components of three phase currents Ii ðkÞ during
the fault time (post-fault) to the pre-fault fundamental components in related phase
(IiPF ðkÞ) with i ¼ fR; S; Tg. Thus we indicated XFlocator the inputs vector taken
by the fault neural locator and by YFlocator the output of the proposed fault locator.
IR ðkÞ IS ðkÞ IT ðkÞ
XFlocator ¼ ; ;
IRPF ðkÞ ISPF ðkÞ ITPF ðkÞ ð6Þ
YFlocator ¼ LF
It is, extremely important to subject ANNs a good training and to test them
correctly, we used the error back-propagation training algorithm (BPNN). ANNs
undergo training with various patterns corresponding to different types and different
fault conditions such as various fault location Lf, various fault resistances Rf and
various fault inception angle FIA. At the training time, different structures (many
Neural Network Approach to Fault Location … 299
neurons in the hidden layer) with various parameters such as the training rate and
the transfer functions are evaluated to determine the optimal structure of the net-
work making it possible to produce a good training and to have the best results. In
order to obtain a wide training process for an effective performance of the suggested
fault locator, each of the ten fault types was simulated at various location of the
considered transmission line; various fault conditions such as fault resistance and
fault inception angle are also modified to include several fault scenarios possible in
real time.
Table 3 contains the parameter values used to generate data training sets and test
patterns for the ANNs of the fault classifier and locator.
Each fault type at various fault conditions such as different fault locations Lf,
different fault resistances Rf and different fault inception angle FIA have been
simulated as shown below in Table 1. The total number of fault simulated are 10
(fault locations) × 3 (fault resistance) × 10 (fault type) × 3 (fault inception
angles) = 900 for fault classification and for fault location.
The fault classifier and locator structure corresponds to the layer numbers and
the neurons numbers in the hidden layers, input and output. After a test series and
the ANN architecture modifications, the best performance is obtained by using a
neurons network with three layers, (Fig. 3). The neurons number in the input layer
corresponds to the ANN input variable numbers. The neurons number in the hidden
layer was given after a test series.
The ANN fault classifier consists of 16 input neurons (four samples of each
signal): IR, IS, IT, I0), 30 neurons in the hidden layer selected after a series of trials
and four output neurons dedicated to indicate the fault type in the transmission line.
Consequently, the ANN structure of the adopted fault classifier is (16-30-4).
Also architectures of ANN based fault distance locator are shown in Table 4.
The number is epochs required for training varies from 188 to 300 to reduce the
mean square error below 3.88e−5 (Figs. 8 and 9).
The final determination of the neural network requires the relevant transfer
functions in the hidden and output layers. After analyzing the various possible
combinations of transfer functions usually used such as “logsig”, “tansig” and
“purelin” functions. The hyperbolic tangent sigmoid function “tansig” has been
used in the hidden layer and the purely linear transfer function “purelin” has been
used in the output layer.
300 M.B. Hessine et al.
6 Performance Results
The effectiveness of the new fault detection/classification and location scheme was
tested for various fault conditions such as different fault locations Lf, different fault
resistances Rf and different fault inception angles FIA for each fault type. The
training and testing process were generated using the single line diagram of a
100 km, 400 kV transmission line shown in Fig. 4. This system has been simulated
using Matlab Software Program and the obtained data of three phase currents and
zero sequence current for pre-fault and post-fault are obtained. The obtained results
are used for training and testing of neural detector/classifier and neural locator using
“Matlab/neural network toolbox”.
Once the ANNs training procedure is entirely carried out, all networks of the fault
locator are tested under different fault scenarios using different fault conditions
which are not presented during the training process. All fault types with different
Neural Network Approach to Fault Location … 301
fault resistances Rf, different inception angles FIA and different fault location Lf in
the transmission line are simulated in order to evaluate the performances of the
proposed fault location scheme.
The criterion for evaluating the performance of the proposed neural fault locator
is based on the following equation.
Some test results of single phase to ground and double phase to ground under
different fault conditions presented in Tables 6 and 7. Also, the simulation
302 M.B. Hessine et al.
Table 6 Fault condition and percentage error for L-G, L-L-G faults
Fault conditions Testing results of L-G Testing results of L-L-G
Fault Fault Fault Output of Percentage Output of Percentage
location inception resistance ANN based error of ANN ANN based error of ANN
(km) angle (°) (Ω) fault based fault fault based fault
locator locator locator locator
08 325 26 08.0137 0.0137 08.0911 0.0911
12 125 130 12.0522 0.0522 12.1091 0.1091
24 145 2.6 23.8829 0.1171 23.8009 0.1991
32 275 100 32.2281 0.2281 32.1990 0.1990
48 05 65 48.1094 0.1094 48.2119 0.2119
53 25 95 53.0998 0.0998 52.6918 0.3082
62 215 22 62.3787 0.2787 62.2790 0.2210
77 10 13 77.1210 0.1210 77.3117 0.3117
88 40 44 87.8917 0.1083 88.3071 0.3071
95 135 35 95.3321 0.3321 95.4566 0.4566
Table 7 Fault condition and percentage error for L-L, L-L-L faults
Fault conditions Testing results of L-L Testing results of L-L-L
Fault Fault Fault Output of Percentage Output of Percentage
location inception resistance ANN based error of ANN ANN based error of ANN
(km) angle (°) (Ω) fault based fault fault based fault
locator locator locator locator
08 325 26 07.9751 0.0249 08.0983 0.0983
12 125 130 12.0908 0.0908 11.8299 0.1701
24 145 2.6 24.2079 0.2079 24.2229 0.2229
32 275 100 32.3009 0.3009 32.2007 0.2007
48 05 65 48.3291 0.3291 48.1399 0.1399
53 25 95 53.1812 0.1812 53.1678 0.1678
62 215 22 62.4097 0.4903 61.6911 0.3089
77 10 13 77.3088 0.3088 77.4018 0.4018
88 40 44 87.7191 0.2809 88.5091 0.5091
95 135 35 95.2791 0.2971 94.7117 0.2883
conditions and the percentage error for double phase and three phase fault are
presented in Tables 6, 7 and 8.
The percentage error for 10 patterns for the transmission line for each fault type
(phase to ground fault L-G, double phase to ground fault L-L-G, double phase fault
and three phase fault L-L-L) are indicated in Figs. 10, 11, 12 and 13.
The minimum, maximum and average error percentages of the proposed fault
locator are illustrated in Table 8. The average error value of this algorithm for the
single phase ground fault was 0.1460 and 0.2512 % for the two phases fault and
0.2415 % for the two-phases to ground and 0.2512 % for the three-phase fault.
304 M.B. Hessine et al.
Fig. 10 Estimated fault location and percentage error during testing of L-G faults
Fig. 11 Estimated fault location and percentage error during testing of L-L-G faults
Neural Network Approach to Fault Location … 305
Fig. 12 Estimated fault location and percentage error during testing of L-L faults
Fig. 13 Estimated fault location and percentage error during testing of L-L-L faults
The simulation results in these figures and tables prove the capacity of the fault
locator to produce a correct answer in all simulations test. Moreover, the ANNs
output stability under normal study state and under fault situations and the fast
convergence of the output variables to the desired values in the presence of fault is
well checked that confirms that the proposed fault locator algorithm is effective.
306 M.B. Hessine et al.
Tr ¼ Te Tf ð8Þ
• Generalization capabilities.
A best ANN-based fault locator is selected when the response time is minimal.
The only means of validating the performance of the neural network is to perform
extensive testing. After training process, the ANN based fault locator is then
extensively tested using different fault scenarios never used in training process.
In our study case, the ANN-based fault locator is trained to show the output as
110 km for no fault situation or for fault outside the segment on the line. For faults
occurred on the segment of the transmission line the ANN is trained to show the
estimated fault location as output.
In order to evaluate the response time of the proposed neural fault locator we
have simulated various fault scenarios with different fault conditions.
First Scenario: Single Phase to Ground Fault with High Fault Conditions
To study the effect of high fault conditions such as fault resistance Rf, fault
inception angle FIA and fault location Lf a single phase to ground fault (R-G) has
been simulated with Rf = 92 Ω, Lf = 86 km and FIA = 135° corresponding to the
fault appearance time at 71 s. Figure 14.
The response time of the ANN-based fault locator is simulated and depicted in
Fig. 15. it can be seen that the ANN output is 85.8962 km what implies a precision
of 0.1038 %. Thus, the fault is occurred at time Tf equal 71 s and will be located by
the adopted neural locator at time Te equal 71.02 s. What gives a response time Tr
equal to 20 ms.
In other hand, we have study the case when a fault occurs near to the source end
(side S) where the relays are installed. A three phase fault is simulated at 11 km
Neural Network Approach to Fault Location … 307
from source (S) end. Test conditions where “R-S-T” fault with Lf = 11 km, Rf = 0 Ω
and occurred at time Tf = 71 s.
Test results of the proposed ANN-based fault locator under this condition are
shown in Fig. 16. From the Figure, it can be seen that after one cycle from the
inception of fault (71 s), that is at 71.022 s, is Lf = 11.0967 as against 11 km actual
fault distance what implies a fast response time about Tr = 22 ms and a precision of
0.0967 %.
The ANN output is almost constant around the real fault location. Thus it is clear
that the proposed ANN-based fault locator can precisely estimate the exact fault
location in Extra High Voltage transmission line (EHV). Further, the operating time
of the proposed algorithm is about one cycle time from the inception of fault
(Fig. 17).
308 M.B. Hessine et al.
The salient features of some existing artificial neural network based fault detection/
classification and location schemes and those of the proposed algorithms are pre-
sented in this section. The proposed fault protection scheme: fault detection/clas-
sification and fault location in Extra High Voltage EHV transmission lines is
evaluated and compared with some former works. This adopted fault protection
scheme has several advantages:
Neural Network Approach to Fault Location … 309
• Only current inputs are required for fault detection/classification and location.
• Wider range of different fault conditions such as fault resistance Rf, fault
location Lf and fault inception angle FIA.
• Fast response better than, the existing schemes.
Table 9 compares the proposed fault location algorithm with some recent pub-
lished methods used others tools such as the hybrid Wavelet-Prony and Wavelet
Transform. Using only single ended current measurement is one of the salient
advantages of the proposed method. The proposed algorithm exhibits better per-
formance compared to the algorithms presented by Mohammad and Javad [47] and
Majid et al. [48] which use only the single-ended current as well as voltage mea-
surements. Indeed, the proposed fault location algorithm in this chapter has led to
more accurate fault locating.
The proposed neural fault detector/classifier and locator are compared also with
some published algorithms used the ANNs for identify and estimate the fault
location in transmission lines, Table 10. In this context, the proposed algorithm and
the algorithms proposed by Tahar [63], Jiang et al. [39], Anamika and Thoke [4]
and Yilmaz [66] are similar in the sense that both are Neural Network based
310 M.B. Hessine et al.
schemes which require the consideration of the measurement signals at one terminal
of transmission line at the relay location. The proposed algorithm is applicable for a
wider variation for fault conditions compared to the other methods. Whereas the
latter is valid for fault resistance variation Rf up to 50 Ω, fault inception angle FIA
up to 270° and fault location up to to 90 % of line length. By against, the proposed
algorithm is valid for fault resistance variation Rf up to 100 Ω, fault inception
variation angle FIA up to 360° and fault location variation Lf up to 95 % of line
length.
Further, the proposed scheme for fault location is simple compared to other
scheme because it requires the computation of ratios from the pre-fault and post-
fault currents samples for identify the fault type. Furthermore, a filtering processor
(one-cycle DFT) for extracting the magnitude of fundamental frequency compo-
nents (50 Hz) necessary for estimate the fault location in transmission line.
We notice also another comparison of the proposed algorithm with the other
works and this compared to the response time as shown in Fig. 18. Thus, the
response time of the proposed scheme for detection/classification and location is
about one cycle from the inception of fault which is comparable to the conventional
distance relay.
8 Conclusion
An accurate fault location based on artificial neural network for fast protection in
Extra High Voltage (EHV) transmission lines. The algorithm consists of two stages,
including fault detection/classification and fault location. For fault detection/clas-
sification the single ANN approach is used, this approach uses the pre-fault and
post-fault samples of three phase currents and zero sequence current. For fault
location the modular ANN approach is used, this approach uses the magnitudes of
fundamental components of three phase currents. Simulations studies carried out
Neural Network Approach to Fault Location … 311
References
17. Barros, J., Drake, J.M.: Real time fault detection and classification in power systems using
microprocessors. IEE Proc. Gener. Transm. Distrib. 141(3), 315–322 (1994)
18. Bo, Z.Q., Weller, G., Lomas, T., Redfern, M.A.: Positional protection of transmission systems
using global positioning system. IEEE Trans. Power Delivery 15(4), 1163–1167 (2000)
19. Borghetti, A., Bosetti, M., Silvestro, D.M., Nucci, C.A., Paolone, M.: Continuous-wavelet
transform for fault location in distribution power networks: definition of mother wavelets
inferred from fault originated transients. IEEE Trans. Power Syst. 23(2), 380–388 (2008)
20. Carlo, C., Kaveh, R.: Fuzzy-logic-based high accurate fault classification of single and double-
circuit power transmission lines. In: The 2012 IEEE Symposium on Power Electronics,
Electrical Drives, Automation and Motion (SPEEDAM), 20–22 June 2011, pp. 883–889.
Sorrento (2012). doi:10.1109/SPEEDAM.2012.6264636
21. Chun, W., Qing, Q.J., Xin, B.L., Chun, X.D.: Fault location using synchronized sequence
measurements. Electr. Power Energy Syst. 30(2), 134–139 (2008)
22. Desikachar, K.V., Singh, L.P.: Digital travelling–wave protection of transmission lines. Electr.
Power Syst. Res. 7(1), 19–28 (1984)
23. D’Amore, D., Ferrero, A.: A simplified algorithm for digital distance protection based on
fourier techniques. IEEE Trans. Power Delivery 4(1), 157–164 (1989)
24. Dong, X., Kong, W., Cui, T.: Fault classification and faulted-phase selection based on the
initial current travelling wave. IEEE Trans. Power Delivery 24(2), 552–559 (2009)
25. Eriksson, L., Saha, M.M., Rockefeller, G.D.: An accurate fault locator with compensation for
apparent reactance in the fault resistance resulting from remote-end infeed. IEEE Trans. Power
Apparatus Syst. 104(2), 424–435 (1985)
26. Ekici, J.S.: Support vector machines for classification and locating faults on transmission lines.
Appl. Softw. Comput. 12(6), 1650–1658 (2012)
27. Ernesto, V.M.: A travelling wave distance protection using principle component analysis. Int.
J. Electr. Power Energy Syst. 25(6), 471–479 (2003)
28. Ferrero, S., Sangiovanni., Zapitteli, E.: Fuzzy-set approach to type-faut identification in digital
relaying. IEEE Trans Power Delivery. 10(1), 169–175 (1995)
29. Gaganpreet, C.M., Sachdev, S., Ramakrishna, G.: Artificial neural network applications for
power system protection. In: The 2005 IEEE Canadian Conference on Electrical and
Computer Engineering (CCECE), 1–4 May 2005, pp. 1954-1957. Saskatoon (2005). doi:10.
1109/CCECE.2005.1557365
30. Girgis, A.A., Brown, R.G.: Adaptive Kalman filtering in computer relaying: fault classification
using voltage models. IEEE Power Eng. Rev. 5(5), 44–45 (1985)
31. Gracia, J., Mazón, A.J., Zamora, I.: Best ANN structures for fault location in single and
double-circuit transmission lines. IEEE Trans. Power Delivery 20(4), 2389–2395 (2005)
32. Gohokar, V.N., Khedkar, M.K.: Faults locations in automated distribution system. Electr.
Power Syst. Res. 75(1), 51–55 (2005)
33. Hassan, K.Z., Zuyi, L.: An ANN based approach to improve the distance relaying algorithm.
Turkish J. Electr. Eng. Comput. Sci. 14(2), 345–354 (2006)
34. Héctor, J.A.F., Ismael, D.V., Ernesto, V.M.: Fourier and Walsh digital filtering algorithms for
digital distance protection. IEEE Tran. Power Syst. 11(1), 457–462 (1996)
35. Huseyin, E.: Fault diagnosis system for series compensated transmission line based on wavelet
transform and adaptive neuro-fuzzy inference system. Measurement 46(1), 393–401 (2013)
36. Izykowski, J., Rosolowski, E., Balcerek, P., Fulczyk, M., Saha, M.: Fault location on double-
circuit series-compensated lines using two-end unsynchronized measurements. IEEE Trans.
Power Delivery 26(4), 2072–2080 (2011)
37. Jang, J.S.R.: ANFIS: adaptive- network-based fuzzy inference system. IEEE Trans. Syst. Man
Cybern. 23(3), 665–684 (1993)
38. Javad, S., Hamid, A.: A new and accurate fault location algorithm for combined transmission
lines using adaptive network-based fuzzy inference system. Electr. Power Syst. Res. 79(11),
1538–1545 (2009)
Neural Network Approach to Fault Location … 313
39. Jiang, J.A., Chuang, C.L., Wang, Y.C., Hung, C.H., Wang, J.Y., Lee, C.H., et al.: A hybrid
framework for fault detection, classification, and location—Part I: concept, structure, and
methodology. IEEE Trans. Power Delivery 26(3), 1988–1998 (2011)
40. Jiang, J.A., Yang, J.Z., Lin, Y.H., Liu, C.W., Ma, J.C.: An adaptive PMU based fault
detection/location technique for transmission lines Part-I; theory and algorithms. IEEE Trans.
Power Delivery 15(2), 486–493 (2000)
41. Joorabian, S.M.A., Taleghani, A.S.L., Aggarwal, R.K.: Accurate fault locator for EHV
transmission lines based on radial basis function neural network. Electr. Power Syst. Res. 71
(3), 195–202 (2004)
42. Kola, V.B., Manoj, T., Asheesh, K.S.: Recent techniques used in transmission line protection:
a review. Int. J. Eng. Sci. Technol. 3(3), 1–8 (2011)
43. Lin, Y.H., Liu, C.W., Yu, C.S.: A new fault locator for three-terminal transmission lines using
two-terminal synchronized voltage and current phasors. IEEE Trans. Power Delivery 17(2),
452–459 (2002)
44. Magnago, F.H., Abur, A.: Fault location using wavelets. IEEE Trans. Power Delivery 13(4),
1475–1480 (1998)
45. Mahamedi, B., Zhu, J.G.: Unsynchronized fault location based on the negative-sequence
voltage magnitude for double-circuit transmission lines. IEEE Trans. Power Delivery 99, 1
(2014)
46. Mahanty, R. N., Gupta, P. B. D.: A fuzzy logic based fault classification approach using
current samples only. Electric Power System Research. 77(5–6), 501–507 (2007)
47. Mohammad, F., Javad, S.: Transmission line fault location using hybrid wavelet-Prony
method and relief algorithm. Int. J. Electr. Power Energy Syst. 61, 127–136 (2014)
48. Majid, J., Abul, K., Ansari, A.Q., Rizwan, M.: Generalized neural network and wavelet
transform based approach for fault location estimation of a transmission line. Appl. Softw.
Comput. 19, 322–332 (2014)
49. Moez, B.H., Houda, J., Souad, C.: Fault detection and classification approaches in
transmission lines using artificial neural networks. In: The 2014 IEEE Mediterranean
Electrotechnical Conference (MELECON), 13–16 April 2014, pp. 520-524. Beirut (2014) (In
press)
50. Moez, B.H., Houda, J., Souad, C.: A new and accurate fault classification algorithm for
transmission lines using fuzzy logic system. Wulfenia J. 20(3), 336–349 (2013)
51. Moez, B.H., Houda, J., Souad, C., Sahbi, M.: Voltage and frequency stabilization of electrical
networks by using load shedding strategy based on fuzzy logic controllers. Int. Rev. Electr.
Eng. 7(5), 5694–5704 (2012)
52. Moez, B.H., Sahbi, M., Souad, C., Houda, J., Rabeh, A.: Preventive and curative strategies
based on fuzzy logic for voltage stabilization of an electrical network. Int. Rev. Model. Simul.
4(6), 3201–3207 (2011)
53. Mora, F.J., Melendez, J., Carrillo, C.G.: Comparison of impedance based fault location
methods for power distribution systems. Electr. Power Syst. Res. 78(4), 657–666 (2008)
54. Nan, Z., Kezunovic, M.: Coordinating fuzzy ART neural networks to improve transmission
line fault detection and classification. In: The 2005 IEEE Power Engineering Society General
Meeting Conference (PES), 12–16 June 2005, pp. 734–740 (2005). doi:10.1109/PES.2005.
1489373
55. Pei, Y.L., Tzu, C.L., Chih, W.L.: An intranet-based transmission grid fault location platform
using synchronized IED data for the Taiwan power system. In: The 2013 IEEE Innovative
Smart Grid Technologies (ISGT), 24–27 Feb 2013, pp. 1–6. Washington (2013). doi:10.1109/
ISGT.2013.6497796
56. Rockefeller, G.D.: High speed distance relaying using a digital computer, II. Test results. IEEE
Trans. Power Appar. Syst. 91(3), 1244–1258 (1972)
57. Shehab-Eldin, E.H., Mclaren, P.G.: Travelling wave distance protection-problem areas and
solutions. IEEE Trans. Power Delivery 3(3), 894–902 (1988)
58. Man, B.J., Morrison, I.F.: Digital calculation of impedance for transmission line protection.
IEEE Trans. Power Appar. Syst. 90(1), 270–279 (1971)
314 M.B. Hessine et al.
59. Man, B.J., Morrison, I.F.: Relaying a three phase transmission line with a digital computer.
IEEE Trans. Power Appar. Syst. 90(2), 742–750 (1971)
60. Soon, R.N., Sang, H.K., Seon, J.A., Joon, H.C.: Single line-to-ground fault location based on
unsynchronized phasors in automated ungrounded distribution systems. Electr. Power Sys.
Res. 86, 151–157 (2012)
61. Spoor, D., Zhu, J.G.: Improved single-ended traveling-wave fault-location algorithm based on
experience with conventional substation transducers. IEEE Trans. Power Delivery 21(3),
1714–1720 (2006)
62. Tabatabaei, A., Mosavi, M.R., Farajiparvar, P.A.: A travelling wave fault location technique
for three-terminal lines based on wavelet analysis and recurrent neural network using GPS
timing. In: The 2013 IEEE Smart Grid Conference (SGC), 17–18 Dec 2013, pp. 268–272.
Tehran (2013). doi:10.1109/SGC.2013.6733830
63. Tahar, B.: Fault location in EHV transmission lines using artificial neural networks. Int.
J. Appl. Math. Comput. Sci. 14(1), 69–780 (2004)
64. Vasilic, S., Kezunovic, M.: Fuzzy ART neural network algorithm for classifying the power
system faults. IEEE Trans. Power Delivery 20(2), 1306–1314 (2005)
65. Xu, Z.Y., Jiao, S.H., Ran, L., Du, Z.Q.: An online fault-locating scheme for EHV/UHV
transmission lines. IET Gener. Transm. Distrib. 2(6), 789–799 (2008)
66. Yilmaz, A.: An alternative approach to fault location on power distribution feeders with
embedded remote-end power generation using artificial neural networks. Electr. Eng. 94(3),
125–134 (2012)
67. Youssef, O. A. S.: A novel fuzzy logic based phase selection tenchinque for power system
relaying. Electric Power System Research. 68(3), 175–184 (2004)
68. Zhao, W., Song, Y.H., Chen, W.R.: Improved GPS traveling wave fault locator for power
cables by using wavelet analysis. Int. J. Electr. Power Energy Syst. 23(5), 403–411 (2001)
69. Zhengyou, H., Ling, F., Sheng, L., Zhiqian, B.: Fault detection and classification in EHV
transmission line based on wavelet singular entropy. IEEE Trans. Power Delivery 25(4),
2156–2163 (2010)
70. Zhu, Y.: Fault location scheme for a multi-terminal transmission line based on current
traveling wave. Int. J. Electr. Power Energy Syst. 53, 367–374 (2013)
A New Approach for Flexible Queries
Using Fuzzy Ontologies
1 Introduction
The diversity of the Database (DB) applications showed the limits of the Relational
Database Management Systems (RDBMS) in particular in the querying field [11]. The
traditional querying of a Relational DB (RDB) is qualified by “Boolean querying”
A. Aloui (&)
Ecole Nationale d’Ingenieurs de Tunis, LR-SITI, Tunis, Tunisia
e-mail: [email protected]
A. Grissa
Ecole Nationale d’Ingenieurs de Tunis, LIPAH, FST, Tunis, Tunisia
e-mail: [email protected]
with SQL for example, a query returns a result or nothing at all [47]. This querying
surrounds a problem for certain applications. First of all, the user must know all the
details concerning the diagram and the data from the database to express his prefer-
ences or he should use imprecise linguistic terms as “moderate”, means” to better
characterize the sought-after data.
The aim of the database flexible querying is to extend this binary behaviour by
introducing preferences into the query criteria [40]. Thus, an element returned over
by a query will be “more or less” relevant according to user preferences. Generally,
the proposed approaches treat the flexible query in case of the RDB but not in case
of the large DB. This work focuses on flexible query in large DB. For this purpose,
we suggest the use of ontologies to improve the performance of retrieving
information.
In fact, recent research showed that adopting formal ontology to describe het-
erogeneous data sources has many benefits. It provides not only a uniform and
flexible approach to integrate and describe such sources, but it can also support the
final user in querying them and improving the usability of the integrated system.
Unfortunately, many deficiencies still exist in ontology. On the one hand, it is
difficult to determine the granularity of ontology. On the other hand, the depth of
concept expression of ontology is still not enough [6]. Thus fuzzy ontology is
introduced to solve the above problems. The application of formal concept analysis
and concept lattice theory in ontology building and mapping not only makes the
building automatic, but also makes the newly generated ontology more formalized.
It should be a better way to combine formal concept analysis with ontology to
express and process the knowledge. This new proposed method supports the task of
formulating a request for a user in a specific domain. In fact, the ontology defines a
vocabulary which is often richer than the logical schema of the underlying data and
usually closer to the users own vocabulary. The ontology can be effectively
exploited by the user in order to formulate a query that best captures their infor-
mation need. Consequently, the user is constantly guided and assisted in this task
because the intelligence is dynamically driven by reasoning over the ontology.
This new approach helps the user in choosing what is more appropriate for him
respecting their information need and restricting the possible choices which are
more relevant and meaningful in a given context by considering only some parts of
the ontology. For those reasons, the user is free to explore the ontology without the
worry of making a wrong choice at some point and can thus concentrate on
expressing his need. Besides, queries can be specified through a refinement process
consisting in the iteration of few basic operations: The user specifies, first, an initial
request, then before constructing the ontology, we use the fuzzy logic techniques [3]
and Formal Concept Analysis concept [52] to classify the data which will refine or
delete some of the not used information, thus the number of concepts constructing
the ontology is always less than the number of objects starting on which we apply
the classification algorithm [29] because the application of FCA reduces consid-
erably the complexity until the resulting query satisfies the need of the user, changes
the level of granularity in the process of the evaluation of the ontology and apply
the clustering operation. So, the interrogation will focus necessarily on clusters.
A New Approach for Flexible Queries … 317
in Sects. 3 and 4 are given. Section 6 details the step of extracting flexible Query
from resulted ontology. Section 7 evaluates the proposed approach. Section 8
summarizes the chapter, enumerates the advantages and concludes with an outlook
on future work.
2 Basic Concepts
In this section, we present the basic concepts of flexible querying, ontologies and
Formal Concept Analysis (FCA).
2.2 Ontologies
Ontologies are content theories about the classes of individuals, properties of indi-
viduals, and relations between individuals that are possible in a specified field of
knowledge [18]. They define the terms for describing our knowledge about the
domain. An ontology of a domain is beneficial in establishing a common (controlled)
320 A. Aloui and A. Grissa
vocabulary when describing a domain of interest. This is important for unification and
sharing of knowledge about a domain and its connection with other domains.
In reality, there is no common formal definition of what an ontology is. All the
same, most approaches share a few core items such as: concepts, a hierarchical IS-
A-relation, and further relations. For the sake of generality, we do not discuss more
specific features like constraints, functions, or axioms in this paper, instead we
formalize the core in the following way:
Definition 1 A (core) ontology is a tuple O = (C, is_a, R, σ) where
• C is a set whose elements are called concepts
• is_a is a partial order on C (i.e., a a binary relation is_a ⊆ C × C which is
reflexive, transitive, and anti symmetric)
• R is a set whose elements are called relation names (or relations for short)
• σ: R → C+ is a function which assigns to each relation name its arity
In the last years, several languages have been developed to describe ontologies.
As example, we can cite, the Resource Description Framework (RDF) [17, 34], the
Ontology Web Language (OWL) [7] and extension of OWL language like OWL 2
[37] or Fuzzy OWL [9]. The Web Ontology Language (OWL) is a family of
knowledge representation languages for authoring ontologies and Description
Logics (DL) are a family of knowledge representation languages which can be used
to represent the terminological knowledge of an application domain in a structured
and formally well-understood way. Today description logic has become a corner-
stone of the Semantic Web for its use in the design of ontologies. The Web
Ontology Language Description Logics (OWL DL) becomeless suitable in domains
in which the concepts to be represented do not have precise definitions. In our case,
this scenario is, unfortunately, likely the rule rather than an exception. To handle
this problem, the use of fuzzy ontology offers a solution. Classical ontology lan-
guages are not appropriate to deal with imprecision or vagueness in knowledge.
Therefore, DL for the semantic web can be enhanced by various approaches to
handle probabilistic or possibilistic uncertainty and vagueness. Although fuzzy
logic was introduced already in the 1960’s [54], the research on fuzzy ontologies
was almost non-existent before 2000, so we can claim that this is a fairly new
research field with a great potential. This is even more surprising considering that
Pena (1984) reasoned already in the 1980’s why the use of fuzzy logic as the basis
for ontology building would be beneficial and solve many problems pertaining to
classical ontologies. He proposes “to reject the maximality rule, according to which
only altogether true sentences are true, and embracing instead the rule of
endorsement, which means that whatever is more or less true is true”. Among the
advantages of fuzzy ontology he mentions:
• Positing fuzzy predicates usually simplifies our theories in most scientific fields.
• Fuzzy predicates are much more plausible, and give us a much more attractive
and cohesive worldview, than their crisp counterpart.
• Degree-talk and comparative constructions.
A New Approach for Flexible Queries … 321
Also, the number of environments and tools for building ontologies have grown
exponentially. These tools provide support for the ontology development process
and for the subsequent ontology usage. Among these tools, we can mention the
most relevant: Ontolinguav [28], WebOnto [25], WebODE [1], Protégé-2000 [39],
OntoEdit [48] and OilEd [7].
In this work, we propose to use fuzzy-OWL2 language iteself to generate
automatically scripts from fuzzy ontologies. More precise, we use Protégé 4.2 as an
OWL2 editor for fuzzy ontology representation.
Conceptual scaling theory is the key part of a Formal Concept Analysis (FCA).
It allows the introduction of the given data and embed much more general scales
than the usual chains and direct products of chains. In the direct products of the
concept lattices of these scales, the given data can be embedded. FCA starts with
the notion of a formal context specifying which objects have what attributes and
thus a formal context may be viewed as a binary relation between the object set and
the attribute set with the values 0 and 1.
In Tran et al. [50], an ordered lattice extension theory has been proposed: Fuzzy
Formal Concept Analysis (FFCA), in which uncertainty information is directly
represented by a real number of membership values in the range of [0,1]. This
number is equal to similarity which is defined as follows:
Definition 2 The similarity of a fuzzy formal concept C1 ¼ ðuðA1 Þ; B1 Þ and its
subconcept C2 ¼ ðuðA2 Þ; B2 Þ is defined as:
juðA1 Þ \ uðA2 Þj
SðC1 ; C2 Þ ¼
juðA1 Þ [ uðA2 Þj
where ∩ and ∪ refer intersection and union operators on fuzzy sets, respectively.
In Sassi et al. [45], we showed that these FFCA are very powerful as well with
the interpretation of the results of the Fuzzy Clustering and in the optimization of
the flexible query.
3 Related Work
Many researchers in the field of data mining have tried to find the efficient way to
respond to the user query. We study in this section the most important approaches
that generate information from data.
322 A. Aloui and A. Grissa
• The rules generated from these data are generally redundant rules.
• These algorithms generated a very big number of rules, almost thousands, that
the human brain cannot even assimilate.
• Generally the goal from extracting a set of rules is to help the user to give
semantics of data and to optimize the information research. This fundamental
constraint is not taken into account by these approaches.
To resolve these problems, we propose:
• A new approach for the ontology generation using conceptual clustering, fuzzy
logic, and FFCA. We propose to define rules (Meta-Rules) between classes
resulting from a preliminary classification on the data. Indeed while classifying
data, we construct homogeneous groups of data having the same properties, so
defining rules between clusters implies that all the data elements belonging to
those clusters will be necessarily dependent on these same rules. Thus, the
number of generated rules is smaller since one processes the extraction of
the knowledge on the clusters which number is relatively lower compared to the
initial data elements.
• A new algorithm to support database flexible querying using the generated
knowledge in the first step. This approach allows the end-user to easily exploit
all knowledge generated.
In this section, we present the architecture of the Fuzzy Ontology of Data Mining
(FODM) approach and the process of fuzzy ontology construction.
Our FODM approach takes the database records and provides the corresponding
fuzzy ontology. Figure 1 shows our proposed FODM approach. We suggest the
ontology definition between classes resulting from a preliminary classification of
the data. The FODM approach is organized according to two following main steps.
Data Organization step and Fuzzy Ontology Generation step.
In this part, we provide the theoretical foundations of the proposed approach, based
on the following properties:
A New Approach for Flexible Queries … 325
Property 1
• The number of clusters generated by a classification algorithm is always lower
than the number of starting objects.
• All objects belonging to one same cluster have the same proprieties. These
characteristics can be deduced easily knowing the center and the distance from
the cluster.
• The size of the lattice modeling the properties of the clusters is lower than the
size of the lattice modeling the properties of the objects.
• The management of the lattice modeling the properties of the clusters is opti-
mum than the management of the lattice modeling the properties of the objects.
C1 ) C2 ðCRÞ ,
326 A. Aloui and A. Grissa
C1; C2 ) C3 ðCRÞ ,
The validation of the two properties come from to the fact that all objects which
belong to a same cluster check necessarily the same attribute as their cluster.
This step gives a certain number of clusters for each attribute. Each tuple has values
in the interval [0,1] representing these membership degrees. Linguistic labels,
which are fuzzy partitions, will be assigned to the attributes. This step consists on
TAH’s and MTAH generation of relieving attributes. This step is very important in
the Fuzzy ontology generation process because it allows to define and interprete the
distribution of objects in the various concepts.
Example Let’s have a relational database table presented by Table 1 containing the
list of AGE and SALARY of Employee
Table 2 presents the results of fuzzy clustering applied to Age and Salary
attributes. For Salary attribute, fuzzy clustering generates three clusters (C1, C2 and
C3). For the attribute AGE, two clusters have been generated (C4 and C5).
The minimal (resp. maximal) value of each cluster corresponds to the lower
(resp. higher) interval terminal of the values of this last. Each cluster of a partition is
labelled with a linguistic label provided by the user or a domain.
The Table 4 presents the correspondence between the linguistic labels and their
designations for the attributes Salary and Age.
The fuzzy concept lattices of fuzzy context presented in Table 5, noted as TAH’s
are given by the line diagrams presented in Figs. 2 and 3.
This very simple sorting procedure gives us for each many-valued attribute the
distribution of the objects in the line diagram of the chosen fuzzy scale. Figure 4
shows the fuzzy nested lattice constructed from Figs. 2 and 3.
Table 4 Correspondence of
Attribute Linguistic labels Designation
the linguistic labels and their
designations Salary Low C1
Salary Medium C2
Salary High C3
Age Young C4
Age Adult C5
This step consists on constructing the Fuzzy Ontology. It aims to deduce the Fuzzy
Cluster Lattice corresponding to MTAH lattice generated in the first step, then to
generate Ontology Extent and Intent Classes, Ontology hierarchical Classes,
Ontology Relational Classes and finally the Fuzzy Ontology.
Definition (Fuzzy Clusters Lattice) A fuzzy Clusters Lattice (FCL) of a Fuzzy
Formal Concept Lattice, consists on a Fuzzy concept lattice for which each
equivalence class (i.e., a node of the lattice) contains only the intentional
description (intent) of the associated fuzzy formal concept. This lattice will be used
to build the core of the ontology.
Definition (Level of FCL) A level i of FCL is a is the set of nodes of FCL with
cardinality equal to i.
Definition (Concept Hierarchy) A concept hierarchy is a poset (partially ordered
set) (H, <), where H is a finite set of concepts and < is a partial order on H.
330 A. Aloui and A. Grissa
We make in this case a certain abstraction on the list of the objects with their
degrees of membership in the clusters. The nodes of FCL are clusters ordered by the
inclusion relation. As shown in the Fig. 5, we obtain a lattice more reduced, simpler
to traverse and to store. Figure 6 illustrates the hierarchical relations constructed
from the conceptual clusters given in 5. Each concept in the concept hierarchy is
represented by a set of its attributes.
The supremum and infimum of the lattice are considered respectively as “Thing”
and “Nothing” concepts.
The next step constructs fuzzy ontology from a fuzzy context using the concept
hierarchy created by fuzzy conceptual clustering. This is done because both FCA
and ontology support formal definitions of concepts. Thus, we define the fuzzy
ontology as follows:
A New Approach for Flexible Queries … 331
Next step is to provide means for transforming concept lattice based ontology
expression to associations rules. This process enables to produce logical expression
of ontology lattice and specify intended semantics of the descriptions in first order
logic. Once the ontology are defined, thus we can model the resulted rules deduced
from our Fuzzy Ontology using Protege 4.2 software as bellow (Fig. 9):
In order to define non-taxonomic relationships the following groups of rules are
defined:
• Properties of concepts: For properties of concepts definition, the following
predicate can be used: has property (Concept name, Property name).
• Inheritance of properties: Inheritance of properties can be represented by the
following rule: has property (C1, X) ← is a (C1,C2), has property (C2, X).
334 A. Aloui and A. Grissa
A flexible and cooperative database flexible querying approach within the fuzzy
ontology framework has been proposed. This approach takes into account the
semantic dependencies between the query and the search criteria to determine its
realizability or not. Thus the idea is to change the level of granularity and apply the
clustering operation, so the interrogation will focus necessarily on clusters.
A New Approach for Flexible Queries … 335
Next step presents our flexible query algorithm using the generated ontology in
the second step. Let R the user Query. The pseudo-code for flexible query gener-
ation algorithm is given:
Note that:
• Concept_Query (R, ΦB): is a procedure that determine the concept ΦB of R.
• Extract (R, i):is a procedure that determines answers of the request while using
the Backward chaining. This procedure calls upon all the rules closely related to
the request of level ≤ i
Applying this algorithm, the generated knowledge is in the form of rules. We
obtain a query concept Φ = (ΦA, ΦB).
Example For better explaining this step, we consider a relational database table
describing apartment announces. The query is as follows:
8
>
> Select ref An; price; surface
>
> From Announce; Appartment
>
>
>
>
< Where price ¼ 105
Q ðA1Þ ð1Þ
>
>
>
> and surface ¼ 75
>
>
>
> ðA2Þ
: 0
and city ¼ Paris0
In this query, the user wishes that its preferences be considered according to the
descending order: Price and Surface. In other words, returned data must be ordered
and presented to the user according to these preferences. Without this flexibility, the
user must refine these search keys until obtaining satisfaction if required since he does
A New Approach for Flexible Queries … 337
not have precise knowledge on the data which he consults. According to the criteria of
the query ϕ only the A1 and A2 criteria correspond to relievable attributes.
Initially, we determine starting from the DB the tuples satisfying the non
relievable criteria (A3, A4, A5), result of the following query:
8
>
> Select refAn; price; surface
>
>
>
> From Announce; Appartment
< 0
Where city ¼ Paris0
Q ð2Þ
>
> ðA3Þ
>
> 0
>
> and place ¼ 16eme arrondissement0
:
ðA4Þ
If the query criteria are in contradiction with their dependences extracted from the
database, it is known as unrealisable.
Proposition Let a query Q having the concept Φ = (ΦA,ΦB). A query ΦA is un-
realisable if and only if there is no data source in ΦA which divide any metadata of
the set ΦA.
Table 8 Characteristics of
datasets Data Set Nb of Size of Nb of
objects objects items
C20d10 K 10,000 20 386
T25i10d10 K 1,000 25 1,000
Mushrooms 8,416 23 128
Car 1,728 7 26
Achat 28 5 5
The performance of the proposed algorithm for Discovering Fuzzy queries can be
measured in order to evaluate the generated ontology. To do this, we compare two
approaches using 4 datasets known on the ECD field (Table 8).
The first approach does not apply the clustering concept and the second uses the
formal concepts for structuring and building ontology-based classification with
AFC adopted by “ClusterFCA”. ClusterFCA is a java platform developed by our
team. It includes a classification module containing algorithms for binary and fuzzy
clustering. It also includes an AFC module for the construction of simple and nested
lattice.
In this chart we show the number of rules resulting from these data sets:
Mushrooms (8,416 objects), C20d10 K (10,000 objects), Car (1,728 objects), Achat
(28 objects).
The existing algorithms dont take into account any semantics of the data. All the
researchers focused themselves on the reduction of the set of rules, by proposing the
concept of metadata, or on the method of visualization of this rules. Our main
contribution resides in extracting the ontology from datasets by using FCA and
transforming it to a rule language in order to model the expression of the user’s
preferences and generate the relevant answers. Thus, we prove in Fig. 11 that with
FCA, we minimize the space complexity of the resulting lattice. The combination of
two concepts: FCA and ontology models a certain abstraction of the data that is
fundamental in the case of an enormous number because the defined ontology is
deduced from clusters not from the initial objects. The flexible query approach
proposed the followings contributions compared to the similar approaches
• The automatic generation of TAH’s and MTAH from relieving attributes.
• The research of relevant data sources for a given query.
• A detection of the query unrealisability.
• The scheduling of the results.
Different advantages are granted by the proposed approach. This approach is:
• More reliable compared to the classic one (without clustering). In the examples,
the number of classes generated in the case of the application of our new
approach is less than the number of classes of input ontology. The decrease
A New Approach for Flexible Queries … 339
8 Conclusion
This paper focuses on how the future retrieval information from large dataset might
look like and the interaction between the sources of information to yield perfect and
real time results with unique power of intelligence is done to interpret the best
possible solution for the user query.
The main idea is to collaborate the search to be more informative and provide it
an intelligence in order to retrieve user oriented results. The model defined for this
approach is FODM-FQ. It consists of four steps: The first organizes the database
records in homogeneous clusters having common properties to deduce the data’s
semantic. This step consists of TAH’s and MTAH generation of relieving attributes.
The second step called Discovering Knowledge is used to deduce the Fuzzy Cluster
Lattice corresponding to MTAH lattice generated in the first step. Then, on the third
340 A. Aloui and A. Grissa
step, the FCL is mapped to an owl ontology design. From this ontology, the rules
modeling the Knowledge (Set of Fuzzy Associations Rules on the attributes SFR)
are extracted. We prove that the discovered rules does not contain any redundant
rule. The fourth step ensures database flexible querying using the generated
ontology.
An example of an ontology was simulated using Protege and results were
analyzed. The keywords entered by the user were given priority and bases on that
we have discarded certain resulted query using FCA methodology which make the
search more compact and effective. The future scope of this method is to first
integrate the current into large domains resulting in expansion of knowledge base.
Secondly, an intelligent distributed ontology query processing method will be
proposed to deal with the growth of the data size and the number of distributed
queries which access the common part of the resources and successfully meet the
user preferences.
References
1. Arpirez, J., Corcho, O., Fernandez-Lopez, M., Gomez-Perez, M.: WebODE: a workbench for
ontological engineering. In: First International Conference on Knowledge Capture
(K-CAP01), Victoria, pp. 613 (2001)
2. Azar, A.T.: Fuzzy Systems. IN-TECH, Vienna (2010). ISBN 978-953-7619-92-3
3. Azar, A.T.: Overview of type-2 fuzzy logic systems. Int. J. Fuzzy Syst. Appl. (IJFSA) 2(4),
1–28 (2012)
4. Azar, A.T.: Adaptive neuro-fuzzy systems. In: Azar, A.T. (ed.) Fuzzy Systems. IN-TECH,
Vienna, ISBN 978-953-7619-92-3 (2010a)
5. Baer, P.G.D., Kapetanios, E., Keuser, S.: A Semantics Based Interactive Query Formulation
Technique, UserInterfaces to Data Intensive Systems. Second International Workshop on User
Interfaces to Data IntensiveSystems, Zurich, 43–49 (2001)
6. Bao-xiang, X., Zhang, Y.: Research on the development of information system modeling
theory. J. Intell. 29(5), 70–74 (2010)
7. Bechhofer, S., Horrocks, I., Goble, C., Stevens, R.: OilEd-a reason-able ontology editor for the
semantic web. In: Joint German-Austrian Conference on Artificial Intelligence (KI01), Vienne,
pp. 396–408 (2001)
8. Berners-Lee, T.: Weaving the Web. HarperCollins, New York, ISBN 006-251-5861 (1999)
9. Bobillo, F., Straccia, U.: Fuzzy description logics with general t-norms and datatypes. Fuzzy
Sets Syst. 160(23), 3382–3402 (2009)
10. Borzsonyi, S., Kossmann, D., Stocker, K.: The skyline operator. In: International Conference
on Data Engineering (ICDE), Heidelberg (2001)
11. Bosc, P., Galibourg, M., Hamon, G.: Fussy quering with SQL: extensions and implementation
aspects. Fussy Sets Syst. 28(3), 333–349 (1988)
12. Bosc, P., Liétard, L.: Aggregates computed over fuzzy sets and their integration into SQLf.
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 16(6),
761–792 (2008)
13. Bosc, P., Lietard, L., Pivert, O.: Databases and flexibility: gradual queries. TSI-Technique et
Science Informati ques-RAIRO 17(3), 355–378 (1998)
14. Bosc, P., Pivert, O.: SQLf: a relational database language for fuzzy querying. Comput. J. IEEE
Trans. Fuzzy Syst. 3(1), 1–17 (1995)
A New Approach for Flexible Queries … 341
15. Bosc, P., Pivert, O.: A propos de requetes a preferences et diviseur stratifie. 28me Congrs
INFORSID, France, pp. 311–326 (2010)
16. Carpineto, C., De Mori, R., Romano, G., Bigi, B.: An information theoretic approach to
automatic query expansion. ACM Trans. Inf. Syst. 19(1), 1–27 (2001)
17. Carroll, J., Klyne, G.: RDF concepts and abstract syntax. Recommendation, W3C (2004),
http://w3c.org/TR/rdf-concepts
18. Chandrasekaran, B., Josephson, J., Benjamin, V.: What are ontologies, and why do we need
them? IEEE Intell. Syst. 14(1), 20–26 (1999)
19. Chang, C.: Decision support in an imperfect world. In: Trends and Applications on
Automating Intelligent Behavior-Applications and Frontiers, Denmark, p. 25 (1983)
20. Chomicki, J.: Preference formulas in relational queries. ACM Trans. Database Syst. 28(4),
427–466 (2003)
21. Chu, W., Yang, H., Minock, M., Chow, G., Larson, C.: CoBase-a scalable and extensible
cooperative information system. J. Intell. Inf. Syst. 6(2–3), 223–259 (1996)
22. Clerkin, P., Cunningham, P., Hayes, C.: Ontology discovery for the semantic web using
hierarchical clustering. In: European Conference Machine Learning (ECML) and European
Conference Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD-
2001) (2001)
23. Decker, S., Melnik, S., Van Harmelen, F., Fensel, D., Klein, M., Broekstra, J., Erdmann, M.,
Horrocks, I.: The semantic web-the roles of XML and RDF. IEEE Internet Comput. 5(4),
63–74 (2000)
24. Ding, Y., Foo, S.: Ontology research and development: Part 1 a review of ontology generation.
J. Inf. Sci. 28(2), 123–136 (2002)
25. Domingue, J., Motta, E.: Knowledge modeling in webonto and ocml. http://kmi.open.ac.uk/
projects/ocml (1999)
26. Domshlak, C., Hoos, H., Boutilier, C., Brafman, R., Poole, D.: Cp-nets : a tool for representing
and reasoning with conditional ceteris paribus preference statements. J. Artif. Intell. Res. 21,
135–191 (2004)
27. Dubois, D., Prade, H.: Bipolarity in flexible querying. In: Proceedings of the 5th International
Conference on Flexible Query Answering Systems (FQAS) 02, London, UK, pp. 174–182
(2002)
28. Farquhar, A., Fikes, R., Rice, J.: The ontolingua server: a tool for collaborative ontology
construction. In: The 10th Knowledge Aqcuisition for Knowledge-Based Systems (KAW96),
Canada, pp. 174–182 (1996)
29. Ganter, B., Wille, R.: Formal Concept Analysis, Mathematical Foundations, vol. 1640.
Springer, Heidelberg (1999)
30. Grissa Touzi, A., Sassi, M., Ounelli, H.: An innovative contribution to flexible query through
the fusion of conceptual clustering, fuzzy logic, and formal concept analysis. Int. J. Comput.
Appl. 16(4), 220–233 (2009)
31. Kapetanios, E., Baer, D., Glaus, B., Groenewoud, P.: MDDQL-Stat: data querying and
analysis through integration of intentional and extensional semantics. Proceedings of the 16th
International Conference on Scientific and Statistical Database Management (SSDBM 2004),
Switzerland, pp. 353 (2004)
32. Kiessling, W.: Data querying and analysis through integration of intentional and extensional
semantics. In: Foundations of Preferences in Database Systems. Very Large Data Base
(VLDB) Endowment Inc, pp. 311–322 (2002)
33. Lacroix, M., Lavency, P.: Preferences-putting more knowledge into queries. In: Proceedings
of the 13th International Conference on Very Large Data Bases, University of Vienna, Austria,
pp. 217–225 (1987)
34. Lassila, O., Swick, R.: Resource description framework (RDF) model and syntax specification.
Recommendation, W3C (1999)
35. Lietard, L., Rocacher, D.: On the definition of extended norms and co-norms to aggregate
fuzzy bipolar conditions. In: The European Society of Fuzzy Logic and Technology
Conference, pp. 513–518 (2009)
342 A. Aloui and A. Grissa
36. Mena, E., Illarramendi, A., Kashyap, V., Sheth, A.: OBSERVER: an approach for query
processing in global information systems based on interoperation across pre-existing
ontologies. J. Distrib. Parallel Databases 8(2), 223–271 (2000)
37. Motik, B., Grau, B.C., Horrocks, I., Wu, Z., Fokoue, A., Lutz, C.: OWL 2 web ontology
language: profiles. Recommendation, W3C. http://www.w3.org/TR/owl2-profiles/ (2009)
38. Motro, A.: VAGUE-A user interface to relational databases that permits vague queries. ACM
Trans. Office Inf. Syst. 6(3), 187–214 (1988)
39. Noy, N., Fergerson, R., Musen, M., IENG, R. D., CORBY, O. The knowledge model of
protg2000 : combining interoperability and flexibility. In: 12th International Conference on
Knowledge Engineering and Knowledge Management (EKAW00), Juan-les-Pins, France,
pp. 17–32 (2000)
40. Ounalli, H., Belhadj, R.: Interrogation flexible et cooprative d’une BD par abstraction
conceptuelle hirarchique, pp. 41–56. INFORSID, Biarritz, France (2004)
41. Paton, N.W., Stevens, R., Baker, P., Goble, C.A., Bechhofer, S., Brass, A.: Query Processing
in the TAMBIS Bioinformatics Source Integration System. Proceedings of the IEEE
International Conference on Scientific and Statistical Databases (SSDBM), 138–147 (1999)
42. Pivert, O.: Contribution a l’interrogation flexible de bases de donnees: expression et evaluation
de requetes floues. PhD thesis (1991)
43. Quan Thanh, T., Hui, S.C., Fong, A., Cao, T.H.: Automatic fuzzy ontology generation for
semantic web. IEEE Trans. Knowl. Data Eng. 18(6), 842–856 (2006)
44. Rabitti, F., Savino, P.: Retrieval of multimedia documents by imprecise query specification.
In: Advances in Database Technology-EDBT90, pp. 203–218. Springer, Berlin (1990)
45. Sassi, M., Grissa Touzi, A., Ounelli, H.: Clustering quality evaluation based on fuzzy FCA. In:
18th International Conference on Database and Expert Systems Applications, (DEXA07),
Regensburg, Germany, pp. 62–72. LNCS (2007)
46. Soergel, D.: Some remarks on information languages, their analysis and comparison. Inf.
Storage Retrieval 3(4), 219–291 (1967)
47. Spoerri, A.: InfoCrystal: a visual tool for information retrieval management. In: Second
International Conference on Information and Knowledge Management, Washington,
pp. 11–20 (1993)
48. Sure, Y., Erdmann, M., Angele, J., Staab, S., Studer, R., Wenke, D.: OntoEdit: collaborative
ontology engineering for the semantic web. In: First International Semantic Web Conference
(ISWC02) of Lecture Notes in Computer Science, Chia, Sardaigne, Italie, vol. 2342,
pp. 221–235 (2002)
49. Tahani, V.: A conceptual framework for fuzzy query processing: a step toward very intelligent
database systems. Inf. Process. Manage. 13(5), 289–303 (1977)
50. Tran, T., Wang, H., Rudolph, S., Cimiano, P.: Top-k exploration of query graph candidates for
efficient keyword search on rdf. In: IEEE Computer Society (ed.) Proceedings of the 2009
IEEE International Conference on Data Engineering, pp. 405–416. IEEE Computer Society
(2009)
51. Uri, K., Jianjun, Z.: Fuzzy clustering principles, methods and examples, vol. 17(3), p. 13.
Technical Report, Technical University of Denmark, IKS, Denmark, (1998)
52. Wille, R.: Restructuring lattice theory: an approach based on hierarchies of concepts. In: Rival,
I. (ed.) Ordered Sets, vol. 83. Springer, Berlin (1982)
53. Wuermli, O., Wrobel, A.C. a. H. & Joller, J. (2003). Data mining for ontology building:
semantic web overview. PhD thesis, Nanyang Technological University
54. Zadeh, L.: Fuzzy sets. Inf. Control 8(3), 338–353 (1965)
An Efficient Multi Level Thresholding
Method for Image Segmentation Based
on the Hybridization of Modified PSO
and Otsu’s Method
F. Hamdaoui (&)
Laboratory of EμE, Faculty of Sciences of Monastir (FSM),
University of Monastir, Monastir, Tunisia
e-mail: [email protected]
A. Sakly
Industrial Systems Study and Renewable Energy (ESIER),
National Engineering School of Monastir (ENIM), Electrical Department,
University of Monastir, Monastir, Tunisia
e-mail: [email protected]
A. Mtibaa
Laboratory of EμE, Faculty of Sciences of Monastir (FSM),
National Engineering School of Monastir (ENIM), Electrical Department,
University of Monastir, Monastir, Tunisia
e-mail: [email protected]
1 Introduction
Genetic algorithms (GAs) are meta-heuristics search methods belonging to the class
of evolutionary algorithms (EAs). They are inspired by the analogy between the
optimization process and the evolution of organisms. A GA is used to search for
global or optimal solutions when no deterministic method exists or when the
deterministic method is computationally complex. GA is a population based
algorithm that was proposed by John Holland in 1975 [38]. Then by Goldberg in
1989 [39], Holland in 1992 [40], Man et al. in 1996 [41], Schmitt and Petrowski in
2001 [42, 43]. Later, this technique has been used in many fields such as image
segmentation, which has been transformed into an optimization problem like in
[44–47].
Multithresholding amounts to finding more than a threshold, this means several
solutions. Each solution is represented as chromosome, each chromosome is con-
structed from genes and solutions generated per iteration are called population. The
size of the population is the number of solutions per iteration. Let n the population
size of randomly generated individuals. The genetic algorithm starts with n random
solution. Then the best member solutions are selected to generate new solutions, so
346 F. Hamdaoui et al.
the best generated solutions will be added to the next iteration and all bad solutions
will be rejected. The selection of the best solution in each generation is based best
fitness evaluation values of every individual in the population to form a new
population. The stopping criterion is determinate by the number of generations that
has been produced, or on a satisfactory fitness value that has been reached for the
population. Generally, the genetic algorithm is based on four steps: population
initialization, evaluation of fitness, reproduction and termination criterion (Table 1).
The ant colony algorithms are a family of meta-heuristics inspired from nature
using swarm intelligence. The behavior of ants in their search for food has been
studied and applied to solve complex optimization problems. Simply, ants initially
start by moving randomly. Once the food has been found, again they join their
colony by filing in the way a chemical substance called pheromone [48]. Other ants
that are experiencing the same path have a high probability to stop their random
movements and follow the path marked by the substance (pheromone): this was
called the phenomenon of stigmergic optimization [49]. After some research, there
will be several paths that lead to food. The shortest path will be covered without
necessarily having a global vision of the path [50, 51]: this phenomenon is based
positive feedback. Therefore, the long paths ultimately disappear. Finally, all the
ants will follow the shortest path.
Dorigo [52] and Colorni [53] are the first who have trying to implement an
inspired ACO analogy to solve the problem of searching for an optimal path in a
graph. Then, several problems have emerged and drawing on various aspects based
behavior of ants. Multithresholding is among the areas in which the ACO algorithm
was implemented to obtain the optimal thresholds in the field of image segmen-
tation [54–57].
Table 2 below gives the overall description of the ACO algorithm:
An Efficient Multi Level Thresholding Method … 347
Artificial Bee Colony algorithm was firstly proposed by Karaboga [58] in 2005 for
searching numerical optimization problems based on intelligent foraging behavior
of honey bee swarm. Later, further improvements have been carried out for
the ABC algorithm by Karaboga and Basturk in 2006 and 2008 [59, 60]. Colony of
ABC model consists of three groups of bees: employed bees, onlookers and
scouts [61].
The bee which discovered and a source of food to exploit belong to the
employed bees group. The second groups of bees called onlookers are those waiting
in the hive for information about the sources of food from the employed bees. The
third group of scouts’ bees is the set of bees which will randomly search for the
food sources around the hive. After exploiting a source of food, a bee belonging to
the employed bees group returns to the hive and shares information about the nectar
amount produced in the food source with other bees. The employed bee starts
dancing in the dance area of the hive. Communication among bees related to the
quality of food sources takes place in the dancing area. This dance is called a
waggle dance and it is made to share information with a probability proportional to
the profitability of the food source. More profitable the source is, more the dancing
duration is so longer. An onlooker on the dance floor watches numerous dances and
selects to employ herself at the most profitable source. After watching several
dances, an onlooker bee chooses a source of food and becomes employed bee. In a
similar way, a scout is called employed when it finds a source of food. After
completely exploiting a source of food, all employed bees abandoned it and change
into onlookers or scouts [62, 63]. Typically algorithm for the ABC algorithm is
given in Table 3.
The ABC algorithm finds optimal solutions for the optimization problems. Many
researchers use this algorithm to determine the threshold values for the multilevel
thresholding problem [64–68].
348 F. Hamdaoui et al.
Step 4: After a defined number of memplex evolution stages, all frogs of mem-
plexes are collected and sorted in descending order again based on their
fitness. Step 2 divides frogs into different memplexes again, and then step 3
is achieved.
Step 5: If a predefined solution or a fixed iteration number is reached, the algo-
rithm stops.
Table 4 shows the proposed algorithm based SFLA technique:
The SFLA algorithm had been recently used in determining the optimal thres-
holding in the field of image segmentation exactly in the identification of the bi-
level [75, 76] and multi-level thresholding [77–82].
3 Proposed Approach
Multilevel thresholding segments images into several distinct regions. Using this
process, it is possible to determinate more than one threshold value for a given
gray-level image and segments it into certain brightness regions, which correspond
to one background and several objects. Let consider a gray-level image that con-
tains N pixels distributed as objects and background. The multilevel threshold
selection can be considered as the problem of finding a set TðlÞ; l ¼ 1; 2; . . .; L of
threshold values with L is defined as the intensity level of the image. As a result of
thresholding, the original image will be transformed to an image with L + 1 levels.
If T(l), l = 1,2, …, L are the threshold values with T(1) < T(2) < T(3), …, < T(L) and
f ðx; yÞ is the image function which gives the gray-level value of the pixel with
coordinates (x, y). The resultant image Fðx; yÞ as explain before, is defined as:
350 F. Hamdaoui et al.
8
>
> 0; if f ðx; yÞ Tð1Þ
>
>
< 1; if Tð1Þ f ðx; yÞ Tð2Þ
Fðx; yÞ : : : ð1Þ
>
>
>
> : : :
:
L if f ðx; yÞ TðLÞ
/ ¼ max r2 ðTÞ
ð3Þ
T ð1Þ\T ð2Þ\T ð3Þ; . . .; \T ðLÞ
And:
1XT 1
1 X
255
r21 ¼ ðhðiÞ l1 Þ2 ; r22 ¼ ðhðiÞ l2 Þ2 ð5Þ
T i¼0 256 T i¼T
1XT 1
1 X
255
l1 ¼ hðiÞ; l2 ¼ hðiÞ ð6Þ
T i¼0 256 T i¼T
1 X T 1
1 X 255
P1 ¼ hðiÞ; P2 ¼ hðiÞ ð7Þ
W H i¼0 W H i¼T
With h is the histogram of this image and N and M are respectively width and
height of the image.
The major drawback of this problem is the computational effort that is much
larger as the number of threshold levels increase. In the last decade, biologically
inspired methods have been used as computationally efficient alternatives to ana-
lytical methods to solve optimization problems [83, 84].
An Efficient Multi Level Thresholding Method … 351
where xim is the ith position of the particle of the swarm; vim the velocity of this
particle; pim the best previous position of the ith particle; pgm is the best position of
particle in the swarm; 1 m M with M is the search space; rand1() and rand2()
are the two independents random number with uniform distribution in the range (0,
1); c1 and c2 are two positives constants of accelerations coefficients called cog-
nitive and social parameter respectively; w is called inertia weight and it is used to
control the balance between exploration and search space exploitation.
The PSO algorithm given in the following diagram in Fig. 1 and described in the
above paragraph is briefly detailed in this Table 5.
PSO approach is based on the memory and the social interaction among indi-
viduals. In the general case, the fitness function allows determining the best position
for a particle i to make moves from its current ðxi ; tÞ to the next ðxi ; t þ 1Þ. Moving
process is depends on three stages:
352 F. Hamdaoui et al.
X
j¼p
SPi ¼ Pj ð13Þ
j¼0
255 SPi
NðiÞ ¼ ð14Þ
2p1
After that, let be introduced four parameters lowsum lownum highsum highnum
to zero. Then, each particle of the original image with intensity L(i) is compared
with N(i) as explained in the pseudo code given in Table 6.
The computation of the new fitness function depends to the final image after
segmentation. For each particle i, after comparing all pixels with N(i), two coeffi-
cients u1 and u2 are calculated according to Eqs. (15) and (16). Finally, the fitness
function of the particle i is calculated using Eq. (17).
lowsum
u1 ¼ ð15Þ
lownum
highsum
u2 ¼ ð16Þ
highnum
This new fitness function increases the probability to use more positions. So, it
guarantees and allows the best speed of the convergence to the sought threshold
value. This is demonstrated by the following experimental results.
The proposed MMPSO in this work is shown in the Table 7.
4 Experimental Results
CPU processing time is the amount of time for which a metaheuristic method was
used for processing and found the best threshold value for each region. It is mea-
sured in second. For many values of thresholds (2, 3, 4, and 5) we calculated the
CPU processing time for both Particle Swarm Optimization (PSO) method and
Otsu’s method [22]. Firstly, we start by giving the experimental results of the
An Efficient Multi Level Thresholding Method … 355
NðiÞ ¼ 255SP
2p1
i
end
lowsum = 0; lownum = 0; highnum = 0; highsum = 0;
for i from 1 to M
if L(i) < N(i) then
lowsum ¼ lowsum þ LðiÞ
lownum ¼ lownum þ 1
else
highsum ¼ highsum þ LðiÞ
highnum ¼ highnum þ 1
end
end
if lownum = 0 then
u1 = 0
else
u1 ¼ lownum
lowsum
end
if highnum = 0 then
u2 = 0
else
u2 ¼ highnum
highsum
average CPU process time for the MMPSO, PSO method and Otsu’s method.
MMPSO is the hybridization of the PSO and Otsu’s method, so the determination
of the Otsu’s method is in order to compare the performance between the MMPSO
and basic Otsu’s method without amelioration. PSO is a bio-inspired stochastic
algorithm and is like all evolutionary algorithms based random initialization. So,
results are not similar in each run. For this, MMPSO and PSO algorithms are
executed 20 times. The average CPU process time values are brought in Table 9.
An Efficient Multi Level Thresholding Method … 357
equal to 5 101 ms for both two methods, and finally when the number of the
threshold values to be determined is equal to 5, the CPU process time for the
MEABCT method increases and becomes equal 7 101 ms unlike the MMPSO
method where the threshold value becomes only equal to 6 101 ms.
The aim of the proposed algorithm is to determinate the best threshold values for
the optimal problem. Since all evolutionary methods among of them MMPSO,
PSO, GA and MEABCT are stochastic and random, the results are not completely
the same in each run and in each number of threshold. For this, different level for
image segmentation are applied and classified in Tables 11 and 12.
Like with the CPU processing time, we begin firstly in this section by giving the
results for the MMPSO, PSO and Otsu’s method. This, to ensure the performance
of the method MMPSO compared with the basic PSO and also basic Otsu methods.
Secondly, we compare our MMPSO method with others metaheuristic methods
such as GA and MEABCT.
Table 11 shows the selected thresholds of the three test images. It is clear that the
selected thresholds by the MMPSO algorithm are very close (for all 2, 3, 4 and 5
threshold problem) to the ones PSO algorithm; nevertheless, there are significant
differences of selected thresholds with regard to the Otsu’s method. This result
reveals that the multithresholding results depend heavily on the objective function
that is selected.
Table 12 shows the selected thresholds derived by the MMPSO, GA and
MEABCT algorithms for 3, 4, 5 and 6 different levels. We find that the selected
thresholds of MMPSO algorithm are equivalent to optimal thresholds derived by
the GA and the MEABCT methods when the number of thresholds is less than 4.
An Efficient Multi Level Thresholding Method … 359
However, when the number of the thresholds exceeds 4, the thresholds of the
MMPSO algorithm are different and better than other methods. Also, for the Map
image which is more complex, the thresholds of the MMPSO is more significant
and the best. Thus, the proposed MMPSO algorithm is suitable for more complex
image analysis.
In Fig. 3, it gives results of different segmented images with various threshold
levels. Qualitative results given below show that image with higher level of seg-
mentation have more details than others. Along the same lines, Tables 11 and 12
show that for 6-level threshold, the MMPSO method is the method that always has
the best threshold value regarding the Otsu’s method, PSO, GA and MEABCT
methods.
360 F. Hamdaoui et al.
The objective function value is the main idea of our work. We introduce a new
fitness function for the basic PSO algorithm. So, it is very necessary to compare
results given by our method called MMPSO to Otsu’s method and others meta-
heuristics algorithms to determine the best.
We give in Tables 13 and 14 the standard deviation fitness values provided by
the MMPSO method, PSO method, Otsu’s method, GA method and MEABCT to
evaluate the stability of all algorithms. For this, we use the following index in
Eq. (18) below [90]:
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
X n
ðri lÞ2
STD ¼ ð18Þ
i¼1
N
Where STD is the standard deviation, ri is the best fitness value of the ith run of
the algorithm, l is the average value of r and N is the repeated times of each
An Efficient Multi Level Thresholding Method … 361
method (here N = 20 times). Note that all objective function values are calculated
for 2, 3, 4 and 5 thresholds.
Table 13 presents the standard deviation of the fitness values for the MMPSO
method, PSO method and Otsu’s method. Higher STD means a higher stability of
the fitness used for the algorithm. The MMPSO method and PSO method have very
close STD values with a slight advantage in favor of the MMPSO method against
the PSO method. Both of those methods have better values compared to the Otsu
method. Thereby, the fitness value of the MMPSO method more stable objective
function.
362 F. Hamdaoui et al.
We can easily note that the difference between all algorithms is very small. Also,
we remark that our method is the best method in refer to values given to Table 13.
In other terms, whenever increasing the number of thresholds, the difference is more
noteworthy and the MMPSO leads with the higher fitness value than other methods.
5 Conclusion
In this paper, we have proposed a method called MMPSO inspired from the Particle
Swarm Optimization algorithm based on a new fitness function and the Otsu’s
method for multilevel thresholding. This method is able to determine optimal
threshold values from complex gray-level images. In this purpose, a new fitness
function is developed to ensure best threshold values in less CPU process time and
with the best stability due to the best STD value. Experimental results demonstrated
by computing optimal threshold values in 4 different levels (3, 4, 5 and 6 levels) for
three different benchmark images. Results indicate that the MMPSO is more effi-
cient than basic PSO, Otsu’s method, GA and MEACBT methods. In particular,
this method is better when the level of segmentation increase and the image is with
more details.
Moreover, due to the low computational complexity of the algorithm and the
higher stability of the MMPSO algorithm, this algorithm (MMPSO) will be applied
to classify the MRI medical images. Also, the segmentation results are promising
and it encourage further researches for applying the MMPSO algorithm to complex
and real-time MRI image segmentation problem.
References
1. Melouah, A.: A novel region growing segmentation algorithm for mass extraction in
mammograms. Model. Approaches Algorithms Adv. Comput. Appl. Stud. Comput. Intel. 488,
95–104 (2013)
2. Chakraborty, J., Mukhopadhyay, S., Singla, V., Khandelwal, N., Rangayyan, R.M.: Detection
of masses in mammograms using region growing controlled by multilevel thresholding. In:
The 25th International Symposium on Computer-Based Medical Systems (CBMS), Rome,
pp. 1–6, 20–22 June 2012. doi: 10.1109/CBMS.2012.6266308
3. Dragon, R., Ostermann, J., Van Gool, L.: Robust realtime motion-split-and-merge for motion
segmentation. In: The 2013 35th German Conference on Computer Science, GCPR.
Saarbrücken, Germany, pp. 425–434, 3–6 Sept 2013. doi:10.1007/978-3-642-40602-7_45
4. Chaudhuri, D., Agrawal, A.: Split-and-merge procedure for image segmentation using
bimodality detection approach. Defence Sci. J. 60(3), 290–301 (2010)
5. Cao, X., Ding, W., Hu, S., Su, L.: Image segmentation based on edge growth. In: Proceedings
of the 2012 International Conference on Information Technology and Software Engineering,
pp. 541–548 (2013). doi:10.1007/978-3-642-34531-9_57
An Efficient Multi Level Thresholding Method … 363
6. Sharif, M., Raza, M., Mohsin, S.: Face recognition using edge information and DCT. Sindh
Univ. Res. J. (Sci. Ser.) 43(2), 209–214 (2011)
7. Baakek, T., Chikh Mohamed, A.: Interactive image segmentation based on graph cuts and
automatic multilevel thresholding for brain images. J. Med. Imaging Health Inform. 4(1),
36–42 (2014)
8. Martin-Rodriguez, F.: New tools for gray level histogram analysis, applications in segmentation.
In: 10th International Conference in Image analysis and recognition, ICIAR, Póvoa do Varzim-
Portugal, pp. 326–335, 26–28 June 2013. doi:10.1007/978-3-642-39094-4_37
9. Qifang, L., Zhe, O., Xin, C., Yongquan, Z.: A multilevel threshold image segmentation
algorithm based on glowworm swarm optimization. J. Comput. Inf. Syst. 10(4), 1621–1628
(2014)
10. Kulkarni, R.V., Venayagamoorthy, G.K.: Bio-inspired algorithms for autonomous deployment
and localization of sensor nodes. IEEE Trans. Syst. Man Cybern. 40(6), 663–675 (2010)
11. Hamdaoui, F., Ladgham, A., Sakly, A., Mtibaa, A.: A new images segmentation method based
on modified PSO algorithm. Int. J. Imaging Syst. Technol. 23(3), 265–271 (2013)
12. Ladgham, A., Hamdaoui, F., Sakly, A., Mtibaa, A.: Fast MR brain image segmentation based
on modified shuffled frog leaping algorithm. DOI, Signal Image Video Process. (2013).
doi:10.1007/s11760-013-0546-y
13. Sun, H.J., Deng, T.Q., Jiao, Y.Y.: Remote sensing image segmentation based on rough
entropy. In: 4th International Conference in Advances in Swarm Intelligence ICSI,
pp. 11–419, 12–15 June 2013. doi:10.1007/978-3-642-38715-9_49
14. Sarkar, S., Sen, N., Kundu, A., Das, S., Chaudhuri, S.S.: A differential evolutionary multilevel
segmentation of near infra-red images using Renyi’s entropy. In: International Conference on
Frontiers of Intelligent Computing: Theory and Applications FICTA, pp. 699–706, (2013).
doi:10.1007/978-3-642-35314-7_79
15. Daisne, J.F., Sibomana, M., Bol, A., Doumont, T., Lonneux, M., Grégoire, V.: Tri-dimensional
automatic segmentation of PET volumes based on measured source-to-background ratios:
influence of reconstruction algorithm. Radiother. Oncol. 69(3), 247–250 (2003)
16. Huang, D.Y., Lin, T.W., Hu, W.C.: Automatic multilevel thresholding based on two-stage
Otsu’s method with cluster determination by valley estimation. Int. J. Innovative Comput. Inf.
Control 7(10), 5631–5644 (2011)
17. Ningning, Z., Tingting, Y., Shaobai, Z.: An improved FCM medical image segmentation
algorithm based on MMTD. Comput. Math. Methods. Med. (2014). http://dx.doi.org/10.1155/
2014/690349
18. Yasmin, M., Mohsin, S., Sharif, M., Raza, M., Masood, S.: Brain image analysis: a survey.
World Appl. Sci. J. 19(10), 1484–1494 (2012)
19. Raza, M., Sharif, M., Yasmin, M., Masood, S., Mohsin, S.: Brain image representation and
rendering: a survey. Res. J. Appl. Sci. Eng. Technol. 4(18), 3274–3282 (2012)
20. Al-azawi, M.: Image thresholding using histogram fuzzy approximation. Int. J. Comput. Appl.
83(9), 36–40 (2013)
21. Nakib, A., Roman, S., Oulhadj, H., Siarry, P.: Fast brain MRI segmentation based on two-
dimensional survival exponential entropy and particle swarm optimization. In: International
Conference of the IEEE EMBS. Lyon, France, pp. 5563–5566, 23–26 Aug 2007. doi:10.1109/
IEMBS.2007.4353607
22. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man
Cybern. 9(1), 62–66
23. Yao, C., Chen, H.J.: Automated retinal blood vessels segmentation based on simplified PCNN
and fast 2D-Otsu algorithm. J. Cent. S. Univ. Technol. 16(4), 640–646 (2009)
24. Huang, D.Y., Wang, C.H.: Optimal multi-level thresholding using a two-stage Otsu
optimization approach. Pattern Recogn. Lett. 30(3), 275–284 (2009)
25. Wu, B.F., Chen, Y.L., Chiu, C.C.: Recursive algorithms for image segmentation based on a
discriminant criterion. Int. J. Sig. Process. 1, 55–60 (2004)
364 F. Hamdaoui et al.
26. Hammouche, K., Diaf, M., Siarry, P.: A comparative study of various meta-heuristic
techniques applied to the multilevel thresholding problem. Eng. Appl. Artif. Intell. 23(5),
676–688 (2010)
27. Hammouche, K., Diaf, M., Siarry, P.: A multilevel automatic thresholding method based on a
genetic algorithm for a fast image segmentation. Comput. Vis. Image Underst. 109(2),
163–175 (2008)
28. Tao, W.B., Tian, J.W., Liu, J.: Image segmentation by three-level thresholding based on
maximum fuzzy entropy and genetic algorithm. Pattern Recogn. Lett. 24(16), 3069–3078
(2003)
29. Yang, Z., Pu, Z., Qi, Z.: Relative entropy multilevel thresholding method based on genetic
optimization. In: The 2003 IEEE International Conference on Neural Networks and Signal
Processing, Nanjing, pp. 583–586, 14–17 Dec 2013. doi:10.1109/ICNNSP.2003.1279340
30. Hancer, E., Ozturk, C., Karaboga, D.: Artificial bee colony based image clustering method. In:
IEEE International Congress on Evolutionary Computation, Brisbane, QLD, pp. 1–5, 10–15
June 2012. doi:10.1109/CEC.2012.6252919
31. Zhang, Y., Wu, L.: Optimal multi-level thresholding based on maximum Tsallis entropy via an
artificial bee colony approach. Entropy 13(4), 841–859 (2011)
32. Geng, R.: Color image segmentation based on self-organizing maps, advances in key
engineering materials. Adv. Mater. Res. 214, 693–698 (2011)
33. Bhandari, A.K., Singh, V.K., Kumar, A., Singh, G.K.: Cuckoo search algorithm and wind
driven optimization based study of satellite image segmentation for multilevel thresholding
using Kapur’s entropy. Expert Syst. Appl. 41(7), 3538–3560 (2014)
34. Gao, H., Kwong, S., Yang, J., Cao, J.: Particle swarm optimization based on intermediate
disturbance strategy algorithm and its application in multi-threshold image segmentation. Inf.
Sci. 250(20), 82–112 (2013)
35. Ghamisi, P., Couceiro, M.S., Benediktsson, J.A., Ferreira, N.M.F.: An efficient method for
segmentation of images based on fractional calculus and natural selection. Expert Syst. Appl.
39(16), 12407–12417 (2012)
36. Tillett, J., Rao, T.M., Sahin, F., Rao, R., Brockport, S.: Darwinian particle swarm
optimization. In: The 2nd Indian International Conference on Artificial Intelligence,
pp. 1474–1487 (2005)
37. Couceiro, M.S., Ferreira, N.M.F., Machado, J.A.T.: In fractional order Darwinian particle
swarm optimization. In FSS’11, Symposium on Fractional Signals and Systems, Coimbra,
Portugal, pp. 2382–2394, 4–5 Nov 2011. doi:10.1109/TGRS.2013.2260552
38. Holland, J.H.: Adaptation in Natural and Artificial Systems. The University of Michigan Press,
Ann Arbor (1975)
39. Goldberg, D.E.: Algorithmes Génétiques: Exploration, optimisation et apprentissage
automatique, Edition Wesley (1989)
40. Holland, J.H.: Genetic algorithms, pour la science. Ed. Sci. Am. 179, 44–50 (1992)
41. Man, K.F., Tang, K.S., Kwong, S.: Genetic algorithms: concepts and applications. IEEE
Trans. Industr. Electron. 43(5), 519–534 (1996)
42. Schmitt, L.M.: Fundamental study: theory of genetic algorithms. Theoret. Comput. Sci. 259
(1–2), 1–61 (2001)
43. Petrowski, A.: Une introduction à l’optimisation par algorithmes génétiques, (2001). http://
www-inf.int-evry.fr/*ap/EC-tutoriel/Tutoriel.html
44. Phulpagar, B.D., Kulkarni, S.S.: Image segmentation using genetic algorithm for four gray
classes. In: IEEE International Conference on Energy, Automation and Signal, 28–30 Dec
2011. Bhubaneswar, Odisha, pp. 1-4. doi:10.1109/ICEAS.2011.6147093
45. Phulpagar, B.D., Bichkar, R.S.: Segmentation of noisy binary images containing circular and
elliptical objects using genetic algorithms. IJCA 66(22), 1–7 (2013)
46. Janc, K., Tarasiuk, J., Bonnet, A.S., Lipinski, P.: Genetic algorithms as a useful tool for
trabecular and cortical bone segmentation. Comput. Methods Programs Biomed. 111(1),
72–83 (2013). doi:10.1016/j.cmpb.2013.03.012
An Efficient Multi Level Thresholding Method … 365
47. Manikandan, S., Ramar, K., Willjuice, I.M., Srinivasagan, K.G.: Multilevel thresholding for
segmentation of medical brain images using real coded genetic algorithm. Measurement 47,
558–568 (2014)
48. Dorigo, M., Gambardella, L.M.: Guest editorial special on ant colony optimization. IEEE
Trans. Evol. Comput 6(4), 317–319 (2002)
49. Ajith, A., Crina, G., Vitorino, R.: Stigmergic Optimization. Stud. Comput. Intel. 31, 1–299
(2006)
50. Beckers, R., Deneubourg, J.L., Goss, S.: Trails and U-turns in the selection of a path by the
Ant Lasius Niger. J. Theor. Biol. 159(4), 397–415 (1992)
51. Goss, S., Aron, S., Deneubourg, J.L., Pasteels, J.M.: Self-organized shortcuts in the argentine
ant. Naturwissenchaften 76(12), 579–581 (1989)
52. Dorigo, M., Maniezzo, V., Colorni, V.: Ant system: optimization by a colony of cooperating
agents. IEEE Trans. Syst. Man Cybern. B Cybern. 26(1), 29–41 (1996)
53. Colorni, A., Dorigo, M., Maniezzo, V.: Distributed optimization by ant colonies. In: The First
European Conference on Artificial Life. MIT Press, Paris, France, pp. 134–142, (1991)
54. Mousa, A.A., El-Desoky, I.M.: Stability of Pareto optimal allocation of land reclamation by
multistage decision-based multipheromone ant colony optimization. Swarm Evol. Comput. 13,
13–21 (2013)
55. Liang, Y.C., Yin, Y.C.: Optimal multilevel thresholding using a hybrid ant colony system.
J. Chin. Inst. Ind. Eng. 28(1), 20–33 (2011)
56. Ma, L., Wang, K., Zhang, D.: A universal texture segmentation and representation scheme
based on ant colony optimization for iris image processing. Comput. Math. Appl. 11(12),
1862–1866 (2009)
57. Tao, W., Jin, H., Liu, L.: Object segmentation using ant colony optimization algorithm and
fuzzy entropy. Pattern Recogn. Lett. 28(7), 788–796 (2007)
58. Karaboga, D.: An idea based on honey bee swarm for numerical optimization. Technical
Report TR06, Computer Engineering Department, Erciyes University, Turkey (2005)
59. Basturk, B., Karaboga, D.: An artificial bee colony (abc) algorithm for numeric function
optimization. In: IEEE Swarm Intelligence Symposium, Indianapolis, Indiana, USA, May
2006
60. Karaboga, D., Basturk, B.: On the performance of artificial bee colony (ABC) algorithm. Appl.
Soft Comput. 8(1), 687–697 (2008)
61. Karaboga, D., Basturk, B.: Artificial bee colony (ABC) optimization algorithm for solving
constrained optimization problems. In: Foundations of Fuzzy Logic and Soft Computing.
Lecture Notes in Computer Science, vol. 45(29), pp. 789–798 (2007)
62. Hadidi, A., Azad, S.K., Azad, S.K.: Structural optimization using artificial bee colony
algorithm. In: The second International Conference on Engineering Optimization. Lisbon,
Portugal, 6–9 Sept 2010
63. Tereshko, V., Loengarov, A.: Collective decision-making in honeybee foraging dynamics.
Comput. Inf. Syst. J. 9(3), 1–7 (2005)
64. Horng, M.H.: Multilevel minimum cross entropy thresholding using artificial bee colony
algorithm. Telkomnika 11(9), 5229–5236 (2013)
65. Akay, B.: A study on particle swarm optimization and artificial bee colony algorithms for
multilevel thresholding. Appl. Soft Comput. 13(6), 3066–3091 (2013)
66. Charansiriphaisan, K., Chiewchanwattana, S., Sunat, K.: A comparative study of improved
artificial bee colony algorithms applied to multilevel image thresholding. Math. Prob. Eng.,
1–17 (2013). http://dx.doi.org/10.1155/2013/927591
67. Cao, Y.F., Xiao, Y.H., Yu, W.Y., Chen, Y.C.: Multi-level threshold image segmentation based
on PSNR using artificial bee colony algorithm. Res. J. Appl. Sci. Eng. Technol. 4(2), 104–107
(2012)
366 F. Hamdaoui et al.
68. Horng, M.H., Jiang, T.W: Multilevel image thresholding selection using the artificial bee colony
algorithm. In: International Conference on Artificial Intelligence and Computational Intelligence,
Sanya, China, pp. 318–325, 23–24 Oct 2010. doi:10.1007/978-3-642-16527-6_40
69. Eusuff, M.M., Lansey, K.E.: Optimization of water distribution network design using the
shuffled frog leaping algorithm. J. Water Resour. Plan. Manag. 129(3), 210–225 (2003)
70. Duan, Q.Y., Gupta, V.K., Sorooshian, S.: Shuffled complex evolution approach for effective
and efficient global minimization. J. Optim. Theory Appl 76(3), 502–521 (1993)
71. Fang, C., Chang, L.: An effective shuffled frog-leaping algorithm for resource-constrained
project scheduling problem. Comput. Oper. Res. 39(5), 890–901 (2012)
72. Narimani, M.R.: A new modified shuffle frog leaping algorithm for non-smooth economic
dispatch. World Appl. Sci. J. 12(6), 803–814 (2011)
73. Wang, N., Li, X., Chen, X.H.: Fast three-dimensional Otsu thresholding with shuffled frog-
leaping algorithm. Pattern Recognit. Lett. Meta-heuristic Intel. Based Image Process. 31(13),
1809–1815 (2010)
74. Liong, S.Y., Atiquzzaman, M.: Optimal design of water distribution network using shuffled
complex evolution. J. Inst. Eng. 44(1), 93–107 (2004)
75. Gu, Y.J., Jia, Z.H., Qin, X.Z., Yang, J., Pang, S.N.: Image segmentation algorithm based on
shuffled frog-leaping with FCM. Commun. Technol. 2, 042 (2011)
76. Yang, C.S., Chuang, L.Y., Ke, C.H.: A combination of shuffled frog-leaping algorithm and
genetic algorithm for gene selection. J. Adv. Comput. Intell. Intell. Inf. 12(3), 218–226 (2008)
77. Horng, M.H.: Multilevel image threshold selection based on the shuffled frog-leaping
algorithm. J. Chem. Pharm. Res. 5(9), 599–605 (2013)
78. Ouadfel, S., Meshoul, S.: A fully adaptive and hybrid method for image segmentation using
multilevel thresholding. Int. J. Image Graph. Sig. Process. (IJIGSP) 5(1), 46–57 (2013)
79. Horng, M.H.: Multilevel image thresholding by using the shuffled frog-leaping optimization
algorithm. In: 15th North-East Asia Symposium on Nano Information Technology and
Reliability (NASNIT), Macao, pp. 144–149, 24–26 Oct 2011. doi:10.1109/NASNIT.2011.
6111137
80. Jiehong, K., Ma, M.: Image Thresholding Segmentation Based on Frog Leaping Algorithm
and Ostu Method. Yunnan University (Natural Science Edition), pp. 634–640 (2012)
81. Liu, J., Li, Z., Hu, X., Chen, Y.: Multiobjective optimization shuffled frog-leaping
biclustering. In: IEEE International Conference on Bioinformatics and Biomedicine
Workshops, Atlanta, pp. 151–156, 12–15 Nov 2011. doi:10.1109/BIBMW.2011.6112368
82. Bhaduri, A., Bhaduri, A.: Color image segmentation using clonal selection-based shuffled frog
leaping algorithm. In: International Conference on Advances in Recent Technologies in
Communication and Computing, ARTCom ‘09. Kottayam, Kerala, pp. 517–520, 27–28 Oct
2009. doi:10.1109/ARTCom.2009.115
83. Couceiro, M.S., Luz, J.M.A., Figueiredo, C.M., Ferreira, N.M.F., Dias, G.: Parameter
estimation for a mathematical model of the golf putting. In WACI’10, Workshop Applications
of Computational Intelligence ISEC-IPC, Coimbra, Portugal, pp. 1–8, 2 Dec 2010 (2010a)
84. Couceiro, M.S., Ferreira, N.M.F., Machado, J.A.T.: Application of fractional algoritms in the
control of a robotic bird. J. Commun. Nonlinear Sci. Numer. Simul. (Special Issue) 15(4),
895–910 (2010b)
85. Eberhart, R.C., Kennedy, J.: A new optimizer using particle swarm theory. In: 6th Symposium
on Micro Machine and Human Science, Nagoya, pp. 39–43, 4–6 Oct 1995. doi:10.1109/MHS.
1995.494215
86. Kennedy, J., Eberhart, R. C. (1995). Particle swarm optimization. In IEEE International
Conference Neural Network, 27 Nov–01 Dec 1995, Perth WA, pp. 1942–1948 (2005). doi:10.
1109/ICNN.1995.488968
87. Jiang, M., Luo, Y.P., Yang, S.Y.: Stochastic convergence analysis and parameter selection of
the standard particle swarm optimization algorithm. Inf. Process. Lett. 102(1), 8–16 (2007)
An Efficient Multi Level Thresholding Method … 367
88. Fan, J., Han, M., Wang, J.: Single point iterative weighted fuzzy C-means clustering algorithm
for remote sensing image segmentation. Pattern Recogn. 42, 2527–2540 (2009)
89. Horng, M.H.: Multilevel thresholding selection based on the artificial bee colony algorithm for
image segmentation. Expert Syst. Appl. 38(11), 13785–13791 (2011)
90. Ghamisi, P., Couceiro, M.S., Benediktsson, J.A., Ferreira, M.F.N.: An efficient method for
segmentation of images based on fractional calculus and natural selection. Expert Syst. Appl.
39(16), 12407–12417 (2012)
IK-FA, a New Heuristic Inverse
Kinematics Solver Using Firefly Algorithm
Abstract In this paper, a heuristic method based on Firefly Algorithm is proposed for
inverse kinematics problems in articulated robotics. The proposal is called, IK-FA.
Solving inverse kinematics, IK, consists in finding a set of joint-positions allowing a
specific point of the system to achieve a target position. In IK-FA, the Fireflies
positions are assumed to be a possible solution for joints elementary motions. For a
robotic system with a known forward kinematic model, IK-Fireflies, is used to gen-
erate iteratively a set of joint motions, then the forward kinematic model of the system
is used to compute the relative Cartesian positions of a specific end-segment, and to
compare it to the needed target position. This is a heuristic approach for solving
inverse kinematics without computing the inverse model. IK-FA tends to minimize
the distance to a target position, the fitness function could be established as the
distance between the obtained forward positions and the desired one, it is subject to
minimization. In this paper IK-FA is tested over a 3 links articulated planar system,
the evaluation is based on statistical analysis of the convergence and the solution
quality for 100 tests. The impact of key FA parameters is also investigated with a
focus on the impact of the number of fireflies, the impact of the maximum iteration
number and also the impact of (α, β, γ, δ) parameters. For a given set of valuable
parameters, the heuristic converges to a static fitness value within a fix maximum
number of iterations. IK-FA has a fair convergence time, for the tested configuration,
the average was about 2.3394 × 10−3 seconds with a position error fitness around
N. Rokbani (&)
High Institute of Applied Sciences and Technology, University of Sousse,
Cité Taffala (Ibn Khaldoun), Sousse 4003, Tunisia
e-mail: [email protected]
N. Rokbani A.M. Alimi
REGIM-Lab.: REsearch Groups in Intelligent Machines, University of Sfax, ENIS,
BP 1173, Sfax 3038, Tunisia
e-mail: [email protected]
A. Casals
Institute for Bioengineering of Catalonia and Universitat Politècnica de Catalunya.
BarcelonaTech., Barcelona, Spain
e-mail: [email protected]
3.116 × 10−8 for 100 tests. The algorithm showed also evidence of robustness over the
target position, since for all conducted tests with a random target position IK-FA
achieved a solution with a position error lower or equal to 5.4722 × 10−9.
Keywords Robotics Inverse kinematics Heuristics Computational kinematics
Swarm intelligence
1 Introduction
In humanoid robotics, legs, arms and fingers are typical articulated systems, with
a similar kinematic chain while dynamics are different. Inverse kinematics is needed
at all levels: for leg motion planning in walking, for arms and fingers motions in
handling and grasping [2, 36]. Due to these reasons, it is easier to plan motion in the
Cartesian frame, since it is the physical frame of the environment. Then an inverse
transformation is needed to compute the needed displacements in the joints frames
prior to control [24, 27], such a technique is also used in 3D character animation or
gaming applications. In humanoid robotics and 3D humanoids simulations, the
human motion analysis was a key source of inspirations for walking gaits, arms
motions, grasping [37, 39], body postures and biped walking [1, 3]. At this level
also inverse kinematics are needed to generate skeleton’s joint motions that could fit
a robotic design while satisfying a human like motion in the Cartesian frame [16]
form a set of marked human motion primitives. Such inverse solvers should satisfy
real-time constraints and should not suffer from any singularity which is not the
case of classical IK solvers. Classical techniques consists in finding an approxi-
mation of the inverse kinematics of a system when the analytical expression of the
inverse is complex and difficult to compute; this class of methods include the
pseudo-inverse methods, the Jacobian transpose, the quasi Newton and the damped
least square methods; classical inverse methods are time consuming essentially in
systems with a high DOF [5, 7].
In inverse kinematics based on PSO [25]; or on GA such as [4, 12, 46], a
stochastic search is performed, using a population in GA or a set of individuals in
PSO, ABC and ACO; each population or individual is a possible solution, here a set
of joints positions of the IK problem. Any possible solution is ranked, using a
fitness function and the best is returned as the solution of the problem, for a given
input, here a target position. The common aspects of these methods is that they try
to solve inverse kinematics by evolving iteratively a set of solutions using a limited
set of operations that mimic a natural process, swarm behavior and its social
organization in the case of PSO, ABC or ACO [9, 10, 14]; GA tends to solve a
problem using the natural evolution mechanisms [21]. The design of the optimality
criteria is what makes a heuristic render an acceptable result. In IK, a trivial
objective could be used, it consists in minimizing the distance to the target position.
For applications such as humanoid gait generation, it is possible to add some
constraints to obtain a solution which fits better this specific class of robotic sys-
tems. In real world applications, the solutions of an articulated arm or an artificial
leg should respect the mechanical design of the system. The adaptation of the
heuristic IK methods is possible for any kind of robotic system and do not need to
be trained.
On the other hand neural network [1] and neuro-fuzzy techniques tend to do the
same but after a training process, where a neural or a neuro-fuzzy network is trained
by a set of joint motions and their correspondent Cartesian solutions [13, 32, 44, 45].
The training process has a direct impact on the quality of the obtained IK solver, for
these intelligent IK solvers designing a good training set is essential. Neuro-fuzzy
has the advantage to be interpretable when compared to neural-network IK solutions;
372 N. Rokbani et al.
they are accurate but suffer from computing time, and could not be used in real time
applications [22, 37].
The remaining of this paper is organized as detailed below. Section 2 reviews the
key issues of kinematics modeling with a focus on inverse kinematics challenges;
this section reviews the concept of heuristic solvers with a focus on CCD, which is
a reference heuristic for IK. Section 3 stars with a review on the FA heuristic, then a
new heuristic approach for inverse kinematics based on the Firefly Algorithm, FA,
is proposed, the proposal is called IK-FA. IK-FA proposal is detailed for uncon-
strained and constrained inverse kinematics problems. In Sect. 4 a set of simulation
based experiments are detailed, the key aspects of IK-FA were subject to investi-
gation over a classical 3 links articulated system, investigations concerned the
impact of IK-FA parameters on convergence and performances. Finally the paper
ends with discussions, conclusions and perspectives.
2 Inverse Kinematics
In Robotics two aspects are important, kinematics and dynamics [36, 37]; Kine-
matics deals with how the motions of a mechanism are related to the relative
positions of the end effectors of the system in accordance with a reference frame [5,
37]. The motion is studied regardless to what produced it. In robotics kinematics
analysis are needed to plan a robot motion with respect to the work space geometric
configuration and satisfying angular and geometric constraints that the system could
be subject to [24]. Kinematics is forward or inverse: In forward kinematics the
mechanism motions are known while the end effectors positions need to be com-
puted. In inverse kinematics the end effectors positions are known and the joint
motions involved to achieve them need to be computed [5, 35].
Assume that X ¼ ðX1 ; X2 ; . . .Xl Þ is the position of an articulated body of (n) ele-
ments subject to a set of elementary rotations and/or translations, q ¼ ðh1 ; h2 ; . . .hn Þ
so the forward kinematics is expressed by Eq. (1).
X ¼ f ðqÞ ð1Þ
n
f ðqÞ ¼ P Ti1
i
: ð2Þ
i¼1
Inverse kinematics consists in finding a possible and feasible joints motions solu-
tion allowing a robotic system, typically articulated, achieving a pre-defined posi-
tion, called target position. For an articulated system such in Fig. 1, let’s assume the
joint rotations needed to produce the motion to q ¼ ðh1 ; . . .; hn Þ; the robot position
in a Cartesian reference frame X ¼ ðx1 ; . . .; xl Þ, is obtained as the output of the
forward kinematics function, f(q) of the system with Qi as input as in (1). If we
assume that the forward kinematics are expressed using a mathematical function f(),
inverse kinematics, IK, could simply be the inverse of that function meanwhile and
considering the nature of f() which is a matrix, its inverse is not for sure defined,
retrieving the inverse kinematics function depends on the invertibility of the for-
ward form, the generic formulation of Eq. (3) is in most cases difficult to compute
and some approximations are needed to retrieve the IK models [5, 34, 35].
q ¼ f 1 ðXÞ ð3Þ
The main problem in IK is the existence f 1 ðÞ; in the most cases an analytic
expression of the IK function is difficult to obtain. Several computational methods
are proposed to tackle this problem. The most Known approach is based on the
Fig. 1 Simplified y
representation of an distance (X3, Target)
articulated system composed
by 3 links and 3 revolute Target position
joints
x
374 N. Rokbani et al.
Using (1) and (4) the linear velocity could be expressed according to the angular
velocities of the joints angular positions by (5).
dX dq
¼ JðqÞ ð5Þ
dt dt
For very limited motions, the previous equation could be expressed using dif-
ferential form instead of the derivates; It leads to an expression of the elementary
position displacement for a given elementary joint displacements as in (6)
dX ¼ JðqÞdq ð6Þ
The inverse form of this equation is expressed in (7) allowing to compute the
small amounts of joints positions changes for a given small relative variation of the
end effectors position. Here the problem of IK is simply transformed in computing
or finding the inverse of the Jacobian of the forward kinematics function.
Around this concept, several methods are proposed all of them belong to the
same class of IK solutions, the Jacobian based IK. Note that if the dimension of the
Cartesian position vector is different from the dimension of the joints angular
rotations vector q(), the J(q) matrix is rectangular and is simply not invertible, it
could also suffer form singularities. For systems with high DOF, the analytical
solution of inverse kinematics is difficult to express [5]. The Jacobian transpose
method used the transpose of J(q) instead if its inverse. The pseudo inverse method
replaces the Jacobian by its pseudo inverse, while it is not the exact inverse of J is
still a good approximation. The main challenge with Jacobian inverse methods is
how to compute the Jacobian directly or iteratively [5, 7].
For a given target position and knowing the current position, the classical method to
solve inverse kinematics consists in retrieving a q() using the inverse kinematics
function as in Eq. (3). This solution exists only if the inverse kinematics function is
IK-FA, a New Heuristic Inverse Kinematics … 375
defined, while in the most of the cases this function is hard to obtain, and suffer
from singularities [5].
An Heuristic solution to this problem consists in using an heuristic search
method to find iteratively a set of q(), by reducing the position error, as in Eq. (8),
which is the distance of the end-segment to the needed target position, as shown in
Fig. 1. In the case of PSO, particle swarm optimization, the heuristic will guide the
search using a set of particles; each one is a potential solution of the problem [10,
11]. The quality of a solution is evaluated using a fitness function, which is natu-
rally quantifying the error of the obtained target versus the needed one.
e ¼ kxt f ðqi Þk ð8Þ
In inverse kinematics solver using PSO, IK-PSO, the fitness function is the
distance of the end-effectors, or specific point of the system to the target [25, 26,
28], its general expression of a dimension (d) is given by (9).
X
d X
d
fitnessðiÞ ¼ ðxi xt Þ2 ¼ ðf ðqÞi xt Þ2 ð9Þ
i¼1 i¼1
where (d) denotes the dimension, in the case of Fig. 1, d = 2. and the fitness
function is expressed by (10).
In this paragraph we first give a brief on the essentials of Firefly Algorithm, FA,
methaheuritic, and then a detailed description on how FA was used to solve inverse
kinematics is given.
xi ¼ xi þ bðxj xi Þ þ ae ð11Þ
where, xi ; xj : are respectively the current position of the fireflies (i) and (j). It is
important to note here that the position update of any firefly is adjusted with regard
to its own current position and also the current positions of all swarm individuals.
This is the main difference between FA algorithm and PSO where an individual
position depends on a couple of specific particles, the local best and the global best.
An individual of a FA swarm is moved towards any other with higher brightness,
this displacement is moderated by the attractiveness coefficient β and a random
displacement (αε). The neighborhood is composed of fireflies within the perception
filed of the individual. The firefly with a lower brightness is moved towards the one
with a higher brightness [43]; a simplified pseudo-code of FA algorithm is given in
Fig. 2.
In Eq. (11) the term β represents the attractiveness coefficient which depends
also on the distance separating the firefly (i) to individual (i), it is expressed as in
Eq. (12).
b ¼ b0 eðcrij Þ
2
ð12Þ
The final term of Eq. (11) could be observed as a step size with a moderation
parameter a and were e could be derived randomly from a Gaussian distribution. In
FA, the brightness of a firefly I(x) could be used as the fitness function of the
problem to optimize as expressed in (13).
IK-FA, a New Heuristic Inverse Kinematics … 377
The brightness of an individual (i) is also subject to a natural lost when observed
from the position of an individual (j), this lost is expressed as in Eq. (14), where (r)
is the distance of firefly (j) to firefly (i) and γ the absorption coefficient.
The Firefly Algorithm is an iterative computing heuristic and Eq. (15) refers to
the position of the firefly (j) at iteration (i). The Cartesian position of end-segment,
obtained with the joints solution of firefly (j) at iteration (i) is expressed as in (16).
where f() stands for the forward kinematics function of the system. A trivial fitness
function for the system could be expressed by the square of Euclidian distance of
the target point to the end-segment position of the system, see Eq. (17).
1
Iji ¼ ð18Þ
1 þ cðxt xji Þ2
where (j) is the firefly identifier and (i) the iteration counter, the brightness is
designed so that it comes to a maximum as the target point is achieved, I = 1. Note
that the brightness is related to the distance of a firefly to target position while the
firefly himself is an angular position. As is PSO, a stop is observed when the
maximum number of iteration is achieved or when the fitness function is satisfied.
Constraints could also be expressed in the Cartesian space, so that a specific (Xi)
position is within a specific convex hull. In this case the constraint is similar to J3,
given by Eq. (21).
C(x) is the convex hull of the solutions space in the case of Eq. (21). It is the
convex hull excluded from the solution space in the case of Eq. (22).
In all cases, the faulty firefly is replaced by a random one with respect to needed
constraints. The pseudo code of the Constrained IK-FA is presented in Fig. 4.
For the pseudo-code of Fig. 4, the fitness function is subject to minimization, as
defined in (18) the brightness is maximal as the fitness approaches zero, this mean
that the best firefly, will be the brightest one and will be the one with fitness as close
to zero as possible. Note that it is possible to code IK-FA with a minimization
where Iji the brightness of firefly (j) by iteration (i); in this case the FA procedure
should be slightly adjusted so that the firefly with the higher brightness is moved
toward the one with a lower brightness, since lower brightness indicates a better
solution. This modification should be made on line (8) of the pseudo-code of Fig. 4.
4 Experimental Results
The first test bench is a generic articulated system composed by 3 links and a 3
revolute joints, similar to what appears in Fig. 1, it represents a 3 DOF articulated
system that could be used for a leg of for an arm planar model. In the case of a leg
the links (l1), (l2) and (l3) represents respectively the thigh and the tibia and the
foot. To apply the IK-FA, we first write the forward kinematics of that system, as in
is the sytem of equations given by (25).
382 N. Rokbani et al.
8
>
> x1 ¼ l1 cosðh1 Þ
>
>
>
> y1 ¼ l1 sinðh1 Þ
<
x2 ¼ l1 cosðh1 Þ þ l2 cosðh1 þ h2 Þ
ð25Þ
>
> y2 ¼ l1 sinðh1 Þ þ l2 sinðh1 þ h2 Þ
>
>
>
> x ¼ l1 cosðh1 Þ þ l2 cosðh1 þ h2 Þ þ l3 cosðh1 þ h2 þ h3 Þ
: 3
y3 ¼ l1 sinðh1 Þ þ l2 sinðh1 þ h2 Þ þ l3 sinðh1 þ h2 þ h3 Þ
For IK-FA, The inverse kinematics problem, relative to a given target position
Xt ¼ ðxt ; yt Þ for the terminal end-segment position, could be written as follows:
All tests were performed with target position (0.700, −0.500). The impact of
parameters is estimated based on the mean results obtained over 100 tests for each
variant. A simulation of the 3 links articulated system is also produced for the best
solution, such in Fig. 5a.
The couple of parameters that are used to evaluate the performances are the
fitness function and the computing time which is related to the iteration number
needed to converge. The fitness function used here is the square of the distance
error, this consideration allowed, for some tests, to fix the position error by con-
trolling its square instead of computing the square rough.
All tests results were visualized using a Cartesian frame such in Fig. 5a, which
shows the best solution found by the end of the processing. We also systematically
plot the evolution of the fitness function in order to see if a convergence behavior is
observed and to evaluate the precision of the obtained solutions. A typical plot of
the fitness function for a solution appears in Fig. 5b, where the fitness is plotted
iteratively.
For general conclusions, the mean and the standard deviation for a set of 100
tests are used to evaluate the impact of IK-FA parameters on the results. The mean
is the average of the fitness function computed for a given configuration test using
the distributions fitting tool of Matlab. This tool allowed also plotting an approx-
imation of the density of probability using a normal distribution.
IK-FA, a New Heuristic Inverse Kinematics … 383
Fig. 5 IK-FA Typical solution. a 3 links arm with a target position (0.700, −0.500), b fitness
function evolution for (α = 0.02, β = 0.02, γ = 0.8, δ = 0.997) and 20 fireflies.
Performances analyses are based on the evaluation of the fitness function of the
obtained solutions as well as the convergence time. The test protocol consist in a
statistical results over 100 tests. All tests were performed with the same target
position (0.700, −0.500), a section is dedicated to the effect of the target position on
a specific set of good parameters of IK-FA. Discussions are conducted on the effect
384 N. Rokbani et al.
A typical set of solutions of the 3 Links system appears in Fig. 5a for ten tests. The
target position is tagged with a red cross, the relative link sizes are respectively
(l1 = 0.5, l2 = 0.3, l3 = 0.2); they are obtained by dividing the real length of the link
by the length of the articulated system when all links are aligned. The target
position is fixed to (0.700, −0.500). Note that IK-FA returns the best solution found
by the end of its processing, the simulation results of Fig. 6, corresponds to 10
results obtained from 10 different executions of the solver with the same set of
parameters. Result evaluations are based on the fitness function see Fig. 6b.
IK-FA was tested first using a set of FA parameters: (α = 0.2, β = 0.2, γ = 0.8,
δ = 0.9), with this set parameters, a convergence attitude is observed with a fitness
mean of about 1.14735 × 10−3, this average was observed over a statistical test of
100 runs of IK-FA. This number of tests is necessary to measure the quality of the
provided solutions. The analysis of the evolution of the fitness function for several
runs show that in all cases a solution is provided within 100 iterations for this set of
parameters see Fig. 6b. Some results have a high quality with fitness about
1 × 10−18, meaning that the distance error is about 1 × 10−9; meanwhile this kind of
solutions are far from the mean performances since the worst result, obtained for
this test, is a distance error of about 10−1.
What could be also underlined here is the fast convergence time, since all results
were reported in less then 100 iterations. Meanwhile we could not speak about a
stable inverse kinematics solver due to the big range in fitness variations limits
which is the square of the position error. This first test confirms the possible
convergence of IK-FA to high quality solutions even if solutions with fitness under
1 × 10−16, were only 31 over 100 tests. Solutions with fitness less than 1 × 10−6,
were 59 over 100. Week fitness’s solutions, equal or higher than 1 × 10−5, were 41
over 100 tests. This first investigation allowed confirming that it is possible to
achieve a convergence using IK-FA, meanwhile deep investigations are needed to
define a good set of parameters.
Fig. 6 IK-FA solver for a 3 links system. a possible solutions for a target position (0.700,
−0.500), b evaluation of the fitness functions using (α = 0.2, β = 0.2, γ = 0.8, δ = 0.9) and 10
fireflies.
Fig. 7 Evolution of the fitness function for 10 tests, (α = 0.02, β = 0.02, γ = 0.8, δ = 0.997) and 10
fireflies
A first investigation of the impact of the maximum iteration for the set of param-
eters, (α = 0.02, β = 0.02, γ = 0.8, δ = 0.997), 10 fireflies, and a target position (0.7,
0.5) was done over 500, 1,000, 2,000, 3,000, 5,000 and 10,000 iterations. This
experiment showed that the best fitness value decreases as the maximum iteration
number increases; the best fitness for 10,000 iterations is about 1.27102 × 10−18,
and the worst result, at iteration 10,000, is 1 × 10−17.
The fitness values observed around 500, 1,000 and 2,000 were decreasing but
did not show a static fitness value, this was only achieved for 5,000 iterations, and
clearly confirmed by the test of 10,000 iterations, see Fig. 8.
Using the distribution fitting tool of Matlab, the fitness is approximated with a
normal distribution with an average, mean, about 1.50 × 10−17. For 10,000 itera-
tions a static convergence comportment is observed around 4,500, (4,489.44)
iterations. These results are confirmed by 100 tests, see Fig. 9. This experiment
confirm that a valuable balance consist in fixing the maximum iteration number to
5,000. The only conclusion that could be taken at this level is that for this specific
set of parameters, IK-Firefly convergence is ensured with fitness around 5 × 10−17
or lower by a maximum iteration of 5,000, see Fig. 9b.
IK-FA, a New Heuristic Inverse Kinematics … 387
Fig. 8 Impact of the iteration number on the IK-FA convergence; Investigation for several
maximum iteration numbers ranging from 500 to 10,000
The swarm size is the number of individuals composing the swarm. The impact of
the number of fireflies is an important parameter in any swarm based heuristic; it
has a direct impact on the procession time and also on the quality of the solutions.
In swarm based techniques, such as PSO or GA, population sizes of 10 to 60 are
commonly used [8, 38]. The investigation on the effect of the FA swarm size
conducted here is specific to the IK-FA algorithm.
The number of fireflies are waved from 10 to 60 for a fixed target point (0.700,
−0.500), the set of (α = 0.02, β = 0.02, γ = 0.8, δ = 0.997) parameters and a fixed
maximum iterations number = 5,000. For any given set of firefly’s numbers the tests
are repeated 100 times prior to any interpretations. The fitness functions are then
subject to a statistical investigation using the distribution fitting tool of Matlab
statistics Toolbox [19]. Interpretations are based on the mean and standard devia-
tion values of the fitness’s on the tests. The fitness corresponding to a given swarm
size is approximated by a normal distribution of the density of probability function,
DPF, and the mean is used to compare the impact of the swarm size, see Fig. 10.
For all swarm sizes ranging from 10 to 60 the fitness mean ranges respectively
from 1.27 × 10−17 to 1.79 × 10−18, as in Table 1, allowing to conclude that as the
swarm size increases the fitness decreases and the position error which is the square
root of the fitness decreases, the obtained solutions are more precise.
For a swarm size of 60 individuals, the probability to obtain a result with a
fitness of 1.5e−18 is 99.8 %. For a swarm size of 10 fireflies, the mean of the
normal distribution used to approximate the results is 1.27148 × 10−17, with a
388 N. Rokbani et al.
Fig. 9 Impact of the iteration number on the IK-FA convergence. a the evolution of the fitness
function over 100 tests, b approximation of the fitness density of probability by a normal
distribution
variance of 1.38555 × 10−34, the probability to obtain a result with a fitness of 1016
is 100 %, which could be considered as a proof of convergence of the IK-FA
algorithm. Note also that results for 50 fireflies are very close to those of 60 fireflies,
see Fig. 7, where the yellow distribution represents the results for 50 fireflies its
IK-FA, a New Heuristic Inverse Kinematics … 389
mean is 2.145 × 10−18 with a variance of 3.660 × 10−36. Results for 40, 30 and 20
fireflies are also close with respective fitness means of 3.2146 × 10−18,
4.1216 × 10−18 and 5.4093 × 10−18. Results are resumed in Table 1.
Globally we can deduce that for a swarm size of 40–60 the fitness mean is
ranged from 3.21*10−18 to 1.79 × 10−18 respectively with a variance ranging from
8.92 × 10−36 to 2.86 × 10−36, for swarm’s size of 10 fireflies the fitness is 10 times
lower, with a mean of 1.27 × 10−17 and a variance of 1.34 × 10−34.
Results comparison based on normalized distributions on the fitness functions’
over 100 tests, showed that as the swarm size increased the variance of the fitness
functions decreased, best results are obtained with 60 fireflies, see Fig. 8, mean-
while results with 50 and 40 fireflies are very close, see Fig. 10. Known the impact
of the swarm size on the processing a good balance between fitness, swarm size and
processing time is the next investigation issue.
390 N. Rokbani et al.
The next investigation concerns the impact of swarm size on the computing time;
results reported on Table 2, concern the time needed for a fixed maximum iteration
number of 5,000 iterations. In Table 3, the impact of the population size on
computing time for a given error position, The IK-FA stop condition is modified so
that it ends treatments when the position error is less or equal to 106 , meaning that
the fitness function is less or equal to 1012 . If the error position is not achieved the
algorithm will stop at its maximum iteration fixed to 5,000 as in the previous test.
The time values presented in Table 2 are average time observed over 100 tests.
Crossing the impact of the number of the population size and the computing
time, it appeared that a swarm size of 20 individual is a valuable choice, since it
allowed achieving a fitness function of 5.48 × 10−18 in a computing time relatively
close to what we can obtain with a limited swarm size of 10 individuals. This choice
is confirmed when the stop condition is modified so that the swarm stops when it
achieved a desired fitness, by mean of error, details of this experimentation appears
in Table 3.
In order the check the robustness of the results over the target position, 100 tests
were performed with a randomly generated target position at each attempt. The test
configuration parameters are: (α = 0.02, β = 0.02, γ = 0.8, δ = 0.997), a swarm size
of 20 individual’s and a maximum iteration number of 5,000. The random target
positions are generated within a circle of radius (1), as in Fig. 11a. The fitness of
each solution is returned and subject to a statistical analysis using the distribution
fitting Matlab tool.
Statistics analysis showed that the probability to obtain a solution with a fitness
lower that 3 × 10−17 is 100 %, this means that for any random target position IK-FA
will generate at 100 % a solution with a fitness lower that 3 × 10−17. We can
conclude that that for any target position within the definition space of the system,
here a circle of radius (1), we are sure that an inverse kinematics solution exists and
we are also sure at 100 % that this solution has a position error of 5.4722 × 10−9 as
in Fig. 11b.
IK-FA, a New Heuristic Inverse Kinematics … 391
After investigating the key aspects of IK-PSO, a comparative test with CCD inverse
kinematics method is conducted for a 3 links articulated system with three revolute
joints as in Fig. 1. The test is conducted under the same conditions using IK-FA and
the CCD method. CCD was selected because it is a reference method for inverse
kinematics solver and is assumed to be a real time IK method, real time methods are
time stressed and have a large panel of application [37]. For this test both algo-
rithms were asked to solve the inverse kinematics problem for the target point xt
(0.700, −0.500), for a given position error.
Results showed that for both configurations the IK-FA is faster than CCD , these
results were established for 10−4 and 10−8 error positions, meaning that IK-FA
fitness were respectively of 10−8 and 10−16. Test were conducted using uncon-
strained IK-FA variant. Results showed that IK-FA clearly rendered solutions in a
limited time compared to CCD, see Table 4 for details.
In this paper a new heuristic method for inverse kinematics based on Firefly
Algorithm is proposed, IK-FA. It is based on the firefly algorithm and the forward
kinematics model of a robotic system. The method is proposed for a human like
articulated system, HLAS, while it could generalized to any kind of robotics. The
paper focuses on the IK-FA convergence capacities as well as the impact of FA
parameters on the quality of the solutions. A set of good parameters for IK-FA was
also established.
As conclusions: IK-FA, is a valuable solver for inverse kinematics, while
parameter fitting is still a challenging problem. For a Given set of parameters, the
heuristic converges to a static fitness value within a fix maximum number of
iterations, in this work about 4,500 for (α = 0.02, β = 0.02, γ = 0.8, δ = 0.997). IK-
FA has a fair convergence time, for the tested configurations, the average was about
(2.3394 × 10−3) s with a position error around (3.116e−08) over 100 tests. The
algorithm showed also evidence of robustness over the target position, since for all
conducted tests with a randomly generated target positions, IK-FA achieved a
solution with a position error lower or equal to 5.4722 × 10−9.
IK-FA, a New Heuristic Inverse Kinematics … 393
The investigation of the impact of the swarm size, showed that whatever is the
swarm size, form 10 to 60, IK-FA convergences. Meanwhile is has been estab-
lished in this work that as the swarm size increases the variance of the obtained
solutions decreases. This means that the probability of finding a solution closer to
the mean is higher. When the swarm size increases the computing time do so.
A balance, between swarm size and computing time, need to be defined, in this
work 20 FA individuals is an interesting choice.
Further developments are needed to deeply investigate the impact of FA variants
on IK-FA. The implementation of IK-FA as an inverse kinematics solver of a
robotic system such in [29, 31] should be introduced soon.
In This paper IK-FA was introduced as new heuristic inverse kinematics solver
for constrained and unconstrained problems. The experimental investigations were
limited to the unconstrained variant with an application to an articulated system
composed by 3 links and 3 revolute joints. The impact of constraints on perfor-
mances and computing time are under developments.
Acknowledgment The authors would like to acknowledge the financial support of this work by
grants from General Direction of Scientific Research (DGRST), Tunisia, under the ARUB program.
References
1. Ammar, B., Chouikhi, N., Alimi, A.M., Chérif, F., Rezzoug, N., Gorce, P.: Learning to walk
using a recurrent neural network with time delay. In: Artificial Neural Networks and Machine
Learning–ICANN, pp. 511–518. Springer, Heidelberg (2013)
2. Asfour, T., Dillmann, R.: Human-like motion of a humanoid robot arm based on a closed-form
solution of the inverse kinematics problem. In: Intelligent Robots and Systems (IROS 2003),
vol. 2, pp. 1407–1412 (2003)
3. Azevedo, C., Andreff, N., Arias, S.: BIPedal walking: from gait design to experimental
analysis. Mechatronics 14(6), 639–665 (2004)
4. Buckley, K.A., Simon H., Brian C.H.T.: Solution of inverse kinematics problems of a highly
kinematically redundant manipulator using genetic algorithms. IET, pp. 264–269 (1997)
5. Buss, S.R.: Introduction to inverse kinematics with jacobian transpose, pseudoinverse and
damped least squares methods. IEEE J. Robot. Autom. 17 (2004)
6. Çavdar, T., Mohammad, M., Milani, R.A.: A new heuristic approach for inverse kinematics of
robot arms. Adv. Sci. Lett. 19(1), 329–333 (2013)
7. Chiaverini, S., Siciliano, B., Egeland, O.: Review of the damped least-squares inverse
kinematics with experiments on an industrial robot manipulator. IEEE Trans. Control Syst.
Technol. 2(2), 123–134 (1994)
8. De Jong, K.A., Spears, W.M.: An analysis of the interacting roles of population size and
crossover in genetic algorithms. In: Parallel Problem Solving from Nature, pp. 38–47.
Springer, Heidelberg (1991)
9. Dorigo, M., Birattari, M., Stutzle, T.: Ant colony optimization. IEEE Comput. Intell. Mag. 1
(4), 28–39 (2006)
10. Eberhart, R.C., Kennedy, J.: A new optimizer using particle swarm theory. In: Proceedings of the
Sixth International Symposium on Micro Machine and Human Science, vol. 1, pp. 39–43 (1995)
11. Eberhart, R.C., Shi, Y.: Particle swarm optimization: developments, applications and
resources. In: Proceedings of the 2001 Congress on Evolutionary Computation, vol. 1,
pp. 81–86 (2001)
394 N. Rokbani et al.
12. Edison, E., Shima, T.: Integrated task assignment and path optimization for cooperating
uninhabited aerial vehicles using genetic algorithms. Comput. Oper. Res. 38(1), 340–356
(2011)
13. Juang, J.G.: Fuzzy neural network approaches for robotic gait synthesis. IEEE Trans. Syst.
Man Cybern. B Cybern. 30(4), 594–601 (2000)
14. Karaboga, D., Gorkemli, B., Ozturk, C., Karaboga, N.: A comprehensive survey: artificial bee
colony (ABC) algorithm and applications. Artif. Intell. Rev. 1–37 (2012)
15. Kuffner, J., Nishiwaki, K., Kagami, S., Inaba, M., Inoue, H.: Motion planning for humanoid
robots. In: Robotics Research, pp. 365–374. Springer, Heidelberg (2005)
16. Kulpa, R., Multon, F.: Fast inverse kinematics and kinetics solver for human-like figures. In:
Proceedings of Humanoids, pp. 38–43 (2005)
17. Lander, J., CONTENT, G.: Making kine more flexible. Game Developer Mag. 1, 15–22
(1998)
18. Łukasik, S., Żak, S.: Firefly algorithm for continuous constrained optimization tasks.
Computational Collective Intelligence. Semantic Web, Social Networks and Multiagent
Systems, pp. 97–106. Springer, Heidelberg (2009)
19. MATLAB Statistics Toolbox User’s Guide (2014). The MathWorks Inc. http:www.
mathworks.com/help/pdf_doc/stats/stats.pdf
20. Mohamad, M.M., Taylor, N.K., Dunnigan, M.W.: Articulated robot motion planning using ant
colony optimisation. In: 3rd International IEEE Conference on Intelligent Systems,
pp. 690–695 (2006)
21. Pant, M., Gupta, H., Narayan, G.: Genetic algorithms: a review. In: National conference on
frontiers in applied and computational mathematics (FACM-2005), Allied Publishers, p. 225,
04–05 Mar 2005
22. Pérez-Rodríguez, R., Marcano-Cedeño, A., Costa, Ú., Solana, J., Cáceres, C., Opisso, E.,
Gómez, E.J.: Inverse kinematics of a 6 DoF human upper limb using ANFIS and ANN for
anticipatory actuation in ADL-based physical neurorehabilitation. Expert Syst. Appl. 39(10),
9612–9622 (2012)
23. Pham, D.T., Castellani, M., and Le Thi, H.A.: Nature-inspired intelligent optimisation using
the bees algorithm. In: Transactions on Computational Intelligence XIII, pp. 38–69. Springer,
Heidelberg (2014)
24. Pollard, N.S., Hodgins, J.K., Riley, M.J., Atkeson, C.G.: Adapting human motion for the
control of a humanoid robot. In Proceedings of IEEE International Conference on Robotics
and Automation, ICRA’02, vol. 2, pp. 1390–1397 (2002)
25. Rokbani, N., Alimi, A.M.: Inverse kinematics using particle swarm optimization, a statistical
analysis. Procedia Eng. 64, 1602–1611 (2013)
26. Rokbani, N., Alimi, A.M.: IK-PSO, PSO inverse kinematics solver with application to biped
gait generation. Int. J. Comput. Appl. 58(22), 33–39 (2012)
27. Rokbani, N., Alimi, A.M., Cherif, B.A.: Architectural proposal for an intelligent humanoid. In:
Procedings of IEEE Conference on Automation and Logistics (2007)
28. Rokbani, N., Benbousaada, E., Ammar, B., Alimi, A.M.: Biped robot control using particle
swarm optimization. In: IEEE International Conference Systems on Man and Cybernetics
(SMC), pp. 506–512 (2010)
29. Rokbani, N., Boussada, E.B., Cherif, B.A., Alimi, A.M.: From gaits to ROBOT, A Hybrid
methodology for A biped Walker. Mobile Robotics: Solutions and Challenges. In: Proceedings
of Clawar, vol. 12, pp. 685–692 (2009)
30. Rokbani, N., Cherif B.A., Alimi, A.M.: Toward intelligent biped-humanoids gaits generation.
In: Choi, B. (eds.) Humanoids. Chap 14, InTech (2009)
31. Rokbani, N., Zaidi, A., Alimi, A.M.: Prototyping a biped robot using an educational robotics
kit. In: IEEE International Conference on Education and E-learning Innovations. Sousse,
Tunisia (2012)
32. Rutkowski, L., Przybyl, A., Cpalka, K.: Novel online speed profile generation for industrial
machine tool based on flexible neuro-fuzzy approximation. IEEE Trans. Industr. Electron. 59
(2), 1238–1247 (2012)
IK-FA, a New Heuristic Inverse Kinematics … 395
33. Schmidt, V., Müller, B., Pott, A. Solving the forward kinematics of cable-driven parallel
robots with neural networks and interval arithmetic. In: Computational Kinematics,
pp. 103–110. Springer, Netherlands (2014)
34. Tchoń, K., Jakubiak, J.: Endogenous configuration space approach to mobile manipulators: a
derivation and performance assessment of Jacobian inverse kinematics algorithms. Int.
J. Control 76(14), 1387–1419 (2003)
35. Tchon, K., Jakubiak, J.: Jacobian inverse kinematics. In: Advances in Robot Kinematics:
Mechanisms and Motion, p. 465 (2006)
36. Tevatia, G., Schaal, S.: Inverse kinematics for humanoid robots. In: Proceedings of IEEE
International Conference on Robotics and Automation, (ICRA’00), pp. 294–299 (2000)
37. Tolani, D., Goswami, A., Badler, N.I.: Real-time inverse kinematics techniques for
anthropomorphic limbs. Graph. Models 62(5), 353–388 (2000)
38. Van den Bergh, F., Engelbrecht, A.P.: Effects of swarm size on cooperative particle swarm
optimizers (2001)
39. Wang, R.Y., Popović, J.: Real-time hand-tracking with a color glove. In: ACM Transactions
on Graphics (TOG), ACM, vol. 28, No. 3, p. 63 (2009)
40. Xu, Q., Li, Y.: Error analysis and optimal design of a class of translational parallel kinematic
machine using particle swarm optimization. Robotica 27(1), 67–78 (2009)
41. Yang, X.S. (2010). Firefly algorithm, Levy flights and global optimization. In: Research and
Development in Intelligent Systems XXVI. Springer, London, 209–218
42. Yang, X.S.: Firefly algorithm, stochastic test functions and design optimisation. Int. J. Bio-
Inspired Comput. 2(2), 78–84 (2010)
43. Yang, X.S.: Firefly algorithms for multimodal optimization. In: Stochastic algorithms:
foundations and applications, pp. 169–178, Springer, Heidelberg (2009)
44. Zaidi, A., Rokbani, N., Alimi, A.M.: A hierarchical fuzzy controller for a biped robot. In:
Proceedings of ICBR 2013. Sousse, Tunisia ( 2013)
45. Zaidi, A., Rokbani, N., Alimi, A.M.: Neuro-Fuzzy gait generator for a biped robot. J. Electron.
Syst. 2(2), 48–54 (2012)
46. Zhang, X,.Nelson, C.A.: Multiple-criteria kinematic optimization for the design of spherical
serial mechanisms using genetic algorithms. J. Mech. Des. 133(1) (2011)
Computer Aided Intelligent Breast
Cancer Detection: Second Opinion
for Radiologists—A Prospective Study
Abstract Breast cancer is the common form of cancer and leading cause of
mortality among woman, especially in developed countries. In western countries
about 53–92 % of the population has this disease. As with any form of cancer, early
detection and diagnosis of breast cancer can increase the survival rate. Mammog-
raphy is the current diagnostic method for early detection of breast cancer. Breast
parenchymal patterns are not stable between patients, between left and right breasts,
and even within the same breast from year to year in the same patient. Breast cancer
has a varied appearance on mammograms, from the obvious spiculated masses, to
very subtle asymmetries noted on only one view, to faint calcifications seen only
with full digital resolution or a magnifying glass. The large volume of cases
requiring interpretation in many practices is also daunting, given the number of
women in the population for whom yearly screening mammography is recom-
mended. It seems obvious that this difficult task could likely be made less error
prone with the help of computer algorithms. Computer-aided detection (CAD)
systems have been shown to be capable of reducing false-negative rates in the
detection of breast cancer by highlighting suspicious masses and microcalcifications
on mammograms. These systems aid the radiologist as a ‘second opinion’ in
detecting cancers and the final decision is taken by the radiologist. A supervised
machine learning algorithm is investigated—Differential Evolution Optimized
Wavelet Neural Network (DEOWNN) for detection of abnormalities in mammo-
grams. Differential Evolution (DE) is a population based optimization algorithm
based on the principle of natural evolution, which optimizes real parameters and
real valued functions. By utilizing the DE algorithm, the parameters of the Wavelet
Neural Network (WNN) are optimized. To increase the detection accuracy a feature
extraction methodology is used to extract the texture based features of the abnormal
J. Dheeba (&)
Department of Computer Science and Engineering, Noorul Islam University,
Kumaracoil, Tamil Nadu, India
e-mail: [email protected]
N. Albert Singh
BSNL, Nagercoil, Tamil Nadu, India
e-mail: [email protected]
Keywords Breast cancer Mammograms Differential evolution Wavelet neural
network
1 Introduction
Cancers are abnormal growth of cancer cells in the body. The cell growth in the
body regulates and replaces the old cells by new cells and the old cells die. But in
certain conditions the cell divides in an abnormal way, producing more cells just
like it and forming a tumor. Cancer cells often travel to other parts of the body,
where they begin to grow and form new tumors that replace normal tissue. This
process is called metastasis. It happens when the cancer cells get into the blood-
stream or lymph vessels of the body.
The estimated number of new cases each year is expected to rise from 10 million
in 2002 to 15 million by 2025, with 60 % of those cases occurring in developing
countries. Not all tumors are cancerous. Tumors that aren’t cancer are called
benign. Benign tumors can cause problems—they can grow very large and press on
healthy organs and tissues. But they cannot grow into (invade) other tissues. These
tumors are almost never life threatening. Malignant tumors are cancerous. Left
unchecked, malignant cells eventually can spread beyond the original tumor to
other parts of the body.
Among all cancers Breast cancer remains a leading cause of cancer deaths
among women in many parts of the world. Worldwide, breast cancer comprises
22.9 % of all cancers in women in developed countries [1]. Breast cancer starts in
the tissues of the breast that produce milk and in the cell lining of the small milk
ducts. Breast cancer may be invasive during which cancer cells spreads from the
milk duct or lobule to other tissues in the breast and noninvasive, in which the
cancer cells remains within the ductal system. Breast cancer is a malignant tumor
that starts in the cells of the breast. A malignant tumor is a group of cancer cells that
can grow into (invade) surrounding tissues or spread to distant areas of the body.
The disease occurs almost entirely in women, but men can get it, too [2].
To understand breast cancer, it helps to have some basic knowledge about the
normal structure of the breast. The female breast is made up mainly of lobules
(milk-producing glands), ducts (tiny tubes that carry the milk from the lobules to the
nipple), and stroma (fatty tissue and connective tissue surrounding the ducts and
lobules, blood vessels, and lymphatic vessels). Usually breast cancer either begins
Computer Aided Intelligent Breast Cancer Detection … 399
in the cells of the lobules, which are the milk-producing glands, or the ducts, the
passages that drain milk from the lobules to the nipple.
Over time, cancer cells can invade nearby healthy breast tissue and make their
way into the underarm lymph nodes, small organs that filter out foreign substances
in the body. If cancer cells get into the lymph nodes, they then have a pathway into
other parts of the body [3]. The more lymph nodes with breast cancer cells, the
more likely it is that the cancer may be found in other organs as well. Because of
this, finding cancer in one or more lymph nodes often affects the treatment plan.
The benign stage breast lumps include fibrocystic changes, cysts, fibro adenomas,
infections and trauma.
Fibrocystic Changes
Fibrosis is the formation of scar-like (fibrous) tissue. Fibrocystic changes are any
lumpiness, thickening or swelling in the women’s breast.
Cysts
Cysts are fluid filled lumps that can range from very tiny to about the size of an egg.
Fibro Adenomas Benign breast tumors such as fibroadenomas are abnormal
growths, but they are not cancerous and do not spread outside the breast to other
organs. They are not life threatening. Fibro adenomas are a solid round rubbery
lump that moves under the skin when touched, and occurs mostly in younger
women.
Infections and Trauma
Infections and trauma are red lumpiness in the breast skin due to bruises.
Breast cancers can be, Noninvasive Breast cancer or Invasive Breast Cancer.
Noninvasive breast cancer. The Noninvasive (in situ) breast cancer is a type of
cancer than does not invade the nearby cell. Instead, it stays in a part of the breast
where it forms. One type of noninvasive cancer called ductal carcinoma in situ
(DCIS) is considered a precancerous lesion. This means that if it were left in the
body, DCIS could eventually develop into an invasive cancer.
Invasive breast cancer, this type of breast cancer invades the nearby tissues of
the breast and spreads to the other parts of the body. The cancer cells can then travel
to other parts of the body, such as the lymph nodes. If the breast cancer is stage I, II,
III or IV, then its an invasive breast cancer.
400 J. Dheeba and N. Albert Singh
The cause of breast cancer is not understood and there is no immediate hope of
prevention. Advances in surgery, radiotherapy, chemotherapy, and hormone ther-
apy have achieved only small increases in survival. One reason for this is that
effective treatment is related to the stage at which the disease is detected and treated.
Early detection will not prevent breast cancer, but it can help find it when the
likelihood of successful treatment is greatest. In general, the earlier the detection
and treatment the better the chance of survival. Early breast cancer usually does not
cause symptoms. Prognosis and survival rate varies greatly depending on cancer
type, staging and treatment.
Breast cancer screening includes tests to detect breast cancer at an early stage,
before a woman discovers a lump. The chance of dying from breast cancer has
declined by about a third over the past few decades. Screening refers to tests and
exams used to find a disease, like cancer, in people who do not have any symptoms.
The goal of screening exams, such as mammograms, is to find cancers before they
start to cause symptoms. Breast cancers that are found because they can be felt tend
to be larger and are more likely to have already spread beyond the breast. In
contrast, breast cancers found during screening exams are more likely to be small
and still confined to the breast. The size of a breast cancer and how far it has spread
are important factors in predicting the prognosis for a woman with this disease.
Most doctors feel that early detection tests for breast cancer save many thou-
sands of lives each year, and that many more lives could be saved if even more
women and their health care providers took advantage of these tests. Following the
American Cancer Society’s guidelines for the early detection of breast cancer
improves the chances that breast cancer can be diagnosed at an early stage and
treated successfully.
Mammography Mammography plays a major role in early detection of breast
cancers, detecting about 75 % of cancers at least a year before they can be felt. It is
estimated that 48 million mammograms are performed each year in US. Mam-
mography is a special type of x-ray imaging used to create detailed images of the
breast. An illustration of the digital mammogram image is shown in Fig. 1. The
arrow mark shows the possible abnormality in the mammogram image.
Breast abnormalities that can indicate breast cancer are masses, calcifications,
architectural distortion. Two main types of feature need to be recognized in
mammograms: soft tissue masses of the order of 1 cm in diameter and usually with
only very subtle differences in density to the surrounding normal structures, and
small (of the order of 0.lmm) irregularly shaped microcalcifications which can be
associated with malignant disease.
A mass is defined as a space occupying lesion seen in at least two different
projections. If a potential mass is seen in only a single projection it should be called
‘Asymmetry’ or ‘Asymmetric Density’ until its three-dimensionality is confirmed.
Masses have different density, different margins and different shape [5]. Benign
masses generally cause no skin change and are smooth, soft to firm, and mobile,
with well-defined margins. Diffuse, symmetric thickening, which is common in the
upper outer quadrants, may indicate fibro-cystic changes. Malignant masses gen-
erally are hard, immobile, and fixed to surrounding skin and soft tissue, with poorly
defined or irregular margins. However, mobile or nonfixed masses can be
cancerous.
402 J. Dheeba and N. Albert Singh
2 Related Works
A lot of researches in the area of CAD systems for breast cancer and developing
intelligent techniques for improving classification accuracy have been conducted in
last few decades [2, 18, 19]. Different studies have demonstrated that Computer
Aided Detection (CAD) of breast cancer can improve the detection rate from 4.7 to
19.5 % compared to radiologists. Regarding classification of abnormalities in
mammogram, a number of techniques have been presented using machine learning
approaches to classify samples as normal and abnormal.
Anna et al. [20] investigated multi-scale texture properties of the tissue sur-
rounding microcalcifications (MCs) for breast cancer diagnosis using probabilistic
neural network. Azar and El-Said [21] used a probabilistic neural network for breast
cancer classification. Gray-level first order statistics, gray-level co-occurrence
matrices features, and Laws texture energy measures are extracted from original
image. The classifying power of these features is analyzed using probabilistic
404 J. Dheeba and N. Albert Singh
neural network; achieving overall accuracy of 90 % when using laws texture fea-
tures in the classification of 85 mammogram images.
Mudigonda et al. [22] proposed a segmentation method for finding the suspected
mass regions in mammograms. Li et al. [23] presented a statistical model supported
approach for enhanced segmentation and extraction of suspicious mass areas from
mammographic images. Gao et al. [24] used a preprocessing technique to improve
the mass detection. Mudigonda et al. [25] used a gradient and texture based features
to detect malignant masses in mammograms. Suliga et al. [26], propose a Markov
random field (MRF) based technique that is suitable for performing clustering in an
environment which is described by limited data. Grim et al. [27] demonstrated a
preprocessing model based on local statistical texture for screening mammograms.
Preprocessing is done to emphasize diagnostically important details of suspicious
regions in mammograms. A log-likelihood image is found to take information to
identify the malignant tumor locations.
Yu and Huang [19] used a wavelet filter to detect all the suspicious regions using
the mean pixel value. Heine et al. [28], proposed a multiresolution statistical
method for identifying clinically normal tissue in digitized mammograms. Kupinski
and Giger [29] presented a radial gradient index based algorithm and a probabilistic
algorithm for detecting lesions in digital mammograms. Nakayama et al. [30]
proposed a MC detection mechanism based on the shape features. Cheng et al. [31],
proposed a novel approach for detecting the microcalcification clusters in arbitrary
shape and in the mammograms of the breasts with various densities. Wang and
Karayiannis [32] presented an approach based on the wavelet features for detection
of microcalcification in mammograms.
Eltoukhy et al. [33], propose to construct and evaluate a supervised classifier for
mammograms using a multiscale curvelet transform coefficients. Verma et al. [34]
investigated a novel soft clustered based direct learning classifier which creates soft
clusters within a class and learns using direct calculation of weights. Teo et al. [35],
used the early time backscatter response obtained using an ultra wide band radar
system is shown to have the potential for lesion classification. A rough lesion with
multiple spicules has more significant scattering points than a lesion with compact
shape. Peng et al. [36], presented a novel algorithm for the detection of micro-
calcifications using stochastic resonance (SR) noise. Tsui et al. [2], a novel method
of 2-D analysis based on describing the contour using the B-mode image and the
scatterer properties using the Nakagami image, which may provide useful clues for
classifying benign and malignant tumors.
Among existing CAD techniques, the main problem of developing an acceptable
CAD system is inconsistent and low classification accuracy. In order to improve the
training process and accuracy, this chapter investigates novel intelligent classifiers
that use texture information as input to classify the normal and abnormal tissues in
mammograms. Moreover, the intelligent machine learning classifiers are optimized
using heuristic algorithms for finding appropriate hidden neurons, learning rate and
momentum constant during the training process.
Computer Aided Intelligent Breast Cancer Detection … 405
3 Database Description
4 Methods
Fig. 2 Block diagram of the DEOWNN based breast cancer detection system
double reading in screening mammography for increasing the detection rate [38,
39]. A computer system which is used for detecting breast cancer can be used as a
second reader to increase the sensitivity rate.
The screen film mammograms are digitized and the digital mammogram image
is then analyzed by the CAD system which marks the areas of concern like sus-
picious presence of calcification, masses, architectural distortions and bilateral
symmetry. The CAD system evaluates the tissue textures in the mammogram and
highlights the potential suspicious regions. This allows the radiologists to take an
attention towards those regions and make the final conclusion regarding the
pathology.
The proposed CAD system is based on a pattern recognition system which
intelligently identifies the abnormal regions. CAD schemes using digital image
processing techniques have the goal of improving the detection performance.
Typically CAD systems are designed to provide a “second opinion” to aid rather
than replacing the radiologist. Figure 2 shows the proposed approach for detection
of abnormality in mammograms. The general approach of CAD for breast cancer
detection in mammograms involves three stages:
1. Preprocessing.
2. Feature Extraction.
3. Classification.
Mammographic image analysis using a CAD system is an extremely challenging
task. First, Since CAD systems are computer directed system, there is a need for a
flawless system. Second, the large variability in the appearance of abnormalities
makes this a very difficult image analysis task. Finally, abnormalities are often
occluded or hidden in dense breast tissue, which makes detection difficult. Hence,
there is a need for analysing the texture features of the mammogram and an
intelligent classifier to classify the abnormalities (malignant tissues) in the
mammograms.
Computer Aided Intelligent Breast Cancer Detection … 407
4.1 Pre-processing
area. The mass intensity distribution of the ROI image is shown in Fig. 3b. The
mass regions are visualized using 3-D intensity distribution in Fig. 3c and they are
visualized with high intensity peaks.
the full size input. In Pattern recognition “relevant” information are extracted about
an object via experiments and use these measurements (features) to classify an
object [41].
Features such as shape, texture, color, etc. are used to describe the content of the
image [42]. In most cases, medical images are based on carrying less information
than color images. Medical images are usually low resolution and high noise
images. They are difficult to automatically analyze for extracting features. Medical
images acquired with different devices, even using the same modality, may have
significantly varying properties. Moreover, color and intensity are not as important
in medical images as in photographs; texture analysis becomes crucial in medical
image. Texture also refers to visual patterns which have properties of homogeneity
and cannot result from the presence of only a single color or intensity [43]. Texture
perception plays an important role in the human visual system of recognition and
interpretation.
Texture contains important information that is used by humans for the inter-
pretation and the analysis of many types of images. Texture refers to the spatial
interrelationships and arrangement of the basic elements of an image. Visually,
these spatial interrelationships and arrangements of the image pixels are seen as
variations in the intensity patterns or gray tones. Therefore, texture features have to
be derived from the gray tones of the image. Texture has been one of the most
important characteristic which has been used to classify and recognize objects and
have been used in finding similarities between images in databases. These texture-
analysis methods have been widely used in pattern recognition fields. The
employment of texture features in medical imaging, especially in mammograms has
been proved to be valuable. Texture features of an image to be classified are often
used as inputs to a CAD system for discriminating between normal and abnormal
tissues. The goal of texture classification then is to produce a classification map of
the input image where each uniform textured region is identified with the texture
class it belongs to.
The textural properties computed are closely related to the application domain to
be used and becomes a vital property in medical imaging. Sutton and Hall [44]
discuss the classification of pulmonary disease using texture features. Harms et al.
[45] used image texture features to diagnose leukemic malignancy in samples of
stained blood cells. Insana et al. [46] used textural features in ultrasound images to
estimate tissue scattering parameters.
The three main approaches of pattern recognition for feature extraction and
classification based on the type of features are as follows: (1) statistical approach,
(2) syntactic or structural approach, and (3) spectral approach. In case of statistical
approach, pattern/texture is defined by a set of statistically extracted features rep-
resented as vector in multidimensional feature space. The statistical features could
be based on first-order, second-order, or higher-order statistics of gray level of an
image. In case of syntactic approach, texture is defined by texture primitives, which
are spatially organized according to placement rules to generate complete pattern. In
syntactic pattern recognition, a formal analogy is drawn between the structural
410 J. Dheeba and N. Albert Singh
pattern and the syntax of language. In case of spectral method, textures are defined
by spatial frequencies and are evaluated by autocorrelation function of a texture.
Feature based methods characterize a texture as a homogenous distribution of
feature values such as gray level co-occurrence matrix (GLCM), Laws texture
energy (LAWS) and Gabor features (GABOR). GLCM was introduced by Haralick
[47], a co-occurrence matrix is used to describe how often one gray level appears in
a specified spatial relationship to another gray level. The parameters used for
constructing a GLCM are d and θ, where d is the distance between the two gray
levels along a given direction θ. Haralick et al. [48] have applied textural features
for image classification.
Normal Texture measures includes mean, variance, etc. which will be concate-
nated to a single feature vector. This will be fed to a classifier to perform classi-
fication. In this way, much of the important information contained in the whole
distribution of the feature values might be lost. MC clusters usually appear as a few
pixels with brighter intensity embedded in a textured background breast tissue [37].
By effectively extracting the texture information within any ROI of the mammo-
gram, the region with MC and the region without MC can be differentiated. Laws
Texture Energy Measures (LTEM) has proven to be a successful method to high-
light high energy points in the image [49]. Anna et al. [20] suggests that LTEM has
a best feature in analyzing texture of tissue for BC diagnosis. By considering the
basic feature set the accuracy achieved using LTEM is 90 %.
The texture energy measures developed by Kenneth Ivan Laws at the University
of Southern California have been used for many diverse applications [49, 50].
These texture features are used to extract Laws texture energy measures from the
ROI containing abnormality and normal tissue patterns. These measures are com-
puted by first applying small convolution kernels to the ROI and then performing a
windowing operation.
A set of nine 5 × 5 convolution masks is used to compute texture energy, which
is then represented by a vector of nine numbers for each pixel of the image being
analyzed. The 2-D convolution kernels for texture discrimination are generated
from the following set of 1-D convolution kernels of length five. The texture
descriptions used are level, edge, spot, wave and ripple.
L5 ¼ ½ 1 4 6 4 1
E5 ¼ ½ 1 2 0 2 1
S5 ¼ ½ 1 0 2 0 1
W5 ¼ ½ 1 2 0 2 1
R5 ¼ ½ 1 4 6 4 1
From this above 1-D convolution kernels 25 different two dimensional convo-
lution kernels are generated by convoluting a vertical 1-D kernel with a horizontal
1-D kernel. Example for generating a 2-D mask from a 1-D is given below.
Computer Aided Intelligent Breast Cancer Detection … 411
E5
L5
E5L5
2 3 2 3
1 1 4 6 4 1
6 2 7 6 2 8 12 8 1 7
6 7 6 7
60 7 ½1 4 6 4 1 ¼ 6 0 7
6 7 60 0 0 0 7
42 5 42 8 12 8 2 5
1 1 4 6 4 1
The following steps will describe how texture energy measures are identified for
each pixel in the ROI of a mammogram image.
Step 1: Apply the two dimensional mask to the preprocessed image i.e. the ROI to
get F(i, j), where F(i, j) is a set of 25 N × M features.
Step 2: To generate the LTEM at the pixel, a non-linear filter is applied to F(i, j).
The local neighbourhood of each pixel is taken and the absolute values of
the neighbourhood pixels are summed together. A 15 × 15 square matrix is
taken for doing this operation to smooth over the gaps between the texture
edges and other micro-features. The non linear filter applied is,
X
7 X
7
E ðx; yÞ ¼ jFðx þ i; y þ jÞj ð2Þ
j¼7 i¼7
By applying the above Eq. (2) energy features per pixel are obtained. The
TEM images are represented as,
Step 3: The texture features obtained from step 2 is normalized for zero-mean.
412 J. Dheeba and N. Albert Singh
4.3 Classification
classification accuracy of the WNN classifier, DE is used to tune the initial network
parameters. Differential Evolution (DE) is a stochastic, population based optimi-
zation algorithm introduced by Storn and Prince [54].
5 Preliminaries
Wavelet Neural Networks (WNN) are an efficient model for non linear pattern
recognition [55, 56]. The wavelet transformation technique is used for obtaining
information from signals that are aperiodic, noisy, intermittent or transient. Among
many artificial intelligent methods, the feedforward ANN is the widely used sta-
tistical tool designed to diagnose pathological images especially of cancers and
precancers. Thus, the study will strengthen the foundation of ANN in CAD
application by combining the wavelet multiscale theory and neural network and
obtain a novel high performance network—Wavelet Neural Networks. These net-
works are a new powerful tool for approximation and deals effectively with the
problems of high dimensional model.
A Wavelet Neural Network was first introduced by Zhang and Benvniste [56] as
a class of feedforward networks composed of wavelets. The discrete wavelet
transform is used for analysing and synthesising feedforward neural network. The
wavelet network uses wavelet activation function and preserves the universal
approximation property.
WNN is a feedforward neural network with an input layer, hidden layer and an
output layer. The hidden layer is comprised normally of wavelets as activation
function and the output layer is comprised of linear activation function. The output
layer of WNN represents the weighted sum of the hidden layer units i.e. wavelet
basis function. The backpropagation learning algorithm is used to update the net-
work weights and to further minimize the standard Mean Square Error (MSE) of the
networks approximation after network construction.
Wavelet transforms a signal into components of different frequencies, allowing
to study each component separately. The basic idea of wavelet transform is map-
ping the signals from one basis to another. Wavelets depend on two variables: scale
(or frequency) and time and have two functions namely the scale function and the
mother wavelet. Wavelets are powerful signal analysis tools. They can approxi-
mately realize the time-frequency analysis using a mother wavelet. The mother
wavelet has a square window in the time- frequency space. The two well-known
types of wavelets are Continuous Wavelet Transform (CWT), which deals with the
function defined over the whole real axis and Discrete Wavelet Transform (DWT),
which has a range of integers ðt ¼ 0; 1. . .; N 1Þ, where N is the number of values
in the time series.
414 J. Dheeba and N. Albert Singh
A wavelet is a real or complex valued function wð:Þ satisfying the following two
conditions,
1.
Z1
wðuÞdu ¼ 0
1
2.
Z1
2
w ðuÞdu ¼ 1
1
1 u t
wk;t ðuÞ ¼ pffiffiffi w ð3Þ
k k
where k [ 0, t is finite and wk;t ¼ kwk for all k; t:
CWT takes more umber of dilations and translations of the mother wavelet and
hence lot of redundancy in CWT. Discrete Wavelet Transform operates on a dis-
cretely sampled function or time series xð:Þ, time t ¼ 0; 1; . . .; N 1 to be finite.
The dilations is denoted by k and is of the form 2j1 ; j ¼ 1; 2; 3; . . ., and the
translation values are sampled at 2j intervals when analysing within a dilation of 2j
−1
. The DWT samples at discrete times and scales, to reduce redundancy. DWT is a
system of two filters one is the wavelet filter and the other is the scaling filter. The
wavelet filter is a high pass filter and the scaling filter is the low pass filter.
The input signal X(z) is split by two filters H0(z) and H1(z) into a low pass
component X0 and a high pass component X1, both of which are decimated (down-
sampled) by 2:1. In order to reconstruct the signal, a pair of reconstruction filters
G0(z) and G1(z) and usually the filters are designed such that output signal Y(z) is
identical to the input X(z). A Haar wavelet is the simplest type of wavelet. In
discrete form, Haar wavelets are related to a mathematical operation called the Haar
transform. The Haar transform serves as a prototype for all other wavelet trans-
forms. Like all wavelet transforms, the Haar transform decomposes a discrete signal
into two subsignals of half its length. One subsignal is a running average or trend;
the other subsignal is a running difference or fluctuation. A major problem in the
development of wavelets during the 1980s was the search for scaling functions that
are compactly supported, orthogonal, and continuous. These scaling functions were
first constructed by Daubechies [57] that created great excitement in the wavelet
research world.
Computer Aided Intelligent Breast Cancer Detection … 415
The Daubechies wavelet transforms are defined in the same way as the Haar
wavelet transform by computing the running averages and differences via scalar
products with scaling signals and wavelets the only difference between them
consists in how these scaling signals and wavelets are defined. The Daubechies
wavelets are a family of orthogonal wavelets defining a discrete wavelet trans-
form and characterized by a maximal number of vanishing moments for some
given support. This wavelet type has balanced frequency responses but non-linear
phase responses. Daubechies wavelets use overlapping windows, so the high fre-
quency coefficient spectrum reflects all high frequency changes. Therefore
Daubechies wavelets are useful in compression and noise removal of audio signal
processing. Daubechies 4-tap wavelet has been chosen for this implementation.
Daubechies [57] discovered a class of wavelets, which are characterised by
orthonormal basis functions. That is, the mother wavelet is orthonormal to each
function obtained by shifting it by multiples of 2j and dilating it by a factor of 2j
(where j ∈ Z). The Daubechies wavelet ‘db4’ is a four-term member of the same
class. The four scaling function coefficients, which solve the above simultaneous
equations for N = 4, are specified as follows:
pffiffiffi
1þ 3
h0 ¼ pffiffiffi
4 2
pffiffiffi
3þ 3
h1 ¼ pffiffiffi
4 2
pffiffiffi
3 3
h2 ¼ pffiffiffi
4 2
pffiffiffi
1 3
h4 ¼ p ffiffiffi
4 2
The scaling function for the db4 wavelet can be built up recursively from these
coefficients. The wavelet function can be built from the coefficients gk which are
found using the relation gk ¼ ð1Þk h4k1 . Wavelet Neural Networks combine the
theory of wavelets and neural networks into one. A wavelet neural network gen-
erally consists of a feed-forward neural network, with one hidden layer, whose
activation functions are drawn from an orthonormal wavelet family. The structure
of a wavelet neural network is very similar to that of a (1 + 1/2) layer neural
network. That is, a feed-forward neural network, taking one or more inputs, with
one hidden layer and whose output layer consists of one or more linear combiners
or summers as shown in Fig. 4. The hidden layer consists of neurons, whose
activation functions are drawn from a wavelet basis. These wavelet neurons are
usually referred to as wavelons.
416 J. Dheeba and N. Albert Singh
x1 y1
ψ1
x2 y2
ψ2
xd yj
Ψj
Input layer Hidden layer Output layer
The WNN consists of three layers: input layer, hidden layer and output layer. All
the units in each layer are fully connected to the nodes in the next layer. The input
layer receives the input variable X ¼ ½x1 ; x2 ; ::xd T and send it to the hidden layer.
The nodes in this layer are given as the product of the jth multi-dimensional wavelet
with N input dimensions as
Y
N
wj ð xÞ ¼ fk;t ðxi Þ ð4Þ
i¼1
fk;t is the activation function of the hidden layer and k and t are the dilations and
translations respectively.
The products of the hidden layer are then propagated to the output layer, where
the output of the WNN will be the linear combination of the weighted sum of the
hidden layer, which is represented in the form of,
X
yj ¼ wij wj ð xÞ þ bj ð5Þ
i
where bj the bias of node j between the hidden layer and the output layer wj ð xÞ is
taken as a Daubechies mother wavelet. The error is calculated by finding the
difference between the target output (dj ) and the actual output (yj ). This error is then
used mathematically to change the weights in such a way that the error will get
smaller. The process is repeated again and again until the error is minimal. The
localized wavelet activation function in the hidden layer of the WNN, the con-
nection weight associated with the hidden nodes can be viewed as local piecewise
constant models, which leads to learning efficiency and structure transparency. The
pseudocode of training algorithm for WNN is outlined below.
Computer Aided Intelligent Breast Cancer Detection … 417
for = 1,2,…, do
begin
select the number of neurons in the hidden layer with wavelet activation
function
normalize the input ( ) and the output ( ) neurons.
initialize the dilation ( ) and translation ( ) parameters
initialize learning rate = 0.01, momentum constant = 0.9
initialize the weights to random values
Assign random weights between ( ) and hidden neurons
Assign random weights between hidden neurons and ( )
end
repeat
for each training pattern ( , ) do
begin
where G, is the generation number and N is the population size. The initial population
at 0th generation should cover the entire search space. The lower and upper bounds
for each parameter is designed as, XjL ¼ ðxL1 ; xL2 ; . . .; xLD Þ and XjU ¼ ðxU
1 ; x2 ; . . .; xD Þ
U U
where randi;j ½0; 1 is a uniformly distributed random number lying between 0 and 1.
Each of the N parameter vectors undergoes mutation, crossover and selection.
418 J. Dheeba and N. Albert Singh
5.2.1 Mutation
In the above equation, the mutation factor F is a constant integer from [0, 2], which
scales the differential variation of ðXr2;G Xr3;G Þ and Vi;G is called the donor vector.
5.2.2 Crossover
5.2.3 Selection
The next step of the algorithm is the selection process which determines whether
the target vector (Xi;G ) or the trial vector (Ui;G ) enter into the next generation. The
target vector is compared with the trial vector and the one with the lowest fitness
value will enter into the generation G þ 1.
Ui;Gþ1 if f ðUi;Gþ1 Þ f ðXi;G Þ
Xi;Gþ1 ¼ i ¼ 1; 2; . . .; N ð9Þ
Xi;G if f ðUi;Gþ1 Þ [ f ðXi;G Þ
where f ð:Þ is the function to be minimized. If the new trial vector yields an equal or
lower fitness value, then it replaces the corresponding target vector in the next
generation. Otherwise, the target is retained in the population.
Computer Aided Intelligent Breast Cancer Detection … 419
In the proposed method, DEOWNN is applied for evolving fully connected Neural
Network with wavelet activation function and is optimized with best network
architecture by optimizing the number of neurons in the hidden layer, the learning
rate and the momentum factor. Finding an optimal learning rate avoids major
disruption of the direction of learning when very unusual pair of training patterns is
presented. The main advantage of using optimal momentum factor is to accelerate
the convergence of error propagation algorithm. The number of neurons in the input
layer and output layer is fixed based on the problem defined. Let NI represents the
size of the neurons in the input layer and NO represents the size of the neurons in the
output layer. The number of neurons in the input and output layer are fixed and they
are same for the entire configuration in the architecture spaces. The number of
hidden layers in this problem is restricted and made as one. The range of the
Set G=0
Selection
Crossover &
Mutation
No
Termination
criteria?
Yes
Stop
420 J. Dheeba and N. Albert Singh
optimization process is defined by two range arrays Xmin ¼ fNhmin ; Lrmin ; Mcmin g
and Xmax ¼ fNhmax ; Lrmax ; Mcmax gwhere, Nh is the number of neurons in the hid-
den layer, Lr is the learning rate and Mc is the momentum factor. The fitness
function sought for optimal training is the Mean Square Error (MSE) formulated as,
X X No p 2
MSEDEOWNN ¼ p2T
t yp;o
k¼1 k k ð10Þ
The training patterns were taken from the MIAS database and a total of 2,050
patterns are used for training. Training of DEOWNN is done in such a way that the
desired outputs are assigned a value 1 for cancers (abnormality) and zero for non
cancer (normal breast tissue). The optimization of DEOWNN classifier is per-
formed with the learning rate and the momentum factor varied from 0 to 1 and the
hidden neurons varied from 31 to 200. For this training a maximum of 100 gen-
erations are performed with a population size N = 50 and with 500 training epochs.
The value of the mutation factor F is set to 1.2 and the crossover constant CR is set
to 0.9. During each generation, the best fitness score (minimum MSE) achieved at
the optimum dimension is stored. Using the proposed DEOWNN algorithm, an
optimized WNN is achieved with Nh = 132, Lr = 0.00127 and Mc = 0.9264.
Figure 6 shows the classification results of abnormalities in various mammogram
images from MIAS database. The results demonstrate the strength of the proposed
methodology. The results demonstrate the usefulness of the DEOWNN classifier in
identifying cancerous and non cancerous regions. Due to the fact that masses and
microcalcifications are the two types of objects that are the best indicators of a
possible early stage of breast cancer, identifying the abnormal cells are more
important to increase the survival.
The DEOWNN classifier achieves a classification accuracy of 96.203 % and
AUC of 0.97843 for MIAS database, which is found to be higher than the other
optimally tuned classifier models. The evaluation results show that the DEOWNN
classifier is capable of achieving a sensitivity of 96.923 % with a specificity of
92.857 % for MIAS.
Computer Aided Intelligent Breast Cancer Detection … 423
Fig. 6 Detection results of abnormalities in mammograms for MIAS database. The top images
show the original mammograms of mdb184, mdb025, mdb083, mdb248 respectively. Detected
masses are shown below a spiculated mass, b circumscribed mass and c asymmetry mass
The DEOWNN methodology has been evaluated on real clinical database collected
form mammogram screening centres. The mammograms in the training subset were
found to have a total of 1,064 patterns containing both abnormal and normal
patterns. The DEOWNN algorithm was trained with the same parameters used for
analyzing the MIAS database. An optimized WNN is achieved with Nh = 114,
Lr = 0.00112 and Mc = 0.9275. Testing is done for all the 216 real time clinical
images.
The normal breast tissues of woman below 40 years of age are much denser,
which may be predicted as an abnormal mass region leading to high miss classi-
fication rate [58]. Different kinds of mammograms encountered in clinical appli-
cations are considered and the experimental results reveal that masses are detected
effectively even in very dense breast mammograms in most of the cases. Masses are
characterized by the margin of the mass. The mammographic border between the
mass and the normal tissue is useful for predicting benign and malignant masses.
Margins of a mass is described as circumscribed, obscured, ill defined and spicu-
lated. The circumscribed masses have a well defined margin. Figure 7 shows
424 J. Dheeba and N. Albert Singh
Fig. 7 Detection results of abnormalities in mammograms. The top images show the original
mammograms. Detected masses are shown below a circumscribed mass, b obscured mass,
c spiculated mass and d ill defined mass
detection of masses using DEOWNN classifier and shows the power of Laws
texture features in discriminating the abnormal and normal tissue patterns. The
mammograms illustrated in Fig. 7 are denser and the margins are obscured and not
clearly seen. Hence it is difficult for a radiologist to identify the mass in a denser
tissue. The most difficult to diagnose mass in mammograms are ill defined and
spiculated masses. The shapes of these masses are irregular and have ill defined and
spiculated margins.
As observed, the proposed DEOWNN scheme has achieved a sensitivity of
93.333 % at specificity level 89.474 % when applied to real clinical database. The
area under the ROC curve is analysed and found to be 0.9573 and a classification
accuracy of 92.405 % is achieved for DEOWNN classifier.
8 Performance Analysis
diagnosed as positive and the FPR is the fraction of patients actually without the
abnormality and that are diagnosed as positive. The detection performance is analyzed
using the area under the ROC curve (AZ). Several metrics is determined for quanti-
tative evaluations of the intelligent classifiers. The ROC curve is the fundamental tool
for diagnostic test evaluation [60]. It has become very popular in biomedical appli-
cations, particularly in radiology and imaging. It is also used in machine learning
applications to assess classifiers performance. The true positive rate also known as
sensitivity is the ratio of the malignant cases correctly classified to the total number of
malignant cases in the test cases. The false positive rate is the ratio of the number of
normal cases incorrectly classified to the total number of normal cases in the test case.
Sensitivity measures the proportion of actual positives which are correctly identified
when the mammogram contains cancers tissues in it. Specificity measures the pro-
portion of negatives which are correctly identified when cancer is not present in the
mammogram. The following statistics can be defined,
TP
sensitivity ¼
ðTP þ FNÞ
TN
specificity ¼
ðTN þ FPÞ
TPR ¼ sensitivity FPR ¼ 1 specificity
The Area under the ROC curve (AUC or AZ) is a measure of how well a
parameter can distinguish between two diagnostic groups (abnormal/normal tis-
sues). AUC can be interpreted as the probability that the test results from a ran-
domly chosen diseased individual is more indicative of disease than that from a
randomly chosen non-diseased individual. The overall performance of diagnostic
systems has been measured and reported in terms of classification accuracy which is
the percentage of diagnostic decisions that proved to be correct. Figures 8 and 9.
shows the ROC curve for classifiers using MIAS Database and real clinical data-
base respectively.
J ¼ sensitivity þ specificity 1
wavelet neural network and optimally designs the neural network using differential
evolution algorithm. This superior performance makes DEOWNN suitable for
efficiently detecting abnormalities in mammograms. The proposed classifier
designed using DE algorithm applied to WNN are investigated for detecting breast
cancer in mammograms and the results gave better classification accuracy than the
traditional classifiers. The optimized wavelet neural network accelerates the con-
vergence of the back propagation algorithm and also it avoids major disruptions in
the direction of learning.
9 Conclusion
References
6. Baker, J.A., Rosen, E.L., Lo, J.Y., Gimenez, E.I., Walsh, R., Soo, M.S.: Computer-aided
detection (CAD) in screening mammography: sensitivity of commercial CAD systems for
detecting architectural distortion. Am. J. Roentgenol. 81, 1083–1088 (2003)
7. Tourassi, G.D., Floyd Jr, C.E.: Performance evaluation of an information-theoretic CAD
scheme for the detection of mammographic architectural distortion. In: Proceedings of SPIE—
The International Society for Optical Engineering, vol. 5370, pp. 59–66 (2004)
8. Lau, T.K., Bischof, W.F.: Automated detection of breast tumors using the asymmetry
approach. Comput. Biomed. Res. 24, 273–295 (1991)
9. Yin, F.F., Giger, M.L., Doi, K., Vyborny, C.J., Schmidt, R.A.: Computerized detection of
masses in digital mammograms: automated alignment of breast images and its effect on
bilateral-subtraction technique. Med. Phys. 21, 445–452 (1994)
10. Chan, H.P., Doi, K., Vyborny, C.J., Schmidt, R.A., Metz, C.E., Lam, K.L., Ogura, T., Wu, Y.,
MacMahon, H.: Improvement in radiologists detection of clustered microcalcifications on
mammograms. The potential of computer-aided diagnosis. Invest. Radiol. 25, 102–1110
(1990)
11. Cheng, H.D., Cai, X., Chen, X., Hu, L., Lou, X.: Computer-aided detection and classification
of microcalcifications in mammograms: a survey. Pattern Recogn. 36, 2967–2991 (2003)
12. Motakis, E., Ivshina, A.V., Kuznetsov, V.A.: Data-driven approach to predict survival of
cancer patients. IEEE Eng. Med. Biol. Mag. 28, 58–66 (2009)
13. Bird, R.E., Wallace, T.W., Yankaskas, B.C.: Analysis of cancers missed at screening
mammography. Radiology 184, 613–617 (1992)
14. Azar, A.T., El-Metwally, S.M.: Decision tree classifiers for automated medical diagnosis.
Neural Comput. Appl. 23(7–8), 2387–2403 (2013). doi:10.1007/s00521-012-1196-7
15. Azar, A.T.: Statistical analysis for radiologists’ interpretations variability in mammograms.
Int. J. Syst. Biol. Biomed. Technol. (IJSBBT) 1(4), 28–46 (2012)
16. Azar, A.T., El-Said, S.A.: Performance analysis of support vector machines classifiers in
breast cancer mammography recognition. Neural Comput. Appl. 24(5), 1163–1177 (2014).
doi:10.1007/s00521-012-1324-4
17. Moftah, H.M., Azar, A.T., Al-Shammari, E.T., Ghali, N.I., Hassanien, A.E., Shoman, M.:
Adaptive k-means clustering algorithm for MR breast image segmentation. Neural Comput.
Appl. 24(7–8), 1917–1928 (2014). doi:10.1007/s00521-013-1437-4
18. Tiedeu, A., Daul, C., Kentsop, A., Graebling, P., Wolf, D.: Texture-based analysis of clustered
microcalcifications detected on mammograms. Digit. Signal Proc. 22, 124–132 (2012)
19. Yu, S.-N., Huang, Y.-K.: Detection of microcalcifications in digital mammograms using
combined model-based and statistical textural features. Expert Syst. Appl. 37(7), 5461–5469
(2010)
20. Anna, N., Ioannis, S., Spyros, G., Filippos, N., Nikolaos, S., Eleni, A., George, S., Lena, I.:
Breast cancer diagnosis: analyzing texture of tissue surrounding microcalcifications. IEEE
Trans. Inf Technol. Biomed. 12, 731–738 (2008)
21. Azar, A.T., El-Said, S.A.: Probabilistic neural network for breast cancer classification. Neural
Comput. Appl. 23(6), 1737–1751 (2013). doi:10.1007/s00521-012-1134-8
22. Mudigonda, N.R., Rangayyan, R.M., Leo Desautels, J.E.: Detection of breast masses in
mammograms by density slicing and texture flow-field analysis. IEEE Trans. Med. Imaging
20, 1215–1227 (2001)
23. Li, H., Wang, Y., Liu, K.J.R., Lo, S.C.B., Matthew, T.: Computerized radiographic mass
detection part I: lesion site selection by morphological enhancement and contextual
segmentation. IEEE Trans. Med. Imaging 20, 289–301 (2001)
24. Gao, X., Wang, Y., Li, X., Tao, D.: On combining morphological component analysis and
concentric morphology model for mammographic mass detection. IEEE Trans. Inf Technol.
Biomed. 14, 266–273 (2010)
25. Mudigonda, N.R., Rangayyan, R.M., Leo Desautels, J.E.: Gradient and texture analysis for the
classification of mammographic masses. IEEE Trans. Med. Imaging 19(10), 1032–1043
(2000)
Computer Aided Intelligent Breast Cancer Detection … 429
26. Suliga, M., Deklerck, R., Nyssen, E.: Markov random field-based clustering applied to the
segmentation of masses in digital mammograms. Comput. Med. Imaging Graph. 32, 502–512
(2008)
27. Grim, J., Somol, P., Haindl, M., Danes, J.: Computer aided evaluation of screening
mammograms based on local texture model. IEEE Trans. Image Process. 18, 765–773 (2009)
28. Heine, J.J., Deans, S.R., Cullers, D.K., Stauduhar, R., Laurence, P.: Multiresolution statistical
analysis of high-resolution digital mammograms. IEEE Trans. Med. Imaging 16, 503–515
(1997)
29. Kupinski, M.A., Giger, M.L.: Automated seeded lesion segmentation on digital
mammograms. IEEE Trans. Med. Imaging 17(4), 510–517 (1998)
30. Nakayama, R., Uchiyama, Y., Yamamoto, K., Watanabe, R., Namba, K. Computer aided
diagnosis scheme using a filter bank for detection of microcalcification clusters in
mammograms. IEEE Trans. Biomed. Eng. 53(2), 273–283 (2006)
31. Cheng, H.-D., Lui, Y.M., Freimanis, R.I.: A novel approach to microcalcification detection
using fuzzy logic technique. IEEE Trans. Med. Imaging 17, 442–450 (1998)
32. Wang, T.C., Karayiannis, N.B.: Detection of microcalcifications in digital mammograms using
wavelets. IEEE Trans. Med. Imaging 17, 498–509 (1998)
33. Eltoukhy, M.M., Faye, I., Samir, B.B.: Breast cancer diagnosis in digital mammogram using
multiscale curvelet transform. Comput. Med. Imaging Graph. 34, 269–276 (2010)
34. Verma, B., McLeod, P., Klevansky, A.: Classification of benign and malignant patterns in
digital mammograms for the diagnosis of breast cancer. Expert Syst. Appl. 37, 3344–3351
(2010)
35. Teo, J., Chen, Y., Soh, C.B., Gunawan, E., Low, K.S., Putti, T.C., Wang, S.-C.: Breast lesion
classification using ultra wideband early time breast lesion response. IEEE Trans. Antennas
Propag. 58, 2604–2613 (2010)
36. Peng, R., Hao, C., Varshney, P.K.: Noise-enhanced detection of micro-calcifications in digital
mammograms. IEEE J. Sel. Top. Sign. Process. 3, 62–73 (2009)
37. Suckling, J., Parker, J.: The Mammographic Images Analysis Society Digital Mammogram
Database. In: Proceedings of 2nd International Workshop on Digital Mammography, UK,
pp. 375–378 (1994)
38. Anderson, E.D., Muir, B.B., Walsh, J.S., Kirkpatrick, A.E.: The efficacy of double reading
mammograms in breast screening. Clin. Radiol. 49, 248–251 (1994)
39. Thurfjell, E.L., Lernevall, K.A., Taube, A.A.: Benefit of Independent Double Reading in a
Population based Mammography Screening Program. Radiology 191, 241–244 (1994)
40. Gonzales, R.C., Woods, R.E.: Digital image processing. Prentice Hall, Upper Saddle River, NJ
(2002)
41. Rafael, C., Gonzalez, R.E., Woods S.L.: Digital Image Processing Using MATLAB. Pearson
Education India (2005)
42. Tsai, D.-Y., Kojima, K.: Measurement of texture features of medical images and its
application to computer aided diagnosis in cardiomyopathy. Measurement 37, 284–292 (2005)
43. Ojala, T., Pietikainen, M., Harwood, D.: A comparative study of texture measures with
classification based feature distributions. Pattern Recogn. 29, 51–59 (1996)
44. Sutton, R.N., Hall, E.L.: Texture measures for automatic classification of pulmonary disease.
IEEE Trans. Comput. C-21, 667–676 (1972)
45. Harms, H., Gunzer, U., Aus, H.M.: Combined local color and texture analysis of stained cells.
Comput. Vis. Graphics Image Process. 33, 364–376 (1986)
46. Insana, M.F., Wagner, R.F., Garra, B.S., Brown, D.G., Shawker, T.H.: Analysis of ultrasound
image texture via generalized Rician statistics. Opt. Eng. 25, 743–748 (1986)
47. Haralick, R.M.: Statistical and structural approaches to texture. Proc. IEEE 67, 786–804
(1979)
48. Haralick, R.M., Shanmugam, K., Dinstein, I.: Textural features for image classification. IEEE
Trans. Syst. Man Cybern. SMC-3, 610–621 (1973)
49. Laws, K.J.: Texture energy measures. Proceeding DARPA Image Understanding Workshop,
pp. 47–51 (1979)
430 J. Dheeba and N. Albert Singh
50. Christodoulou, C.I., Pattichis, C.S., Pantziaris, M., Nicolaides, A.: Texture-based classification
of atherosclerotic carotid plaques. IEEE Trans. Med. Imaging 22, 902–912 (2003)
51. Chauhan, N., Ravi, V., Karthik Chandra, D.: Differential evolution trained wavelet neural
network application to bankruptcy prediction in banks. Expert Syst. Appl. 36, 7659–7665
(2009)
52. McCulloch, W.S., Pitts, W.: A logical study of the ideas immanent in nervous activity. Bull.
Math. Biophys. 5, 115–133 (1943)
53. Bishop, C.M.: Neural networks for pattern recognition. Oxford University Press, New York
(1995)
54. Storn, R., Price, K.: Differential evolution—a simple and efficient adaptive scheme for global
optimization over continuous spaces. J. Global Optim. 11, 341–359 (1997)
55. Zhang, J., Walter, G.G., Miao, Y., Lee, W.N.W.: Wavelet neural networks for function
learning. IEEE Trans. Signal Process. 43, 1485–1497 (1995)
56. Zhang, Q., Benvniste, A.: Wavelet networks. IEEE Trans. Neural Netw. 3, 889–898 (1992)
57. Daubechies, I.: Time-frequency localization operators: a geometric phase space approach.
IEEE Trans. Inf. Theory 34, 605–612 (1988)
58. Zheng, B., Qian, W., Clarke, L.P.: Digital mammography mixed feature neural network with
spectral entropy decision for detection of microcalcifications. IEEE Trans. Med. Imaging 15,
589–597 (1996)
59. Metz, C.E.: ROC methodology in radiologic imaging, Invest Radiol 21(9), 720–733 (1986)
60. Zweig, M.H., Campbell, G.: Receiver operating characteristic (ROC) plots: a fundamental
evaluation tool in clinical medicine. Clin. Chem. 39, 561–577 (1993)
61. Youden, W.J.: An index for rating diagnostic tests. Cancer 3, 32–35 (1950)