0% found this document useful (0 votes)

347 views

Mathematics and Computing 2018

Uploaded by

Muhammad Shakir

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

347 views

Mathematics and Computing 2018

Uploaded by

Muhammad Shakir

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 469

Springer Proceedings in Mathematics & Statistics

Debdas Ghosh · Debasis Giri
Ram N. Mohapatra · Kouichi Sakurai
Ekrem Savas · Tanmoy Som Editors

Mathematics
and
Computing
ICMC 2018, Varanasi, India, January 9–11,
Selected Contributions
Springer Proceedings in Mathematics & Statistics

Volume 253
Springer Proceedings in Mathematics & Statistics
This book series features volumes composed of selected contributions from
workshops and conferences in all areas of current research in mathematics and
statistics, including operation research and optimization. In addition to an overall
evaluation of the interest, scientiﬁc quality, and timeliness of each proposal at the
hands of the publisher, individual contributions are all refereed to the high quality
standards of leading journals in the ﬁeld. Thus, this series provides the research
community with well-edited, authoritative reports on developments in the most
exciting areas of mathematical and statistical research today.

More information about this series at http://www.springer.com/series/10533

Debdas Ghosh Debasis Giri
•

Ram N. Mohapatra Kouichi Sakurai

•

Ekrem Savas Tanmoy Som

•

Editors

Mathematics and Computing

ICMC 2018, Varanasi, India, January 9–11,
Selected Contributions

123
Editors
Debdas Ghosh Kouichi Sakurai
Department of Mathematical Sciences Faculty of Information Science
Indian Institute of Technology (BHU) and Electrical Engineering
Varanasi, Uttar Pradesh, India Kyushu University
Fukuoka, Japan
Debasis Giri
Department of Computer Science Ekrem Savas
and Engineering Uşak University
Haldia Institute of Technology Uşak, Turkey
Haldia, West Bengal, India
Tanmoy Som
Ram N. Mohapatra Department of Mathematical Sciences
Department of Mathematics Indian Institute of Technology (BHU)
University of Central Florida Varanasi, Uttar Pradesh, India
Orlando, FL, USA

ISSN 2194-1009 ISSN 2194-1017 (electronic)

Springer Proceedings in Mathematics & Statistics
ISBN 978-981-13-2094-1 ISBN 978-981-13-2095-8 (eBook)
https://doi.org/10.1007/978-981-13-2095-8

Library of Congress Control Number: 2018950828

Mathematics Subject Classiﬁcation (2010): 35-xx, 65-xx, 76-xx, 90-xx, 94-xx

© Springer Nature Singapore Pte Ltd. 2018

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publisher remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Dedicated to
Pandit Madan Mohan Malaviya—The
Founder of Banaras Hindu University
Preface

The Fourth International Conference on Mathematics and Computing (ICMC—

2018) was organized in the Department of Mathematical Sciences, Indian Institute
of Technology (Banaras Hindu University), Varanasi, India, during January 9–11,
2018, under the dynamic leadership of Dr. Debdas Ghosh along with the support of
Prof. R. N. Mohapatra, Prof. D. Giri, Prof. T. Som, Prof. S. Mukhopadhyay,
Prof. S. Das, Dr. A. Banerjee, and the faculty members of the Department of
Mathematical Sciences, IIT (BHU), India. There was an overwhelming response to
the program, and one hundred and twenty papers all over the country and abroad
were submitted for the consideration of presentation and later publication in the
proceedings. Taking into account the norms of the proceedings, the papers were
gone through strict blind reviewing process by at least two referees in the respective
areas and only forty-seven papers were selected for the presentation and
twenty-nine for inclusion in the Proceeding of Mathematics and Statistics, Springer.
The areas covered by the papers are the latest works in the ﬁeld of cryptography,
security, abstract algebra, functional analysis, fluid dynamics, fuzzy modeling and
optimization, etc. The ICMC—2018 was attended by several experts of interna-
tional repute from the nation as well as from USA, UK, Japan, China, Finland, etc.,
as invited speakers with their high-quality research presentations. Experts were
from IIT Madras, ISI Chennai, University of Central Florida, Orlando, USA,
Kettering University, USA, University of Surrey, UK, Auburn University,
Alabama, USA, Kyushu University, Japan, Tianjin University of Science and
Technology, China, Oracle’s System of Technology, USA, University of Turku,
Finland, Haldia Institute of Technology (HIT), India, Banaras Hindu University
(BHU), India, and IIT (BHU), India. Most of the experts have submitted their
contributions for the proceeding. The Organizing Committee of ICMC—2018 is
truly thankful to all experts and paper presenters for their academic support.
Distinguished Prof. Anthony T. S. Ho of the Tianjin University, of the
University of Surrey, also of the Wuhan University of Technology has nicely
elaborated and explained the applications of Benford’s law for multimedia security
and forensics. Professor R. N. Mohapatra, University of Central Florida, has
beautifully explored the various aspects of epidemiological models with mutating

vii
viii Preface

pathogens with basic SIR model, diffusion equation, the Fisher–Kolmogoroff

equation, spatial epidemic models, and his proposed model supported with some
nice examples. Professor S. R. Chakravarty of the Kettering University elaborately
presented the different aspects of non-preemptive stochastic priority queuing model
for two different types of customers and with a new threshold. Professor K. Sakurai
of the Kyushu University has discussed non-commutative approach using ring for
enhancing the security of cryptosystems. Professor Matti Vuorinen of the
University of Turku gave insightful elaboration on computation of condenser
capacity. Dr. Srinivas Pyda of Oracle’s System of Technology has discussed well
the mathematics in machine learning. Professor Dr. Parisa Hariri of the University
of Turku has explored the hyperbolic metric of plane domain to a subdomain of Rn
(n 2), discussed the geometry and topology of metric balls, and compared
different hyperbolic type metrics and gave an application to solve Ptolemy–Alhazen
problem. Professor S. Ponnusamy of IIT Madras has described the classical Bohr’s
theorem for bounded functions, bounded n-symmetric functions, half-plane map-
pings, half-plane n-symmetric mappings, and added some nice examples supporting
the theory; Prof. Debasis Giri of HIT has elaborated on authenticated encryption of
long messages; Prof. Chris Rodger of the Auburn University has explored the
various aspects of graph embedding and construction of Hamilton’s decomposition
of graphs and elaborated with nice examples having several applications.
Professor S. K. Mishra of BHU talked about the properties and relations of strong
pseudomonotone and strong quasimonotone operators. Professor T. Som of IIT
(BHU) has contributed to convergence of generalized Mann type of iterates to
common ﬁxed point though he has dealt with soft relation and fuzzy soft relation
with application to decision-making problems in the conference program. The
submitted contributions of the experts are included in the proceeding. The orga-
nizing committee is truly thankful to all the experts for their valuable contribution to
the conference.
I, on behalf of the organizing committee, gratefully acknowledge the ﬁnancial
support to the conference by
– Science and Engineering Research Board, India
– Defence Research and Development Organization, India
– Indian Institute of Technology (BHU), India
– Council of Research and Industrial Research, India
– SCUBE India.

Varanasi, India Prof. Tanmoy Som

Organizing Secretary
Contents

1 Constructions and Embeddings of Hamilton Decompositions

of Families of Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
C. A. Rodger
2 On Strong Pseudomonotone and Strong
Quasimonotone Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Sanjeev Kumar Singh, Avanish Shahi and S. K. Mishra
3 A Dynamic Non-preemptive Priority Queueing Model
with Two Types of Customers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Srinivas R. Chakravarthy
4 Ih -Statistical Convergence of Weight g in Topological Groups . . . . 43
Ekrem Savas
5 On the Integral-Balance Solvability of the Nonlinear Mullins
Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Jordan Hristov
6 Optimal Control of Rigidity Parameter of Elastic Inclusions
in Composite Plate with a Crack . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Nyurgun Lazarev and Natalia Neustroeva
7 Convergence of Generalized Mann Type of Iterates to Common
Fixed Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
T. Som, Amalendu Choudhury, D. R. Sahu and Ajeet Kumar
8 Geometric Degree Reduction of Bézier Curves . . . . . . . . . . . . . . . . 87
Abedallah Rababah and Salisu Ibrahim
9 Cybersecurity: A Survey of Vulnerability Analysis and Attack
Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Rachid Ait Maalem Lahcen, Ram Mohapatra and Manish Kumar

ix
x Contents

10 A Solid Transportation Problem with Additional Constraints

Using Gaussian Type-2 Fuzzy Environments . . . . . . . . . . . . . . . . . 113
Sharmistha Halder (Jana), Debasis Giri, Barun Das,
Goutam Panigrahi, Biswapati Jana and Manoranjan Maiti
11 Complements to Voronovskaya’s Formula . . . . . . . . . . . . . . . . . . . 127
Margareta Heilmann, Fadel Nasaireh and Ioan Raşa
12 Mathematics and Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . 135
Srinivas Pyda and Srinivas Kareenhalli
13 Numerical Study on the Influence of Diffused Soft Layer in pH
Regulated Polyelectrolyte-Coated Nanopore . . . . . . . . . . . . . . . . . . 155
Subrata Bera, S. Bhattacharyya and H. Ohshima
14 Quadruple Fixed Point Theorem for Partially Ordered Metric
Space with Application to Integral Equations . . . . . . . . . . . . . . . . . 169
Manjusha P. Gandhi and Anushri A. Aserkar
15 Enhanced Prediction for Piezophilic Protein by Incorporating
Reduced Set of Amino Acids Using Fuzzy-Rough Feature
Selection Technique Followed by SMOTE . . . . . . . . . . . . . . . . . . . 185
Anoop Kumar Tiwari, Shivam Shreevastava, Karthikeyan Subbiah
and Tanmoy Som
16 Effect of Upper and Lower Moving Wall on Mixed Convection
of Cu-Water Nanofluid in a Square Enclosure with Non-uniform
Heating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
S. K. Pal and S. Bhattacharyya
17 On Love Wave Frequency Under the Influence of Linearly
Varying Shear Moduli, Initial Stress, and Density of Orthotropic
Half-Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Sumit Kumar Vishwakarma, Tapas Ranjan Panigrahi
and Rupinderjit Kaur
18 The Problem of Oblique Scattering by a Thin Vertical
Submerged Plate in Deep Water Revisited . . . . . . . . . . . . . . . . . . . 225
B. C. Das, S. De and B. N. Mandal
19 A Note on Necessary Condition for Lp Multipliers with Power
Weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Rajib Haloi
20 On M=Gða;bÞ =1=N Queue with Batch Size- and Queue
Length-Dependent Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
G. K. Gupta and A. Banerjee
Contents xi

21 A Fuzzy Random Continuous (Q, r, L) Inventory Model

Involving Controllable Back-order Rate and Variable Lead-Time
with Imprecise Chance Constraint . . . . . . . . . . . . . . . . . . . . . . . . . 263
Debjani Chakraborty, Sushil Kumar Bhuiya and Debdas Ghosh
22 Estimation of the Location Parameter of a General Half-Normal
Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
Lakshmi Kanta Patra, Somesh Kumar and Nitin Gupta
23 Existence of Equilibrium Solution of the Coagulation–
Fragmentation Equation with Linear Fragmentation Kernel . . . . . 295
Debdulal Ghosh and Jitendra Kumar
24 Explicit Criteria for Stability of Two-Dimensional Fractional
Nabla Difference Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
Jagan Mohan Jonnalagadda
25 Discrete Legendre Collocation Methods for Fredholm–
Hammerstein Integral Equations with Weakly
Singular Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
Bijaya Laxmi Panigrahi
26 Norm Inequalities Involving Upper Bounds for Operators
in Orlicz-Taylor Sequence Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 329
Atanu Manna
27 A Study on Fuzzy Triangle and Fuzzy
Trigonometric Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
Debdas Ghosh and Debjani Chakraborty
28 An Extension Asymptotically ‚-Statistical Equivalent
Sequences via Ideals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
Ekrem Savas and Rabia Savas
29 Fuzzy Goal Programming Approach for Resource Allocation
in an NGO Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
Vinaytosh Mishra, Tanmoy Som, Cherian Samuel and S. K. Sharma
30 Stoichio Simulation of FACSP From Graph Transformations
to Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
J. Philomenal Karoline, P. Helen Chandra, S. M. Saroja Theerdus
Kalavathy and A. Mary Imelda Jayaseeli
31 Fully Dynamic Group Signature Scheme with Member
Registration and Veriﬁer-Local Revocation . . . . . . . . . . . . . . . . . . 399
Maharage Nisansala Sevwandi Perera and Takeshi Koshiba
32 Fourier-Based Function Secret Sharing with General Access
Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
Takeshi Koshiba
xii Contents

33 A Uniformly Convergent NIPG Method for a Singularly

Perturbed System of Reaction–Diffusion Boundary-Value
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
Gautam Singh and Srinivasan Natesan
34 On Solving Bimatrix Games with Triangular Fuzzy Payoffs . . . . . . 441
Subrato Chakravorty and Debdas Ghosh
35 Comparison of Two Methods Based on Daubechies Scale
Functions and Legendre Multiwavelets for Approximate Solution
of Cauchy-Type Singular Integral Equation on R . . . . . . . . . . . . . . 453
Swaraj Paul and B. N. Mandal
Contributors

Rachid Ait Maalem Lahcen Department of Mathematics, University of Central

Florida, Orlando, FL, USA
Anushri A. Aserkar Department of Mathematics, Rajiv Gandhi College of
Engineering and Research, Nagpur, India
A. Banerjee Department of Mathematical Sciences, Indian Institute of Technology
(BHU), Varanasi, Uttar Pradesh, India
Subrata Bera Department of Mathematics, National Institute of Technology
Silchar, Silchar, India
S. Bhattacharyya Department of Mathematics, Indian Institute of Technology
Kharagpur, Kharagpur, West Bengal, India
Sushil Kumar Bhuiya Department of Mathematics, Indian Institute of
Technology Kharagpur, Kharagpur, West Bengal, India
Debjani Chakraborty Department of Mathematics, Indian Institute of
Technology Kharagpur, Kharagpur, West Bengal, India
Srinivas R. Chakravarthy Departments of Industrial and Manufacturing
Engineering and Mathematics, Kettering University, Flint, MI, USA
Subrato Chakravorty Department of Mechanical Engineering, Indian Institute of
Technology (BHU), Varanasi, Uttar Pradesh, India
Amalendu Choudhury Department of Mathematics and Statistics, Haflong
Government College, Haflong, Dima Hasao, Assam, India
B. C. Das Department of Applied Mathematics, Calcutta University, Kolkata,
India
Barun Das Department of Mathematics, Sidho Kanho Birsha University, Purulia,
West Bengal, India
S. De Department of Applied Mathematics, Calcutta University, Kolkata, India

xiii
xiv Contributors

Manjusha P. Gandhi Department of Mathematics, Yeshwantrao Chavan College

of Engineering, Nagpur, India
Debdas Ghosh Department of Mathematical Sciences, Indian Institute of
Technology (BHU), Varanasi, Uttar Pradesh, India
Debdulal Ghosh Department of Mathematics, Indian Institute of Technology
Kharagpur, Kharagpur, West Bengal, India
Debasis Giri Department of Computer Science and Engineering, Haldia Institute
of Technology, Haldia, East Midnapore, India
G. K. Gupta Department of Mathematical Sciences, Indian Institute of
Technology (BHU), Varanasi, Uttar Pradesh, India
Nitin Gupta Indian Institute of Technology Kharagpur, Kharagpur, West Bengal,
India
Sharmistha Halder (Jana) Department of Mathematics, Midnapore College
[Autonomous], Midnapore, India
Rajib Haloi Department of Mathematical Sciences, Tezpur University, Tezpur,
Sonitpur, Assam, India
Margareta Heilmann School of Mathematics and Natural Sciences, University of
Wuppertal, Wuppertal, Germany
P. Helen Chandra Jayaraj Annapackiam College for Women (Autonomous),
Theni, Tamil Nadu, India
Jordan Hristov Department of Chemical Engineering, University of Chemical
Technology and Metallurgy (UCTM), Soﬁa, Bulgaria
Salisu Ibrahim Department of Mathematics Northwest University, Kano, Nigeria
Biswapati Jana Department of Computer Science, Vidyasagar University,
Midnapore, West Bengal, India
Jagan Mohan Jonnalagadda Department of Mathematics, Birla Institute of
Technology and Science, Pilani, Hyderabad, Telangana, India
Srinivas Kareenhalli Oracle India, Bengaluru, India
Rupinderjit Kaur Department of Mathematics, Birla Institute of Technology and
Science, Pilani, Hyderabad, India
Takeshi Koshiba Faculty of Education and Integrated Arts and Sciences, Waseda
University, Shinjuku-ku, Tokyo, Japan
Ajeet Kumar Department of Mathematics, Banaras Hindu University, Varanasi,
India
Jitendra Kumar Department of Mathematics, Indian Institute of Technology
Kharagpur, Kharagpur, West Bengal, India
Contributors xv

Manish Kumar Department of Mathematics, Birla Institute of Technology and

Science-Pilani, Hyderabad, Telangana, India
Somesh Kumar Indian Institute of Technology Kharagpur, Kharagpur, West
Bengal, India
Nyurgun Lazarev North-Eastern Federal University, Yakutsk, Russia;
Lavrentyev Institute of Hydrodynamics SB RAS, Novosibirsk, Russia
Manoranjan Maiti Department of Applied Mathematics with Oceanology and
Computer Programming, Vidyasagar University, Midnapore, West Bengal, India
B. N. Mandal Physics and Applied Mathematics Unit, Indian Statistical Institute,
Kolkata, India
Atanu Manna Faculty of Mathematics, Indian Institute of Carpet Technology,
Bhadohi, Uttar Pradesh, India
A. Mary Imelda Jayaseeli Jayaraj Annapackiam College for Women
(Autonomous), Theni, Tamil Nadu, India
S. K. Mishra Department of Mathematics, Institute of Science, Banaras Hindu
University, Varanasi, India
Vinaytosh Mishra Indian Institute of Technology (BHU), Varanasi, India
Ram Mohapatra Department of Mathematics, University of Central Florida,
Orlando, FL, USA
Fadel Nasaireh Department of Mathematics, Technical University, Cluj-Napoca,
Romania
Srinivasan Natesan Department of Mathematics, Indian Institute of Technology
Guwahati, Guwahati, India
Natalia Neustroeva North-Eastern Federal University, Yakutsk, Russia
H. Ohshima Faculty of Pharmaceutical Sciences, Tokyo University of Science,
Noda, Chiba, Japan
S. K. Pal Department of Mathematics, Indian Institute of Technology Kharagpur,
Kharagpur, West Bengal, India
Bijaya Laxmi Panigrahi Department of Mathematics, Sambalpur University,
Sambalpur, Odisha, India
Goutam Panigrahi Department of Mathematics, National Institute of Technology,
Durgapur, West Bengal, India
Tapas Ranjan Panigrahi Department of Mathematics, Birla Institute of
Technology and Science, Pilani, Hyderabad, India
Lakshmi Kanta Patra Indian Institute of Information Technology Ranchi,
Ranchi, India
xvi Contributors

Swaraj Paul Department of Mathematics, Visva Bharati, Santiniketan, West

Bengal, India
Maharage Nisansala Sevwandi Perera Graduate School of Science and
Engineering, Saitama University, Saitama, Japan
J. Philomenal Karoline Jayaraj Annapackiam College for Women (Autonomous),
Theni, Tamil Nadu, India
Srinivas Pyda Oracle America, Redwood Shores, CA, USA
Abedallah Rababah Department of Mathematical Sciences, United Arab
Emirates University, Al Ain, UAE
Ioan Raşa Department of Mathematics, Technical University, Cluj-Napoca,
Romania
C. A. Rodger Department of Mathematics and Statistics, Auburn University,
Baltimore, AL, USA
D. R. Sahu Department of Mathematics, Banaras Hindu University, Varanasi,
India
Cherian Samuel Indian Institute of Technology (BHU), Varanasi, India
S. M. Saroja Theerdus Kalavathy Jayaraj Annapackiam College for Women
(Autonomous), Theni, Tamil Nadu, India
Ekrem Savas Department of Mathematics, Usak University, Usak, Turkey
Rabia Savas Department of Mathematics, Sakarya University, Sakarya, Turkey
Avanish Shahi Department of Mathematics, Institute of Science, Banaras Hindu
University, Varanasi, India
S. K. Sharma Indian Institute of Technology (BHU), Varanasi, India
Shivam Shreevastava Department of Mathematical Sciences, Indian Institute of
Technology (BHU), Varanasi, India
Gautam Singh Department of Mathematics, Indian Institute of Technology
Guwahati, Guwahati, India
Sanjeev Kumar Singh Department of Mathematics, Institute of Science, Banaras
Hindu University, Varanasi, India
T. Som Department of Mathematical Sciences, Indian Institute of Technology
(BHU), Varanasi, India
Tanmoy Som Department of Mathematical Sciences, Indian Institute of
Technology (BHU), Varanasi, India
Contributors xvii

Karthikeyan Subbiah Department of Computer Science, Institute of Science

(BHU), Varanasi, India
Anoop Kumar Tiwari Department of Computer Science, Institute of Science
(BHU), Varanasi, India
Sumit Kumar Vishwakarma Department of Mathematics, Birla Institute of
Technology and Science, Pilani, Hyderabad, India
Chapter 1
Constructions and Embeddings
of Hamilton Decompositions
of Families of Graphs

C. A. Rodger

Abstract In this paper, a discussion of the use of amalgamations in constructing

Hamilton decompositions of graphs is presented. Edge-colorings that are fair in var-
ious senses are critical to this endeavor, so some discussion of them is also included.
Finally, the power of amalgamations is demonstrated in the overview of results in the
literature that take a given edge-coloring of a graph and extend it to one of a family
of graphs (e.g., a complete graph or a complete multipartite graph) in which each
color class is a Hamilton cycle.

Keywords Hamilton cycles · Amalgamations · Fair edge-colorings · Embeddings

1 Introduction

Colorings of graphs are very useful in a variety of settings, especially scheduling

problems. In such problems, sharing objects (vertices or edges) out evenly in various
ways usually has beneficial effects in the application being considered. For example,
the most basic of these fairness notions is to ensure that the coloring is proper (no
two adjacent objects receive the same color). But other notions also play a vital role.
One could ask for the coloring to be equalized; that is, the number of objects of
each color is within one of the number of objects of each other color. Two examples
illustrate this.
The first example is the scheduling problem where various companies send rep-
resentatives to a central location, such as Chicago Airport, where they are to meet
other companies for one-on-one discussions. All representatives are in the same in-
dustry so, while not every pair of companies’ representatives need to meet, there is
a lot of congestion to manage. The aim is to schedule the meetings (each is to last
30 min) to minimize the number of time slots needed to satisfy all needs to meet.
The number of rooms is also an issue, partly due to availability and partly due to

C. A. Rodger (B)
Department of Mathematics and Statistics, Auburn University,
221 Parker Hall, Baltimore, AL 36849-5310, USA
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2018 1

expense. This problem can be modeled by a graph G formed by letting each company
(representative) be represented by a vertex, two vertices being joined if and only if
the corresponding companies need to meet. A proper edge-coloring with k colors
provides a schedule using k time slots: Representatives i and j meet at time slot k if
the edge {i, j} is colored k. Clearly the fact that the edge-coloring is proper ensures
that each representative is scheduled to meet at most one other representative at each
time. The number of rooms needed is decided by the size of the biggest color class,
and this is minimized if the edge-coloring is also equalized. Results in the litera-
ture come close to immediately answering this problem: k can be any value at least
χ (G), which by Vizing’s Theorem is either the maximum degree = (G) of G,
or is + 1, and a result by McDiarmid [1] guarantees that if there exists a proper
k-edge-coloring, then there exists an equalized proper k-edge-coloring. Deciding if
a schedule with timeslots is possible may be difficult to determine, as this falls
in the class of NP-complete problems; but rather than working hard to save just one
time period, simply using + 1 timeslots often may not be a problem.
The second example contrasts with the first quite nicely. Various university clubs
are to meet one evening to plan their efforts to help Auburn collect enough food to win
the Auburn-Alabama Food Fight, designed to help the hungry in Alabama. Ideally,
each club would only meet if all its representatives attending that evening are able to
be present at the meeting. Again the plan is to schedule the meetings (each is to last
30 min) in a way that minimizes the number of time slots needed for each club to
meet, having all members present; as before, the number of rooms is also an issue. In
this case, the model is a graph in which each club is represented by a vertex and two
vertices are joined by an edge if the corresponding clubs have a member in common.
So a proper vertex-coloring with k colors provides a schedule using k time slots:
Club i meets at time slot k if vertex i is colored k. The fact that the vertex-coloring is
proper ensures that clubs with members in common are scheduled at different times.
Minimizing the number of rooms needed again calls for an equalized vertex-coloring
(often called an equitable vertex-coloring in the literature). Unfortunately, results in
the literature have more trouble solving this problem; both answering the question
of how many colors are needed (χ(G) is not easily determined) and of whether or
not an equalized vertex-coloring exists. The number of time slots can be any value
at least χ(G); if it is chosen to be more than (G), then it is known that the vertex-
coloring can be equalized ([2]). Other efforts over the past 40 years to find conditions
guaranteeing the existence of equalized vertex-colorings have been found, but much
work remains to understand this property.
Many interesting problems associated with fair colorings of various sorts remain
open and are of practical use. Several more will be introduced later in the paper as
they are needed.
A third practical problem addressed by graph theory is the famous traveling sales-
man problem. A salesman has to visit a predetermined set of cities, one by one, then
return home, following a route that minimizes the distance travelled. It is modeled by
a graph, G, in which the vertices represent the cities, edges represent various routes to
get from city to city, and all edges are weighted by the distance of the corresponding
route. In the unworldy case where all the edges have weight 1, this problem asks
1 Constructions and Embeddings of Hamilton Decompositions … 3

whether or not G contains a Hamilton cycle (a cycle in G which includes each vertex
in V (G)). This too falls into the family of NP-complete problems, so is difficult to
solve, even in this seemingly far simpler situation. Related to the Hamiltonicity of a
graph is a stronger property. A Hamilton decomposition of G is a partition of E(G),
each element of which induces a Hamilton cycle. Since each Hamilton cycle includes
exactly two edges incident with each vertex, clearly G needs to be regular in order
to have a Hamilton decomposition. Around 125 years ago, Walecki proved that the
complete graph K n has a Hamiltonian decomposition if and only if n is odd [3].
This too has an interpretation in an applied setting, related to the traveling salesman
problem. In this case, the salesman wants to visit certain important cities on every
trip, but other towns along the way can be visited less often. The Hamilton cycles in a
Hamilton decomposition ensure that each time out the salesman visits the important
cities (the vertices of the graph), and then since each edge is in exactly one Hamilton
cycle, towns along the roads corresponding to the edges will be visited as the road
is traversed.
For each of these three problems, interest eventually turned from complete graphs
to another natural family of graphs, namely the complete multipartite graphs: The
vertices in each such graph are partitioned into p parts, with two vertices being joined
by an edge if and only if they are in different parts. The chromatic index of such graphs
was settled thirty years ago [4], and the value of the chromatic number is obvious,
but finding equalized vertex-colorings is not so straightforward (see [5]). For such
graphs to have a Hamilton decomposition, clearly they must be regular; to be regular,
clearly all parts must have the same size. So this motivates the following definition:
Let K (n, p) denote the complete multipartite graph with p parts in which each part
contains n vertices. Deciding whether or not K (n, p) has a Hamilton decomposition
was settled by Laskar and Auerbach [6] 40 years ago, showing that it exists if and
only if n( p − 1) is even.
Much more recently, a third family of graphs has drawn wide interest, motivated
by the construction of experimental designs in statistics. A block design with two
association classes (BDTAC) can be described graph theoretically as follows. Let
K (P, λ1 , λ2 ) be the graph in which P is a partition of the vertices, two vertices being
joined by λ1 edges if they are in the same part of P and by λ2 edges if they are in
different parts. The BDTAC is equivalent to a partition of the edges of K (P, λ1 , λ2 ),
each element of which is a copy of K k for some integer k. In the setting of this
paper, the natural question is whether or not there exists a Hamilton decomposition
of K (P, λ1 , λ2 ), so of particular interest is the regular graph K (n, p, λ1 , λ2 ) where
each of the p parts in K (P, λ1 , λ2 ) contains n vertices. This problem was settled
by Bahmanian and Rodger 5 years ago [7]. Their method of proof is the main topic
in Sect. 2. Continuing the theme of fairness in colorings, the amalgamation proof
technique produces a graph H from a given edge-colored graph G, where G is a
graph homomorphism of H , such that the edges in H are shared out among the
vertices and among color classes in ways that are fair with respect to several notions
of balance. The connectivity of color classes is also addressed.
In Sect. 3, the embedding of edge-colored graphs into edge-colored copies of
K (n, p, λ1 , λ2 ) is the main focus. This is a great demonstration of the power of
4 C. A. Rodger

amalgamation proofs, but is also motivated by applications in the following sense.

Scheduling problems often require prerequisite conditions to be built into the final
schedule. For example, when deciding which teachers should teach which classes at
what times, some teachers may not be able to teach early in the morning. Hilton [8]
developed the notion of an outline schedule where times are compressed into a small
number of groups; say early morning, late morning, early afternoon, and last classes.
Similarly, subjects being taught, or classes for the same age students could also form
such groups. Once this outline schedule has been developed, reversing the amalga-
mation approach develops the full schedule. This method also allows prerequisites
to be built into about a quarter of the entire schedule. Here, we begin with a given
edge-colored copy of K (n, p1 , λ1 , λ2 ) and embed it in a copy of K (n, p2 , λ1 , λ2 )
such that each color class induces a Hamilton cycle. In view of the third problem
described above, the given copy can be thought of as the given prerequisites in the
final Hamilton decomposition that realizes the schedule of the salesman.

2 Amalgamations and Hamilton Decompositions

In 1984, Hilton [9] made a leap forward in the study of Hamilton decompositions.
He had the idea of starting with a single vertex, say α, incident with n(n − 1)/2
loops, n of each of (n − 1)/2 colors, and attempted to disentangle n vertices from
α, one at a time, to end up with the complete graph K n in which each color class
was a Hamilton cycle. The proof was inductive, so was especially powerful in that it
allowed one to start midway through the process rather than with a single vertex. All
that was needed was for this midway point to satisfy the conditions described in the
inductive hypotheses, conditions which actually turn out to be necessary anyway.
It is helpful to think of the single vertex as originally containing the n vertices
that eventually appear in the final graph. As each vertex is disentangled from α,
one less vertex is still contained in it, so at the ith step one can naturally define
the amalgamation function f i (α) = n − i to be the number of vertices still in α.
Inductively, the setup at the ith step is to have: f i (α)( f i (α) − 1)/2 loops incident
with α; one edge between each pair of disentangled vertices; and n − f i (α) edges
between each disentangled vertex and α. Since we hope to end up with K n then at the
ith step, one end of each of f i (α) − 1 loops is detached from α and joined instead of
the new vertex being disentangled from α. Also, from each previously disentangled
vertex, one of the n − f i (α) edges joining it to α is detached from α, its new end
becoming the disentangled vertex instead of α. So, with these properties in mind, by
the time the (n − 1)th step is completed, it is easy to see our single vertex with loops
has been transformed into K n .
Advantageously, the method is even more flexible than described so far in that
it is possible to start with a graph G having p vertices, each vertex, v, containing
f (v) = n vertices (or even setups more general than that). If each of the p vertices,
v, has λ1 f (v)( f (v) − 1)/2 = λ1 n(n − 1)/2 loops on it, and if between each pair of
the p vertices, say u and v, there are λ2 f (u) f (v) = λ2 n 2 edges, then this graph is
1 Constructions and Embeddings of Hamilton Decompositions … 5

the amalgamation (homomorphic image) of H = K (n, p, λ1 , λ2 ): For each of the

p parts of K (n, p, λ1 , λ2 ), amalgamate the n vertices into a single vertex to form
G. Notice that this includes the classical complete multipartite graphs, when λ1 = 0
and λ2 = 1.
Of course, the point here is not just to produce K n or K (n, p, λ1 , λ2 ); we are
really trying to produce Hamilton decompositions (or other graph decompositions)
of these graphs. The idea is that if the disentangling process can be achieved, then
it is much easier to form the amalgamated graph with a suitable edge-coloring (an
outline of the final decomposition) than it is to find the final decomposition directly.
So attention also needs to be paid to the color of the edges being selected during
the disentangling process, both the loops incident with α and the edges joining the
previously disentangled vertices to α. It turns out that we now have a lot of control over
the disentanglement. Various results appear in the literature, but the following result
is a good example of what is possible. Proved in more generality by Bahmanian and
Rodger in [7], it ties in nicely with the fairness notions described earlier. Informally, it
says that if D(v) is the set of vertices in H disentangled from v in G, then each vertex
u in D(v) receives its fair share of the edge ends in G incident with v, and each vertex
u in D(v) receives its fair share of the edge ends in G incident with v colored j. That
is, d H (v) ∈ {dG (v)/n, dG (v)/n} and d H ( j) (v) ∈ {dG( j) (v)/n, dG( j) (v)/n},
where G( j) is the subgraph of G induced by the edges colored j. In Theorem 1,
ψ plays the role of the amalgamation function, G (u) is the number of loops in G
incident with u, and m G (u, v) is the number of edges in G joining u to v.
Theorem 1 [7] Let G be a k-edge-colored graph and let ψ be a function from V (G)
into the positive integers such that for each u ∈ V (G),
(1) ψ(u) = 1 implies G (u) = 0,
(2) dG( j) (u)/ψ(u) is an even integer for all 1 ≤ j ≤ k,
ψ(u)
(3) 2
divides G (u),
(4) ψ(u)ψ(v) divides m G (u, v) for each v ∈ V (G) \ {u}, and
(5) G( j) is connected for 1 ≤ j ≤ k.
Then, there exists a detachment H of G in which each u ∈ V (G) is disentangled
into vertices u 1 , . . . , u ψ(u) , such that for all u ∈ G:

(i) m H (u i , u i ) = G (u)/ ψ(u)2
for all 1 ≤ i < i ≤ ψ(u) if ψ(u) ≥ 2,
(ii) m H (u i , vi ) = m G (u, v)/ψ(u)ψ(v) for v ∈ V (G) \ {u}, 1 ≤ i ≤ ψ(u), and 1 ≤
i ≤ ψ(v),
(iii) d H ( j) (u i ) = d H ( j) (u)/ψ(u) for 1 ≤ i ≤ ψ(u) and 1 ≤ j ≤ k, and
(iv) Each color class H ( j) is connected for 1 ≤ j ≤ k.
Condition (2) is critical for proving that connected color classes in G can remain
connected during the disentangling process, thus guaranteeing that condition (iv) is
satisfied by H . Since we aim to have each color class disentangled into a Hamilton cy-
cle, clearly each vertex v in the amalgamated graph we construct needs to be incident
with 2ψ(v) edges colored j, for each color j, since each of the ψ(v) vertices inside
v needs to be incident with exactly two edges colored j in the disentangled graph.
6 C. A. Rodger

Not only does this approach give a new proof of Walecki’s [3] result, but it also
lends itself beautifully to other families of graphs than complete graphs.
Theorem 2 [6, 10] There exists a Hamilton decomposition of λK (n, p) if and only
if λn( p − 1) is even.
To see how Theorem 1 is of use in proving Theorem 2, start with p vertices, each
joined to each other with λn 2 edges. The edges are then colored with λn( p − 1)/2
colors so that each color class is connected and 2n-regular (the details of how the
edge-coloring is accomplished are not included here, but one natural approach is to
add Hamilton cycles of K p , each containing edges of just one color, to complete
most of the task). It is easy to see that this edge-colored graph satisfies conditions
(1–5) of Theorem 1 with ψ(v) = n for all vertices. So the disentangled graph, H :
by condition (ii) H is simple, so it must be that H = λK (n, p); by conditions (iii–
iv), each color class of H is 2-regular and connected, so is a Hamilton cycle. This
completes the proof.
More recently, the existence of Hamilton decompositions of K (n, p, λ1 , λ2 ) was
completely settled in the following theorem.
Theorem 3 [7] Let p > 1, λ1 ≥ 0, and λ2 ≥ 1, with λ1
= λ2 be integers. Then,
there exists a Hamilton decomposition of K (n, p, λ1 , λ2 ) if and only if
(ii) λ1 (n − 1) + λ2 n( p − 1) is even, and
(iii) λ1 ≤ λ2 n( p − 1).
It is hopefully not surprising now that the proof of the sufficiency follows the
above approaches closely, starting with p vertices, each joined to each other with
λn 2 edges, but this time each vertex is also incident with λ1 n(n − 1)/2 loops. The
edges are then colored so that each color class is connected and 2n-regular. Once
this is done, the result follows essentially immediately from Theorem 1.
The proof of the necessity of Theorem 3 is not included here, but it is worth giving
some feel for why condition (iii) is necessary. First note that every Hamilton cycle
in K (n, p, λ1 , λ2 ) must use at least p edges joining vertices in different parts in
order to be connected. So if we allow λ1 to grow while holding all other parameters
constant, we will eventually run out of the edges joining vertices in different parts.
For this reason, an upper bound on λ1 is to be expected.

3 Embeddings of Edge-Colorings into Hamilton

Decompositions

The embedding interest followed the same line as the construction results described in
Sect. 2: First studied was embeddings of edge-colored graphs into Hamilton decom-
positions of K n (see Theorem 4), then of complete multipartite graphs (see Theorem
5), and then of K (n, p, λ1 , λ2 ) (see Theorems 6 and 7). We now survey this progress,
one by one.
1 Constructions and Embeddings of Hamilton Decompositions … 7

In the previous section, Hilton’s paper [9] introducing amalgamations as a means

of producing graph decompositions was described. One of the great applications he
developed was the idea of building prerequisites into the final Hamilton decomposi-
tion. In his paper, he proved the following result which completely describes when
it is possible to start with a given edge-coloring of K n and embed it in a Hamilton
decomposition of K m ; that is, add m − n new vertices to the given edge-colored K n ,
and edges to form a K m , then color all the added edges so that each color class is
a Hamilton cycle. This was truly an amazing result, since typically the given edge-
coloring would seemingly need to have much postulated structure or symmetry to
make such a result provable. But the amalgamation method is so flexible that he
completely solved the problem with the following result.

Theorem 4 [9] A k-edge-colored K n (some colors may appear on no edges) can

be embedded into a Hamiltonian decomposition of K m if and only if
1. m is odd,
2. k = m/2, and
3. Each color class of the given edge-coloring of K n has at most m − n components,
each of which is a path (isolated vertices are considered to be paths of length 0).

Proof The necessity of these conditions is quite clear: (1–2) follow since in K m each
vertex is incident with exactly two edges of each color; (3) follows because each
one of the m − n added vertices can be used to connect just two components in each
color class.
Proving the sufficiency clearly demonstrates the power of amalgamations. At first
sight, it is not clear at all how to color all the added edges. But we immediately know
how to color them in the graph formed by taking any solution to the embedding and
amalgamating the added vertices to form a single vertex (in the notation of Sect. 2,
the amalgamated vertex is like α, with f (α) = m − n). The following shows how
to form the amalgamated graph G, even though we do not have a solution (i.e., a
Hamilton decomposition of H = K m ) in hand.
1. Join each vertex in K n to the added vertex α with m − n edges.
2. Color the added edges so that each vertex in K n has degree 2 in each color class.
(This is possible since then vertices in K n would have degree 2k = (n − 1) +
(m − n).)
3. Add (m − 1)(m − n − 1)/2 loops incident with α.
4. To complete the edge-coloring of G, color the loops so that α is incident with
exactly 2(m − n) edge ends of each color; each loop contributes two edge ends.
(This is possible since condition (3) guarantees the number of loops to be added
is nonnegative and the number of edges of each color added in the second step is
even.)
We can now immediately form a Hamilton decomposition of H from G using The-
orem 1 with ψ(u) = 1 for all vertices in K n and ψ(α) = m − n. To see this, refer to
the various parts of Theorem 1 in turn as follows.
8 C. A. Rodger

(i) Shows that once the m − n vertices in α are disentangled, the ((m − n)(m −
n − 1)/2 loops on α induce a simple graph, which must be K m−n .
(ii) Shows that once the m − n vertices in α are disentangled, the m − n edges
joining each vertex u in K n to α become single edges joining u to each of the
m − n disentangled vertices. So at this stage we know that H is K m .
(iii) Shows that each vertex in H has degree 2 in each color class.
(iv) Shows, together with what was just shown in (iii), that each color class is a
Hamilton cycle.

Hilton and Rodger [10] extended Theorem 4 to the complete multipartite graphs.
They proved the following result as a corollary of a much more general amalgamation
theorem.

Theorem 5 [10] Suppose that 2t ≤ s. Then, a k-edge-coloring of the complete t-

partite graph K a1 ,...,at can be embedded into a Hamiltonian decomposition of the
complete p-partite graph K (n, p, 0, 1) if and only if
(i) Each color class is a set of vertex-disjoint paths,
(i) ai ≤ n for 1 ≤ i ≤ t, and
(i) p(n − 1) is even.

The proof of Theorem 5, while more complicated, follows the approach outlined
above for proving Theorem 4. In this case, the given t-partite graph is first embedded
greedily into an edge-colored K (n, t, 0, 1) in which each color class is still a set of
vertex-disjoint paths; this can be done since we are assuming that 2t ≤ s. The second
step introduces one new vertex, an amalgamated vertex playing the role of α in the
outline of the proof of Theorem 4 above, but in this case the technique calls for all
vertices within the same part to be disentangled before moving on to vertices from
other parts still contained in α.
The embedding of edge-colored copies of K (n, t, λ1 , λ2 ) into Hamilton decom-
positions of K (n, p, λ1 , λ2 ) is really very interesting. Reasonably obvious numerical
conditions are sufficient when p is somewhat larger than t (see Theorem 6), but at this
stage it appears that there are conditions which depend upon the existence of certain
components in a companion bipartite graph to the given edge-colored graph which
are necessary for the embedding to exist (see Theorem 7). This structural property
is reminiscent of the long-standing unsolved embedding problem for partial idem-
potent latin squares of order n into idempotent latin squares of order n + t when t is
small: When t ≥ n numerical conditions do prove to be sufficient (see [11–13]), but
for smaller values of t the existence of certain components in a closely related graph
can prevent such an embedding (see [11, 14]).
As in other results mentioned so far, the following is a consequence of a more
general amalgamation result in [15] which requires some postulations that are un-
likely to be necessary in a complete solution to the problem. Nevertheless, the result
is sufficiently general to allow the embedding problem to be solved whenever the
number of parts, r , being added to the given edge-colored copy of K (n, t, λ1 , λ2 )
1 Constructions and Embeddings of Hamilton Decompositions … 9

is large enough. It is always a little worrying when a result is described in terms of

some parameter being sufficiently large. Often that necessary size for the result to
work is really very large. However, the good news in this case is that in fact the lower
bound on r for the result to be applicable is not really so large, as the following result
indicates.

Theorem 6 [15] Let n > 1, λ1 ≥ 0, λ2 ≥ 1, λ1

= λ2 , p = t + r and

λ1 (n − 1) + λ2 n(t − 1)
r≥ . (1)
λ2 n(n − 1)

Then, a k-edge-coloring of K (n, t, λ1 , λ2 ) can be embedded into a Hamiltonian

decomposition of K (n, p, λ1 , λ2 ) if and only if

1. k = λ1 (n − 1) + λ2 n( p − 1) /2,
2. λ1 ≤ λ2 n( p − 1),
3. Every component of G( j) is a path (possibly of length 0) for 1 ≤ j ≤ k, and
4. G( j) has at most nr components for 1 ≤ j ≤ k.

In the same paper, using the same general amalgamation theorem, it turns out that
the case where r = 1 (so just one part is being added) can also be completely solved.
So now we need to explore the values of r between 1 and (λ1 (n − 1) + λ2 n(t −
1))/λ2 n(n − 1). Starting with the smallest values seems enticing! It turns out that
even just considering the case where r = 2 is particularly challenging. We appear to
enter a different world where the structure can play a deciding role in determining
whether or not the embedding of the k-edge-coloring of G = K (n, t, λ1 , λ2 ) into a
Hamiltonian decomposition of H = K (n, p = t + 2, λ1 , λ2 ) is possible. To see this,
it is best to describe the issue in terms of a related bipartite graph, B. Its vertex set is
of course partitioned into two sets: V (G) and C = {c j | 1 ≤ j ≤ k}. Each v ∈ V (G)
is joined to c j in B with x edges if and only if dG( j) (v) = 2 − x. Recall that in H
each color class is a Hamilton cycle, so each vertex has degree 2 in each color class.
So B is keeping a track of how many more edges of each color, j, that v needs
added during the embedding process. Connectivity is also a critical aspect of the
embedding: The added vertices in the r = 2 new parts need to be used to connect
up all the paths in G( j) for each color j (so 1 ≤ j ≤ k). For various reasons, it
seems likely, possibly even necessary, that if d B (c j ) ≡ 2 (mod 4) then at least one
of the components (paths) in G( j) must have its end vertices in G, say v j,1 and v j,2 ,
joined to different new parts in H . Reproducing this during a proof of the sufficiency
is managed by forming B ∗ , a modification of B constructed by disentangling such
c j into two vertices, one having degree 2 being adjacent to v j,1 and v j,2 . As the
embedding proceeds, choosing the path for each color, j, which determines v j,1 and
v j,2 seems to be critical, as is described in condition (∗) of Theorem 7 below. Let
= {{v j,1 , v j,2 } | 1 ≤ j ≤ k} describe this choice. It is conceivable that condition
(∗) is also a necessary condition. Let C2 denote the set of vertices in C of degree 2
(mod 4).
10 C. A. Rodger

Theorem 7 [16] Let n > 1, λ1 ≥ 0, λ2 ≥ 1 and λ1

= λ2 . Suppose we are given a
k-edge-coloring of G = K (n, t, λ1 , λ2 ), and that
(*) can be chosen such that in the detached graph, B ∗ , the number of components
having an odd number of color vertices of degree divisible by 4 is at most λ2 n 2 .
Then, the k-edge-coloring of G can be embedded into a Hamiltonian decomposition
of K (n, p = t + 2, λ1 , λ2 ) if and only if

of Theorem 6 with r = 2 are satisfied, and

(i) Conditions (1–4)
(v) |C2 | ≤ 2λ1 n2 + λ2 n 2 .

Apart from amalgamations, there is another aspect of the proof of this result which
is of interest here since a 2-edge-coloring of B ∗ is required that has the colors fairly
divided in two ways. An edge-coloring of a graph is said to be equitable at vertex v if,
for all colors i and j, the number of edges incident with v colored i is within 1 of the
number of edges colored j. An edge-coloring of a graph is said to be evenly equitable
at vertex v if, for all colors i and j, the number of edges incident with v colored i
is even and is within 2 of the number of edges colored j. Hilton [17] proved that
evenly equitable edge-colorings (i.e., evenly equitable at all vertices) exist whenever
all vertices have even degree. Equitable edge-colorings (i.e., equitable at all vertices)
are much more problematic (see [18] for example), but de Werra [19] has shown
that they always exist for bipartite graphs. To prove Theorem 7, it was critical that
these two results of Hilton and de Werra be generalized to require some vertices to
be evenly equitably colored and others to be equitably colored. We end with this
crucial lemma, which is of interest in its own right.

Lemma 1 [16] Let B be a finite even bipartite graph with bipartition {V, C} of its
vertex set. For any subset X ⊆ C, there exists a 2-edge-coloring σ : E(B) → {1, 2}
such that
(i) d B(1) (v) = d B(2) (v) for all v ∈ V ,
(ii) d B(1) (c) = d B(2) (c) for all c ∈ X , and
(iii) |d B(1) (c) − d B(2) (c)| = 2 for all c ∈ C \ X
if and only if
(iv) |V (D) ∩ (C \ X )| is even for each component D of B.

References

1. McDiarmid, C.J.H.: The solution of a timetabling problem. J. Inst. Math. Appl. 9, 23–34 (1972)
2. Hajnal, A., Szemerdi, E.: Proof of a Conjecture of P. Erdös, Combinatorial Theory and its
Applications, II North-Holland, Amsterdam, , pp. 601–623 (1970)
3. Lucas, E.: Récréations mathématiques, vol. 2, Gauthier-Villars, Paris (1883)
4. Hoffman, D.G., Rodger, C.A.: The chromatic index of complete multipartite graphs. J. Graph
Theor. 16, 159–163 (1992)
1 Constructions and Embeddings of Hamilton Decompositions … 11

5. Lam, P., Shiu, W.C., Tong, C.S., Zhang, Z.F.: On the equitable chromatic number of complete
n-partite graphs. Discrete Appl. Math. 113, 307–310 (2001)
6. Laskar, R., Auerbach, B.: On decomposition of r -partite graphs into edge-disjoint Hamiltonian
circuits. Discrete Math. 14, 265–268 (1976)
7. Bahmanian, M.A., Rodger, C.: Multiply balanced edge colorings of multigraphs. J. Graph
Theor. 70, 297–317 (2012)
8. Hilton, A.J.W.: School timetables, studies on graphs and discrete programming. Ann. Discrete
Math. 11, 177–188 (1981)
9. Hilton, A.J.: Hamiltonian decompositions of complete graphs. J. Comb. Theor. (B) 36, 125–134
(1984)
10. Hilton, A.J., Rodger, C.A.: Hamiltonian decompositions of complete regular s-partite graphs.
Discrete Math. 58, 63–78 (1986)
11. Andersen, L.D., Hilton, A.J.W., Rodger, C.A.: A solution to the embedding problem for partial
idempotent Latin squares. J. London Math. Soc. 26, 21–27 (1982)
12. Rodger, C.A.: Embedding incomplete idempotent latin squares, Combinatorial Mathematics
X. Lecture Notes in Mathematics (Springer), vol. 1036, pp. 355–366 (1983)
13. Rodger, C.A.: Embedding an incomplete latin square in a latin square with a prescribed diag-
onal. Discrete Math. 51, 73–89 (1984)
14. Andersen, L.D., Hilton, A.J.W., Rodger, C.A.: Small embeddings of incomplete idempotent
Latin squares. Ann. Discrete Math. 17, 19–31 (1983)
15. Bahmanian, M.A., Rodger, C.: Embedding an edge-colored K (a ( p) ; λ, μ) into a Hamiltonian
decomposition of K (a ( p+r ) ; λ, μ). Graphs Comb. 29, 747–755 (2012)
16. Demir, M., Rodger, C.A.: Embedding an Edge-Coloring of K (nr ; λ1 , λ2 ) into a Hamiltonian
Decomposition of K (nr +2 ; λ1 , λ2 ), submitted
17. Hilton, A.J.W.: Canonical edge-colourings of locally finite graphs. Combinatorica 2, 37–51
(1982)
18. Hilton, A.J.W., de Werra, D.: A sufficient condition for equitable edge-colourings of simple
graphs. Discrete Math. 128, 179–201 (1994)
19. de Werra, D.: Equitable colorations of graphs, Rev. Franaise Informat. Recherche Oprationnelle
5, Sr. R-3, 3–8 (1971)
Chapter 2
On Strong Pseudomonotone
and Strong Quasimonotone Maps

Sanjeev Kumar Singh, Avanish Shahi and S. K. Mishra

Abstract We introduce strong pseudomonotone and strong quasimonotone maps

of higher order and establish their relationships with strong pseudoconvexity and
strong quasiconvexity of higher order, respectively, which yields first-order char-
acterizations of strong pseudoconvex and strong quasiconvex functions of higher
order. Moreover, we answer the open problem (converse part of Proposition 6.2) of
Karamardian and Schaible (J. Optim. Theory Appl. 66:37–46,1990), for even more
generalized functions, namely strongly pseudoconvex functions of higher order.

Keywords Generalized monotone maps · Generalized convexity · First-order

conditions

1 Introduction

Minty [9] introduced the concept of monotone maps. Further, in addition to that
Karamardian [5] discussed strict monotone and strongly monotone maps. It is well
known that every differentiable function is convex if and only if its gradient map
is monotone (see [2, 10]). Karamardian [5] stated the relationship between strongly
convex functions and strongly monotone maps. In 1976, Karamardian [4] introduced
the concept of pseudomonotone maps and showed that a differentiable pseudocon-
vex function (see [3, 8]) is characterized by pseudomonotonicity of its gradient
map and used monotonicity/pseudomonotonicity in establishing several existence
theorems for complementarity problems. Further, Karamardian and Schaible [6] in-
troduced strictly pseudomonotone, quasimonotone, strongly monotone, and strongly

S. K. Singh · A. Shahi · S. K. Mishra (B)

Department of Mathematics, Institute of Science, Banaras Hindu University,
Varanasi 221005, India
e-mail: [email protected]
S. K. Singh
e-mail: [email protected]
A. Shahi
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2018 13

pseudomonotone maps and showed that for gradient maps, these generalized mono-
tonicity properties are related to generalized convexity properties of the underlying
functions.
Lin and Fukushima [7] along with other results for nonlinear programs and math-
ematical programs with equilibrium constraints introduced strong convexity of order
σ and strong monotone maps of order σ . Lin and Fukushima [7] showed that the
strong monotonicity of order σ of the gradient map is related to strong convexity
of order σ of the function. Arora et al. [1] introduced strongly pseudoconvex func-
tions of order σ and its generalization to characterize solution sets and optimality
conditions for optimization problems. Arora et al. [1] have also introduced strongly
quasiconvex function of order σ .
It is very natural to see that the concept of strongly monotone maps of order σ
due to Lin and Fukushima [7] can be extended to strongly pseudomonotone maps of
order σ and strongly quasimonotone maps of order σ can be studied, as Karamardian
and Schaible [6] extended the concept of monotone maps to pseudomonotone maps.
In 1990, Karamardian and Schaible [6] left an open problem as the converse
of Proposition 6.2 [6], and we have answered that open question positively for a
more general function, namely strongly pseudoconvex of order σ , which is also an
extension of strongly convex function of order σ given by Lin and Fukushima [7].

2 Preliminaries

2.1 Pseudoconvexity and Quasiconvexity

Definition 1 [2, 6] A differentiable function f on an open convex subset X of Rn

is pseudoconvex on X if, for every pair of distinct points x, y ∈ X, we have

∇ f (y), x − y ≥ 0 ⇒ f (x) ≥ f (y).

Definition 2 [2, 6] A function f is quasiconvex on a convex set X of Rn if, for all

x, y ∈ X , λ ∈ [0, 1],

f (x) ≤ f (y) ⇒ f (λx + (1 − λ)y) ≤ f (y).

Proposition 1 [2, 6] A differentiable function f is quasiconvex on an open convex

set X of Rn if and only if, for every pair of points x, y ∈ X , we have

f (x) ≤ f (y) ⇒ ∇ f (y), x − y ≤ 0.

Remark 1 [3] Every pseudoconvex function is quasiconvex, but the converse is not
necessarily true.
2 On Strong Pseudomonotone and Strong Quasimonotone Maps 15

2.2 Strong Convexity and Strong Monotonicity of Order σ

Definition 3 [7] Let X be a non-empty open and convex subset of Rn . A function

f : X → R is said to be strongly convex function of order σ if ∃ a constant c > 0
such that

f (λx + (1 − λ)y) ≤ λ f (x) + (1 − λ) f (y) − cλ(1 − λ)x − yσ ,

for any x, y ∈ X and any λ ∈ [0, 1].

Theorem 1 [7] Let X be a non-empty open and convex subset of Rn . A continuously

differentiable function f : X → R is strongly convex of order σ on X if and only if
∃ a constant c > 0 such that

f (x) − f (y) ≥ ∇ f (y), x − y + cx − yσ , ∀x, y ∈ X.

Remark 2 [6] For σ = 2,

f (x) − f (y) ≥ ∇ f (y), x − y + cx − y2 .

This function is referred to as strongly convex function in ordinary sense.

Definition 4 [7] Let X be a non-empty open and convex subset of Rn . A mapping

F : X → Rn is said to be strongly monotone map of order σ if ∃ a constant β > 0
such that
F(x) − F(y), x − y ≥ βx − yσ , ∀x, y ∈ X.

Remark 3 [6] For σ = 2,

F(x) − F(y), x − y ≥ βx − y2 .

This map is referred to as strongly monotone map in ordinary sense.

Lin and Fukushima [7] established the relation between strongly convex function of
order σ and strongly monotone map of order σ.
Theorem 2 [7] Let X be a non-empty open and convex subset of Rn . A continuously
differentiable function f : X → R is strongly convex of order σ if and only if ∇ f is
strongly monotone of order σ on X .
16 S. K. Singh et al.

3 Strongly Pseudoconvexity of Order σ and Strongly

Pseudomonotonicity of Order σ

Definition 5 [1] Let X be a non-empty open and convex subset of Rn . A differen-

tiable function f : X → R is said to be strongly pseudoconvex of order σ on X if
∃ α > 0 such that

∇ f (y), x − y + αx − yσ ≥ 0 ⇒ f (x) ≥ f (y), ∀x = y ∈ X.

We introduce strongly pseudomonotone map of order σ.

Definition 6 Let X be a non-empty open and convex subset of Rn . A map F : X →

Rn is said to be strongly pseudomonotone of order σ on X if ∃ β > 0 such that

F(y), x − y + βx − yσ ≥ 0 ⇒ F(x), x − y ≥ 0, ∀x = y ∈ X.

We establish the relationship between strong pseudoconvexity and strong pseu-

domonotonicity of order σ, which is the natural generalization of the strongly pseudo-
convex function given by Karamardian and Schaible [6]. Karamardian and Schaible
[6] have left an open problem as the converse of the Proposition (6.2), and we prove
necessary and sufficient both parts for more general class as strong pseudoconvexity
of order σ.

Theorem 3 Let X be a non-empty open and convex subset of Rn . A continuously

differentiable function f : X → R is strongly pseudoconvex of order σ if and only
if ∇ f is strongly pseudomonotone of order σ on X .

Proof Let f be strongly pseudoconvex of order σ on X, then ∃ α > 0 such that

∇ f (y), x − y + αx − yσ ≥ 0 ⇒ f (x) ≥ f (y), ∀x = y ∈ X. (1)

Since every strongly pseudoconvex function of order σ is quasiconvex function.

Therefore,
f (λx + (1 − λ)y) ≤ f (x). (2)

By using proposition (1) on Eq. (2),

∇ f (x), (λx + (1 − λ)y) − x ≤ 0,

∇ f (x), (1 − λ)(y − x) ≤ 0,

∇ f (x), (x − y) ≥ 0.
2 On Strong Pseudomonotone and Strong Quasimonotone Maps 17

Therefore, we have

∇ f (y), x − y + αx − yσ ≥ 0 ⇒ ∇ f (x), (x − y) ≥ 0.

Thus, ∇ f is strongly pseudomonotone of order σ .

Conversely, suppose that ∇ f is strongly pseudomonotone of order σ on X and
then ∃ β > 0 such that

∇ f (y), x − y + βx − yσ ≥ 0 ⇒ ∇ f (x), x − y ≥ 0, ∀x = y ∈ X.

Equivalently,

∇ f (x), x − y < 0 ⇒ ∇ f (y), x − y + βx − yσ < 0. (3)

We want to show that f is strongly pseudoconvex of order σ .

For this, we have to show ∃ α > 0 such that

∇ f (y), x − y + αx − yσ ≥ 0 ⇒ f (x) ≥ f (y), ∀x = y ∈ X. (4)

Suppose on contrary,
f (x) < f (y).

By the mean value theorem, ∃ z = λx + (1 − λ)y, for some λ ∈ (0, 1) such that

f (x) − f (y) = ∇ f (z), x − y, ∀x = y ∈ X. (5)

1
∇ f (z), x − y = ∇ f (z), z − y < 0.
λ
From Eq. (3), we obtain

∇ f (z), z − y < 0 ⇒ ∇ f (y), z − y + βz − yσ < 0,

∇ f (z), z − y < 0 ⇒ λ[∇ f (y), x − y + βλσ −1 x − yσ ] < 0,

∇ f (z), z − y < 0 ⇒ ∇ f (y), x − y + βλσ −1 x − yσ < 0,

which contradicts that

∇ f (y), x − y + αx − yσ ≥ 0.

So, f (x) ≥ f (y) and hence f is strongly pseudoconvex of order σ .

Remark 4 [1] Every strongly pseudoconvex function of order σ is pseudoconvex,
but the converse is not necessarily true.
18 S. K. Singh et al.

Fig. 1 Strongly pseudomonotone map of order σ.

Remark 5 Every strongly monotone map of order σ is strongly pseudomonotone

map of order σ , but the converse is not necessarily true.

Example 1 Let F : R → R, defined by F(x) = 1 − x, x ∈ R.

Here, F is strongly pseudomonotone of order σ but not strongly monotone of
order σ (Fig. 1).

4 Strongly Quasiconvexity and Strongly

Quasimonotonicity of Order σ

Definition 7 [1] Let X be a non-empty open and convex subset of Rn . A differen-

tiable function f : X → R is said to be strongly quasiconvex of order σ if ∃ α > 0
such that

f (x) ≤ f (y) ⇒ ∇ f (y), x − y + αx − yσ ≤ 0, ∀x = y ∈ X.

We introduce strongly quasimonotone map of order σ.

Definition 8 Let X be a non-empty open and convex subset of Rn . A map F : X →

Rn is said to be strongly quasimonotone of order σ if ∃ β > 0 such that

F(y), x − y > 0 ⇒ F(x), x − y ≥ βx − yσ , ∀x = y ∈ X.

2 On Strong Pseudomonotone and Strong Quasimonotone Maps 19

Theorem 4 Let X be a non-empty open and convex subset of Rn . A continuously

differentiable function f : X → R is strongly quasiconvex of order σ if and only if
∇ f is strongly quasimonotone of order σ on X.

Proof Let f be strongly quasiconvex function of order σ on X , then ∃ α > 0 such

that

f (x) ≤ f (y) ⇒ ∇ f (y), x − y + αx − yσ ≤ 0, ∀x = y ∈ X. (6)

We have to show that ∇ f is strongly quasimonotone of order σ on X.

For this, we have to prove that ∃ β > 0 such that

∇ f (y), x − y > 0 ⇒ ∇ f (x), x − y ≥ βx − yσ , ∀x = y ∈ X.

Since every strongly quasiconvex function of order σ is quasiconvex, therefore we

have
∇ f (y), x − y > 0 ⇒ f (x) > f (y). (7)

As f is strongly quasiconvex function of order σ , then by using Eq. (6), we have

f (y) < f (x) ⇒ ∇ f (x), y − x + αy − xσ ≤ 0,

f (y) < f (x) ⇒ ∇ f (x), y − x ≤ −αy − xσ ,

f (y) < f (x) ⇒ ∇ f (x), x − y ≥ αx − yσ .

Therefore,

∇ f (y), x − y > 0 ⇒ ∇ f (x), x − y ≥ αx − yσ .

So, ∇ f is strongly quasimonotone of order σ.

Conversely, let ∇ f be strongly quasimonotone of order σ , then ∃ β > 0 such
that

∇ f (y), x − y > 0 ⇒ ∇ f (x), x − y ≥ βx − yσ , ∀x = y ∈ X. (8)

We have to prove that f is strongly quasiconvex function of order σ.

For this, we have to prove that ∃ β > 0 such that

f (x) ≤ f (y) ⇒ ∇ f (y), x − y + βx − yσ ≤ 0, ∀x = y ∈ X.

Equivalently,

∇ f (y), x − y + βx − yσ > 0 ⇒ f (x) > f (y). (9)

20 S. K. Singh et al.

Suppose on contrary, f (x) ≤ f (y).

By the mean value theorem, ∃ z = λx + (1 − λ)y for some λ ∈ (0, 1) such that

1
f (x) − f (y) = ∇ f (z), x − y = ∇ f (z), z − y ≤ 0 ⇒ ∇ f (z), y − z > 0, ∀x = y ∈ X.
λ

Since ∇ f is strongly quasimonotone of order σ , therefore by using Eq. (8), we obtain

∇ f (z), y − z > 0 ⇒ ∇ f (y), y − z ≥ βy − zσ ,

∇ f (z), y − z > 0 ⇒ λ∇ f (y), y − x ≥ βλσ y − xσ ,

∇ f (z), y − z > 0 ⇒ ∇ f (y), x − y ≤ −βλσ −1 x − yσ ,

∇ f (z), y − z > 0 ⇒ ∇ f (y), x − y + αx − yσ ≤ 0. (α = βλσ −1 )

which contradicts to left side inequality of Statement (9).

Hence, f (x) > f (y) and f is strongly quasiconvex function of order σ .

Remark 6 Every strongly quasiconvex function of order σ is a quasiconvex function,

but the converse is not always true (Fig. 2).

Example 2 f (x) = 1 − x 3 on X = R.
Here, f is quasiconvex but not strongly quasiconvex of order σ .
As f (x) ≤ f (y) ⇒ ∇ f (y), x − y ≤ 0.

Fig. 2 Quasiconvex function

2 On Strong Pseudomonotone and Strong Quasimonotone Maps 21

Fig. 3 Strongly quasimonotone map of order σ.

Therefore, f is quasiconvex function. On the other hand, if we take x = 21 , y = 0

then f (x) ≤ f (y).
But ∇ f (y), x − y + αx − yσ ≤ 0 ⇒ −3y 2 (x − y) + αx − yσ ≤ 0.
At x = 21 , y = 0, the above inequality gives α ≤ 0.
But α is positive quantity so this is not applicable for all α.
Hence, f is not strongly quasiconvex of order σ .

Remark 7 As the class of quasifunctions is largest class, a strongly pseudomonotone

map of order σ is strongly quasimonotone map of order σ , but the converse is not
always true.

Example 3 ⎧ Define F : X = [−1, 2] → R, by

⎨0 for −1 ≤ x < 0
F(x) = x for 0 ≤ x < 1
⎩
2 − x for 1 ≤ x ≤ 2
Here, F is a strongly quasimonotone map of order σ but not strongly pseudomono-
tone map of order σ (Fig. 3).

Acknowledgements The first author is financially supported by CSIR-UGC JRF, New Delhi,
India, through Reference no.: 1272/(CSIR-UGC NET DEC.2016). The second author is fi-
nancially supported by UGC-BHU Research Fellowship, through sanction letter no: Ref.No.
/Math/Res/Sept.2015/2015-16/918.
22 S. K. Singh et al.

References

1. Arora, P., Bhatia, G., Gupta, A.: Characterization of the solution sets and sufficient optimality
criteria via higher-order strong convexity. Topics in Nonconvex Optimization, vol. 50, pp.
231–242. Springer Optim Appl, New York (2011)
2. Avriel, M., Diewert, W.E., Schaible, S., Zang, I.: Generalized Concavity. Plenum Publishing
Corporation, New York (1988)
3. Cambini, A., Martein, L.: Generalized convexity and optimization. Lecture notes in Economics
and Mathematical systems, vol. 616. Springer, Berlin (2009)
4. Karamardian, S.: Complementarity problems over cones with monotone and pseudomonotone
maps. J. Optim. Theor. Appl. 18, 445–454 (1976)
5. Karamardian, S.: The nonlinear complementarity problem with applications, Part 2. J. Optim.
Theor. Appl. 4, 167–181 (1969)
6. Karamardian, S., Schaible, S.: Seven kinds of monotone maps. J. Optim. Theor. Appl. 66,
37–46 (1990)
7. Lin, G.H., Fukushima, M.: Some exact penalty results for nonlinear programs and Mathematical
programs with equilibrium constraints. J. Optim. Theor. Appl. 118, 67–80 (2003)
8. Mangasarian, O.L.: Nonlinear programming. Corrected Reprint of the 1969 Original, Classical
Appl Math, Society for Industrial and Applied Mathematics, SIAM, vol. 10. Philadelphia, PA
(1994)
9. Minty, G.J.: On the monotonicity of the gradient of a convex function. Pacific J. Math. 14,
43–47 (1964)
10. Ortega, J.M., Rheinboldt, W.C.: Interactive Solutions of Nonlinear Equations in Several Vari-
ables. Academic Press, New York (1970)
Chapter 3
A Dynamic Non-preemptive Priority
Queueing Model with Two Types
of Customers

Srinivas R. Chakravarthy

Abstract In this paper, we study a single-server non-preemptive priority queueing

model with two types of customers. The customers arrive according to two indepen-
dent Poisson processes, and the service times are exponential with possibly different
parameters. While Type 1 customers, who have non-preemptive priority over Type 2
customers, have a finite waiting room, Type 2 customers have no such restriction. A
new dynamic rule based on a predetermined threshold is applied in offering services
to lower-priority customers (when higher-priority customers are present) whenever
the server becomes free. Using matrix-analytic methods, we analyze the model in
steady state and bring out some qualitative and interesting aspects of the model under
study. We also compare our model to the classical two-customer non-preemptive pri-
ority model to show a marked improvement in the quality of service to customers
under the proposed threshold model.

Keywords Queueing · Dynamic non-preemptive priority · Matrix-analytic

method · Algorithmic probability

1 Introduction

Preemptive and non-preemptive queueing models have been studied extensively in

the literature ever since the classical books on this topic appeared (see, e.g., [4, 7, 15]).
Such models have applications in many areas, notably in telecommunications (see,
e.g., [14, 15]). Traditional preemptive and non-preemptive queueing models are such
that higher-priority customers are first attended before lower-priority customers on a
first-come-first-served basis. To avoid excessive delays for lower-priority customers,
several modifications to how the preemptive rules are applied have been introduced
in the literature. Using the notion of preemptive distance (which is defined as, in
the multi-priority queueing model, the difference between the indices of priority

S. R. Chakravarthy (B)
Departments of Industrial and Manufacturing Engineering & Mathematics,
Kettering University, Flint, MI 48504, USA
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2018 23

classes), several (see, e.g., [1, 12, 16]) models have been studied, In [6], the authors
employ the number of preemptions as a cutoff point for intervening higher-priority
customers to yield to lower-priority ones. Using discretion rules such as placing a
threshold on the accumulated service effort so as to block further preemptions, a
number of models have been studied (see, e.g., [2, 5, 7]). With the help of threshold
policy, the authors in [3] introduced policies for preemption based on a certain (a)
proportion of service requirements has been met; (b) time units of service has been
met; and (c) time remaining for the current service which is less than a pre-specified
limit.
All of the papers mentioned above analyzed the queueing models under various
assumptions for the arrivals, the services, and the nature of the buffer space (finite or
infinite) and derived several system performance measures. Recently, Kim [8] intro-
duced a hysteretic type threshold policy, which depends on the number of (one partic-
ular type of) customers present in the system, to determine the priority of two types of
customers as well as the rule to switch from one type to another type by preempting
the (lower priority) customer in service. More specifically, the author in [8] consid-
ers a single-server queue with two types of customers and with (N , n)−preemptive
priority rule which operates as follows. Whenever the number of Type 1 customers
in the system reaches, N , N ≥ 1, during the time a Type 2 customer is in service,
that customer is preempted to provide services to Type 1 (thus getting a priority
over Type 2) customers on a first-come-first-served basis and will return to servicing
the preempted (Type 2) customer and other Type 2 customers when the number of
Type 1 customers is n with n, 0 ≤ n < N . Under the assumption of Poisson arrivals
and general services, the author shows that this new priority discipline enables one
to control (within a certain range) the first and second moments of the queue length
of high-priority customers and thus the quality of service (QoS) can be improved.
It should be pointed out that in this model the author’s focus is on the QoS from
higher-priority (Type 1) customers’ point of view (even though they are already
given a higher priority when the upper threshold is reached). Further, the services
for Type 2 customers are resumed only when the number of Type 1 customers hits
the lower threshold upon completion of a Type 1 service.
It should be pointed out that all the models referenced in the above papers involve
preemption in one form or the other causing a disruption in services for one or more
types of customers. Our paper focuses on non-preemptive priority queueing model
with a new (dynamic) threshold rule such that the lower-priority customers do not
have to wait excessively long. Note that in the classical non-preemptive priority queu-
ing model lower-priority customers get pushed out to accommodate higher-priority
ones and hence have to wait longer period of time. Also, our model significantly
differs from the existing models in the literature including the one considered in [8]
by (a) dynamically clearing lower-priority customers as opposed to focusing only
on one type of customers through the threshold and (b) focusing on the QoS from
lower-priority customers also.
3 A Dynamic Non-preemptive Priority Queueing Model with Two … 25

The paper is organized as follows. In Sect. 2, we describe the model under study
in more detail and set up the needed notation for understanding the rest of the paper.
The steady-state analysis of the model is performed in Sect. 3, and the classical non-
preemptive priority queueing model is shown to be the limiting case of the current
model in Sect. 4. The comparison of our model and the corresponding classical non-
preemptive priority queueing model without threshold is carried out in Sect. 5. Some
illustrative examples are presented in Sect. 6, and concluding remarks are outlined
in Sect. 7.

2 Model Description and Notation

Two types of customers, say, Type 1 and Type 2, arrive according to two independent
Poisson processes with rate λ1 and λ2 , respectively, to a single-server system. We
assume that the service times of Type i customers are exponentially distributed with
parameter μi , i = 1, 2. Type 1 customers have a waiting area of a finite capacity of
size, say, K, while Type 2 customers have no limit in the waiting area. Thus, any
arriving Type 1 customers finding the buffer full will be lost. We introduce a new
non-preemptive priority rule to offer services to both types of customers as follows.
There is a threshold, say, N , N ≥ 1, such that upon completion of the current service,
the server either (a) becomes idle due to no customers waiting in the system; or (b)
chooses the customer from the nonempty queue (as only one type of customers is
present at that time); or (c) chooses a Type 1 customer to offer service unless the
number of Type 2 customers waiting in the system is at least N plus the number of
waiting Type 1 customers. That is, the server will offer a service to a Type 2 customer
if, say, there are i Type 1 customers and the number of Type 2 customers is at least
N + i, for 1 ≤ i ≤ K, N ≥ 1.
For use in sequel, we define a number of auxiliary quantities.
• λ = λ1 + λ2 . This gives the total rate of customers arriving to the system. Note
that some of Type 1 customers may be lost due to their buffer being full. So, λ
may not always be the effective total arrival rate to the system.
• By e, we will denote a column vector (of dimension K + 1) of 1’s.
• By ei , we will denote a unit column vector (of dimension K + 1) with 1 in the ith
position and 0 elsewhere.
• By I an identity matrix (of dimension K + 1).
• Suppose that a is a vector of dimension K + 1 with jth element is given by aj . Then,
we denote by (a) a diagonal matrix of order K + 1 with diagonal elements given
by aj , 1 ≤ j ≤ K + 1.

• By ˜ i , we denote a diagonal matrix of order K + 1 given by ˜ i = ( ik=1 ek ).
• i , 1 ≤ i ≤ K, is a square matrix of order K + 1 such that its nonzero entries are
1 and appear in (i + j, i + j − 1)th positions, for 1 ≤ j ≤ K − i + 1. That is,
26 S. R. Chakravarthy

⎛ ⎞ 1 2 ··· i ··· K K + 1
1 1 ⎛ ⎞
⎜ 1 ⎟
⎜ ⎟ ⎜
2 ⎟
⎜ .. ⎟ ⎜
.. ⎟
⎜ . ⎟ ⎜ ⎟
⎜ ⎟ ⎜. ⎟
˜i =⎜
1 ⎟ , i = ⎜ ⎟. (1)
⎜ ⎟ i ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ 0 ⎟ i + 1⎜ 1 ⎟
⎜
⎝ ..
.
⎟
⎠ .. ⎜
⎝ .. ⎟
⎠
. .
0 K +1 1

[Note: Here and in the sequel, we will use blank space in matrices or vectors to
correspond to the entry being zero unless we need to display 0 for more clarity.]
• The matrix F of dimension K + 1 is defined as
⎛ ⎞
−λ λ1
⎜ −λ λ1 ⎟
⎜ ⎟
⎜ . .
.. .. ⎟
F =⎜ ⎟. (2)
⎜ ⎟
⎝ −λ λ1 ⎠
−λ2

• Should there be a need to display I or e or ei of different dimensions other than

K + 1, we will do so by writing, say, Im or e(m) or ei (m) to explicitly identify their
dimension given by m, which is different from K + 1.
• Finally, we will use the notation “ ” appearing as superscript on a vector or a matrix
to denote the transpose of a matrix.

3 The Steady-State Analysis

The steady-state analysis of the model described in Sect. 2 will be analyzed in this
section. First we define, N1 (t), N2 (t), and J (t), respectively, to be the number of
Type 1 customers in the system, the number of Type 2 customers in the system, the
status of the server at time t. Note that the status of the server can be either idle
(J (t) = 0) or busy serving a Type 1 customer (J (t) = 1) or busy serving a Type 2
customer (J (t) = 2). The process {(N2 (t), N1 (t), J (t) : t ≥ 0} is a continuous-time
Markov chain with state space given by

= {(0, 0, 0)} {(0, i1 , 1) : 1 ≤ i1 ≤ K + 1}
(3)
{(i2 , i1 , r) : 2 − r ≤ i1 ≤ K + 2 − r, r = 1, 2, i2 ≥ 1}.

We now define the set of states along with their meanings as follows.
• ∗ = {(0, 0, 0)}. This corresponds to the system being idle.
3 A Dynamic Non-preemptive Priority Queueing Model with Two … 27

• 0 = {(0, i1 , 1), 1 ≤ i1 ≤ K + 1}. This corresponds to the case when there are i1
Type 1 customers including the one in service and no Type 2 customers in the
system.
• i2 = {(i2 , i1 , r) : 2 − r ≤ i1 ≤ K + 2 − r, r = 1, 2, i2 ≥ 1}. This set of states cor-
responds to the case when there are i2 Type 2 customers, i1 Type 1 customers in the
system, and the server is busy serving a Type r customer. Note that when the server
is busy with a Type 1 customer, the number of such customers can be between 1
and K + 1, whereas when the server is busy with a Type 2 customer, the number
of Type 1 customers will be between 0 and K.
The infinitesimal generator of the Markov chain governing the system is of the
form:
⎛ ⎞
−λ λ1 e1 λ2 h
⎜ μ1 e1 C1 C0 ⎟
⎜ ⎟
⎜ μ2 h B̃2 B1 A0 ⎟
⎜ ⎟
⎜ B2 B1 A0 ⎟
⎜ ⎟
⎜ . .. ... ... ⎟
⎜ ⎟
⎜ ⎟
⎜ B2 B1 A0 ⎟
⎜ ⎟
⎜ B2 E1 A0 ⎟
⎜ ⎟
⎜ E2,1 E2 A0 ⎟
Q=⎜ ⎟ , (4)
⎜ E E3 A0 ⎟
⎜ 3,2 ⎟
⎜ . . . ⎟
⎜ . . . . .. ⎟
⎜ ⎟
⎜ EK−1,K−2 EK−1 A0 ⎟
⎜ ⎟
⎜ EK,K−1 A1 A0 ⎟
⎜ ⎟
⎜ A2 A1 A0 ⎟
⎜ ⎟
⎜ A2 A1 A0 ⎟
⎝ ⎠
.. .. ..
. . .

where

O
h = eK+2 (2K + 2), C0 = λ2 I O , C1 = F − μ1 I + μ1 1 , B̃2 = μ2 ,
1

˜1
C1 μ1 O O
B1 = , B2 = μ2 ˜1 ,
O F − μ2 I 1
(5)
˜ i+1
F − μ1 I + μ1 i+1 μ1 O O
Ei = , Ei+1,i = μ2 ˜ i+1 , 1 ≤ i ≤ K − 1,
O F − μ2 I i+1
(6)
F − μ1 I μ1 I OO
A1 = , A2 = μ2 , A0 = λ2 I2K+2 . (7)
O F − μ2 I O I
28 S. R. Chakravarthy

It should be pointed out that C0 is of dimension K + 1 × (2K + 2); C1 is a square

matrix of dimension K + 1; B̃2 is of dimension (2K + 2) × K + 1; B1 , B2 , A0 , A1 , A2 ,
and Ei and Ei+1,i , 1 ≤ i ≤ K − 1, are all square matrices of dimension (2K + 2).

3.1 The Stability Condition

First note that the non-preemptive priority queueing with dynamic priority rule dic-
tated by the threshold parameter, N , under study is governed by a Markov process
whose generator [see Eq. (2)] has a modified quasi-birth-and-death (QBD) form. Fur-
ther, the matrix, A = A0 + A1 + A2 , is upper triangular and hence is reducible. Thus,
we can adopt Theorem 1.4.1 in [10] to our model and obtain the following theorem.
Theorem 1 The queuing system under study is stable if and only if the following
condition is satisfied.
λ2 < μ2 . (8)

Proof Adapting Theorem 1.4.1 in [10], we see that the system under study is stable
if and only if
(A0 )2K+2,2K+2 λ2
= < 1.
(A2 )2K+2,2K+2 μ2

Note: It should be pointed out that the stability condition for the classical two-
customer non-preemptive priority queueing model (i.e., our current model without
the threshold N ) depends not only λ2 and μ2 but also on other parameters, namely
λ1 , μ1 , and K. We will discuss this in more detail in Sect. 4.

3.2 The Steady-State Probability Vector

The steady-state probability vector, x, of Q satisfying

x Q = 0, x e = 1, (9)

is partitioned into vectors of smaller dimensions as follows.

x = (x∗ , u0 , x1 , x2 , . . .), xi = (ui , v i ), i ≥ 1,

(10)
ui = (ui,1 , ui,2 , . . . , ui,K+1 ), i ≥ 0, v i = (vi,0 , vi,1 , . . . , vi,K ), i ≥ 1.
3 A Dynamic Non-preemptive Priority Queueing Model with Two … 29

Under the stability condition given in (8), the steady-state probability vector x is
obtained (see, e.g., [10]) as follows

−λx∗ + μ1 u0,1 + μ2 v1,0 = 0,

λ1 x∗ e1 + u0 C1 + μ2 v 1 1 = 0,
λ2 u0 + u1 C1 + μ2 v 2 ˜ 1 = 0,
λ2 x∗ e1 + μ1 u1
˜ 1 + v 1 (F − μ2 I ) + μ2 v 2
˜ 1 = 0,
λ2 ui−1 + ui C1 + μ2 v i+1 1 = 0,
λ2 v i−1 + μ1 ui ˜ 1 + v i (F − μ2 I ) + μ2 v i+1
˜ 1 = 0, 2 ≤ i ≤ N ,
λ2 ui−1 + ui (F − μ1 I + μ1 i+1−N ) + μ2 v i+1 i+1−N = 0,
˜ i+1−N + v i (F − μ2 I ) + μ2 v i+1
λ2 v i−1 + μ1 ui ˜ i+1−N = 0, N + 1 ≤ i ≤ N + K − 1,
xN +i = xN +K−1 Ri+1−K , i ≥ K, (11)

where the matrix R is the minimal nonnegative solution to the matrix quadratic
equation:
R2 A2 + RA1 + A0 = 0, (12)

and with the normalizing condition

N +K−2
N +K−2

x∗ + ui e + v i e + xN +K−1 (I − R)−1 e = 1. (13)
i=0 i=1

The computation of the steady-state vector, x, can be carried out by exploiting the
special structure of the coefficient matrices appearing in (11), and the details are
omitted. Once the steady-state vector, x, is obtained, a number of key system per-
formance measures can be obtained. For our focus in this paper, we will consider a
few such measures. Two of them will be defined here along with their formulas, and
the rest will be presented in appropriate places below. The mean number of Type i
customers in the system for the threshold model, denoted by μ(T )
Ti , i = 1, 2, is given
by

K+1 ∞

μ(T )
T1 = j ui,j and
j=1 i=0
N +K−2

μ(T ) −1 −2
T2 = (N + K − 2)xN +K−1 (I − R) e + xN +K−1 (I − R) e + ixi e.
i=1

3.3 Rate Matrix (R)

Due to special structure of the matrices A0 , A1 , and A2 , the rate matrix, R, also has a
special structure of being upper triangular, which can be exploited in its computation.
30 S. R. Chakravarthy

While logarithmic reduction [9] method for computing R is more efficient, in order
to exploit the special structure, especially when K is large, one may want to consider
other well-known methods such as (block) Gauss–Seidel iterative method. Since
these are well-known and well publicized in the literature, we refer the reader to
references such as [9, 13] for details.

3.4 Busy Probabilities at Arbitrary Time

The following theorem displays results, which are intuitively clear, are useful in
serving as accuracy checks in numerical computation.
Theorem 2 The probabilities that the server is busy with Type 1 and Type 2 cus-
tomers are given by
(T )
(T ) λ1 (1 − Ploss )
PBusy = , (14)
1
μ1

(T ) λ2
PBusy = , (15)
2
μ2
(T )
where Ploss is the probability that a Type 1 customer is lost due to the buffer being
full and is given by
∞
(T )
Ploss = [ui,K+1 + vi+1,K ]. (16)
i=0

Proof First note that

(T ) ∞
PBusy 1
= i=0 ui e,
(17)
(T ) ∞
PBusy 2
= i=1 v i e.

From the steady-state equations given in (11), one can easily obtain the following
equations.
∞
∞
N +1
(λ1 + μ1 ) ui,1 = λ1 x∗ + μ1 ui,2 + μ2 vi,1 , (18)
i=0 i=0 i=0

∞ ∞ N +j−1 N +j

(λ1 + μ1 ) ui,j = λ1 ui,j−1 + μ1 ui,j+1 + μ2 vi,j , 2 ≤ j ≤ K, (19)
i=0 i=0 i=0 i=0

∞
∞

μ1 ui,K+1 = λ1 ui,K , (20)
i=0 i=0
3 A Dynamic Non-preemptive Priority Queueing Model with Two … 31

∞
∞

λ1 x ∗ + vi,0 = μ1 ui,1 , (21)
i=1 i=0

∞
∞
∞
∞

(λ1 + μ2 ) vi,j = λ1 vi,j−1 + μ1 ui,j+1 + μ2 vi,j , 1 ≤ j ≤ K − 1,
i=1 i=1 i=N +j i=N +j
(22)
∞
∞
∞
∞

μ2 vi,K = λ1 vi,K−1 + μ1 ui,K+1 + μ2 vi,K . (23)
i=1 i=1 i=N +K i=N +K+1

From Eqs. (18)–(23), through some standard algebraic manipulations, it can easily
be verified that
∞
∞

λ1 (ui,j + vi,j ) = μ1 ui,j+1 , 1 ≤ j ≤ K − 1, (24)
i=0 i=0

λ2 (x∗ + u0 e) = μ2 v 1 e, (25)

λ2 (ui e + v i e) = μ2 v i+1 e, i ≥ 1. (26)

The stated result in (14) follows by adding the Eqs. (20), (21), and (24). Similarly,
the stated result in (15) is obtained by adding the Eqs. (25) and (26).

3.5 Steady-State Probability at Departure Epoch

In this section, we will derive an expression for the steady-state probability vector
at departure epochs. It should be pointed out that due to finite buffer for Type 1
customers, this probability will differ from that of at an arbitrary time.
Suppose that y denotes the steady-state probability vector at departure epoch
and that y is partitioned as y = (y0,0 , y0,1 , . . . , y0,K , y1,0 , y1,1 , . . . , y1,K , . . .) such
that yi,j gives the steady-state probability that at a departure epoch there are j, 0 ≤
j ≤ K, Type 1 customers and i, i ≥ 0, Type 2 customers in the system. The following
theorem gives an expression for yi,j .
Theorem 3 The steady-state probability vector y is such that its components are
given by

1
yi,j = μ1 ui,j+1 + μ2 vi+1,j , 0 ≤ j ≤ K, i ≥ 0. (27)
λ2 + λ1 (1 − Ploss )
32 S. R. Chakravarthy

Proof From the definition of the steady-state probabilities, it is easy to see that

yi,j = c μ1 ui,j+1 + μ2 vi+1,j , 0 ≤ j ≤ K, i ≥ 0, (28)

where c is the normalizing constant. The normalizing constant is obtained as follows.

∞ K ∞ K
yi,j = 1 ⇒ c μ1 ui,j+1 + μ 2 vi+1,j =1
i=0
j=0 i=0 j=0
∞
⇒ c i=0 ui e + v i+1 e = 1 ⇒ c[λ1 (1 − Ploss ) + λ2 ] = 1,

where the last statement follows from Eqs. (14) and (15). Hence, the stated result
follows.
In the sequel, we need the following system performance measure defined in terms
of the conditional probability at departure epoch to see the qualitative impact of the
(T1 >0)
threshold parameter N . The conditional probability, PBusy 2
, that there will be at least
one Type 1 customer in the system given that a departure will result in the server
offering a service to a Type 2 customer is given by
K ∞
(T1 >0) j=1 i=N +j yi,j
PBusy = ∞ K ∞ , (29)
2
i=1 yi,0 + j=1 i=N +j yi,j

which can be obtained in a more computable form using the steady-state probability
vectors, u and v. Toward this end, we define

(a, b) = xN +K−1 (I − R)−1 . (30)

(T1 >0)
The simplified and computationally implementable expression for PBusy 2
is given by

(T >0) μ1 K−2 N +K−2 K+1
1
PBusy = i=N +j u i,j+1 + a j − u N +K−1,K+1
2 λ2 + λ1 (1 − Ploss ) j=1 j=2

μ2 K−2 N +K−3 K+1 K
+ i=N +j vi+1,j + b j − j=K−1 vN +K−1,j − vN +K,K .
λ2 + λ1 (1 − Ploss ) j=1 j=2
(31)

Note that the above conditional probability is zero in the classical two-customer
non-preemptive queueing model and hence will indicate the improvement in the
fraction of time the server is paying attention to serving lower-priority customers in
our threshold non-preemptive model.
3 A Dynamic Non-preemptive Priority Queueing Model with Two … 33

4 Classical Two-Customer Non-preemptive Priority

Queueing Model

In this section, we will briefly provide the needed details on the classical non-
preemptive priority queueing model with two types of customers (with only higher-
priority customers having a finite waiting) so as to compare that model with the model
under study here. This is mainly to see the impact of the threshold N on the QoS with
respect to Type 2 customers. Also, it is worth pointing out that if we let N approach
infinity, our model will reduce to the corresponding classical non-preemptive priority
model.
In this case, the state space for this model is same as for the model with threshold N
(see Eq. 3), and the generator, Q̃, for the corresponding classical non-preemptive
priority queueing model is of the form
⎛ ⎞
−λ λ1 e1 λ2 h
⎜ μ1 e1 C1 C0 ⎟
⎜ ⎟
⎜ μ2 h B̃2 B1 A0 ⎟
⎜ ⎟
Q̃ = ⎜ B2 B1 A0 ⎟, (32)
⎜ ⎟
⎜ B2 B1 A0 ⎟
⎝ ⎠
.. .. ..
. . .

where the entries appearing in (32) are as given in (5)–(7).

4.1 The Steady-State Analysis—Classical Non-preemptive

Priority Queueing Model

In this section, we will briefly outline the steady-state analysis starting with the
stability condition. In order to derive the stability condition for the classical non-
preemptive priority queueing model, we first need the steady-state probability vector
of B = A0 + B1 + B2 , where A0 is as given in (5) and B1 and B2 are as given (7).
Toward this end, let π = (π 1 , π 2 ) be the steady-state probability vector of the gen-
erator B. That is, π satisfies
πB = 0, πe = 1. (33)

The stability condition for the classical non-preemptive queueing model is given
in the following theorem. Before that, we further partition π r , r = 1, 2 as π r =
(πr,1 , πr,2 , . . . , πr,K+1 ), r = 1, 2.
Theorem 4 The classical two-customer non-preemptive priority queuing system
(with only higher-priority customers having a finite waiting room) is stable if and
only if the following condition is satisfied.

λ2 < μ2 d , (34)
34 S. R. Chakravarthy

where d is given by
⎧ ⎫
K+1−k ⎬ −1
λ1 ⎨ λ K K
1 λ 1 k λ
1
d = π2 e = 1 + 1− + 1− .
μ1 ⎩ λ1 + μ2 μ1 λ1 + μ2 ⎭
k=1
(35)

Proof First note that due to the special structure of the matrices A0 , B1 , and B2 (see
Eq. (5)), the steady-state equation given in (34) can be rewritten as

π 1 [F − μ1 I + μ1 1 ] + μ2 π 2 1 = 0,
˜ 1 + π 2 [F − μ1 2 + μ2
μ1 π 1 ˜ 1 ] = 0, (36)
π 1 e + π 2 e = 1.

Noting that (see, e.g., [10]) the necessary and sufficient condition for the classical
queue under study in this section to be stable is πA0 e < πB2 e, which reduces to

λ2 < μ2 π 2 e. (37)

It can easily be verified from (36) that

λ1
π1,1 = π2,1 , (38)
μ1

λ1 λ1 j−1
π1,j = π1,j−1 + π2,1 , 2 ≤ j ≤ K, (39)
μ1 λ1 + μ2
2
λ1 λ1 K−1
π1,K+1 = π1,K−1 + π2,1 , (40)
μ1 λ1 + μ2
λ1 j−1
π2,j = π2,1 , 1 ≤ j ≤ K, (41)
λ1 + μ2

λ1 λ1 K−1
π2,K+1 = π2,1 . (42)
μ2 λ1 + μ2

Now adding Eqs. (38) and (39) (over j, 1 ≤ j ≤ K), and (40), we get
λ K K K+1−k
λ1 λ1 + μ2 1 λ1 k λ
1
π1 e = 1− + 1− π2,1 .
μ1 μ2 λ1 + μ2 μ1 λ1 + μ2
k=1
(43)
Adding Eq. (42) to the one obtained by summing over j, 1 ≤ j ≤ K of (41), we get

μ2 π 2 e = (λ1 + μ2 )π2,1 . (44)

3 A Dynamic Non-preemptive Priority Queueing Model with Two … 35

Now the stated result follows immediately from (43) to (44) along with the normal-
izing equation given in (36).
Note: (1) Note that one can simplify further the expression for d given in (35) by
considering three cases: (a) λ1 = μ1 ; (b) μ1 = λ1 + μ2 ; and (c) μ1 = λ1 + μ2 . The
details are omitted.
(2) While the steady-state vector, π, is explicitly given, it is probably more efficient
to compute recursively with π2,1 computed from the normalizing condition.
Under the stability condition given in (34), the steady-state probability vector,
say, x̃, of Q̃, is of modified matrix-geometric and is obtained as follows. Once again,
we will partition the steady-state vector here like we did for the threshold model.
That is, we partition x̃ as

x̃ = (x̃∗ , ũ0 , x̃1 , x̃2 , . . .), x̃i = (ũi , ṽ i ), i ≥ 1,

ũi = (ũi,1 , ũi,2 , . . . , ũi,K+1 ), i ≥ 0, ṽ i = (ṽi,0 , ṽi,1 , . . . , ṽi,K ), i ≥ 1.

The steady-state probability vector x̃ is obtained by solving the following system

of equations.

−λx̃∗ + μ1 ũ0,1 + μ2 ṽ1,0 = 0,

λ1 x̃∗ e1 + ũ0 C1 + x̃1 B̃2 = 0,
∗ (45)
λ2 x̃ eK+2 (2K + 2) + ũ0 C0 + x̃1 [B1 + R̃B2 ] = 0,
x̃i = x̃1 R̃i−1 , i ≥ 1,

where the matrix R̃ is the minimal nonnegative solution to the matrix quadratic
equation:
R̃2 B2 + R̃B1 + A0 = 0, (46)

and with the normalizing condition

x̃∗ + ũ0 e + x̃1 (I − R̃)−1 e = 1. (47)

The computation of x̃ is done similar to x by exploiting the special structure of the

coefficient matrices appearing in (45), and the details are omitted. Like earlier, we
display the mean number of Type i customers in the system for the classical model,
denoted by μ(C)
Ti , i = 1, 2, which is given by

K+1 ∞

μ(C)
T1 = j ũi,j and μ(C) −2
T 2 = x̃1 (I − R̃) e.
j=1 i=0

The following theorem is very similar to Theorem 2 in that Theorem 5 gives expres-
sions for busy probabilities for the classical non-preemptive priority queueing model.
36 S. R. Chakravarthy

Theorem 5 The probabilities that the server is busy with Type 1 and Type 2 cus-
tomers in the case of classical two-customer non-preemptive priority queueing model
are given by
(C)
(C) λ1 (1 − Ploss )
PBusy = , (48)
1
μ1

where
∞

(C)
Ploss = [ũi,K+1 + ṽi+1,K ]. (50)
i=0

Proof The proof is very similar to Theorem 2 once the following equations are
verified from (45).
∞ ∞ ∞
(λ1 + μ1 ) i=0 ũi,1 = λ1 x̃∗ + μ1 i=0 ũi,2 + μ2 i=0 ṽi,1 ,
∞ ∞ ∞ ∞
(λ1 + μ1 ) i=0 ũi,j = λ1 i=0 ũi,j−1 + μ1 i=0 ũi,j+1 + μ2 i=0 ṽi,j , 2 ≤ j ≤ K,
∞ ∞
μ1 i=0 ũi,K+1 = λ1 i=0 ũi,K ,
∞
λ1 x̃∗ + ∞
i=1 ṽi,0 = μ1 i=0 ũi,1 ,

∞ ∞
(λ1 + μ2 ) i=1 ṽi,j = λ1 i=1 ṽi,j−1 1 ≤ j ≤ K − 1,
∞ ∞
μ2 i=1 ṽi,K = λ1 i=1 ṽi,K−1 .
(51)

5 Comparison of the Two Models

In this section we, will compare the non-preemptive priority queueing model with
the threshold and its corresponding classical model. Toward this end, we first define
the traffic intensities of the two models. Let ρT and ρC denote, respectively, the
traffic intensity of the two-customer non-preemptive priority queueing model with
threshold and without threshold (i.e., classical). That is,

λ2 λ2
ρT = , ρC = , (52)
μ2 μ2 π 2 e

where d is as given in (35). Also, note that the above equation implies

ρT = ρC π 2 e. (53)
3 A Dynamic Non-preemptive Priority Queueing Model with Two … 37

1. Looking at the stability condition (see Theorems 1 and 4 which give expressions
for the two models), it is clear that the threshold model can accommodate a larger
rate of Type 2 arrivals (assuming all other parameters are fixed) when compared to
the classical model. Only when λ1 → 0, which corresponds to essentially not hav-
ing any higher-priority customers in the model or when μ1 → ∞, which assures
that Type 1 customers are almost immediately served, we see that the classical
non-preemptive priority model will approach to the same level of handling a larger
number of Type 2 customers like the threshold non-preemptive priority queueing
model without violating the stability condition.
2. When both λ1 and μ1 are finite and positive, no matter how fast Type 2 customers
are served (i.e., how large μ2 is), the threshold model can handle more Type 2
customers on the average compared to the classical one. This is due to the fact
that π 2 e will always be positive.
3. As long as Type 1 customers are allowed to enter into the system and have a finite
service time, the threshold model can always accept a larger rate of Type 2 cus-
tomers (within the allowable level satisfying the stability condition) as compared
to the corresponding classical one.
4. Under the assumptions that λ1 < μ1 (in addition to λ2 < μ2 which is needed for
the stability of the threshold
model
under study here), it can easily be verified
λ1
from (35) that π2 e → 1 − μ1 as K → ∞. Note that this result is intuitively
obvious since in the case when Type 1 customers are admitted without any limit
(i.e., they have infinite buffer space like Type 2 customers), the system’s stability
requires λ1 < μ1 in addition to λ2 < μ2 .
5. The main purpose of introducing the threshold parameter, N , into the classical
non-preemptive priority queueing model is to reduce the average number of Type
2 customers waiting to be processed in the presence of Type 1 customers. We
will explore this numerically (due to the complexity of the expressions for this
measure for the two models) in the Sect. 6.
6. As N → ∞, the threshold model will approach the classical model. It would be
of interest to see an optimal value, say, N ∗ of N , such that the loss probabilities
under the two models are close enough to each other. Again, we will explore this
numerically in Sect. 6. This is mainly due to the complexity of the expressions
for the loss probability.

6 Numerical Examples

In this section, we will present two representative examples to illustrate the impact of
the new type of non-preemptive priority rule. In order to compare the classical non-
preemptive priority queueing model to the one studied in this paper in a meaningful
way, we need to set the parameters of the model properly taking into account the
stability conditions for the two models are different. From Theorems 1 to 4, we note
that the stability condition for the classical model depends on λ1 , μ1 , μ2 , and K,
38 S. R. Chakravarthy

whereas for the threshold model with N being positive and finite, it depends only λ2
and μ2 . Furthermore, as is to be expected, the threshold model can handle a larger
load (either through a larger λ1 or a larger λ2 or a combination of both) as compared
to the classical model. This will be explored further in the examples below.
Example 1 The goal of this example is to find what should be the minimum value
(T ) (C)
of the threshold N , say N ∗ , such that |Ploss − Ploss | < 10−3 under various scenarios.
Toward this end, we fix λ1 = 1, μ1 = μ2 = 1.1, vary K = 1, 2, 3, 5, 10, 15, 20, 50,
and choose λ2 such that we get a specific value for ρC , which is varied over 0.1
through 0.95. Note that in order to properly carry out the comparison, we need to use
the same λ2 value in the threshold model. Thus, ρT will be much less than ρC (see
53) for the same set of values for the other parameters. Note that in this example,
λ2 = ρT since we fixed μ2 = 1.0. In Table 1, we display the values of N ∗ , and in
μ(T ) μ(T )
Tables 2 and 3, respectively, we display the values of (C) T1
, and (C)
T1
. Note that in
μT 1 μT 1
obtaining the mean values for Type 1 and Type 2 customers for the threshold model,
we set N = N ∗ so that the comparison of classical and the threshold models makes
sense.
From Table 1, we notice that, as expected, for smaller values of ρC , one needs a
smaller N for all K in the range considered to get the loss probabilities under both
models to differ by no more than 10−3 . However, as ρC becomes larger, one needs a
larger N and the value of N appears to increase with K.
A look at the values in Tables 2 and 3 indicates a significant reduction in the mean
number of Type 2 customers present in the system, while at the same time, the mean
number of Type 1 customers present in the system does not increase appreciably.
While the increase is insignificant when K is upto 20, we see relatively significant
increase when K is 50. It should be pointed out that one needs to keep in mind
the differing values of N when making specific interpretations. However, general
observations like the one we made here should be adequate to bring out the qualitative
aspects of the threshold model.

Table 1 Optimum N ∗ values under various scenarios

ρC K =1 K =2 K =3 K =5 K = 10 K = 15 K = 20 K = 50
0.1 2 1 1 1 1 1 1 1
0.2 3 3 3 2 2 1 1 1
0.3 4 4 4 4 4 3 2 1
0.4 5 6 6 6 7 7 7 1
0.5 7 8 9 10 12 13 13 1
0.6 10 11 12 14 18 21 23 6
0.7 14 16 18 21 29 34 38 26
0.8 22 25 28 34 47 58 66 64
0.9 41 47 54 65 91 112 130 155
0.95 71 81 92 111 153 188 218 280
3 A Dynamic Non-preemptive Priority Queueing Model with Two … 39

(T ) (C)
Table 2 Ratios of μT 1 over μT 1 , under various scenarios at N ∗
ρC K =1 K =2 K =3 K =5 K = 10 K = 15 K = 20 K = 50
0.1 1.000 1.001 1.002 1.003 1.004 1.005 1.006 1.009
0.2 1.000 1.000 1.001 1.004 1.009 1.021 1.025 1.038
0.3 1.000 1.001 1.001 1.004 1.010 1.024 1.041 1.089
0.4 1.000 1.001 1.001 1.004 1.010 1.021 1.033 1.166
0.5 1.000 1.001 1.001 1.003 1.009 1.018 1.032 1.275
0.6 1.000 1.001 1.001 1.003 1.009 1.017 1.028 1.319
0.7 1.000 1.001 1.001 1.003 1.008 1.016 1.026 1.249
0.8 1.000 1.001 1.001 1.003 1.008 1.014 1.023 1.196
0.9 1.000 1.001 1.001 1.003 1.007 1.013 1.021 1.155
0.95 1.000 1.001 1.001 1.002 1.007 1.013 1.020 1.138

Table 3 Ratios of μ(T ) (C)

T 2 over μT 2 , under various scenarios at N
∗

ρC K =1 K =2 K =3 K =5 K = 10 K = 15 K = 20 K = 50
0.1 0.988 0.911 0.905 0.901 0.902 0.903 0.905 0.909
0.2 0.986 0.978 0.970 0.909 0.888 0.809 0.814 0.831
0.3 0.981 0.968 0.957 0.936 0.898 0.836 0.783 0.761
0.4 0.971 0.972 0.958 0.932 0.904 0.871 0.851 0.695
0.5 0.970 0.964 0.961 0.945 0.915 0.889 0.861 0.626
0.6 0.969 0.956 0.946 0.932 0.904 0.885 0.871 0.647
0.7 0.958 0.947 0.939 0.920 0.899 0.875 0.861 0.740
0.8 0.946 0.930 0.918 0.900 0.870 0.853 0.839 0.768
0.9 0.906 0.885 0.873 0.844 0.804 0.778 0.765 0.721
0.95 0.847 0.817 0.798 0.761 0.705 0.674 0.657 0.621

In summary, this example illustrates the significant advantage in using the type
of threshold introduced here to increase the QoS for lower-priority customers with-
out affecting the higher-priority customers. This is an important observation since
priority queues occur naturally in practice, and with the classical models, the lower-
priority customers get poor QoS.

Example 2 The purpose of this example is to see the impact of N under various
scenarios. That is, we look at the non-preemptive priority queueing model with
threshold under study in this paper and look at the role played by the parameter N .
Toward this end, we fix λ1 = 1, μ1 = 1.1, μ2 = 1, vary K = 1, 2, 5, 10, 15, 20, and
choose λ2 such that we get a specific value for ρT , which is taken to be one of four
values ρT = 0.1, 0.5, 0.9, 0.95. Since we fixed μ2 = 1.0, it is clear (see Eq. (53)) that
λ2 = ρT in this example.
40 S. R. Chakravarthy

Fig. 1 Selected measures for threshold non-preemptive priority model under various scenarios

In Fig. 1, we display the graphs of four measures, Ploss , μ(T ) (T ) (T1 >0)
T 1 , μT 2 , and PBusy2 , for
the threshold model, for selected values of ρT . A brief look at this figure reveals the
following key observations.
• For fixed N and for low traffic intensity, we notice that Ploss appears to decrease
with increasing K. However, in the case of higher traffic intensity, we see such a
3 A Dynamic Non-preemptive Priority Queueing Model with Two … 41

behavior only for small N . This behavior is as is to be expected since a higher traffic
intensity will result in more Type 2 customers arriving to the system resulting in
them getting services more frequently. This results in Type 1 customers getting
lost more often. Also, notice that in the case of higher traffic intensity, the range
for the loss probability is much narrower (in the current case, it varies from 0.945
to 0.952) indicating that one can choose N to be small so as to help increase the
quality of service for Type 2 customers.
• For fixed N , we see that μ(T )
T 1 appears to increase significantly as K increases;
however, for fixed K, this measure does not appear to increase significantly as N
is increased. This is the case for low as well as high traffic intensity.
• For fixed N we see that μ(T )
T 2 appears to increase significantly as K increases; sim-
ilarly, for fixed K this measure appears to increase significantly as N is increased.
This is the case for low as well as high traffic intensity.
(T1 >0)
• With respect to the measure, PBusy 2
, we see some interesting observations. These
are as follows. First, in the high traffic intensity region, the significant role of K is
seen initially as this measure increases and then attains its maximum value. When
the traffic intensity is high, the number of Type 2 customers arrives at a faster rate
(as μ2 is fixed and we vary λ2 to arrive at a specific value for ρT ). Thus, it is not
surprising to see the measure under discussion to be insensitive to N . Secondly,
in the low to moderate (the figure contains only for low traffic intensity value due
to limiting the number of figures) traffic intensity when N is small, the measure
appears to decrease initially and then increase as K is increased. This is somewhat
counterintuitive.

7 Concluding Remarks

In this paper, we considered a non-preemptive priority queueing system with two

types of customers and introduced a threshold to attend to serving lower-priority cus-
tomers in the presence of higher priority to increase the quality of service for lower-
priority customers. By comparing the current model to the classical two-customer
non-preemptive priority queueing model, we showed a marked improvement in the
quality of service with the introduction of the new type of threshold parameter.
Assuming the buffer size to be finite for Type 1 customers and infinite for Type
2 customers, we studied the model as highly structured QBD process. The model
under study can be generalized in a number of ways. For example, we can model
the arrivals to follow a versatile point process, namely Markovian arrival process,
the service times to be of phase type, and also consider a multi-server system. The
results of these and other models will be presented elsewhere.
42 S. R. Chakravarthy

References

1. Adiri, I., Domb, I.: A single server queueing system working under mixed priority disciplines.
Oper. Res. 30, 97–115 (1982)
2. Avi-Itzhak, B., Brosh, I., Naor, P.: On discretionary priority queueing. Z. Angew. Math. Mech.
6, 235–242 (1964)
3. Cho, Y.Z., Un, C.K.: Analysis of the M/G/1 queue under acombined preemptive/nonpreemptive
priority discipline. IEEE Trans. Commun. 41, 132–141 (1993)
4. Conway, R.W., Maxwell, W., Miller, L.: Theory of Scheduling. Addison-Wesley, Reading, MA
(1967)
5. Drekic, S., Stanford, D.A.: Threshold-based interventions to optimize performance in preemp-
tive priority queues. Queueing Syst.35, 289–315 (2000)
6. Drekic, S., Stanford, D.A.: Reducing delay in preemptive repeat priority queues. Oper. Res.
49, 145–156 (2000)
7. Jaiswal, N.K.: Priority queues. Acadameic Press, USA (1968)
8. Kim, K.: (N , n)-preemptive priority queue. Perform. Eval. 68, 575–585 (2011)
9. Latouche, G., Ramaswami, V.: Introduction to Matrix Analytic Methods in Stochastic Model-
ing. SIAM (1999)
10. Neuts, M.F.: Matrix-Geometric Solutions in Stochastic Models: An Algorithmic Approach.
The Johns Hopkins University Press, Baltimore, MD. [1994 version is Dover Edition] (1981)
11. Neuts, M.F.: Algorithmic Probability: A Collection of Problems. Chapman and Hall, NY (1995)
12. Paterok, M., Ettl, A.: Sojourn time and waiting time distributions for M/G/1 queues with
preemption-distance priorities. Oper. Res. 42, 1146–1161 (1994)
13. Stewart, W.J.: Introduction to the Numerical Solution of Markov Chains. Princeton University
Press, Princeton, NJ (1994)
14. Takagi, H.: Analysis of Polling Systems. MIT, USA (1986)
15. Takagi, H.: Queueing Analysis 1: A Foundation of Performance Evaluation: Vacation and
Priority Systems. North-Holland, Amsterdam (1991)
16. Takagi, H., Kodera, Y.: Analysis of preemptive loss priority queues with preemption distance.
Queueing Syst. 22, 367–381 (1996)
Chapter 4
Iθ -Statistical Convergence
of Weight g in Topological Groups

Ekrem Savas

Abstract In this paper, we introduce and study the concept of I-lacunary statistical
convergence of weight g : [0, ∞) → [0, ∞) where g(xn ) → ∞ for any sequence
(xn ) in [0, ∞) with xn → ∞ in topological groups, and finally, we investigate some
inclusion relations theorems related to I-lacunary statistical convergence.

Keywords Lacunary sequence · Statistical convergence of weight g

Topological groups

1 Introduction

Note that the statistical convergence of a sequence was introduced by Fast [8] and
Schoenberg [23]. Later, the concept of statistical convergence has been discussed by
Fridy [9], Šalát [14]. More details on statistical convergence and on applications of
this concept can be found in Di Maio and Kočinac [13], Das and Savas [5], and Savas
[21, 22].
The notion of statistical convergence is related to the density of subsets of the set
N of natural numbers. The density of subset E of N is defined by

1
n
δ(E) = lim n χ E (k)
n k=1

provided the limit exists, where χ E is the characteristic functions of E. It is obvious

that any finite subset of N has zero natural density and δ(E)c = 1 − δ(E).
A sequence x = (x j ) is said to be statistically convergent to ξ if for arbitrary
ε > 0, the set E(ε) = {n ∈ N : |x j − ξ | ≥ ε} has natural density zero (see [9]). In
this case, we write st − lim j x j = ξ and we denote the set of all statistical convergent
sequences by S.

E. Savas (B)
Department of Mathematics, Usak University, Usak, Turkey
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2018 43

By a lacunary sequence, we mean an increasing sequence θ = (k p ) of positive

integers such that k0 = 0 and h p : k p − k p−1 → ∞ as p → ∞. Throughout this
paper, the intervals determined by θ will be denoted by I p = (k p−1 , k p ], and the
−1
ratio k p k p−1 will be abbreviated by q p .
Also in [10], a new type of convergence called lacunary statistical convergence
was introduced as follows: A sequence (x j ) of real numbers is said to be lacunary
statistically convergent to ξ (or, Sθ -convergent to ξ ) if for any ε > 0,

1
lim |{ j ∈ I p : |x j − ξ | ≥ ε}| = 0
p→∞ hp

Gadjiev and Orhan (see [11]) has given the order of statistical convergence of
a sequence, and also, Colak [4] studied the statistical convergence of order α and
strongly p-Cesàro summability of order α.
The (relatively more general) concept of I-convergence was introduced by
Kostyrko et al. [12] in a metric space as a generalized form of the concept of
statistical convergence, and it is based upon the notion of an ideal of the subset
of the set N of positive integers.
More investigations and more applications of ideals can be found in [6, 7, 15–20].
Recently in [21], we introduce the concepts of I-statistical convergence and
I-lacunary statistical convergence in topological groups. Also, Savas [22] extended
the above concepts to I-statistical convergence and I-lacunary statistical conver-
gence of order α, 0 < α ≤ 1 in topological groups.
Quite recently in [1], it has been extended to the idea of natural or asymptotic
density by taking natural density of weight g where g : N → [0, ∞) is a function
with lim g (n) = ∞ and g(n) n
0 as n → ∞.
n→∞
In a natural way, in this paper we consider new and more general summability
methods, namely I-statistical convergence of weight g and Iθ -statistical convergence
of weight g in topological group.

2 Definitions and Notations

In this paper, our study will concern ideal which is given below:
Definition 1 (see [12]). A family I ⊂ 2N is said to be an ideal of N if the following
conditions hold:
(a) P, Q ∈ I implies P ∪ Q ∈ I,
(b) P ∈ I, Q ⊂ P implies Q ∈ I,
Definition 2 (see [12]). A non-empty family F ⊂ 2N is said to be an filter of N if
the following conditions hold:
(a) φ ∈
/ F,
(b) P, Q ∈ F implies P ∩ Q ∈ F,
(c) P ∈ F, P ⊂ Q implies Q ∈ F,
4 Iθ -Statistical Convergence of Weight g in Topological Groups 45

Definition 3 (see [12]). A proper ideal I is said to be admissible if {n} ∈ I for each
n ∈ N.

Throughout this note, I will stand for a proper admissible ideal of N.

Definition 4 (see [12]) Let I ⊂ 2N be a proper admissible ideal in N. The sequence

x = (x j ) of elements of R is said to be I-convergent to ξ if for each > 0 the set
K ( ) = {n ∈ N : |x j − ξ | ≥ } ∈ I.

Let g : N → [0, ∞) be a function with lim g (n) = ∞. The upper density of

n→∞
weight g was defined in [1] by the formula

K (1, n)
d g (K ) = lim sup
n→∞ g (n)

for K ⊂ N where as before K (1, n) denotes the cardinality of the set K ∩ [1, n].
Then, the family
Ig = {K ⊂ N : d g (K ) = 0}

forms an ideal. It has been observed in [1] that N ∈ Ig iff. g(n) n

→ 0, as n → ∞.
So we additionally assume that n/g (n) 0, so that N ∈ / Ig and Ig is a proper
admissible ideal of N. The set of all such weight functions g satisfying the above
properties will be denoted by G. Now, we can write the following definition.

Definition 5 A sequence x j of real numbers is said to converge dg −statistically to
ξ if for any given ε > 0, d g (K (ε)) = 0 where K (ε) is the set defined in Definition 4.

By X , we will note an abelian topological Hausdorff group, written additively,

which satisfies the first axiom
of countability. For a subset R of X , s(R) will denote
the set of all sequences x j such that x j is in R for j = 1, 2, . . . , c(X ) will denote
the set of all convergent sequences. In [2], a sequence (x j ) in X is called to be
statistically convergent to an element ξ of X if for each neighbourhood U of 0,

1
lim |{ j ≤ n : x j − ξ ∈
/ U }| = 0.
n→∞ n

The set of all statistically convergent sequences in X is denoted by st (X ).

Furthermore, Cakalli [3] considered lacunary statistical convergence in topolog-
ical groups as follows: A sequence (x j ) is said to be Sθ -convergent
to ξ if for each
neighbourhood U of 0, lim p→∞ (h p )−1 j ∈ I p : x j − L ∈ / U } = 0. In this case, we
define

Sθ (X ) = (x j ) : for some ξ, Sθ − lim x j = ξ .
j→∞
46 E. Savas

We now introduce the following definitions:

Definition 6 A sequence x = (x j ) in X is said to be statistically convergent of

weight g to ξ or S(I)g -convergent of weight g to ξ if for each γ > 0 and for each
neighbourhood U of 0,

1
{n ∈ N : |{ j ≤ n : x j − ξ ∈
/ U }| ≥ γ } ∈ I.
g(n)

In this case, we write x j → ξ(S(I)g ). The class of all S(I)g -statistically convergent
sequences will be denoted by simply S(I)g (X ).

Remark 1 For I = I f in = {B ⊆ N : B is a finite subset }, S(I)g -convergence co-

incides with statistical convergence of weight g in topological groups. Further taking
g (n) = n α , it reduces to I-statistical convergence of order α in topological groups,
which is studied by Savas [22].

Definition 7 Let θ be a lacunary sequence. A sequence x = (x j ) in X is said to be

I-lacunary statistically convergent of weight g to ξ or Sθ (I)α -convergent to ξ if for
any γ > 0 and for each neighbourhood U of 0,

1
{p ∈ N : |{ j ∈ I p : x j − ξ ∈
/ U }| ≥ γ } ∈ I.
g(h p )

In this case, we write

Sθ (I)g − lim x j = ξ or x j → ξ(Sθ (I)g )

j→∞

and define

Sθ (I) (X ) = (x j ) : for some ξ, Sθ (I ) − lim x j = ξ
g g
j→∞

and in particular,

Sθ (I)g (X )0 = (x j ) : Sθ (I)g − lim x j = 0 .
j→∞

Remark 2 For I = I f in , Sθ (I)g -convergence reduces to lacunary statistical conver-

gence of weight g in topological groups, which has not been studied till now. Further,
we write in the special case θ = 2r . Definition 7 reduces to Definition 6.
4 Iθ -Statistical Convergence of Weight g in Topological Groups 47

3 Inclusion Theorems

The following theorem gives inclusion relations

Theorem 1 Let g1 , g2 ∈ G be such that there exist M > 0 and j0 ∈ N such that
g1 (n)
g2 (n)
≤ M for all n ≥ j0 . Then S(I)g1 ⊂ S(I)g2 .

Proof For any neighbourhood U of 0,

j ≤ n : xj − ξ ∈
/U g1 (n) /U
j ≤ n : xj − ξ ∈
= ·
g2 (n) g2 (n) g1 (n)

j ≤ n : xj − ξ ∈/U
≤M· .
g1 (n)

for n ≥ j0 . Hence for any γ > 0 and for each neighbourhood U of 0

/U
j ≤ n : xj − ξ ∈
n∈N: ≥γ
g2 (n)

/U
j ≤ n : xj − ξ ∈ γ
⊂ n∈N: ≥ ∪ {1, 2, . . . , j0 } .
g1 (n) M

So we have that S(I)g1 ⊂ S(I)g2 .

Similarly, we can get the following result.

Theorem 2 Let g1 , g2 ∈ G be such that there exist M > 0 and i 0 ∈ N such that
g1 (n)
g2 (n)
≤ M for all n ≥ i 0 . Then
(i) Sθ (I)g1 (X ) ⊂ Sθ (I)g2 (X ).
(ii) In particular Sθ (I)g1 (X ) ⊂ Sθ (I)(X ).

We now record two useful another theorems.

Theorem 3 For any lacunary sequence θ , I-statistical convergence of weight g

implies I-lacunary statistical convergence of weight g if

g hp
lim inf > 1.
p g kp

Proof Since lim inf g( k p ) > 1, so we get a H > 1 such that for sufficiently large p
g h
p ( p)
we get
g hp
≥ H.
g kp
48 E. Savas

Since x j → ξ S(I)g , hence for each neighbourhood U of 0 and sufficiently large
p we have

1 1
/ U ≥ j ∈ Ip : x j − ξ ∈
j ≤ kp : x j − ξ ∈ /U
g kp g kp
1
/ U .
≥ H · j ∈ Ip : x j − ξ ∈
g hp

Then for any γ > 0, and for each neighbourhood U of 0 we get

1
/U ≥γ
p ∈ N : j ∈ Ip : x j − ξ ∈
g hp

1
⊆ / U ≥ Hγ
p ∈ N : j ≤ kP : x j − ξ ∈ ∈ I.
g kp

This shows that x j → ξ Sθ (I)g .

suppose that the lacunary sequence θ fulfils the condition

For the next theorem, we
that for any set C ∈ F(I), {n : k p−1 < n < k p , p ∈ C} ∈ F(I).

Theorem 4 For a lacunary sequence θ satisfying the above condition, I-lacunary

statistical convergence of weight g implies I-statistical convergence of weight g
p
g(h i )
(where g (n) = n), if sup g (k p−1 )
= K (say) < ∞ where g is also assumed to be
i=1
monotonically increasing.

Proof Assume that x j → ξ Sθ (I)g . Take any neighbourhood U of 0. For γ , γ1 > 0
define the sets

1
P = p ∈ N : j ∈ Ip : x j − ξ ∈ /U <γ
g hp

and
1
T = n∈N: / U < γ1 .
j ≤ n : xj − ξ ∈
g(n)

From our assumption, it follows that P ∈ F (I), the dual filter of I. Also note that

1
Ak = /U <γ
j ∈ Ik : x j − ξ ∈
g(h k )

for all k ∈ P. Let n ∈ N be such that k p−1 < n < k p for some p ∈ P. Now
4 Iθ -Statistical Convergence of Weight g in Topological Groups 49

1
j ≤ n : xj − ξ ∈ /U
g (n)
1
≤ j ≤ kp : x j − ξ ∈ /U
g k p−1
1 1
= j ∈ I1 : x j − ξ ∈ / U + ··· + /U
j ∈ Ip : x j − ξ ∈
g k p−1 g k p−1
g (h 1 ) 1
= · j ∈ I1 : x j − ξ ∈ / U +
g k p−1 g (h 1 )
g (h 2 ) 1
· j ∈ I2 : x j − ξ ∈ / U + ··· +
g k p−1 g (h 2 )

g hp 1
· j ∈ Ip : x j − ξ ∈ /U
g k p−1 g h p

g (h 1 ) g (h 2 ) g hp
= · A1 + · A2 + · · · + · Ap
g k p−1 g k p−1 g k p−1

By choosing δ1 = Kδ and since n : k p−1 < n < k p , p ∈ P ⊂ T where P ∈
F (I), it is obvious that from our assumption on θ that the set T also belongs to
F (I).

Corollary 1 Let θ = {(k p )} be a lacunary sequence, then S(I)α (X ) = Sθ (I)α (X )

iff
1 < lim inf q p ≤ lim sup q p < ∞.
p p

Finally, we conclude this paper by proving the following theorem.

g(h p )
Theorem 5 S(I)g (X ) ⊂ Sθ (I)g (X ) if lim inf g(n)
> 0.
n

g(h p )
Proof Since lim inf g(n)
> 0, so we can find a M > 0 such that for sufficiently large
n
n we have
g(h p )
≥ M.
g(n)

Since x j → ξ S(I)g , hence for any neighbourhood U of 0 and sufficiently large n,

1 1
/U ≥
j ≤ n : xj − ξ ∈ /U
j ∈ Ip : x j − ξ ∈
g(n) g(h p )

1
≥M / U .
j ∈ Ip : x j − ξ ∈
g(h p )

For γ > 0,
50 E. Savas

1
n∈N: j ∈ Ip : x j − ξ ∈/U ≥γ
g(h p )

1
⊂ n∈N: / U ≥ Mγ .
j ∈ Ip : x j − ξ ∈
g(n)

Since I is admissible, the set on the right-hand side belongs to I.

References

1. Balcerzak, M., Das, P., Filipczak, M., Swaczyna, J.: Generalized kinds of density and the
associated ideals. Acta Math. Hungar. 147(1), 97–115 (2015)
2. Çakalli, H.: On statistical convergence in topological groups. Pure Appl. Math. Sci. 43(1–2),
27–31 (1996)
3. Çakalli, H.: Lacunary statistical convergence in topological groups. Indian J. Pure Appl. Math.
26(2), 113–119 (1995)
4. Colak, R.: Statistical convergence of order α. Modern Methods in Analysis and Its Applications,
121–129. Anamaya Publisher, New Delhi, India (2010)
5. Das, P., Savaş, E.: On I -convergence of nets in locally solid Riesz spaces. Filomat 27(1), 84–89
(2013)
6. Das, P., Savaş, E.: On Iλ -statistical convergence in locally solid Riesz spaces. Math. Slovaca
65(6), 1491–1504 (2015)
7. Das, P., Savaş, E.: On I -statistically pre-Cauchy sequences. Taiwanese J. Math. 18(1), 115–126,
FEB (2014)
8. Fast, H.: Sur la convergence statistique. Colloq Math. 2, 241–244 (1951)
9. Fridy, J.A.: On ststistical convergence. Analysis 5, 301–313 (1985)
10. Fridy, J.A., Orhan, C.: Lacunary statistical convergence. Pacific J. Math. 160, 43–51 (1993)
11. Gadjiev, A.D., Orhan, C.: some approximation theorems via statistical convergence. Rocky
Mt. J. Math. 32(1), 508–520 (2002)
12. Kostyrko, P., Šalát, T., Wilczynki, W.: I -convergence. Real Anal. Exch. 26(2), 669–685
(2000/2001)
13. Maio, G.D., Kocinac, L.D.R.: Statistical convergence in topology. Topology Appl. 156, 28–45
(2008)
14. Šalát, T.: On statistically convergent sequences of real numbers. Math. Slovaca 30, 139–150
(1980)
15. Savaş, E., Das, Pratulananda: A generalized statistical convergence via ideals. Appl. Math.
Lett. 24, 826–830 (2011)
16. Savaş, E.: Δm -strongly summable sequence spaces in 2-normed spaces defined by ideal con-
vergence and an Orlicz function. Appl. Math. Comput. 217, 271–276 (2010)
17. Savaş, E.: A sequence spaces in 2-normed space defined by ideal convergence and an Orlicz
function. Abst. Appl. Anal. 2011, Article ID 741382 (2011)
18. Savaş, E.: On some new sequence spaces in 2-normed spaces using Ideal convergence and an
Orlicz function. J. Ineq. Appl. Article Number: 482392 (2010). https://doi.org/10.1155/2010/
482392
19. Savaş, E.: On generalized double statistical convergence via ideals. In: The Fifth Saudi Science
Conference, pp. 16–18 (2012)
20. Savaş, E.: On I -lacunary statistical convergence of order α for sequences of sets. Filomat 29(6),
1223–1229. 40A35 (2015)
4 Iθ -Statistical Convergence of Weight g in Topological Groups 51

21. Savaş, E.: Iθ -statistically convergent sequences in topological groups. Mat. Bilten 39(2), 19–28
(2015)
22. Savaş, E., Savaş Eren, R.E.: Iθ -statistical convergence of order α in topological groups. Applied
mathematics in Tunisia, 141–148, Springer Proc. Math. Stat., 131, Springer, Cham (2015)
23. Schoenberg, I.J.: The integrability methods. Amer. Math. Monthly 66, 361–375 (1959)
Chapter 5
On the Integral-Balance Solvability
of the Nonlinear Mullins Model

Jordan Hristov

Abstract The integral-balance method to the nonlinear Mullins model of thermal

grooving has been applied. The successful integral-balance solution utilizing the
double-integration techniques has been able after application of the nonlinear Broad-
bridge transform. The Broadbridge transform converts the Mullins equation into a
Dirichlet problem of a nonlinear diffusion equation with a Fujita-type nonlinearity
of the diffusion coefficient. The solution is straightforward but needs additional opti-
mization procedure determining the unspecified exponent of the generalized assumed
parabolic profile.

Keywords Integral-balance method · Mullins equation · Double-integration

method · Approximate solution

1 Introduction

1.1 Mullins Models of Thermal Diffusion Grooving

The thermal grooving on metal surface by mechanisms of evaporation–condensation

is modelled by the nonlinear Mullins equation [1–3]
2
∂u D(0) ∂ u ∂u(0, t)
= , = m = const.,
∂t 1 + (∂u/∂ x)2 ∂ x 2 ∂x
(1)
∂u(0, t)
−→ 0 , x −→ ∞, u(x, 0) = 0
∂x
with initial conditions

u x (0, t) = const., u(x, 0) = 0, u(∞, 0) = 0, u x (∞, t) = 0 (2)

J. Hristov (B)
Department of Chemical Engineering, University of Chemical Technology
and Metallurgy (UCTM), 1756 Sofia, Bulgaria
e-mail: [email protected]
© Springer Nature Singapore Pte Ltd. 2018 53
D. Ghosh et al. (eds.), Mathematics and Computing, Springer Proceedings
in Mathematics & Statistics 253, https://doi.org/10.1007/978-981-13-2095-8_5
54 J. Hristov

and subjected to boundary conditions [3]

u x (0, t) = m., u(∞, t) = u x x (0, t) = u x x x (0, t) = 0. (3)

The boundary condition at x = 0 actually corresponds to the physical requirement

the flux of vapours to be equal to zero at the origin of the groove [1–3]. The model
of Mullins considers an axisymmetric groove (about the vertical axis and x = 0),
and due to this symmetry, we may consider only a half-profile for x ≥ 0 (see Fig. 1).
The model (1)–(3) has been solved and analysed by many authors [1, 2, 4–6]. In
addition, when the surface curvature is small (i.e. for (u x x )2 1 ) a linearization of
the model (1) as a fourth-order parabolic equation [3] accounting mainly the groove
formation by the surface diffusion mechanisms is possible, namely [1, 3]

∂u ∂ 4u ∂u(0, t) ∂ 3 u(0, t)
= −B 4 , = m, = 0, 0 < x < ∞, t > 0. (4)
∂t ∂x ∂x ∂x3

Here, the apparent diffusion coefficient B = Ds γ Ω 2 ν/kT is a dimensional group

involving the coefficient of surface diffusion coefficient Ds , the free surface energy
per unit area γ , the molecular volume Ω, and the area ν where the surface diffusion
occurs.
The linear model (4) was recently solved by a new integral-balance technology
named multiple method (MIM) [7] in two recently published articles: the case integer-
order time derivative [7] and in a time-fractional (subdiffusion) version (suggested
in [8]) [9]. In both cases, the solutions reveal strong subdiffusion behaviour of the
process modelled because the groove surface profile evolves in time proportional

Fig. 1 Schematic groove profile a with equivalent Dirichlet diffusion example (by inverting the
profile) b explaining why the integral-balance method is applied. Adapted from [7] by courtesy of
Thermal Science
5 On the Integral-Balance Solvability of the Nonlinear Mullins … 55

to t 1/4 . Now, the present work addresses an approximate solution of the complete
model (1) by the integral-balance method [10–16].

1.2 The Motivation for Doing This Study

The main motivation for this study comes from the elegant work of Broadbridge
[1] where the model (1) was solved exactly (see comments in the sequel) as well
as a more general class of nonlinear Mullins-type models was defined. Especially
to the present author motivation, owning already experience in application of the
integral-balance method to nonlinear diffusion problems [12, 13, 17, 18], as well as
with solutions of the linearized model (4) [7] and its time-fractional version [9] by
MIM, the next challenging tasks were the solution of (1) by the integral method, an
attempt never done before. The results of these efforts are presented in this work.

2 The Integral-Balance Method: Necessary Background

The integral-balance method to diffusion models of heat and mass is based on the
concept of a finite penetration depth, evolving in time, and a sharp front of the
solution [10, 11] propagating with a finite speed. There are two principle integration
techniques of the method: simple integration method known as heat-balance integral
method (HBIM) of Goodman [10–12, 14–16] and double-integration method (DIM)
[12–14, 17–20] (see the sequel). The basic rules of these techniques are explained
next.

2.1 Single-Integration Approach

The approach considers (in case of transient diffusion with a constant transport
coefficient a) a single-step integration over the penetration depth δ(t), that is
δ δ
∂θ ∂ 2θ
dx = a d x, θ (x, t) = 0, t > 0. (5)
0 ∂t 0 ∂x2

Physically, the relationship (5) is a simple mass balance over a diffusion layer of
finite depth δ(t), while mathematically it is the zero moment of the diffusion equation.
It is worthnoting that the physically based concept of the finite speed (and finite
penetration depth) of the diffusant in a semi-infinite medium actually replaces the
boundary condition at infinity θ (∞) = 0 with θ (δ) = 0 and ∂θ (δ)/∂ x = 0, known
also as Goodmans conditions [10, 11]. This change in the boundary condition forms a
56 J. Hristov

sharp front of the solution δ(t) beyond which the medium is undisturbed. Moreover, it
converts the problem defined initially in semi-infinite medium to a two-point problem.
After application of the Leibniz rule, we get from (5) the basic relationship of
HBIM
δ
d ∂θ
θ (x, t)d x = −a (0, t). (6)
dt 0 ∂x

Replacement of θ (x, t) by an assumed profile θa expressed as a function of the

dimensionless space variable x/δ in (6) results in ODE about δ(t). The principle
disadvantage of the single-step integration technique is that the gradient in the right-
side of (6) should be defined through the assumed profile θa .

2.2 Double-Integration Approach

The double-integration method (DIM) in its original version [12–14, 20] employs a
two-step integration procedure: first integration from 0 to x and a second one from
0 to δ (see details in the cited references). Here, we will use a modified version [13,
17, 18] (the first formula of (7)) where after application of the Leibniz rule we have
(the second formula of (7))
δ δ δ δ
∂θ (x, t) d
d xd x = aθ (0, t) =⇒ θ (x, t)d xd x = aθ (0, t). (7)
0 x ∂t dt 0 x

In (7), the first integration is near the front (from x to δ). The approach expressed by
(7) is general and applicable to either integer-order [12] and time-fractional models
[13, 17–19] (see details in [12–14]) .

3 The Integral-Balance Solution

Prior to applying either the DIM solution to (1), the necessary step is a transform
to a more convenient form as a standard nonlinear diffusion equation. Precisely,
the Broadbridge transformation (BT) of the nonlinear part of (1) [1] allows the
application of the integral-balance method to be successful.

3.1 Transformation to Nonlinear Diffusion Problem

Following Broadbridge [1] and by help of (2) (see also the first BC in (3)), we may
apply the substitution Θ = u x /m. Consequently, we get
5 On the Integral-Balance Solvability of the Nonlinear Mullins … 57

x
u= Θ(z, t)dz. (8)
∞

x
du d
= Θ(z, t)dz (9)
dt dt ∞

Applying the Leibniz rule in inverse order (to that used in the previous point) and
with the last boundary condition in (2), we may transform (1) as [1]
x
∂Θ ∂Θ
d x = D(Θ) . (10)
∞ ∂t ∂x

Now after differentiation with respect to x, we get a more friendly form

∂Θ ∂ ∂Θ
= D(Θ) , Θ(x, 0) = 0 (11)
∂t ∂x ∂x

with boundary conditions

u x (0, t)
Θ(0, t) = = 1, Θ(∞, t) = 0. (12)
m

Hence, we got a Dirichlet problem with respect to the variable Θ(x, t) .

Futher, Broadbridge [1] developed an exact
√ self-similar solution in terms of the
classical similarity variable η B = (1/2)(x/ D0 t) = (η/2) (η is defined naturally
through the solution developed in this work). We will refer further to the Broad-
bridge solution when specific moments of the solution developed here have to be
commented. Now, we go in a different way applying the integral-balance method to
(11) and (12).

3.2 Assumed Profile

Prior to applying DIM, we should select the assumed profile Θa (x/δ) . In this work,
a parabolic one with unspecified exponent is used, namely
x n
Θa = Θs 1 − , Θs = Θ(0, t) = 1. (13)
δ
The profile (13) obeys all boundary conditions at both ends of the penetration layer
(0 ≤ x ≤ δ) for any value of the exponent n [7, 11–15, 18, 19]. This feature offers
a flexibility to optimize the numerical value of the exponent as it will demonstrate
further in this work. With the boundary conditions (3) and the transform (8), we have
58 J. Hristov

Θ(δ) = Θx (δ) = Θx x (0, t) = Θx x x (0, t) = 0. (14)

In accordance with the schematic presentation in Fig. 1a, the penetration depth
δ(t) equals the half-width of the cavity , that is δ(t) = w/2 , from the bottom point
x = 0 up to the inflexion point (because beyond the point of inflection the surface of
the next groove begins). The scheme in Fig. 1b is an inverted profile corresponding to
the diffusion Dirichlet problem. This was especially done to facilitate understanding
of how the integral-balance method, well known from transient diffusion and heat
conduction problems, could be applied to the Mullins equation.

3.3 The Nonlinear Diffusion Coefficient D(Θ)

In the transformed model (11), the diffusion coefficient in terms of the variable Θ is
transformed as
D(0) D(0)
D(u x ) = =⇒ D(Θ) = , a = m2. (15)
1 + (u x )2 1 + aΘ 2

This is a Fujita-type nonlinearity [21–23] as it was especially commented by Broad-

bridge [1], and a special transformation D(Θ) is needed prior application of the
integral-balance method. Denoting D(0) = D0 , the right-hand side of (11) can be
transformed as

∂Θ D0 ∂Θ ∂ ar ctan(aΘ)
D(Θ) = = D0 √ . (16)
∂x 1 + aΘ 2 ∂ x ∂x a

Then, the model (11) takes the form

∂Θ ∂2 ar ctan(aΘ)
= D0 2 √ . (17)
∂t ∂x a

3.4 Penetration Depth

In accordance with the rules of DIM, we have

δ δ δ δ
d ∂2 ar ctan(aΘ)
Θd xd x = D0 √ d xd x (18)
dt 0 x 0 x ∂x2 a

The double integration in (18), with account of the boundary conditions (14), yields
5 On the Integral-Balance Solvability of the Nonlinear Mullins … 59

1 dδ 2 ar ctan(a)
= D0 √ = D0 M(m) (19)
(n + 1)(n + 2) dt a

ar ctan(a) ar ctan(m 2 )
M(m) = √ = = const. (20)
a m

The initial condition Θ(x, 0) = 0 that corresponds to the physically based δ(t =
0) = 0 results in

δ = D0 t M(m)N , N = (n + 1)(n + 2). (21)

Therefore, the penetration

√ depth propagates in accordance with the classical
(Fickian) diffusion law t which is in agreement with the result of Broadbridge
[1]. In contrast to solution of the linearized model, (4) reveals subdiffusion scal-
ing because δ ≡ t 1/4 [7, 9] (see also [8]). Moreover, we may define an effective
diffusion coefficient Dm = D0 M(m) and consequently (21) takes a form mim-
icking√the penetration
√ depth of the linear diffusion problem [10, 11, 15], i.e. as
δm = Dm t (n + 1)(n + 2), but the explicit effect of the nonlinearity is lost.

3.5 Approximate Profile (Solution)

With the established relationship about δ(t), the approximate solution of (11) is

n
n
x η x
Θa = 1 − √ √ = 1− √ , η= √ (22)
D0 t M(m)N M(m)N D0 t
√
thus defining in a natural way the Boltzmann similarity variable η = x/ D0 t.
Now remembering that ∂u/∂ x = Θ/m =⇒ ∂u a /∂ x = Θ/m, we get

n
du a 1 x
= 1− √ √ . (23)
dx m D0 t M(m)N

Integration in (23) from 0 to δ yields

√ √ δ
δ
D0 t M(m)N x n+1
u a (x, t) = Θa d x = − 1− . (24)
0 m(n + 1) δ 0

Hence, in terms of the original variable u(x, t) , precisely the solution about u a (x, t)
is
√ √
D0 t M(m)N n+2 x n+1
u a (x, t) = 1− . (25)
m (n + 1) δ
60 J. Hristov

For x = 0 in (25), we get u a (0, t) which is the groove maximal depth (see the sequel)
and the normalized profile following from (25) can be presented as

u a (x, t) u a (x, t) m n+1 x n+1
Ua∗ = = √ √ 1− . (26)
u a (0, t) D0 t M(m) n + 2 δ
√
Alternatively, we may scale the groove depth by the natural length scale D0 t,
namely √
◦ u a (x, t) M(m) n + 2 x n+1
Ua = √ = 1− . (27)
D0 t m n+1 δ

At this moment, we have to mention that the approximate MIM solution of (4)
[7, 9] is n
m 1/4 1/4 x
u a (x, t) M I M = (Bt) M4 1− 1/4
n (Bt)1/4 M4
n (28)
m 1/4 1/4 ηM
= (Bt) M4 1− 1/4
.
n (Bt)1/4 M4

In (28), M4 = Γ (n + k + 1)/Γ (n + 1) and k is the number of the integrations ap-

plied by MIM (in the case of the model (4), we have k = 4). The similarity variable
ηm = x/(Bt)1/4 is of non-Boltzmann type, and the natural length scale is (Bt)1/4 [3,
7, 9]. In this context, it is worth noting that ηm = x/(Bt)1/4 was used by Mullins [3]
as an ansatz allowing transforming the linearized equation (4) into ODE. Hence, the
linear problem (4) is easily solvable by MIM, but the solution depends on a nonlinear
similarity variable, at the same time as the nonlinear problem
√ (1) results in a solu-
tion expressed trough the Boltzmann variable η = x/ D0 t but needs a nonlinear
transform (Broadbridge transform,
√ BT) at the beginning.
Following (25) at η = M(m)N (corresponding to x = δ), we have Θa = 0 =⇒
u a = Ua = 0. The value of m used in the original
√ study of Mullins [3] was selected as
m = 0.1. In this case, M(m) ≈ 0.099 and M(m) √ ≈√0.316. Therefore, the groove
half-width is approximately δ = w/2 ≈ 0.996 D0 t (n + 1)(n + 2), where n is
still unspecified. The solution of the linearized model (4) in [7] with m = 0.1 and
n = 4.555 (see details in [7, 9]) provides δ4 = w/2 ≈ 8.106(Bt)1/4 .
In addition, the condition x = 0 defines the maximum of u(x, t) attained by the
profile at the groove bottom, denoted as groove depth G 0 (t), namely
√ √ √
D0 t M(m) n+2 u(0, t) M(m) n+2
G 0 (t) = , G ∗0 (t) = √ = (29)
m n+1 D0 t m n+1
√
where G ∗0 (t) is the groove depth normalized by the natural length scale D0 t .
5 On the Integral-Balance Solvability of the Nonlinear Mullins … 61

4 Refinement of the Approximate Solution

4.1 Residual Function

When the approximate solution is used as an alternative of the exact one, it is natural
that the residual function of the model (1) differs from zero, namely
⎧ ⎫
⎨ ∂u D ∂ 2
u ⎬
a 0 a
Ru = −
2 ∂ x 2 ⎭ = 0 (30)
⎩ ∂t 1 + ∂u a ∂x

or alternatively following (16) and (17) in the forms (31) and (32)

∂u a D0 ∂Θa
Ru = − = 0 (31)
∂t 1 + aΘ 2 ∂ x

∂u a ∂2 ar ctan(aΘa )
Ru = − D0 2 √ = 0. (32)
∂t ∂x a

The refinement of the approximate solution simply means a minimization of R

with respect to the exponent n within the range 0 ≤ x ≤ δ since all other parameters
of the model are initially specified. First, let us see what is the behaviour of the
residual fiction at the boundaries x = 0 and x = δ. For x = 0, we have Θa = 1,
while for x −→ δ we get Θ1 −→ 0. Now with the assumed profile (13), we have

x n−1 x 1 dδ ∂2 1 x n
R Θ2 =n 1− − D0 2 √ ar ctan a 1 − . (33)
δ δ δ dt ∂x a δ

For in (33), we have RΘ2 = 0 − D0 ∂∂x 2 √1a ar ctan(a) for any value of n. Further,
2

for x −→ δ we have directly RΘ2 = 0 also

√ for any value of n. Moreover, the product
(1/δ)(dδ/dt) simply reduces to 1/ 2 t ; that is, the first term in (33) decays in
time. Hence, these tests do not provide the needed information about the exponent
n and a special attention on the approximation of the diffusion term is needed.

4.2 Approximation of the Diffusion Term

Now, the problem at issue is how to approximate the second term of RΘ2 as a function
of x/δ . It is well known that ar ctan(y) has a convergent series expansion as
∞
y 2 j+1 y3 y5 y7
ar ctan(y) ≈ (−1) j ≈y− + − + ··· (34)
j=0
2j + 1 3 5 7

with radius of convergence 1, when −1 ≤ y ≤ 1.

62 J. Hristov

In our case −1 ≤ Θa ≤ 1 and 0 ≤ a ≤ 1, and with y = aΘa , it is possible

to obtain a series expansion like (34). However, as a first attempt we will use
the linear approximation ar ctan(y) ≈ (π/4) y for −1 ≤ y ≤ 1, especially when
y −→ 1, which mimics the first term in the series (34) and corresponds to x/δ −→ 0
when y = aΘa . This physically corresponds to a groove profile near its origin x = 0,
when m = 1. Since we always have (1 − x/δ) < 1 and m < 1 =⇒ a = m 2 1,
then at least the product aΘa = a(1 − x/δ) is of order of magnitude 10−2 ; it is clear
from (34) that beyond fourth term all the following will be negligible. Hence, replac-
ing y = aΘa we get ar ctan(aΘa ) ≈ π4 (aΘa ), and with the profile (13), this linear
approximation becomes

π π x n
ar ctan(aΘa ) ≈ (aΘa ) ≈ a 1 − . (35)
4 4 δ
The approximation (35) is valid for n > 0 because from the condition 0 ≤ y ≤ 1 we
should have (1 − x/δ)n < 1. Therefore, the diffusion term can be approximated as

∂2 1 x n π √ n(n − 1) x n−2
√ ar ctan a 1 − ≈ a 1 − . (36)
∂x2 a δ 4 δ2 δ

Now after this approximation the residual function can be presented in two forms

x n−1 x 1 dδ π √ n(n − 1) x n−2
R Θ2 ≈n 1− − D0 a 1− (37)
δ δ δ dt 4 δ2 δ

1 dδ π√
R Θ2 ≈ n(1 − z) n−1
zδ − D 0 a n(n − 1)(1 − z) n−2
. (38)
δ2 dt 4

In (38), the moving boundary domain 0 ≤ x ≤ δ is transformed into one with fixed
boundaries 0 ≤ z = x/δ ≤ 1. The product δ dδ
dt
is time-independent and therefore

1 nz(1 − z)n−1 [M(m)(n + 1)(n + 2)] − π2 m[n(n − 1)(1 − z)n−2
R Θ2 ≈ .
t 2[M(m)(n + 1)(n + 2)]
(39)
In (39), we have a term (in the waved brackets) which is time-independent but in
general RΘ2 decays in time. Now with the new construction of RΘ2 setting x = 0 we
get RΘ2 (x = 0) ≈ 0 − π2 m(n − 1) which is obeyed for n = 1 .

4.3 Optimal Exponents with Linear Approximation of

ar ct an(aΘa )

The optimal exponent n can be determined by minimization of the squared error of

1
approximation defined as E(n, m, t) = 0 (RΘ2 )2 dz = t12 e(n, m), where e(n, m) is
5 On the Integral-Balance Solvability of the Nonlinear Mullins … 63

the results of integration of the time-independent term of (38). The procedure is well
described in [12, 13, 15, 18, 19], and we will avoid here huge expressions.
The minimization of e(n, m) for given values of m was performed by Maple.
For two values of m used in the literature: m = 0.1 [3] and m = 0.4 [1], we have:
n(m = 1) ≈ 0.985 with e(n, m) ≈ 0.0289, and n(m = 0.4) ≈ 0.32 with e(n, m) ≈
0.00511, respectively. These values of the exponents do not obey the requirement
n > 2. However, numerical tests revealed that the decrease in m, that is for grooves
with small angles β at the origin (see Fig. 1), the values of the optimal exponents
increase and vice versa. As examples supporting this statement, the following results
were obtained: n(m = 0.01) ≈ 2.257 with e(n, m) ≈ 0.696, n(m = 0.02) ≈ 1.186
with e(n, m) ≈ 0.0585, and n(m = 0.3) ≈ 0.9336 with e(n, m) ≈ 0.0.00987.
Therefore, the first attempt to use the approximation (35) provides reasonable data
for small values of m that limits the application of this approach. Nevertheless, if the
number of terms in the series (34) is increased, then we have a rapidly converging
series
x n a 3 x 3n a 5 x 5n
ar ctan(aΘa ) ≈ a 1 − − 1− + 1− + ··· (40)
δ 3 δ 5 δ
Then, the approximate diffusion term can be presented as

∂2 1 x n 1 n(n − 1) x n−2
√ ar ctan a 1 − ≈√ a 1− −
∂x2 a δ a δ 2 δ
(41)
a 3n(3n − 1
3
x 3n−2 a 5 5n(5n − 1) x 5n−1
1 − + 1 − + ···
3 δ2 δ 5 δ2 δ
This approach, however, draws a new problem about the reasonable number of terms
in the series (41), which is beyond the scope of this report.

4.4 Brief Notes

To recapitulate the solution results, these are actually the first attempts to solve the
complete Mullins equation by the integral-balance method, especially applying the
double-integration technique. The crucial points are the nonlinear transform of the
diffusion term, after the initial transformation of Broadbridge (BT) and then the
approximation of the diffusion term in the residual function. The new challenging
problem emerging in the determination of the optimal exponent is the approximation
of ar ctan(aΘa ) when Θa is the generalized parabolic profile (13), but this figures
new studies beyond the scope of the present communication.
64 J. Hristov

5 Numerical Simulations

Numerical simulations with the approximate solution Θa (22), actually of completely

normalized profile (26), are shown in Fig. 2a. All these plots correspond to DIM
solutions with optimal exponents satisfying the condition n > 2 or at least n ≈ 2.0.
The curves reveal a physical adequacy of the solution since larger initial angles
(represented by the value of m) result in wider groove openings and vice versa.
At this moment of study, these results are, to some extent, qualitative since there
are no extensive database of values of m available in the literature. However, the
dimensionless√presentation shown in Fig. 2 is general since it uses the similarity
variable η = D0 t as independent variable.
As it was mentioned in the previous section, these are results obtained with the
linear approximation of ar ctan(aΘa ). The idea to use more terms of the approximat-
ing series (34) results in (40) and consequently in (41). Taking into account the small
values of m (order of magnitude 10−2 ÷ 10−3 ) and the fact that a = m 2 (order of
magnitude 10−4 ÷ 10−6 ) as well as that (1 − x/δ) ≤ 1 with n > 2 we may approx-
imately estimate that, for example, the second term in the series (41) with exponent
3n − 1 > 5 would have an order of magnitude of about 10−6 ÷ 10−10 . Hence, the
linear approximation used in this work is physically reasonable. More terms in (41)
could be taken into account when m takes large values of order of magnitude of
unity or larger, that is in modelling of grooves with larger openings. This is a good
challenge that needs to be proved by modelling and comparison with experimental
data on groove shapes, but beyond the scope of this report.
The groove depth evolutions in time presented in Fig. √ 2b reproduce directly the
linear relationship (29) when the natural length scale D0 t is used as independent
variable: larger values of m result in wider grooves but with slow growths and vice
versa.

Fig. 2 Numerical simulation with DIM solutions: a Dimensionless groove profiles at various m
and similarity
√ variable η as independent variable; b Groove depth as function of the natural length
scale D0 t
5 On the Integral-Balance Solvability of the Nonlinear Mullins … 65

Both groove depth and opening (w = 2δ) are results of a Gaussian diffusion
process since G 20 ≡ t and w 2 ≡ t (because δ 2 ≡ t) in contrast to the linearized model
1
(4) where δ 2 ≡ t 4 and the process is a subdiffusive. Precisely, with the approximate
profile
! (13), the mean squared! displacement characterizing the diffusion process is
x 2 ≡ δ 2 (see [19]): when x 2 ≡ t γ with γ = 1, we have a normal Gaussian process,
but when γ < 1 the process is subdiffusive (see [9, 17]). In the context of the physical
process of groove evolution, this should be related to the mechanisms involved: the
evaporation–condensation mechanism[(model (1)] is Gaussian, while the surface
diffusion mechanism is subdiffusive [7–9].

6 Conclusion

An attempt on the integral-balance method to approximate solution of the nonlinear

Mullins model of thermal growing has been reported. The application of the double-
integration method (DIM) was successfully applied, but this solution needs two
important steps to be done before: (1) application of the Broadbridge transform
converting the original model (1) into a Dirichlet problem of nonlinear diffusion
equation with Fujita-type nonlinearity and (2) nonlinear transform of the diffusion
term, a technique used before [12, 18] allowing application of the assumed parabolic
profile.
Principle moment in the process of refinement of the approximate solution is the
approximation of the nonlinear diffusion term, and this problem strongly depends
on the specific function that should be approximated and doubly differentiated with
respect to the space coordinate. The numerical experiments reveal adequate behaviour
of the simulated results reasonably modelling groove shapes.
The process of solution developed raises many questions and interesting problem
that might be solved in future studies, but the main step to solve the nonlinear Mullins
model by the integral-balance method was already done.

References

1. Broadbridge, P.: Exact solvability of the Mullins nonlinear diffusion model of groove devel-
opment. J. Math. Phys. 30, 1648–1651 (1989)
2. Broadbridge, P.: Exact solution of a degenerate fully nonlinear diffusion equation. Z. Angw.
Math. Phys. 55, 34–538 (2004)
3. Mullins, W.W.: Theory of thermal grooving. J. Appl. Phys. 28, 333–339 (1957)
4. Kitada, A.: On properties of a classical solution of nonlinear mass transport equation. J. Math.
Phys. 27, 1391–1392 (1986)
5. Martin, P.A.: Thermal grooving by surface diffusion: Mullins revisited and extended to multiple
grooves. Q. J. Appl. Math. 67, 125–36 (2009)
6. Robertson, W.M.: Grain-boundary growing by surface diffusion for finite slopes. J. Appl. Phys.
42, 463–467 (1971)
66 J. Hristov

7. Hristov, J.: Multiple integral-balance method: basic idea and an example with Mullinss model
of thermal grooving. Therm. Sci. 21, 1555–1560 (2017)
8. Abu Hamed, M., Nepomnyashchy, A.A.: Groove growth by surface subdiffusion. Physica D:
Nonlinear Phenom. 298299, 42–47 (2015)
9. Hristov, J.: Fourth-order fractional diffusion model of thermal grooving: integral approach to
approximate closed form solution of the Mullins model. Math. Model Natur. Phenom. 13, 1–6
(2018)
10. Goodman, T.R.: The heat balance integral and its application to problems involving a change
of phase. Trans. ASME 80, 335–342 (1958)
11. Hristov, J.: The heat-balance integral method by a parabolic profile with unspecified exponent:
analysis and Benchmark exercises. Therm. Sci. 13, 27–48 (2009)
12. Hristov, J.: Integral solutions to transient nonlinear heat (mass) diffusion with a power-law
diffusivity: a semi-infinite medium with fixed boundary conditions. Heat Mass Transf. 52,
635–655 (2016)
13. Hristov, J.: Double integral-balance method to the fractional subdiffusion equation: approxi-
mate solutions, optimization problems to be resolved and numerical simulations. J. Vib. Control
23, 2795–2818 (2017)
14. Mitchell, S.L., Myers, T.G.: Application of standard and rened heat balance integral methods
to one-dimensional Stefan problems. SIAM Rev. 52, 57–86 (2010)
15. Myers, J.G.: Optimizing the exponent in the heat balance and refined integral methods. Int.
Commun. Heat Mass Transf. 36, 143–147 (2009)
16. Sahu, S.K., Das, P.K., Bhattacharyya, S.: A comprehensive analysis of conduction-controlled
rewetting by the heat balance integral method. Int. J. Heat Mass Transf. 49, 4978–4986 (2006)
17. Hristov, J.: Approximate solutions to time-fractional models by integral balance approach,
Chapter 5. In: Cattani, C., Srivastava, H.M., Yang, X.-J. (eds.) Fractional Dynamics, pp. 78–
109, De Gruyter Open (2015)
18. Hristov, J.: Integral-balance solution to nonlinear subdiffusion equation, Chapter 3, In:
Bhalekar, S. (ed.) Frontiers in Fractional Calculus, pp. 57–88. Bentham Science Publishers
(2017)
19. Hristov, J.: Subdiffusion model with time-dependent diffusion coefficient: integral-balance
solution and analysis. Therm. Sci. 21, 69–80 (2017)
20. Volkov, V.N., Li-Orlov, V.K.: A refinement of the integral method in solving the heat conduction
equation. Heat Transf. Sov. Res. 2, 41–47 (1970)
21. Fujita, H.: The exact pattern of a concentration-dependent diffusion in a semi-infinite medium,
Part II. Text. Res. J. 22, 823–827 (1952)
22. Fujita, H.: The exact pattern of a concentration-dependent diffusion in a semi-infinite medium,
Part 1. Text. Res. J. 22, 757–760 (1952)
23. Fujita, H.: The exact pattern of a concentration-dependent diffusion in a semi-infinite medium,
Part III. Text. Res. J. 24, 234–240 (1954)
Chapter 6
Optimal Control of Rigidity Parameter
of Elastic Inclusions in Composite Plate
with a Crack

Nyurgun Lazarev and Natalia Neustroeva

Abstract Equilibrium problems for a family of composite plates with a crack

passing along the boundary of an elastic inclusion are considered. We assume that the
Signorini-type condition for nonpenetration of the opposite crack faces is fulfilled.
It is shown that there exists a solution of the optimal control problem with the cost
functional given with the help of an arbitrary continuous functional in the solution
space.

Keywords Timoshenko plate · Rigid inclusion · Crack · Nonpenetration

conditions · Variational inequality · Derivative of energy functional
Shape control

1 Introduction

It is well known that the difference between the coefficients of thermal expansion
and moduli elasticity for heterogeneous materials often leads to initiation of cracks
(delamination) and ruptures at the boundary interface of different materials. In this
regard, it is important to analyze high-level mathematical models of elastic bodies
with delaminated inclusions and to investigate dependence of solutions on the vari-
ation of physical parameters of inclusions. We consider two types of inclusions: For
the first type, we have inclusions which are described by the Timoshenko model, and
the second type of inclusions corresponds to the Kirchhoff–Love model. Optimal
control problem considered in this work consists in finding the best rigidity parame-
ter of an elastic inclusion. The cost functional is defined with the help of an arbitrary
continuous functional in the solution space.
The main difficulty in studying this problem is due to the presence of the nonlinear
boundary conditions of inequality type. Since the beginning of 1990s, a crack theory

N. Lazarev (B) · N. Neustroeva

North-Eastern Federal University, Yakutsk 677891, Russia
e-mail: [email protected]
N. Lazarev
Lavrentyev Institute of Hydrodynamics SB RAS, Novosibirsk 630090, Russia
© Springer Nature Singapore Pte Ltd. 2018 67
D. Ghosh et al. (eds.), Mathematics and Computing, Springer Proceedings
in Mathematics & Statistics 253, https://doi.org/10.1007/978-981-13-2095-8_6
68 N. Lazarev and N. Neustroeva

with nonpenetration conditions at the crack faces has been under active study (see,
e.g., [3–7, 12, 16, 17]). Some of these works are devoted to the investigation of
various nonlinear mathematical models of crack theory. We refer the reader to [7, 15,
19, 21] for results concerning the shape sensitivity analysis to nonlinear problems in
domains with cuts. The fictitious domain and smooth domain methods were proposed
in [1, 2]. Invariant integrals in the framework of nonlinear elasticity problems with
Signorini-type conditions were constructed in [2, 7, 24]. The problems concerning
equilibrium models for elastic bodies with rigid inclusions [7–11, 13, 17, 18, 22, 23]
or elastic inclusions [14] were studied. It is worth mentioning that these problems
belong to the class of free boundary value problems.

2 Equilibrium Problems

We formulate the two types of variational problems. These both types of problems
are formulated with respect to the identical geometrical objects. Let us consider a
bounded domain Ω ⊂ R2 with a boundary Γ ∈ C 0,1 . Let a subdomain ω be strictly
contained in Ω, i.e., ω ∩ Γ = ∅, and let a boundary ∂ω be sufficiently smooth.
Assume that ∂ω consists of two disjoint curves γc and ∂ω \ γc , meas ∂ω \ γc > 0.
The outward pointing unit normal to ∂ω is denoted by ν = (ν1 , ν2 ).
We require that the curve γc can be extended up to the outer boundary Γ in such
a way that Ω is divided into two subdomains Ω1 , Ω2 with the Lipschitz boundaries.
The latter condition is sufficient to fulfill the Korn and Poincare inequalities in the
domain Ωc = Ω \ γ c [7].
For simplicity, suppose the plate has a uniform thickness 2h = 2. Let us assign
a three-dimensional Cartesian space {x1 , x2 , z} with the set {Ωc } × {0} ⊂ R3 corre-
sponding to the middle plane of the plate. The curve γc defines a crack (a cut) in
the plate. This means that the cylindrical surface of the crack may be defined by
the relations x = (x1 , x2 ) ∈ γc , −1 ≤ z ≤ 1 where |z| is the distance to the mid-
dle plane. Following our arguments, an elastic inclusion is specified by the set
ω × [−1, 1]; i.e., the boundary of the elastic inclusion is defined by the cylindri-
cal surface ∂ω × [−1, 1]. An unaltered part of the plate corresponds to the domain
Ωc \ ω.
Denote by χ = (W, w) the displacement vector of the mid-surface points (x ∈
Ωc ), by W = (w1 , w2 ) the displacements in the plane {x1 , x2 }, and by w the dis-
placements along the axis z. The angles of rotation of a normal fiber are denoted by
ψ = ψ(x) = (ψ1 , ψ2 ), (x ∈ Ωc ).
In accordance with the direction of the outer normal ν to ∂ω, it is possible to
speak about a positive face ∂ω+ and a negative face ∂ω− of the curve ∂ω. If the trace
of a function v is chosen on the positive (from the side of the domain Ω \ ω) face
∂ω+ , we use the notation v + = v| ∂ω+ , and if it is chosen on the negative face, then
v − = v| ∂ω− . In addition, the jump [v] of the function v on the curve γc can be found
by the formula [v] = v| γc + − v| γc − .
6 Optimal Control of Rigidity Parameter of Elastic Inclusions … 69

Assume that deformation of the unaltered part (which corresponds to the set
Ωc \ ω) is described by the Timoshenko model. The corresponding formulas for
strains and other mechanical values have the form [20]:

1 ∂ψ j ∂ψi 1 ∂w j ∂wi
εi j (ψ) = + , εi j (W ) = + . (1)
2 ∂ xi ∂x j 2 ∂ xi ∂x j

The tensors of moments m(ψ) = {m i j (ψ)} and stresses σ (W ) = {σi j (W )} are

expressed by the formulas (summation is performed over repeated indices)

m i j (ψ) = bi jkl εkl (ψ), σi j (W ) = 3bi jkl εkl (W ), i, j, k, l = 1, 2, (2)

with nonzero components of elasticity tensor B = {bi jkl } specified by the relations

aiiii = D, aii j j = Dκ, ai ji j = ai j ji = D(1 − κ)/2, , i = j, i, j = 1, 2,

(3)
where D and κ are the constants: D is a cylindrical rigidity of the plate, κ is the
Poisson ratio, 0 < κ < 1/2. The transverse forces in the Timoshenko-type model
are defined by the expressions

∂v
qi (w, ψ) = L(w,i +ψi ), i = 1, 2, v,i = ,
∂ xi

where L > 0 is a constant coefficient describing elastic plate characteristics with

respect to transverse shear [20].
Next, we describe the mathematical models corresponding to elastic inclusion
which refers to the domain ω. There are two types of inclusions. For the first type,
we have the same relations (1)–(3) with some other constant coefficients D , κ ,
B = {bi jkl }. For the transverse forces, we accept the formulas

L
qi (w, ψ) = (w,i +ψi ), i = 1, 2,
λ

where L > 0 is a constant value, and λ ∈ (0, 1].

The second type of elastic inclusion is described by the Kirchhoff–Love model,
so that the following relations are fulfilled in the domain ω:

m i j = −bi jkl w,kl .

σi j (W ) = 3bi jkl εkl (W ), i, j, k, l = 1, 2.

As the next step, we want to formulate the corresponding variational problems. For
the first type of inclusions, we formulate a family of variational problems. In order
to define a potential energy functional, introduce bilinear forms B(Q, ·, ·), b(Q, ·, ·)
determined by the equalities
70 N. Lazarev and N. Neustroeva

B(Q, η, η) = σi j (W ) εi j (W ) + m i j (ψ) εi j (ψ),
Q

b(Q, η, η) = (w,i +ψ i )(w,i + ψi ),
Q

where Q ⊂ Ωc , η = (W, w, ψ), η = (W , w, ψ).

The potential energy functional of the plate has the following representation [20]:

1 1
Π λ (Ωc , η) = B(Ωc , η, η) + Λ(λ)b(Ωc , η, η) − Fη, η = (W, w, ψ),
2 2
Ωc

where the vector F = ( f 1 , f 2 , f 3 , f 4 , f 5 ) ∈ L 2 (Ωc )5 describes the body forces [20],

L, x ∈ Ωc \ω,
Λ(λ) = L
λ
, x ∈ ω.

In what follows, we suppose that f 4 = f 5 = 0. Introduce the Sobolev spaces

H 1,0 (Ωc ) = v ∈ H 1 (Ωc ) | v = 0 on Γ , H (Ωc ) = H 1,0 (Ωc )5 .

Note that the following inequality holds (with some fixed value λ)

B(Ωc , η, η) + Λ(λ)b(Ωc , η, η) ≥ c
η
2H (Ωc ) ∀η ∈ H (Ωc ), (4)

where the constant c > 0 is independent of η [16]. This estimate ensures that the
bilinear form B(Ωc , η, η) defines a norm equivalent to the standard norm on H (Ωc ).
The condition of mutual nonpenetration of opposite faces of the crack is given by

[W ]ν ≥ |[ψ]ν| on γc . (5)

The derivation and justification of the condition (5) can be found in [16]. Introduce
the set of admissible functions

K 1 = { η = (W, w, ψ) ∈ H (Ωc ) | [W ]ν ≥ |[ψ]ν| on γc }.

Now, we can formulate a family of the equilibrium problems for the plate with a
crack on the boundary of the elastic inclusion. We fix the parameter λ ∈ (0, 1] and
set the minimization problem
inf Π λ (Ωc , η). (6)
η∈K 1
6 Optimal Control of Rigidity Parameter of Elastic Inclusions … 71

Using the same reasoning as in the paper [16], it is possible to prove the existence of
a unique solution ξ λ to the problem (6). Besides, it can be shown that the problem
(6) is equivalent to the following variational inequality [16]

ξ λ ∈ K1,
B(Ωc , ξ , η − ξ ) + Λ(λ)b(Ωc , ξ λ , η − ξ λ ) ≥ F(η − ξ λ ) ∀η ∈ K 1 .
λ λ (7)
Ωc

Next, let us formulate a variational problem for plate with a inclusion of the second
type, i.e., if deformation of the elastic inclusion is described by the Kirchhoff–Love
model. We start with the introduction of the conditions describing the Kirchhoff–
Love hypothesis of straight-line normals

w,i +ψi = 0 in ω, i = 1, 2.

Therefore, the set of admissible functions K 2 is defined by the following relation

K 2 = { η = (W, w, ψ) ∈ H (Ωc ) | w,i +ψi = 0 in ω, i = 1, 2; [W ]ν ≥ |[ψ]ν| on γc }

For this case, we can represent the potential energy of the plate in the following
form:
1 L
Π (η) = B(Ωc , η, η) + b(Ωc \ω, η, η) − Fη.
K
2 2
Ωc

The variational setting of problem is as follows. In the domain Ωc , we have to find

function ξ K ∈ K 2 such that

Π K (ξ K ) = min Π K (η), (8)

η∈K 2

As we have to find the solution ξ K in the space H (Ωc ), we assume that the gluing
conditions are satisfied on the interface between the media outside the crack:

[ξ K ] = (0, 0, 0) on ∂ω \ γc .

The convexity, weak semi-continuity, and coercivity of the functional Π K (η) in

the space H (Ωc ) can be established similar to that made in [16]. The properties of
Π K (η) and the convexity and closedness of the set K 2 guarantee the existence and
uniqueness of the solution ξ K = (U K , u K , φ K ) of problem (8). Besides, the problem
(8) is equivalent to the following variational inequality

ξ K = (U K , u K ,φ K ) ∈ K 2 , B(Ωc , ξ K , η − ξ K ) + Lb(Ω\ω, ξ K , η − ξ K ) ≥

≥ F(η − ξ K ) ∀ η = (W, w, ψ) ∈ K 2 . (9)
Ωγ
72 N. Lazarev and N. Neustroeva

3 Optimal Control Problem

In this section, we prove the main result of the paper which provides an existence of
the optimal rigidity parameter λ∗ ∈ [0, 1] for the elastic inclusion. Here, the limiting
case λ = 0 corresponds to the inclusion with an infinite shear rigidity or Kirchhoff–
Love’s inclusion. We define the cost functional J : [0, 1] → R of an optimal control
problem with the use of the following relation

G(ξ λ ), λ ∈ (0, 1],
J (λ) =
G(ξ K ), λ = 1,

where G(η) : H (Ωγ ) → R is an arbitrary continuous functional.

As examples of such functionals having
physical sense, we can give the following
functionals. The functional G 1 (η) = γc |[χ ]| (η = (W, w, ψ), χ = (W, w)) charac-
terizes the opening of the crack. The functional G 2 (η) =
η − η0
H (Ωc ) characterizes
the deviation of the displacement vector from a given function χ0 by η0 . Consider
the optimal control problem:

Find λ∗ ∈ [0, 1] such that J (λ∗ ) = sup J (λ). (10)

λ∈[0,1]

Theorem 1 There exists a solution of the optimal control problem (10).

Proof Consider a maximizing sequence λn ∈ [0, 1]. In view of evidence, we can

exclude the simple situations corresponding to the following case: λn = λ̂ for all
n > n 0 . Therefore, we have to deal with the following two cases:

1. λn → α, λn ∈ (0, 1], α ∈ (0, 1],

2. λn → 0, λn ∈ (0, 1).

We start from the first case. For each fixed λn , there exists a solution ξ n = ξ λn ,
n = 1, 2, . . . of the variational inequality like (7), i.e.,

ξ n ∈ K1, B(Ωc , ξ n ,η − ξ n ) + Λ(λn )b(Ωc , ξ n , η − ξ n ) ≥

F(η − ξ n ) ∀η ∈ K 1 . (11)
Ωc

By substituting η = 2ξ λ and η = 0 into the variational inequalities (7), we get

λ λ λ λ
B(Ωc , ξ , ξ ) + Λ(λ)b(Ωc , ξ , ξ ) = Fξ λ . (12)
Ωc

Taking into account the inequality (4), we can derive from the last equality that
6 Optimal Control of Rigidity Parameter of Elastic Inclusions … 73

ξ λ
2H (Ωc ) ≤ B(Ωc , ξ λ , ξ λ ) + Λ(1)b(Ωc , ξ λ , ξ λ )
≤ B(Ωc , ξ λ , ξ λ ) + Λ(λ)b(Ωc , ξ λ , ξ λ ) = Fξ λ .
Ωc

From this, we get the following uniform estimation

ξ n
H (Ωc ) ≤ C. (13)

Choosing a subsequence, if necessary, we can assume that as n → ∞

ξ n → ξ̃ weakly in H (Ωc ),
(u n ,i + φin ) (ũ,i + φ̃i ) (14)
→ weakly in L 2 (ω).
λn α

(λn )−1/2 (u n ,i +φin ) → α −1/2 (ũ,i +φ̃i ) weakly in L 2 (ω). (15)

Using the strong convergence of U n → Ũ , φ n → φ̃ in H 1 (Ωc )2 as n → ∞, it can

be easily shown that ξ̃ ∈ K 1 . In view of (14), (15), we pass to the limit as n → ∞
in (11), which yields

ξ̃ ∈ K 1 , B(Ωc , ξ̃ , η − ξ̃ ) + Λ(α)b(Ωc , ξ̃ , η − ξ̃ ) ≥ F(η − ξ̃ ) ∀η ∈ K 1 .
Ωc

By the arbitrariness of η, this inequality means that the last inequality is variational
and ξ̃ = ξ α . Now, we will prove that ξ n → ξ α is strong in H (Ωc ). The weak con-
vergence ξ n → ξ α as n → ∞ implies that

lim Fξ n = Fξ α
n→∞
Ωc Ωc

Consequently, the limit of the right side of (12) exists and is equal to

lim B(Ωc , ξ n , ξ n ) + Λ(λn )b(Ωc , ξ n , ξ n )

n→∞

= lim B(Ωc , ξ n , ξ n ) + Λ(α)b(Ωc , ξ n , ξ n ) = lim Fξ n = Fξ α .
n→∞ n→∞
Ωc Ωc

On the other hand, by (12), we derive

lim B(Ωc , ξ n , ξ n ) + Λ(α)b(Ωc , ξ n , ξ n )

n→∞
= B(Ωc , ξ α , ξ α ) + Λ(α)b(Ωc , ξ α , ξ α ). (16)
74 N. Lazarev and N. Neustroeva

We should recall that by the estimation (4), the bilinear form

B(Ωc , ·, ·) + Λ(α)b(Ωc , ·, ·)

determines an equivalent norm in the space H (Ωc ). This fact and the relation (16)
allow us to obtain as n → ∞

ξ n
H (Ωc ) →
ξ α
H (Ωc ) .

Next, based on this convergence of norms and the weak convergence ξ n → ξ α in

H (Ωc ), we get the desired strong convergence ξ n → ξ α in H (Ωc ). Thus, we have
relations

sup J (λ) = lim J (λn ) = lim G(ξ n ) = G(ξ α ) = J (α),

λ∈[0,1] n→∞ n→∞

which prove the statement for the first case.

Let us consider the second case. We suppose that the maximizing sequence λn
converges to 0. Analogously, from (13), we can conclude that there is a subsequence
(retain notation) such that ξ n converges weakly in H (Ωc ) to some ξ̃ . Next, we can
represent (12) in the following form

L
B(Ωc , ξ λ , ξ λ ) + Lb(Ωc \ω, ξ λ , ξ λ ) + b(ω, ξ λ , ξ λ ) = Fξ λ
λ
Ωc

and conclude that

(u n ,i +φin )
2L 2 (ω) ≤ Cλn , (ũ,i +φ̃i ) = 0 a.e. in ω (17)

with some positive constant C. Therefore, in view of the last equation in (17), we
can obtain that ξ̃ ∈ K 2 . Now, we can substitute some fixed element η ∈ K 2 as the
test function in (11) and pass to the limit as n → ∞. As a result, we arrive at the
relation

ξ̃ ∈ K 2 , B(Ωc , ξ̃ , η − ξ̃ ) + Lb(Ωc \ω, ξ̃ , η − ξ̃ ) ≥ F(η − ξ̃ ) ∀η ∈ K 2 .
Ωc

The arbitrariness of the test function η means that the last inequality is variational
and ξ̃ = ξ K .
In the next step, we prove the strong convergence ξ n → ξ K as n → ∞. To this
end, we rewrite (12) for the parameters λn as follows

L
B(Ωc , ξ n , ξ n ) + Lb(Ωc \ω, ξ n , ξ n ) + b(ω, ξ n , ξ n ) = Fξ n .
λn
Ωc
6 Optimal Control of Rigidity Parameter of Elastic Inclusions … 75

From this, using the weak lower semi-continuity of the bilinear forms B(Ωc , ·, ·),
b(Ωc , ·, ·), we can deduce

lim sup λL b(ω, ξ n , ξ n ) ≤ lim sup − B(Ωc , ξ n , ξ n ) − Lb(Ωc \ω, ξ n , ξ n ) + Fξ n

n→∞ n n→∞

Ωc

≤ − B(Ωc , ξ K , ξ K ) − Lb(Ωc \ω, ξ K , ξ K ) + Fξ K = 0.
Ωc
(18)
The last equality to zero in (18) is provided by the following identity

B(Ωc , ξ , ξ ) + Lb(Ωc \ω, ξ , ξ ) =
K K K K
Fξ K , (19)
Ωc

which can be obtained from the variational inequality (9) by substituting η = 0,

η = 2ξ K . Therefore, we get

L
lim sup b(ω, ξ n , ξ n ) = lim L b(ω, ξ n , ξ n ) = 0.
n→∞ λn n→∞

Consequently, we have

limB(Ωc , ξ n , ξ n ) + Lb(Ωc \ω, ξ n , ξ n ) + L b(ω, ξ n , ξ n )

n→∞
L

= lim Fξ n − b(ω, ξ n , ξ n ) + L b(ω, ξ n , ξ n ) = Fξ K . (20)
n→∞ λn
Ωc Ωc

Finally, taking into account the identity b(ω, ξ K , ξ K ) = 0 and relations (19), (20),
we get

lim B(Ωc , ξ n , ξ n ) + Lb(Ωc \ω, ξ n , ξ n ) + L b(ω, ξ n , ξ n )

n→∞
L

= lim Fξ n − b(ω, ξ n , ξ n ) + L b(ω, ξ n , ξ n ) = Fξ K
n→∞ λn
Ωc Ωc

= B(Ωc , ξ , ξ ) + Lb(Ωc \ω, ξ , ξ ) + L b(ω, ξ , ξ ).
K K K K K K

This means that we have the convergence of norms

ξ n
H (Ωc ) →
ξ K
H (Ωc )

as n → ∞, which together with the weak convergence ξ n → ξ K in H (Ωc ) provides

the desired strong convergence ξ n → ξ K as n → ∞ in H (Ωc ). At last, we have
relations
76 N. Lazarev and N. Neustroeva

sup J (λ) = lim J (λn ) = lim G(ξ n ) = G(ξ K ) = J (0).

λ∈[0,1] n→∞ n→∞

Thus, we have established the existence of solutions of (10) for all possible cases.
Theorem is proved.

References

1. Alekseev, G.V., Khludnev, A.M.: Crack in elastic body crossing the external boundary at zero
angle. Vestnik Q. J. Novosibirsk State Univ. Ser.: Math, Mech. inform. 9(2), 15–29 (2009)
2. Andersson, L.-E., Khludnev, A.M.: On crack crossing an external boundary. Fictitious domain
method and invariant integrals. Siberian. J Ind. Math. 11(3), 15–29 (2008)
3. Hömberg, D., Khludnev, A.M.: On safe crack shapes in elastic bodies. Eur. J. Mech. A/Solids
21(6), 991–998 (2002)
4. Itou, H., Khludnev, A.M.: On delaminated thin Timoshenko inclusions inside elastic bodies.
Math. Methods Appl. Sci. 39(17), 4980–4993 (2016)
5. Itou, H., Khludnev, A.M., Rudoy, E.M., Tani, A.: Asymptotic behaviour at a tip of a rigid line
inclusion in linearized elasticity. Z. Angew. Math. Mech. 92, 716–730 (2012)
6. Itou, H., Kovtunenko, V.A., Tani, A.: The interface crack with Coulomb friction between two
bonded dissimilar elastic media. Appl. Math. 56(1), 69–97 (2011)
7. Khludnev, A.M.: Elasticity Problems in Nonsmooth Domains. Fizmatlit, Moscow (2010)
8. Khludnev, A.M.: Problem of a crack on the boundary of a rigid inclusion in an elastic plate.
Mech. Solids 45(5), 733–742 (2010)
9. Khludnev, A.M.: Optimal control of crack growth in elastic body with inclusions. Eur. J. Mech.
A/Solids. 29(3), 392–399 (2010)
10. Khludnev, A.M.: Thin rigid inclusions with delaminations in elastic plates. Eur. J. Mech.
A/Solids. 32(1), 69–75 (2012)
11. Khludnev, A.M.: Shape control of thin rigid inclusions and cracks in elastic bodies. Arch. Appl.
Mech. 83(10), 1493–1509 (2013)
12. Khludnev, A.M., Kovtunenko, V.A.: Analysis of Cracks in Solids. WIT Press, Southampton-
Boston (2000)
13. Khludnev, A.M., Negri, M.: Optimal rigid inclusion shapes in elastic bodies with cracks. Z.
Angew. Math. Phys. 64(1), 179–191 (2013)
14. Khludnev, A.M., Popova, T.S.: Junction problem for Euler-Bernoulli and Timoshenko elastic
inclusions in elastic bodies. Q. Appl. Math. 74(4), 705–718 (2016)
15. Kovtunenko, V.A.: Primal-dual methods of shape sensitivity analysis for curvilinear cracks
with nonpenetration. IMA J. Appl. Math. 71, 635–657 (2006)
16. Lazarev, N.P.: An iterative penalty method for a monlinear problem of equilibrium of a
Timoshenko-type plate with a crack. Num. Anal. Appl. 4(4), 309–318 (2011)
17. Lazarev, N.P.: An equilibrium problem for the Timoshenko-type plate containing a crack on
the boundary of a rigid inclusion. J. Siberian Fed. Univ. Math. Phys. 6(1), 53–62 (2013)
18. Lazarev, N.P.: Optimal control of the thickness of a rigid inclusion in equilibrium problems for
inhomogeneous two-dimensional bodies with a crack. Z. Angew. Math. Mech. 96(4), 509–518
(2016)
19. Lazarev, N.P., Rudoy, E.M.: Shape sensitivity analysis of Timoshenko’s plate with a crack
under the nonpenetration condition. Z. Angew. Math. Mech. 94(9), 730–739 (2014)
20. Pelekh, B.L.: Theory of Shells with Finite Shear Modulus. Nauk. Dumka, Kiev (1973)
21. Rudoy, E.M.: Shape derivative of the energy functional in a problem for a thin rigid inclusion
in an elastic body. Z. Angew. Math. Phys. 66(4), 1923–1937 (2014)
22. Shcherbakov, V.V.: On an optimal control problem for the shape of thin inclusions in elastic
bodies. J. Appl. Ind. Math. 7(3), 435–443 (2013)
6 Optimal Control of Rigidity Parameter of Elastic Inclusions … 77

23. Shcherbakov, V.V.: Existence of an optimal shape of the thin rigid inclusions in the Kirchhoff-
Love plate. J. Appl. Indust. Math. 8(1), 97–105 (2014)
24. Shcherbakov, V.V.: The Griffith formula and J-integral for elastic bodies with Timoshenko
inclusions. Z. Angew. Math. Mech. 96(11), 1306–1317 (2016)
Chapter 7
Convergence of Generalized Mann Type
of Iterates to Common Fixed Point

T. Som, Amalendu Choudhury, D. R. Sahu and Ajeet Kumar

Abstract The present paper deals with the convergence of two modified Mann type
of iteration schemes for a single and a finite family of mappings to the fixed and com-
mon fixed point, respectively, of a single and a finite family of quasi-nonexpansive
mappings on a uniformly convex Banach space. An example is added in support of
our main result. The results obtained generalize the earlier results of Rhoades (J Math
Anal Appl 56:741–750, [6]), Som et al. (Proc Nat Acad Sci (India) 70(A)(II):185–
189, [8]), and others in turn.

Keywords Quasi-nonexpansive map · Generalized Mann iterates · Convergence

Fixed point

1 Introduction

The present paper deals with the generalization of Mann type of iteration scheme
[4] for a single mapping to two different iteration schemes involving firstly a single
map and secondly a finite family of mappings, respectively, and then studies the
convergence of such an iteration scheme for quasi-nonexpansive self-mappings of a

T. Som (B)
Department of Mathematical Sciences, Indian Institute of Technology (BHU),
Varanasi 221005, India
e-mail: [email protected]
A. Choudhury
Department of Mathematics and Statistics, Haflong Government College,
Haflong, Dima Hasao 788819, Assam, India
e-mail: [email protected]
D. R. Sahu · A. Kumar
Department of Mathematics, Banaras Hindu University, Varanasi 221005, India
e-mail: [email protected]
A. Kumar
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2018 79

convex subset of a uniformly convex Banach space to the fixed and common fixed
point, which mainly generalize a fixed point result of Rhoades [6].
Some preliminary definitions and earlier results of other authors noted in [1, 2,
4–7] and an extension of Mann type of iteration scheme defined for a finite sequence
of mappings are the following:

Definition 1 A mapping T from a Banach space X into itself is said to be nonex-

pansive if T satisfies

T x − T y ≤ x − y for all x, y ∈ X.

In the setting of a Banach space, Dotson [1] introduced a new class of mappings,
called quasi-nonexpansive, in the following manner:

Definition 2 [1] Let X be a Banach space, and let C be a convex subset of X. A

self-mapping T of C is said to be quasi-nonexpansive, provided T has a fixed point,
say p, in C, if
T x − p ≤ x − p

is true for all x ∈ C.

Definition 3 The modulus of convexity of a Banach space E is a function

δ : (0, 2] → (0, 1] defined by

δ() = inf{x − y : x, y ∈ E, x = y = 1, x − y ≥ }.

It is well known [4] that if E is uniformly convex then δ is strictly increas-

ing, lim→0 δ() = 0 and δ(2) = 1. Let η be the inverse of δ, then we note that
η(t) < 2 for t < 1.

Lemma 1 [2] Let E be a uniformly convex Banach space and Br be the closed ball
in E centered at the origin with radius r > 0. If x1 , x2 , x3 ∈Br,
x1 − x2 ≥ x2 − x3 ≥ d > 0 and x2 ≥ 1 − 21 δ dr r
then
1 d
x1 − x3 ≤ η 1 − δ x1 − x2 .
2 r

Petryshyn and Williamson [5] proved the following result on the convergence of
iterates of a quasi-nonexpansive mapping.

Theorem 1 Let C be a closed subset of a Banach space X and T : C → X be a

quasi-nonexpansive mapping. Suppose there exists a point x0 in C such that xn =
T n x0 ∈ C, n ∈ N. Then, the sequence {xn } converges to a fixed point of T in C if and
only if limn→∞ D(xn , F(T )) = 0, where F(T ) is the fixed point set of T.
7 Convergence of Generalized Mann Type of Iterates to Common Fixed Point 81

Definition 4 [4] For a self-mapping T of C and x0 ∈ C, the Mann type of iteration

is defined as
xn+1 = (1 − tn )xn + tn T xn , (1)

where tn ∈ (α, β), 0 < α < β < 1, n = 0, 1, 2, . . .

Theorem 2 [7] Let X be a uniformly convex Banach space, C a closed convex subset
of X , and T a quasi-nonexpansive mapping of C into itself. Let φ : [0, ∞) → [0, ∞)
be a nondecreasing function with φ(0) = 0 and φ(t) > 0 for t ∈ (0, ∞). If T satisfies

x − T x > φ (D(x, F(T )))

for all x ∈ C, then for arbitrary x0 ∈ C, the sequence of Mann type of iterates given
in (1) converges to a member of F(T ).
In a strictly convex Banach space, Rhoades [6] proved the following theorem:
Theorem 3 [6] Let X be a strictly convex Banach space and C a closed convex
subset of X. Let T : C → C be continuous, quasi-nonexpansive mapping and T (C)
be a subset of a compact set K of X. Then, the Mann iterates given by (1) converge
strongly to a fixed point of T.
Som et al. [8] generalized the Mann type of iteration as in (1) in the following
manner:
Definition 5 Let {Tk }k=1
N
be a finite family of self-mappings of a convex subset C
of a Banach space X , and let {tk }k=1
N
be a finite sequence in (0, 1]. For x0 ∈ C, we
define the modified Mann type of iteration as
⎧
⎪
⎪ x = t1 xi N + (1 − t1 )T1 xi N ;
⎪ i N +1
⎪
⎪
⎪xi N +2 = t2 xi N +1 + (1 − t2 )T2 xi N +1 ;
⎪
⎪
⎨. . . . . . . . . . . . . . .
(2)
⎪xi N +k = tk xi N +k−1 + (1 − tk )Tk xi N +k−1 ;
⎪
⎪
⎪
⎪
⎪ ... ... ... ... ...
⎪
⎪
⎩
xi N +N = t N xi N +N −1 + (1 − t N )TN xi N +N −1 ;

for i = 0, 1, 2, . . . .
For a finite family {Tk }k=1
N
of quasi-nonexpansive mappings, Som et al. [8] proved
the following result on convergence of Mann type of iterates to common fixed point
of a finite family of mappings.
Theorem 4 [8] Let C be a nonempty convex subset of a uniformly convex Banach
space. Let {Tk }k=1
N
be a finite sequence of quasi-nonexpansive mappings of C into
itself. Let the graph of each Tk be closed and one of Tk (C), k = 1, 2, . . . , N be
compact. If the family {Tk }k=1
N
has a common fixed point in C, then the modified
Mann type of iterates given by (2) converge to the common fixed point of the family.
82 T. Som et al.

2 Main Results

As a particular case of Definition 5, we note the following as the modification of

Mann type of iteration (1) for a single map:
Definition 6 Let T be a self-mapping of a convex subset C of a Banach space X
and {tk }k=1
N
be a finite sequence in (0, 1]. For x0 ∈ C, we define the modified Mann
type of iteration as
⎧
⎪
⎪ xi N +1 = t1 xi N + (1 − t1 )T xi N ;
⎪
⎪
⎪
⎪ xi N +2 = t2 xi N +1 + (1 − t2 )T xi N +1 ;
⎪
⎪
⎨. . . . . . . . . . . . . . .
(3)
⎪
⎪ xi N +k = tk xi N +k−1 + (1 − tk )T xi N +k−1 ;
⎪
⎪
⎪
⎪. . . . . . . . . . . . . . .
⎪
⎪
⎩
xi N +N = t N xi N +N −1 + (1 − t N )T xi N +N −1 ;

for i = 0, 1, 2, . . . .
Using the iteration scheme (3), we have our first result on convergence to the
fixed point of a single quasi-nonexpansive mapping, which generalizes Theorem 3
in respect of the iteration scheme.
Theorem 5 Let C be a nonempty convex subset of a uniformly convex Banach space.
Let T be a quasi-nonexpansive mapping of C into itself. Let the graph of T be closed
and T (C) be compact. If T has a fixed point in C, then the modified Mann type of
iterates given by (3) converge to a fixed point of T .
Proof The proof is similar to that of Theorem 4 [8], so we omit it.
As the next generalization of Mann iteration scheme for single mapping defined
in (1) and also of the generalized Mann iteration scheme for n-mappings of Som et
al. [8], we further modify it for a finite family of mappings in a different way and
define it in the following manner.
N +1
Definition 7 Let {Tk }k=1 be a finite family of self-mappings of a convex subset C
of a Banach space X , and let {tk }k=1
N
be a finite sequence in (0, 1]. For x0 ∈ C, we
define the modified Mann type of iteration as
⎧
⎪
⎪ xi N +1 = t1 T1 xi N + (1 − t1 )T2 xi N ;
⎪
⎪
⎪
⎪ xi N +2 = t2 T2 xi N +1 + (1 − t2 )T3 xi N +1 ;
⎪
⎪
⎪
⎪
⎨. . . . . . . . . . . . . . .
... ... ... ... ... (4)
⎪
⎪
⎪
⎪ xi N +k = tk Tk xi N +k−1 + (1 − tk )Tk+1 xi N +k−1 ;
⎪
⎪
⎪
⎪ . .. ... ... ... ...
⎪
⎪
⎩
xi N +N = t N TN xi N +N −1 + (1 − t N )TN +1 xi N +N −1 ;
7 Convergence of Generalized Mann Type of Iterates to Common Fixed Point 83

for i = 0, 1, 2, . . . .
N +1
For such a finite family {Tk }k=1 of quasi-nonexpansive mappings, we have the
following result generalizing all such previous results established by other authors.
Theorem 6 Let C be a nonempty convex subset of a uniformly convex Banach space.
N +1
Let {Tk }k=1 be a finite family of quasi-nonexpansive mappings of C into itself. Let
the graph of each Tk be closed and one of Tk (C), k = 1, 2, . . . , N + 1 be compact.
N +1
If the family of mappings {Tk }k=1 has a common fixed point in C, then the modified
Mann type of iterates given by (4) converge to the common fixed point of the family.
Proof First, we show that

Tk xi N +k−1 − Tk+1 xi N +k−1 → 0 as i N → ∞

for each k = 1, 2, . . . , N + 1.
If possible, let for a given > 0, there exists a subsequence {i N j } of {i N } such
that
Tk xi N j +k−1 − Tk+1 xi N j +k−1 ≥ . (5)

N +1
Let u ∈ C be a common fixed point of the family of mappings {Tk }k=1 . Then by
quasi-nonexpansiveness of Tk and Tk+1 , we get for each k = 1, 2, . . . , N + 1,

xi N +k − u = (tk Tk xi N +k−1 + (1 − tk )Tk+1 xi N +k−1 ) − (tk Tk u + (1 − tk )Tk+1 u)

≤ tk Tk xi N +k−1 − u + (1 − tk )Tk+1 xi N +k−1 − u
≤ tk xi N +k−1 − u + (1 − tk )xi N +k−1 − u
= xi N +k−1 − u.

Thus, {xi N +k − u} is a decreasing sequence of nonnegative reals, and therefore, it

is convergent. From (5), we have,

(Tk (xi N j +k−1 ) − Tk u) − (Tk+1 (xi N j +k−1 ) − Tk+1 u) ≥ .

Then for this , by uniform convexity of the space, there exists a δ, 0 < δ < 1, such
that

xi N j +k − u = (tk Tk xi N j +k−1 + (1 − tk )Tk+1 xi N j +k−1 ) − (tk Tk u + (1 − tk )Tk+1 u)

≤ δ max{Tk xi N j +k−1 − u, Tk+1 xi N j +k−1 − u}.

Since Tk+1 is quasi-nonexpansive, we get,

xi N j +k − u ≤ δTk+1 xi N j +k−1 − u

≤ δxi N j +k−1 − u
≤ δ j+k x0 − u → 0 as j → ∞.
84 T. Som et al.

Therefore, xi N +k − u → 0 as i N → ∞.
Now,

Tk xi N j +k−1 − Tk+1 xi N j +k−1 ≤ Tk xi N j +k−1 − u + u − Tk+1 xi N j +k−1

≤ 2xi N j +k−1 − u → 0, as i N j → ∞.

which contradicts (5) and therefore, for k = 1, 2, . . . , N + 1,

Tk xi N j +k−1 − Tk+1 xi N j +k−1 → 0, as i N → ∞. (6)

Let Tk+1 (C) be compact. Then by compactness, there is a subsequence {Tk+1 xi N j +k−1 }
which is convergent in C.
Let
lim Tk+1 xi N j +k−1 = z ∈ C.
j→∞

Then from (6), we have lim j→∞ Tk xi N j +k−1 = z. Now

z − Tk+1 z ≤ z − Tk xi N j +k−1 + Tk xi N j +k−1 − Tk+1 xi N j +k−1

+ Tk+1 xi N j +k−1 − Tk+1 z (7)

Since Tk+1 has a closed graph, therefore Tk+1 xi N j +k−1 − Tk+1 z → 0 as j → ∞

as such right-hand side of (7) tends to zero as j → ∞. Hence,

Tk+1 z = z for k = 0, 1, 2, . . . , N .
N +1
Thus, z is a common fixed point of {Tk }k=1 , as such the sequence {xi N +k − z} is
a decreasing sequence. But

xi N j +k − z ≤ (tk Tk xi N j +k−1 + (1 − tk )Tk+1 xi N j +k−1 ) − Tk xi N j +k−1

+Tk xi N j +k−1 − z
≤ (1 − tk )Tk+1 xi N j +k−1 − Tk xi N j +k−1 + Tk xi N j +k−1 − z

which tends to 0 as j → ∞.
That is, the sequence {xi N +k − z} has a subsequence converging to 0 and therefore

xi N +k − z → 0

as i N → ∞, i.e., xi N +k → z as i N → ∞. This completes the proof of the

theorem.
7 Convergence of Generalized Mann Type of Iterates to Common Fixed Point 85

3 Numerical Example

The following example is in support of Theorem 6.

Example 1 Let X = R and C = [0, 1]. Define a mapping Ti : C → C by

Ti x = x i+1 for all x ∈ C.

It is clear that each Ti is a nonlinear continuous self-mapping on C with unique fixed

point p = 0. Moreover,

|Ti x − p| = |x i+1 − 0| = |x i ||x − 0|

≤ |x − p|
≤ |x − p| for all x ∈ C,

i.e., Ti is quasi-nonexpansive mapping. However, T1 is not nonexpansive. Indeed, for

x = 14/30 and y = 17/30, we have

|T1 x − T1 y| = |(14/30)2 − (17/30)2 |

= 0.103333 > 1/10 = |14/30 − 17/30| = |x − y|.

Similarly, T2 is not nonexpansive. Indeed, for x = 19/30 and y = 20/30, we have

|T2 x − T2 y| = |(19/30)3 − (20/30)3 |

= 0.042259 > 1/30 = |19/30 − 20/30| = |x − y|.

Finally, T3 is not nonexpansive. Indeed, for x = 19/30 and y = 20/30, we have

|T3 x − T3 y| = |(19/30)4 − (20/30)4 |

= 0.036640 > 1/30 = |19/30 − 20/30| = |x − y|.

Here N = 2. Hence, the Mann type of iteration (4) reduces to

xi(N −1)+1 = t1 T1 xi(N −1) + (1 − t1 )T2 xi(N −1) ;

(8)
xi(N −1)+2 = t2 T2 xi(N −1)+1 + (1 − t2 )T3 xi(N −1)+1 ;

for all i = 0, 1, 2, . . . .
Thus, all the assumptions of Theorem 6 are satisfied. Hence, from Theorem 6, it
follows that the sequence generated by (8) converges to the common fixed point of the
N +1
family {Tk }k=1 . It is clear that the sequence {xn } generated by the proposed iterative
scheme converges to {0}. For different initial values x0 = 0.99, .999, .9999, .999999
and x0 = 0.999999999 and t1 = t2 = .5, the convergence of sequence {xn } is shown
in Fig. 1.
86 T. Som et al.

1
x 0 =.99
0.9
x 0 =.999
0.8
x =.9999
0.7 0

x =.999999
0.6 0

x 0 =.999999999
n

0.5
x

0.4

0.3

0.2

0.1

0
0 5 10 15 20 25 30 35 40
Number of iterations (n)

Fig. 1 Convergence of iterative method (4)

4 Conclusion

Our definition 7 is a more generalized version of the iteration scheme involving

N + 1 mappings, and the Theorems 5 and 6 generalize the earlier results of Petryshyn
and Williamson [5], Senter and Dotson [7], and Rhoades [6] not only in the sense
of iteration scheme but also in the sense of mapping, which was considered to be
continuous in the result of Rhoades [6]. In our case, the mappings considered need
not be continuous.

References

1. Dotson Jr., W.G.: Fixed points of quasi nonexpansive mappings. J. Aust. Math. Soc. 13, 167–
170 (1972)
2. Goebel, K., Kirk, W.A., Shimi, T.N.: A fixed point theorem in uniformly convex spaces. Bol.
Un. Mat. Ital. 7(4), 67–75 (1973)
3. Iseki, K.: On common fixed point of mappings. Bull. Aus. Math. Soc. 10, 75–87 (1974)
4. Mann, W.R.: Mean value methods in iteration. Proc. Amer. Math. Soc. 4, 506–510 (1953)
5. Petryshyn, W.V., Williamson, T.E.: A necessary and sufficient condition for the convergence
of iterates for quasi non-expansive mappings. Bull. Amer. Math. Soc. 78, 1027–1031 (1972)
6. Rhoades, B.E.: Comments on two fixed point iteration methods. J. Math. Anal. Appl. 56,
741–750 (1976)
7. Sentor, H.F., Dotson, W.G.: Approximating fixed points of non expansive mappings. Proc.
Amer. Math. Soc. 44, 375–379 (1974)
8. Som, T., Das S.: Convergence of modified Mann type of iterates and fixed point. Proc. Nat.
Acad. Sci. (India) 70(A)(II), 185–189 (2000)
Chapter 8
Geometric Degree Reduction
of Bézier Curves

Abedallah Rababah and Salisu Ibrahim

Abstract We consider the weighted-multi-degree reduction of Bézier curves. Based

on the fact that exact degree reduction is not possible, therefore approximative process
to reduce a given Bézier curve of high degree n to a Bézier curve of lower degree m,
m < n is needed. The weight function is used to better representing the approximative
curve at some parts that need more details, and the error is greater than other parts. The
L 2 norm is used in the degree reduction process. Numerical results and comparisons
are supported by examples. The numerical results obtained from the new method
yield minimum approximation error, improve the approximation in some parts of the
curve, and show up possible applications in science and engineering.

Keywords Bézier curves · Multiple degree reduction · Geometric continuity

1 Introduction

The problem of degree reduction of Bézier curve is to approximate an original Bézier

curve of degree n with another Bézier curve of degree m, m < n under the satisfaction
of boundary conditions and minimum error conditions. Degree reduction is impor-
tant in different fields of science, medical physics, network design, engineering, and
industrial applications. So many scientists had tried several times to find a solution
to degree reduction. The approach to the problem of degree reduction leads to solv-
ing a nonlinear problem. This requires numerical methods. In 1999, Lutterkort et
al. proved in [1] that degree reduction of Bézier curves in the L 2 norm equals best

A. Rababah (B)
Department of Mathematical Sciences, United Arab Emirates University, Al Ain, UAE
e-mail: [email protected]
S. Ibrahim
Department of Mathematics Northwest University, Kano 3220, Nigeria
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2018 87

Euclidean approximation of Bézier points; see also [5]. These results are generalized
to the constrained case by Ahn et al. in [2], and the discrete cases have been studied
in [3]. In 2007, Rababah et al. used in [4] the idea of basis transformation between
Bernstein and Jacobi basis to ascertain multi-degree reduction of Bézier curves. The
existing methods to find degree reduction have many issues including: accumulate
round-off errors, stability issues, complexity, accuracy, losing conjugacy, requiring
the search direction to be set to the steepest descent direction frequently, experiencing
ill-conditioned systems, leading to a singularity, and the most challenging difficulty
is in applying the methods (difficulty and indirect). A. Rababah and S. Mann pre-
sented also in [5] linear G 1 , G 2 , and G 3 -multiple degree reduction methods for
Bézier curves. The weighted G 0 - and G 1 , weighted G 1 , and weighted G 2 -multiple
degree reduction methods for Bézier curves are studied by Rababah and Ibrahim in
[6–8] respectively. Woźny and lewanowrez degree reduced Bézier curves using dual
Bernstein basis in [9]. Due to the new development in digital technology, [10] use the
approach of Bézier curve for Automated Offline Signature Verification with Intrusion
Identification. The research on Bézier curves has extended to the area of Medical
Image Visual Appearance Improvement Using Bihistogram Bézier curves Contrast
Enhancement in [11]. In all existing degree reducing methods, the conditions and
free parameters were applied at the end points.
The main contribution of this paper is to introduce the weight with the problem
of degree reduction of Bézier curves. So that it gives more weight to the center of
the curve. It is appropriate to consider degree reduction with the weight function
w(t) = 2t (1 − t), t ∈ [0, 1]. The result obtained carries all general advantages such
as better approximation at the center of the curves, minimum error, simplicity in
design, and implementation over existing results.

2 Preliminaries

Definition 1 A Bézier curve Pn (t) of degree n is defined algebraically as follows:

n
Pn (t) = pi Bin (t) 0 ≤ t ≤ 1, (1)
i=0

where

n
Bin (t) = (1 − t)n−i t i , i = 0, 1, . . . , n,
i

are the Bernstein polynomials of degree n, and p0 , p1 , . . . , pn are called the Bézier
control points of the Bézier curve.

Multiplication of two Bernstein polynomials with the weight function 2t (1 − t)

is given by
8 Geometric Degree Reduction of Bézier Curves 89

2 mi nj m+n+2
Bim (t)B nj (t)2t (1 − t) = m+n+2 Bi+ j+1 (t).
i+ j+1

We define the Gram matrix G m,n as the (m + 1) × (n + 1)-matrix, whose ele-

ments are given by
m n
1 2 i j
gi j = Bim (t)B nj (t)2t (1 − t)dt = m+n+2 , (2)
0 (m + n + 3) i+ j+1
i = 0, . . . , m, j = 0, 1, . . . , n.

The matrix G m,m is real, symmetric, and positive definite [5].

Geometric continuity describes the continuity of two curves with some geometric
properties. It is independent of their parametrization and denoted by G k . Geometric
continuity produces additional free parameters; see [5, 12] that are used to minimize
the error.

Definition 2 Bézier curves Pn and Rm are said to be G k -continuous at t = 0, 1 if

there exists a strictly increasing parametrization s(t) : [0, 1] → [0, 1] with s(0) =
0, s(1) = 1, and

Rm(i) (t) = Pn(i) (s(t)), t = 0, 1, i = 0, 1, . . . , k. (3)

3 Degree Reduction of Bézier Curves

Degree reduction can be defined as a method of approximating a given Bézier curve

of degree n by a Bézier curve of degree m, m < n. Degree reduction is approximative
process in nature, and exact degree reduction is ordinarily not possible. In this paper,
our goal is to find a Bézier curve Rm (t) of degree m with control points {ri }i=0
m
that
approximates a given Bézier curve Pn (t) of degree n with control points { pi }i=0 n
,
where m < n. The Bézier curve Rm has to satisfy the following two conditions:

(1) Pn and Rm are G k -continuous at the end points for k = 0,1, and
(2) the L 2 -error between Pn and Rm is minimum.
We can write the two Bézier curves Pn (t) and Rm (t) in matrix form as.

n
Pn (t) = pi Bin (t) =: Bn Pn , 0 ≤ t ≤ 1, (4)
i=0
90 A. Rababah and S. Ibrahim

m
Rm (t) = ri Bim (t) =: Bm Rm , 0 ≤ t ≤ 1, (5)
i=0

where Bn , Bm are the row matrices containing the Bernstein polynomials of degree
n, m, respectively, and Pn and Rm are the column matrices containing the Bézier
points of degrees n and m, respectively.
In this paper, we use the weighted L 2 -norm to measure distance between the
Bézier curves Pn (t) and Rm (t); therefore, the error term becomes

1
ε= ||Bn Pn − Bm Rm ||2 2t (1 − t)dt
0
1
= ||Bn Pn − Bmc Rmc − Bmf Rmf ||2 . 2t (1 − t)dt. (6)
0

The linear system is constructed and solved for each of the conditions of the G 1 -
and G 2 -degree reductions. The control points of the Bézier curve are expanded into
their x and y components. Therefore, the variables of our system of equations are
y y
rkx , rk , k = 2, . . . , m − 2, δ0 and δ1 and rkx , rk , k = 3, . . . , m − 3, η0 and η1 for G 1 -
and G 2 -degree reductions respectively.
The unknowns have the following solution form; see [5]

RmF = (G m,m
F
)−1 G m,n
PC C
Pn − G Cm,m RmC . (7)

4 Applications

This section provides two examples to support and validates the theoretical results
of the discussed methods.

Example 4.1 Given a Bézier curve Pn (t) of degree 12 with control points;
P0 = (0.224, 0.213), P1 = (0.248, 0.327), P2 = (0.079, 0.377), P3 = (0.004,
0.497), P4 = (0.544, 0.587),
P5 = (0.068, 0.511), P6 = (0.529, 0.131), P7 = (−0.274, 0.516), P8 = (0.248,
0.531), P9 = (0.194, 0.383),
P10 = (0.202, 0.357), P11 = (0.494, 0.306), P12 = (0.193, 0.141).
This curve is reduced to Bézier curve Rm (t) of degree 8.
Figure 1 depicts the original curve in solid-black, weighted G 1 - and G 2 -degree
reduction in dashed-green and dashed-red curve.
Figure 2 depicts the error plots in long thick blue, and dashed-orange curves represent
weighted G 2 - and G 1 -degree reduction respectively.
8 Geometric Degree Reduction of Bézier Curves 91

0.45

0.40

0.35

0.30

0.25

0.20

0.15

0.15 0.20 0.25 0.30 0.35

Fig. 1 Curves of degree 12 reduced to degree 8 with weighted G 1 and G 2 in (dashed-green and
dashed-red) and original curve in (black)

Example 4.2 Given a Bézier curve Pn (t) of degree 15 with control points; see [13].
P0 = (0, 0), P1 = (1.5, −2), P2 = (4.5, −1), P3 = (9, 0), P4 = (4.5, 1.5),
P5 = (2.5, 3), P6 = (0, 5), P7 = (−4, 8.5), P8 = (3, 9.5), P9 = (4.4, 10.5),
P10 = (6, 12), P11 = (8, 11), P12 = (9, 10), P13 = (9.5, 5), P14 = (7, 6),
P15 = (5, 7).
This curve is reduced to Bézier curve Rm (t) of degree 8.
Figure 3 depicts the original curve in solid-black, weighted G 1 and G 2 -degree
reduction in dashed-green and dashed-red curve.
Figure 4 shows the curves with polygon are reduced to degree 8 with weighted G 1
and G 2 -degree reduction in dashed-green and dashed-red curve and original curve
in (black).
Figure 5 depicts the error plots in long thick blue, and dashed-orange curves represent
weighted G 2 - and G 1 -degree reduction respectively. Figure 6 depict the figure from
existing method; see [13].
92 A. Rababah and S. Ibrahim

Erros Plot

Weighted G2

Weighted G1

0.006

0.004

0.002

0.2 0.4 0.6 0.8 1.0

Fig. 2 Error Plots

Fig. 3 Curves of degree 15

reduced to degree 8 with 10
weighted G 1 and G 2 in
(dashed-green and
dashed-red) and original 8
curve in (black)

–2
–2 0 2 4 6 8
8 Geometric Degree Reduction of Bézier Curves 93

–10 –5 0 5 10 15

Fig. 4 Polygon of degree 15 reduced to degree 8 with weighted G 1 and G 2 in (dashed-green and
dashed-red) and original curve in (black)

Erros Plot

Weighted G2

Weighted G1

0.25

0.20

0.15

0.10

0.05

0.2 0.4 0.6 0.8 1.0

Fig. 5 Error plots

94 A. Rababah and S. Ibrahim

Fig. 6 Figure from existing method; see [13]

5 Conclusion

This paper investigates weighted-multi-degree reduction of Bézier curves. Explicit

formula for weighted G 1 and G 2 method is used to reduce a given Bézier curve of
high degree n to a Bézier curve of lower degree m, m < n, and these are achieved
with the help of mathematica 9. Finally our numerical results show that the weight
function helps to improve the approximation in some parts of the curves, and our
new method yields minimum approximation error and shows up possible application
in science and engineering.

Acknowledgements The authors would like to thank the reviewers for helpful comments.

References

1. Lutterkort, D., Peters, J., Reif, U.: Polynomial degree reduction in the L 2 -norm equals best
Euclidean approximation of Bézier coefficients. Comput. Aided Geom. Des. 16, 607–612
(1999)
2. Ahn, Y., Lee, B.G., Park, Y., Yoo, J.: Constrained polynomial degree reduction in the L 2 -norm
equals best weighted Euclidean approximation of Bézier coefficients. Comput. Aided Geom.
Des. 21, 181–191 (2004)
3. Ait-Haddou, R.: Polynomial degree reduction in the discrete L 2 -norm equals best Euclidean
approximation of h-Bézier coefficients. BIT Numer, Math (2016)
4. Rababah, A., Lee, B.G., Yoo, J.: Multiple degree reduction and elevation of Bézier curves
using Jacobi-Bernstein basis transformations. Num. Funct. Anal. Optim. 28(9–10), 1179–1196
(2007)
5. Rababah, A., Mann, S.: Linear methods for G 1 , G 2 , and G 3 −multi-degree reduction of Bézier
curves. Comput.-Aided Des. 45(2), 405–14 (2013)
6. Rababah, A., Ibrahim, S.: Weighted G 1 -multi-degree reduction of Bézier curves. Int. J. Adv.
Comput. Sci. Appl. 7(2), 540–545 (2016)
8 Geometric Degree Reduction of Bézier Curves 95

7. Rababah, A., Ibrahim, S.: Weighted degree reduction of Bézier curves with G 2 -continuity. Int.
J. Adv. Appl. Sci. 3(3), 13–18 (2016)
8. Rababah, A., Ibrahim, S.: Weighted G 0 - and G 1 multi-degree reduction of Bézier curves. In:
AIP Conference Proceedings 1738, 05, vol. 7, issue 2. p. 0005 (2016). https://doi.org/10.1063/
1.4951820
9. Woźny, P., Lewanowicz, S.: Multi-degree reduction of Bézier curves with constraints, using
dual Bernstein basis polynomials. Comput. Aided Geom. Des. 26, 566–579 (2009)
10. Vijayaragavan, A., Visumathi, J., Shunmuganathan, K.L.: Cubic Bézier curve approach
for automated offline signature verification with intrusion identification. Math. Prob. Eng.
2014(Article ID 928039), 8 pp. (2014)
11. Gan, H.-S., Swee, T.T., Abdul Karim, A.H., Amir Sayuti, K., Abdul Kadir, M.R., Tham,
W.-T., Wong, L.-X., Chaudhary, K.T., Ali, J., Yupapin, P.P.: Medical image visual appearance
improvement using bihistogram Bezier curve contrast enhancement: data from the osteoarthritis
initiative. Sci. World J. 2014(Article ID 294104), 13 pp. (2014)
12. Lu, L., Wang, G.: Optimal multi-degree reduction of Bézier curves with G 2 -continuity. Comput.
Aided Geom. Des. 23, 673–683 (2006)
13. Lu, L.: Sample-based polynomial approximation of rational Bézier curves. J. Comput. Appl.
Math. 235, 1557–1563 (2011)
Chapter 9
Cybersecurity: A Survey of Vulnerability
Analysis and Attack Graphs

Rachid Ait Maalem Lahcen, Ram Mohapatra and Manish Kumar

Abstract The network infrastructure is the most critical technical asset of any
organization. This network architecture must be useful, efficient, and secure. How-
ever, their cybersecurity challenges are immense as the number of attacks is increas-
ing. Consequently, there is a need to have efficient tools to assess the risks, know the
vulnerabilities, and find the solutions before the attackers exploit them. The chal-
lenges remain in integrating the vulnerability analysis tools in a holistic process that
cyber defenders can use to detect an intrusion and respond quickly. Attack graphs
showed great importance in analyzing security. In this paper, we present a survey of
raised and related topics to the field of vulnerability analysis and attack graphs.

Keywords Attack graphs · Cybersecurity · Cyber situational awareness

Vulnerability analysis

1 Introduction

Enterprise networks continue to struggle with maintenance of network performance,

availability, and security [1]. For instance, the Identity Theft Resources Center [2]
had recorded 1339 US data breaches in 2017, exposing more than 174,402,528
confidential records. In cumulative view, between January 1, 2005, and December
27, 2017, number of breaches is 8190 with 1,057,771,011 of exposed records. Based
on The Federal Bureau Investigation’s (FBI) Internet Crime Complaint Center [3]
receives an average of 280,000 complaints each year, or an average of 800 complaints
a day, and in 2016 there was a total loss of $1.33 Billion. It is also widely recognized

R. Ait Maalem Lahcen · R. Mohapatra (B)

Department of Mathematics, 4000 Central Florida Blvd., Orlando, FL 32816, USA
e-mail: [email protected]
R. Ait Maalem Lahcen
e-mail: [email protected]
M. Kumar
Department of Mathematics, Birla Institute of Technology and Science-Pilani,
Hyderabad Campus, Hyderabad 500078, Telangana, India

© Springer Nature Singapore Pte Ltd. 2018 97

that the time it takes for an organization to realize that they have been successfully
attacked is measured in hundreds of days, and not in hours. Organizations in Europe,
Middle East and Africa (EMEA) report [4] that it took three times longer to detect a
compromise in the region, it was 469 days versus a global average of 146 days. Many
organizations in EMEA were re-compromised within months of an initial breach.
Consequently, it is crucial for any institution to analyze the security of its network
from every access point. The attackers may have one or various motives, and they are
determined to breach the systems. Once they enter one access point, they will try to
penetrate every level in the network. Hence, the motivation of the defender to protect
the systems cannot stop at the administrative duties. The defender must possess tools
that can analyze enterprise network to discover vulnerabilities before the attackers
do. One of the most effective methods is to search for all possible multi-access points,
the various possible attack paths by building attack graphs and simulate the attacks
[5]. The scenario graph demonstrates every possible path to break into a network
security [5]. Consequently, network attack graph depicts all possible penetration
scenarios. Attack graphs give an overview of potential scenarios that can lead to an
unauthorized intrusion [6]. The challenge in security of zero-day exploits will always
be a challenge since attackers develop exploits for those vulnerabilities that have not
yet been disclosed. Hence, it is necessary to explore unexpected attackers behavior
and not be limited by predefined information [7]. Since we see cyber situational
awareness to be an important framework in which attack graphs can be implemented,
we’ll address it first. Some related and interesting work can be found in [3, 6–16].

2 Cyber Situational Awareness

Cyber situational awareness (CSA) is important for an effective cybersecurity anal-

ysis and incident response. However, it hasn’t been well studied [11]. Several open-
source tools and products were developed to tackle cyber problems, with US Gov-
ernment being a primary client. However, those tools have not improved CSA of
cybersecurity analysts. Braford et al. in chapter 1 of “Cyber SA: Situational Aware-
ness for Cyber Defense” discuss that aspects of situational awareness (SA) consist
of [12]:
• Situation perception that includes recognition and identification of the type of
attack, the target.
• Impact assessment that includes assessment of current and of future damage.
• Situation tracking is important to be aware of its progress.
• Awareness of intent and threat hunting techniques.
• Backtracking and forensics to analyze reasons and methods that caused a situation
attack.
• Evaluation of the collected SA information.
• Lessons learned about current situation and how it’ll evolve in the future.
9 Cybersecurity: A Survey of Vulnerability Analysis and Attack Graphs 99

Therefore, CSA can be summarized in three major steps [12]:

1. Recognition which provides basis for better SA.
2. Comprehension in which knowledge and data apply context to make sure the
information is meaningful to the specific circumstances.
3. Projection that is used to make educated and informed assessment about future
attacks and mitigate their threats.
The diagram in Fig. 1 shows how the situation can evolve in a nonlinear way.
This is equivalent to sensemaking in [8] that includes learning new areas, solving
not so well-defined problems, acquiring SA, and participating in knowledge sharing;
as those steps should lead to deeper understanding. CSA requires time to develop,
and one should work on building a model that better prepares for future attacks. It
is clear that cyber defenders ought to deal with uncertainty as it is not possible for
them to be aware of everything running within every computer inside the network.
There is also no efficient mechanism to digest the logs even if every device can be
logged. To summarize, one should find answers to these questions in CSA [17]:
• Is there an intrusion?
• Where is the intruder?
• How does the situation evolve?
• What is the impact of the attack on the network?
• How to assess a damage?
• What behavior is expected from the attackers?
• What strategies they may take?

Fig. 1 A nonlinear SA process [17]

100 R. Ait Maalem Lahcen et al.

Fig. 2 CSA framework [17]

• Can we predict future scenarios of the current situation ?

• How did the intruder manage to make it happen?
• What was the target or goal of the intruder?
Figure 2 depicts CSA framework in which vulnerability analysis is conducted by
a topological approach allowing to generate attack graphs by encoding probabilistic
knowledge of the attackers’ behavior. They merged multiple attack types to a compact
data structure and define an index structure on top of it to classify multiple alerts
and data from sensors. A dependency analysis is performed to generate dependency
graphs. Consequently, attack graph scenario is made from joining dependency graphs
and attack graphs. Scenario graphs show ways in which an intruder can exploit known
vulnerabilities and affect the system. The authors also proposed an algorithm for both
detection and prediction, and it scaled well with large graphs [17].

3 Attack Graphs

Computer networks may have vulnerabilities that can be exploited in ways that serve
the goals of the intruder. Although a successful attack may require multiple steps in
various order, the usual network attack consists of these stages:
1. Reconnaissance in which attackers gather information about a target to use
in the next step. Some of the techniques used are social engineering, physical
reconnaissance, and dumpster diving. Reconnaissance can be active or passive
depending on whether the interaction happened with the system or not.
9 Cybersecurity: A Survey of Vulnerability Analysis and Attack Graphs 101

2. Scanning is the next step to discover running services on a target computer or

network. It is a development of active reconnaissance since the attacker engages
with system to learn about its vulnerabilities.
3. Gaining Access is a logical next step after attempting to exploit identified vul-
nerabilities.
4. Maintaining Access is possible with the intruder planting own Trojan software,
packet analyzer, or additional backdoor network access codes.
5. Covering tracks or a hiding stage in which the intruder tries to cover-up the
crime. This stage may include cleaning logs, hidden background programs, and
installing codes to conceal malicious software from legitimate users.
A case example with an attack graph is given by J. Li, X. Ou, and R. Rajagopalan in
chapter 4 of [12]. In this example, attack paths are found after configuration analysis.
Figure 3 shows it.
The intruder breaches web server (a critical attack vector, i.e. used in Equifax
breach in 2017) from a remote location by exploitation of CVE-2002-0392 vulner-
ability and gains local access on the server. Then attempts to alter data on file server
in order to exploit vulnerabilities to get access on the machine. The intruder installs
a Trojan-horse program, and wait for a user on workstation to run it, and gain control
of the station. Details of this scenario graph can be found in [12]. Although this
attack graph, or any other attack graph of similar size, may look simple, it could still
involve complicated computations of the likelihood that an attack can be successful.
Figure 4 shows a simple example of an attack graph found in [16]. The oval nodes
being the exploit nodes and the conditional nodes being the text nodes.
The complexity of attack graphs topology creates many shared dependencies. For
instance, node c10 can be reached by an intruder from exploiting e4 or e5 that fully
depend upon c7 . Hence, the paths to e4 and e5 are not independent. Furthermore, one

Fig. 3 A case example of an attack scenario and attack graph (WebSevrer p1 , NFS protocol p2 ,
WebServer p3 , File server p4 ) [12]
102 R. Ait Maalem Lahcen et al.

Fig. 4 Simple example of an attack graph [16]

cannot assume independence in attack graphs, and should measure the probability
of possible multistep attacks. Yun et al. in [16] presented a method for security
risk assessment that combines the attack graphs and the Common Vulnerability
Scoring System (CVSS) in order to address incorrect probability computing caused
by conjoint dependencies in nodes. Briefly, CVSS helps to identify the principal
characteristics of a vulnerability and scores its severity. CVSS is formed by three
metric groups stated in [10]:
1. Base including exploit ability metrics and impact metrics. This includes the ease
to exploit the vulnerability component, and the consequence or the impacted
component. Vulnerability characteristics that are constant across user and envi-
ronment.
9 Cybersecurity: A Survey of Vulnerability Analysis and Attack Graphs 103

2. Temporal represents the characteristics of a changing vulnerability, yet not across

user environments.
3. Environmental represent metrics characteristics of a vulnerability related to a
particular user’s environment.
The algorithm in [16] calculated either accurately or approximately the probability
of nodes depending on their depth, a setting number, and a formed theorem. This algo-
rithm solved the problem of probabilistic incorrect computing. It was experimented
in a 5–20 hosts in a simulation and showed some effectiveness over HOMER’s
algorithm [18]. Wang, Du, and Yang presented an automated method that generates
and analyzes attack graphs in [13]. They formed it using symbolic model checking
algorithms and tested it on a small network example. They tested it on a small oper-
ational network using applied Network Security Planning Architecture and found a
faulty firewall. Shahriari, Ganjisaffar, Jalili, and Habibi modeled networks’ topolo-
gies, their configurations and vulnerabilities in [19]. A framework that is similar to
MulVAL which we’ll address later in the paper. They implemented an expert system
based on a framework for automatic topological multihost vulnerability analysis. A
methodology that explores all paths of attacks and combats unauthorized access by
an attacker. The output of the expert system is accessed by the network administra-
tor from the user interface which allows to control the inference engine. The latter
processes logical inferences based on the knowledge base input, which collects facts
and inference rules. Knowledge base component gets input from the host vulnerabil-
ity extractor that takes information from vulnerability databases and host scanners.
The expert system performed vulnerability analysis of a network with 1600 hosts in
reasonable time (31 s) [19].
Noel and Jajodia applied adjacency matrix clustering to network attack graphs in
order to correlate attacks and predict them [20]. Reachability across the network is
found by self-multiplying the clustered adjacency matrices to find number of steps
to an attack. The reachability analysis summaries how changing a network config-
uration can affect the attack graph. The graphical technique matches columns and
rows of the clustered adjacency matrix to show multiple step attacks. This allows to
identify impact depending on the number of steps to victim machines and identify the
sources of the attack. The adjacency matrix brings simplicity to their approach since
a single matrix element represents each graph edge. Graph vertices are implicitly
represented as matrix rows and columns. The adjacency matrix avoids the typical
crowded edge representation of small and large graphs. Their clustering algorithm
is advantageous because it scales linearly with network size, it is parameter-free and
completely automatic. Yang et al. experiment in [15] show that the built hierarchi-
cal architecture constructed is good for assessing the potential security risks of four
levels: network, hosts, services, and vulnerabilities. The vulnerability attack link
generated algorithm proposed in their paper could help system administrators miti-
gate the potential security risks in the computer system. This algorithm is composed
of two subalgorithms: (1) host access link generated algorithm and (2)vulnerability
104 R. Ait Maalem Lahcen et al.

attack link generated algorithm, details can be found in [15]. Abraham and Nair
propose in [7] a stochastic approach for security evaluation based on attack graphs,
taking into account CVSS scoring. They used MulVAL (developed by Kansas State
University) to generate logical attack graphs in a polynomial time. A simulation of
the Absorbing Markov chain is conducted on the attack graph generated for the net-
work. They used a realistic network to analyze and capture security properties and
optimize the application of patches. The proposed model can assist to harden the
system by identifying its critical parts and predicting the total security variation over
time [7].
Lippmann and Ingols, in 2005, surveyed attack graphs papers that focused on
three goals [21]:
1. Papers construct attack graphs to analyze network security.
2. Papers about formal languages that are complex or simple to describe states
in attack graphs. Those languages would typically define preconditions for a
successful intrusion, and postconditions or changes in network state after an
intrusion.
3. Papers describe attack graphs used with intrusion detection systems (IDS) to
group alerts.
They found that most of algorithms were tested on small networks with fewer than
20 hosts. Consequently, we find that, after 2005, several papers tackled scalability
problem and attempted larger networks but not the desirable to enterprise networks
with over 10,000 hosts. Nevertheless, research using attack graphs has achieved a
number of good prototypes that are summarized in Table 1.
Cauldron is a commercialized TVA that was developed by George Mason Uni-
versity, hence, it applies the concept shown in Fig. 5. In this paper, we limit the
survey to TVA discussed below. FireMon is the commercialized NETSPA, adopted
by FireMon, LLC. We also limit discussion to NETSPA. Another commercial toolkit
is Skybox View by Sktbox Security Inc.; it has a polynomial complexity O(n 3 ).

Table 1 Attack graphs toolkits [7]

Toolkit name Complexity Open source Developer
MulVAL O(n 2 ) O(n 3 ) Yes Kansas State
University
TVA O(n 2 ) No George Mason
University
NETSPA O(nlogn) No MIT
Cauldron O(n 2 ) No Commercial
Firemon O(nlogn) No Commercial
9 Cybersecurity: A Survey of Vulnerability Analysis and Attack Graphs 105

Fig. 5 Topological vulnerability analysis [12]

3.1 Topological Vulnerability Analysis

Jajodia and Noel discuss Topological Vulnerability Analysis (TVA) in [12]. TVA
tries to discover the paths through a network that an intruder may follow. Figure 5
shows the concept of TVA architecture.
Network Capture builds a model of the network, Vulnerability Database repre-
sents a comprehensive repository of reported vulnerabilities and the record listing
of the affected software or hardware, and the Exploit Conditions conceals how each
vulnerability may be exploited and the consequence of the breach (preconditions and
postconditions). All inputs from network capture are used to set up an Environment
Model for multistep attack graph simulation. The Graph Engine generates all possi-
ble attack path scenarios after analyzing vulnerability dependencies, coordinating the
before and after exploitation conditions. The TVA outputs Visual Analysis of attack
graphs and calculate Optimal Counter Measures. TVA attack graphs can support
intrusion detection system. TVA matches the network model against a database of
reported vulnerabilities from the examples included in Fig. 5 [12]. Although TVA has
some technical challenges like entering the exploits information by hand, it can be
used to determine safe network configurations with respect to the goal of maximiz-
ing available network services. It also has potential application to identify possible
attack responses and improve intrusion detection systems.
106 R. Ait Maalem Lahcen et al.

3.2 A Network Security Planning Architecture

A Network Security Planning Architecture (NETSPA) generates attack graphs from

a network topology and graphs of all potential paths that can be exploited for a
user-defined network. These graphs and their associated statistics, such as number
of hosts compromised and attacker privilege levels, allow a network administrator
to determine likely intrusion paths and extrapolate this data to determine the current
and future security of the network given past software vulnerability frequencies. As
the attack graphs are displayed in near real time, an administrator can change the net-
work topology slightly, recompute the graphs for the new topology, and compare the
graphs produced from different configurations. This allows an administrator to weigh
network security against other factors, such as hardware costs and ease of main-
tenance. Finally, NETSPA imports information from several existing security and
network planning tools. Existing network configuration information can be obtained
through the use of tools such as nmap, Nessus, and NetViz. Online databases such
as ICAT and the Nessus vulnerability plug-ins provide valuable information about
attack requirements and effects [22]. Construction of an attack graph requires sev-
eral pieces of information about the type of attacker, underlying network topology,
number of attacks available to the attacker, and their types. Figure 6 illustrates these
input components.
Only three inputs (the attack model, network and host vulnerabilities, and network
topology) are essential to the creation of a useful attack graph. The attack model
defines the state transition relation of an attack by stating its requirements and effects
of executing an attack. The network topology limits the physical paths that an attacker

Fig. 6 Necessary information to create an attack graph [22]

9 Cybersecurity: A Survey of Vulnerability Analysis and Attack Graphs 107

can traverse within the network, subject to network connectivity and firewall rules.
The host and network vulnerabilities and configurations define the possible set of
initial actions that the attacker can take against the network using attacks from the
attack model. The other three inputs to the attack graph (attacker profiles, intrusion
detection systems, and critical network resources) are not required to generate an
attack graph; however, they increase the utility of the constructed graph. The attacker
profile defines the starting state of the intruder, as well as the methodology that
he uses in choosing the next attack to execute. This enables the administrator to
optimize a network’s security against novice outside attackers, while accepting the
possibility of an insider attack. A list of critical network resources also allow the
security administrator to prune the complete attack graph to only those states which
are judged critical, such as not allowing attacker access to a central billing database.
Finally, the placement and type of the intrusion detection systems allow the graph
generator to determine which paths are visible.
NETSPA was created to fill a void in existing security software. The primary
design goal of NETSPA was to create a system that could automatically compute
complete attack graphs for real, user-specified networks. This, in turn, leads to three
separate subgoals: Allow a user to easily define a network and its resulting config-
uration, enable quick modeling of realistic actions, and efficiently compute worst-
case attack graphs with sufficient meta-information to be easily useful to the user.
Secondary to the notion of attack graphs was that of simplicity and information
reuse, most notably in the action specifications. The worst-case graphs generated
by NETSPA illustrate all possible cyber-attack paths. They do not model physical
attacks or human engineering attacks. Graph generation does not take into account
the skill or predisposition of the attacker. It also assumes that attempts at “security
by obscurity,” such as passing SMTP traffic through the firewall on a non-standard
port, fail. In addition, the model of an IDS is assumed to be “best-case.” A host-based
IDS always detects an attack launched against it, while a network-based IDS always
detects attacks that are visible on the network if it has a signature for the attack.
NETSPA is divided into several modules to achieve its goals, each component and
resulting connectivity is shown in Fig. 7. As seen in the upper left of the figure and
illustrated, the software database is the repository of software information used by
NETSPA to make software names consistent. The software database is used by both
the action database and the input filters to the network model to create a consistent
software naming scheme among network configuration and action definitions. The
action database shown in the middle left of figure contains information about every
possible action that an attacker can execute against a user-defined network. The cre-
ation of this network is aided by the user input filters, shown in the upper right of
Fig. 7, which populate the network model with network configuration information.
This network model is then used to create an initial network state, which is provided,
along with the database of possible actions and set of existing trust relationships,
to the computation engine. The computation engine then creates a worst-case attack
graph for the specific set of inputs [22].
108 R. Ait Maalem Lahcen et al.

Fig. 7 NETSPA component diagram [22]

3.3 Multihost, Multistage, Vulnerability Analysis

Multihost, multistage, vulnerability analysis (MulVAL) project was developed at

Kansas State University as a research tool to better manage the configuration of an
enterprise network. Xinming Ou, Govindavajhala, and Appel discuss that MulVAL
uses datalog as the artificial language for the elements in the analysis [23]. The inputs
to MulVALs analysis are reported vulnerabilities or advisors, host configuration,
network configuration, the network users or principals, and policies like access levels.
Figure 8 shows MulVAL framework.
9 Cybersecurity: A Survey of Vulnerability Analysis and Attack Graphs 109

Fig. 8 The MulVAL framework [23]

The reasoning engine in MulVAL can handle the network size and perform analy-
sis for thousands of machines. For scalability, MulVaL was tested on up to 2000 hosts.
The scanners can execute in parallel on multiple machines. The analysis engine then
operates on the data collected from all hosts. The OVAL scanner collects machine
configuration information and compares the configuration with formal advisories
to assess for vulnerabilities existence on a system. However, when a new advisory
comes, the scanning will have to be repeated on each host which is not the most desir-
able technique. OVAL language is an XML-based language for specifying machine
configuration tests. MulVAL runs efficiently for networks with thousands of hosts,
and it has found security problems in a real network [23]. MulVaL is an open source
and that gives an advantage to academic researchers.

4 Conclusion

Predicting total security on a given time is still a challenging task, and blocking
sophisticated threats or advanced malware attacks is still less effective [24]. Attack
graphs representation approaches had several developments since 1996 from enu-
meration approach to hybrid condition with exploit oriented approach and vulner-
ability oriented approach [25]. Good strides in addressing scalability and network
vulnerability analysis were made, yet, there is still need to address complex large
enterprise and multiple stage attacks. Those complex networks demand automatic
expert system to analyze network topology, show exploitation scenarios, and rank
relevant subgraphs to determine security measures that need to be deployed first. In
addition, future research should improve the application of graph attacks algorithms
by decreasing their complexity. Finally, there is also a need for research designs
of security systems to better integrate and automate cyber situational awareness
[26–33].
110 R. Ait Maalem Lahcen et al.

References

1. Filkins, B.: Network Security Infrastructure and Best Practices: A SANS Survey. SANS Insti-
tute, Washington (2017)
2. Identity Theft Resource Center: ITRC Data Breach Report (2017)
3. Smith, S.S.: Internet Crime Report 2016, 29920 (2016)
4. Hall, T., Hau, B., Penrose, M., Bevilacqua, M.: Mandiant M-Trends 2016 EMEA Edition, pp.
1–18 (2016)
5. Liu, Z., Li, S., He, J., Xie, D., Deng, Z.: Complex network security analysis based on attack
graph model. In: 2012 Second International Conference on Instrumentation, Measurement,
Computer, Communication and Control, pp. 183–186 (2012)
6. Sheyner, O., Wing, J.: Tools for generating and analyzing attack graphs. In: 2nd International
Symposium on Formal Methods for Components and Objects (FMCO’03), vol. 3188, pp.
344–371 (2004)
7. Abraham, S., Nair, S.: A Predictive Framework for Cyber Security Analytics Using Attack
Graphs, pp. 1–17 (2015)
8. Pirolli, P., Russell, D.M.: Introduction to this special issue on sensemaking. Hum.-Comput.
Interact. 26, 1–8 (2011)
9. Seuwou, P., Banissi, E., Ubakanma, G., Sharif, M.S., Healey, A.: Actor-network theory
as a framework to analyse technology acceptance model’s external variables: the case of
autonomous vehicles. Commun. Comput. Inf. Sci. 630, 305–320 (2016)
10. Singhal, A., Ou, X.: Security Risk Analysis of Enterprise Networks Using Probabilistic Attack
Graphs, pp. 1–22 (2011). https://doi.org/10.6028/nist.ir.7788
11. Stevens-Adams, S., Carbajal, A., Silva, A., Nauer, K., Anderson, B., Reed, T., Forsythe, C.:
Enhanced Training for Cyber Situational Awareness. Lecture Notes in Computer Science
(Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinfor-
matics) (LNAI), vol. 8027, pp. 90–99 (2013)
12. Jajodia, S., Peng, L., Swarup, V., Wang, C.: Cyber Situational Awareness Testing, vol. 2016,
pp. 209–233. Springer (2016)
13. Wang, C., Du, N., Yang, H.: Generation and analysis of attack graphs. Procedia Eng. 29,
4053–4057 (2012)
14. Wang, X., Liao, Y.: A replication detection scheme for sensor networks. Procedia Eng. 29,
21–26 (2012)
15. Yang, J., Liang, L., Yang, Y. and Zhu, G.: A hierarchical network security risk assessment
method based on vulnerability attack link generated. In: 2012 4th International Symposium on
Information Science and Engineering (ISISE 2012), vol. 1, pp. 113–118 (2012)
16. Ye, Y., Xu, X.S., Qi, Z.C.: A probabilistic computing approach of attack graph-based nodes in
large-scale network. Procedia Environ. Sci. 10, 3–8 (2011)
17. Pino, R.E.: Cybersecurity Systems for Human Cognition Augmentation. Springer, New York
(2014)
18. Homer, J., Ou, X., Schmidt, D.: A sound and practical approach to quantifying security risk in
enterprise networks. Technical Report, pp. 1–15. Kansas State University (2009)
19. Hamid, R.S., Yasser, G., Rasool, J.: Topological analysis of multi-phase attacks using expert
systems. J. Inf. Sci. Eng. 767, 743–767 (2008)
20. Noel, S., Jajodia, S.: Understanding complex network attack graphs through clustered adjacency
matrices. Proceedings-Annual Computer Security Applications Conference (ACSAC) 2005,
160–169 (2005)
21. Lippmann, R.P., Ingols, K.W.: An annotated review of past papers on attack graphs. No. PR-
IA-1 (2005)
22. Artz, M.L.: NetSPA: a network security planning architecture. Netw. Secur. 2001, 1–97 (2002)
23. Ou, X., Govindavajhala, S., Appel, A.W.: MulVAL: a logic-based network security analyzer.
In: Proceedings of the 14th conference on USENIX Security Symposium, vol. 14 (2005)
24. Oltsik, J.: Integrated Network Security Architecture: Threat-Focused Next-generation Firewall.
The Enterprise Strategy Group, Inc. (2014)
9 Cybersecurity: A Survey of Vulnerability Analysis and Attack Graphs 111

25. Mell, P., Harang, R.: Minimizing attack graph data structures. In: ICSEA 2015: Tenth Interna-
tional Conference on Software Engineering Advances. Barcelona (2015)
26. Bacic, E., Froh, M., Henderson, G.: Mulval extensions for dynamic asset protection (2006)
27. Frigault, M., Wang, L.: Measuring network security using bayesian network-based attack
graphs. In: Proceedings—International Computer Software and Applications Conference, pp.
698–703 (2008)
28. Kaynar, K.: A taxonomy for attack graph generation and usage in network security. J. Inf.
Secur. Appl. 29, 27–56 (2016)
29. Long, X., Wu, X.: Motion segmentation based on edge detection. Procedia Eng. 29, 74–78
(2012)
30. Ma, J.C., Wang, Y.J., Sun, J.Y., Chen, S.: A minimum cost of network hardening model based
on attack graphs. Procedia Eng. 15, 3227–3233 (2011)
31. Mourad, A., Soeanu, A., Laverdière, M.A., Debbabi, M.: New aspect-oriented constructs for
security hardening concerns. Comput. Secur. 28, 341–358 (2009)
32. Ou, X., Govindavajhala, S., Appel, A.W: Policy-based multihost multistage vulnerability anal-
ysis (2005)
33. Dimitrios, P., Sarandis, M., Christos, D.: Expanding topological vulnerability analysis to intru-
sion detection through the incident response intelligence system. Inf. Manage. Comput. Secur.
4 (2010)
Chapter 10
A Solid Transportation Problem
with Additional Constraints Using
Gaussian Type-2 Fuzzy Environments

Sharmistha Halder (Jana), Debasis Giri, Barun Das,

Goutam Panigrahi, Biswapati Jana and Manoranjan Maiti

Abstract This paper deals with nonlinear transportation problem where one part
of unit transportation cost varies with distance from some origin, and the problems
consist one more impurity restriction. Moreover, the fixed unit transportation costs
are imprecise ones. In model I, some parameters (i.e. production cost, transport cost,
supply, demand and unit of impurity at demand point) are considered as Gaussian
type-2 fuzzy variable, while model II considered only the supply and demand which
are deterministic. The type-2 fuzzy variables are transformed into type-I fuzzy vari-
ables with the help of CV-based reduction method. Genetic algorithm (GA) has been
applied to solve the proposed models. Finally, an illustration is presented numerically
to demonstrate the experimental results.

S. Halder (Jana)
Department of Mathematics, Midnapore College [Autonomous],
Midnapore 721101, India
e-mail: [email protected]
D. Giri (B)
Department of Computer Science and Engineering,
Haldia Institute of Technology, Haldia, East Midnapore 721657, India
e-mail: [email protected]
B. Das
Department of Mathematics, Sidho Kanho Birsha University,
Purulia 723104, West Bengal, India
e-mail: [email protected]
G. Panigrahi
Department of Mathematics, National Institute of Technology,
Durgapur 713209, West Bengal, India
e-mail: [email protected]
B. Jana
Department of Computer Science, Vidyasagar University,
Midnapore 721102, West Bengal, India
e-mail: [email protected]
M. Maiti
Department of Applied Mathematics with Oceanology and Computer Programming,
Vidyasagar University, Midnapore 721102, West Bengal, India
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2018 113

Keywords Nonlinear solid transportation problem · Impurity constraints · Critical

value · Gaussian type-2 fuzzy variables · Genetic algorithm · Reduction method

1 Introduction

The traditional transportation issue is one of the subclasses of nonlinear programming

problem in which all the constrains are of equal type or of in-equal type. In traditional
shape, the issue limits the aggregate of transporting an item which is accessible at
a few sources and are required at different goals. The unit cost, i.e., the cost of
transporting one unit from a specific supply point to a specific request point, the
amount accessible at the supply focuses and the amount required at the request
focuses are the parameters of the transportation issues. In reality circumstances,
the transportation issue typically includes nonlinear, noncommensurable, numerous
and clashing target capacities. This sort of issue is called nonlinear multi-target
transportation issue. A few creators apply a distance function to present a numerical
model of the nonlinear multi-target transportation issue. In this case, we propose the
single objective function of transportation problem which become non-linear. In the
wake of presenting the idea of fuzzy set theory by Zadeh [1] in 1965, Zimmermann
[2] connected the fuzzy programming technique with some reasonable enrolment
capacities to tackle multi-objective linear programming issues. The outcome acquired
by fuzzy linear programming leads to effective arrangements, as well.
The standard/usual transportation issue [3] is a well-known improvement issue in
operational research, in which two sorts of imperatives (source and goals) are mulled
over. In any case, in genuine circumstances, it deals with other constraints such as
the type of products, mode of transport and distance of path travels. As a result, the
conventional transportation problems (2D-TPs) with conveyance constraints turn
into the solid transportation problems (STPs/3D-TPs). The STP was first proposed
by Schell [4]. As a speculation of normal TP, STP was presented by Haley [5] in
1962. In current years, STP has received much attention, many models and algorithms
under both crisp and uncertain environment have been developed. There are many
researchers who have worked in this area such as Jimenez et al. [6], Yang et al. [7],
Liu et al. [8], Kocken et al. [9].
Ordinarily, separations of the courses between the sources to goals are not consid-
ered in TPs as the distance of the route stays unaltered and does not cause any effect in
the minimization of cost/time. In true issue, these might be distinctive courses/ways
for travel between a origin and a goal. Amongst these paths, the distance between the
sources to goals is different. Per unit transportation costs and fixed charges along the
routes are also different. Hence, choice of routes plays a major roll in maximization
the profit in a TP. Thus, in a transportation problem, if different routes are consid-
ered besides different vehicles, then the three-dimensional transportation problem
(3D-TP) is transformed into four-dimensional transportation problem (4D-TP). In an
STP, when in excess of one items are put away at various steps and are transported
to various goals utilizing diverse kinds of conveyances, the issue diminishes to a
10 A Solid Transportation Problem with Additional Constraints … 115

multi-item STP (MISTP). Many researchers such as Ojha et al. [10], Kundu et al.
[11], Giri et al. [12] and others worked on MISTP. Similarly when in a 4D-TP, more
than one item is considered as it becomes a multi-item 4D-TP (MI4D-TP).
Type-2 fuzzy sets are used due to its flexibility and degrees of freedom, and it is
treated as three dimension. So, type-2 sets are more efficient for modelling uncertain
problem accurately than type-1 fuzzy variable. Dubois and Prade [13] and Mizumoto
and Tanaka [14] investigate the logical tasks of fuzzy type-2. Afterwards, huge lists
of hypothetical research take a shot at the property of type-2 fuzzy variables [6, 13,
15], and its practical application has been developed [16, 17].
The present paper essentially researches the accompanying things:

• A computationally effective defuzzification procedure of type-2 fuzzy variables

are introduced.
• In spite of the fact that TPs with type-1 fuzzy parameters are talked about by
numerous specialists, transportation issues of type-2 fuzzy variable are composed
and comprehended.
• Chance-constrained programming model with type-2 fuzzy variables is formulat-
ed.

Here, we have presented profit maximization STP with Gaussian type-2 fuzzy
variables. A few sort of conveyances are utilized for transportation of merchandise
from source to destinations. Here the transportation system is planned regarding a
dealer who buys the source amounts at various starting points and sells the transported
amount at different destinations as per the demands at destinations. Purchasing costs
and selling price at different origins and destinations are different. Transportation
costs, demands at destination, conveyance procurement cost and capacities are also
Gaussian type-2 fuzzy variables.

2 Mathematical Model Formulation

2.1 Notations

The following notations are used

Index sets

– i: index for source for all i = 1, 2, …, M.

– j: index for destination for all i = 1, 2, …, N.
Decision variables

– wij units transported from ith origin to jth destination.

– (xi , yi ) position of the ith origin. Objective functions z1 total transportation cost
from ith origin to jth destinations.
116 S. Halder (Jana) et al.

Objective functions:
z1 total transportation cost from ith origin to jth destinations.
Parameters

– hij creation cost per unit conveyed from ith source to jth destination.
– cij transportation cost per unit conveyed from warehouses ith to markets jth add
up to availability supply for each source (or origin) i.
– Ai Add up to accessible supply for each source (or origin) i.
– Bj Add up to request of every goal (destination) j. (pj , qj ) position of the jth
destination.
– dij distance of per unit delivered from ith warehouses to jth markets.

2.2 Model Formulation

Objective functions:
Give us a chance to consider a transportation problem with M origins Oi (i = 1, 2,
…, M) and N destinations Dj (j = 1, 2, …, N, in which the position (xi , yi ) of origins
to be decided with respect to the position of destination pj , qj ) in of that the units of
transportation wij from ith origin to jth destination. Also to be decided the first part
of the objective function is the cost associated with the amount to be transported and
second part is associated with the distance from origin to destinations. Hence, the
objective function of the nonlinear solid transportation problem is as follows:

M
N
M
N
z1 = Min
hij wij +
cij dij yij
i=1 j=1 i=1 j=1

1, if wij = 0; (1)
where yij =
0, if wij = 0

where dij = (xi − pj )2 + (yi − qj )2

Constraints:

For the ith origin Oi to the total amount shipment Nj=1 wij cannot exceed its avail-
ability Ai . That is, we must require

N
wij ≤
Ai i = 1, 2, . . . , M . (2)
j=1
10 A Solid Transportation Problem with Additional Constraints … 117

M
The aggregate incoming shipment at jth destination is i=1 wij should satisfied its
requirement or demand. That is, we must require

M
wij ≥
Bj j = 1, 2, . . . , N . (3)
i=1

Consider one unit of the item at the ith supply

point contains fi units of polluting
influence. The total impurity at origin i is m i=1 fi wij . Request point j cannot get
more than gj units of impurity. That is, we should require

M
fi wij ≤
gj j = 1, 2, . . . , N . (4)
i=1

Non-negativity constraints on decision factors: wij ≥ 0 ∀i, j

2.3 Defuzzification of Gaussian Type-2 Fuzzy Variables

Min f¯
m
n
m
n
s.t Cr
hij wij + cij dij yij ≥ f¯ ≥ α

i=1 j=1 i=1 j=1

n
Cr wij ≤
Ai ≥ αi i = 1, 2, . . . M ,
j=1

m
Cr wij ≥
Bj ≥ βj j = 1, 2, . . . N ,
i=1

m
Cr gj ≥ γk
fi wij ≤ j = 1, 2, . . . N , xijk ≥ 0, ∀i, j, k.
i=1
(5)

Here Min f1 indicate the minimum value and the objective function accomplish with
generalized credibility α(0 < α ≤ 1).αi , βj , γk (0 < αi , βj , γk ≤ 1) which are the
present generalized credibility satisfaction level of the origin and end point restriction
respectively for all i, j, k.
118 S. Halder (Jana) et al.

Case i:
When α (0, 0.25], then the parametric problem of the model representation (5) as:

Minf¯
N
M
s.t (μh̃ − σh̃ 2 ln(1 + (1 − 4α)θr,h̃ ) − 2 ln 2α)wij
ij ij ij
i=1 j=1

+ (μc̃ − σc̃ 2 ln(1 + (1 − 4α)θr,c̃ ) − 2 ln 2α)yij
ij ij ij

N
and wij ≤ (μÃ − σÃ 2 ln(1 + (1 − 4αi )θr,Ã ) − 2 ln 2αi ), i = 1, 2, 3, . . . .M
i i i
j=1

M
wij ≥ (μB̃ − σB̃ 2 ln(1 + (1 − 4βj )θr,B̃ ) − 2 ln 2βj ), j = 1, 2, 3, . . . N
j j j
i=1

M
fi wij ≤(μg̃ − σg̃ 2 ln(1 + (1 − 4γk )θr,g̃ ) − 2 ln 2γk ), j = 1, 2, 3, . . . N
j j j
i=1

Case ii:
When α (2.5, 0.5], then the parametric problem of the model representation (5) as:

Minf¯
N
M
s.t (μh̃ − σh̃ 2 ln(1 + (4α − 1)θr,h̃ ) − 2 ln(2α + (4α − 1)θ1,h ))wij
ij ij ij ij
i=1 j=1

+ (μc̃ − σc̃ 2 ln(1 + (4α − 1)θr,c̃ ) − 2 ln(2α + (4α − 1)θ1,cij ))yij
ij ij ij

N
and wij ≤ (μÃ − σÃ 2 ln(1 + (4αi − 1)θr,Ã ) − 2 ln(2αi + (4αi − 1)θ1,Ai )), i = 1, 2, 3, . . . M
i i i
j=1

M
wij ≥ (μB̃ − σB̃ 2 ln(1 + (4βj − 1)θr,B̃ ) − 2 ln(2βj + (4βj − 1)θr,Bj )), j = 1, 2, 3, . . . N
j j j
i=1

M
fi wij ≤ (μg̃ − σg̃ 2 ln(1 + (4γk − 1)θr,g̃ ) − 2 ln(2γk + (4γk − 1)θ1,gj )), j = 1, 2, 3, . . . N
j j j
i=1

Case iii:
When α (0.5, 7.5], then the parametric problem of the model representation (5) as:

Minf¯
N
M
s.t (μh̃ij + σh̃ij 2 ln(1 + (3 − 4α)θr,h̃ij ) − 2 ln(2(1 − α) + (3 − 4α)θ1,h̃ij ))wij
i=1 j=1

+ (μc̃ij + σc̃ij 2 ln(1 + (3 − 4α)θr,c̃ij ) − 2 ln(2(1 − α) + (3 − 4α)θ1,c̃ij ))yij

N
and wij ≤ (μÃi + σÃi 2 ln(1 + (3 − 4αi )θr,Ãi ) − 2 ln(2(1 − αi ) + (3 − 4αi )θ1,Ãi )), i = 1, 2, 3, . . . M
j=1

M
K
wij ≥ (μB̃j + σB̃j 2 ln(1 + (3 − 4βj )θr,B̃j ) − 2 ln(2(1 − βj ) + (3 − 4α)θ1,B̃j )), j = 1, 2, 3, . . . N
i=1 k=1

M
fi wij ≤ (μg̃j + σg̃j 2 ln(1 + (3 − 4γk )θr,g̃j ) − 2 ln(2(1 − γk ) + (3 − 4γk )θ1,g̃j )), k = 1, 2, 3, . . . K
i=1
10 A Solid Transportation Problem with Additional Constraints … 119

Case iv:
When α (0.75, 1], then the parametric problem of the model representation (5) as:

Minf¯
N
M
s.t (μh̃ + σh̃ 2 ln(1 + (4α − 3)θr,h̃ ) − 2 ln(2(α − 1))wij
ij ij ij
i=1 j=1

+ (μc̃ + σc̃ 2 ln(1 + (4α − 3)θr,c̃ ) − 2 ln(2(1 − α))yij
ij ij ij

N
and wij ≤ (μÃ + σÃ 2 ln(1 + (4αi − 3)θr,Ã ) − 2 ln(2(αi − 1)), i = 1, 2, 3, . . . M
i i i
j=1

M
wij ≥ (μB̃ + σB̃ 2 ln(1 + (4βj − 3)θr,B̃ ) − 2 ln(2(1 − βj )), j = 1, 2, 3, . . . N
j j j
i=1

M
fi wij ≤ (μg̃ + σg̃ 2 ln(1 + (4γk − 3)θr,g̃ ) − 2 ln(2(γk − 1)), j = 1, 2, 3, . . . N
j j j
i=1

2.4 Model 2: Production Cost, Unit Transportation Cost,

Impurity at Demand Point are treated as Gaussian Type-2
Fuzzy Variables and Source, Demands are Crisp

M
N
M
N
Min f1 =
hij wij +
cij dij yij
i=1 j=1 i=1 j=1

where dij = (xi − pj )2 + (yi − qj )2

N
wij ≤
Ai i = 1, 2, . . . , M .
j=1

m
wij ≥
Bj j = 1, 2, . . . , N .
i=1

m
fi wij ≤
gj j = 1, 2, . . . , N .
i=1

Case i:
When α (0, 0.25], then the parametric problem of the model representation (5) as:

N
M
Min TP = (μh̃ − σh̃ 2 ln(1 + (1 − 4α)θr,h̃ ) − 2 ln 2α)wij
ij ij ij
i=1 j=1

+ (μc̃ − σc̃ 2 ln(1 + (1 − 4α)θr,c̃ ) − 2 ln 2α)yij
ij ij ij

s.t (11)−(13) (6)

120 S. Halder (Jana) et al.

Case ii:
When α (2.5, 0.5], then the parametric problem of the model representation (5) as:

N
M
Min TP = (μh̃ − σh̃ 2 ln(1 + (4α − 1)θr,h̃ ) − 2 ln(2α + (4α − 1)θ1,h ))wij
ij ij ij ij
i=1 j=1

+ (μc̃ − σc̃ 2 ln(1 + (4α − 1)θr,c̃ ) − 2 ln(2α + (4α − 1)θ1,cij ))yij
ij ij ij

s.t (11)−(13) (7)

Case iii:
When α (0.5, 7.5], then the parametric problem of the model representation (5) as:

N
M
Min TP = (μh̃ + σh̃ 2 ln(1 + (3 − 4α)θr,h̃ ) − 2 ln(2(1 − α) + (3 − 4α)θ1,h̃ ))wij
ij ij ij ij
i=1 j=1

− (μc̃ + σc̃ 2 ln(1 + (3 − 4α)θr,c̃ ) − 2 ln(2(1 − α) + (3 − 4α)θ1,c̃ ))yij
ij ij ij ij

s.t (11)−(13) (8)

Case iv:
When α (0.75, 1], then the parametric problem of the model representation (5) as:

N
M
Min TP = (μh̃ + σh̃ 2 ln(1 + (4α − 3)θr,h̃ ) − 2 ln(2(α − 1))wij
ij ij ij
i=1 j=1

− (μc̃ + σc̃ 2 ln(1 + (4α − 3)θr,c̃ ) − 2 ln(2(1 − α))yij
ij ij ij

s.t (11)−(13) (9)

3 Solution Procedures

Genetic algorithm (GA) has been utilized to take care of the issue of given model.
GA is utilized to find optimization through heuristic inquiry process that corresponds
related regular choice (natural selection). Here population is as an arrangement of
feasible solutions of proposed issue. Genotype is called as considered member of
population, a chromosome, a string or a permutation. A GA performed three different
operations—reproduction, crossover and mutation.
10 A Solid Transportation Problem with Additional Constraints … 121

3.1 Parameters

The different parameters are considered to solve the problem through GA as follows.
(MAXGEN)-number of generation (set 5000)
(POPSIZE)-size of population (set 100)
(PXOVER)- probability of crossover (set 0.6)
(PMU)-probability of mutation (set 0.2).

3.2 Representation of Chromosome

The variables in this proposed models are nonlinear. So, a real number is used to rep-
resent the chromosome to solve the proposed model. Many nonlinear real problems
used binary vectors but those were not effective.

3.3 Reproduction

To evaluate the chromosome, parents are randomly selected. The boundaries, depen-
dent variables, independent variables are determined from all (here 16) variables to
initialize the population.

3.4 Crossover

The main operator of GA is crossover. It is used to exchange the parent’s character-

istics and communicate to the children. It may happen in two steps:
(i) Selection for crossover: A random number r is generated for each solution of
P 1 (T ) from the range [0...1]. The solution is considered for crossover, if r < pc ,
where pc is crossover probability.
(ii) Crossover process: After selection some solution, crossover has been applied.
The random number c has been taken from the range [0...1] for the pair of
solutions Y1 , Y2 . Y11 and Y21 are calculated using Y1 , Y2 as follows:
where Y11 = cY1 + (1 − c)Y2 , Y21 = cY2 + (1 − c)Y1 , where Y11 , Y21 must
meet the problem constraints.
122 S. Halder (Jana) et al.

3.5 Mutation

To recover any loss of some important characteristics, we need to perform mutation

operation. It is also used for maintaining population diversity. It is done in two steps:
(i) Mutation Selection: A random number r is generated for each solution of P 1 (T )
from the range [0...1]. The solution is considered for mutation, if r < pm , where
pm is the mutation probability.
(ii) Mutation process: A random number r is selected with in the range [1...K]. Then
by replacement of xr within rth component of X they are random number. We
get a solution X = (x1 , x2 , . . . xk ), which is a solution through mutation.

3.6 Evaluation

The evaluation function used to solve this problem is

eval(Vi ) = objective function value
Through Roulette wheel selection chromosome. Here better chromosome has
been chosen from the population to create the new chromosomes. Presently, new
enhanced better chromosomes are produced through arithmetic crossover and muta-
tion. The steps of the proposed algorithm are given below:
Step-1: Begin
t=0; Where t is considered as number of iteration.
Step-2: Population(t) is initialized.
Step-3: Population(t) is evaluated.
Step-4: while(condition is true)
{
Population(t) is selected from Population(t-1).
Perform crossover on Population(t)
Perform mutation on Population(t)
evaluate Population(t)
}
Step-5: Optimization Result Printed
Step-6: end.

4 Numerical Experiments

To present the relevancy and utility of the proposed model, a numerical illustration
with three sources and three destination and three convenances are considered in
these models. The model described above is coded in GA to solve the minimization
solid transportation problem (Tables 1, 2, 3 and 4).
10 A Solid Transportation Problem with Additional Constraints … 123

Table 1 Gaussian T2 fuzzy unit transportation costs

Product c(11) c(12) c(21) c(22)
(10, 1.0; 0.8, 1.8) (12, 1.2; 0.9, 1.5) (9, 1.0; 1.1, 1.5) (11, 1.0; 1.1, 1.5)

Table 2 Solid transportation problem parameters

i Source j Demand j Max impurity received
1 (30, 1.5; 0.8, 1.0) 1 (23, 2.1; 0.5, 0.8) 1 (28, 2.1; 1.2, 1.6)
2 (31, 1.2; 0.1, 1.0) 2 (21, 1.1; 0.5, 0.8) 2 (35, 2.1; 1.0, 1.6)

Table 3 Value of hij

Product h(11) h(12) h(21) h(22)
(4.5, 1.0; 0.8, 1.8) (3.1, 1.2; 0.9, 1.5) (2.23, 1.0; 1.1, 1.5) (5.23, 1.0; 1.1, 1.5)

Table 4 Value of dij and impurity

Distance Unknown location Impurity
d11 = 3.17, d12 = 0.1 x1 = 5.29, x2 = 4.2 f1 = 7
d21 = 1.66, d22 = 1.64 y1 = 8.1, y2 = 9.2 f2 = 5

5 Discussion

We obtained different solutions from the experiment which are distinct with different
degrees. The performance of this model has been shown through the experimental
result in Table 5. The obtained results demonstrate the applicability and managerial
insight of the proposed scheme. The proposed algorithm is very effective for search-
ing better solution, and we achieve Pareto optimal solutions for managerial decision.

Table 5 Different models results (optimum)

α Model Amount Transportation Cost
0.95 1 40.189 202.337
2 31.421 190.660
0.90 1 42.09 229.418
2 32.75 192.658
0.85 1 43.189 236.972
2 34.56 192.880
0.80 1 43.89 241.672
2 32.63 190.186
124 S. Halder (Jana) et al.

Here, we observed that cost of Model 1 is greater than the cost in Model 2. GA has
been used to shown the crossover of results. It is possible to get the result through
the variations of population size, iteration, crossover and mutation.

6 Comparison with Earlier Work

It has been observed that few development has been done using STP with cost
minimization. Most work has been developed by considering profit maximization.
Here we have investigated the problem in the angle of cost minimization. These two
approaches are opposite angle, and it is hard for comparison between them. This pro-
posed scheme is the new development using cost minimization solid transportation
problem with Gaussian type-2 fuzzy environments. So, this is a another innovative ex-
amination towards the field of transportation according as far as anyone is concerned.

7 Conclusions and Future Scope

A new cost minimization STP with most parameters is considered as Gaussian type-
2 fuzzy environments. Here, the parameters are supply, demand, production cost,
transport cost and impurity at demand point. The GA has been used to solve the
proposed model and achieve good results. The main contributions are mentioned
below:

• This is the new attempt in STP with cost minimization.

• Gaussian type-2 has been used to get accurate result which is more precise than
type-1.
• A new concept has been developed using these models. One can apply using time
minimization budget constraint, damage item, discount of price, festival offer, etc.
• This model can be solved through different environments like rough, fuzzy rough,
intuitionists fuzzy environment.

References

1. Zadeh, L.A.: Fuzzy sets. Inf. Control. 8, 338–353 (1965)

2. Zimmermann, H.J.: Fuzzy programming and linear programming with several objective func-
tions. Fuzzy Sets Syst. 1, 4555 (1978)
3. Hitchcock, F.L.: The distribution of a product from several sources to numerous localities. J.
Math. Phys. 20(1), 224–230 (1941)
4. Shell, E.: Distribution of a product by several properties. In: Directorate of Management Anal-
ysis. Proceedings of the Second Symposium in Linear Programming, vol. 2 (1955)
10 A Solid Transportation Problem with Additional Constraints … 125

5. Haley, K.B.: New methods in mathematical programming-The solid transportation problem.

Oper. Res. 10(4), 448–463 (1962)
6. Jimenez, F., Verdegay, J.L.: Solving fuzzy solid transportation problems by an evolutionary
algorithm based parametric approach. Eur. J. Oper. Res. 117, 485–510 (1999)
7. Yang, L., Liu, L.: Fuzzy fixed charge solid transpotation problem and algorithm. Appl. Soft
Comput. 7, 879–889 (2007)
8. Liu, P., Yang, L., Wang, L., Li, S.: A solid transportation problem with type-2 fuzzy variables.
Appl. Soft Comput. 24, 543–558 (2014)
9. Kocken, H.G., Sivri, M.: A simple parametric method to generate all optimal solutions of fuzzy
solid transportation problem. Appl. Math. Model. 40(7–8), 4612–4624 (2016)
10. Ojha, A., Das, B., Mondal, S.K., Maiti, M.: A multi-item transportation problem with fuzzy
tolerance. Appl. Soft Comput. 13(8), 3703–3712 (2013)
11. Kundu, P., Kar, S., Maiti, M.: A fixed charge transportation problem with type-2 fuzzy variables.
Inf. Sci. 255, 170–186 (2014)
12. Giri, P.K., Maiti, M.K., Maiti, M.: Fully fuzzy fixed charge multi-item solid transportation
problem. Appl. Soft Comput. 27, 77–91 (2015)
13. Dubois, D., Prade, H.: Fuzzy Sets and Systems: Theory and Applications. Academic Press,
New York (1980)
14. Mizumoto, M., Tanaka, K.: Fuzzy sets of type-2 under algebraic product and algebraic sum.
Fuzzy Sets Syst. 5(3), 277–280 (1981)
15. Gray, P.: Exact solution of the fixed charge transportation problem. Oper. Res. 19(6), 1529–1538
(1971)
16. Bit, A.K., Biswal, M.P., Alam, S.S.: Fuzzy programming approach to multiobjective solid
transportation problem. Fuzzy Sets Sys. 57(2), 183–194 (1993)
17. Greenfield, S., Chiclana, F., John, R.I., Coupland, S.: The sampling method of defuzzification
for type-2 fuzzy sets: experimental evaluation. Inf. Sci. 189, 77–92 (2012)
Chapter 11
Complements to Voronovskaya’s
Formula

Margareta Heilmann, Fadel Nasaireh and Ioan Raşa

Abstract We generalize some known results concerning Voronovskaya-type

formulas for the composition of two linear operators acting on an arbitrary Banach
space.

Keywords Voronovskaya-type formula · Composition of operators · Bernstein

operator · Inverse of Bernstein operator

MSC (2010): 41A36 · 41A35

1 Introduction

Voronovskaya-type formulas are usually established for sequences of positive linear

operators. They are important tools in approximation theory, used in order to inves-
tigate the rate of convergence and saturation properties. The classical Voronovskaya
formula is related to the Bernstein operators Bn : C[0, 1] −→ C[0, 1],
n

n k k
Bn f (x) = x (1 − x)n−k
f , f ∈ C[0, 1], x ∈ [0, 1],
k n
k=0

M. Heilmann (B)
School of Mathematics and Natural Sciences, University of Wuppertal,
Gaußstraße 20, 42119 Wuppertal, Germany
e-mail: [email protected]
F. Nasaireh · I. Raşa
Department of Mathematics, Technical University,
Str. Memorandumului 28, 400114 Cluj-Napoca, Romania
e-mail: [email protected]
I. Raşa
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2018 127

and reads as follows:

lim n (Bn f (x) − f (x)) = K f (x),

n→∞

where K f (x) = x(1−x)

2
f (x), f ∈ C 2 [0, 1], x ∈ [0, 1], the convergence being uni-
form on [0, 1].
If f has continuous derivatives of higher degree, the above Voronovskaya formula
can be extended; see, e.g., [4] and the references therein.
The investigation of Voronovskaya-type formulas for the composition of two
linear operators Pn and Qn , acting on an arbitrary Banach space X , was initiated in
[4].
The main aim of this article is to generalize the results of [4]. This is done in Sect. 2.
In particular, in [4] was investigated the case when Pn Qn acts as the identity on
some linear subspace of X . We are concerned with this case in Sects. 3 and 4, where
the operators Bn and Bn−1 are considered, acting on polynomials.
In Sects. 3 and 4, we use the eigenstructure of the operators Bn . Similar results
can be obtained for several other operators, for which the eigenstructure is known;
this will be done elsewhere.
A problem is mentioned in Sect. 4.

2 Voronovskaya’s Formula for Composition of Operators

Let X be a Banach space. For a given m ∈ N, consider the linear subspaces Ym ⊆

Ym−1 ⊆ · · · ⊆ Y1 ⊆ Z ⊆ Y0 = X .
Let Pn : X −→ X , Qn : Z −→ X , n ∈ N, be linear operators. Suppose that each
operator Pn is bounded and

lim Pn x = x, x ∈ X . (1)
n→∞

For l = 0, 1, . . . , m, consider the linear operators Kl : Yl −→ X , Ll : Yl −→ X . Sup-

pose that
K0 y = L0 y = y, y ∈ Y0 , (2)

and for all 0 ≤ i ≤ l ≤ m,

Li y ∈ Yl−i , y ∈ Ym . (3)

Moreover, assume that for l = 1, 2, . . . , m,

11 Complements to Voronovskaya’s Formula 129

l−1
lim n (Pn y − y) −
l
n l−i
Ki y = Kl y, y ∈ Yl , (4)
n→∞
i=1

l−1
lim n (Qn y − y) −
l
n l−i
Li y = Ll y, y ∈ Yl . (5)
n→∞
i=1

Theorem 1 Under the above assumptions,

m−1
l
lim n (Pn Qn y − y) −
m
n m−l
Kl−i Li y (6)
n→∞
l=1 i=0
m
= Km−i Li y, y ∈ Ym .
i=0

Proof Let y ∈ Ym . Then,

m−1
l
nm (Pn Qn y − y) − nm−l Kl−i Li y = Pn Lm y + un + vn + wn , (7)
l=1 i=0

where

m−1
un := nm (Pn y − y) − nm−i Ki y,
i=1

m−1
l−1
vn := nl (Pn Lm−l y − Lm−l y) − nl−i Ki Lm−l y ,
l=1 i=1

m−1
wn := Pn nm (Qn y − y) − nm−i Li y − Lm y .
i=1

According to (1),

lim Pn Lm y = Lm y. (8)
n→∞

Using (4) with l = m, we get

lim un = Km y. (9)
n→∞

Moreover, (4) yields

130 M. Heilmann et al.

m−1
lim vn = Kl Lm−l y. (10)
n→∞
l=1

By using (1) and the Banach–Steinhaus theorem, we infer that the sequence (Pn )n≥1
is bounded; i. e., there exists M > 0 such that Pn ≤ M , n ≥ 1. Therefore, we have

m−1
wn ≤ M nm (Qn y − y) − nm−i Li y − Lm y.
i=1

From (5) with l = m, we infer that

lim wn = 0. (11)
n→∞

Now (7), (8), (9), (10), and (11) show that

m−1
l
lim n (Pn Qn y − y) −
m m−l
n Kl−i Li y
n→∞
l=1 i=0

m−1
= Lm y + Km y + Kl Lm−l y
l=1

m
= Kl Lm−l y,
l=0

and this concludes the proof.

Corollary 1 Let y ∈ Ym such that Pn Qn y = y. Then,

l
Kl−i Li y = 0, l = 1, 2, . . . , m. (12)
i=0

Proof If Pn Qn y = y, (6) yields

m−1
l
m
lim nm−l Kl−i Li y = − Km−i Li y.
n→∞
l=1 i=0 i=0

This entails (12).

Remark 1 For m ∈ {1, 2, 3}, Theorem 1 and Corollary 1 were proved in [4, Theorem
2.1]. Several examples and applications can be found in [3, 4].
11 Complements to Voronovskaya’s Formula 131

3 The Operator Bn

Let Bn : Pm −→ Pm , n ≥ m, be the classical Bernstein operator. It is known that

l
Bn p = p + n−i Ki p + o(n−l ), l = 1, 2, . . . , m − 1, (13)
i=1

see [1], where the operators Ki are described. On the other hand (see [2, (4.23)]),

m
Bn p = λ(n) (n) (n)
j p j μ j (p), p ∈ Pm ,
j=0

where λ(n) (n) (n)

j are the eigenvalues of Bn , p j the eigenpolynomials, and μ j the dual
functionals. Hence,
m
p= p(n) (n)
j μ j (p), (14)
j=0

and
n! s( j, j − i)
j−1
λ(n) (n) (n)
0 = λ1 = 1, λ j = =1+ , j ≥ 2,
(n − j)!n j
i=1
ni

where s(m, l) denote the Stirling numbers of first kind, defined by

˙ − m + 1) = m s(m, l)xl . Therefore,
x(x − 1). ˙. .(x l=0

m
j−1
s( j, j − i)
Bn p − p = p(n) (n)
j μ j (p)
j=2 i=1
ni

m−1
1
m
= s( j, j − i)p(n) (n)
j μ j (p).
i=1
ni j=i+1

l
1
m
Bn p = p + i
s( j, j − i)p(n) (n)
j μ j (p) (15)
i=1
n j=i+1

m−1
1
m
+ s( j, j − i)p(n) (n)
j μ j (p).
ni j=i+1
i=l+1
132 M. Heilmann et al.

m
Denote Kni p := s( j, j − i)p(n) (n)
j μ j (p), and remark that
m−1 1 m j=i+1
(n) (n) −l
i=l+1 ni j=i+1 s( j, j − i)p j μ j (p) = o(n ), since according to [2], the
(n) (n)
sequences (p j )n≥0 and (μ j (p))n≥0 are convergent to p∗j and μ∗j (p), respectively.
Thus, we have
Theorem 2 For each p ∈ Pm ,

l
1
Bn p = p + K p + o(n−l ), l = 1, 2, . . . , m − 1,
i ni
(16)
i=1
n

where

m
Kni p −→ i p, i = 1, . . . , l.
s( j, j − i)p∗j μ∗j (p) =: K (17)
j=i+1

i = Ki , i = 1, . . . , l.
Moreover, K

Proof (16) and (17) are consequences of (15) and the above remarks. From (13) and
(16), we infer that

l
n−i (Kni p − Ki p) = o(n−l ), l = 1, . . . , m − 1.
i=1

i = Ki , i = 1, . . . , l.
This entails limn→∞ Kni p = Ki p, i. e., K

4 The Operator B−1

In the spirit of Sect. 3, consider Bn−1 : Pm −→ Pm , n ≥ m.

From (14), we get

m
1
Bn−1 p = p(n) (n)
j μ j (p) .
j=0 λ(n)
j

We have

1 1 1
j−1
(n − i − 1)!
= = 1, =1+ a ji , j ≥ 2, (18)
λ(n)
0 λ(n)
1 λ(n)
j i=1
(n − 1)!

where the a ji can be written in terms of a forward difference of order j − i − 1, i. e.,

11 Complements to Voronovskaya’s Formula 133

1
a ji = Δ j−i−1 i j−1 . (19)
( j − i − 1)!

To prove (19), we consider the Newton form of the interpolation polynomial of

order j − 1 for the monomial x j−1 with respect to the equidistant knots j − 1, j −
2, . . . , 1, 0, evaluated at x = n. Thus,

j−1

j−1
1
n j−1
= (n − l) · Δ j−i−1 i j−1 .
i=0
( j − i − 1)!
l=i+1

(n− j)!
Multiplying the equation by (n−1)!
leads to

1 (n − j)!n j
=
λ(n)
j
n!

j−1
(n − i − 1)! 1
= · Δ j−i−1 i j−1
i=0
(n − 1)! ( j − i − 1)!
(n − i − 1)!
j−1
1
= 1+ · Δ j−i−1 i j−1 .
i=1
(n − 1)! ( j − i − 1)!

This prove (19).

Consequently,

m
j−1
(n − i − 1)!
Bn−1 p − p = p(n) (n)
j μ j (p) a ji
j=2 i=1
(n − 1)!

m−1
(n − i − 1)!
m
= a ji p(n) (n)
j μ j (p).
i=1
(n − 1)! j=i+1

l
(n − i − 1)!
m
Bn−1 p = p + a ji p(n) (n)
j μ j (p)
i=1
(n − 1)! j=i+1

m−1
(n − i − 1)!
m
+ a ji p(n) (n)
j μ j (p).
(n − 1)! j=i+1
i=l+1

Denote Lni p := mj=i+1 a ji p(n) (n)
j μ j (p), and remark that
m−1 (n−i−1)! m (n) (n) −l
i=l+1 (n−1)! j=i+1 a ji p j μ j (p) = o(n ). Thus, we have

Theorem 3 For each p ∈ Pm ,

134 M. Heilmann et al.

l
(n − i − 1)!
Bn−1 p = p + Lni p + o(n−l ), l = 1, 2, . . . , m − 1, (20)
i=1
(n − 1)!

where

m
Lni p −→ a ji p∗j μ∗j (p) =:
Li p i = 1, . . . , l.
j=i+1

Problem 1 Taking into account (16) and (20), is there a relation connecting the
i and
operators K Li , similar to (12)?

References

1. Abel, U., Ivan, M.: Asymptotic expansion of the multivariate Bernstein polynomials on a simplex.
Approx. Theory Appl. 16, 85–93 (2000)
2. Cooper, Sh., Waldron, Sh.: The Eigenstructure of the Bernstein operator. J. Approx. Theory 105,
133–165 (2000)
3. Gonska, H., Heilmann, M., Lupaş, A., Raşa, I.: On the composition and decomposition of positive
linear operators III: A non-trivial decomposition of the Bernstein operator, http://arxiv.org/abs/
1204.2723 Aug 30, 2012, pp. 1–28
4. Nasaireh, F., Raşa, I.: Another look at Voronovskaja type formulas, J. Math. Inequal. 12(1),
95–105 (2018)
Chapter 12
Mathematics and Machine Learning

Srinivas Pyda and Srinivas Kareenhalli

Abstract Machine learning is a branch of computer science that gives computers

the ability to make predictions without explicitly being programmed. Machine learn-
ing enables computers to learn, as they process more and more data and make even
more accurate predictions. Machine learning is becoming all pervasive in our daily
lives, from speech recognition, medical diagnosis, customized content delivery, and
product recommendations to advertisement placements to name a few. Knowingly
or unknowingly, there is a very high chance that one would have encountered some
form of machine learning several times in one’s daily activities. In cloud data cen-
ters, machine learning presents an opportunity to make systems autonomous and
thus transforming data centers into those that are less error prone, secure, self tun-
ing, and highly available. Mathematics forms the bedrock of machine learning. This
paper aims at highlighting the concepts in mathematics that are essential for build-
ing machine learning systems. Topics in mathematics like linear algebra, probability
theory and statistics, multivariate calculus, partial derivatives, and algorithmic opti-
mizations are quintessential to implementing efficient machine learning systems.
This paper will delve into a few of the aforementioned areas to bring out core con-
cepts necessary for machine learning. Topics like principal component analysis,
matrix computation, gradient descent algorithms are a few of them covered in this
paper. This paper attempts to give the reader a panoramic view of the mathematical
landscape of machine learning.

Keywords Eigenvalues · Machine learning · Partial differential equations

Linear algebra

S. Pyda (B)
Oracle America, Redwood Shores, CA 94065, USA
e-mail: [email protected]
S. Kareenhalli
Oracle India, Bengaluru, India
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2018 135

1 Introduction

We are seeing a huge explosion in the data that is being generated from online con-
tent and mobile data like messages, photos, videos and data generated by Internet of
Things (IoT). The online logs generated from Web servers, users histories, operating
system logs, and logs in databases are growing in size by leaps and bounds. This
presents a great opportunity to be able to analyze the data from these sources to save
costs and serve businesses and consumers better. In large enterprises, the data gath-
ering used to be defensive in nature mainly for ensuring compliance for regulatory
purposes. The role of data is changing with it becoming center of innovation.
According to a Gartner report [1], the number of connected “things” is estimated
to be 20 billion by the year 2020. These include but are not limited to IoT from
automotive systems, health monitoring devices, and smart meters to name a few.
In data centers, huge amounts of log data are being generated. This data can
be analyzed to improve services, reduce costs, and make services more secure and
available.
Processing this huge explosion of data using static analytics and algorithms will be
both inefficient and impractical. The need of the hour is computers which can make
predictions without being explicitly programmed. This is where machine learning
comes into enable systems to automatically detect patterns in the data and make
intelligent predictions. The machine learning algorithms should be capable of learn-
ing and enhance the accuracy of predictions as more and more data are processed. In
cloud data centers, machine learning can be used to make systems autonomous which
can predict usage patterns and allocate resources accordingly. Detect intrusions and
take action to restrict damage or data loss. Machine learning can also be used to tune
the systems to keep them performing optimally.
This paper will delve into the mathematics behind the machine learning and cover
two areas in greater detail. The last section will cover the future of machine learning,
especially in the cloud and database services.

2 Areas of Mathematics in Machine Learning

2.1 Recommender Systems (Supervised Learning)

Supervised learning is class of machine learning methods where there is a labeled

dataset to learn from. The machine learning algorithms learn from the data and
predict values for a new data item. Some examples of such systems are housing price
predictor and stock values predictor.
Consider there are n different parameters in a dataset, representing the features
of a house (like area of the house, area of the lot, location of the house, number of
bedrooms). Let us denote them as x i (i 1 … n). Let the outcome be price of the
house as yi (i 1 … m) taking on continuous values. Let there be m data items in the
12 Mathematics and Machine Learning 137

dataset. The machine learning system can learn from this dataset and predict price
of a new house on the market based on the features of the house.
A popular machine learning method is the linear regression model. This technique
tries to approximate a model by learning a set of weights for the features in the dataset
and come up with a regression model. The weights for the features learnt using an
iterative algorithm called as gradient ascent algorithm.
An initial set of weights are assigned to the features, and the error due to these
weights is computed and compared to actual output values. The algorithm then
tries to minimize the error by iterative changing the values of the weights, till the
algorithm converges. The convergence is achieved when the error values do not
change significantly. Let the initial set of weights be assigned as follows, and the
hypotheses are represented as follows:

h(x) θ0 + θ1 x1 + θ2 x2 + · · · + θn xn

The error due to these weights is expressed as a cost function of the weights and
is calculated as

1 (i) 2
m
C(θ ) h x − y (i)
2 i1

The goal of the algorithm is to minimize C(θ ) so that the error of prediction is
minimal. This is achieved by adjusting the weights incrementally using the partial
derivative of the cost function with respect to the weights. This is iteratively done
till the cost function is minimized.
This machine learning technique is covered in greater details in Sect. 3 of this
paper.

2.2 Classifier Systems (Unsupervised Learning)

Unlike in the previous section where the dataset had an output label associated with it,
there are datasets which do not have any output label associated with them. Building
machine learning algorithms to classify or discover underlying patterns on this kind
of data is called unsupervised learning.
Consider a collection of articles (say Web sites or Web articles), how do we classify
these articles based on “similarity.” For instance, classifying the articles based on
interests like “sports,” “entertainment,” “politics,” “news,” “art.”.
One of the most popular machine learning algorithms for clustering is k-means
clustering algorithm. This algorithm involves computing the similarity (distance)
between documents and a cluster center and assigning the document to a cluster
center based on its distance from the cluster center. To compute the distance or
similarity between the documents, the collection of articles/documents first needs to
138 S. Pyda and S. Kareenhalli

be converted to a vector notation called Vector Space Model [2]. A common notation
is the tf-idf. A document is first tokenized into set of terms, and tf-idf is defined as
follows:
• TF: Term frequency is the frequency of the term in the document. Since the doc-
uments can be of different lengths, term frequency is divided by the number of
terms in the documents as a way of normalization.

frequency of term t in the document

tf(t, d)
total terms in the document
• IDF: Inverse document frequency is the ratio of total number of documents to the
number of documents containing term t.

Total number of documents
idf(t) ln
1 + number of documents with term t

tf − idf(t) tf(t, d) ∗ idf(t)

Each document can be represented as a vector of tf−idf(t) weights. The axes are
terms, and the document can be modeled as a vector. By using length-normalized
unit vector, the weights of the documents of varying lengths become comparable in
weights.
Cluster centers are assigned in random, and data items are assigned to the closest
cluster center using a distance measurement (Euclidean distance or cosine distance).
The cluster centers are reevaluated and reassigned. This process is repeated iteratively
till convergence.
Euclidean distance between two vectors is defined as

d(u , v) ||
u , v|| (u 1 − v1 )2 + (u 2 − v2 )2 + · · · + (u n − vn )2

Cosine similarity between two vectors is defined as

n
u i vi
u , v) i1
cosine(
n n
i1 vi
2 2
i1 u i

Randomly initialize k cluster centroids μ1 , μ2 , μ3 … μk

12 Mathematics and Machine Learning 139

repeat till convergence {

for i := 1 ..m
ci := index from 1 to k of cluster
centroid closest to xi
,closeness measure euclidean
distance.
for k := 1..K
μk := mean of data points
assigned to the cluster k
}

Some more applications that can use k-means clustering algorithm:

• Clustering similar images, e.g., cluster by type of images (flower, ocean, sunset,
clouds, cars, etc.)
• Use clustering to structure Web query results.
• Cluster product categories based on user buying patterns.
• Cluster similar neighborhood based on real estate or crime or other criteria for
better forecasting.
• Customer or market segmentation based on geography, demography, price,
lifestyle, etc.
• Cluster population based on medical condition.

2.3 Anomaly Detection

Anomaly detection is referred to the identification of items or events that are anoma-
lous to other items present in the dataset. Anomalous detection techniques can be
used to detect abnormal brain scans, cancerous cells from healthy cells, detecting
frauds and intrusions, and detecting structural and manufacturing defects. Anomaly
detection can be used to provide high availability of IT systems in data centers by
recognizing anomalies and passing it downstream for resolution.
Anomaly detection is being used by databases (Oracle) [3] to detect anomalous
intrusions and detect performance events to provide better availability and automated
performance tuning of systems.
Consider the application where in a datacenter, anomalies are to be detected in
systems that are behaving abnormally. Let us consider n features x i (1 … n) like x 1
cpu used by a system, x 2 number of disk requests, x 3 memory usage, x 4 =
swap usage, x 5 = network usage, x 6 = device interrupts, etc. Let the dataset be m data
items. Each feature can be treated as a Gaussian distribution as follows:

x1 ∼ ℵ μ1 , σ12 , x2 ∼ ℵ μ2 , σ12 , . . . xn ∼ ℵ μn , σn2

where μ is the mean and σ is the variance. The probability distribution of P(x i ) is
given by the Gaussian distribution (normal distribution) as follows:
140 S. Pyda and S. Kareenhalli

−(xi −μi ) 2
1
P xi ; μi , σ 2 √
2
e 2σ1
2π σ

The curve of this probability distribution is a bell curve. The intuition is that
probability of values under the curve being non-anomalous values is high and as we
move along the axis away from the mean the probability of the value being non-
anomalous becomes lower (i.e., they are probably anomalous). On the given dataset,
we model the probability of the features as follows:

P(x) p x1 ; μ1 , σ12 p x2 ; μ2 , σ22 . . . p xn ; μn , σn2

n

p x j ; μ j , σ j2
j1

If p(x) is less than a small threshold value ε, we flag that the data item x as
anomalous. The anomaly detection algorithm [4] can be formalized as follows:
1. Formalize a set of n parameters
2. Estimate μ1 , μ2 , … μn , σ 1 , σ 2 …, σ n as

1 (i)
n
μj x
m i1 j

1
(i) 2
n
σj xj − μj
m i1

3. Compute P(x) for a new data x

n
n
1 −(xi −μi )2

P(x) p x j ; μ j , σ j2 √
2
e 2σ1
j1 j1
2πσ

4. If P(x) < ε, then x is an anomaly.

If the features are not displaying a Gaussian distribution, a transformation function
(like log(x), x 1/n ) can be applied such that the transformed feature approximates to a
Gaussian distribution.
If some of the features are correlated, then a multivariate Gaussian probability
distribution can be applied, where the P(x) is computed as
1 −(x−μ)T −1 (x−μ)
P(x; μ, ) n 1 e 2

(2π) || 2 2

where μ is a vector of length n and Σ a matrix of dimension n × n and defined as

follows for a dataset.
12 Mathematics and Machine Learning 141

1 (i)
m
μ x
m i1

1 (i) T
m
x − μ x (i) − μ
m i1

If P(x) < ε, then the dataset x is anomalous. Gaussian distribution is a special case
of multivariate Gaussian distribution where the non-diagonal elements are 0.
Multivariate Gaussian distribution is a very useful tool in anomaly detection and
widely used in machine learning. Anomaly detection is useful when there is a small
number of anomalous compared to the total number of data items in the dataset.

2.4 Neural Networks

Neural network [5] is one of the most popular machine techniques used to model a
host of machine learning problems like image recognition and speech recognition.
Neural networks can efficiently model nonlinear decision boundaries compared to
other linear models like logistic regression. Linear regression will have to use higher-
order polynomials to model nonlinear decision boundaries adding to complexity.
Consider the dataset shown in Fig. 1, where the negative and positive examples are
shown for a two-feature dataset. Clearly here the decision boundary is nonlinear and
to model this using logistic regression would involve using higher-order polynomials.
This is where neural networks come in handy to model such nonlinear decision
boundaries.
Consider an example where we have a set of images and want to classify if the
images are that of a street signage. The supervised labeled set consists of a set of
images with positive classification (images that are those of street signs) and a set
of images with negative classification (denoting images are not that of a street sign).

Fig. 1 An example of
non-linear decision boundary
142 S. Pyda and S. Kareenhalli

Fig. 2 An example of a typical L layered neural network

Neural network algorithms try to mimic the behavior of the brain. Neural networks
involve a series of layers through which the hypotheses are modeled. Figure 2 shows
a typical neural network model. This neural network would output a 1 or 0 for images
which it thinks are images of street signs or not respectively. This is a single-class
classification problem.
A typical neural network consists of an input layer (layer 1) which takes in the
input parameter and feeds into another layer called the intermediate or hidden layer.
The output of the hidden layer can feed into more hidden layers. The final layer will
output the hypothesis. The nodes in the hidden layers are called as hidden nodes or
activation nodes. Each activation node implements an activation function (sigmoid
function g(z)). For example, in the above two layered neural network the activation
node 1 in layer 2 would take on values as shown below

a1(2) g w11
(1) (1)
x1 + w12 (1)
x2 + · · · + w1n xn

The output of the last (output) layer would be:

h(x) a1(L+1) g w11 (L) (L) (L) (L)
a1 + w12 (L) (L)
a2 + w13 a3

The weights learned are denoted by w(l)

ij where l is the lth layer, i is the input, and
(2)
j represents the jth activation node. For example, w13 represents the weight of 1st
input to the 3rd activation node in layer 2. Sigmoid function is defined as
1
g(z)
(1 + e−z )

The nature of the sigmoid function is such that g(z) tends to 1 as z → ∞ and
tends to −1 as z → −∞. The output of the sigmoid function is always bounded in
the interval [0,1].
The intuition behind the neural network is that output of each layer itself acts as
input to the subsequent layer resulting in modeling nonlinear decision boundary. It
is not uncommon to have tens and hundreds of activation layers.
12 Mathematics and Machine Learning 143

Neural network algorithm will be discussed in greater detail in the coming sec-
tions.

2.5 Principal Component Analysis

Principal component analysis (PCA) is a powerful tool in machine learning for deter-
mining the principal component features of a dataset. In lot of datasets, commonly
there are features which are correlated. Using PCA, the number of highly correlated
features can be prioritized into fewer uncorrelated features called principal compo-
nents. This is also known as dimensionality reduction.
Apart from helping in visualizing the principal components, PCA helps in reduc-
ing the cost of machine learning algorithms. Computational cost of machine learn-
ing algorithms is dependent on the number of dimensions. Using PCA to reduce
dimensions helps in reducing computational costs of machine learning algorithms.
Reducing the dataset to its principal components also reduces the amount of dataset,
with minimal compromise in the correlation between the data features.
In neural networks, using PCA, dimensions can be prioritized and low variance
dimensions can be dropped and the algorithms converge faster.

2.5.1 PCA Algorithm

Assume a dataset X j (j = 1 … m) with n features and m data items.

The idea behind PCA is to project data points in a n-dimensional space onto a
lower-dimensional space while preserving as much information as possible. Consider
Fig. 3 shown below for a two-dimensional dataset.
PCA aims at orthogonally projecting the data onto the lower-dimensional linear
space such that [6].
• Minimizes the distance between the points and the projections (sum of brown
lines)
• Maximizes the variance of projected data (yellow line).

Fig. 3 Illustrating goals of

PCA for a 2-d dataset
144 S. Pyda and S. Kareenhalli

First principal component is along the direction of largest variance. Subsequent

principal components are orthogonal to the previous principal component and points
in the direction of the largest variance of remaining data space.
Compute the covariance matrix for the dataset.

1 T
m
Σ (X i − X̄ ) X − X̄
m i1

m
where X̄ 1
m
Xi
i1
The PCA basis vectors are the eigenvectors of the covariance matrix . The
eigenvalues will determine the importance of the eigenvectors. Larger the eigenval-
ues, more important the eigenvectors. By choosing a subset of the PCA vectors with
largest eigenvalues, the dimensionality of the dataset can be reduced.
PCA can be used to discard dimensions of less significance, remove noise, and
get compact description of the data.
Consider a dataset of m facial images, each of 256 × 256 pixels. Each data item has
N 64 K dimensions. Covariance matrix is of N × N dimensions. N eigenvectors and
values can be computed in O(N 3 ) complexity, and first p eigenvectors and eigenvalues
can be computed in O(pN 2 ) complexity. For N = 64 K, this is very computationally
intensive. Invariably m
N.

using L X T X instead of Σ X X T
Let v be the eigen vector of L
Lv λv
X T X v λv

X X T X v X (λv)

X X T X v λ(X v)
Σ X v λ(X v)

So X v is the eigenvector of Σ. Complexity of computation of eigenvector of L is

much less computationally intensive compared to Σ [7].

3 Details of Two Areas

3.1 Linear/Logistic Regression

Let us consider the application of an intelligent smartphone review system. Users

submit reviews on various smartphones based on the user experience. Let us look
12 Mathematics and Machine Learning 145

Fig. 4 A simple linear

classifier model with binary
prediction

at a machine learning system that classifies these user reviews using a simple linear
classifier.
Consider the following reviews:

Review User experience

“Incredible phone. A great value for money” Positive experience
“Sports a very good camera, however battery Mixed experience
life is bad”
“Best smartphone that I have ever bought” Very positive experience
“Voice quality is poor, dropping calls is Negative experience
abysmal”

The classifier model, as shown in Fig. 4, would take user review as input and
predict the rating for the product.
The dataset can be viewed as set of words x i (i = 1 … n), consider n different
parameters in the review dataset (training dataset), commonly referred to as features.
Let the outcome (recommendation) be represented as yi (i = 1 … m) taking on values
−1 (do not recommend the item) or 1 (recommend the item).
The dataset can be used to train a set of parameter (coefficients) for each word as
shown in Table 1 below. Some neutral words will get assigned coefficient value 0,
since they are not adding any sentiment to the review.

Table 1 Coefficients for Word Coefficient

words in the reviews
Incredible 3.1
Great 1.7
Best 2.1
Good 1.0
Bad −1.1
Poor −1.6
Abysmal −3.4
Phone, the, camera 0.0
146 S. Pyda and S. Kareenhalli

Table 2 Table showing a simplistic model with coefficients for two words
Word Coefficient
Incredible 2.0
Poor −1.6

Fig. 5 An example of a
linear decision boundary

For each review, a rating rate(x) is assigned as weighted count of the words in it.
If rate(x) > 0 y +1 else y = −1.
Consider the table above depicting a simplistic model where the following words
have nonzero coefficients (Table 2).
rate(x) = 2.0 * count(“incredible”) − 2.5 * count(“poor”)
The decision boundary can be plotted as shown in Fig. 5 above.
For linear classifiers, with three coefficients the decision boundary is a plane, and
for more than three coefficients, it is a hyperplane. For other classifiers, the boundary
will end up being complicated shapes. The features could have other functions like
tf-idf weights instead of just count of terms. The rest of the section describes an
algorithm to train a set of parameters based on gradient descent algorithm.
Let us start with a hypothesis h that can be used to approximate y as follows:

h(x) θ0 + θ1 x1 + θ2 x2 + · · · + θn xn

θ i is called weight (or coefficient). By introducing an intercept term x 0 1, the

hypothesis can be represented as

n
h(x) θi xi
i0

In the above application, x i is the count of the ith word in the review. In vectorized
form, the hypothesis can be rewritten as
⎡ ⎤
x0
⎢ x1 ⎥
h(x) [θ0 θ1 . . . θn ].⎢
⎣ . ⎦
⎥

xn
12 Mathematics and Machine Learning 147

If the parameters are denoted by vector θ and features by vector x, the hypothesis
can be simplified as

h(x) θ T . x

θ T represents the transpose of vector θ . The goal is to learn the values of parameters
θ such that the h(x) is as close as possible to the outcome in the training set. To measure
the accuracy of the hypothesis to the actual outcomes, a cost function can be defined
as follows

1 (i) 2
m
C(θ ) h x − y (i)
2 i1

The goal of supervised learning is to come up with a set of parameters θ such that
it minimizes this cost function. Superscript i denotes the ith dataset item. This is the
common least squares regression model.

3.1.1 Gradient Descent Algorithm

To achieve the goal of minimizing the cost function C(θ ), we can start by an initial
guess of values for θ and iteratively change the value of θ such that the cost function
is smaller with every iteration till we converge to a set of parameters that minimizes
C(θ ). Consider the following update to the parameters
∂
θ j := θ j − α C(θ ) (1)
∂θ j

α is referred to as the step size or learning rate. The algorithm involves updating
all the parameters θ j (j = 1 … n) iteratively till convergence. In this algorithm, the
parameter values step toward the steepest descent with each iteration.
Let us derive the partial derivative of the cost function with respect to θ j .
m
∂ ∂ 1 (i)
(i) 2
C(θ ) h x −y
∂θ j ∂θ j 2 i1
1 ∂
m
(h(x (i) ) − y (i) )2
2 i1 ∂θ j
1 ∂ (i)
m
2(h(x (i) ) − y (i) ) h x − y (i)
2 i1 ∂θ j
1 m
(2) (h(x (i) ) − y (i) )x (i)
j
2 i1
148 S. Pyda and S. Kareenhalli

m
− (y (i) − h(x (i) ))x (i)
j
i1

Substituting this in (1) we get the step for θ j

m
(i)
θj θj − α − y (i) − h x (i) x j
i1

m

θj + α y (i) − h x (i) x (i)
j
i1

The gradient descent algorithm can be written as

while ( C(θ) > threshold ) {

m

θj θj + α y (i) − h x (i) x (i)
j
i1

Compute C(θ)
}

The updates of all θ j are performed simultaneously on all the values of the param-
eters. The above algorithm is called batch gradient descent. The whole dataset has
to be scanned before the parameters are updated to make progress toward the global
minimum. For large datasets, a variant called stochastic gradient descent [8] is used,
where for each data encountered the parameters are updated. The stochastic gradient
invariably converges quicker than batch gradient descent for large datasets.

3.1.2 Logistic Regression

For the application mentioned earlier like intelligent product review system, we need
the outcome to be a binary value like “recommended” (value 1) or “not recommend-
ed” (value 0). For these classes of application, a discrete output of h(x) as in linear
regression is not intuitive since y m {0,1}. Consider the following change to the
hypothesis.
1
h(x) g θ T x
1 + e−θ T x

Function g(z) is the sigmoid function. The graph of sigmoid function is shown
in Fig. 6.
12 Mathematics and Machine Learning 149

Fig. 6 Graph of Sigmoid

function

The nature of the sigmoid function is such that g(z) tends to 1 as z → ∞ and
tends to −1 as z → −∞. The output of the sigmoid function is always bounded in
the interval [0, 1].
A salient feature of the sigmoid function is that its derivative can be expressed in
terms of itself:
∂
(g(z)) g(z)(1 − g(z))
∂z

In the previous section, we have output a rating as +1 or −1 for a review based

on the rate(x) function learned from the dataset. Defining the probability that an
outcome is 1 for given set of features and parameters as

P(y 1|x; θ ) h(x)

P(y 0|x; θ ) 1 − h(x)

In the above example, this can be interpreted as the probability that a review is
positive given a set of words in the review.
Combining the two into one expression, the probability can be rewritten as

P(y|x; θ ) (h(x)) y (1 − h(x))1−y

By defining a likelihood function L(θ ), for m data items, assuming generated

independently

m
L(θ ) p(y (i) |x (i) ; θ )
i1
m
(i) y (i) 1−y (i)
h x 1 − h x (i)
i1
150 S. Pyda and S. Kareenhalli

The goal here is to maximize the probability for the set of parameters. For ease
of derivation, the log of the likelihood function is maximized.

l(θ ) log(L(θ ))

m

y (i) log(h x (i) ) + 1 − y (i) log 1 − h x (i)
i1

Similar to linear regression, to maximize the likelihood function, we use a gradient

ascent algorithm. If θ denotes vector of parameters,
∂
θ θ +α (l(θ ))
∂θ
Simplifying the partial derivative, we arrive at the following step

θ j := θ j + α y (i) − h x (i) x (i)
j

Now applying the gradient ascent, we can update the parameters simultaneously,
iteratively till convergence.
On a new data item, we can predict the probability that the review is positive by
using the parameters learnt using the training dataset

P(y 1|x; θ ) h(x) g θ T x
if P(y 1|x; θ ) > 0.5 outcome 1
< 0.5 outcome 0

The threshold of 0.5 can be set to a different value based on the application and
the dataset. For instance in an application that detects cancerous cells, the threshold
could be set conservatively.

3.2 Neural Networks

In Sect. 2.4, neural network was introduced as a method to model nonlinear decision
boundary; in this section, we delve into neural networks in greater detail.
In earlier section, the neural network that was described was modeling data and
classifying the output into single class. To model a multiclass classification, the same
notion can be extended. Instead of the output y being a value [0,1], the output is a
vector whose size is equal to the number of classes in the classification problem as
shown in Fig. 7.
Consider a set of images where the images are labeled as that of sunrise, oceans,
forests, or deserts. The output label for each image will be a vector of size 4. The
12 Mathematics and Machine Learning 151

Fig. 7 Depiction of a typical multi-class neural network

values of the element in the vector (0 or 1) will denote which class the image belongs
to. The neural network output would look something similar to what is shown below:
The output in the figure above classifies the image as that of an ocean.
Recall the cost function for the logistics regression (previous section) was

m
(i)
l(w) y log(h x (i) ) + 1 − y (i) log 1 − h x (i)
i1

For a K-class classification neural network, the cost function can be generalized
as follows:
K

m

l(w) yk(i) log h x (i) k + 1 − yk(i) log 1 − h x (i) k
i1 k1

The second summation adds up the cost of logistic regression for each node in
the output layer. As in the logistic regression, we minimize the cost function using
an optimal set of weights. To achieve this, we need to compute the partial derivative
of the cost function.
Let us define the error of the output as (l) as the error for layer l. For the output
layer (l = L) for a L layered neural network

δL aL − y

which is basically the error for each of the output nodes in the output layer. To get the
delta values of the hidden layers, we follow a backpropagation algorithm and derive
the delta values from the delta values of the subsequent layer. Values for δ (L − 1) ,
δ (L − 2) ,…, δ (2) can be calculated as follows by the general formula for layer l.
T
δl w (l) δl+1 . ∗ a (l) . ∗ 1 − a (l)
152 S. Pyda and S. Kareenhalli

where a(l) is the activation function of layer l. Recall that for a sigmoid function g(z)
∂
(g(z)) g(z)(1 − g(z))
∂z

So the delta value (l) for layer l is

T
δl w (l) δl+1 . ∗ g z (l)

Propagating from right to left in a neural network, delta values for the units in
all the layers can be derived for each of the layers in the network. The delta values
are errors for each unit a (l)
j (activation unit j in layer l) and are derivative of the cost
function.
∂
δlj cost(t)
∂z (l)
j

The gradient for the hidden layer weights is simply the output error signal back-
propagated to the hidden layer, then weighted by the input to the hidden layer. The
gradient for the weights of layer l is

δ (l+1) (a l )T

The backward propagation algorithm [9] can be formalized as follows:

• Initialize li j 0 for all values of l, i, j.
• Calculate the activations a(l) for layers l 2, 3, … L
• Compute (L) as al − y for layer L.
• Compute δ (L − 1) , δ (L − 2) ,…, (2) as
T
– δl w (l) δl+1 . ∗ a (l) . ∗ 1 − a (l)

• Compute gradients for layer l as

– (l) (l) + δ (l+1) (a (l) )T

• Dil j m1 li j
• The delta matrix is the partial derivatives.
Update weights wil j wil j − Dil j
In logistic regression, the initial weights can be assigned to 0 and the weights can
be learnt the using gradient descent. In neural networks, assigning initial weights
of 0 will cause all the nodes to update to the same values during backpropagation.
Typically, the weights are assigned random weights in a neural network.
12 Mathematics and Machine Learning 153

4 Future

The data centers and databases are headed toward self managing and autonomous
systems that can self manage, self tune, detect and fix adversities. Oracle provides
self-managing databases which are autonomous and self-driving.
The cost to acquire, store, and compute data will continue to fall. Amount of data
will continue to grow. The machine learning building blocks are moving to cloud,
where machine learning techniques are hosted in the cloud. With confluence of these
will make every organization, a data company, and every application an intelligent
application, moving to an algorithm economy. In the past, automation was limited
to “blue-collar jobs,” and we will see a future where automation by “white-collar
machines” will be prevalent.

5 Conclusion

With the explosion of data, machine learning algorithms are need of the hour to
dynamically analyze the data. The technological advancements in the processor
speeds, distributed technology combined with explosion of data have seen resur-
gence in machine learning. Mathematics forms the bedrock of machine learning
techniques. Machine learning is poised to take an even bigger role in our daily lives.

References

1. Gartner Research report, https://www.gartner.com/newsroom/id/3598917

2. Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. ACM Commun.
18(11), 613–620 (1975)
3. Oracle Corporation, “Oracle Autonomous database”, https://www.oracle.com/database/autono
mous-database/feature.html
4. Andrew, Ng., https://see.stanford.edu/Course/CS229
5. Lippmann, R., MIT Lincoln Lab. Lexington, MA, An introduction to computing with neural
nets, http://ieeexplore.ieee.org/abstract/document/1165576
6. Shelns, J.: A tutorial on Principal Component Analysis. https://arxiv.org/pdf/1404.1100.pdf,
Google Researc, Mountain View, CA
7. “PCA”, Barnabas Pcozos and Aarti Singh, Machine Learning Department, Computer science
Department, Carnegie Melon University, http://www.cs.cmu.edu/~aarti/Class/10701_Spring14/
slides/PCA.pdf
8. Bottu, L.: Large-Scale Machine Learning with Gradient Descent, http://leon.bottou.org/publica
tions/pdf/compstat-2010.pdf, NEC Labs, Princeton, NJ
9. Jain, A.K., Mao, J., Mohiuddin, K.M.: Artificial Neural networks—A Tutorial, ieeex-
plore.ieee.org/document/485891
Chapter 13
Numerical Study on the Influence
of Diffused Soft Layer in p H Regulated
Polyelectrolyte-Coated Nanopore

Subrata Bera, S. Bhattacharyya and H. Ohshima

Abstract Electroosmotic flow and its effect are numerically studied in the polyelec-
trolyte layer-coated cylindrical nanopore. The flow characteristic of the electrokinetic
consists of the Nernst–Planck equation for species distribution, the Brinkman mod-
ified Navier–Stoke equation for fluid flow and the Poisson equation for induced
electric potential. These nonlinear coupled governing equations for potential distri-
bution, ionic species distribution and fluid flow are solved through a finite volume
method in staggered grid system for cylindrical coordinate. This study established
the importance of the bulk ionic concentration, electrolyte p H , the softness of the
polyelectrolyte layer, the nanopore geometries and potential of the polyelectrolyte
layer and nanopore wall. Three functional group as Succinoglycan, Glycine, and
Proline functional group are considered in this study. The average electroosmotic
flow rate increases with polyelectrolyte segment for a fixed p H value in the succino-
glycan functional group. The axial velocity increases with the p H values for fixed
polyelectrolyte segment. The increase of softness parameter decreases the average
flow. The increase in p H values increases the average flow for different bulk ionic
concentration. The increase of ionic current with the p H values are more prominent
for the negatively charged surface than zero-charged potential. The electric body
force increase with the pH values for both zero-charged nanopore and negatively
charged nanopore.

Keywords Polyelectrolyte layer · Electroosmotic flow · Functional group

Nernst–Planck equation

S. Bera (B)
Department of Mathematics, National Institute of Technology Silchar,
Silchar 788010, India
e-mail: [email protected]
S. Bhattacharyya
Department of Mathematics, Indian Institute of Technology Kharagpur,
Kharagpur 721302, India
H. Ohshima
Faculty of Pharmaceutical Sciences, Tokyo University of Science,
Noda, Chiba 2788510, Japan

© Springer Nature Singapore Pte Ltd. 2018 155

1 Introduction

The grafting nanochannels in polyelectrolyte layer (PEL) has emerged as a novel

technique for a large number of applications such as flow control, current rectifica-
tion, ion sensing and manipulation, fabrication of nanofluidic diodes, liquid transport,
and many more [1–3]. A thin layer of nonzero net charged density forms along the
wall when the electrolyte comes in touch with the solid wall and is called the elec-
tric double layer (EDL). The EDL thickness is known as Debye length, and it is in
nanometers order. Electroosmotic flow (EOF) occurs when the external electric field
contacts with the net surplus charged ions in the EDL. Several interesting observa-
tions were seen experimentally when the characteristic length is an order of EDL
thickness. When EOF is modeled with the thin EDL approximation using slip con-
dition in velocity is called Helmholtz-Smoluchowski velocity [4]. Most of previous
studies on EOF, the ion distribution is considered to obeys the equilibrium Boltzmann
distribution and resulting Poisson–Boltzmann equation for the induced potential. But
the Nernst–Planck equation for ions considered the convection, electromigration, and
diffusion of ions. Several authors studied the various aspects of EOF in micro- and
nanochannel in theoretically as well as experimentally. Conlisk and McFerran [5]
developed a mathematical model for EOF and corresponding numerical solution in a
rectangular nanochannel with overlapping EDL in the presence of the applied electric
field. The combined effects of EOF and pressure-driven flow on species transport
have studied by Bera and Bhattacharyya [6] by considering Nernst–Planck model
for micro- and nanochannels.
There are many ways of modulation of electroosmotic flow. Polymer coatings are
very useful to control the EOF rate. The electroosmotic flow in a semicircular cross
section is studied by Wang et al. [7] under the Debye–Huckel approximation. The
perturbation method was introduced by Chang et al. [8] to investigate the EOF of
an incompressible, viscous, and electrically conducting Newtonian liquid through
a microtube with slightly corrugated walls. Rojas et al. [9] theoretically studied
the pulsatile electroosmotic flow (PEOF) within a circular microchannel. Liu et al.
[10] established an analytical expression for the flow velocity and ionic current for
the EOF in a charge-regulated circular channel, focusing on the effect of types of
ions and their concentrations. The electroosmotic flow behavior of the nanopore can
be influenced by its physicochemical properties, the applied external electric field,
nature of liquid medium, and the potential of boundary surface and nearby PEL.
Patwary et al. [11] established that the polyelectrolyte-grafted nanochannel which
is highly efficient for electrochemomechanical energy conversion. Simple analytic
expressions for the electrophoretic mobility of a soft particle were developed by
Ohshima [12] within an ion-penetrable hard particle core surface of polyelectrolyte
layer for low electric potential. Tessier and Slater [13] numerically investigated the
EOF on coarse-grained molecular dynamics simulations. Cao and you [14] studied
the coarse-grained molecular dynamics simulation method for mixed polymer brush-
grafted nanochannels between two distinct species of polymers alternately grafted on
the inner surface of nanochannels. The effect of PEL charged density on ions and fluid
13 Numerical Study on the Influence of Diffused Soft … 157

flow numerically investigated by Bera and Bhattacharyya [15] in a polyelectrolyte-

coated nanopore. Ohshima [16] proposed a simple algorithms for the analytic solution
of Poisson–Boltzmann equation in a charged narrow pore. They compared with the
exact numerical solution for low-to-moderate values of the nanopore surface potential
when the nanopore radius is less than the Debye length.
All these above studies considered a fixed charge density within the polyelectrolyte
layer. But, many bacterial cell surfaces possess acidic and/or basic functional groups.
Das [17] established the explicit relationships between surface potential of a charged
soft interface and Donnan potential. Electroosmotic transport phenomena in a pH-
dependent charge density were studied by Chen and Das [18] through polyelectrolyte-
grafted nanochannel. Tseng et al. [19] theoretically investigated the influence of
temperature distribution on the p H -regulated polyelectrolyte layer-coated particle.
The present study deals with the electroosmotic flow through PEL-coated nanopore
in which the PEL charges is dependent on the ionization of corresponding functional
group. The objective of the present study is to analyze the effects of bulk ionic con-
centration, p H value, softness of PEL, charged density of PEL and nanopore surface
potential. Most of the authors studied linear EOF by considering Boltzmann distri-
bution for ion. By taking the convection, diffusion, and electromigration effects, we
have taken the Nernst–Plank equation. We have also considered the full Brinkman
model in Navier–Stoke equations with body force term. The Poisson equations give
the induced potential distribution in EDL. The characteristics of this electrokinetic
flow are obtained by solving these nonlinear coupled equations through a finite vol-
ume method.

2 Mathematical Model

A canonical nanopore whose radius a and axial length z is considered in our

study, as shown in Fig. 1. The nanopore wall bears negative potential ζ . A poly-
electrolyte layer of thickness d is embedded in nanopore wall. We have assumed
that the polyelectrolyte layer is homogeneously structured, ion-penetrable with fixed
charge density ρ f i x . We consider that the polyelectrolyte segment distribution can
be modeled as a soft step function and its distribution h(r ) is given by (as shown in
Fig. 2)
1 − exp − r −(a−d)
δ
, a−d ≤r ≤a
h(r ) = (1)
0 0≤a ≤a−d

Here δ is assumed to obey δ d, which measures the width of inhomogeneous

distribution of PEL segments near front edge.
The polyelectrolyte layer (PEL) contains both the acidic functional groups/and
basic functional groups namely AX and B, respectively. We have taken a volumetric
charge density for uniform distribution of function group within the polyelectrolyte
layer can be given as follows
158 S. Bera et al.

(a) (b)
d 0 a-d a
r
Wall

r=1 fix Polyelectrolyte Layer d

Cathode Anode d
r=0 z

h(r)
Polyelectrolyte Layer
Wall r

Fig. 1 a Schematic diagram of diffuse soft layer consisting of a rigid charged core in pH regu-
lated the polyelectrolyte layer in a canonical nanopore and b soft function on the cross section of
cylindrical nanopore

Fig. 2 Spatial distribution of 1

polymer segments when
d = 0.4 and arrow indicate
increasing order of δ/d 0.8
and varies from 0, 0.2, 0.4, δ/d=0
0.6, 0.8 and 1.0
0.6
h(r)

0.4 δ/d

0.2

d
0
0.5 0.6 0.7 0.8 0.9 1
r

ρ f i x (r ) = h(r )ρ( p H0 , pK a , pK b ) (2)

where charge density ρ( p H0 , pK a , pK b) comes due to the functional groups, which

is given by Tseng et al. [19] as

zA NA F z B NB F
ρ( p H0 , pK a , pK b ) = +
1 + 10 pK a − p H0 exp(−eφ/k BT) 1 + 10 p H0 − pK b exp(eφ/k BT)
(3)
We have taken a binary symmetric electrolyte with valance z A = −1 and z B = 1.
Here, the total concentration for acidic functional groups is N A and basic functional
groups is N B . Here, p H0 is the bulk p H with pK a = − log K a and pK b = − log K b .
Here K a is the ionization constant for acidic functional group and K b is for the
basic functional groups. We symbolically denote p H as the bulk p H value of the
13 Numerical Study on the Influence of Diffused Soft … 159

electrolyte in this discussion. Here the induced potential is φ(r ), the Boltzmann
constant is k B , elementary electric charge is e, and absolute temperature is T .
The electric field E (=Er , Ez , Eθ ) has component along radial direction r , cross-
radial direction z and axial direction z where a constant electric field E 0 is applied. The
equation of total potential related to double layer potential (DLP) and polarization
effects due to electric field. Therefore, the electric field connected to net charge
density ρe plus charges density ρ f i x in PEL is given by the Poisson’s equation

∇ · (e E) = −e ∇ 2 φ = ρe + ρ f i x (4)

Here, the induced electric potential is and permittivity e = 0 r , where 0 and r

are the permittivity of vacuum and dielectric constant of the solution, respectively.
Here net charge density ρe = i z i en i ; z i and n i are, respectively, the valance and
ionic concentration. We have taken symmetric electrolyte of valance z i = ±1 in the
present study. We scaled the potential by φ0 (= k B T /e) and concentration by the
bulk ionic concentration n 0 . The bulk number density (n 0 ) and the bulk electrolyte
concentration (C) are related by FC = en 0 . The Poisson equation can be written in
non-dimensional as
2
∂ φ 1 ∂ ∂φ (κa)2
+ r = − (g − f ) − h(r )Q f i x (5)
∂z 2 r ∂r ∂r 2

We have scaled cylindrical coordinate r and z by nanopore radius

a. The Debye layer
thickness κ is the inverse of the EDL thickness (λ), where λ = e k B T /i (z i e)2 n i0
and κa = a/λ. The scale fixed charge density Q f i x (r ) within the diffused PEL is

zAQA zB QB
Q f i x (r ) = + (6)
1 + 10 pK a − p H0 exp(−φ) 1 + 10 p H0 − pK B exp(φ)

where the non-dimensional maximum charge density parameter are Q j = F N j a 2 /

e φ0 ( j = A, B) for acidic functional groups and basic functional groups.
The ion transport is described by the Nernst–Planck equation and is given as

∂n i
+ ∇ · Ni = 0 (7)
∂t
where Ni = −Di ∇n i + n i ωi z i FE + n i q is the net ionic flux of individual species.
Here, Faraday’s constant is F, Di is the diffusivity and ωi is the mobility of ith ionic
species. Here velocity field q = (v, u) with the velocity components v and u along
the radial r and axial z directions respectively. Here, velocity field q is nondimen-
sionalized by the Helmholtz–Smoluchowski velocity U H S (=e E 0 φ0 /μ) and time t
is nondimensionalized by a/U H S . The Reynolds number Re = U H S a/ν, Schmidt
number Sc = ν/Di . Here, gas constant is R and the viscosity μ of the electrolyte
is relate to ν = μ/ρ. Here, Peclet number Pe as Pe = UDH Si a Here, we denote g is
the cationic concentration and f is anionic concentration in non-dimensional form.
Hence, the non-dimensional equations of ion transport are expressed as follows
160 S. Bera et al.
2
∂g ∂ g 1 ∂ ∂g ∂(ug) 1 ∂(r vg) ∂g ∂g ∂ψ ∂g ∂ψ
Pe − + r + Pe + + − +
∂t ∂z 2 r ∂r ∂r ∂z r ∂r ∂z ∂r ∂r ∂z ∂z

∂g ∂φ ∂g ∂φ (κa)2
− + + g (g − f ) − hg Q f i x = 0 (8)
∂r ∂r ∂z ∂z 2

2
∂f ∂ f 1 ∂ ∂f ∂(u f ) 1 ∂(r v f ) ∂g ∂ f ∂ψ ∂ f ∂ψ
Pe − + r + Pe + + + +
∂t ∂z 2 r ∂r ∂r ∂z r ∂r ∂z ∂r ∂r ∂z ∂z

∂ f ∂φ ∂ f ∂φ (κa)2
+ + − f (g − f ) + h f Q f i x = 0 (9)
∂r ∂r ∂z ∂z 2

The modified Navier–Stokes equation for electrokinetic flow is

∂q
ρ + (q · ∇)q = −∇ p + μ∇ 2 q + ρe E − μλ2s q (10)
∂t

∇ ·q=0 (11)

Here, fluid density and viscosity are given by ρ and μ, respectively, and λ2s (r ) is
the position- dependent screening length. For diffuse polyelectrolyte layer and the
softness parameter λs of the PEL can be expressed by Duval and Ohshima [20] as
follows
λs = λ0 [h(r )]1/2 (12)

where λ0 is the softness degree of the homogeneous distribution of polyelectrolyte

segments. Here, pressure is non-dimensionless by μU H S /a.
The non-dimensional equations for fluid flow are given along axial direction z
and radial direction r respectively as follows

∂u ∂u ∂u ∂ p (κa)2 ∂φ
Re + Re u +v =− − − + (g − f )
∂t ∂z ∂r ∂z 2 ∂z

∂ 2u 1 ∂ ∂u
+ + r − β 2 hu (13)
∂z 2 r ∂r ∂r

∂v ∂v ∂v ∂ p (κa)2 ∂φ
Re + Re u +v =− − (g − f )
∂t ∂z ∂r ∂r 2 ∂r

∂ 2v 1 ∂ ∂v v
+ + r − 2 − β 2 hv (14)
∂z 2 r ∂r ∂r r
13 Numerical Study on the Influence of Diffused Soft … 161

∂u ∂v v
+ + =0 (15)
∂z ∂r r

Here, the non-dimensional softness parameter is β. It can be expressed the softness

√
degree of PEL (λ−1 −1
0 ) as β = a/λ0 . The softness degree of PEL, λ0 (=
−1
μ/γ )
relates the hydrodynamic field inside the nanopore, while the conductance is not
affected significantly by the flow field where γ is the hydrodynamics frictional coef-
ficient. It (λ−1
0 ) is the dimensional length and typically represents the characteristic
penetration length of the fluid within soft structure. Here, we varied softness degree
of PEL (λ−1 0 ) so as to obtain the range of β between 1 and 20 [21, 22].
In the computational domain, we have used fully developed boundary condition
in the upstream and downstream boundaries. We have also considered the no-slip
condition along the channel walls. The rigid membrane surface is ion-impenetrable,
i.e., n · Ni = 0 with negative (ζ ) potential on the walls. The axisymmetric condition
is taken along the nanopore axis.

3 Numerical Schemes

The governing nonlinear coupled equations for potential, ion distribution, and fluid
flow are solved by the finite volume method in staggered grid approach. The dis-
cretized form of these equations is obtained by integrating the governing equations
over each control volumes. Different control volumes are used to integrate differ-
ent equations. We considered the higher-order upwind scheme, QUICK (Quadratic
Upwind Interpolation Convective Kinematics, [23] to discretize the convective and
electromigration terms in both ion distribution and Navier–Stokes equations. These
discretized governing equations are solved by the pressure correction-based iterative
algorithm SIMPLE (Semi-Implicit Method for Pressure-Linked Equations, [24]).
We have taken non-uniform grid along radial direction r but the uniform grid along
axial direction z. To verify the grid independency, we performed our computation
for three different grid size when Grid 1: 400 × 250, Grid 2: 500 × 490, and Grid
3: 600 × 600 for EOF in cylindrical channel. We have also compared our result
with the Ai et al. [25] and analytic solution. We have taken non-uniform grid size
where δr is considered in range between 0.0025 to 0.01 with δz is either 0.0125
(for Grid 1) or δz = 0.008 (for Grid 3). In Grid 2, we have taken δz = 0.01 and
0.0025 ≤ δr ≤ 0.002 and δt was taken as 0.0001. To validate of our numerical
scheme, we have compared our computed solution for EOF in Fig. 3 with Ai et
al. [25] and analytic solution for the axial velocity (u) of an electroosmotic flow
(EOF) in a cylindrical channel. Here, 10 mM is the bulk electrolyte concentration
without polyelectrolyte layer. The nanopore charge density (σ ) = −1 mC/m 2 (i.e.,
ζ = −0.019), and the applied electric field is −50 KV/m.
162 S. Bera et al.

Fig. 3 Comparison of
0
electroosmotic flow for the
analytic solution and present
solution with Ai et al. [25] of Grid-1
the axial velocity (u) in a -50 Grid-2
cylindrical channel without Aietal. [2010]

U(μm/sec)
PEL, i.e., h(r ) = 0. The bulk Present solution
concentration is 10 mM in Analytic solution
KCl solution. The charge -100
density of the nanopore
(σ ) = −1 mC/m2 (i.e.,
ζ = −0.56) and the applied
imposed electric field is −50 -150
KV/m

0 10 20 30 40 50
r(nm)

4 Results and Discussions

In this paper, we focus on the EOF effects on polyelectrolyte-coated cylindrical

charged or uncharged nanopore. The nanopore geometry relates to experimental
design which depend on nanofluidic devices, where the nanopore radius is 3–30 nm.
It is proved that the continuum-based model is valid to capture their essential physics
when the nanopore radius is larger than 3 nm. The thickness of the polyelectrolyte
layer (d) is based on the biological lipids which typically ranges from 3 to 5 nm. We
have considered the polyelectrolyte nanopore with radius a = 10 nm and the PEL
thickness (d) is 4nm. Here, we presented the results for various values of p H , bulk
ionic concentration, wall potential, thickness of polyelectrolyte layer and surface
charge density of polyelectrolyte layer.
We vary the bulk electrolyte concentration so that the Debye–Huckel parameter
from κa ∼ o(1) to κa 1. In our study, we have taken three functional groups as
succinoglycan ( pK a = 4.58, pK b = 8.6; proline ( pK a = 1.99, pK b = 10.96; Wu
et al. [3]), and glycine ( pK a = 2.35, pK b = 9.78). We have consider N A = N B =
10.23 mM so that the scaled charge density becomes Q A = Q B = 10.
Figure 4a–c shows the non-dimensional distribution of ionic species g, f ; induced
potential φ and axial velocity u, respectively, for different values of PEL segment δ/d
in succinoglycan functional group. Here, we have considered the softness parameter
β = 1, bulk ionic concentration C = 10 mM in the surface potential ζ = −1 with
Q A = Q B = 10. It is clear for Fig. 4 that the increase of PEL segment increase the
net charge density and so as axial electroosmotic velocity.
The distribution of axial velocity are shown in Fig. 5a for different vales of p H
in succinoglycan functional group. We have considered different p H values such as
2, 4, 6, 8, 10, and 12 for acidic and basic groups. The p H values are taken lower
and higher values close to the corresponding pK a value in succinoglycan functional
group (i.e., pK a = 4.58). The axial velocity increase with p H values for fixed soft-
13 Numerical Study on the Influence of Diffused Soft … 163

(a) (b) (c)

δ
δ

φ
δ

r r r

Fig. 4 Distribution of non-dimensional a ionic species distribution g, f ; b induced potential φ and

c axial velocity u for different δ/d in succinoglycan functional group in the PEL. Here, p H = 2.0,
β=1, C = 10 mM, ζ = −1 with Q A = Q B = 10. Arrows indicated the increasing order of δ/d as
0.1, 0.2, 0.4, 0.6, 0.8 and 1.0

(a) (b)

Fig. 5 Distribution of non-dimensional axial velocity u with a different p H when β = 1 in suc-

cinoglycan functional group and b different functional group for different softness parameter β
when p H = 2. Here, δ/d = 0.5, C = 10 mM, ζ = −1 with Q A = Q B = 10. Arrows indicated the
increasing order of p H as 2, 4, 6, 8, 10 and 12

ness parameter β = 1 and PEL segment δ/d = 0.5. Figure 5b shows the distribution
of axial velocity for different softness parameter β for three functional group such
as succinoglycan, glycine, and proline . The axial velocity inversely varies with the
softness parameter for those functional group.
Figure 6a shows the potential distribution for different p H in succinoglycan func-
tional group for fixed ionic concentration, PEL segment, and softness parameter. We
consider the p H values lower and higher of pK a value of succinoglycan functional
group. Fig. 6a that for lower values of p H with respect to pK a , the potential dis-
tribution is positive. For higher values of p H , the factor h(r ) becomes to unity and
the polyelectrolyte behaves like a layer with a constant charge density Q f i x , which
is independent of the bulk p H and so the potential distribution is negative. When
the p H values is lower close to pK a , it observed a strong dependence of p H values
164 S. Bera et al.

(a) (b)
0.15 0.15
Succinoglycan
Glycine
0.1 0.1 Proline
pH=2
0.05 0.05
pH

φ
φ

0 0

-0.05 -0.05
pH=10
-0.1 -0.1

-0.15 -0.15
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
r r

Fig. 6 Distribution of non-dimensional electric potential φ a different p H for succinoglycan func-

tional group and b different functional groups of the PEL. Here, δ/d = 0.5, β = 1, C = 10 mM,
ζ = 0 with Q A = Q B = 10. Arrows indicated the increasing order of p H as 2, 4, 6, 8, 10, 12

on the PEL charge density. Figure 6b presents the non-dimensional distribution of

potential for three functional group succinoglycan, glycine, and proline functional
group for different values of softness parameter in fixed p H . It is evident from Fig. 6b
that lower values of p H than pK a , the potential distribution is always positive for
three different functional group. Since the value of pK a for succinoglycan functional
group in higher than other functional group, the potential distribution is more high
than others and reverse happens for high p H cases.
The non-dimensional average flow EOF velocity (u m ) in a cross section is defined
by
u · nds
um = (16)
s πa
2

Here, n is the unit vector in outward normal direction on the nanopore and s is the
crosssectional area. The variation of dimensional average flow Um with bulk p H
is shown in Fig. 7a, b for different length of PEL segment δ/d in succinoglycan
functional group when nanopore surface potential ζ = 0 and ζ = −1, respectively.
For low p H value, average flow is negative, and it is increase with increase of p H
when the nanopore surface potential ζ = 0 as shown in Fig. 7a. But Fig. 7b shows
that average flow is increase positively with the increase of p H when nanopore wall
is negatively charged. For both cases ζ = 0 and ζ = −1, the average flow increases
with the length of PEL segment δ/d for low p H values and reverse result happens
for high p H values.
The ion concentration effects on the average flow are plotted in Fig. 8a, b for
succinoglycan functional group when softness parameter and polyelectrolyte layer
segment length are fixed. Figure 8a shows that the average flow increase from negative
to positive with the increase of the bulk ionic concentration when nanopore surface
potential ζ = 0. But the average flow always increase positively with bulk ionic
concentration for ζ = −1.
13 Numerical Study on the Influence of Diffused Soft … 165

7000
2000
(a) (b)
6000

1000
5000
δ δ

μ
μ

0
4000

-1000
3000

-2000
2000
2 4 6 8 10 12 2 4 6 8 10 12
pH pH

Fig. 7 Variation of dimensional average flow Um with bulk p H for different polyelectrolyte seg-
ment δ/d in succinoglycan functional group. Here, C = 10 mM, β = 1, with Q A = Q B = 10.
Arrows indicate the increasing order of δ/d as 0.1, 0.2, 0.4, 0.6, and 1.0. a ζ = 0 and b ζ = −1

2000 (a) 10000 (b)

1000 8000

6000
0
μ
μ

4000
-1000
2000
-2000
0
2 4 6 8 10 12 2 4 6 8 10 12
pH pH

Fig. 8 Variation of dimensional average flow Um with bulk p H for for different ionic concentration
in succinoglycan functional group. Here, δ/d = 0.5, β = 1 with Q A = Q B = 10. Arrows indicate
the increasing order of C as 1 mm, 10 mM, 100 mM, and 1000 mM. a ζ = 0 and b ζ = −1

The variational of dimensional average flow with PEL segment are described
in Fig. 9 for different values of softness parameter for succinoglycan, glycin, and
proline functional group. Figure 9a, b indicates the average flow when nanopore
surface potential ζ = 0 and ζ = −1 respectively. The increase of softness parameter
decrease the average flow for all cases.
The current density is defined as follow

j = ez i Ni = j0 −z i ∇n i − z i2 n i ∇ + Peqz i n i (17)

where j0 (=Di en 0 /a) is the scaled electric current density. We defined the average
current density (Iz ) along the z axial direction as

Iz = j · nds (18)
s
166 S. Bera et al.

(a) (b)
0 8000
β

β β
-1000 6000
Succinoglycan
β Glycine

μ
μ

-2000 4000 Proline

Succinoglycan β
-3000 2000
Glycine
β
Proline

-4000 0
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

Fig. 9 Variation of dimensional average flow Um with polyelectrolyte segment δ/d for different
softness parameter β. Here, p H = 2, C = 1000 mM with Q A = Q B = 10. a ζ = 0 and b ζ =
−1. solid, dash, and dash dot represent the succinoglycan, glycin, and proline functional group
respectively

0.7
(a) 2 (b) ζ
0.65 ζ
1.5
0.6

0.55 1
Iz

0.5 0.5 ζ

0.45
ζ 0
0.4
-0.5
2 4 6 8 10 12 2 4 6 8 10 12
pH pH

Fig. 10 Variation of scaled a average current density Iz and b total electric bodyforce F with bulk
p H for different functional group. Here, δ/d = 0.5, C = 1 mM and β = 1 with Q A = Q B = 10.
a ζ = 0 and b ζ = −1

We also defined the total electrostatic body force (F) in the entire nanopore and is
given by
l
F= h f z dz (19)
0

where
r the average body force across the cylindrical cross section is defined as f z =
0 ρe Edr . The scaled current density Iz and total electric bodyforce F for different
functional group are shown in Fig. 10a, b when nanopore surface potential are ζ = 0
and ζ = −1, respectively.
13 Numerical Study on the Influence of Diffused Soft … 167

5 Conclusions

We have focused the electroosmotic flow effects through polyelectrolyte layer-

embedded nanopore. The governing equations for electrokinetic flow are considered
as the Nernst–Planck equation for ion distribution, Brinkman-modified Navier–Stoke
equation for flow and Poisson’s equation for EDL potential. This coupled governing
equations are solved by different algorithms in finite volume approach. The axial
velocity increases with the polyelectrolyte segment for fixed p H value in the Suc-
cinoglycan functional group. The axial velocity increases with the p H values for
fixed polyelectrolyte segment. The average flow rate is inversely proportional to the
softness parameter. The increase in p H values increases the average flow for differ-
ent bulk ionic concentration. The effect of ionic current is more prominent with the
p H values for negatively charged nanopore than zero potential. The electric body
force increase with the p H values for both zero charged nanopore and negatively
charged nanopore.

Acknowledgements Authors (S. Bera) wish to thank the Sci. & Eng. Research Board in Dept.
of Sci. and Tech., Govt. of India for supporting financial assistant in the project of File No:
ECR/2016/000771.

References

1. Squires, A., Hersey, J.S., Grinstaff, M.W., Meller, A.: A nanopore-nanofiber mesh biosensor
to control DNA translocation. J. Am. Chem. Soc. 135, 16304–16307 (2013)
2. Bergen, W.G., Wu, G.: Intestinal nitrogen recycling and utilization in health and disease. J.
Nutr. 139, 821–825 (2009)
3. Wu, G., Bazer, F.W., Burghardt, R.C., Johnson, G.A., Kim, S.W., Knabe, D.A., Li, P., Li,
X., McKnight, J.R., Satterfield, M.C., Spencer, T.E.: Proline and hydroxyprolinemetabolism:
implications for animal and human nutrition. Amino Acids 40, 1053–1063 (2011)
4. Probstein, R.F.: Physicochemical Hydrodynamics: An Introduction, 2nd edn. Wiley Inter-
science, New York (1994)
5. Conlisk, A.T., McFerran, J.: Mass transfer and flow in electrically charged micro-and nanochan-
nels. Anal. Chem. 74, 2139–2150 (2002)
6. Bera, S., Bhattacharyya, S.: On mixed electroosmotic-pressure driven flow and mass transport
in microchannels. Int. J. Eng. Sci. 62, 165–176 (2013)
7. Wang, C.-Y., Liu, Y.-H., Chang, C.C.: Analytical solution of electro-osmotic flow in a semi-
circular microchannel, ?Phys. Fluids 20, 063105–063111 (2008)
8. Chang, L., Jian, Y., Buren, M., Liu, Q., Sunb, Y.: Electroosmotic flow through a microtube
with sinusoidal roughness. J. Mol. Liq. 220, 258–264 (2016)
9. Rojasa, G., Arcosa, J., Peraltaa, M., Méndezb, F., Bautistaa, O.: Pulsatile electroosmotic flow
in a microcapillary with the slip boundary condition, Colloids and Surfaces A: Physicochem.
Eng. Aspects 513, 57–65 (2017)
10. Liu, B.-T., Tseng, S., Hsu, J.-P.: Analytical expressions for the electroosmotic flow in a charge-
regulated circular channel. Electrochem. Commun. 54, 1–5 (2015)
11. Patwary, J., Chen, G., Das, S.: Efficient electrochemomechanical energy conversion in
nanochannels grafted with polyelectrolyte layers with pH-dependent charge density. Microfluid
Nanofluid 20, 37–51 (2016)
168 S. Bera et al.

12. Ohshima, H.: Electrical phenomena of soft particles. A soft step function model. J. Phys. Chem.
A. 116, 6473–6480 (2012)
13. Tessier, F., Slater, G.W.: Modulation of electroosmotic flow strength with end-grafted polymer
chains. Macromolecules 39, 1250–1260 (2006)
14. Cao, Q., You, H.: Electroosmotic flow in mixed polymer brush-grafted nanochannels. Polymers
8, 438–449 (2016)
15. Bera, S., Bhattacharyya, S.: Effect of charge density on electrokinetic ions and fluid flow through
polyelectrolyte coated nanopore. In: ASME-Fluids Engineering Division Summer Meeting,
V01BT10A008-V01BT10A008 (2017). https://doi.org/10.1115/FEDSM2017-69194.
16. Ohshima, H.: A simple algorithm for the calculation of the electric double layer potential
distribution in a charged cylindrical narrow pore. Colloid Polym. Sci. 294, 1871–1875 (2016)
17. Das, S.: Explicit interrelationship between Donnan and surface potentials and explicit quan-
tification of capacitance of charged soft interfaces with pH-dependent charge. Colloids Surf.
A: Physicochem. Eng. Aspects 462, 6974 (2014)
18. Chen, G., Das, S.: Electroosmotic transport in polyelectrolyte-grafted nanochannels with pH-
dependent charge density. J. Appl. Phys. 117, 185304–185313 (2015)
19. Tseng, S., Lin, J.Y., Hsu, J.P.: Theoretical study of temperature influence on the electrophoresis
of a pH-regulated polyelectrolyte. Anal. Chim. Acta. 847, 80–89 (2014)
20. Duval, J.F.L., Ohshima, H.: Electrophoresis of diffuse soft particle. Langmuir 22, 3533–3546
(2006)
21. van Dorp, S., Keyser, U.F., Dekker, N.H., Dekker, C., Lemay, S.G.: Origin of the electrophoretic
force on DNA in solid-state nanopores. Nat. Phys. 5, 347–351 (2009)
22. Yeh, L.-H., Zhang, M., Qian, S., Hsu, J.-P.: Regulating DNA translocation through functional-
ized soft nanopores. Nanoscale 4, 2685–2693 (2012)
23. Leonard, B.P.: A stable and accurate convective modelling procedure based on quadratic
upstream interpolation. Comput. Methods Appl. Mech. Eng. 19, 59–98 (1979)
24. Fletcher, C.A.J.: Computational Techniques for Fluid Dynamics, vol-I & II Springer Ser. Com-
put. Phy. Springer, Heidelberg, New York (1991)
25. Ai, Y., Zhang, M., Joo, S.W.: Cheney. M.A., Qian. S.: Effects of electro osmotic flow on ionic
current rectification in conical nanopores. J. Phys. Chem. C. 114, 3883–3890 (2010)
Chapter 14
Quadruple Fixed Point Theorem
for Partially Ordered Metric Space
with Application to Integral Equations

Manjusha P. Gandhi and Anushri A. Aserkar

Abstract In this paper, two theorems have been established. The first theorem says
the existences of a quadruple fixed point theorem in partially ordered metric space
for nonlinear contraction mapping which is (α)-admissible and satisfies the mixed
monotone property. The second result is proved for non-continuous mapping in addi-
tion to some other conditions. A suitable example of nonlinear contraction mapping
validates the result. Moreover, an application to the integral equation is also presented.

Keywords Complete metric space · Partially ordered set · Quadruple fixed point
Mixed monotone property · (α)-admissible

1 Introduction

The classical Banach’s contraction principle has been improved and generalized
by many researchers [1–8]. The existence of some new fixed point theorems for
contraction mappings in partially ordered metric spaces was considered by Ran et
al. [9], Bhaskar et al. [10], Nieto et al. [11, 12], and Agarwal et al. [13]. Bhashkar
et al. [10] introduced the concept of a coupled fixed point and proved theorems in
partially ordered complete metric spaces. Lakshmikantham et al. [5] proved coupled
coincidence and coupled common fixed point theorems for nonlinear mappings in
partially ordered complete metric spaces. Later, numerous results on coupled fixed
point have been obtained [14–19]. Berinde et al. [20] came up with the idea of
a tripled fixed point. Moreover, Samet et al. [21] proposed fixed point of order
N ≥ 3 for the first time. Karapnar [22] established quadruple fixed point theorems
in partially ordered metric spaces. Several researchers [23–26] were motivated and

M. P. Gandhi (B)
Department of Mathematics, Yeshwantrao Chavan College of Engineering, Nagpur, India
e-mail: [email protected]
A. A. Aserkar
Department of Mathematics, Rajiv Gandhi College of Engineering
and Research, Nagpur, India
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2018 169

proved theorems on quadruple fixed points under certain constraints. The present
paper consists of three parts. In the first part, we prove two theorems. First theorem
proves a quadruple fixed point theorem for a mapping satisfying the mixed monotone
property as well as (α)-admissible condition. The second theorem proves for non-
continuous mapping with some additional conditions. In addition, a suitable example
validates the result. In the last section, the result is implicated for the existence of the
solution of nonlinear integral equation. The theory of integral equations has many
applications in the real world. For example, integral equations are often applicable
in engineering, mathematical physics, economics, and biology.

2 Preliminaries

2.1 Quadruple Fixed Point: Let X be a nonempty set, and let A : X × X × X ×

X → X . An element (x, y, z, w) is called a quadruple fixed point of A if

A(x, y, z, w) = x, A(y, z, w, x) = y, A(z, w, x, y) = z, A(w, x, y, z) = w.

2.2 Mixed Monotone Property: Let (X , ≤) be a partially ordered set, and let
A : X × X × X × X → X be a mapping. We say that A has the mixed monotone
property if A(x, y, z, w) is monotone non-decreasing in x and z and is monotone
non-increasing in y and w, that is, for any x, y, z, w ∈ X .

x1 , x2 ∈ X , x1 ≤ x2 ⇒ A(x1 , y, z, w) ≤ A(x2 , y, z, w)

y1 , y2 ∈ X , y1 ≤ y2 ⇒ A(x, y2 , z, w) ≥ A(x, y1 , z, w)

z1 , z2 ∈ X , z1 ≤ z2 ⇒ A(x, y, z1 , w) ≤ A(x, y, z2 , w)

w1 , w2 ∈ X , w1 ≤ w2 ⇒ A(x, y, z, w2 ) ≥ A(x, y, z, w1 )

∞
2.3 Let ψ be the family of non-decreasing functions ξ(t) such that ξ n (t) < ∞
n=1
for all t > 0, satisfying (i) ξ(0) = 0, (ii) ξ(t) < t for all t > 0 (iii) lim+ ξ(r) < t
r→t
for all t > 0.
2.4 (α)-admissible: Let A : X × X × X × X → X and α : X 4 × X 4 → [1, ∞) be
two mappings. Then, A is said to be (α)-admissible if
α ((x, y, z, w), (p, q, r, s)) ≥ 1

(A(x, y, z, w), A(y, z, w, x), A(z, w, x, y), A(w, x, y, z)),
⇒α ≥1
(A(p, q, r, s), A(q, r, s, p), A(r, s, p, q), A(s, p, q, r)),

for all x, y, z, w, p, q, r, s ∈ X .
14 Quadruple Fixed Point Theorem for Partially Ordered Metric … 171

3 Main Theorem

In this section, we establish two quadruple fixed point theorem with (α)-admissible
mapping satisfying the mixed monotone property. In the second theorem, the conti-
nuity of the mapping is not considered.

Theorem 3.1 Let (X , d , ≤) be a partially ordered complete metric space. A : X ×

X × X × X → X be a mapping having the mixed monotone property of X . Suppose
that there exist ξ ∈ ψ and α : X 4 × X 4 → [1, ∞) such that for x, y, u, v, p, q, r, s ∈
X , the following holds:

α((x, y, z, w), (p, q, r, s))d (A(x, y, z, w), A(p, q, r, s))

d (x, p) + d (y, q) + d (z, r) + d (w, s)
≤ξ (1)
4

for all x ≥ p, y ≤ q, z ≥ r, w ≤ s. Also,

(i) A is (α)-admissible,
(ii) There exist (x0 , y0 , z0 , w0 ) ∈ X such that
α{(A(x0 , y0 , z0 , w0 ), A(y0 , z0 , w0 , x0 ), A(z0 , w0 , x0 , y0 ), A(w0 , x0 , y0 , z0 )), (x0 , y0 ,
z0 , w0 )} ≥ 1
(iii) A is continuous.
If there exists x0 , y0 , z0 , w0 ∈ X such that
x0 ≤ A(x0 , y0 , z0 , w0 ), y0 ≥ A(y0 , z0 , w0 , x0 ), z0 ≤ A(z0 , w0 , x0 , y0 ), w0 ≥
A(w0 , x0 , y0 , z0 ), then A has a quadruple fixed point.

Proof Let (x0 , y0 , z0 , w0 ) ∈ X be such that

x0 ≤ A(x0 , y0 , z0 , w0 ) = x1 , y0 ≥ A(y0 , z0 , w0 , x0 ) = y1 ,

z0 ≤ A(z0 , w0 , x0 , y0 ) = z1 , w0 ≥ A(w0 , x0 , y0 , z0 ) = w1

Thus, x0 ≤ x1 , y0 ≥ y1 , z0 ≤ z1 , w0 ≥ w1
Again, x2 = A(x1 , y1 , z1 , w1 ), y2 = A(y1 , z1 , w1 , x1 ), z2 = A(z1 , w1 , x1 , y1 ), w2 =
A(w1 , x1 , y1 , z1 )
∵ A has the mixed monotone property

x0 ≤ x1 ≤ x2 y0 ≥ y1 ≥ y2 , z0 ≤ z1 ≤ z2 , w0 ≥ w1 ≥ w2

By continuing this process, we construct the sequence {xn }, {yn }, {zn }, {wn } in X
such that xn+1 = A(xn , yn , zn , wn ), yn+1 = A(yn , zn , wn , xn ), zn+1 = A(zn , wn , xn , yn ),
wn+1 = A(wn , xn , yn , zn )
Since A has the mixed monotone property

xn ≤ xn+1 , yn+1 ≥ yn , zn+1 ≤ zn , wn+1 ≥ wn (2)

172 M. P. Gandhi and A. A. Aserkar

Assume for some n ∈ N , xn = xn+1 , yn+1 = yn , zn+1 = zn , wn+1 = wn

Thus, (xn , yn , zn , wn ) is a quadruple fixed point of A.
Thus, we assume xn = xn+1 , yn = yn+1 , zn = zn+1 , wn = wn+1 for any n ∈ N ,
∵ A is α-admissible

∴ α((x0 , y0 , z0 , w0 ), (x1 , y1 , z1 , w1 )) ≥ 1

(A(x0 , y0 , z0 , w0 ), A(y0 , z0 , w0 , x0 ), A(z0 , w0 , x0 , y0 ), A(w0 , x0 , y0 , z0 )),
⇒α ≥1
A(x1 , y1 , z1 , w1 ), A(y1 , z1 , w1 , x1 ), A(z1 , w1 , x1 , y1 ), A(w1 , x1 , y1 , z1 )),

∴ α((x1 , y1 , z1 , w1 ), (x2 , y2 , z2 , w2 )) ≥ 1

Similarly, we may prove that

α((y1 , z1 , w1 , x1 ), (y2 , z2 , w2 , x2 )) ≥ 1, α((z1 , w1 , x1 , y1 ), (z2 , w2 , x2 , y2 )) ≥ 1

α((w1 , x1 , y1 , z1 ), (w2 , x2 , y2 , z2 )) ≥ 1

Continuing and generalizing, we get

α((xn , yn , zn , wn ), (xn+1 , yn+1 , zn+1 , wn+1 )) ≥ 1

α((yn , zn , wn , xn ), (yn+1 , zn+1 , wn+1 , xn+1 )) ≥ 1
α((zn , wn , xn , yn ), (zn+1 , wn+1 , xn+1 , yn+1 )) ≥ 1
α((wn , xn , yn , zn ), (wn+1 , xn+1 , yn+1 , zn+1 )) ≥ 1 (3)

Putting (x, y, z, w) = (xn+1 , yn+1 , zn+1 , wn+1 ), (p, q, r, s) = (xn , yn , zn , wn ) in (1),

we get

d (xn+1 , xn ) = d (A(xn , yn , zn , wn ), A(xn−1 , yn−1 , zn−1 , wn−1 ))

A(xn , yn , zn , wn ),
≤ α((xn , yn , zn , wn ), (xn−1 , yn−1 , zn−1 , wn−1 ))d
A(xn−1 , yn−1 , zn−1 , wn−1 ) (4)

d (xn , xn−1 ) + d (yn , yn−1 ) + d (zn , zn−1 ) + d (wn , wn−1 )
≤ξ
4

Similarly, we may prove that

d (yn+1 , yn ) = d (A(yn , zn , wn , xn ), A(yn−1 , zn−1 , wn−1 , xn−1 ))

A(yn , zn , wn , xn ),
≤ α((yn , zn , wn , xn ), (yn−1 , zn−1 , wn−1 , xn−1 ))d
A(yn−1 , zn−1 , wn−1 , xn−1 )

d (yn , yn−1 ) + d (zn , zn−1 ) + d (wn , wn−1 ) + d (xn , xn−1 )
≤ξ
4
(5)
14 Quadruple Fixed Point Theorem for Partially Ordered Metric … 173

d (zn+1 , zn ) = d (A(zn , wn , xn , yn ), A(zn−1 , wn−1 , xn−1 , yn−1 ))

A(zn , wn , xn , yn ),
≤ α((zn , wn , xn , yn ), (zn−1 , wn−1 , xn−1 , yn−1 ))d
A(zn−1 , wn−1 , xn−1 , yn−1 )

d (zn , zn−1 ) + d (wn , wn−1 ) + d (xn , xn−1 ) + d (yn , yn−1 )
≤ξ
4
(6)
d (wn+1 , wn ) = d (A(wn , xn , yn , zn ), A(wn−1 , xn−1 , yn−1 , zn−1 ))

A(wn , xn , yn , zn ),
≤ α((wn , xn , yn , zn ), (wn−1 , xn−1 , yn−1 , zn−1 ))d
A(wn−1 , xn−1 , yn−1 , zn−1 )

d (wn , wn−1 ) + d (xn , xn−1 ) + d (yn , yn−1 ) + d (zn , zn−1 )
≤ξ
4
(7)
∴ max{d (xn+1 , xn ), d (yn+1 , yn ), d (zn+1 , zn ), d (wn+1 , wn )}

d (wn , wn−1 ) + d (xn , xn−1 ) + d (yn , yn−1 ) + d (zn , zn−1 )
≤ξ
4

d (xn+1 , xn ) + d (yn+1 , yn ) + d (zn+1 , zn ) + d (wn+1 , wn )

∴
4

d (wn , wn−1 ) + d (xn , xn−1 ) + d (yn , yn−1 ) + d (zn , zn−1 )
≤ξ (8)
4

Continuing with the same steps, we get

d (xn+1 , xn ) + d (yn+1 , yn ) + d (zn+1 , zn ) + d (wn+1 , wn )

∴
4
d (x1 , x0 ) + d (y1 , y0 ) + d (z1 , z0 ) + d (w1 , w0 )
≤ξ
4

For > 0, there exists n ∈ N such that

n d (x1 , x0 ) + d (y1 , y0 ) + d (z1 , z0 ) + d (w1 , w0 )
ξ ≤ 4
4
Let m, n ∈ N be such that m > n
d (xm , xn ) + d (ym , yn ) + d (zm , zn ) + d (wm , wn )
∴
4

m−1
d (xi , xi+1 ) + d (yi , yi+1 ) + d (zi , zi+1 ) + d (wi , wi+1 )
≤
i=n
4

m−1
d (x1 , x0 ) + d (y1 , y0 ) + d (z1 , z0 ) + d (w1 , w0 )
≤ ξi <
i=n
4 4
174 M. P. Gandhi and A. A. Aserkar

d (xm , xn ) + d (ym , yn ) + d (zm , zn ) + d (wm , wn )

∴ <
4 4
∴ d (xm , xn ) + d (ym , yn ) + d (zm , zn ) + d (wm , wn ) <

Now,
d (xm , xn ) < d (xm , xn ) + d (ym , yn ) + d (zm , zn ) + d (wm , wn ) ≤

Similarly,

d (ym , yn ) < d (xm , xn ) + d (ym , yn ) + d (zm , zn ) + d (wm , wn ) ≤

d (zm , zn ) < d (xm , xn ) + d (ym , yn ) + d (zm , zn ) + d (wm , wn ) ≤

d (wm , wn ) < d (xm , xn ) + d (ym , yn ) + d (zm , zn ) + d (wm , wn ) ≤

Hence, {xn }, {yn }, {zn }, {wn } are Cauchy sequences in (X , d ).

Since (X , d ) is a complete metric space, {xn }, {yn }, {zn }, {wn } must converge in it.
Let x, y, z, w ∈ X such that

lim xn = x, lim yn = y, lim zn = z, lim wn = w

n→∞ n→∞ n→∞ n→∞

A is continuous and

xn+1 = A(xn , yn , zn , wn ), yn+1 = A(yn , zn , wn , xn ), zn+1

= A(zn , wn , xn , yn ), wn+1 = A(wn , xn , yn , zn )

Taking lim to both sides, we get

n→∞

lim xn+1 = lim A(xn , yn , zn , wn ) ⇒ x = A(x, y, z, w)

n→∞ n→∞

lim yn+1 = lim A(yn , zn , wn , xn ) ⇒ y = A(y, z, w, x)

n→∞ n→∞

lim zn+1 = lim A(zn , wn , xn , yn ) ⇒ z = A(z, w, x, y)

n→∞ n→∞

lim wn+1 = lim A(wn , xn , yn , zn ) ⇒ w = A(w, x, y, z)

n→∞ n→∞

Thus, A has a quadruple fixed point in (X , d ).

In the next theorem, we omit the continuity of A.

Theorem 3.2 Let (X , d , ≤) be a partially ordered complete metric space. Let A :

X × X × X × X → X be a mapping having the mixed monotone property of X .
Suppose that there exist ξ ∈ ψ and α : X × X × X × X → [1, ∞) such that for
x, y, u, v ∈ X , the following holds:
14 Quadruple Fixed Point Theorem for Partially Ordered Metric … 175

α((x, y, z, w), (p, q, r, s))d (A(x, y, z,w), A(p, q, r, s)) ≤ ξ

d (x, u) + d (y, q) + d (z, r) + d (w, s)
4
for all x ≥ p, y ≤ q, z ≥ r, w ≤ s. Also,
(i) A is (α)-admissible,
(ii) There exist (x0 , y0 , z0 , w0 ) ∈ X such that

α {(x0 , y0 , z0 , w0 ), (A(x0 , y0 , z0 , w0 ), A(y0 , z0 , w0 , x0 ), A(z0 , w0 , x0 , y0 ),

A(w0 , x0 , y0 , z0 )) } ≥ 1

(iii) If {xn }, {yn }, {zn }, {wn } are sequences in X , such that

α((xn , yn , zn , wn ), (xn+1 , yn+1 , zn+1 , wn+1 )) ≥ 1,

α((yn , zn , wn , xn ), (yn+1 , zn+1 , wn+1 , xn+1 )) ≥ 1

α((zn , wn , xn , yn ), (zn+1 , wn+1 , xn+1 , yn+1 )) ≥ 1,

α((wn , xn , yn , zn ), (wn+1 , xn+1 , yn+1 , zn+1 )) ≥ 1

If lim xn = x, lim yn = y, lim zn = z, lim wn = w, then

n→∞ n→∞ n→∞ n→∞

α((xn , yn , zn , wn ), (x, y, z, w)) ≥ 1,

α((yn , zn , wn , xn ), (y, z, w, x)) ≥ 1

α((zn , wn , xn , yn ), (z, w, x, y)) ≥ 1

α((wn , xn , yn , zn ), (w, x, y, z)) ≥ 1

If there exists (x0 , y0 , z0 , w0 ) ∈ X such that

x0 ≤ A(x0 , y0 , z0 , w0 ), y0 ≥ A(y0 , z0 , w0 , x0 ), z0 ≤ A(z0 , w0 , x0 , y0 ),

w0 ≥ A(w0 , x0 , y0 , z0 )

then A has a quadruple fixed point.

Proof As already in Theorem 1 we have proved that {xn }, {yn }, {zn }, {wn } are Cauchy
sequences in X , therefore, there exists x, y, z, w ∈ X such that

lim xn = x, lim yn = y, lim zn = z, lim wn = w

n→∞ n→∞ n→∞ n→∞

and hence
α((xn , yn , zn , wn ), (x, y, z, w)) ≥ 1,
176 M. P. Gandhi and A. A. Aserkar

α((yn , zn , wn , xn ), (y, z, w, x)) ≥ 1

α((zn , wn , xn , yn ), (z, w, x, y)) ≥ 1

α((wn , xn , yn , zn ), (w, x, y, z)) ≥ 1

Now,

d (A(x, y, z, w), x) ≤ d (A(x, y, z, w), A(xn , yn , zn , wn )) + d (xn+1 , x)

≤ α((xn , yn , zn , wn ), (x, y, z, w))d (A(x, y, z, w), A(xn , yn , zn , wn )) + d (xn+1 , x)

d (xn , x) + d (yn , y) + d (zn , z) + d (wn , w)
≤ξ + d (xn+1 , x)
4

d (xn , x) + d (yn , y) + d (zn , z) + d (wn , w)
≤ + d (xn+1 , x)
4

Similarly,

d (yn , y) + d (zn , z) + d (wn , w) + d (xn , x)
d (A(y, z, w, x), y) ≤ + d (yn+1 , y)
4

d (zn , z) + d (wn , w) + d (xn , x) + d (yn , y)
d (A(z, w, x, y), y) ≤ + d (zn+1 , z)
4

d (wn , w) + d (xn , x) + d (yn , y) + d (zn , z)
d (A(w, x, y, z), w) ≤ + d (wn+1 , w)
4

Taking lim to both sides, we get

n→∞

d (A(x, y, z, w), x) = 0 ⇒ A(x, y, z, w) = x

Similarly,
d (A(y, z, w, x), y) = 0 ⇒ A(y, z, w, x) = y

d (A(z, w, x, y), z) = 0 ⇒ A(z, w, x, y) = z

d (A(w, x, y, z), w) = 0 ⇒ A(w, x, y, z) = w (9)

Example 3.3 Let X = R and d : X × X × X × X → R with d = |x − y|.

Let A : X × X × X × X → R by A(x, y, z, w) = 16
1
ln ((1 + |x|)(1 + |y|)(1 + |z|)
(1 + |w|)) for all x, y, z, w ∈ X .
14 Quadruple Fixed Point Theorem for Partially Ordered Metric … 177

Consider α : X 4 × X 4 → [1, ∞) be such that

2 if x ≥ p, y ≤ q, z ≥ r, w ≤ s
α((x, y, z, w), (p, q, r, s)) =
0 otherwise

and ξ(t) = 21 In(1 + |t|)

Then, we get
1
d (A(x, y, z, w), A(p, q, r, s)) = In((1 + |x|)(1 + |y|)(1 + |z|)(1 + |w|))
16
1
− In((1 + |p|)(1 + |q|)(1 + |r|)(1 + |s|))
16
1 1 1 1
= In(1 + |x|) + In(1 + |y|) + In(1 + |z|) + In(1 + |w|)
16 16 16 16
1 1 1 1
− In(1 + |p|) + In(1 + |q|) + In(1 + |r|) + In(1 + |s|)
16 16 16 16
1 1 + |x| 1 1 + |y| 1 1 + |z| 1 1 + |w|
= In + In + In + In
16 1 + |p| 16 1 + |q| 16 1 + |r| 16 1 + |s|
⎛ ⎞
1 1
1 ⎜ 4 In(1 + |x − p|) + 4 In(1 + |y − q|)+ ⎟
≤ ⎝ 1 1 ⎠
4 In(1 + |z − r|) + In(1 + |w − s|)
4 4
1 4 + |x − p| + |y − q| + |z − r| + |w − s|
≤ In
4 4

1 |x − p| + |y − q| + |z − r| + |w − s|
= In 1 +
4 4

∴ 2 × (d (A(x, y, z, w), A(p, q, r, s)))

1 |x − p| + |y − q| + |z − r| + |w − s|
≤ In 1 +
2 4

i.e.,

α((x, y, z, w)(p, q, r, s))d (A(x, y, z, w), A(p, q, r, s))

d (x, u) + d (y, q) + d (z, r) + d (w, s)
≤ξ .
4

Thus, all the conditions of Theorem 1 are satisfied. Hence, (0, 0, 0, 0) is a quadruple
fixed point of A.
178 M. P. Gandhi and A. A. Aserkar

4 Application

In this section, we present an application of quadruple fixed point theorem for estab-
lishing the existence of solution of the following integral equation.

b
u(t) = θ1 (s, t) {F1 (s, u(s)) + F2 (S, u(s)) + F3 (s, u(s)) + F4 (s, u(s))} ds + h(t)
a
(10)

where t ∈ [a, b]
Let X = C([a, b], R) denote the class of R-valued continuous functions on the
interval [a, b] endowed with metric d (u, v) = max |u(t) − v(t)| for u, v ∈ X .
t∈[a,b]
The partial order “≤” on X by x, y ∈ X x ≤ y ⇒ x(t) ≤ y(t) for t ∈ [a, b].
(X , d , ≤) be partial ordered complete metric space.
We suppose that
(i) F1 , F2 , F3 , F4 : [a, b] × R → R is continuous.
(ii) θ1 (s, t) : [a, b] × [a, b] → R is continuous.
(iii) h(t) : [a, b] → R is continuous.

(x − y)
(iv) 0 ≤ F1 (s, x) − F1 (s, y) ≤ λξ
4

(x − y)
0 ≤ F2 (s, y) − F2 (s, x) ≤ ηξ
4

(x − y)
0 ≤ F3 (s, x) − F3 (s, y) ≤ δξ
4

(x − y)
0 ≤ F4 (s, y) − F4 (s, x) ≤ ξ
4

for λ, η, δ, > 0 and x, y ∈ R, x ≥ y, ξ : [0, ∞) → [0, ∞) is non-decreasing func-

tion such that ξ(t) < t and lim+ ξ(r) < t for all t > 0.
r→t
b
(v) Let sup(λ, η, δ, ) = β and 4γβ (θ1 (s, t)) ≤ 1 where γ > 1
a
(vi) Let there exists functions x, y, z, w : [a, b] → R(x, y, z, w) such that

b
x(t) ≤ θ1 (s, t) {F1 (s, x(s)) + F2 (s, y(s)) + F3 (s, z(s)) + F4 (s, w(s))} ds + h(t)
a
14 Quadruple Fixed Point Theorem for Partially Ordered Metric … 179

b
y(t) ≥ θ1 (s, t) {F1 (s, y(s)) + F2 (s, z(s)) + F3 (s, w(s)) + F4 (s, x(s))} ds + h(t)
a

b
z(t) ≤ θ1 (s, t) {F1 (s, z(s)) + F2 (s, w(s)) + F3 (s, x(s)) + F4 (s, y(s))} ds + h(t)
a

b
w(t) ≥ θ1 (s, t) {F1 (s, w(s)) + F2 (s, x(s)) + F3 (s, y(s)) + F4 (s, z(s))} ds + h(t)
a

for all t ∈ [a, b]

Theorem 4.1 Consider the integral equation (10) and suppose that θ1 , θ2 , F1 , F2 ,
F3 , F4 satisfy all the conditions the assumptions, then equation (10) has a quadruple
fixed point in C([a, b], R).

Proof Consider A : X 4 → X defined by

A(x1 , x2 , x3 , x4 )(t)
b
= θ1 (s, t) {F1 (s, x1 (s)) + F2 (s, x2 (s)) + F3 (s, x3 (s)) + F4 (s, x4 (s))} ds + h(t)
a
(11)

for x1 , x2 , x3 , x4 ∈ X
We will prove that it satisfies all the conditions of Theorem 1.
First, let us prove that it satisfies the mixed monotone property.
Let (x1 , y1 ) ∈ X with x1 ≤ y1 and t ∈ [a, b], then we have

A(y1 , y2 , y3 , y4 )(t) − A(x1 , x2 , x3 , x4 )(t)

b
= θ1 (s, t) {F1 (s, y1 (s)) − F1 (s, x1 (s))} ds
a

∵ x1 (t) ≤ y1 (t) and based on our assumption (iv)

{F1 (s, y1 (s)) − F1 (s, x1 (s))} ≥ 0.

Thus,
A(y1 , y2 , y3 , y4 )(t) − A(x1 , x2 , x3 , x4 )(t) ≥ 0

⇒ A(x1 , x2 , x3 , x4 )(t) ≤ A(y1 , y2 , y3 , y4 )(t)

180 M. P. Gandhi and A. A. Aserkar

Let x2 , y2 ∈ X with x2 ≤ y2 and t ∈ [a, b], then we have

A(x2 , x3 , x4 , x1 )(t) − A(y2 , y3 , y4 , y1 )(t)

b
= θ1 (s, t) {F2 (s, x2 (s)) − F2 (s, y2 (s))} ds
a

∵ x2 (t) ≤ y2 (t) and based on our assumption (iv)

{F2 (s, x2 (s)) − F1 (s, y2 (s))} ≥ 0.

Thus,
A(x2 , x3 , x4 , x1 )(t) − A(y2 , y3 , y4 , y1 )(t) ≥ 0

⇒ A(y2 , y3 , y4 , y1 )(t) ≤ A(x2 , x3 , x4 , x1 )(t)

Similarly, one proves the property for third and fourth component
i.e.,
x3 (t) ≤ y3 (t) ⇒ A(x3 , x4 , x1 , x2 )(t) ≤ A(y3 , y4 , y1 , y2 )(t)

and
x4 (t) ≤ y4 (t) ⇒ A(y4 , y1 , y2 , y3 )(t) ≤ A(x4 , x1 , x2 , x3 )(t)

Let us proceed to find

d (A(x1 , x2 , x3 , x4 ), A(y1 , y2 , y3 , y4 )) for x1 ≤ y1 , x2 ≥ y2 , x3 ≤ y3 , x4 ≥ y4

and with A having the mixed monotone property, we get d (A(x1 , x2 , x3 , x4 ), A(y1 , y2 ,
y3 , y4 ))
= max |A(x1 , x2 , x3 , x4 )(t) − A(y1 , y2 , y3 , y4 )(t)|
t∈(a,b)

= max |A(y1 , y2 , y3 , y4 )(t) − A(x1 , x2 , x3 , x4 )(t)|

t∈(a,b)

Now for t ∈ [a, b] and equation (11)

b
d (A(y1 , y2 , y3 , y4 ), A(x1 , x2 , x3 , x4 )) = θ1 (s, t) ds
a

{F1 (s, y1 (s)) − F1 (s, x1 (s)) + F2 (s, y2 (s)) − (F2 (s, x2 (s)))+

(F3 (s, y3 (s)) − F3 (s, x3 (s))) + (F4 (s, y4 (s)) − F4 (s, x4 (s)))}
14 Quadruple Fixed Point Theorem for Partially Ordered Metric … 181

Using condition (v),

d (A(y1 , y2 , y3 , y4 ), A(x1 , x2 , x3 , x4 ))

b
y1 − x1 x2 − y2 y3 − x3
≤ θ1 (s, t) λ ξ +η ξ +δ ξ
4 4 4
a

x4 − y4
+ ξ ds
4

d (A(y1 , y2 , y3 , y4 ), A(x1 , x2 , x3 , x4 ))

b
y1 − x1 x2 − y2 y3 − x3
θ1 (s, t) β ξ +β ξ +β ξ
4 4 4
a

x4 − y4
+β ξ ds
4

b
y1 − x1 x2 − y2 y3 − x3
≤ β θ1 (s, t) ξ + ξ + ξ
4 4 4
a

x4 − y4
+ ξ ds
4

y1 − x1 y1 − x1 x2 − y2 y3 − x3 x4 − y4
∵ξ ≤ξ + + +
4 4 4 4 4

b
y1 − x1 x2 − y2 y3 − x3 x4 − y4
≤ 4β (θ1 (s, t)) ξ + + + ds
4 4 4 4
a

b
d (y1 , x1 ) d (x2 , y2 ) d (y3 , x3 ) d (x4 , y4 )
≤ 4β (θ1 (s, t)) ξ + + + ds
4 4 4 4
a

∴ γ d (A(y1 , y2 , y3 , y4 ), A(x1 , x2 , x3 , x4 ))

b
d (y1 , x1 ) + d (x2 , y2 ) + d (y3 , x3 ) + d (x4 , y4 )
≤ 4γβ (θ1 (s, t)) ξ ds
4
a
182 M. P. Gandhi and A. A. Aserkar

where
α((y1 , y2 , y3 , y4 ), (x1 , x2 , x3 , x4 )) = γ ≥ 1

∴ γ d (A(y1 , y2 , y3 , y4 ), A(x1 , x2 , x3 , x4 ))

d (y1 , x1 ) + d (x2 , y2 ) + d (y3 , x3 ) + d (x4 , y4 )
≤ ξ (12)
4

Thus, all the conditions of the theorem are satisfied, so let (x, y, z, w) be a solution
such that it satisfies condition (vi).
And so,

x ≤ A(x, y, z, w), y ≥ A(y, z, w, x), z ≤ A(z, , x, y), w ≥ A(w, x, y, z).

So, all the conditions of Theorem 1 are satisfied.

∴ We apply Theorem 1, and thus, we get a point

(x̄, ȳ, z̄, w̄) ∈ C([a, b], R) × C([a, b], R) × C([a, b], R) × C([a, b], R)

such that

¯ = A(ȳ, z̄, w̄, x̄), (z)

(x̄) = A(x̄, ȳ, z̄, w̄), (y) ¯ = A(z̄, w̄, x̄, ȳ), (w)
¯ = A(w̄, x̄, ȳ, z̄)

Acknowledgements The authors are thankful to the affiliated college authorities for financial
support given by them.

References

1. Arvanitakis, A.D.: A proof of the generalized Banach contraction conjecture. Proc. Am. Math.
Soc. 131(12), 36473656 (2003)
2. Choudhury, B.S., Das, K.P.: A new contraction principle in Menger spaces. Acta Math. Sin.
24(8), 13791386 (2008)
3. Boyd, D.W., Wong, J.S.W.: On nonlinear contractions. Proc. Am. Math. Soc. 20, 458464 (1969)
4. Aydi, H., Vetro, C., Sintunavarat, W., Kumam, P.: Coincidence and fixed points for contractions
and cyclical contractions in partial metric spaces. Fixed Point Theory Appl. 2012, 124 (2012)
5. Lakshmikantham, V., Ciric, LjB: Coupled fixed point theorems for nonlinear contractions in
partially ordered metric spaces. Nonlinear Anal. 70, 4341–4349 (2009)
6. Sintunavarat, W., Kumam, P.: Weak condition for generalized multi-valued (f,)-weak contrac-
tion mappings. Appl. Math. Lett. 24, 460465 (2011)
7. Sintunavarat, W., Kumam, P.: Common fixed point theorem for cyclic generalized multi-valued
contraction mappings. Appl. Math. Lett. 25(11), 18491855 (2012)
8. Sintunavarat, W., Cho, Y.J., Kumam, P.: Common fixed point theorems for c-distance in ordered
cone metric spaces. Comput. Math. Appl. 62, 19691978 (2011)
9. Ran, A.C.M., Reurings, M.C.B.: A fixed point theorem in partially ordered sets and some
applications to matrix equations. Proc. Am. Math. Soc. 132, 14351443 (2004)
14 Quadruple Fixed Point Theorem for Partially Ordered Metric … 183

10. Bhaskar, T.G., Lakshmikantham, V.: Fixed point theory in partially ordered metric spaces and
applications. Nonlinear Anal. 65, 13791393 (2006)
11. Nieto, J.J., Rodriguez-Lopez, R.: Contractive mapping theorems in partially ordered sets and
applications to ordinary differential equations. Order 22, 223239 (2005)
12. Nieto, J.J., Lopez, R.R.: Existence and uniqueness of fixed point in partially ordered sets and
applications to ordinary differential equations. Acta Math. Sin. Engl. Ser. 23(12), 2205–2212
(2007)
13. Agarwal, R.P., El-Gebeily, M.A., ORegan D.: Generalized contractions in partially ordered
metric spaces. Appl. Anal. 87, 18 (2008)
14. Choudhury, B.S., Metiya, N., Kundu, A.: Coupled coincidence point theorems in ordered metric
spaces. Ann. Univ. Ferrara. 57, 116 (2011)
15. Karapnar, E.: Couple fixed point on cone metric spaces. Gazi Univ. J. Sci. 24, 51–58 (2011)
16. Karapnar, E.: Coupled fixed point theorems for nonlinear contractions in cone metric spaces.
Comput. Math. Appl. 59, 36563668 (2010)
17. Aydi, H.: Some coupled fixed point results on partial metric spaces. Int. J. Math. Math. Sci.
2011, Article ID 647091, 11 pages (2011)
18. Abbas, M., Khan, M.A., Radenovic, S.: Common coupled fixed point theorem in cone metric
space for wcompatible mappings. Appl. Math. Comput. 217, 195202 (2010)
19. Luong, N.V., Thuan, N.X.: Coupled fixed points in partially ordered metric spaces and appli-
cation. Nonlinear Anal. 74, 983992 (2011)
20. Berinde, V., Borcut, M.: Tripled fixed point theorems for contractive type mappings in partially
ordered metric spaces. Nonlinear Anal. 74, 48894897 (2011)
21. Samet, B., Vetro, C.: Coupled fixed point, f-invariant set and fixed point of N-order. Ann. Funct.
Anal. 1(2), 4656 (2010)
22. Karapnar, E.: Quartet fixed point for nonlinear contraction. http://arxiv.org/abs/1106.5472 (27
Jun 2011)
23. Karapnar, E.: A new quartet fixed point theorem for nonlinear contractions. J. Fixed Point
Theory Appl. 6(2), pp. 119–135 (2011)
24. Karapnar, E.: Quadruple fixed point theorems for weak φ-contractions. ISRN Math. Anal.
2011, Article ID 989423, 15 pages (2011)
25. Karapnar, E., Luong, N.V.: Quadruple fixed point theorems for nonlinear contractions. Comput.
Math. Appl. 64, 18391848 (2012)
26. Karapnar, E., Berinde, V.: Quadruple fixed point theorems for nonlinear contractions in partially
ordered metric spaces. Banach J. Math. Anal. 6(1), 7489 (2012)
Chapter 15
Enhanced Prediction for Piezophilic
Protein by Incorporating Reduced Set
of Amino Acids Using Fuzzy-Rough
Feature Selection Technique Followed
by SMOTE

Anoop Kumar Tiwari, Shivam Shreevastava, Karthikeyan Subbiah

and Tanmoy Som

Abstract In this paper, the learning performance of different machine learning algo-
rithms is investigated by applying fuzzy-rough feature selection (FRFS) technique
on optimally balanced training and testing sets, consisting of the piezophilic and
nonpiezophilic proteins. By experimenting using FRFS technique followed by Syn-
thetic Minority Over-sampling Technique (SMOTE) at optimal balancing ratios, we
obtain the best results by achieving sensitivity of 79.60%, specificity of 74.50%,
average accuracy of 77.10%, AUC of 0.841, and MCC of 0.542 with random forest
algorithm. The ranking of input features according to their differentiating ability of
piezophilic and nonpiezophilic proteins is presented by using fuzzy-rough attribute
evaluator. From the results, it is observed that the performance of classification al-
gorithms can be improved by selecting the reduced optimally balanced training and
testing sets. This can be obtained by selecting the relevant and non-redundant fea-
tures from training sets using FRFS approach followed by suitably modifying the
class distribution.

Keywords Feature selection · Imbalanced dataset · SMOTE · Fuzzy-rough set

Random forest · SVM

1 Introduction

Machine learning techniques are effectively implemented to solve a diversity of

problems in pattern recognition, data mining, and bioinformatics [1, 28, 34]. Due

A. K. Tiwari · K. Subbiah
Department of Computer Science, Institute of Science (BHU), Varanasi, India
S. Shreevastava (B) · T. Som
Department of Mathematical Sciences, Indian Institute of Technology (BHU),
Varanasi, India
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2018 185

to the advancement of high-throughput assay systems in modern laboratories, large

volume biological datasets are created every day. Data size is enlarging not only in
the form of data instances (tuples) but also the dimensionality of data attributes (fea-
tures). This may reduce the average accuracy and efficiency of most of the machine
learning algorithms [5], especially in case of the existence of redundant or irrelevant
features. Many researchers have outlined this issue in several bioinformatics prob-
lems [25]. High-dimensional bioinformatics datasets contain proteomics, genomics,
clinical trial data, etc. Feature selection (FS) [14, 18] techniques focus on select-
ing subset of the original features while attaining the best for a predetermined goal,
often the maximum accuracy (for test data). FS removes irrelevant and redundan-
t features and acquires the best subsets of original features which most profitably
differentiate(s) among classes. FS approaches are extensively explored as it is easier
to interpret selected features than the extracted features. FS is required in numerous
applications, such as object recognition, document classification, computer vision,
and disease diagnosis. The class imbalance is another key issue, which directly af-
fects the machine learning algorithms while solving many prediction problems in
bioinformatics datasets. This class imbalance problem [9, 22] is almost ubiquitous
in data mining, machine learning, and pattern recognition tasks [3]. This imbalance
problem has been widely discussed in the literature. Many researchers have inves-
tigated that imbalanced data usually leads to performance loss [35, 36], and some
kind of treatments, such as cost-sensitive learning, sampling, and ensemble learning,
are capable to enhance prediction performance [15, 21, 23, 31–33]. Large difference
among the overall instances (tuples) related to positive and negative classes causes
imbalance data problem, which generates classifier biased problem. In this paper, we
have presented a model to improve the prediction performance of piezophilic and
nonpiezophilic groups in protein dataset by selecting relevant and non-redundant
features from optimally balanced training sets using fuzzy-rough feature selection
(FRFS) [10–13] technique. Now, the same features have been selected from testing
sets followed by optimally balancing the testing sets using SMOTE [4, 19]. From the
conducted experiments, it is observed that our model results in better performance
than the reported results of Nath et al. [24]. Moreover, we have given a suitable
schematic representation of our proposed methodology. Furthermore, we have given
ranking of input features using fuzzy-rough attribute evaluator technique. Finally,
we have given ROC curves [17] for four classifiers on different groups of testing set.

2 Materials and Methods

2.1 Dataset

We have taken the dataset of Nath et al. [24] to conduct our experiments. This was
created as a two-dimensional habitat space-based dataset. It was created on the ba-
sis of pressure (nonpiezophiles and piezophiles) and on the basis of temperature
15 Enhanced Prediction for Piezophilic Protein by Incorporating … 187

(psychrophilic, mesophilic, and thermophilic). It consists of 2464 psychrophilic–

piezophilic (PP)/2684 psychrophilic–nonpiezophilic (PNP), 2125 mesophilic–
piezophilic (MP)/2566 mesophilic–nonpiezophilic (MNP), 1058 thermophilic
–piezophilic (TP-I)/1025 thermophilic–nonpiezophilic (TNP-I), and 1099 thermo-
philic–piezophilic (TP-II)/1249 thermophilic–nonpiezophilic (TNP-II). These
datasets are imbalanced datasets as the ratio of positive (piezophilic) to negative
(nonpiezophilic) class is different from ideal ratio (1:1), where the positive class
(piezophilic) is the minority class in PP/PNP, MP/MNP, TP-II/TNP-II and is the
majority class in case of TP-I/TNP-I.

2.2 Input Features

Nath et al. [24] have created separate training and testing datasets from the original
dataset. Training sets are optimally balanced, and testing sets are imbalanced. The
input feature vector consists of amino acid composition, which is the basic feature
of any protein sequence and has adequate discriminating capability for classification
of proteins. It can be calculated by applying the following expression:
Z aa,k
Paa,k = × 100 (1)
Z r es,k

where
aa denotes specific one of the twenty different amino acid residues,
P aa,k denotes the percentage frequency of the specific amino acid ‘aa’ in the kth
sequence,
Z aa,k denotes the total count of the specific amino acid ‘aa’,
Z r es,k denotes the total number of amino acid residues in the kth sequence.
where P(k) denotes the percentage frequency of kth type residue (k changes from 1
to 20 indicating specific amino acids) and Z (k) denotes the overall residues of kth
type.

2.3 Classification Protocol

Our experiments are performed independently by using four different machine learn-
ing algorithms, which are widely used on biological datasets for classification and
prediction tasks. From our experiments, it can be observed that random forest (RF)
[2] and support vector machines with sequential minimization optimization (SMO)
[27] are the better performing algorithms. A brief description of RF and SMO are
given below.
RF: Random forest (proposed by Breiman [2]) is an ensemble learning approach
188 A. K. Tiwari et al.

comprising of many individual decision trees. The two factors determining the ac-
curacy of random forest are the evaluation of correlation and strength between the
individual tree classifiers. Feature randomization is characterized as an integral part
of random forests. For individual tree, 2/3 of the training samples are adopted for
tree construction and rest of the 1/3 samples are used for testing. This improves the
performance of the tree and is defined as out of bag data [20].
SMO: Support vector machines (SVMs) work on the principle of structural risk min-
imization of statistical learning theory and are used to perform supervised learning
task. SVM classifies input instances by mapping the Euclidean input instance (tuple)
space into a greater dimensional space and the building of a hyperplane in the kernel
feature space that is applied for dividing the two classes. We have conducted ex-
periments using SMO algorithm [27] which is applied for training a SVM classifier
in order to get faster optimization. SVMs are proven to be robust to noise and can
cope with large feature space. SVMs have been successfully implemented in many
biological domains and have presented promising results.

2.4 Optimal Balancing Protocol

When the real-world dataset is imbalance with the number of negative and positive
class instances, then the evaluation parameters, such as overall accuracy with which
most of the machine learning algorithms are optimized to perform, tend to be biased
in favour of the majority class [8], which is not acceptable as it results in higher
specificity and less sensitivity while predicting the minority class tuples (instances)
[16]. In order to deal with this problem, we have balanced the reduced testing set in
terms of an ideal balancing ratio of 1:1 by using Synthetic Minority Over-sampling
Technique (SMOTE) [21, 31, 32]. A brief description of SMOTE is given below.
SMOTE: It is an over-sampling method that produces synthetic samples from the
minority class. It is a nearest neighbor-based concept which advances by randomly
picking a minority sample and its nearest neighbor samples. It then utilizes one of
the nearest neighboring minority class instances to insert for generating an artificial
minority class instance. The SMOTE samples are defined as the linear combinations
of two similar samples related with minority class ( p and p k ) and are defined by

s = p + i ∗ ( p k − p) (2)

where i varies from 0 to 1 and p k is randomly selected among the five minority class
nearest neighbors of p. In recent years, SMOTE has been successfully implemented to
solve class imbalance problems. In WEKA [7], the default value of nearest neighbors
for SMOTE is 5.
15 Enhanced Prediction for Piezophilic Protein by Incorporating … 189

2.5 Feature Selection Protocol

The existence of identical and overlapped features in bioinformatics datasets makes

the classification task difficult. Interclass feature overlaps, and the existence of simi-
larities leads to vagueness and/or indiscernibility. Rough set concept [26] is invariably
applicable for decision making in case of indiscernibility is present, and vague de-
cision can be handled by fuzzy set theory [37]. These two theories (fuzzy set and
rough set) can be combined to form fuzzy-rough set theory [6], which can cope with
the uncertainty pertaining vagueness and indiscernibility for fuzzy and rough sets
respectively, which is useful for addressing classification problems. In our proposed
model, we have applied FRFS approach [10, 11] to select relevant and non-redundant
features in order to enhance the prediction of piezophilic proteins. The FRFS algo-
rithm is given as follows [12, 13]:

Fuzzy-Rough Quick Reduct Algorithm (C,D)

C, the set of all conditional attributes;
D, the set of decision attributes.

R ← {}; γbest = 0; γ pr ev = 0
do
T ←R

γ pr ev = γbest
for each x(C − R)

if (γ R∪{x} )(D) > (γT )(D)
T ← R ∪ {x}

γbest = (γT )(D)
R←T

until γbest == γ pr ev
return R

2.6 Performance Evaluation Metrics

The relative prediction performance of the four machine learning algorithms is calcu-
lated taking into account threshold-dependent and threshold-independent parameters.
These parameters are determined from the values of the confusion matrix, namely
true positives (TP) that is the number of correctly predicted piezophilic proteins, false
negatives (FN) that is the number of incorrectly predicted piezophilic proteins, true
negatives (TN) that is the number of correctly predicted nonpiezophilic proteins,
and false positives (FP) that is the number of incorrectly predicted nonpiezophilic
proteins.
Sensitivity: This parameter gives the percentage of correctly predicted piezophilic
proteins and is given as follows:
190 A. K. Tiwari et al.

TP
Sensitivit y = × 100 (3)
(T P + F N )

Specificity: This parameter gives the percentage of correctly predicted non-

piezophilic proteins and is calculated by:
TN
Speci f icit y = × 100 (4)
(T N + F P)

Accuracy: This parameter calculates the percentage of correctly predicted

piezophilic and nonpiezophilic proteins and is calculated as follows:
(T P + T N )
Accuracy = × 100 (5)
(T P + F P + T N + F N )

AUC: It represents the area under curve (AUC) of a receiver operating character-
istics curve (ROC) [17]; the closer its value to 1, the better the piezophilic protein
predictor; in the worst case, its value is 0, and in random ranking, its value is 0.5.
It is one of the evaluation metrics which are robust to the imbalanced nature of the
proteomics datasets.
Mathews correlation coefficient (MCC): It is calculated by using the following e-
quation:
(T P × T N − F P × F N )
MCC = √ (6)
(T P + F P)(T P + F N )(T N + F P)(T N + F N )

It is extensively applied as a performance parameter for binary classification. The

MCC value 1 is considered as the best for piezophilic protein predictor. In this
study, the open source java-based machine learning platform WEKA [7] was used
to conduct all the experiments.

3 Result and Discussion

In the current study, we experimented with four different machine learning algo-
rithms, namely support vector machines with sequential minimization optimization
(SMO) [27], multilayer perceptron (MLP) [30], rotation forest (ROF) [29], and ran-
dom forest (RF) [2] on the reduced optimally balanced training and testing sets. We
applied FRFS with rank search on training sets for selecting suitable features (as
recorded in Table 1) and selected the same features from the corresponding testing
sets. Reduced testing sets have been balanced by using varying degree of SMOTE.
The values of different performance evaluation metrics for the four classifiers using
tenfold cross validation are recorded in Table 2.
From experiments, it can be easily observed that the performance of SMO based
on the values of different evaluation parameters is better than other classifiers on
15 Enhanced Prediction for Piezophilic Protein by Incorporating … 191

the training set and RF is the best performer on testing set for the differentia-
tion of psychrophilic–piezophilic and psychrophilic–nonpiezophilic group. For the

Table 1 Different training sets dimensions and their reduct sizes based on FRFS

Table 2 Evaluation metrics of different machine learning algorithms

Table 3 Attribute ranking by fuzzy-rough feature selection algorithm

192 A. K. Tiwari et al.

discrimination of mesophilic–piezophilic and mesophilic–nonpiezophilic group, the

values of evaluation metrics indicate that SMO is performing better than other ma-
chine learning algorithms in case of training set while RF is the best performer on test-
ing set. SMO is the best performing classifier on both training and testing sets for dis-
criminating thermophilic–piezophilic-I and thermophilic–nonpiezophilic-I group,
while in case of discrimination of thermophilic–piezophilic-II and thermophilic–
nonpiezophilic-II group, SMO is the best predictor on training set and RF gives the
best performance result on testing set.
From the entire experiment, we can observe that RF is performing better and is
closely followed by SMO on the basis of values of different evaluation metrics. The
flow diagram of the proposed methodology is depicted in Fig. 1.
The existence of redundant features in a dataset affects the generalization ability of
the model as well as the training time. We have used fuzzy-rough attribute evaluator
technique to rank the 20 different amino acids based on their discerning ability, and
the results are recorded in Table 3.
A suitable way to observe the overall performance of individual machine learn-
ing algorithms at different decision thresholds is the well-known receiver operating
characteristic (ROC) curve, which allows a visual representation of the performance
of different classifiers. The ROC curves for different machine learning algorithms
on different reduced testing sets are given in Figs. 2, 3, 4, and 5, respectively. It can
be observed that the performance of RF and SMO is better than other classifiers.

4 Conclusion

There are many aspects that can directly influence in attaining the real performance of
the classifiers. The three key issues among these are selection of suitable input feature
set, class imbalance, and selection of an appropriate learning algorithm. Redundant
and irrelevant features available in biological datasets lead to accuracy loss and class

Fig. 1 Schematic representation of current study

15 Enhanced Prediction for Piezophilic Protein by Incorporating … 193

Fig. 2 AUC for four machine learning algorithms on reduced PP/PNP testing set

Fig. 3 AUC for four machine learning algorithms on reduced MP/MNP testing set
194 A. K. Tiwari et al.

Fig. 4 AUC for four machine learning algorithms on reduced TP-I/TNP-I testing set

Fig. 5 AUC for four machine learning algorithms on reduced TP-II/TNP-II testing set
15 Enhanced Prediction for Piezophilic Protein by Incorporating … 195

imbalance factor, which is usually observed in biological datasets and causes the
classifier to be biased to majority class tuples (instances). Our experimental results
validated the fact that selection of relevant and non-redundant features using FRFS
technique followed by optimally balancing the ratios in both training and testing
datasets results in higher sensitivity and higher accuracy through various machine
learning algorithms. In our experiments, we explored that RF and SMO have more
discriminating ability of piezophilic and nonpiezophilic proteins as we move up the
temperature range from PP to TP, i.e., PP < MP < TP, and it is clearly visible
from ROC curves of testing sets. Finally, the fuzzy-rough attribute evaluator ranking
method is applied to rank all the input features according to their contribution toward
discrimination of piezophilic and nonpiezophilic proteins.
In the future, we intend to apply our proposed model on some other bioinformatics
datasets to enhance the prediction of positive and negative classes. Furthermore,
we will apply our proposed model by using various search techniques for FRFS.
Moreover, we can apply some more accurate feature selection techniques based on
intuitionistic fuzzy-rough set models.

References

1. Baldi, P., Brunak, S.: Bioinformatics: The Machine Learning approach. MIT press (2001)
2. Breiman, L.: Random Forests. Mach. Learn. 45(1), 5–32 (2001)
3. Chawla, N.V.: Data Mining for Imbalanced Datasets: An Overview. Data Mining and Knowl-
edge Discovery Handbook, pp. 875–886. Springer (2009)
4. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-
sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
5. Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1(1–4), 131–156 (1997)
6. Dubois, D., Prade, H.: Putting Rough Sets and Fuzzy Sets Together Intelligent Decision Sup-
port, pp. 203–232. Springer (1992)
7. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data
mining software: an update. ACM SIGKDD Explor. Newslett. 11(1), 10–18 (2009)
8. He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9),
1263–1284 (2009)
9. Japkowicz, N., Stephen, S.: The class imbalance problem: a systematic study. Intell. Data Anal.
6(5), 429–449 (2002)
10. Jensen, R., Shen, Q.: Fuzzy rough attribute reduction with application to web categorization.
Fuzzy Sets Syst. 141(3), 469–485 (2004a)
11. Jensen, R., Shen, Q.: Semantics-preserving dimensionality reduction: rough and fuzzy-rough-
based approaches. IEEE Trans. Knowl. Data Eng. 16(12), 1457–1471 (2004b)
12. Jensen, R., Shen, Q.: Fuzzy-rough sets assisted attribute selection. IEEE Trans. Fuzzy Syst.
15(1), 73–89 (2007)
13. Jensen, R., Shen, Q.: Computational Intelligence and Feature Selection: Rough and Fuzzy
Approaches, Vol. 8. Wiley (2008)
14. Langley, P.: Selection of relevant features in machine learning. Paper presented at the Proceed-
ings of the AAAI Fall Symposium on Relevance
15. Lee, P.H.: Resampling methods improve the predictive power of modeling in class-imbalanced
datasets. Int. J. Environ. Res. Public Health 11(9), 9776–9789
16. Li, H., Pi, D., Wang, C.: The prediction of protein-protein interaction sites based on RBF
classifier improved by SMOTE. Math. Prob, Eng (2014)
196 A. K. Tiwari et al.

17. Ling, C., Huang, J., Zhang, H.: AUC: a better measure than accuracy in comparing learning
algorithms. Adv. Artif. Intell. 991–991 (2003)
18. Liu, H., Motoda, H.: Feature Extraction, Construction and Selection: A Data Mining Perspec-
tive, vol. 453. Springer Science and Business Media (1998)
19. Lusa, L.: SMOTE for high-dimensional class-imbalanced data. BMC Bioinform. 14(1), 106
(2013)
20. Nath, A., Chaube, R., Karthikeyan, S.: Discrimination of psychrophilic and mesophilic pro-
teins using random forest algorithm. Paper presented at the 2012 International Conference on
Biomedical Engineering and Biotechnology (iCBEB) (2012)
21. Nath, A., Karthikeyan, S.: Enhanced prediction and characterization of CDK inhibitors using
optimal class distribution. Interdisc. Sci. Comput. Life Sci. 9(2), 292–303 (2017)
22. Nath, A., Subbiah, K.: Inferring biological basis about psychrophilicity by interpreting the rules
generated from the correctly classified input instances by a classifier. Comput. Biol. Chem. 53,
198–203 (2014)
23. Nath, A., Subbiah, K.: Maximizing lipocalin prediction through balanced and diversified train-
ing set and decision fusion. Comput. Biol. Chem. 59, 101–110 (2015)
24. Nath, A., Subbiah, K.: Insights into the molecular basis of piezophilic adaptation: extraction
of piezophilic signatures. J. Theoret. Biol. 390, 117–126 (2016)
25. Okun, O.: Feature Selection and Ensemble Methods for Bioinformatics: Algorithmic Classifi-
cation and Implementations. Information Science Reference-Imprint of IGI Publishing (2011)
26. Pawlak, Z.: Rough sets. Int. J. Parallel. Program. 11(5), 341–356 (1982)
27. Platt, J.: Sequential minimal optimization: a fast algorithm for training support vector machines
(1998)
28. Prompramote, S., Chen, Y., Chen, Y.-P.P.: Machine learning in bioinformatics. In: Chen, Y.-P.P.
(ed.) Bioinformatics Technologies, pp. 117–153. Springer, Berlin Heidelberg, Berlin, Heidel-
berg (2005)
29. Rodriguez, J.J., Kuncheva, L.I., Alonso, C.J.: Rotation forest: a new classifier ensemble method.
IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1619–1630 (2006)
30. Ruck, D.W., Rogers, S.K., Kabrisky, M., Oxley, M.E., Suter, B.W.: The multilayer perceptron
as an approximation to a bayes optimal discriminant function. IEEE Trans. Neural Netw. 1(4),
296–298 (1990)
31. Tiwari, A.K., Nath, A., Subbiah, K., Shukla, K.K.: Effect of varying degree of resampling
on prediction accuracy for observed peptide count in protein mass spectrometry data. Paper
presented at the 2015 11th International Conference on Natural Computation (ICNC) (2015)
32. Tiwari, A.K., Nath, A., Subbiah, K., Shukla, K.K.: Enhanced prediction for observed peptide
count in protein mass spectrometry data by optimally balancing the training dataset. Int. J.
Pattern Recogn. Artif. Intell. 1750040 (2017)
33. Vani, K.S., Bhavani, S.D.: SMOTE based protein fold prediction classification. In: Advances
in Computing and Information Technology, pp. 541–550. Springer (2013)
34. Wang, L., Fu, X.: Data Mining with Computational Intelligence. Springer Science and Business
Media (2006)
35. Weiss, G.M., Provost, F.: The effect of class distribution on classifier learning: an empirical
study. Rutgers Univ (2001)
36. Weiss, G.M., Provost, F.: Learning when training data are costly: the effect of class distribution
on tree induction. J. Artif. Intell. Res. 19, 315–354 (2003)
37. Zadeh, L.A.: Fuzzy sets. Inf. Control 8(3), 338–353 (1965)
Chapter 16
Effect of Upper and Lower Moving Wall
on Mixed Convection of Cu-Water
Nanofluid in a Square Enclosure
with Non-uniform Heating

S. K. Pal and S. Bhattacharyya

Abstract Mixed convection of Cu-water nanofluid in a square enclosure with upper

and lower moving lid has been investigated numerically. Non-uniform heating is
imposed on the left wall, and the right wall is cooled at a constant temperature. Upper
and lower walls are taken to be adiabatic. Finite volume-based SIMPLE algorithm
has been used to solve the nonlinear equations. Results are presented graphically
to describe the effect of nanoparticle volume fraction (0.0 ≤ φ ≤ 0.2), Richardson
number (0.1 ≤ Ri ≤ 10.0) and the moving walls (upper and lower) on flow field,
thermal field and heat transfer rate at a fixed value of Reynolds number (Re =
100). Results show that heat transfer rate increases remarkably with the addition
of nanoparticles. Non-uniform temperature distribution on the left wall affects the
thermal field.

Keywords Mixed convection · Non-uniform heating · Heat transfer · Square

enclosure

1 Introduction

Nanofluid is a colloid mixture of metallic and nonmetallic nano-sized particles with

a base fluid. These nano-sized particles change the thermo-physical properties of the
base fluid and exhibits a substantially larger thermal conductivity as compared to
the conventional base fluid such as oil, water and ethylene glycol. Nanofluid has a
wide range of application in those industries where heat transfer is a prime matter of
concern. Choi et al. [1] investigated the potential benefits of copper nanometer-sized
particles dispersed in ethylene glycol and concluded that significantly higher thermal
conductivity can be achieved using the nanoparticles. Xuan and Li [2] experimentally
investigated the heat transfer features of Cu-water nanofluid and concluded that
suspended nanoparticles enhance the heat transfer process remarkably.
Over the years, the flow and heat transfer of nanofluid inside a closed enclosure
has received a considerable attention because of its significant range of application

S. K. Pal (B) · S. Bhattacharyya

Department of Mathematics, Indian Institute of Technology, Kharagpur 721302, India
e-mail: [email protected]
© Springer Nature Singapore Pte Ltd. 2018 197
D. Ghosh et al. (eds.), Mathematics and Computing, Springer Proceedings
in Mathematics & Statistics 253, https://doi.org/10.1007/978-981-13-2095-8_16
198 S. K. Pal and S. Bhattacharyya

in many industries such as cooling of electronic systems, room ventilation, nuclear

reactors, gas production and lubrication. Tiwari and Das [3] numerically studied the
behaviour of nanofluid inside a two-sided differentially heated lid-driven enclosure
and found that nanoparticles increase heat transfer rate. Mahmoodi [4] numerically
investigated the mixed convection of Al2 O3 -water nanofluid inside a rectangular
enclosure and concluded that the average Nusselt number increases with the increase
of nanoparticle volume fraction.
Along with nanoparticles, convection in enclosures with various wall temperature
conditions also has been studied by many researchers because of its application in
thermal engineering. Basak et al. [5] numerically studied the influence of linearly
heated side walls on mixed convection in a square enclosure. They reported that
multiple circulating cells were observed. A numerical investigation was carried out by
Ramakrishna et al. [6] inside a square cavity for various thermal boundary conditions
on bottom and side walls. Sivakumar and Sivasankaran [7] numerically investigated
the mixed convection in an inclined square cavity with non-uniform temperature
distribution on the both vertical side walls. Sivasankaran et al. [8] studied the effect
of the upper moving wall direction on the mixed convection in an inclined square
cavity with sinusoidal heating on the left wall. They used air as the working fluid
and concluded that the moving wall’s direction has significant impact on the flow
and thermal field in the cavity.
To the best of our knowledge, there is no study to investigate the effect of upper
and lower lid and non-uniform wall temperature on the mixed convection of Cu-water
nanofluid in a square enclosure. Hence, the present study deals with the effects on
flow and thermal fields of Cu-water nanofluid caused by the upper and lower wall
movement and non-uniform sidewall temperature distribution.

2 Physical Model

A two-dimensional mixed convection flow of Cu-water nanofluid in a square enclo-

sure of height H has been considered (Fig. 1a and b). The Cartesian coordinate system
has its origin at the lower left corner of the square enclosure with lower the wall along
the x ∗ -axis and left vertical wall along the y ∗ axis. The gravitational acceleration g
is acting in the opposite direction of the y ∗ coordinate. The top and bottom walls
of the cavity are kept insulated, and both walls are allowed to move with a velocity
U0 in the positive x-axis direction which induces shear in the cavity. A non-uniform
temperature profile is applied on the left wall of the enclosure, and a constant tem-
perature Tc is maintained at the right wall. The form of the non-uniform temperature
profile is expressed as

10π y ∗ 2π y ∗
T (y ∗ ) = Tc + (Tr e f − Tc ) 1.0 + 0.2sin + 0.2sin (1)
H H
16 Effect of Upper and Lower Moving Wall on Mixed … 199

(a) (b) U0
u=0, v=0, ∂θ/∂y=0 u=U0, v=0, ∂θ/∂y=0

u=0 u=0 u=0 u=0

v=0 v=0 v=0 v=0
H H
T=T(y) g T=Tc T=T(y) g T=Tc

* *
y y
x
* u=U0, v=0, ∂θ/∂y=0 *
x u=0, v=0, ∂θ/∂y=0
U0

1
(c)
0.8

0.6
y

0.4

0.2

0
0.6 0.8 1 1.2 1.4
θ

Fig. 1 Schematic diagram of the physical system with the boundary conditions for a lower moving
lid and b upper moving lid. c Nondimensional temperature distribution on the left wall

3 Mathematical Formulation

Single phase model has been adopted for the study. The cavity is filled with Cu-
water nanofluid, which is assumed to be Newtonian, incompressible and laminar.
Nanoparticles are assumed to be of uniform size and shape and considered to be
in thermal equilibrium with the base fluid. The thermo-physical properties of the
nanofluid are assumed to be constant except the density which varies according to
the Boussinesq approximation. Under the above assumptions continuity, momentum
and energy equations for the buoyancy-driven flow inside the cavity employing the
Boussinesq approximation can be expressed in nondimensional form as

∂u ∂v
+ =0 (2)
∂x ∂y
200 S. K. Pal and S. Bhattacharyya

∂u ∂u ∂u ∂p 1 ρf 1 ∂ 2u ∂ 2u
+u +v =− + + 2 (3)
∂t ∂x ∂y ∂x Re ρn f (1 − φ)2.5 ∂x 2 ∂y

∂v ∂v ∂v ∂p 1 ρf 1
+u +v =− +
∂t ∂x ∂y ∂y Re ρn f (1 − φ)2.5
2
∂ v ∂ 2v ρf ρpβp
+ + Ri 1 − φ + φ θ (4)
∂x2 ∂ y2 ρn f ρfβf

∂θ ∂θ ∂θ kn f (ρC p ) f 1 ∂ 2θ ∂ 2θ
+u +v = + 2 (5)
∂t ∂x ∂y k f (ρC p )n f Re Pr ∂x 2 ∂y

The dimensionless variables are defined by x = x ∗ /H , y = y ∗ /H , t = t ∗ U0 /H ,

∗ ∗ p∗
θ = (T − Tc )/(Tr e f − Tc ), u = u /U0 , v = v /U0 , p = ρ U 2 . The dimensionless
nf 0
ρ f U0 L νf
parameters are Reynolds number Re = μf
, Prandtl number Pr = αf
and Richard-
son Number Ri = Gr
Re2
.
The effective density of nanofluid is given by ρn f = (1 − φ)ρ f + φρ p . Effec-
tive heat capacitance of nanofluid is given by (ρc p )n f = (1 − φ)(ρc p ) f + φ(ρc p ) p ,
Xuan and Li [2]. The thermal diffusivity of nanofluid is expressed as αn f =
kn f
(ρC p )n f
, Chamkha and Abu-Nada [9]. There exists several modified models for the
dynamic viscosity of nanofluids, but Brinkman model still gives reasonable result.
So Brinkman model [10] has been adopted for effective viscosity of nanofluid, μn f
μ
and is given by μn f = (1−φ)f 2.5 , where φ is the nanoparticle volume fraction. The
Maxwell–Garnett’s model has been considered to determine the effective thermal
k k +2k −2φ(k −k )
conductivity of the nanofluid and is given by knff = kpp +2kf f +φ(k f f−k pp) . The thermo-
physical properties for water and copper, at room temperature, used in this study, has
been given in Table 1.
The boundary conditions are as follows:
u = 1 or u = 0, v = 0, ∂θ ∂y
= 0 at top wall
u = 0 or u = 1, v = 0, ∂θ ∂y
= 0 at the bottom wall
u = 0, v = 0, θ = 1.0 + (0.2sin(10π y) + 0.2sin(2π y)) at the left wall
u = 0, v = 0, θ = 0 at the right wall.

Table 1 Thermo-physical properties of water and copper

Parameter Water Copper
c p (J/kgK) 4179 383
ρ(kg/m3 ) 997.1 8954
k(W/mK) 0.6 400
β(K−1 ) 2.1 × 10−4 1.67 × 10−5
16 Effect of Upper and Lower Moving Wall on Mixed … 201

3.1 Nusselt Number

The heat transfer rate in terms of local Nusselt

number (N u) along the left nonho-
mogeneous hot wall is defined as N u = − knff ∂∂θx .
k

1
Average Nusselt number at the left hot wall is calculated as N u av = N u dy.
0

4 Numerical Methods

Finite volume method is used to solve the nonlinear governing partial differential
equations in its nondimensional form on a staggered grid system. In staggered grid
system control volume are different for each computing variable. The velocity com-
ponents are evaluated at the mid-point of the cell face which they are normal, and all
the scalar quantities are stored at the centre of the cell. The equations are integrated
over each control volume. QUICK algorithm is used to discretize the convective
terms in the momentum and energy equation and a second-order central difference
scheme is used to discretize the diffusive terms. Velocity-pressure coupling is done
by SIMPLE algorithm. Uniform grid distribution is considered along both the axes
and the resulting set of discretized equations are solved using block elimination
method. The time step is chosen to be 10−4 , and at each iteration level, the pres-
sure field is computed and updated by using SIMPLE algorithm. For any set of
input parameters, the iteration process is repeated and until the convergence crite-
−6
rion maxi j |
ik+1
j −
i j |≤ 10
k
is satisfied where subscripts i, j denote the cell
index and superscripts k denotes the iteration index and
is the variable to compute.
Figure 2a represents the grid independence test on local Nusselt number on left hot
wall for Richardson number Ri = 1.0, Re = 100 and nanoparticle volume fraction
φ = 0.1. Three sets of grid have been considered for the test, and it can be seen
that 81 × 81 is optimal for this present study. More finer grid can give more accurate
results, but it has been observed that the change in accuracy is less than 1%. Figure 2b
shows the validation of the present code. In Fig. 2b, the average Nusselt number along
the hot wall has been validated with the calculation of Abu-Nada and Chamkha [11]
for inclination angle 0◦ , Ri = 1.0, Re = 10 and φ = 0.1. It can be seen that present
result shows very good agreement with result given by Abu-Nada and Chamkha [11].

5 Results and Discussion

Mixed convection flow of Cu-water nanofluid in a square enclosure with non-uniform

heat distribution on a sidewall has been investigated numerically for upper and lower
lid movement when 0.0 ≤ φ ≤ 0.2 and 0.1 ≤ Ri ≤ 10.0 at Re = 100. Throughout
the study, the Reynolds number is kept fixed at 100 and Richardson number (Ri) has
202 S. K. Pal and S. Bhattacharyya

(a) (b)
25 7
71x71 Present Result
81x81 6 Abu-Nada et al.
20 91x91
5
15
4
Nu

Nu
10 3

2
5
1
0
0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
y x

Fig. 2 a Grid-independence test for local Nusselt number along the left hot wall for Ri = 1.0,
Re = 100 and φ = 0.1, b comparison of the local Nusselt number along the hot wall with the
calculation of Abu-Nada and Chamkha [11] for mixed convection of nanofluid in an square enclosure
of inclination angle 00 , Ri = 1.0, Re = 10 and φ = 0.1

been varied by varying the Grashof number (Gr ) between 103 ≤ Gr ≤ 105 . Water
is considered as the base fluid with 6.2 as its Prandtl number.

5.1 Flow and Temperature Field

Figures 3 and 4 show the variation of streamlines and isotherms for mixed convection
of Cu-water nanofluid for the movement of upper lid (first row of Figs. 3 and 4) and
lower lid (second row of Figs. 3 and 4) respectively. To show the effect of nanopar-
ticle volume fraction (φ), streamline and isotherms for pure water (φ = 0.0) and
nanofulid (φ = 0.2) have been included in each figure where solid black lines rep-
resent pure fluid (φ = 0.0) result and dotted red lines represent nano fluid (φ = 0.2)
result (Table 2).
Figure 3a–c shows the effect of upper lid movement on the streamline at Re = 100
for Ri = 0.1, 1.0, 10.0, respectively. Due to the combined effect of the buoyancy
force and the temperature gradient between the hot and cold walls, hot fluid rises
from bottom along the left vertical hot wall and comparatively heavier cold fluid
occupies the bottom portion of the cavity along the right cold wall. Again, the motion
of the upper lid in positive x-direction accelerates this movement and a primary
vortex forms which move in clockwise direction. When Ri = 0.1, i.e. less than unity,
then shear force dominates the buoyancy force and the nanofluid in the enclosure
is primarily driven by the lid velocity. Due to strong shear force, the core of the
primary eddy is near the moving lid. For Ri = 1.0, forced convection and natural
convection have equal contribution on the flow field. Hence, the size of the primary
eddy increases and the core region moves downwards due to natural convection effect.
16 Effect of Upper and Lower Moving Wall on Mixed … 203

Ri = 0.1 Ri = 1.0 Ri = 10.0

(a) (b) (c)

-0.0
-0.085

-0.09
.1

75
-0.

-0
-0.07
06
-0.1

-0.05
-0.09
-0
.0
-0 .

-0.045
8
y

y
06

- 0. 0
-0
.0
4

3 2
-0 -0.03

-0
.02

.0
-0. -0.02
01 -0.0
1 -0.015
-0.01

x x x

(d) (e) (f)

0
02
0
-0.001

.0
-0
-0.03
y

y
0.02

-0.01
-0.02
0.09
0.02
0.07

0.1
0.05

0.01

0.04
0.03

0.05
0.00887

2 .03
0.0

0.09

0
1 0.0 0
0 .0 0.04 0.05 0.06

x x x

Fig. 3 Variation of streamline for different Richardson number (Ri), nanoparticle volume fraction
(φ) at Re = 100 with a–c moving upper lid and d–f moving lower lid. Solid black lines are for
pure fluid (φ = 0.0), and dotted red lines are for nanofluid (φ = 0.2)

Ri = 0.1 Ri = 1.0 Ri = 10.0

(a) (b) (c)
0.3
0.4

0.4
0.5

0.1

0.5 0.3 0.4

0.5
0.1
0.7

0.3
y

y
0.7
0. 2

0.2
0.6

0.3
0.8

0.1
0.8
0.7

0.2
0.9
1

0.7
1.1

1.1

x x x

(d) (e) (f)

0.1

0.1 0.02
0.2

0.2
0.2 0.1 0.5
0.2
0.1

0.2
0.3
0.7

0.4 0.4
0.6
y

y
0.5
0.6

0.3

0.7
0.7

0.3
0.9
1.1

0.3
0.8

0.9

0.4
0.8
x x x

Fig. 4 Variation of isotherm for different Richardson number (Ri), nanoparticle volume fraction
(φ) at Re = 100 with a–c moving lower lid and d–f moving upper lid. Solid black lines are for
pure fluid (φ = 0.0) and dotted red lines are for nanofluid (φ = 0.2)
204 S. K. Pal and S. Bhattacharyya

Table 2 Maximum absolute value of stream function (|ψmax |) in the cavity for different lid move-
ment, Ri and volume fraction (φ)
Lid φ Ri = 0.1 Ri = 1.0 Ri = 10.0
(|ψmax |) (|ψmax |) (|ψmax |)
Upper 0.0 0.100152 0.102985 0.0775992
0.1 0.104307 0.1056800 0.0831871
0.2 0.104403 0.1061990 0.0888904
Lower 0.0 0.101411 0.0947755 0.0669556
0.1 0.103499 0.0983553 0.0692968
0.2 0.103998 0.1011970 0.0712635

At Ri = 10.0, natural convection plays dominant role on the flow field. Because of
strong buoyancy force, the primary eddy is elongated horizontally and it occupies the
whole cavity. At Ri = 0.1, a secondary vortex also has been formed at the right lower
corner of the cavity which disappears for higher values of Ri because of the stronger
buoyancy force. It also can be seen that streamline patterns for pure fluid (φ = 0.0)
and nanofluid (φ = 0.2) case are almost similar for Ri = 0.0 and Ri = 1.0 while
for Ri = 10.0 (Fig. 3c) streamline pattern differs significantly.
Figure 3d–f show the streamline pattern when lower lid is moving in positive x-
direction at Re = 100 for Ri = 0.1, 1.0, 10.0, respectively. For Ri = 0.1 (forced
convection dominated regime), the buoyancy force is overwhelmed by the shear
force exerted by the lower moving lid and a single anticlockwise circulating cell has
formed in the lower portion of the enclosure. The core of the circulation is displaced
towards the lower right corner of the cavity due to shear effect. At Ri = 1.0 (mixed
convection-dominated regime), the buoyancy force and shear force are relatively
comparable in magnitude. Due to this combined effect, two circulations have formed
in the enclosure circulating into opposite directions. Lower eddy circulates in anti-
clockwise direction by the moving wall shear effect while the upper eddy circulates
in clockwise direction due to buoyancy force and temperature gradient. Two centres
have formed for the upper eddy at φ = 0.2. This happens because of weak buoyancy
force at φ = 0.2, since as φ increases the contribution of buoyancy force decreases.
But as Ri rises to 10 (i.e. natural convection-dominated regime), the two centres
of the upper eddy submerged into a single clockwise circulating cell due to strong
buoyancy force. At Ri = 10, lower cell shrinks and looses its strength while the
upper cell becomes larger gaining strength.
Figure 4a–c shows the variation of isotherm for non-uniform wall temperature at
different Richardson number (Ri) and nanoparticle volume fraction (φ) when upper
lid of the square enclosure is moving in positive x-direction at Re = 100. It can be
seen that isotherm lines are coiled on the hot wall due to non-uniform distribution
of the temperature and the coiling is dense in the lower portion of the wall as com-
pared to the upper portion of the wall. At Ri = 0.1 (forced convection-dominated
regime), isotherms are clustered along the hot wall. This is due to the steep temper-
ature gradients in the horizontal direction, and it indicates that the heat transfer near
16 Effect of Upper and Lower Moving Wall on Mixed … 205

the wall is due to conduction. There is no significant heat distribution in the middle
portion of the enclosure. When Ri = 1.0 (mixed convection regime), buoyancy force
become stronger and the boundary layer vanishes. Also heat distribution increases
in the middle of the enclosure due to mixed convection effect. At Ri = 10.0 (nat-
ural convection-dominated regime), the buoyancy force becomes dominant and the
isotherms are distributed in the whole cavity. The thermal gradient near the wall is
higher for nanofluid (φ = 0.2) than the pure fluid (φ = 0.0). This is because of the
enhanced thermal conductivity at higher φ.
Figure 4d–f shows the variation of the isotherms when lower lid of the cavity is
moving in the positive x-direction at different Richardson number (Ri) and nanopar-
ticle volume fraction (φ) at Re = 100. At Ri = 0.1, isotherms are clustered along
the left vertical wall and the lower corner of the left vertical wall. For Ri = 1.0
and 10.0, the isotherms are distributed all over the enclosure. Figures 4a–c and 4d–f
illustrate the effect of the upper and lower moving lid on the temperature distribution
as well as the effect of the non-uniform temperature distribution on the left vertical
wall. The moving lid has significant impact on the temperature distribution as the
moving lid drives the adjacent fluid in its direction which causes the temperature
distribution in the same direction.
Figure 5a and b show the variation of the average Nusselt number (N u av ) on the
hot wall as a function of nanoparticle volume fraction (φ) for different Richardson
Numbers (Ri) when upper and lower lid is moving in positive x-direction respec-
tively. At a fixed Ri, N u av is a monotonic increasing function of nanoparticle volume
fraction (φ). This is due to the fact that thermal conductivity of the nanofluid enhances
with the increase of the nanoparticle volume fraction and hence a larger amount of
heat gets absorbed and removed from the hot wall by the nanofluid. As a result N u av
increases. Figure 5a shows that as Ri increases, N u av also increases. This is due
to the fact that as Ri increases the buoyancy force also increases which reduces the
thickness of the thermal boundary layer and heat transfer increases. But for the lower
moving lid case (Fig. 5b), N u av decreases as Ri increases from 0.1 to 1.0. This phe-
nomenon is largely illustrated in Fig. 5c. It can be seen that average Nusselt number
(N u av ) is a decreasing function of Richardson number (Ri) between 0.1 and 1.0
and increasing function between 1.0 and 10.0. At Ri = 0.1 the fluid flow is wholly
dominant by shear force and so the heat transfer. But as Ri increases from 0.1 to 1.0,
buoyancy force increases and at Ri = 1.0 shear force and buoyancy force become
of same magnitude. These two forces have opposite effect on fluid flow when lower
lid is moving in positive x-direction which can be seen from lower panel of Fig. 3. At
Ri = 1.0, shear force moves lower half of the nanofluid into anticlockwise direction
whereas the buoyancy force moves the upper half of the nanofluid into clockwise
direction. Due to the combined effect of these two oppositely acting forces, heat
transfer is minimum at Ri = 1.0. But further increment in Ri makes the buoyancy
force stronger and hence N u av also increases.
Figure 6 shows the u-velocity and v-velocity profile for the movement of upper
(Fig. 6a and b) and lower lid (Fig. 6c and d) at x = 0.5 and Ri = 10.0. Figure 6a
shows that the u-velocity changes very little upto mid-height of the enclosure and
after that it increases rapidly while the situation is almost opposite when lower lid
206 S. K. Pal and S. Bhattacharyya

(a) 16 Ri = 0.1
(b) 16 Ri = 0.1
Ri = 1.0 Ri = 1.0
Ri = 10.0 14 Ri = 10.0
14

12 12
Nuav

Nuav
10 10

8 8

6 6
0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2
φ φ

12 φ
Nu

10-1 100 101

Fig. 5 Variation of average Nusselt number (N u av ) at Re = 100 as a function of nanoparticle

volume fraction (φ) at different Ri for a upper moving lid and b lower moving lid. c Variation of
average Nusselt number (N u av ) at Re = 100 as a function of Richardson number (Ri) at different
nanoparticle volume fraction (φ) for lower moving lid

id moving. Figure 6c shows that the u-velocity increases in the lower portion of the
enclosure. From Fig. 6b and d it can be concluded that v-velocity remains positive
on the upper half and negative in the lower half of the enclosure for both the cases.

6 Conclusions

A numerical investigation of mixed convection of Cu-water nanofluid in a square

enclosure with non-uniform temperature distribution on a side wall is made. Flow
fields and thermal fields are illustrated by presenting the streamline and isotherm
contour plots. The main findings of this study can be summarized as follows:
1. It is found that heat transfer rate is a strictly increasing function of Richardson
number when upper lid is moving. But for moving lower wall, heat transfer rate
16 Effect of Upper and Lower Moving Wall on Mixed … 207

(a) 1 (b) 1

0.8 φ = 0.0 0.8 φ = 0.0

φ = 0.1 φ = 0.1
φ = 0.2 φ = 0.2
0.6 0.6
y

y
0.4 0.4

0.2 0.2

0 0
0 0.5 1 -0.05 0 0.05 0.1 0.15
u v
(c) 1 (d) 1
φ = 0.0 φ = 0.0
φ = 0.1 φ = 0.1
0.8 φ = 0.2 0.8 φ = 0.2

0.6 0.6
y

0.4 0.4

0.2 0.2

0 0
-0.5 0 0.5 1 -0.15 -0.1 -0.05 0 0.05
u v

Fig. 6 u and v-velocity profile for a–b upper moving lid case and c–d lower moving lid case at
x = 0.5 for Ri = 10, Re = 100 and φ = 0.0, 0.1, 0.2

decreases in the interval 0.1 ≤ Ri ≤ 1.0 with minimum value at Ri = 1.0 and
it increases in the interval 1.0 ≤ Ri ≤ 10.0.
2. Heat transfer rate is dependent on the choice of the moving wall. At natural
convection regime, i.e. at Ri = 10.0, N u av has higher value when upper lid is
moving. But at forced convection regime, i.e. at Ri = 0.1, N u av has higher value
when lower lid is moving.
3. Non-uniform wall temperature effects the isotherm distribution on the hot wall.
But it has negligible effect on the flow field.
4. Choice of moving lid has great impact on the flow field and temperature distri-
bution.

References

1. Eastman, J.A., Choi, S.U.S., Li, S., Yu, W., Thompson, L.J.: Anomalously increased effective
thermal conductivities of ethylene glycol-based nanofluids containing copper nanoparticles.
Appl. Phys. Lett. 78(6), 718–720 (2001)
2. Li, Q., Xuan, Y., Wang, J.: Investigation on convective heat transfer and flow features of
nanofluids. J. Heat Transfer 125(2003), 151–155 (2003). As references [2] and [9] are the
208 S. K. Pal and S. Bhattacharyya

same, we have deleted the duplicate reference and renumbered accordingly. Please check and
confirm.
3. Tiwari, R.K., Das, M.K.: Heat transfer augmentation in a two-sided lid-driven differentially
heated square cavity utilizing nanofluids. Int. J. Heat Mass Transfer 50(9), 2002–2018 (2007)
4. Mahmoodi, M.: Mixed convection inside nanofluid filled rectangular enclosures with moving
bottom wall. Therm. Sci. 15(3), 889–903 (2011)
5. Basak, T., Roy, S., Sharma, P.K., Pop, I.: Analysis of mixed convection flows within a square
cavity with linearly heated side wall (s). Int. J. Heat Mass Transfer 52(9), 2224–2242 (2009)
6. Ramakrishna, D., Basak, T., Roy, S., Pop, I.: A complete heatline analysis on mixed convection
within a square cavity: effects of thermal boundary conditions via thermal aspect ratio. Int. J.
Therm. Sci. 57, 98–111 (2012)
7. Sivakumar, V., Sivasankaran, S.: Mixed convection in an inclined lid-driven cavity with non-
uniform heating on both sidewalls. J. Appl. Mech. Tech. Phys. 55(4):634–649
8. Sivasankaran, S., Cheong, H.T., Bhuvaneswari, M., Ganesan, P.: Effect of moving wall direction
on mixed convection in an inclined lid-driven square cavity with sinusoidal heating. Numer.
Heat Transfer A 69(6), 630–642
9. Chamkha, A.J., Abu-Nada, E.: Mixed convection flow in single-anddouble-lid driven square
cavities filled with water Al2 O3 nanofluid: effect of viscosity models. Eur. J. Mech. B Fluids
36, 82–96 (2012)
10. Brinkman, H.C.: The viscosity of concentrated suspensions and solutions. J. Chem. Phys. 20(4),
571–571 (1952)
11. Abu-Nada, E., Chamkha, A.J.: Mixed convection flow in a lid-driven inclined square enclosure
filled with a nanofluid. Eur. J. Mech. B Fluids 29(6), 472–482 (2010)
Chapter 17
On Love Wave Frequency Under the
Influence of Linearly Varying Shear
Moduli, Initial Stress, and Density
of Orthotropic Half-Space

Sumit Kumar Vishwakarma, Tapas Ranjan Panigrahi

and Rupinderjit Kaur
Abstract The present work studies Love wave propagation in an inhomogeneous
anisotropic layer superimposed over an inhomogeneous orthotropic half-space under
the influence of rigid boundary plane. The layer exhibits inhomogeneity which varies
quadratically with depth, whereas the half-space has inhomogeneity in the shear mod-
uli, density, and initial stress which varies linearly downward. The frequency equation
is deduced in the closed form. It has been found that the dispersion equation is a func-
tion of phase velocity, wave number, inhomogeneity parameters, and initial stress. To
analyze the result more profoundly, numerical simulation and graphical illustrations
have been effectuated to depict the pronounced impact of the affecting parameters on
the phase velocity of Love wave. As a special case, the procured dispersion relations
have been found in well agreement with the standard Love wave equation.

Keywords Love wave · Inhomogeneous · Orthotropic · Anisotropic

Rigid plane

1 Introduction

It is very interesting to study Love wave propagation in an anisotropic media because

the dispersion of seismic waves in anisotropic and orthotropic media is elementarily
different from their dispersion in isotropic media. As the crustal layer of earth and
mantle are not found to be homogeneous, it is very interesting to know the dispersion
pattern of Love wave in an inhomogeneous medium as is studied sufficiently by
Shearer [13]. It has been noticed that the propagation of Love wave is mostly affected
by the elastic properties and the characteristic of the medium which it travels through.
The earths’ mantle (half-space) contains some hard and soft rocks or materials that
may exhibit orthotropic property and porosity. In orthotropic medium, the thermal or
mechanical properties being unique and independent in three mutually perpendicular

S. K. Vishwakarma (B) · T. R. Panigrahi · R. Kaur

Department of Mathematics, BITS-Pilani, Hyderabad Campus, Hyderabad 500078, India
e-mail: [email protected]; [email protected]

© Springer Nature Singapore Pte Ltd. 2018 209

directions make it an interesting medium. These facts motivated us to investigate

further on Love wave propagation where the bearing of linear variation in the rigidity,
density, and initial stresses can be studied. Destrade [5] studied in detail surface
waves in orthotropic being incompressible in nature, whereas Kumar and Rajeev
[11] analyzed the seismic wave motion to show the effect of voids at the boundary
surface of orthotropic thermoelastic material. Ahmed and Dahab [1] demonstrated
the remarkable effect of orthotropic granular layer on Love wave propagation, while
a clear picture has been explained by Kumar and Choudhury [10] about the behavior
and the response of orthotropic micropolar elastic medium via various sources.
Many problems in field of theoretical seismology are likely to be solved by demon-
strating the earth as a layered medium with certain finite thickness and mechanical
properties. An accurate and precise study on dispersion of elastic wave and its gen-
eration had been made by Chapman [4]. Propagation of surface seismic waves in
the earths’ crust due to its multiple applications in the field of geophysics, seismol-
ogy, and applied mathematics has always been the subject of discussion along with
various investigations. Vishwakarma et al. [14] demonstrated about the influence of
the rigid boundary playing on the Love wave propagation in the elastic layer with
void pores, while an interesting study made by Ke et al. [9] on Love wave dispersion
under the effect of linearly varying properties of an inhomogeneous fluid saturated
porous-layered half-space. In the theoretical study of seismic waves, mathematical
expression provides the bridge between modeling results and field application. The
propagation of elastic/seismic waves through the interior part of earth is governed
by mathematical laws similar to the laws of light waves in optics.
The propagation of surface seismic wave such as Love waves in various inho-
mogeneous media has importance in multiple branches of engineering and applied
science, like geophysics, seismology, earth science. Several studies have been carried
out to understand the propagation technique of seismic waves in the inhomogeneous
medium. Theories related to Love wave propagation in the anisotropic and inhomo-
geneous media have significant practical importance. It not only helps to investigate
the internal structure of the earth and exploration of natural resources buried in the
earths’ surface but also about the composition of several layers under immense stress
owing to different physical causes, i.e., presence of overlying layers, variation in tem-
perature and gravitational field. This wave disperses when the solid medium near the
surface has inhomogeneous elastic properties. Fortunately, Biot [2] developed the
incremental deformation theory for pre-stressed medium. Adapting the same theory,
earth being a spherical body with finite dimension, there exist remarkable influence
of earths’ crust on seismic surface waves. This phenomenon motivated us to inves-
tigate boundary waves or surface waves, i.e., waves that remain confined to certain
surfaces during their dispersion. The formulations, solutions, and numerical simula-
tions of many problems related to linear wave propagation for variety of geomedia
may be found in the work of Gupta et al. [7, 8].
However, no attempt has been made to show the influence of inhomogeneous or-
thotropic half-space under initial stress on Love wave propagation. Therefore, in the
present study, the half-space has been taken as inhomogeneous orthotropic medium
followed by an inhomogeneous anisotropic layer resting over it. The inhomogeneity
17 On Love Wave Frequency Under the Influence of Linearly … 211

taken in the orthotropic mantle varies linearly along depth down toward the cen-
tral core of the earth. This linear inhomogeneity has been taken in shear moduli,
density, and initial stress of the half-space whereas the layer exhibits a quadratic
variation in directional rigidities along horizontal and vertical direction and density.
Suitable boundary condition under the assumption of rigid boundary plane has been
considered and imposed on the displacement of the wave which have been found
for individual layers. The frequency equation (dispersion equation) has been derived
in closed form along with various particular cases. When all the inhomogeneities
vanish, the frequency equation reduces to a classical equation of Love wave given
by Love [12]. Numerical magnitude of the phase velocity has been calculated with
the help of values of the material constants given by Biot [2] from experiments, and
the effect of inhomogeneity parameter associated with directional rigidities, density,
and initial stress is discussed and demonstrated using graphs.

2 Statement of the Problem

The geometry of the problem consists of an inhomogeneous anisotropic earth crust

of finite thickness H resting over an inhomogeneous orthotropic half-space under
the influence of linearly varying initial stress. Cartesian coordinate system has been
employed with z-axis directed downwards and origin being at the interface where
crustal layer and half-space meet as shown in the 3D diagram of Fig. 1. The upper
boundary plane of the layer has been kept rigid where displacement of the wave
vanishes. The inhomogeneities considered in the layer are as follows:

N = N (1 + az)2 , L = L (1 + az)2 , ρ = ρ (1 + az)2 (1)

where N and L are the values of directional rigidities along x and z directions and ρ
is the density at z = 0, a is called inhomogeneity parameter with dimension same as
that of inverse of length.
The inhomogeneities taken in the anisotropic half-space are

Q1 = Q1 (1 + αz) , Q3 = Q3 (1 + βz) , P = P (1 + γz) , ρ1 = ρ1 (1 + δz) (2)

where Q1 , Q3 , P, and ρ1 are shear moduli, initial stress, and density of the medium
at the interface z = 0 and α, β, γ, and δ are the inhomogeneity parameter associ-
ated with it having dimension equal to that of inverse length. Variation of rigidity,
density, and initial stress along the depth inside the earth effects the propagation of
seismic waves to a great extent. The inhomogeneity that exists is caused by variation
in rigidity and density. The crust region of our planet is composed of various inho-
mogeneous layers with different geological parameters. As pointed out by Bullen
[3], the density inside the earth varies at different rates with different layers within
the earth. He approximated density law inside the earth as a quadratic polynomial
in depth parameter for 413–984 km depth. For depth from 984 km to central core,
212 S. K. Vishwakarma et al.

Fig. 1 Three-dimensional geometry of the problem

Bullen approximated the density as a linear function of depth parameter, and hence
based on these theories, we have taken quadratic and linear variations.

3 Solution

3.1 Finding Displacement in Anisotropic

Inhomogeneous Layer

Let u1 , v1 and w1 be the displacement components in the x, y, and z direction,

respectively. Starting from the general equation of motion and using the conventional
Love waves conditions, viz. u = 0, w = 0 and v = v1 (x, z, t), the only y component.
Then, the equation of motion in the absence of body force can be written as Biot [2]

∂ 2 v1 ∂ ∂v1 ∂ 2 v1
N + L = ρ (3)
∂x2 ∂z ∂z ∂t 2

For a wave propagation along x-direction, we may assume

v1 = V (z)eik(x−ct) (4)

Using Eqs. (3) and (4) takes the form

17 On Love Wave Frequency Under the Influence of Linearly … 213

d2V 1 dL d V K2 2
+ + c ρ−N V =0 (5)
dz 2 L dZ dz L

After putting V = V1
L
in equation, we get
2
d 2 V1 1 d 2L 1 dL K2 2
2
− 2
V1 + 2 V1 + c ρ − N V1 = 0 (6)
dz 2L dz 4L dz L

Using the inhomogeneity taken in Eqs. (1) and (6) changes to

d 2 V1
+ m21 V1 = 0 (7)
dz 2

K2 2
where, m21 = c ρ−N (8)
L
The solution of Eq. (7) may be assumed as

V1 = A1 eim1 z + B1 e−im1 z

Thus, Eq. (4), the displacement in the inhomogeneous anisotropic layer may be
taken as
A1 eim1 z + B1 e−im1 z iK(x−ct)
v1 = √ e (9)
L (1 + az)

3.2 Finding Displacement for Inhomogeneous Orthotropic

Half-Space

The half-space taken in the problem is inhomogeneous orthotropic in nature under

the influence of initial stress P along x direction as shown in Fig. 1. The system of
equation pertaining to wave motion when there is no body forces is given by Biot [2]
⎫
∂σ11 ∂σ12 ∂σ13 ∂wz ∂wy ∂ 2 u2 ⎪
⎪
+ + −P − = ρ1 2 ⎪
⎬
∂x ∂y ∂z ∂y ∂z ∂t

∂σ21 ∂σ22 ∂σ23 ∂wz 2
∂ v2 ∂σ31 ∂σ32 ∂σ33 ∂wy ∂ w2 ⎪
2 ⎪
⎪
+ + −P = ρ1 2 , + + −P = ρ1 ⎭
∂x ∂y ∂z ∂x ∂t ∂x ∂y ∂z ∂x ∂t 2
(10)

where u2 , v2 , and w2 are the displacement components while wx , wy , and wz are the
rotational components along x, y, and z direction. Here, σij are the incremental stress
components and ρ1 is the density of orthotropic medium. The relations between the
strain and the incremental stress components are
214 S. K. Vishwakarma et al.

σ11 = B11 e11 + B12 e22 + B13 e33 , σ12 = 2Q3 e12 , σ22 = B21 e11 + B22 e22 + B23 e33
σ23 = 2Q1 e23 , σ33 = B31 e11 + B32 e22 + B33 e33 , σ31 = 2Q2 e31
(11)

where Bij and Qi are the incremental normal elastic coefficients and shear moduli,
respectively.
Hereeij are the strain components, which is defined by
1 ∂ui ∂u
eij = 2 ∂xj + ∂xij , where i, j = 1, 2, 3.
Now, as per the characteristic of Love wave propagation, u2 = 0, w2 = 0, and
v2 = v2 (x, z, t). Also, the inhomogeneity taken in Eq. (2) in Eq. (10) reduces to

∂ 2 v2 ∂ 2 v2 ∂ 2 v2 P ∂ 2 v2 ∂ 2 v2
Q3 (1 + βz) + Q 1 α + Q 1 (1 + αz) − (1 + γz) = ρ1 (1 + δz)
∂x2 ∂z 2 ∂z 2 2 ∂x2 ∂t 2
(12)

We may now use separation of variable, i.e., v2 = V2 (z) eik(x−ct) , where k is the
wave number and c is the phase velocity. Eq. (12) may now be written as

d 2 V2 α d V2 (1 + γz) (1 + δz) (1 + βz)
+ + k 2
A1 + A2 − A3 V2 = 0
dz 2 (1 + αz) dz (1 + αz) (1 + αz) (1 + αz)
(13)
P c2 Q3 2 Q1
where, A1 = , A2 = 2 , A3 = , c1 = (14)
2Q1 c1 Q1 ρ1

ψ(z)
Now, substituting V2 = (1+αz)1/2
in Eq. (13) to eliminate d V2
dz
, we get

d 2ψ 2 (1 + γz) (1 + δz) (1 + βz) 1 a 2 1
+ k A1 + A2 − A3 + ψ=0
dz 2 (1 + αz) (1 + αz) (1 + αz) 4 k (1 + αz)2
(15)

k β γ δ
Putting n = 2 (1 + αz) α
A3 α
− A1 α
− A2 α
, we will get

d 2ψ 1 R 1
+ + − ψ=0 (16)
d η2 4η 2 η 4

k β γ δ 21
where, R = 1
2 α
A3 α − A1 α + A2 α A1 αγ − 1 + A2 αδ − 1 + A3

β
1− α
17 On Love Wave Frequency Under the Influence of Linearly … 215

Equation (16) is a well-known Whittakers’ equation, and the solution of which

can be written as
ψ = A2 WR,0 (η) + B2 W−R,0 (−η)

where WR,0 (η) is Whittakers’ function and the general expansion of WR,m (η) may
be written as Whittaker and Watson [15]
⎡ 2 2 2 2 ⎤
m2 − R − 21 m2 − R − 21 m − R − 23
− η2
WR,m (η) = e .Rη ⎣1 + + + ...⎦
1!z 2!z 2
(17)

Thus, the displacement in inhomogeneous orthotropic half-space becomes

A2 WR,0 (η) + B2 W−R,0 (−η)
v2 = 1 eik(x−ct) (18)
(1 + αz) 2

But, as we go down deep toward the center of earth, the displacement vanishes,
i.e, as z → ∞, ν2 → 0, and therefore, the displacement in Eq. (18) reduces to

A2 WR,0 (η)
v2 = 1 eik(x−ct) (19)
(1 + αz) 2

4 Boundary Conditions and Dispersion Equation

(1) Due to the presence of rigid boundary plane at Z = −H , the displacement van-
ishes
v1 = 0 at z = −H (20a)

(2) Displacement being continuous at the interface implies that

v1 = v2 at z = 0 (20b)

(3) At the contact plane z = 0, the continuity of the stress requires that

∂v1 ∂v2
L = Q1 at z = 0 (20c)
∂z ∂z
Using the above boundary conditions one by one, and eliminating the arbitrary
constants A1 , B1 , and A2 for nontrivial solution, we will have the following determi-
nant.
216 S. K. Vishwakarma et al.

e−im1 H eim1 H 0

√1 √1 −WR,0 (η)
L L =0 (21)

√ √

L (im1 − a) − L (im1 + a) −Q1 ∂W∂η
R,0 (η) d η
. dz
z=0

Expanding the above determinant, we get the following:

γ 1/2
a Q1 k β δ
cot(m1 H ) = 1+2 A3 − A1 − A2
m1 L a α α α
−1

1 R (R − 0.5)2 (R − 0.5)2
− + + 1−
2 η η 2 η

Substituting the value of m1 in the above expansion, it reduces to

⎛ ! ⎞
" γ 1/2
" c2 N Q1 k β δ
⎝
cot kH # − ⎠ = 1+2 A3 − A1 − A2
2
c0 L L a α α α
⎡ &
⎤⎛ −1/2 ⎞
1 R (R − 0.5) 2 (R − 0.5) 2 −1 a c 2 N
⎣− + + 1− ⎦⎝ − ⎠
2 η η2 η k c2 L 0
(22)

where c02 = Lρ .
Equation (22) is the required frequency equation of Love wave propagation in an
inhomogeneous anisotropic layer resting over an inhomogeneous orthotropic medi-
um with rigid boundary plane at the top. We find that Eq. (22) is a function of
c2
dimensionless phase velocity c2 , dimensionless wave number kH along with the
0
inhomogeneity parameters m, α, γ, and δ associated with the rigidities, densities,
and initial stress of the medium taken in to consideration.
Particular Case:
Case-I: When there is no inhomogeneity in the layer a → 0, then Eq. (22) reduces to
⎛ ! ⎞
" γ 1/2
" c2 N Q1 β δ
cot ⎝kH # 2 − ⎠ = 2 A3 − A1 − A2
c0 L L α α α
⎡ &
⎤ −1/2
1 R (R − 0.5) 2 (R − 0.5) 2 −1 2
⎣− + + 1 − ⎦ c −N
2 η η2 η c02 L

which is the frequency equation of Love wave in a homogeneous anisotropic layer

over inhomogeneous orthotropic half-space.
17 On Love Wave Frequency Under the Influence of Linearly … 217

Case-II: When the half-space is stress-free, i.e., P → 0, then Eq. (22) becomes
⎛ ! ⎞
" 1/2
" c2 N Q1 k β δ
cot ⎝kH # 2 − ⎠ = 1 + 2 A3 − A2
c0 L L a α α
⎡ &
⎤⎛ −1/2 ⎞
1 R (R − 0.5)2 (R − 0.5)2 −1 a c 2 N
⎣− + + 1− ⎦⎝ − ⎠,
2 η η2 η k c2 L
0

which is the frequency equation of Love wave in an inhomogeneous anisotropic layer

resting over inhomogeneous orthotropic half-space with no initial stress.
Case-III: When N = L, Q1 = Q2 , a → 0, α → 0, β → 0, δ → 0 and P → 0, then
the frequency Eq. (22) becomes

' (
c2
c2 Q1 c12
−1
cot kH −1 = (
c02 L c2
−1
c02

which is the standard classical dispersion equation of Love wave given by Love [12]
and therefore validated the solution of the problem discussed.

5 Numerical Computations, Graphs, and Discussion

In order to illustrate the theoretical results obtained in the preceding sections, the data
have been fetched from Gubbins [6] to study graphically the impact of inhomogeneity,
rigid boundary, and the various elastic constants on the propagation of Love wave
using frequency equation as obtained in Eq. (22). We will use the asymptotic linear
expansion of Whittakers’ function as given in Eq. (17). In all the graphs, horizontal
axis has been taken as dimensionless wave 2number kH while vertical axis has been
taken as dimensionless phase velocity cc0 . Numerical values taken are as follows:

1. Inhomogeneous anisotropic layer: N = 7.34 × 1010 N/m2 , L = 5.98 × 1010

N/m2 N/m2 , ρ = 3195 kg/m3
2. Inhomogeneous orthotropic half-space: Q1 = 5.82 × 1010 N/m2 , Q3 = 3.99 ×
1010 N/m2 , ρ1 = 4500 kg/m3

Figure 2 reflects the effect of inhomogeneity parameter ak associated with the

directional rigidities and density in the anisotropic layer. The value of ak for curve
no.1, curve no. 2, curve no. 3, and curve no. 4 has been taken as 0.1, 0.3, 0.5, and 0.7,
P α β γ
respectively, whereas the value of 2Q , k , k , k and kδ are 0.2, 0.1, 0.2, 0.2, and 0.1,
respectively. The following observations and effects are obtained under the above
considered values.
218 S. K. Vishwakarma et al.

Fig. 2 Variation of dimensionless phase velocity against dimensionless wave number for different
values of (m/k) when P = 0.2, αk = 0.1, βk = 0.2, γk = 0.2, kδ = 0.1
2Q1

2a. The phase velocity decreases as the wave number increases for all the values of
a
.
k
2b. While at a particular wave number as the value of ak increases from 0.1 to 0.7,
the phase velocity decreases.
2c. Toward low wave number, the curves seem accumulating which reveals that the
phase velocity remains unaffected as inhomogeneity changes.
2d. Toward higher wave number, the phase velocity decreases gradually, whereas it
decreases rapidly for low wave number.
2e. Seeing the pattern of the curve, we can claim that the inhomogeneity present in
the layer bears a remarkable effect on the phase velocity of Love wave.
Figure 3 has
been drawn to analyze the bearing of dimensionless inhomogeneity
parameter αk on the phase velocity of Love wave. Curve no. 1 has been plotted
for αk = 0.2, curve no. 2 for αk = 0.4, curve no. 3 for αk = 0.6 and curve no. 4 for
α P a β γ
k
= 0.8. The value of 2Q , k , k , k and kδ are 0.2, 0.1, 0.2, 0.2, and 0.1, respectively.
The following results are obtained.
3a. The pattern of curves obtained
here is quite similar to one obtained in Fig. 2.
3b. As the magnitude of αk increases from 0.2 to 0.8, the phase velocity decreases
at a fixed wave number.
3c. Curves being equally apart, a periodic effect of inhomogeneity parameter βk may
be found throughout the figure.

Figure 4 describes the influence of inhomogeneity parameter βk for its increasing
magnitude from 0.1 to 0.4 for curve no. 1–4. The following observations and effects
are found.
17 On Love Wave Frequency Under the Influence of Linearly … 219

Fig. 3 Variation of dimensionless phase velocity against dimensionless wave number for different
values of (α/k) when P = 0.2, ak = 0.1, βk = 0.2, γk = 0.2, kδ = 0.1
2Q1

Fig. 4 Variation of dimensionless phase velocity against dimensionless wave number for different
value of (β/k) when P = 0.2, ak = 0.1, αk = 0.1, γk = 0.2, kδ = 0.1
2Q1
220 S. K. Vishwakarma et al.

Fig. 5 Variation of dimensionless phase velocity against dimensionless wave number for different
value of (γ/k) when P = 0.2, ak = 0.1, αk = 0.1, γk = 0.2, kδ = 0.1
2Q1

4a. Unlike Figs. 2 and 3, the

phase velocity increases for the increases in the inho-
mogeneity parameter βk associated with shear Modulus Q3 .

4b. The curves are becoming closer as the magnitude of βk increases.
4c. The impact of the inhomogeneity is more pronounced for its least value.
4d. It can also be said that phase velocity may attain a constant magnitude as the
inhomogeneity increases further.
Figure 5 illustrates a clear picture of the variation of phase velocity against wave
number when initial stress in the half-space increases. Curves have been plotted for
γ
k
equals to 0.2, 0.4, 0.6, and 0.8 for curve no. 1, curve no. 2, curve no. 3, and curve
P a α β γ
no. 4, respectively. The values of other parameter such as 2Q , k , k , k , k , and kδ have
been taken as 0.2, 0.1, 0.1, 0.2, 0.1. We can enlist the following points about Fig. 5.
5a. The pattern is similar to some extent as that of one obtained in Fig. 4.
5b. Here the phase velocity diminishes as the magnitude of the inhomogeneity pa-
rameter linked with initial stress increases.
5c. The phase velocity for curve no. 3 and curve no. 4 is restricted upto to kH = 3.5
and kH = 2, respectively, thereby showing a significant effect of inhomogeneity
in the half-space.
In Fig. 6, attempt has been made to show the influence of inhomogeneity parameter
δ
k
present in the density of the orthotropic half-space. We find that
6a. there is an decrement in the magnitude of phase velocity as the wave number
diminishes for all the values of kδ .
17 On Love Wave Frequency Under the Influence of Linearly … 221

Fig. 6 Variation of dimensionless phase velocity against dimensionless wave number for different
value of (δ/k) when P = 0.2, ak = 0.1, αk = 0.1, βk = 0.2, γk = 0.2
2Q1

Fig. 7 Variation of dimensionless phase velocity

against dimensionless wave number for different
α β γ δ
P
values of compressive initial stress 2Q 1
> 0 when k = 0.1, k = 0.1, k = 0.2, k = 0.0, k = 0.1
a
222 S. K. Vishwakarma et al.

6b. At a particular wave number, the phase velocity also decreases for the increasing
magnitude of inhomogeneity parameter in the density of orthotropic medium.
6c. When the phase velocity is least, the curves appearing closer to each other at
high wave number showing a prominent effect of inhomogeneity parameter kδ .
Figure 7 depicts the impact of initial stress 2QP when γk = 0 shows the effect of
1

compressive initial stress 2QP > 0 on the phase velocity of Love wave propagating
1
in an inhomogeneous anisotropic layer. It has been observed that as the magnitude
of compressive initial stress becomes larger, the phase velocity decreases while it
increases as the tensile stress increases.

6 Conclusion

Propagation of Love waves in an inhomogeneous anisotropic layer resting over an

inhomogeneous orthotropic half-space with linearly varying inhomogeneity has been
studied in details. Solutions in terms of displacement of the wave in the layer and
half-space have been derived separately. We have used asymptotic linear expansion
of Whittakers’ function and obtained the dispersion relation (frequency equation) in
compact form. Numerical investigations have been made on phase velocity against
wave number and the effect of each one of the linearly varying inhomogeneity param-
eters associated with anisotropic layer and orthotropic half-space has been studied
and discussed in detail. We observed that
2
I. Under the assumed condition, phase velocity cc0 increases with decrease in
dimensionless wave number.
II. The phase velocity of Love wave decreases as the inhomogeneity parameter ak
associated with directional rigidity and density of the layer increases.
III. The increasing magnitude of βk increases the phase velocity whereas αk , γk and kδ
decreases the phase velocity as it increases.
IV. At
a fixed wave number, the increasing value of compressive initial stress
P
2Q1
> 0 decreases the velocity while increasing tensile stress 2QP
1
> 0 in-
creases.
V. In the absence of all inhomogeneity and initial stress, the dispersion equation
turns into the classical form of equation of Love wave and therefore revealing
the validation of current work.
The consequences of the present study gives a theoretical framework for the adopt-
ed model, which may likely to be utilized to collect, investigate, and recognize the
propagation pattern Love wave propagation in anisotropic layer over orthotropic
half-space, which may further help in accessing the resources buried inside the earth
such oils, gases, minerals, deposits, and other useful hydrocarbons. Apart from these,
the outcomes of the present study may also be used widely in the design and develop-
ment of heavy civil construction projects involving steel structures, disaster-resistant
17 On Love Wave Frequency Under the Influence of Linearly … 223

buildings, bridge, and towers, etc. Precisely the study may also be useful in the inter-
departmental fields like rock mechanics, soil mechanics, geotechnical engineering,
and applied science.

Acknowledgements Authors extend their sincere thanks to SERB-DST, New Delhi, for providing
financial support under Early Career Research Award with Ref. no. ECR/2017/001185. Authors are
also thankful to DST, New Delhi, for providing DST-FIST grant with Ref. no. 337 to Department
of Mathematics, BITS-Pilani, Hyderabad campus. Authors also express their deep sense of respect
and gratitude to honorable reviewers for their constructive suggestions to improve the quality of the
manuscript.

References

1. Ahmed, S.M., Abd-Dahab, S.M.: Propagation of Love waves in an orthotropic Granular layer
under initial stress overlying a semi-infinite Granular medium. J. Vib. Control 16(12), 1845–
1858 (2010)
2. Biot, M.A.: Mechanics Incremental Deformation. Wiley, New York (1965)
3. Bullen, K.E.: The problem of Earth’s density variation. Bull. Seismological Soc. Am. 30(3),
235–250 (1940)
4. Chapman, C.: Fundamentals of Seismic Wave Propagation. Cambridge University Press, Cam-
bridge (2004)
5. Destrade, M.: Surface waves in orthotropic incompressible materials. J. Acoustical Soc. Am.
110(2), 837–840 (2001)
6. Gubbins, D.: Seismology and Plate Tectonics. Cambridge University Press, Cambridge (1990)
7. Gupta, S., Majhi, D.K., Kundu, S., Vishwakarma, S.K.: Propagation of love waves in non-
homogeneous substratum over initially stressed heterogeneous half-space. Appl. Mathe. Mech.
34(2), 249–258 (2013)
8. Gupta, S., Vishwakarma, S.K., Majhi, D.K., Kundu, S.: Possibility of Love wave propagation
in a porous layer under the effect of linearly varying directional rigidities. Appl. Mathe. Modell.
37, 6652–6660 (2013)
9. Ke, L.L., Wang, Y.S., Zhang, Z.M.: Propagation of love waves in an inhomogeneous fluid
saturated porous layered half-space with linearly varying properties. Soil Dyn. Earthquake
Eng. 26, 574–581 (2006)
10. Kumar, R., Choudhary, S.: Response of orthotropic micropolar elastic medium due to various
sources. Meccanica 38, 349–368 (2003)
11. Kumar, R., Rajeev, K.: Analysis of wave motion at the boundary surface of orthotropic ther-
moelastic material with voids and isotropic elastic half space. J. Eng. Phys. Thermophys. 84(2),
463–478 (2003)
12. Love, A.E.H.: A Treatise on Mathematical Theory of Elasticity, 4th edn. Dover Publication,
New York (1944)
13. Shearer, P.M.: Introduction to Seismology, 2nd edn. Cambridge University Press, Cambridge
(2009)
14. Vishwakarma, S.K., Gupta, S., Majhi, D.K.: Influence of rigid boundary on the Love wave
propagation in elastic layer with void pores. Acta Mechanica Solida Sinica 25(5), 551–558
(2013)
15. Whittaker, E.T., Watson, G.N.A.: Course in Modern Analysis. Cambridge University Press,
Cambridge (1990)
Chapter 18
The Problem of Oblique Scattering
by a Thin Vertical Submerged Plate
in Deep Water Revisited

B. C. Das, S. De and B. N. Mandal

Abstract The problem of oblique scattering by fixed thin vertical plate submerged in
deep water is studied here, assuming linear theory, by employing single-term Galerkin
approximation involving constant as basis multiplied by appropriate weight function
after reducing it to solving a pair of first kind integral equations. Upper and lower
bounds of reflection and transmission coefficients when evaluated numerically are
seen to be very close so that their averages produce fairly accurate numerical estimates
for these coefficients. Numerical estimates for the reflection coefficient are depicted
graphically against the wave number for different values of various parameters. The
numerical results obtained by the present method are found to be in an excellent
agreement with the known results.

Keywords Submerged plate · Linearized theory · Galerkin technique · Constant

as basis · Reflection coefficient

1 Introduction and Mathematical Formulation

of the Problem

There is no explicit solutions for the problems of oblique scattering of water waves
by a thin vertical barrier of various geometrical configurations present in deep water.
However, there exists some approximate methods to solve these problems approxi-
mately in the sense that the reflection and transmission coefficients could be obtained

B. C. Das (B) · S. De
Department of Applied Mathematics, Calcutta University,
92, A.P.C. Road, Kolkata 700009, India
e-mail: [email protected]
S. De
e-mail: [email protected]
B. N. Mandal
Physics and Applied Mathematics Unit, Indian Statistical Institute,
203 B.T Road, Kolkata 700108, India
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2018 225

numerically. Oblique scattering problems involving a partially immersed or com-

pletely submerged thin vertical barrier was studied by [2–4, 6–8] by using various
methods.
The problem of oblique scattering by a thin vertical plate submerged in deep
water can be formulated mathematically as follows. Assuming linear theory and
irrotational motion, let a train of surface water waves represented by the potential
function Re{φ0 (x, y, z)eiνz−iσt } with

φ0 (x, y, z) = e−Ky+iμx , (1.1)

where μ = K cos α, ν = K sin α, K = σg , g being the gravity, σ being the circular

frequency, be incident obliquely at an angle α on a thin vertical plate represented

by x = 0, y ∈ L = (a, b) which is submerged in deep water. Here y−axis is taken
vertically downwards and the (x, y)-plane denotes the mean free surface. Due to
geometrical symmetry, the resulting motion in water can be described by the velocity
potential Re{φ(x, y, z)eiνz−iσt } where φ(x, y) satisfies

(∇ 2 − ν 2 )φ = 0, y ≥ 0, −∞ < x < ∞, (1.2)

Kφ + φy = 0 on y = 0, (1.3)
φx = 0 on x = 0, y ∈ L, (1.4)
∇φ → 0 as y → ∞, (1.5)
1 1
r ∇φ = O(1) as r = (x + (y − c) ) → 0, where c = a, b
2 2 2 2 (1.6)

and

Te−Ky+iμx as x → ∞
φ(x, y) → (1.7)
e−Ky+iμx + Re−Ky−iμx as x → −∞,

where T and R are the transmission and reflection coefficient, respectively, and are
to be determined.
It may be noted that for the case of normal incidence (α = 0), R and T can be
obtained in closed forms. In fact, [1] solved the normal incidence problem using
complex variable theory and obtained the corresponding reflection and transmission
coefficients in closed forms involving some complicated (but computable) integrals.
However, when α = 0, closed form results cannot be obtained. For this case, [9]
reduced it to the solution of an integral equation involving the unknown difference
of potentials across the plate, the integral equation being solved by an expansion
method similar to the expansion of its kernel involving different orders of sin α.
Later, [10, 11] employed one-term Galerkin approximations involving the exact
solutions for normally incident waves to solve the integral equations involving the
unknown difference of potential across the plate and the unknown horizontal velocity
across the gaps above and below the plate and obtained very accurate upper and lower
bounds for the reflection and transmission coefficients for all angles of incidence and
18 The Problem of Oblique Scattering by a Thin … 227

wave numbers. However, the aforesaid exact solutions of integral equations for nor-
mally incident waves are somewhat complicated, and as such, the upper and lower
bounds for the reflection and transmission coefficients involve complicate integrals.
In the present method, single-term Galerkin approximations in solving the integral
equations are employed, but these approximations involve constant multiplied by
appropriate weight functions. This process produces upper and lower bounds involv-
ing simple integrals which are quite easy to evaluate.

2 Method of Solution

A solution for the velocity potential φ(x, y) satisfying (1.2) and the conditions (1.7)
is given by
⎧ ∞
⎨ T φ0 (x, y) + 0 A(k)S(k, y)e−k1 x dk, x > 0,
φ(x, y) = ∞ (2.1)
⎩
φ0 (x, y) + Rφ0 (−x, y) + 0 B(k)S(k, y)e k1 x
dk, x < 0

1
where k1 = k 2 + ν 2 2 with k1 = k when ν = 0 and

S(k, y) = k cos(ky) − K sin(ky). (2.2)

Let
∂φ
f (y) = (0, y), 0 < y < ∞, (2.3)
∂x
and

g(y) = φ(x + 0) − φ(x − 0), 0 < y < ∞, (2.4)

then

f (y) = 0 for y ∈ L (2.5)

and

g(y) = 0 for y ∈ L = (0, ∞) − L. (2.6)

The unknown constants R, T and the unknown functions A(k) and B(k) are related
to f (y) and g(y) as given by

2iK
T =1−R=− f (y)e−Ky dy, (2.7)
μ L
228 B. C. Das et al.

2
A(k) = −B(k) = − f (y)S(k, y)dy, (2.8)
πk1 (k + K 2 )
2
L

R = −K g(y)e−Ky dy, (2.9)
L

1
A(k) = g(y)S(k, y)dy. (2.10)
π(k 2 + K 2 ) L

In deriving relations (2.7) and (2.8), the condition (2.5) and in deriving the relations
(2.9) and (2.10) the condition (2.6) have been utilized in the appropriate Havelock
[5] inversion formula.
Use of the condition (1.4) in the form

∂φ
(±0, y) = 0, y ∈ L
∂x
in the representation (2.1) for φ(x, y) produces an integral equation for g(y) as given
by

g(u)M(y, u)du = πiμ(1 − R)e−Ky , y ∈ L (2.11)
L

where
∞
k1 S(k, y)S(k, u) −k
M(y, u) = lim e dk, (2.12)
→+0 0 k2 + K2

the exponential term being introduced to ensure convergence of the integral. In the
relation (2.12), we note that M(y, u) is a real and symmetric function of y and u.
Again, as φ(x, y) is continuous across the gap, use of the representation (2.1)
along with the relation (2.8) produces an integral equation for f (y) as given by

π
f (u)N (y, u)du = − Re−Ky , y ∈ L (2.13)
L 2

where
∞
S(k, y)S(k, u)
N (y, u) = dk, (2.14)
0 k1 (k 2 + K 2 )

so that N (y, u) is also a real and symmetric function of y and u.

18 The Problem of Oblique Scattering by a Thin … 229

Let us write
2
F(y) = − f (y), y ∈ L, (2.15)
πR

1
G(y) = g(y), y ∈ L, (2.16)
πiμ(1 − R)

then G(y) and F(y) satisfy the integral equations

G(u)M(y, u)du = e−Ky , y ∈ L, (2.17)
L

F(u)N (y, u)du = e−Ky , y ∈ L. (2.18)
L

It may be noted that functions G(y) and F(y) in (2.17) and (2.18), respectively, must
be real.
The relations (2.7) and (2.9) are now recast as

F(y)e−Ky dy = C, (2.19)
L

and

1
G(y)e−Ky dy = , (2.20)
L π2 K 2 C

where
1−R
C= cos α. (2.21)
iπR
It is important to note that C is real.

3 Upper and Lower Bounds for C

Following [2], we define an inner product

< f ,g >= f (y)g(y)dy. (3.1)
L
230 B. C. Das et al.

Then, obviously < f (y), g(y) > is symmetric and linear. Also, the operator M
defined by

(Mg)(y) = < M(y, u), g(u) > (3.2)

is linear, self-adjoint, and positive semi-definite. For the solution of the integral
equation (2.17), we choose a single-term Galerkin approximation as given by

G(y) ≈ λ1 g(y), y ∈ L (3.3)

where λ1 is an unknown constant and g(y) is to be chosen suitably. Then,

< g(y), e−Ky >

λ1 = . (3.4)
< g(y), (Mg)(y) >

Hence, using the approximate solution (3.3) for G(y) in the relation (2.20), we find

1
= < G(y), e−Ky >≥< λ1 g(y), e−Ky >, (3.5)
π2 K 2 C
after using the same argument as in [2]. Thus, we find that

C≥A (3.6)

where A is an upper bound of C and is given by

1 < g(y), (Mg)(y) >

A = . (3.7)
π2 K 2 (< g(y), e−Ky >)2

Again, if we define another inner product by

f , g
= f (y)g(y)dy (3.8)
L

and another operator N by

(N f )(y) = N (y, u), f (u)

, (3.9)

, then it is obvious that f , g

is linear, symmetric, and also the operator N is
linear, self-adjoint, and positive semi-definite.
For solution of the integral Eq. (2.18), we choose single-term Galerkin approxi-
mation as

F(y) ≈ λ2 f (y), y ∈ L, (3.10)

18 The Problem of Oblique Scattering by a Thin … 231

where λ2 is an unknown constant and f (y) is to be chosen suitably, then

f (y), e−Ky

λ2 = . (3.11)
f (y), (N f )(y)

Hence, using the approximate solution (3.10) for F(y) in the relation (2.19), we find
that

C = F(y), e−Ky
≥ λ2 f (y), e−Ky
(3.12)

after using the same argument as in [2]. Thus, we find that

C≥B (3.13)

where B is a lower bound of C and is given by

( f (y), e−Ky
)2
B= . (3.14)
f (y)(N f )(y)

Hence, for the unknown real constant C, we find

B≤C≤A (3.15)

where A and B are given by (3.7) and (3.14), respectively. Thus, upper and lower
bounds for |R| and |T | are obtained as

R1 ≤ |R| ≤ R2 , T1 ≤ |T | ≤ T2 (3.16)

where
1 1
R1 = 21 , R2 = 1 , (3.17)
1 + π A sec α
2 2 2 1 + π B2 sec2 α 2
2

πB sec α πA sec α
T1 = 21 , T2 = 1 . (3.18)
1 + π 2 A2 sec2 α 1 + π 2 B2 sec2 α 2

Here L = (a, b) so that L = (0, a) + (b, ∞). The function g(y) in (3.3) is cho-
sen as
1

g(y) = (y − a)(b − y), a < y < b. (3.19)

b
232 B. C. Das et al.

After substituting g(y) in (3.7), A is obtained as

∞ 2
1 0
k1
k 2 +K 2
kp(a, b, k) − Kq(a, b, k) dk
A= 2 2 (3.20)
π K (r(a, b, K))2

where
1 b

p(a, b, k) = (y − a)(b − y) cos(ky)dy,

b a

1
q(a, b, k) = (y − a)(b − y) sin(ky)dy,
b a

1
r(a, b, K) = e−Ky (y − a)(b − y)dy.
b a

Again, we choose f (y) in (3.10) as

⎧
⎨ a
,0 < y < a,
a−y
f (y) = (3.21)
⎩ e−Ky b
,b < y < ∞.
y−b

After substituting this f (y) in the expressions (3.14), B is obtained as

[M (Ka) + N (Kb)]2
B = ∞
0
1
k1 (k 2 +K 2 )
[kU (a, k) + K V (a, k) + kW (b, k, K) − KX (b, k, K)]2 dk
(3.22)

where a
−Ky a
M (Ka) = e dy,
0 a−y

∞
−2Ky b
N (Kb) = e dy,
b y−b
a
a
U (a, k) = cos(ky)dy,
0 a−y
a
a
V (a, k) = sin(ky)dy,
0 a−y

∞
b
W (b, k, K) = e−Ky cos(ky)dy,
b y−b
18 The Problem of Oblique Scattering by a Thin … 233

∞
−Ky b
X (b, k, K) = e sin(ky)dy.
b y−b

The integrals appearing in (3.20) and (3.22) are simple to evaluate numerically.

4 Discussion of Numerical Results

Numerical estimation for the upper (R2 or T2 ) and lower (R1 or T1 ) bounds of the
reflection and transmission coefficients |R| and |T | are obtained for different values
of the various parameters. In Table 1, the lower and upper bounds of |R| for different
values of various parameters are presented. From this table, it is seen that the two
bounds of |R| coincide upto 3–4 decimal places. Similar results are found for the two
bounds for |T | which are not given here. The average of an upper and lower bound
of |R|(|T |) thus produces fairly good numerical estimate for |R|(|T |). The numerical
results obtained by the present method satisfy the energy identity |R|2 + |T |2 = 1.
This provides a check on the correctness of the results obtained here. Because of
the energy identity, we confine our attention on |R| only. In Fig. 1, |R| is depicted
against the wave number Kb(= σg b ) for different values of μ(= ab ) and for normal
2

incidence(α = 0). From Fig. 1, it is seen that the curve of |R| corresponding to
normal incidence almost coincides with the curve of |R| in Fig. 2 of [1]. This provides
another check on the correctness of the results obtained using the present method.
Geometrical significance of the limiting case μ = 0 is that the plate intersects the free
surface. This indicates that submerged plate behaves like partially immersed barrier
in deep water. For each finite μ, |R| first increases to maximum as Kb increases and
then decreases to zero for further increases of Kb. Thus, for each finite value of
μ, |R| → 0 as Kb → ∞.
In Figs. 2 and 3, |R| is platted against the wavenumber Kb for different incident
angles and for μ = 0.05 and 0.1, respectively. From these figures, it is seen that the
curve of |R| almost coincides with the curve of |R| in Figs. 2 and 3 of [11]. Further,

Table 1 Lower and upper bounds of the reflection coefficient |R| for various values of the parameters
Kb, α and μ(= ab ) = 0.5
Kb α = 150 α = 450 α = 750 α = 850
R1 R2 R1 R2 R1 R2 R1 R2
0.05 0.000442 0.000442 0.000346 0.000372 0.000118 0.000119 0.000043 0.000046
0.4 0.017022 0.017051 0.012437 0.012473 0.004553 0.004563 0.001508 0.001552
0.8 0.037450 0.037481 0.027287 0.027352 0.009962 0.009981 0.003352 0.003377
1.6 0.045591 0.045599 0.032631 0.032669 0.001170 0.011733 0.003920 0.003991
2.4 0.032430 0.032445 0.022473 0.022680 0.007881 0.00794 0.002663 0.002699
3.0 0.021992 0.021999 0.014773 0.014796 0.005074 0.005290 0.001603 0.001692
234 B. C. Das et al.

Fig. 1 Graph of |R| versus 1

Kb for different values of
0.9
μ(= a/b)
0.8 μ=0
0.7

0.6 μ=0.01

|R|
0.5
μ=0.05
0.4

0.3

0.2
μ=0.25
0.1

0
0 0.5 1 1.5 2 2.5 3
Kb

Fig. 2 Graph of |R| versus 0.6

Kb for different values of α
and μ = 0.05 α=150
0.5

0
0.4 α=45
|R|

0.3

0.2
α=750

0.1
α=850

0
0 0.5 1 1.5 2 2.5 3
Kb

it is also observed that most of the cases Table 1 coincides upto 3 to 4 decimal places
with Table 1 of [11]. This provides another check on the correctness of the results
obtained using the present method. Reflection coefficient |R| first increases as Kb
increases and then decreases for further increases of Kb in Figs. 2 and 3. Also, in
Fig. 4, the curve for |R| depicted against α for different Kb and for μ = 0.5. From
Fig. 4 and from Table 1, it is seen that for fixed μ, |R| decreases as α increases from
0◦ to 90◦ . This is obvious since the incident wave then almost grazes along the plate.
18 The Problem of Oblique Scattering by a Thin … 235

Fig. 3 Graph of |R| versus 0.6

Kb for different values of α
and μ = 0.1
0.5

0.4
α=150

|R|
0.3
α=450

0.2

α=750
0.1
α=850
0
0 0.5 1 1.5 2 2.5 3
Kb

Fig. 4 Graph of |R| versus α 0.05

for different values of Kb and
0.045
μ = 0.5
0.04 Kb=3.0

0.035 Kb=2.4

0.03 Kb=1.6
|R|

0.025
Kb=0.8
0.02
Kb=0.4
0.015

0.01

0.005
Kb=0.05
0
0 10 20 30 40 50 60 70 80 90
α

5 Conclusion

Here, we have used Havelock’s expansion of water wave potential for the problem
of water wave scattering by submerged plate to reduced the problem to the solution
of pair of integral equations involving the difference of potentials and the horizontal
component of velocity across the barriers. These integral equations are solved by
using single-term Galerkin technique involving a constant as basis. Numerical eval-
uations of upper and lower bounds for the reflection coefficients are seen to be very
close. Their averages give actual values of reflection coefficients for all practical pur-
poses. The present method produces numerical results which are in good agreement
with the earlier results obtained by [11].
236 B. C. Das et al.

Acknowledgements The first author acknowledges financial support from UGC, New Delhi. This
work is also partially supported by SERB through the research project no. EMR/2016/005315.

References

1. Evans, D.V.: Diffraction of water waves by a submerged vertical plate. J. Fluid Mech. 40,
433–451 (1970)
2. Evans, D.V., Morris, A.C.N.: The effect of a fixed vertical barrier on oblique incident surface
waves in deep water. J.Inst. Maths. Applies, 9, 198-204 (1972)
3. Faulkner, T.R.: The diffraction of an obliquely incident surface wave by a submerged plane
barrier. ZAMP 17, 699–707 (1965)
4. Faulkner, T.R.: The diffraction of an obliquely incident surface wave by a vertical barrier of
finite depth. Proc. Camb. Phil. Soc. 62, 829–38 (1966)
5. Havelock, T.H.: Forced surface waves on water. Phil. Mag. 8, 569–576 (1929)
6. Jarvis, R.J., Taylor, B.S.: The scattering of surface waves by a vertical plane barrier. Proc.
Camb. Phil. Soc. 66, 417–22 (1969)
7. Mandal, B.N., Goswami, S.K.: A note on the scattering of surface wave obliquely incident on
a submerged fixed vertical barrier. J. Phys. Soci. Jpn. 53(9), 2980–2987 (1984a)
8. Mandal, B.N., Goswami, S.K.: A note on the diffraction of an obliquely incident surface wave
by a partially immersed fixed vertical barrier. App. Sci. Res. 40, 345–353 (1983)
9. Mandal, B.N., Goswami, S.K.: The scattering of an obliquely incident surface wave by a
submerged fixed vertical plate. J. Math. Phys 25, 1780–1783 (1984)
10. Mandal B.N., Dolai D.P.: Oblique water wave diffraction by thin vertical barrier in water of
uniform finite depth. App. Ocean. Res. 16, 195-203 (1994)
11. Mandal, B.N., Das, P.: Oblique diffraction of surface waves by a submerged vertical plate. J.
Engng. Math. 30, 459–470 (1996)
Chapter 19
A Note on Necessary Condition for Lp
Multipliers with Power Weights

Rajib Haloi

Abstract In this article, we prove a necessary condition for Lp multipliers, 1 < p ≤

2. The results are obtained by the use of Hausdorff–Young inequality that generalizes
the result available for p = 2.

Keywords Fourier transform · Schwartz functions · Ap Weights

Hausdorff–Young inequality

AMS Subject Classification (2010): 42A38 · 26D15 · 42B10

1 Introduction

Let Lp (R, |x|α dx), 1 < p ≤ 2, α ≥ 0 denote the space of all measurable functions
on R such that
|f (x)|p |x|α dx < ∞.
R

We prove a necessary condition for Lp (R, |x|α dx), 1 < p ≤ 2, α ≥ 0 multipliers. Let
S0,0 (R) be the space of all Schwartz functions whose Fourier transform has compact
support not including the origin. We note that S0,0 (R) is dense in Lp (R, |x|α dx) [see
[3]]. For f ∈ S0,0 (R), the multiplier operator is deined as

Tm (f ) = (m
f )∨ , (1.1)

R. Haloi (B)
Department of Mathematical Sciences, Tezpur University,
Napaam, Tezpur, Sonitpur 784028, Assam, India
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2018 237

which has continuous extension to Lp (R, |x|α dx)[2]. Here

f and f ∨ denote the Fourier
transform and the inverse Fourier transform of f , respectively. We begin with the
definition of the multiplier space M (s, λ) which is due to Strichartz [8].
Definition 1.1 Let [λ] denote the greatest integer less than or equal to λ. For λ > 0
and 1 ≤ s ≤ ∞, we define M (s, λ) to be the set of all m with [λ] weak derivatives
on R {0} such that B(m, s, λ) < ∞, where
1/s
λ−1/s (λ)
B(m, s, λ) = m∞ + sup r |m (x)| dx
s
r>0 r≤|x|≤2r

if λ is an integer, and
1/s
|m(k) (x) − m(k) (y)|s
B(m, s, λ) = m∞ + sup r λ−1/s dydx
r>0 r≤|x|≤2r r≤|y|≤2r |x − y|1+p (λ−k)

if λ is not an integer with k = [λ].

The characterization of the multiplier space for the multipliers defined on
L2 (R, |x|2α dx) is done by Muckenhoupt et al. [3]. We state the following therorems
due to Muckenhoupt et al. [3].
Theorem 1.1 [3] If α > 21 and 2α is not an odd integer and m(x) is in M (2, α),
then
(m f )∧ 2,2α ≤ CB(m, 2, α)f 2,2α

for every f ∈ S0,0 (R), where C depends only on α.

The sufficient part for α < − 21 is established by duality argument. Again for
0 < |α| < 21 , the characterization of the multipliers space is established in term of
Reisz capacity [1]. Then, the characterization for 0 < |α| < 21 is used to prove the
remaining sufficiency part in [3].

Theorem 1.2 [3] If α ≥ 0, m(x) is locally integrable on R {0} and

(m
f )∧ 2,2α ≤ Af 2,2α

for all f ∈ S0,0 (R), then m is in M (2, α) and there is a constant C, depending only
on α, such that
B(m, 2, α) ≤ CA.

Further, Muckenhoupt et al. [5] proved the following sufficient condition that
extends the results for the values of p
= 2 in [3].
19 A Note on Necessary Condition for Lp Multipliers with Power Weights 239

Theorem 1.3 [5] Let 1 < p ≤ 2 and λ > p1 . If α ∈ R such that −1 < α < −1 +
p(λ + 21 ), (α + 1)/p is not an integer, then for f in S0,0 and m ∈ M (p , λ), we have
∞ ∞
|(m
f )∨ |p |x|α dx ≤ CB(m, p , λ)p |f (x)|p |x|α dx,
−∞ −∞

where C is a constant independent of m and f .

However, there is no known result for necessary conditions for multipliers space
for p
= 2 on Lp (R, |x|α dx). We prove the following necessary condition for the
multiplier operator defined in (1.1) in terms of the space M (p , λ).
Theorem 1.4 If λ ≥ 0, 1 < p ≤ 2, m ∈ L1loc (R − {0}), and
∞ ∞
|(m
f )∨ |p |x|pλ dx ≤ Ap |f (x)|p |x|pλ dx (1.2)
−∞ −∞

for all f ∈ S0,0 , then m ∈ M (p , λ), and there exists a constant C depending only on
p and λ such that
B(m, p , λ) ≤ CA.

2 Lemmas

In this section, we prove two important lemmas that are used to prove Theorem 1.4.
The following lemma is analogous to a proposition by Stein [7, Proposition 4, page
139].

Lemma 2.1 If f ∈ Lp (R) for 1 < p ≤ 2, 1

p
+ 1
p
= 1, and 0 < α < 1, then

1/p 1/p
|
f (x) −

f (y)|p
dydx ≤C |f (x)|p |x|pα dx
R R |x − y|1+p α R

for some constant C independent of f .

Proof By a change of variable, we get

1/p 1/p
|
f (x) −
f (y)|p
dydx = |t|−(1+p α) |
f (y + t) −
f (y)|p dy dt .
R R |x − y| 1+p α R R
240 R. Haloi

Using the Hausdorff–Young inequality, we obtain

1/p
|
f (y + t) −

|t|−(1+p α) f (y)|p dy dt
R R
p /p 1/p

≤ |t|−(1+p α) |f (y)(eiyt − 1)|p dy dt
R R
⎛ ⎞
p /p p/p 1/p
=⎝ ⎠

|t|−(1+p α) |f (y)(eiyt − 1)|p dy dt
R R
⎛ ⎞
p /p p/p 1/p
|f (y)(e iyt
− 1)| p
=⎝ dy dt ⎠ .
R R |t|(1+p α)p/p

Again applying the Minkowski’s integral inequality, we obtain

1/p
|
f (y + t) −

|t|−(1+p α) f (y)|p dy dt
R R
⎛ ⎞1/p
p
p/p
|f (y)(e − 1)|
iyt
≤⎝ dt dy⎠
R R |t|(1+p α)
⎛ p/p ⎞1/p

|(eiyt − 1)|p
=⎝ |f (y)|p dy dt ⎠ .
R R |t|1+p α

We note that

p/p
|(eiyt − 1)|p
dt = C|y|pα
R |t|1+p α

for 0 < α < 1, y ∈ R and for some constant C [7, page 140]. Thus
1/p 1/p
|
f (x) −

f (y)|p
dydx ≤C |f (y)| |y| dy
p pα
.
R R |x − y|1+p α R

Lemma 2.2 If f ∈ Lp (R) for 1 < p ≤ 2, k is a nonnegative integer, fˆ has a weak

derivative of order k on R and

f (k) ∈ Lp (R); and k < α < k + 1, then
19 A Note on Necessary Condition for Lp Multipliers with Power Weights 241

1/p 1/p
|
f (k) (x) −

f (k) (y)|p
dydx ≤C |f (x)| |x| dx
p pα
,
R R |x − y|1+p (α−k) R

for some constants C, 1

p
+ 1
p
= 1.

Proof Using the following property of the Fourier transform k f ](ξ)

f (k) (ξ) = [(−ix)
a.e. ξ, the proof can be obtained from Lemma 2.1.

3 Proof of the Main Results

In this section, we complete the proof of the Theorem 1.4 by proving a sequence of
Lemmas. The idea of the proof is based on Muckenhoupt et al. [3].
Lemma 3.1 If we assume (1.2), then there exists a constant C depending only on p
and λ such that m∞ ≤ CA.

Proof We choose φ ∈ Cc∞ (R) with φ(x) ≥ 0, φ(x)dx = 1; and

1
φ(x) = 1, ∀|x| ≤ ,
4
1
φ(x) = 0, ∀|x| ≥ .
2
It is given that m is locally integrable on R {0}. Thus, a.e. x
= 0 ∈ R is a Lebesgue
point for m. Let y
= 0 be a Lebesgue point for m. Let r be fix number such that
0 < r < |y|2
. Define
1 t−y
fˆ (t) = φ( ).
r r

Then, f (t) = eiyt φ̌(rt) and f ∈ S0,0 . Next, we claim for this f that

|(mfˆ )∨ (x)| ≥ |m(y)| a.e. y.

Now for a.e. y,

|(mf ) (x)| ≥ m(t)f (t)e dt − m(t)f (t)(e − e )dt
ˆ ∨ ˆ ixt ˆ iyt ixt

R R

≥ m(t)fˆ (t)dt − |m(t)|fˆ (t)|x||y − t|dt
R R

≥ m(t)fˆ (t)dt − 1/8 |m(t)|fˆ (t)dt (3.1)
R R
242 R. Haloi

if t is such that fˆ (t)

= 0, that is if |t − y| < r/2, and if |x| ≤ 1/4r. Further, we choose
φ ∈ Cc∞ (R) such that

1 t − y |m(y)|
m(t)φ( )dt ≥ , (3.2)
r r 2
R

and
1 t−y
|m(t)|φ( )dt| ≤ 2|m(y)|. (3.3)
r R r

Using (3.2) and (3.3) in (3.1), we get

|(mfˆ )∨ (x)| ≥ |m(y)|/4 (3.4)

a.e. s with |x| ≤ 1/4r. Integrating for |x| ≤ 1/4r, we get from (3.4) that
∞
|m(y)/4|p |x|pλ dx ≤ |(mfˆ )∨ |p |x|pλ dx
|x|≤1/4r −∞
∞
≤A |f (x)|p |x|pλ dx
−∞
∞
=A |φ̌(rx)|p |x|pλ dx
−∞
−1−pλ
= ADr ,

where Dp = |φ̌(u)|p |u|pλ du < ∞ as φ ∈ S. Thus, we obtain
R

|m(y)| ≤ CA

with C depending only on p and λ. Thus, the proof follows.

Lemma 3.2 If we assume (1.2), then m has kth weak derivative and m(k) ∈ Lp (I ),
where I is any compact interval in R not containing 0.

Proof Let k = [λ]. For the existence of the weak derivative of m, we must show that
there exist h with

m(t)ψ (k) (t)dt = h(t)ψ(t)dt

for all ψ ∈ Cc∞ (R − {0}). Choose a sequence {fn } in S0,0 such that
fn (ξ) = 1 for
1/n ≤ |ξ| ≤ n. Let ψ ∈ Cc∞ (R − {0}) and n ∈ N such that supp(ψ) is a subset of
{1/n ≤ |x| ≤ n}. Now
19 A Note on Necessary Condition for Lp Multipliers with Power Weights 243

mψ (k) = m
fn ψ (k)

= (m
fn )∨ (ψ (k) )∨

= (m
fn )∨ (−it)k ψ ∨ )

= [(m
fn )∨ (−it)k ]∧ ψ.

Define

hn (x) = [(m
fn )∨ (−it)k ]∧ (x).

By the Hausdorff–Young inequality

1/p 1/p
p
|[(m

|hn (t)| dt = fn )∨ (−ix)k ]∧ (t)|p dt
1/p
≤ |(m
fn )∨ (t)(−it)k |p dt . (3.5)

Now

|(m
fn )∨ (−it)k |p dt ≤ |(m
fn )∨ (−it)λ |p dt + |(m
fn )∨ |p dt.
|t|≥1 |t|≤1

The first term in the last inequality is finite by the hypothesis. The integrand in the

second term is in S, so it follows from inequality (3.5) that hn ∈ Lp . For m > n,
f
m (x) = fn (x), so we have

0 = [mfˆm − mfˆn ]ψ (k)
R
= [(mfˆm )∨ − (mfˆn )∨ ](−ix)k ψ ∨
R
= [hm − hn ]ψ,
R

which is true for all ψ ∈ Cc∞ (R − {0}) with supp(ψ) is a subset of {1/n ≤ |x| ≤ n}.
This implies that a.e. x ∈ {x : 1/n ≤ |x| ≤ n},

hm (x) = hn (x),
244 R. Haloi

and hence, {hn } is Cauchy sequence in Lp . Thus, {hn } has convergent subsequence
which converges a.e. We call the subsequence as {hn }. So

h(x) = lim hn (x)

is defined a.e. x ∈ R and h(x) = hn (x) a.e. x ∈ {x : 1/n ≤ |x| ≤ n}.

Now we show that h is weak derivative of m of order k and h ∈ Lp (I ) for compact
∞
interval I in R not containing 0. For ψ ∈ Cc (R − {0}) with supp(ψ) is a subset of
{1/n ≤ |x| ≤ n}, we have

hψ = hn ψ
R
R
= [m
fn ]∨ (−it)k ]∧ ψ
R

= mψ (k) .
R

Thus, m(k) (x) = (−1)k h(x). Next, let I be any compact interval not containing 0.
Choose n ∈ N such that I ⊆ {x : 1/n ≤ |x| ≤ n}. Then,

|h|p ≤ |h|p
I {x:1/n≤|x|≤n}

≤ |hn |p < ∞.
R

This shows that the kth order derivative of m is in Lp on any compact interval not
containing the origin.

Lemma 3.3 If we assume (1.2), then there is a constant C1 depending only on p and
λ such that

B(m, p , λ) ≤ C1 A.

Proof Because of Lemma 3.1, it is enough to show that there exists a constant A
independent of r such that
1/p
λ−1/p (λ) p
r |m (x)| dx ≤ CA
r≤|x|≤2r
19 A Note on Necessary Condition for Lp Multipliers with Power Weights 245

for λ integer and

1/p
λ−1/p |m(k) (x) − m(k) (y)|p
r dydx ≤ CA
r≤|x|≤2r r≤|y|≤2r |x − y|1+p (λ−k)

for λ non-integer with k = [λ]. We choose η ∈ Cc∞ such that

η(x) = 1, ∀|x| ≤ 3/2,

η(x) = 0, ∀|x| ≥ 7/4.

For fix r > 0, define f such that

x x
f (x) = η( − 2) + η( + 2)
r r
and so

f (x) = (e2irx + e−2irx )η̌(rx).

Then,
f (x) = 1 on r/2 ≤ |x| ≤ 3r and
f (x) = 0 on |x| ≤ r/4 and |x| ≥ 4r. So,

[m(x)
f (x)](k) = m(k) (x), (3.6)

for a.e. x ∈ {x : r/2 ≤ |x| ≤ 3r}. We first estimate B(m, p , λ) for integer λ. By the
Hausdorff–Young inequality and the Assumption (1.2), we have
1/p 1/p
|(m

r λ−1/p |m(λ) (x)|p dx = r λ−1/p f )(λ) (x)|p dx
r≤|x|≤2r r≤|x|≤2r
1/p
|(m

λ−1/p
≤r f )∨ (x)|x|λ |p dx
R
1/p

≤ Ar λ−1/p |f (x)|p |x|pλ dx
R
≤ ADp ,

1/p
where Dp = r λ−1/p R |f (x)|p |x|pλ dx . For λ non-integer, we use (3.6) and Lem-
ma 2.2 to obtain
246 R. Haloi

1/p
|m(k) (x) − m(k) (y)|p
r λ−1/p dydx
r≤|x|≤2r r≤|y|≤2r |x − y|1+p (λ−k)
1/p
|(mfˆ )(k) (x) − (mfˆ )(k) (y)|p

λ−1/p
=r dydx
r≤|x|≤2r r≤|y|≤2r |x − y|1+p (λ−k)
1/p
|(mfˆ )(k) (x) − (mfˆ )(k) (y)|p

λ−1/p
≤r dydx
R R |x − y|1+p (λ−k)
1/p
|(mfˆ )∨ |p |x|pλ dx

≤ Cr λ−1/p
R
1/p
λ−1/p
≤ ACr |f (x)| |x| dx
p pλ
R
= ACDp .

Here, the second inequality follows from the Lemma 2.1 and third inequality follows
from the hypothesis.

4 Remark

We note that the condition in Theorem 1.4 cannot be sufficient. We recall the fol-
lowing results for the sufficient condition established by Muckenhoupt et al. [5] for
Lp (R) multipliers with power weights.

Theorem 4.1 [5] If 1 < p < ∞ , 1 ≤ s ≤ ∞, λ > max( 1s , | 1p − 21 |) or λ = s =

1, m ∈ m(s, λ), max(−1, −pλ, −1 + p(−λ + 21 )) < α < min(pλ, −1 + p(λ + 21 ),
−1 + p(λ + 1 − 1s )) and (α + 1)/p is not an integer, then for f ∈ S0,0 , we have
∞ ∞
|(mfˆ )∨ |p |x|α dx ≤ CB(m, s, λ)p |f (x)|p |x|α dx,
−∞ −∞

where C is a constant independent of m and f .

Further Muckenhoupt [6] proved the following necessity conditions for Lp mul-
tipliers with power weights on α.
Theorem 4.2 [6] If 1 < p < ∞ , 1 ≤ s ≤ ∞, λ ≥ 1
s
and assume
∞ ∞
|(mfˆ )∨ |p |x|α dx ≤ C |f (x)|p |x|α dx,
−∞ −∞
19 A Note on Necessary Condition for Lp Multipliers with Power Weights 247

for all m ∈ M (s, λ) and f ∈ S0,0 , then

(1) α > −1,
(2) max(−pλ, −1 + p(−λ + 21 )) ≤ α ≤ min(pλ, −1 + p(λ + 21 ), −1 + p(λ + 1 −
1
s
)),
(3) (α + 1)/p is not an integer.
It is clear that for 1 < p ≤ 2 and λ ≥ 1/p , the result (3) of Theorem 4.2 is not
satisfied. So only possibility is for 0 < λ < 1/p which is the Ap range of the weight.
As the Hilbert transform is bounded, so in this case there are non-constants multipliers
[3, page 183]. Thus, we conclude that Theorem 1.4 cannot be sufficient.

Acknowledgements The author would like to thank Professor Parasar Mohanty and Professor
Sobha Madan for technical discussion with them. The author acknowledges the financial support
by AICTE-NEQIP, Tezpur University . The author would like to acknowledge the excellent facilities
of Indian Institute of Technology Kanpur that are availed during the preparation of the article.

References

1. Dahlberg, B.J.: Regularity properties of Riesz potentials. Indiana Univ. Math. J. 28(2), 257–268
(1979)
2. Grafakos, L.: Classical and Modern Fourier Analysis. Pearson Education, New Jersey (2004)
3. Muckenhoupt, B., Hunt, R., Wheeden, R.: L2 multipliers with power weights. Adv. Math. 49,
170–216 (1983)
4. Muckenhoupt, B., Young, W.S.: Lp multipliers with weights |x|kp−1 . Trans. Amer. Math. Soc.
275(2), 623–639 (1983)
5. Muckenhoupt, B., Wheeden, R., Young, W.S.: Sufficiency conditions for Lp multipliers with
power weights. Trans. Amer. Math. Soc. 300(2), 433–461 (1987)
6. Muckenhoupt, B.: Necessity conditions for Lp multipliers with power weights. Trans. Amer.
Math. Soc. 300(2), 503–520 (1987)
7. Stein, E.M.: Singular Integrals and Differentiability Properties of Functions. Princeton Univer-
sity Press, Princeton (1970)
8. Strichartz, R.S.: Multipliers for sperical harmonic expansions. Trans. Amer. Math. Soc. 167,
115–124 (1972)
Chapter 20
On M/ G (a,b) /1/N Queue with Batch
Size- and Queue Length-Dependent
Service

G. K. Gupta and A. Banerjee

Abstract In this paper, we analyze finite buffer M /G/1 queue where service is
offered in groups/batches according to ‘general bulk service’ rule by a single server.
The service time distribution is considered to be generally distributed and allowed to
change dynamically depending on the batch size under service and queue length just
before the service initiation of the batch under consideration. Using the embedded
Markov chain technique and supplementary variable technique, we obtain the joint
distribution of the queue length and batch size at various epoch. At the end, we
present several numerical results in the form of self-explanatory table and graphs to
bring out some interesting features of the model.

Keywords Batch size-dependent queue · Finite buffer · Queue length-dependent

queue · Supplementary variable technique · Embedded Markov chain technique
General bulk service rule

1 Introduction

In asynchronous transfer mode (ATM) networks, based on continuous-bit-rate (CBR)

traffic services, packetized voice or video samples are transmitted over the commu-
nication channel. An important issue, called ‘congestion’, may arise frequently in
packet-switched network. Generally, a high-rate traffic flow causes congestion to the
system and degrades the performance of the system significantly. The mechanisms
of congestion control prevent congestion of the system either, before it happens, or
remove congestion, after it has happened. A significant amount of literature on queu-
ing study is found to be focused on congestion control mechanism to regulate service

G. K. Gupta (B) · A. Banerjee

Department of Mathematical Sciences, Indian Institute
of Technology (BHU) Varanasi, Varanasi 221005, Uttar Pradesh, India
e-mail: [email protected]; [email protected]
A. Banerjee
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2018 249

(transmission) rates in communication networks; see [1, 2] and reference therein. In

queueing literature, congestion control mechanism is achieved by controlling either
arrival rates or service rates or both.
Queueing models are useful in different real-world practical management situa-
tions to optimize the total system cost by keeping the QoS. Bulk service queuing
systems have wide applications such as in transportation system, manufacturing
systems, computer networking systems, telecommunication systems. Over the past
decades, the bulk service queueing system is focused by the researchers; see, e.g.,
[3–5]. It should be noted here that the service time of a batch, containing more cus-
tomers, should be longer than the batches with lesser number of customers within it.
Therefore, assuming the service time of the batch to be dependent on the batch size is
more applicable in describing the congestion control mechanism of the bulk service
queue. The batch size-dependent service queue has been recently studied in [6–11],
etc. However, the queue length dependency along with the batch size-dependent ser-
vice rate (time) may increase the system’s productivity in terms of decreasing in
blocking probability.
In view of the above discussion, we come to the conclusion that there are less
number of literature found in bulk service queue where batch size as well as queue
length-dependent service rate has been considered (see [12]). To support the raising
interest in the study of batch size-dependent bulk service queueing models together
with queue length-dependent service, this paper devotes our current work. We con-
sider a finite buffer M /G (a,b) /1 queue, where a single server serves a group of
customers of varying batch size according to GBS rule, the server changes its ser-
vice times (rates) only at the beginning of the service depending on the number of
customers taken for service, i.e., batch size under service, as well as on the number
of remaining customers left in the queue, i.e., queue length. We analyze our model
using the supplementary variable technique and the embedded Markov chain tech-
nique. The former one is used to develop a relation between joint distribution of
queue content and serving batch size at departure epoch and arbitrary epoch, and the
latter one is used to obtain the joint distribution of queue content and serving batch
size at departure epoch.
The outline of the rest of this paper is as follows: after giving the formal description
of the model in Sect. 2, in Sect. 2.1 we obtain the joint distribution of queue content
and serving batch size at departure epoch by using the embedded Markov chain
technique. Using the supplementary variable technique, we obtain a relation between
departure epoch and arbitrary epoch joint distributions of queue content and serving
batch size in Sect. 2.2. Section 3 is assigned to present for the various performance
measures. Several numerical examples are presented in Sect. 4. Some conclusions
are drawn in Sect. 5 followed by the references.
(a,b)
20 M /G n,r /1/N Queue … 251

2 Model Description and Steady-State Analysis

We consider a bulk service queue with single server and buffer size is finite. The
customers’ arrival follows the Poisson process with parameter λ and service is pro-
vided in batches according to the “GBS” rule. That is, when queue length is less than
‘a’ (≥ 1), server waits till the queue length reaches ‘a’ and then initiates service for
that group of ‘a’ customers. However, if the queue length is greater than or equal to
‘a’ and less than or equal to ‘b’, the entire group of ‘r’ (a ≤ r ≤ b) customers are
served at a time. However, when the queue length is greater than ‘b’, then server
serves first ‘b’ customers for service and rest of them will have to wait in the queue.
Further, it is assumed that the queue size is fixed to N (> b). The service time of the
batches is considered to be dependent on the size of the batch taken for service as
well as on the queue length (excluding the batch with the server) at the beginning of
the service. Let Sn,r (t) (a ≤ r ≤ b, 0 ≤ n ≤ N ) denote the service time distribution
∗
with probability density function (pdf) sn,r (.), LST sn,r (.) and mean service time s̃n,r
(rate μn,r = 1/s̃n,r ). Now, define the state space of the system under consideration at
time t as follows.
– Nq (t) : the queue length at time t.
– Ns (t) : the server content at time t when server is busy.
– U (t) : the remaining service time of a batch of customers under service, if any.
The state probabilities, at time t, are defined as follows:
– Pn,0 (t) ≡ prob.{Nq (t) = n, Ns (t) = 0}; 0 ≤ n ≤ a − 1,
– Pn,r (x, t)dx ≡ prob.{Nq (t) = n, Ns (t) = r, x ≤ U (t) ≤ x + dx}; 0 ≤ n ≤ N ,
a ≤ r ≤ b, x ≥ 0,
The corresponding steady-state probabilities are defined as follows:

lim Pn,0 (t) = Pn,0 ; 0 ≤ n ≤ a − 1,

t→∞
lim Pn,r (x, t) = Pn,r (x); 0 ≤ n ≤ N , a ≤ r ≤ b.
t→∞

Our main objective is to obtain the joint distribution of the queue content as well
as server content at departure epoch and arbitrary epoch. In the following sections,
we will proceed to obtain the required distributions by using the embedded Markov
chain technique and supplementary variable technique.

2.1 Joint Probability Distribution at Departure Epoch

This section devotes to obtain the steady-state joint probability distribution

of the queue length and number of customers with the serving batch at depar-
ture epoch. Now by observing the state of the system at two consecutive batch
departure epochs, we obtain a two-dimensional Markov chain with state space
252 G. K. Gupta and A. Banerjee

{(n, r) : 0 ≤ n ≤ N , a ≤ r ≤ b} (see
[6]). The corresponding one-step transition
probability matrix (TPM) P = pi,j of dimension (N + 1)(b − a + 1), where each
pi,j is a matrix of dimension (b − a + 1) × (b − a + 1), is given by

0 1 ... N − b − 1 N − b ... N −1 N
⎛ ⎞
0 D0(0,1) (0,1)
D1
(0,1)
. . . DN −b−1
(0,1)
DN −b ...
(0,1)
DN −1
(0,1)
D̄N
⎜ . .. .. . .. .. . . ⎟
. ⎜ . . . . . . . . ⎟
. ⎜ . . . . ⎟
. ⎜ (0,1) (0,1) (0,1) (0,1) (0,1) (0,1) ⎟
⎜ D0 D . . . DN −b−1 DN −b ... DN −1 D̄N ⎟
a−1 ⎜ 1 ⎟
⎜ (0,1) (0,1) (0,1)
. . . DN −b−1
(0,1)
...
(0,1) (0,1) ⎟
a ⎜ D0 D1 DN −b DN −1 D̄N ⎟
⎜ (0,2) (0,2) (0,2) (0,2) (0,2) (0,2) ⎟
a+1 ⎜ D0 D1 . . . DN −b−1 DN −b ... DN −1 D̄N ⎟
⎜ ⎟
P= . ⎜ . .. .. . .. .. . . ⎟
. ⎜ . . . . . . . . ⎟
. ⎜ . . . . ⎟
⎜ (0,b−a+1) (0,b−a+1) (0,b−a+1) (0,b−a+1) (0,b−a+1) (0,b−a+1) ⎟
b ⎜D0 D . . . DN −b−1 DN −b . . . DN −1 D̄N ⎟
⎜ 1 ⎟
b+1 ⎜ (1,b−a+1) (1,b−a+1)
. . . DN −b−2
(1,b−a+1) (1,b−a+1)
. . . DN −2
(1,b−a+1) ⎟
⎜ 0 D0 DN −b−1 D̄N −1 ⎟
. ⎜ . . . . ⎟
. ⎜ . .. .. . .. .. . . ⎟
. ⎝ . . . . . . . . ⎠
N (N −b,b−a+1) (N −b,b−a+1) (N −b,b−a+1)
0 0 ... 0 D0 . . . Db−1 D̄b

here, each 0 and Dj(n,i) in P are the matrices of dimension (b − a + 1) and are
described as follows:
Dj(0,i) = eiT ⊗ κ(0,i+a−1)
j , 1 ≤ i ≤ b − a + 1, 0 ≤ j ≤ N − 1,
Dj(n,b−a+1) = eb−a+1
T
⊗ κ(n,b)
j , 1 ≤ n ≤ N − b, 0 ≤ j ≤ N − n − 1,
D̄N(0,i) = eiT ⊗ κ̄(0,i+a−1)
N −1 , 1 ≤ i ≤ b − a,
(n,b−a+1)
D̄j = eb−a+1 ⊗ κ̄(n,b)
T
j−1 , b ≤ j ≤ N , 0 ≤ n ≤ N − b, n + j = N .
where
– ei is a column vector of dimension (b − a − 1) with 1 at the ith position and 0
elsewhere.
– κ(n,r)
j is a column vector of dimension (b − a − 1), consisting of ξj(n,r) ’s, and ξj(n,r) ’s
are the probabilities of j arrival during the service period of a batch of size r
(a ≤ r ≤
b) servicing with service time distribution Sn,r (.) and is obtained as
∞ e−λt (λt)j
dS0,r (t), a ≤ r ≤ b − 1,
ξj(n,r) = 0∞ e−λtj!(λt)j
dSn,b (t), r = b, 0 ≤ n ≤ N − b.
0 j! j
– κ̄j is a column vector of dimension (b − a − 1) consisting of 1 − i=0 ξi(n,r) ’s.
(n,r)

+
Let us now define the departure epoch joint probability as pn,r
(0 ≤ n ≤ N , a ≤ r ≤ b), which represents that there are n customers are left in the
+
queue at departure epoch of a batch of size r. Then, pn,r can be determined by solving
the system of equations πP = π, where π = (π0 , π1 , ..., πN+ ) and each πn+ (0 ≤ n ≤
+ +
+ +
N ) is a row vector of order (b − a + 1), and is given by πn+ = (pn,a
+
, pn,a+1 , ..., pn,b ).
+
Once we obtain the joint probabilities pn,r (0 ≤ n ≤ N , a ≤ r ≤ b), the marginal
distribution of the queue length at departure epoch, represented by pn+ , is obtained
b
+
as pn+ = pn,r , 0 ≤ n ≤ N.
r=a
(a,b)
20 M /G n,r /1/N Queue … 253

2.2 Joint Probability Distribution at Arbitrary Epoch

This section devotes to obtain the joint distribution of the queue content and server
content at an arbitrary epoch. Toward this end, we will first obtain the governing
equations of the system in steady state.
Relating the state of the system at time t and t + dt, we find the Kolmogrov
equations of the system, in steady state, as follows.

b
0 = −λP0,0 + P0,r (0), (1)
r=a

b
0 = −λPn,0 + λPn−1,0 + Pn,r (0) , 1 ≤ n ≤ a − 1, (2)
r=a

∂P0,a (x) b
− = −λP0,a (x) + λPa−1,0 s0,a (x) + Pa,r (0)s0,a (x), (3)
∂u r=a

∂P0,r (x) b
− = −λP0,r (x) + Pr,k (0)s0,r (x) , a + 1 ≤ r ≤ b, (4)
∂u
k=a
∂Pn,r (x)
− = −λPn,r (x) + λPn−1,r (x) , a ≤ r ≤ b − 1, 1 ≤ n ≤ N − 1, (5)
∂u
∂Pn,b (x) b
− = −λPn,b (x) + λPn−1,b (x) + Pn+b,r (0)sn,b (x) , 1 ≤ n ≤ N − b,
∂u r=a
(6)
∂Pn,b (x)
− = −λPn,b (x) + λPn−1,b (x) , N − b + 1 ≤ n ≤ N − 1, (7)
∂u
∂PN ,r (x)
− = λPN −1,r (x) , a ≤ r ≤ b, (8)
∂u
+
We may note here that the joint probabilities pn,r and Pn,r (0) are proportional to
each other and hence can be written as
+
pn,r = σPn,r (0), 0 ≤ n ≤ N , a ≤ r ≤ b, (9)

where σ is the proportionality constant and its value is obtained in Lemma 1.

Lemma 1 The value of the proportionality constant σ, as appeared in (9), is given by

N
b
a−1
−1 −1
σ = Pn,r (0) = g 1− Pn,0 , (10)
n=0 r=a n=0
254 G. K. Gupta and A. Banerjee

a
b
N
where g = s̃0,a pn+ + pn+ s̃0,n + pn+ s̃n−b,b .
n=0 n=a+1 n=b+1

Proof Summing both sides of (9) over the range of r and n and using the result
N
N
b
pn+ = 1, we obtain σ −1 = Pn,r (0).
n=0 n=0 r=a

Now multiplying (3)–(8) by e−θx and integrating with respect to x from 0 to ∞,

we obtain

b
∗ ∗ ∗
(λ − θ) P0,a (θ) = λPa−1,0 s0,a (θ) + Pa,r (0)s0,a (θ) − P0,a (0), (11)
r=a

b
∗ ∗
(λ − θ) P0,r (θ) = Pr,k (0)s0,r (θ) − P0,r (0) ; a + 1 ≤ r ≤ b, (12)
k=a
∗ ∗
(λ − θ) Pn,r (θ) = λPn−1,r (θ) − Pn,r (0) ; a ≤ r ≤ b − 1 , 1 ≤ n ≤ N − 1, (13)

b
∗ ∗ ∗
(λ − θ) Pn,b (θ) = λPn−1,b (θ) + Pn+b,r (0)sn,b (θ) − Pn,b (0) ; 1 ≤ n ≤ N − b,
r=a
(14)
∗ ∗
(λ − θ) Pn,b (θ) = λPn−1,b (θ) − Pn,b (0) ; N − b + 1 ≤ n ≤ N − 1, (15)
∗ ∗
−θPN ,r (θ) = λPN −1,r (θ) − PN ,r (0) ; a ≤ r ≤ b, (16)

where

∞
e−θx Pn,r (x)dx = Pn,r
∗
(θ); 0 ≤ n ≤ N , a ≤ r ≤ b, θ ≥ 0,
0
∞
e−θx sn,r (x)dx = sn,r
∗
(θ); 0 ≤ n ≤ N , a ≤ r ≤ b, θ ≥ 0,
0
∞
∗
Pn,r ≡ Pn,r (0) = Pn,r (x)dx.
0

From Eqs. (1) and (2), we obtain

n
b
λPn,0 = Pm,r (0) ; 0 ≤ n ≤ a − 1, (17)
m=0 r=a

Now using (17) in (11), we get

a
b
∗ ∗
(λ − θ) P0,a (θ) = Pm,r (0)s0,a (θ) − P0,a (0), (18)
m=0 r=a
(a,b)
20 M /G n,r /1/N Queue … 255

Now summing Eqs. (12)–(16) and (18), we obtain

N
b a b ∗ (θ) b b ∗ (θ)
∗
1 − s0,a 1 − s0,n
Pn,r (θ) = Pn,r (0) + Pn,r (0)
θ θ
n=0 r=a n=0 r=a n=a+1 r=a
∗

N b
1 − sn−b,b (θ)
+ Pn,r (0), (19)
r=a
θ
n=b+1

Taking limit as θ → 0 in above expression, and using L’Hospital’s rule and the

a−1
N b
normalization condition Pn,0 + Pn,r = 1, we obtain
n=0 n=0 r=a

a−1
a
b
b
b
N
b
1− Pn,0 = s̃0,a Pn,r (0) + s̃0,n Pn,r (0) + s̃n−b,b Pn,r (0), (20)
n=0 n=0 r=a n=a+1 r=a n=b+1 r=a

After algebraic manipulation from (20), we obtain the desired result (10).
The joint probability distribution of queue content and number of customers with
server at an arbitrary epoch is obtained in Theorem 1.

Theorem 1 The steady-state arbitrary epoch joint probabilities Pn,0 , Pn,r are
related with the departure epoch joint probabilities pn+ , pn,r
+
as follows.

n
Pn,0 = E −1 pi+ , 0 ≤ n ≤ a − 1, (21)
i=0
a

n
Pn,a = E −1
pk+ − +
pi,a , 0 ≤ n ≤ N − 1, (22)
k=0 i=0

n
−1 +
Pn,r = E pr+ − pi,r , 0 ≤ n ≤ N − 1, a + 1 ≤ r ≤ b − 1, (23)
i=0
min(b+n, N )

n
Pn,b = E −1
pi+ − +
pi,b , 0 ≤ n ≤ N − 1, (24)
i=b i=0

a−1
a
b
N
where E = λg + (a − n)pn+ and g = s̃0,a pn+ + pn+ s̃0,n + pn+ s̃n−b,b .
n=0 n=0 n=a+1 n=b+1

Proof The desired results (22)–(24) are obtained by substituting θ = 0 in (11)–(16),

solving recursively, after some algebraic manipulations (as described in [6]).

Evaluation of PN ,r (a ≤ r ≤ b) By using the normalizing condition, the probabil-

ities PN ,r (a ≤ r ≤ b) cannot be determined. The procedure for determining those
probabilities from Eq. (16) is described in this section.
256 G. K. Gupta and A. Banerjee

Let us first differentiate (11)–(15) with respect to θ and set θ = 0, and we obtain

∗(1)

b
λP0,a (0) = P0,a − λPa−1,0 s̃0,a − s̃0,a Pa,r (0), (25)
r=a

∗(1)

b
λP0,n (0) = P0,n − s̃0,n Pn,k (0) ; a + 1 ≤ n ≤ b, (26)
k=a
∗(1) ∗(1)
λPn,r (0) = Pn,r + λPn−1,r (0) ; a ≤ r ≤ b − 1 , 1 ≤ n ≤ N − 1, (27)
∗(1) ∗(1)

b
λPn,b (0) = Pn,b + λPn−1,b (0) − Pn+b,r (0)sn,b
˜ ; 1 ≤ n ≤ N − b, (28)
r=a
∗(1) ∗(1)
λPn,b (0) = Pn,b + λPn−1,b (0) ; N − b + 1 ≤ n ≤ N − 1, (29)
∗(1) ∗
where Pn,r (0) is the derivative of Pn,r (θ) with respect to θ at θ = 0. Solving Eqs.
∗(1)
(25)–(29) recursively and Lemma 1, we obtain the values of Pn,r (0) (a ≤ r ≤ b, 0 ≤
n ≤ N − 1) in known terms.
Now to obtain PN ,r (a ≤ r ≤ b), we differentiate (16) with respect to θ and set
θ = 0, and obtain
PN ,r = −λPN∗(1)
−1,r (0) ; a ≤ r ≤ b.

Henceforth, we have completely obtained all the joint probability distributions

of queue length and server content. Now, we obtain several marginal distributions
which will help us to compute the useful performance measures, as follows.
queue
–

the distribution of queue content, pn (0 ≤ n ≤ N ), is given by pnqueue =

b
Pn,0 + r=a Pn,r , 0 ≤ n ≤ a − 1,
b
r=a Pn,r , a ≤ n ≤ N.
sys
⎧ distribution of the system content, pn (0 ≤ n ≤ N + b), is given by pn =
sys
– the
⎪
⎨ Pn,0 , 0 ≤ n ≤ a − 1,
min(b,n)
Pn−r,r , a ≤ n ≤ N + a,
⎪
⎩ b
r=a
r=n−N Pn−r,r , N + a + 1 ≤ n ≤ N + b.

b
– the distribution of queue length when server is busy is given by, pnbusy = Pn,r ,
r=a
(0 ≤ n ≤ N ),
– the distribution of the number of
customer with the server, prser
a−1
Pn,0 , r = 0,
(r = 0 and a ≤ r ≤ b), is given by prser = n=0
N
n=0 n,r , a ≤ r ≤ b.
P

– the probability that the server is in idle state, is given by Pidle = a−1 n=0 Pn,0 , and

probability that the server is in busy state, is given by Pbusy = br=a prser .
(a,b)
20 M /G n,r /1/N Queue … 257

3 Performance Measure

As all the state probabilities are known, the significant performance measures of the
present model c are evaluated as follows:

N
1. Expected queue length: Lq = npnqueue .
n=0

N +b
2. Expected system length: L = npnsys .
n=0
b
3. Expected server content: Ls = rprser .
r=a

a−1

q =
4. Expected queue length when server is idle: Lidle nPn,0 .
n=0

busy
N
5. Expected queue length when server is busy: Lq = n.pnbusy .
n=0
busy
6. Expected queue length when server is busy with r (a ≤ r ≤ b) customers: Lq,r =
N
nPn,r .
n=0
busy
7. Probability of blocking: PBlock = pN .
8. Using Little’s law, the expected waiting time of a customer in the queue is given
by Wq = Lq /λ̄ and the expected waiting time of a customer in the system is given
by W = L/λ̄, where λ̄ is the effective arrival rate of the system and is given by
λ̄ = λ (1 − PBlock ).

4 Numerical Results

This section devotes to present several numerical examples in the form of tables
and graphs to adjudicate the analytical results obtained in previous sections. For
this purpose, we consider M /G (4,9)
n,r /1/25 queue with state-dependent service rate as
μ0,r = (b − r + 1)μ (a ≤ r ≤ b − 1) and μn,b = μ + 0.5n (0 ≤ n ≤ N − b), where
μ = 1.5. Tables 1 and 2 present the departure epoch joint probabilities and arbitrary
epoch joint probabilities for the above queuing system with E4 service time distribu-
tion and arrival rate λ = 29.0. These results are presented here to show the numerical
compatibility of our analytical results. The important performance measures of the
queueing model under consideration are also presented at the bottom of Table 2. We
have also presented here a comparative study in forms of graph to bring out the
qualitative aspects of our current study. For this purpose, we have considered the
following two cases.
258 G. K. Gupta and A. Banerjee

(4,9)
Table 1 Departure epoch joint distribution for M /E4 /1/25 queue with a = 4, b = 9, N = 25,
λ = 29.0
+ + + + + +
n pn,4 pn,5 pn,6 pn,7 pn,8 pn,9 pn+
0 0.0361923 0.0041659 0.0021518 0.0009450 0.0002947 0.0000335 0.0437831
1 0.0645893 0.0081906 0.0047096 0.0023322 0.0008337 0.0001954 0.0808509
2 0.0720419 0.0100647 0.0064424 0.0035976 0.0014743 0.0006609 0.0942818
3 0.0642835 0.0098941 0.0070502 0.0044396 0.0020856 0.0016717 0.0894248
4 0.0501906 0.0085106 0.0067509 0.0047938 0.0025816 0.0034760 0.0763035
5 0.0358284 0.0066931 0.0059102 0.0047326 0.0029216 0.0062348 0.0623207
6 0.0239774 0.0049347 0.0048509 0.0043802 0.0030997 0.0099328 0.0511758
7 0.0152823 0.0034651 0.0037918 0.0038610 0.0031321 0.0143932 0.0439255
8 0.0093751 0.0023419 0.0028528 0.0032757 0.0030462 0.0192659 0.0401575
9 0.0055770 0.0015348 0.0020813 0.0026949 0.0028728 0.0240327 0.0387934
10 0.0032347 0.0009807 0.0014804 0.0021616 0.0026416 0.0280985 0.0385975
11 0.0018367 0.0006135 0.0010310 0.0016975 0.0023780 0.0309460 0.0385028
12 0.0010243 0.0003769 0.0007052 0.0013093 0.0021025 0.0322733 0.0377915
13 0.0005625 0.0002280 0.0004749 0.0009943 0.0018303 0.0320510 0.0361409
14 0.0003047 0.0001361 0.0003155 0.0007449 0.0015720 0.0304890 0.0335623
15 0.0001631 0.0000803 0.0002072 0.0005516 0.0013343 0.0279464 0.0302829
16 0.0000864 0.0000469 0.0001346 0.0004041 0.0011207 0.0269328 0.0287256
17 0.0000454 0.0000271 0.0000867 0.0002934 0.0009326 0.0247909 0.0261760
18 0.0000236 0.0000155 0.0000553 0.0002112 0.0007696 0.0212840 0.0223593
19 0.0000122 0.0000088 0.0000350 0.0001509 0.0006303 0.0172570 0.0180943
20 0.0000063 0.0000050 0.0000221 0.0001071 0.0005127 0.0134749 0.0141279
21 0.0000032 0.0000028 0.0000138 0.0000755 0.0004144 0.0103188 0.0108286
22 0.0000016 0.0000016 0.0000086 0.0000529 0.0003331 0.0078661 0.0082639
23 0.0000008 0.0000009 0.0000053 0.0000369 0.0002664 0.0060357 0.0063459
24 0.0000004 0.0000005 0.0000033 0.0000256 0.0002119 0.0046952 0.0049370
25 0.0000004 0.0000006 0.0000051 0.0000559 0.0007647 0.0234197 0.0242465
Total 0.3846442 0.0623207 0.0511758 0.0439255 0.0401575 0.4177764 1.0000000

Case 1. Batch size as well as queue length-dependent service time, i.e., μ0,r = (b −
r + 1)μ (a ≤ r ≤ b − 1) and μn,b = μ + 0.5n (0 ≤ n ≤ N − b), where μ = 1.5.
Case 2. Only batch size-dependent service time, i.e., μ0,r = (b − r + 1)μ
(a ≤ r ≤ b − 1) and μn,b = 4.150918 (0 ≤ n ≤ N − b) where μ = 1.5.
The purpose behind choosing the value of μn,b = 4.150918 (0 ≤ n ≤ N − b)
for Case 2 is that the average service time for both the cases; i.e., Cases 1 and 2
remain the same. It must be noticed here that when server serves a batch of size
r (a ≤ r ≤ b − 1) then for both the cases, service times remain unaffected by the
queue length. This is because of the fact whenever server is starting a service of
a batch of size r (a ≤ r ≤ b − 1), then he finds that queue length is always zero.
(a,b)
20 M /G n,r /1/N Queue … 259

(4,9)
Table 2 Arbitrary epoch joint distribution for M /E4 /1/25 queue with a = 4, b = 9, N = 25,
λ = 29.0
queue
n Pn,0 Pn,4 Pn,5 Pn,6 Pn,7 Pn,8 Pn,9 pn
0 0.0065450 0.0520887 0.0086933 0.0073284 0.0064250 0.0059589 0.0057941 0.0928334
1 0.0186311 0.0424335 0.0074690 0.0066244 0.0060763 0.0058343 0.0115347 0.0986032
2 0.0327249 0.0316643 0.0059644 0.0056613 0.0055386 0.0056139 0.0171915 0.1043588
3 0.0460926 0.0220548 0.0044854 0.0046074 0.0048749 0.0053022 0.0225909 0.1100082
4 0.0145520 0.0032132 0.0035982 0.0041583 0.0049162 0.0274739 0.0579118
5 0.0091961 0.0022126 0.0027147 0.0034508 0.0044795 0.0315589 0.0536128
6 0.0056118 0.0014750 0.0019896 0.0027960 0.0040161 0.0346010 0.0504896
7 0.0033273 0.0009570 0.0014228 0.0022189 0.0035479 0.0367435 0.0482174
8 0.0019259 0.0006069 0.0009963 0.0017292 0.0030926 0.0377765 0.0461274
9 0.0010922 0.0003775 0.0006852 0.0013264 0.0026631 0.0375263 0.0436707
10 0.0006087 0.0002309 0.0004639 0.0010032 0.0022683 0.0360308 0.0406058
11 0.0003341 0.0001392 0.0003098 0.0007495 0.0019128 0.0335167 0.0369621
12 0.0001810 0.0000828 0.0002044 0.0005538 0.0015985 0.0303110 0.0329315
13 0.0000969 0.0000487 0.0001334 0.0004051 0.0013249 0.0267552 0.0287642
14 0.0000513 0.0000284 0.0000862 0.0002938 0.0010899 0.0231461 0.0246958
15 0.0000270 0.0000164 0.0000553 0.0002113 0.0008904 0.0197065 0.0209069
16 0.0000140 0.0000094 0.0000351 0.0001509 0.0007229 0.0193050 0.0202373
17 0.0000073 0.0000053 0.0000222 0.0001070 0.0005835 0.0155991 0.0163244
18 0.0000037 0.0000030 0.0000139 0.0000755 0.0004684 0.0124174 0.0129820
19 0.0000019 0.0000017 0.0000087 0.0000529 0.0003742 0.0098377 0.0102771
20 0.0000010 0.0000009 0.0000054 0.0000369 0.0002976 0.0078234 0.0081652
21 0.0000005 0.0000005 0.0000033 0.0000256 0.0002356 0.0062809 0.0065465
22 0.0000002 0.0000003 0.0000020 0.0000177 0.0001858 0.0051050 0.0053112
23 0.0000001 0.0000002 0.0000012 0.0000122 0.0001460 0.0042028 0.0043625
24 0.0000001 0.0000001 0.0000008 0.0000084 0.0001143 0.0035009 0.0036245
25 0.0000001 0.0000001 0.0000012 0.0000177 0.0003910 0.0210598 0.0214699
Total 0.1039936 0.1852744 0.0360222 0.0369753 0.0423158 0.0580289 0.5373898 1.0000000
(prser ) (Pidle ) (p4ser ) (p5ser ) (p6ser ) (p7ser ) (p8ser ) (p9ser )
L = 13.69263, W = 0.4825193, Lq = 6.952619, Wq = 0.2450057, PBlock = 0.02146987,
busy
Ls = 6.740011, Lidle
q = 0.2223588, Lq = 6.73026,
busy busy busy busy busy
Lq,4 = 0.373122, Lq,5 = 0.08705351, Lq,6 = 0.1116945, Lq,7 = 0.1704021, Lq,8 =
busy
0.3493061, Lq,9 = 5.638682

Whereas when the server is starting service with a batch of size b, then he finds
a queue of n (0 ≤ n ≤ N − b) customers. As a result in this situation the service
rate, for Case 1, is decreasing depending on the queue length. However, for the
Case 2, the service rate of a batch of size b is considered to be the reciprocal of the
average service time
of−b a batch
ofsize b with n in the queue as considered in case 1,
i.e., (N − b + 1) / Nn=0 1/μn,b . This normalization of the service rate has been
done for getting better result in comparing Cases 1 and 2. Therefore, the average
260 G. K. Gupta and A. Banerjee

32
0.70
30
0.65
W(case 1)
Wq(case 1) 28
0.60 L (case 1)
Wq(case 2)
26 Lq (case 1)
0.55
Waiting time

Queue length
W(case 2)
L (case 2)
0.50 24 Lq (case 2)

0.45 22

0.40 20

0.35 18

0.30 16
0.25 14
50 55 60 65 70 75 80 85 90 95 50 55 60 65 70 75 80 85 90 95
λ λ
(a) Effect of λ on W and Wq (b) Effect of λ on L and Lq
0.55
0.50 PBlock (case 2)
0.45 PBlock (case 1)
Blocking probability

0.40
0.35
0.30
0.25
0.20
0.15
0.10

50 55 60 65 70 75 80 85 90 95
λ
(c) Effect of λ on PBlock vs.

Fig. 1 Impact of λ on some key performance measures for Case 1 and Case 2

service time of a batch of size b is E(Sb ) = 0.240911 and average service rate of
a batch of size b is μn,b = 4.150918 for both the cases, i.e., Case 1 and 2. Figure 1
depicts the impact of arrival rate λ on various performance measures for Case 1 and
Case 2. In particular, Fig. 1a depicts that the value of Ws and Wq decreases with
the increase in arrival rate for Case 1, while these are increasing for Case 2. This
behavior well justifies the contribution of effect of queue length-dependent service
together with batch size-dependent service. Again an important observation may be
noted from Fig. 1b is that the expected system/queue length is much lower for Case
1 in comparison to Case 2. One of the most important performance measures for
any queuing model is PBlock . Figure 1c reveals that the congestion control is achieved
more significantly in terms of decrease in PBlock in our current study in comparison
to the queuing model considered by Banerjee and Gupta [6].
Figure 2 is presented here for the purpose of revealing the behavior of some
important performance measures w.r.t. λ and different service time distribution, viz.
deterministic and E4 , for the Case 1. Figure 2a and b reveals that the idle probability
of the server is decreasing while blocking probability is increasing with increase in
(a,b)
20 M /G n,r /1/N Queue … 261

0.085 0.060
0.080
0.075 Deterministic 0.055 Deterministic
0.070 Erlang-4 0.050 Erlang-4
0.065
0.060 0.045
0.055 0.040
0.050 0.035

PBlock
0.045
Pidle

0.040 0.030
0.035 0.025
0.030
0.020
0.025
0.020 0.015
0.015 0.010
0.010
0.005 0.005
0.000 0.000
30 31 32 33 34 35 36 37 38 39 40 30 31 32 33 34 35 36 37 38 39 40
λ λ
(a) Effect of λ on Pidle (b) Effect of λ on PBlock
9.0
8.8
8.6
8.4
8.2
8.0
LS

7.8
7.6
Erlang-4
7.4 Deterministic
7.2
7.0
30 31 32 33 34 35 36 37 38 39 40
λ
(c) Effect of λ on LS

Fig. 2 Impact of λ on Pidle , PBlock and LS for different service time distributions

the arrival rate, which is quite obvious for our current study. Figure 2c shows that
expected server content increases with the increase in the value of λ as expected.

5 Conclusion

In this paper, we have analyzed a bulk service queue with finite buffer size. The service
time, which depends on the size of the batches under service as well as workload
which is measured as the queue length before service initiation, is considered to be
generally distributed. We have presented here the procedure for obtaining the joint
probabilities in steady state at various time epochs. Several numerical examples to
compare the impact of our current study with the one presented by Banerjee and
Gupta [6] are presented to explore the qualitative aspects of our considered model.
The effect of arrival rate on some important performance measures reveals that the
congestion control is achieved more significantly in our current study. The considered
model can be extended to study the models with correlated arrival process.
262 G. K. Gupta and A. Banerjee

Acknowledgements The authors are truly thankful to the anonymous reviewers and editors for
their valuable comments and suggestions.

References

1. Jain, R.: Congestion control and traffic management in atm networks: Recent advances and a
survey, Comput. Netw. Syst. 28(13)
2. Leung, K.K.: Load-dependent service queues with application to congestion control in broad-
band networks. Perform. Eval. 50(1), 27–40 (2002)
3. Neuts, M.F.: A general class of bulk queues with poisson input. Ann. Mathe. Statist. 38, 759–
770 (1967)
4. Powell, W.B., Humblet, P.: The bulk service queue with a general control strategy: theoretical
analysis and a new computational procedure. Oper. Res. 34(2), 267–275 (1986)
5. Neuts, M.F.: Transform-free equations for the stationary waiting time distributions in the queue
with poisson arrivals and bulk services. Ann. Oper. Res. 8(1), 1–26 (1987)
6. Banerjee, A., Gupta, U.C.: Reducing congestion in bulk-service finite-buffer queueing system
using batch-size-dependent service. Perform. Eval. 69(1), 53–70 (2012)
7. Banerjee, A., Gupta, U.C., Chakravarthy, S.R.: Analysis of a finite-buffer bulk-service queue
under markovian arrival process with batch-size-dependent service. Comput. Oper. Res. 60,
138–149 (2015)
8. Banerjee, A., Gupta, U.C., Sikdar, K.: Analysis of finite-buffer bulk-arrival bulk-service queue
with variable service capacity and batch-size-dependent service: M X /G Yr /1/N . Int. J. Mathe.
Operational Res. 5(3), 358–386 (2013)
9. Banerjee, A., Sikdar, K., Gupta, U.C.: Computing system length distribution of a finite-buffer
bulk-arrival bulk-service queue with variable server capacity. Int. J. Operational Res. 12(3),
294–317 (2011)
10. Bar-Lev, S.K., Blanc, H., Boxma, O., Janssen, G., Perry, D.: Tandem queues with impatient
customers for blood screening procedures. Meth. Comput. Appl. Probab. 15(2), 423–451 (2013)
11. Bar-Lev, S.K., Parlar, M., Perry, D., Stadje, W.: Applications of bulk queues to group testing
models with incomplete identification. Eur. J. Operational Res. 183, 226–237 (2007)
12. Germs, R., Van Foreest, N.: Loss probabilities for the M X /G Y /1/(K + B) bulk queue. Probab.
Eng. Inf. Sci. 24(4), 457–471 (2010)
Chapter 21
A Fuzzy Random Continuous (Q, r, L)
Inventory Model Involving Controllable
Back-order Rate and Variable Lead-Time
with Imprecise Chance Constraint

Debjani Chakraborty, Sushil Kumar Bhuiya and Debdas Ghosh

Abstract In this article, we analyze a fuzzy random continuous review inventory

system with the mixture of back-orders and lost sales, where the annual demand is
treated as a fuzzy random variable. The study under consideration assumes that the
lead-time is a control variable and the lead-time crashing cost is being introduced
as a negative exponential function of the lead-time. In a realistic situation, the back-
order rate is dependent on the lead-time. Significantly large lead-times might lead
to stock-out periods being longer. As a result, many customers may not be prepared
to wait for back-orders. Instead of constant back-order rate, we introduce the back-
order rate as a decision variable, which is a function of the lead-time throughout the
amount of shortage. Moreover, a budgetary constraint is imposed on the model in
the form of an imprecise chance constraint to capture the possible way of measuring
the imprecisely defined uncertain information of the budget constraint. We develop
a methodology to determine the optimum order quantity, reorder point, lead-time,
and back-order rate such that the total cost is minimized in the fuzzy sense. Finally,
a numerical example is presented to illustrate the proposed methodology.

Keywords Inventory · Imprecise chance constraint · Fuzzy random variable

Possibilistic mean value

D. Chakraborty (B) · S. K. Bhuiya

Department of Mathematics, Indian Institute of Technology Kharagpur,
Kharagpur 721302, West Bengal, India
e-mail: [email protected]
S. K. Bhuiya
e-mail: [email protected]; [email protected]
D. Ghosh
Department of Mathematical Sciences, Indian Institute
of Technology (BHU) Varanasi, Varanasi 221005, Uttar Pradesh, India
e-mail: [email protected]; [email protected]

© Springer Nature Singapore Pte Ltd. 2018 263

1 Introduction

Inventory control plays a significant role in every production house. The continuous
review inventory model is one of the most important and useful problems in indus-
trial applications. In the continuous review inventory system, the occurrence of the
shortage is a major concern. In most of the real-life situations, when such a condition
arises, back-orders and lost sales happen simultaneously. Thus, the inventory model,
which constitutes both back-order and lost sale cases, is more efficient than the ones
based on the individual cases. Montgomery et al. [23] first introduced the inventory
model with a mixture of back-orders and lost sales. After the pioneering work of
[23], numerous related studies have been developed considerably in the problem of
mixture of back-orders and lost sales (see, among others [16, 22, 31]).
In the earlier literature dealing with inventory systems, the lead-time is common-
ly considered as a prescribed constant or a stochastic variable. Hence, the lead-time
becomes uncontrollable [26]. But, production management philosophies like just in
time (JIT) show that there are advantages and benefits associated with the effort of
control of the lead-time. By shortening lead-time [35], we can decrease the safety
stock, minimize the loss due to stock-out, improve the service level to the customer,
and increase the competitive capability in business. Liao and Shyu [21] first intro-
duced the problem of lead-time reduction in a continuous review inventory model,
where the order quantity was predetermined, and the lead-time was assumed to be a
decision variable. Thereafter, several researchers (see, among others [2, 14, 20, 22,
25, 28–30]) have studied lead-time reduction in different types of inventory system.
In addition to lead-time, another key aspect of the inventory system is back-
orders. Most of the earlier work in the field of inventory control, it is assumed that
the back-order rate is constant. However, in a realistic situation, the back-order rate
is dependent on the lead-time. Bigger lead-times might lead to stock-out periods
being longer; and as a result, many customers may not be willing for back-orders.
This phenomenon reveals that under the longer length of the lead-time, the period
of shortage becomes longer. It signifies that the proportion of customers that can
wait goes down; as a result, back-order rate decreases. The interdependence between
the back-order rate and the lead-time has been proposed by Ouyang and Chuang
[28]. They have considered the back-order rate to be dependent on the length of the
lead-time through the amount of shortage and that the back-order rate is a control
variable. After the work of [28], researchers have been attracted on the problem of
controlling back-order rate, and they have extended the inventory control in various
directions (see, among others [20, 22, 33]).
On the other hand, most of the real-life business situation, the decision maker
has to work under limited budget. According to Hadley and Whitin [15], the most
significant real-world constraint is the budgetary restriction on the amount of capital
that can be contributed to procure the items of inventory. Keeping this in mind,
many inventory models (see, among others [1, 18, 24]) have been developed under
budgetary constraint in stochastic environment.
21 A Fuzzy Random Continuous (Q, r, L) Inventory Model … 265

During the mid-1980s, researchers have noticed that the fuzziness is also an in-
trinsic property of key parameters of the inventory system, particularly when given
or obtained data is undetectable, insufficient or partially ignorant. After that, fuzzy
set theory has been extensively employed in the problem of inventory system for
capturing the uncertainties in the non-probabilistic sense. Park [32] introduced the
fuzzy mathematics in the inventory system by developing economic order quantity
(EOQ) model in which trapezoidal fuzzy numbers were represented the ordering and
holding costs. Gen et al. [13] developed a continuous review inventory model where
the values of the parameters of inventory system are considered to be triangular fuzzy
numbers. Ouyang and Yao [27] extended min-max distribution-free procedure in the
fuzzy environment by developing a continuous review inventory model with variable
lead-time in which the annual demand was assumed as the triangular fuzzy number.
Tütüncü et al. [36] and Vijayan and Kumaran [37] studied the continuous review
inventory model by fuzzifying the cost parameters into fuzzy numbers. Tütüncü et
al. obtain the solution using a simulation-based analysis, while an iterative algo-
rithm was used to derive the optimal solution by Vijayan and Kumaran. Recently,
Shekarian et al. [34] presented a comprehensive review of the most relevant works
of fuzzy inventory model.
It can be noticed that the models, primarily the ones as mentioned above, capture
the uncertainty of the parameters of inventory system by characterizing the corre-
sponding variable as either fuzzy or random variable. In a real-life inventory system,
fuzziness and randomness often co-occur. Kwakernaak [19] first described the fuzzi-
ness and randomness of an event simultaneously. Dutta et al. [12] first incorporated
the mixture of fuzziness and randomness into annual demand and developed a single
periodic review inventory model. Chang et al. [5] and Dutta and Chakraborty [11]
analyzed and extended the continuous review inventory model into fuzzy random
circumstances. Chang et al. [5] treated the lead time as the fuzzy random vari-
able and annual expected demand as the fuzzy number. On the other hand, Dutta
and Chakraborty [11] considered both the lead-time and annual demand as discrete
fuzzy random variables. Dey and Chakraborty [10] considered the annual demand as
a fuzzy random variable for developing a periodic review inventory model. Dey and
Chakraborty [9] proposed a methodology for constructing a fuzzy random data set
from the partially known information. This method is applied on the fuzzy random
periodic review model developed by Dey and Chakraborty [10]. Moreover, Dey and
Chakraborty [8] also extended the model [10] by incorporating negative exponential
crashing cost and lead-time as a variable. Kumar and Goswami [17] extended the
min-max distribution-free approach in fuzzy random environments by developing
a continuous review production–inventory system. Now, with increased complexity
of inventory problem domain, it is hard to define budgetary constraint with proper
certainty and precision. Chance-constrained programming [6] can be providing a
procedure to construct the constraints in the presence of randomness. However, the
imprecision and randomness may appear combined in the information of the restric-
tion. Keeping the issue of vagueness in mind, Chakraborty [4] redefined the chance
constraint as the imprecise chance constraint in which the probability of satisfying
the imprecise constraint is considered to be vague in nature and to be imprecisely
266 D. Chakraborty et al.

greater than or equal to a specified probability. Recently, Dey et al. [7] incorporated
imprecise chance constraint into a fuzzy random continuous review inventory model
with a mixture of back-orders and lost sales.
An analysis of the literature reveals that there are some studies of the continuous
review inventory system that consider both the fuzziness and randomness simul-
taneously. But, existing research does not assemble the controllable lead-time and
back-order rate in the mixed fuzzy random framework. Here, our intention is to ad-
dress this research gap of the continuous review inventory model under fuzzy random
environment.
Thus, in this paper, we consider a fuzzy random continuous review (Q, r, L)
inventory model inclusive of back-orders and lost sales by including the annual de-
mand as the fuzzy random variable. The lead-time is taken as a decision variable,
and the crashing cost is being introduced by the negative exponential function of the
lead-time. The back-order rate is also a decision variable, which is a function of the
lead-time through an amount of shortages. A budgetary constraint has been consid-
ered on the model in the form of an imprecise chance constraint. A methodology
has been developed to determine the optimal values of the decision-making variable
such that the annual cost of the inventory model is minimized in the fuzzy sense.
Finally, a numerical example is provided to illustrate the proposed methodology.
The rest of paper is organized as follows: Sect. 2 presents some basic concepts of
fuzzy set theory. In Sect. 3, development of proposed methodology is discussed. We
present a numerical example to illustrate the methodology in Sect. 4. Paper has been
summarized in Sect. 5.

2 Preliminaries

In this section, we review some basic concepts of the fuzzy set theory in which will
be used in this paper.

Definition 1 (Triangular fuzzy number [38]). A normalized triangular fuzzy number

Ã = (a, b, c) is a fuzzy subset of the real line R, whose membership function μÃ (x)
satisfies the following conditions:
(i) μÃ (x) is a continuous function from R to the closed interval [0, 1],
(ii) μÃ (x) = b−a
x−a
is strictly increasing function on [a, b],
(iii) μÃ (x) = 1 for x = b,
(iv) μÃ (x) = c−b
c−x
is strictly decreasing function on [b, c],
(v) μÃ (x) = 0 elsewhere.

Without any loss of generality, all fuzzy quantities are assumed as triangular fuzzy
numbers throughout this paper.
21 A Fuzzy Random Continuous (Q, r, L) Inventory Model … 267

Definition 2 (α-cut of fuzzy set [38]). Let Ã be a fuzzy set. The α-cut of the fuzzy
set Ã, denoted by Ãα = [A− +
α , Aα ], is defined as follows:

{x ∈ R : μÃ (x) ≥ α} if α ∈ (0, 1]
Ãα = (1)
cl{x ∈ R : μÃ (x) > 0} if α = 0.

Definition 3 (Fuzzy random variable [19]). Let (Ω, B, P) be a probability space

and F(R) be the set all all fuzzy numbers, then a mapping χ̃ : Ω → F(R) is said to
be a fuzzy random variable (or FRV for short) if for all α ∈ [0, 1], the two real-valued
mappings χα− : Ω → R and χα+ : Ω → R are real-valued random variable.

Definition 4 (Expectation of fuzzy random variable [19]). If X̃ is a fuzzy random

variable, then the fuzzy expectation of X̃ is a unique fuzzy number. It is defined by

E(X̃ ) = X̃ dP = Xα− dP, Xα+ dP :0≤α≤1 , (2)

where the α-cut of fuzzy random variable is [X̃ ]α = [Xα− , Xα+ ] for all α ∈ [0, 1].

Definition 5 (Possibilistic mean value of a fuzzy number [3]). Let Ã be a fuzzy

number with α-cut Ãα = [A− +
α , Aα ], and therefore, the possibilistic mean value of Ã
is denoted by M (Ã) and defined as

1
M (Ã) = α(A− +
α + Aα )d α. (3)
0

3 Methodology

3.1 Model and Assumptions

The inventory position is reviewed continuously in the (Q, r) continuous review

inventory system. When the stock position falls to the reordering point r, an order
quantity Q is placed to order. In inventory system, a state is said to be the stock-out
state if inventory level falls to zero, at any point in time. Considering the simultaneous
occurrence of back-orders and lost sales in real-world scenario, the effect of both are
included in the model. The following notations have been used to develop the model:

Notations

P fixed ordering cost per order,

h holding cost per unit per year,
π stock-out cost per unit stock-out,
268 D. Chakraborty et al.

π0 marginal profit per unit,

Q order quantity,
r reorder point,
β fraction of demand back-ordered during the stock-out period, (0 ≤ β ≤ 1),
L lead-time (in years),
R(L) lead-time crashing cost,
D̃(ω) annual demand (ω ∈ Ω where (Ω, B, P) is a probability space),
D̃L (ω) lead-time demand (ω ∈ Ω),
x+ max{0, x}.

In continuous review inventory system, the safety stock or buffer stock is defined as
the difference between reorder point r and the expected lead-time demand. Now, for
all practical purposes, none of the manufacturer wants to have a negative safety stock.
Therefore, nonnegative safety stock criterion is imposed on the model. To maintain
the safety stock at nonnegative level, r ≥ M (D̃L ) has been considered, where M (D̃L )
denotes the expected lead-time demand in possibilistic sense and defined by

1
− +
M (D̃L ) = α DL,α + DL,α dα. (4)
0

In order to incorporate fuzziness and randomness simultaneously [11], the annu-

al demand is assumed to be a discrete fuzzy random variable D̃(ω) (ω ∈ Ω where
(Ω, B, P)). Let us suppose that the annual customer demand D̃(ω) is of the form
{(D̃1 , p1 ), (D̃2 , p2 ), . . . , (D̃n , pn )}, where each of D̃i ’s are triangular fuzzy numbers
of the form (Di , Di , Di ) with corresponding probabilities pi ’s, i = 1, 2, . . . , n. More-
over, the lead-time demand is reflected by any fluctuation of the annual demand. Thus
instead of independent parameter, the lead-time demand is assumed to be connected
to the annual demand through the length of the lead-time in the following form [11]:

D̃L (ω) = D̃(ω) × L. (5)

Since annual demand D̃(ω) is a fuzzy random variable of the form D̃i =
(Di , Di , Di ) with associated probability pi , i = 1, 2, . . . , n, the lead-time demand
is also fuzzy random variable. Thus, the lead-time demand is of the form D̃L,i =
(DL,i , DL,i , DL,i ) with associated probability pi , i = 1, 2, . . . , n. Hence, the expected
lead-time demand can be expressed in triangular form. The triangular form of ex-
pected lead-time demand is given by E(D̃L (ω)) = D̃L = (DL , DL , DL ). The annual
demand D̃(ω) and the lead-time demand D̃L (ω) can be represented by its α-cut as
− +
[D̃(ω)]α = [Dα− (ω), Dα+ (ω)] and [D̃L (ω)]α = [DL,α (ω), DL,α (ω)] where α ∈ [0, 1].
The α-cut representation of the expected lead-time demand is defined as follows:
− − + +
DL,α (ω) = DL,α (ω) × L and DL,α (ω) = DL,α (ω) × L (6)
21 A Fuzzy Random Continuous (Q, r, L) Inventory Model … 269
⎧
⎪
n
⎪
⎨ E D −
L,α (ω) = −
Di,α pi × L
⇒ i=1 (7)
⎪
n
⎪
⎩ E DL,α+
(ω) = +
Di,α pi × L
i=1

In this study, we consider the lead-time is a decision variable and the lead-time
crashing cost is assumed to be a negative exponential function [8] of the lead-time,
which is given by
Crashing cost R(L) = εe−δL (8)

where we can estimate the parameters ε, δ by some of known values of the lead-time
crashing cost for a few values of L.
A function of fuzzy random variable is itself a fuzzy random variable; therefore,
total cost function is also a fuzzy random variable. Thus, the fuzzy total cost function
is given by

Q D̃(ω)
C˜(Q, r, L) = h + r − D̃(ω)L + h(1 − β) + {π + π0 (1 − β)} M (D̃L − r)+
2 Q
D̃(ω)
+ (R(L) + P) (9)
Q

where M (D̃L − r)+ denote the expected shortage at each cycle in possibilistic sense
and defined by

+ 1 − +
+ +
M D̃L − r = α (D̃L − r) + (D̃L − r) dα. (10)
α α
0

As mentioned earlier, in a realistic situation, the back-order rate is dependent on

the lead-time. Significantly large lead-times might lead to stock-out periods longer,
and as a result, many customers may not be willing for back-orders. This phenomenon
reveals that with the longer length of lead-time, the time of shortage gets longer and
with the increase of shortage the proportion of customers that can wait goes down
resulting in the overall decrease of back-order rate. Therefore, we consider the back-
orders rate, β, which is a decision variable instead of constant. During the stock-out
period, the back-order rate β is a function of the lead-time through the amount of
shortage M (D̃L − r)+ . The larger expected shortage quantity implies, the smaller
back-order rate. Thus, we consider β as β = 1+ αM (1D̃ −r)+ , where α the back-order
L
parameter (0 ≤ α < ∞).
270 D. Chakraborty et al.

Hence, the fuzzy total cost function can be written as

˜ D̃(ω) α(M (D̃L − r)+ )2 D̃(ω)
C(Q, r, L) = h + π0 +π M (D̃L − r)+
1 + αM (D̃L − r)+
Q Q

Q D̃(ω)
+h + r − D̃(ω)L + (P + εe−δL ) (11)
2 Q

In real-life situation, decision maker has to work under limited budget. Let us
consider that the cost of each item and the total available budget are c and C, respec-
tively. Then since the order quantities are Q when an order is placed, the following
inequality required to hold:
cQ ≤ C (12)

The information about the cost of each unit of the item and total budget available
is estimated from past data. Let ĉ ∼ N (mc , σ c ) and Ĉ ∼ N (mC , σ C ) be normally
distributed and independent random variables of the cost of each unit of the item
and the total available budget, respectively. Further, the fulfillment of the budget
constraint is an individual, organizational decision. Again the decision maker allows
some relaxation of the restriction; i.e., both sides of the constraint may be tied with
the vague relationship ‘’ which is the fuzzified version of ‘≤.’ As explained earlier,
the decision maker may be more confident to select the probability level in linguistic
terms. Thus, instead of crisp probability, a fuzzy probability measure, say around
p ∈ [0, 1] will be attached such that the constraint is satisfied with no less than this
imprecise probability level. Because of this, the budgetary constraint (12) may be
written in the form of the imprecise chance constraint as [4]

Prob ĉQ Ĉ p. (13)

The goal of the decision maker is to determine the optimal order quantity, reorder
point, lead-time, and back-order rate in order to minimize the total cost in fuzzy
sense. Since the total cost function is a fuzzy random variable thus the expectation
˜
of total cost function is a unique fuzzy number. Let M (C(Q, r, L)(ω)) or simply
M (Q, r, L) be the defuzzified representation of the expectation of the total cost. So
mathematically, the problem can be formulated in the following optimization form:
⎧
⎪
⎪ min M (Q, r, L)
⎪
⎪ Q,r,L
⎨
(P3 ) such that
Prob ĉQ Ĉ p
⎪
⎪
⎪
⎪
⎩
Q, r, L ≥ 0;
21 A Fuzzy Random Continuous (Q, r, L) Inventory Model … 271

where the value of M (Q, r, L) is need to be determined. Therefore, the following steps
are required to find for obtaining the optimal solution of decision-making variables:
(i) The expected lead-time demand and the exact expression for expected shortage
M (D̃L − r)+ for a given r ∈ [DL , DL ] in possibilistic sense;
(ii) The expected value of the total cost function, which are a fuzzy random variable
and the defuzzified representation of this fuzzy random variable;
(iii) The crisp equivalent form of the imprecise chance constraint;
(iv) The optimal values of order quantity Q∗ , reorder point r ∗ , lead-time L∗ , and
back-order rate β ∗ in order to minimize the total cost.

3.2 Determination of the Expected Shortage

The expected lead-time demand is D̃L = (DL , DL , DL ). Now, in order to maintain

the nonnegative safety or buffer stock, the lower bound of reorder point r is M (D̃L ).
When the expected lead-time demand D̃L in each cycle is greater than r, then there
is a shortage of amount (D̃L − r). Since the lead-time demand D̃L is a triangular
fuzzy number, the upper bound of the reorder point r is DL due to the nonnegative
safety stock condition. Thus to determine the expected amount of shortage in each
cycle, two situations will arise depending upon the position of r ∈ [DL , DL ] subject
to condition that the safety or buffer stock is nonnegative.
Situation 1. For r lying between DL and DL , we have the α-level set of the lead-
time demand as [11]
+
[r, DL,α ], α ≤ L(r)
(D̃L )α = − +
[DL,α , DL,α ], α > L(r)

which implies
+
[0, DL,α − r], α ≤ L(r)
(D̃L − r)+ = − + (14)
α
[D L,α − r, DL,α − r], α > L(r)

Therefore, the possibilistic mean is obtained as follows:

+ 1 − +
+ +
M D̃L − r = α (D̃L − r) + (D̃L − r) dα
α α
0
1 1
+ −
= αDL,α dα + αDL,α dα − (1 − 0.5L2 (r)) (15)
0 L(r)
272 D. Chakraborty et al.

Situation 2. For r lying between DL and DL , we have the α-level set of the lead-
time demand as [11]
+
[r, DL,α ], α ≤ R(r)
(D̃L )α =
φ, α > R(r)

which implies +
+ [0, DL,α − r], α ≤ R(r)
(D̃L − r) = (16)
α
φ, α > R(r)

Therefore, the possibilistic mean is obtained as follows:

+ 1 − +
M D̃L − r = α (D̃L − r)+ + (D̃L − r)+ dα
α α
0
R(r)
+
= αDL,α dα − 0.5rR2 (r) (17)
0

3.3 Defuzzification of the Fuzzy Expected Total Cost

Function Using Possibilistic Mean Value

We have obtained the total cost function in (11), which is given by

˜ Q D̃(ω) α(M (D̃L − r)+ )2
C(Q, r, L) = h + r − D̃(ω)L + h + π0
2 Q 1 + αM (D̃L − r)+

D̃(ω) D̃(ω)
+π M (D̃L − r)+ + (P + εe−δL ) (18)
Q Q

where M (D̃L − r)+ is given by Eqs. (15) or (17) according to the position of the target
inventory level r in the interval [DL , DL ]. For computational purpose, we defuzzified
the fuzzy expected total cost function using its possibilistic mean value. Let E(C̃(ω))
be the fuzzy expectation of the total cost function. Then, the possibilistic mean value
of the fuzzy expected total cost function is given by

1
− +
M (Q, r, L) = α E(Cα (ω)) + E(Cα (ω)) dα (19)
0

Now, the α-level set of E(C̃(ω)) is then given by

21 A Fuzzy Random Continuous (Q, r, L) Inventory Model … 273

ECα (ω) = E(Cα (ω) = [E(Cα− (ω), E(Cα+ (ω)], α ∈ [0, 1], ω ∈ (Ω, B, P), where

n −

D (ω) α(M (D̃L − r)+ )2
α
E(Cα− (ω)) = +
P + πM (D̃L − r) + π0 + εe −δL

i=1
Q 1 + αM (D̃L − r)+
n

{Di + α(Di − Di )} α(M (D̃L − r)+ )2
+ −δL
= P + πM (D̃L − r) + π0 + εe
i=1
Q 1 + αM (D̃L − r)+

Q α(M (D̃L − r)+ )2
+h + r − {Di − α(Di − Di )}L + pi (20)
2 1 + αM (D̃L − r)+

and
n +

D (ω) α(M (D̃L − r)+ )2
α
E(Cα+ (ω)) = P + π M (D̃L − r)+ + π0 + εe−δL
i=1
Q 1 + αM (D̃L − r)+

Q − α(M (D̃L − r)+ )2
+h + r − Dα (ω)L + pi
2 1 + αM (D̃L − r)+
n

{Di − α(Di − Di )} α(M (D̃L − r)+ )2
= +
P + π M (D̃L − r) + π0 + εe −δL

i=1
Q 1 + αM (D̃L − r)+

Q α(M (D̃L − r)+ )2
+h + r − {Di + α(Di − Di )}L + pi (21)
2 1 + αM (D̃L − r)+

Substituting the values of Eqs. (20) and (21) in (19), we find the possibilistic mean
value of the fuzzy expected total cost function M (Q, r, L), which is given by

1 α(M (D̃L − r)+ )2
M (Q, r, L) = P + π M (D̃L − r)+ + π0 + εe−δL
Q 1 + αM (D̃L − r)+
n
1 2
n
(D + Di )pi + Di pi
6 i=1 i 3 i=1
n
1 2
n
Q
+h +r− (D + Di )pi + Di pi L
2 6 i=1 i 3 i=1

α(M (D̃L − r)+ )2
+ (22)
1 + αM (D̃L − r)+
274 D. Chakraborty et al.

3.4 Crisp Equivalent Form of the Imprecise Chance

Constraint

The imprecise
chance
constraint is as follows:

Prob ĉQ Ĉ p, where ĉ ∼ N (mc , σ c ) and Ĉ ∼ N (mC , σ C ) are normally
distributed and independent random variables of the cost of each unit of item and
the total available budget, respectively. Since this constraint cannot be dealt with this
form, hence the imprecise chance constraint is transformed to its crisp equivalent
form using the concept which is mentioned in [4].
Suppose Ẑ = ĉQ − Ĉ. Then, Ẑ follows the normal distribution with mean mZ and
21
standard deviation σ where m = m Q − m and σ = (σ ) Q + (σ ) .
Z Z c C Z c 2 2 C 2

Resorting the fuzzy ordering between the left- and right-hand sides of ‘’ in the
Z
parenthesis (), Ẑ is then replaced by its standard normal variable Ẑ−m
σZ
as follows

Ẑ − mZ −mZ

Prob p. (23)
σZ σZ

Now, for a fuzzy event (Z z), the following proposition, as proved by [4], holds:

F(z) ≤
Prob(Z z) ≤ F(z + Δz) (24)

where Δz is the extent of softness permitted and fixed by decision maker. Therefore
using the result of (24) in (23), we have

Ẑ − mZ −mZ −mZ

Prob ≤φ (25)
σZ σZ σ Z

where Ẑ
= ĉQ − (Ĉ + ΔC) ≤ Ẑ and φ(.) is the distribution function of standard
normal variable. Here, ΔC (non-random) is the range of tolerance allowed and fixed
by the decision maker for the fuzzy events ĉQ Ĉ. Hence, we get the following
fuzzy relation

−mZ
φ p. (26)
σ Z

Assuming the following linear membership function of the above fuzzy relation with
Δp assumed to be range of tolerance permitted,
⎧
⎨1 if φ(·) > p
φ(·) − (p − Δp)
μφ(·) (p) = Δp
if p − Δp ≤ φ(·) ≤ p (27)
⎩
0 otherwise
21 A Fuzzy Random Continuous (Q, r, L) Inventory Model … 275

Hence, the crisp equivalent form of the imprecise chance constraint is given as

mZ + σ Z φ −1 (p − Δp) ≤ 0. (28)

3.5 Optimal Solution

Our main goal is to find the optimal solution. In order to find the optimal order quan-
tity, reorder point, lead-time, and back-order rate for decision making, the following
steps are required to execute.
Step (i): Input the values of P, h, π, π0 , ĉ, Ĉ, p, α, , δ.
Step (ii): Calculate the possibilistic mean value of the fuzzy expected shortage using
either (15) or (17) with the condition 0 ≤ L(r) ≤ 1 and 0 ≤ R(r) ≤ 1,
respectively.
Step (iii): Determine the safety stock criteria, i.e., r − M (DL ) ≥ 0.
Step (iv): Calculate the possibilistic mean value of the fuzzy expected total cost
from (22).
Step (v): Find the crisp equivalent form of imprecise chance constraint using (28).
Step (vi): Use the Lingo, Lindo, or Mathematica to solve the following minimization
problem
⎧
⎪
⎪ min M (Q, r, L)
⎪
⎪ Q,r,L
⎪
⎪
⎪
⎪ such that mZ
+ σ Z
φ −1 (p − Δp) ≤ 0
⎨
(P3 ) r ≥ M (D̃L )
⎪
⎪
⎪
⎪
⎪
⎪ r ≤ DL
⎪
⎪
⎩
Q, r, L ≥ 0.

4 Numerical Example

A Leather Good’s company in a city, say X Leather private limited, sells a particular
type of handbags. The cost of placing an order is assumed to be Rs. 200. The holding
cost is Rs. 20 per item per year. The fixed penalty cost for the shortage is Rs. 50,
and the cost of lost sales including marginal profit is Rs. 100. Suppose it is estimated
that the expense of each handbag is normally distributed with mean Rs. 375 and
standard deviation Rs. 5. The total budget available to the private limited is also
normally distributed with mean Rs. 30,000 and standard deviation Rs. 75. The lead-
time reduction cost is a negative exponential function of the lead-time, i.e., R(L) =
εe−δL with ε = 156 and δ = 114. Now, the manager of X private limited is quite
276 D. Chakraborty et al.

Table 1 Demand Demand Probability

information
(825, 1130, 1270) .25
(775, 977, 1275) .22
(1120, 1325, 1450) .27
(1240, 1352, 1560) .26

satisfied if the budgetary constraint attains to the probability of ‘around 0.8’. The
information about annual demand is given in Table 1.
Thence, the problem is to determine the optimal order quantity Q∗ , reorder point
r , lead-time L∗ , and back-order rate β ∗ in such a way that the expected annual
∗

inventory cost incurred is minimum. From the above problem, we have the order
cost P = 200, the inventory holding cost h = 20, the fixed shortage cost π = 50, the
marginal profit π0 = 100, the lead-time reduction cost R(L) = 156e−114L , the cost of
each handbag ĉ ∼ N (375, 5), the total budget Ĉ ∼ N (30000, 75) and the probability
p =
around 0.8
. Thus, the expected lead-time demand and possibilistic mean value
of lead-time are given by

D̃L = (1001.55, 1206.71, 1373.1)L and, (29)

1 2
n n
M (D̃L ) = (D + Di )pi + Di pi L = 1200.248L. (30)
6 i=1 i 3 i=1

Then, the defuzzified fuzzy expected total cost function is obtained as

Q α(M (D̃L − r)+ )2
M (Q, r, L) = 20 + r − 1200.248L +
2 1 + αM (D̃L − r)+

1200.248 α(M (D̃L − r)+ )2
+ 200 + 50M (D̃L − r)+ + 100 + 156e−114L
Q 1 + αM (D̃L − r)+
(31)

Thus, mathematically, we need to solve the following optimization problem for de-
termining the optimal solutions:
⎧
⎪
⎪ min M (Q, r, L)
⎪
⎪ Q,r,L
⎪
⎪
⎪
⎪ such that 140607.29Q2 − 22575000Q + 906006016 ≥ 0
⎪
⎪
⎪
⎨ r ≥ 1200.248L
(P4 )
⎪
⎪ r ≤ 1373.1L
⎪
⎪
⎪
⎪ r − 1001.55L
⎪
⎪ ≤1
⎪
⎪
⎪
⎩
205.16L
Q, r, L ≥ 0.
21 A Fuzzy Random Continuous (Q, r, L) Inventory Model … 277

Table 2 Optimal solutions of optimization problem (P4 ) for different values of α

α Q∗ r∗ β∗ L∗ (in yearr) R(L) Total cost
0.0 79.36030 26.52812 1.0000000 0.02198384 12.00000 4467.313
0.5 79.36030 20.88508 0.8064689 0.01730796 20.85363 4641.311
1.0 79.36030 18.99215 0.6961684 0.01573879 24.93852 4730.848
10 79.36030 15.20582 0.2225064 0.01260105 35.66305 5039.961
∞ 79.36030 14.84971 0.0000000 0.01230594 36.88325 5158.792

1.18791×10− 5r
where M (D̃L − r)+ = (1200.248L − r) + (r − 1001.55L)2 L2
− .01187542
L
−
− 7.91942×10
L2
6
(r − 1001.55L)3 , Δp = .01 and ΔC = 100. For the different values of
α, the optimal solutions are presented in Table 2. Through numerical solutions, we
have seen that as the back-order parameter α increases, the back-order rate decreases,
and with the decreases of back-order rates, the total cost increases. It is also observed
that the lead-time crashing cost increases as the length of the lead-time declines.

5 Conclusions

In this paper, we have proposed a fuzzy random continuous review inventory system
with a mixture of back-orders and lost sales. The model is developed under the
consideration that the order quantity, reorder point, back-order rate, and lead-time
are the decision variables. We have considered the negative exponential function of
lead-time and introduced a function of lead-time through an amount of shortages
for controlling the lead-time and back-order rate, respectively, in the fuzzy random
framework. We have considered the annual demand as a fuzzy random variable to
capture the fuzziness and randomness simultaneously. To quantify the imprecise
information, a budgetary constraint has been imposed on the model in the form of an
imprecise chance constraint. We developed a methodology for obtaining the optimum
decision-making variables in such a way that the total annual cost is minimized in the
fuzzy sense. A numerical example has illustrated the proposed methodology. In future
research on this model, it would be interesting to deal with imprecise probabilities.
On the other hand, a possible extension of this model can be achieved by inclusion
of the service-level constraint.

References

1. Abdel-Malek, Layek L., Montanari, R.: An analysis of the multi-product newsboy problem
with a budget constraint. Int. J. Prod. Econ. 97(3), 296–307 (2005)
2. Ben-Daya, M., Raouf, A.: Inventory models involving lead time as decision variable. J. Oper.
Res. Soc. 45(5), 579–582 (1994)
278 D. Chakraborty et al.

3. Carlsson, C., Fuller, R.: On possibilistic mean value and variance of fuzzy numbers. Fuzzy
Sets Syst. 122(2), 315–326 (2001)
4. Chakraborty, D.: Redefining chance-constrained programming in fuzzy environment. Fuzzy
Sets Syst. 125(3), 327–333 (2002)
5. Chang, H.C., Yao, J.S., Quyang, L.Y.: Fuzzy mixture inventory model involving fuzzy random
variable lead time demand and fuzzy total demand. Eur. J. Oper. Res. 169(1), 65–80 (2006)
6. Charnes, A., Cooper, W.W.: Chance-constrained programming. Manag. Sci. 6(1), 73–79 (1959)
7. Dey, O., Giri, B.C., Chakraborty, D.: A fuzzy random continuous review inventory model with
a mixture of backorders and lost sales under imprecise chance constraint. Int. J. Oper. Res.
26(1), 34–51 (2016)
8. Dey, O., Chakraborty, D.: A fuzzy random periodic review system with variable lead-time and
negative exponential crashing cost. Appl. Math. Model. 36(12), 6312–6322 (2012)
9. Dey, O., Chakraborty, D.: A fuzzy random periodic review system: a technique for real-life
application. Int. J. Oper. Res. 13(4), 395–405 (2012)
10. Dey, O., Chakraborty, D.: Fuzzy periodic review system with fuzzy random variable demand.
Eur. J. Oper. Res. 198(1), 9113–120 (2009)
11. Dutta, P., Chakraborty, D.: Continuous review inventory model in mixed fuzzy and stochastic
environment. Appl. Math. Comput. 188(1), 970–980 (2007)
12. Dutta, P., Chakrabortyand, D., Roy, A.R.: A single-period inventory model with fuzzy random
variable demand. Math. Comput. Model. 41(8–9), 915–922 (2005)
13. Gen, M., Tsujimura, Y., Zheng, D.: An application of fuzzy set theory to inventory control
models. Comput. Ind. Eng. 33(3), 553–556 (1997)
14. Glock, C.H.: Lead time reduction strategies in a single-vendor-single-buyer integrated invento-
ry model with lot size-dependent lead times and stochastic demand. Int. J. Prod. Econ. 136(1),
37–44 (2012)
15. Hadley, G., Whitin, T.M.: Analysis of Inventory Systems. Prentice-Hall, Englewood Cliffs, NJ
(1963)
16. Kim, O.H., Park, K.S.: (Q, r) inventory model with a mixture of lost sales and weighted back-
orders. J. Oper. Res. Soc. 36, 231–238 (1985)
17. Kumar, R.S., Goswami, A.: A continuous review production-inventory system in fuzzy random
environment: minmax distribution free procedure. Comput. Ind. Eng. 79(1), 65–75 (2015)
18. Kundu, A., Chakrabarti, T.: A multi-product continuous review inventory system in stochastic
environment with budget constraint. Optim. Lett. 6(2), 299–313 (2012)
19. Kwakernaak, H.: Fuzzy random variables—I. Definitions and theorems. Inform. Sci. 15(1),
1–29 (1978)
20. Lee, W.C.: Inventory model involving controllable back-order rate and variable lead time
demand with the mixture of distribution. Appl. Math. Comput. 160(3), 701–717 (2005)
21. Liao, C.J., Shyu, C.H.: An analytical determination of lead time with normal demand. Int. J.
Oper. Prod. Manage. 11(9), 72–78 (1991)
22. Lin, H.J.: A stochastic periodic review inventory model with back-order discounts and ordering
cost dependent on lead time for the mixtures of distributions. Top 23(2), 386–400 (2015)
23. Montgomery, D.C., Bazaraa, M.S., Keswani, A.K.: Inventory models with a mixture of back-
orders and lost sales. Naval Res. Logistics Q. 20(2), 255–263 (1973)
24. Moon, I., Silver, E.A.: The multi-item newsvendor problem with a budget constraint and fixed
ordering costs. J. Oper. Res. Soc. 51(5), 602–608 (2000)
25. Moon, I., Choi, S.: TECHNICAL NOTEA note on lead time and distributional assumptions in
continuous review inventory models. Comput. Oper. Res. 25(11), 1007–1012 (1998)
26. Naddor, E.: Inventory Systems. Wiley, New York (1966)
27. Ouyang, L.Y., Yao, J.S.: A minimax distribution free procedure for mixed inventory model
involving variable lead time with fuzzy demand. Comput. Oper. Res. 29(5), 471–487 (2002)
28. Ouyang, L.Y., Chuang, B.R.: Mixture inventory model involving variable lead time and con-
trollable back-order rate. Comput. Ind. Eng. 40(4), 339–348 (2001)
29. Ouyang, L.Y., Chang, H.C.: Impact of investing in quality improvement on (Q, r, L) model
involving the imperfect production process. Prod. Plan. Control 11(6), 598–607 (2000)
21 A Fuzzy Random Continuous (Q, r, L) Inventory Model … 279

30. Ouyang, L.Y., Yeh, N.C., Wu, K.S.: Mixture inventory model with back-orders and lost sales
for variable lead time. J. Oper. Res. Soc. 47, 829–832 (1996)
31. Padmanabhan, G., Vrat, P.: Inventory models with a mixture of backorders and lost sales. Int.
J. Syst. Sci. 21(8), 1721–1726 (1990)
32. Park, K.S.: Fuzzy-set theoretic interpretation of economic order quantity. IEEE Trans. Syst.
Man Cybern. 17(6), 1082–1084 (1987)
33. Sarkar, B., Moon, I.: Improved quality, setup cost reduction, and variable backorder costs in
an imperfect production process. Int. J. Prod. Econ. 155, 204–213 (2014)
34. Shekarian, E., Kazemi, N., Rashid, S.H.A., Olugu, E.U.: Fuzzy inventory models: a compre-
hensive review. Appl. Soft Comput. 45(2–3), 260–264 (2017)
35. Tersine, R.J.: Principles of Inventory and Materials Management. Prentice Hall, Englewood
Cliffs, NJ (1994)
36. Tütüncü, G.Y., Aköz, O., Apaydın, A., Petrovic, D.: Continuous review inventory control in
the presence of fuzzy costs. Int. J. Prod. Econ 113(2), 775–784 (2008)
37. Vijayan, T., Kumaran, M.: Inventory models with mixture of back-orders and lost sales under
fuzzy cost. Eur. J. Oper. Res. 189(1), 105–119 (2008)
38. Zimmermann, H.J.: Fuzzy Set Theory and its Applications. Springer Science & Business Media
(2011)
Chapter 22
Estimation of the Location Parameter
of a General Half-Normal Distribution

Lakshmi Kanta Patra, Somesh Kumar and Nitin Gupta

Abstract In this paper, estimation of the location parameter of a half-normal dis-

tribution is considered. Some unbiased as well as biased estimators are derived.
Admissibility and minimaxity of Pitman estimator are proved. A complete class
of estimators among multiples of the maximum likelihood estimator is obtained.
We develop a one-sided asymptotic confidence interval for the location parameter.
Numerical comparisons of the percentage risk improvements over maximum likeli-
hood estimator of various estimators are carried out.

Keywords Half-normal distribution · Generalized Bayes estimator · Pitman

estimator · Admissible estimator · Minimax estimator

1 Introduction

If Z is a standard normal random variable, then Y = |Z| follows a standard half-

normal distribution. The half-normal distribution is a special case of the folded normal
and truncated normal distributions ([12], pp. 156,√ 170). Also if W has a chi-square
distribution on one degree of freedom, then Y = W follows a standard half-normal
distribution.
Let X = ηY + ξ, then X follows a general half-normal distribution and the prob-
ability density function of X is given by

1 2 (x − ξ)2
fX (x|ξ, η) = exp − , x > ξ, − ∞ < ξ < ∞, η > 0. (1)
η π 2η 2

The general half-normal distribution is a special case of the generalized gamma

distribution and also of the two-parameter chi-square distribution. This distribution

L. K. Patra (B)
Indian Institute of Information Technology Ranchi, Ranchi, India
e-mail: [email protected]
S. Kumar · N. Gupta
Indian Institute of Technology Kharagpur, Kharagpur 721302, India
© Springer Nature Singapore Pte Ltd. 2018 281
D. Ghosh et al. (eds.), Mathematics and Computing, Springer Proceedings
in Mathematics & Statistics 253, https://doi.org/10.1007/978-981-13-2095-8_22
282 L. K. Patra et al.

was first introduced by Daniel [6] with ξ = 0. He introduced half-normal plot for
interpreting factorial two-level experiments. This distribution is also an important
limiting distribution of Azzalini’s three-parameter skew-normal class, introduced in
[2]. The half-normal distribution has also been used for modeling truncated data.
It has applications in several areas such as stochastic frontier modeling [1], sports
science and physiology [15, 17]. Unbiased estimators for location and scale param-
eters of a general half-normal distribution are proposed by Nogales and Perez [14].
They have numerically shown that the proposed estimators perform better than other
existing estimators in the literature. Using the proposed estimators, they derived large
sample confidence intervals for the location and scale parameters.
The data set introduced in [5] consists of percentage body fat measurements
made on 202 elite athletes who trained at the Australian Institute of Sports. The
data on the measurements of male athletes is seen to follow a general half-normal
distribution. Azzalini and Capitanio [3] and Pewsey [15] have shown that for highly
skewed data, the maximum likelihood estimator of a fitted skew-normal distribution
often corresponds to a general half-normal distribution. So, inferential results for a
half-normal distribution have relevance in modeling of the skewed data.
First, we state some results which are already available in the literature. Let
X = (X1 , . . . , Xn ) be a random sample from a general half-normal distribution with
the density (1). Pewsey [16] considered a general half-normal distribution and showed
that the maximum likelihood estimators (MLEs) of the location parameter ξ and the
ˆ
scale parameter η are ξML = X(1) and η̂ML = 1
(Xi − X(1) )2 , respectively, where
n
X(1) = min{X1 , . . . , Xn }. These estimators are biased. He also derived large sample
confidence intervals for the parameters. The bias-corrected estimator of the param-
eters of a general half-normal distribution based on maximum likelihood estimators
was considered by Pewsey [17]. He also constructed a bias-corrected confidence
interval for the location parameter. The bias-corrected estimators of η and ξ are
given as

n 1 1
η̂BC = η̂ML and ξˆBC = ξˆML − Φ −1 + η̂BC (2)
n−1 2 2n

respectively, where Φ(.) is the cumulative distribution function of a standard normal

distribution. We also denote by φ(.) the probability density function of a standard
normal distribution. Bayes estimation of the parameters of a general half-normal
distribution is studied by Farsipour and Rasouli [7]. Wiper et al. [19] derived Bayes
estimators for the parameters of a general half-normal distribution with the loca-
tion parameter ξ and scale parameter η. They considered a non-informative prior
f (ξ, τ ) ∝ 1/τ , where ξ ∈ R, τ > 0 and τ = 1/η 2 . For this prior, the joint posterior
distribution of ξ and τ is a right-truncated normal-gamma distribution. Wiper et al.
[19] showed that the marginal distribution of ξ is a truncated t-distribution and the
marginal distribution of τ is a Gaussian-modulated gamma distribution. Finally, they
have numerically compared the bias and root-mean-squared errors of the proposed
estimators using simulation.
22 Estimation of the Location Parameter of a General … 283

To the best of our knowledge, the decision theoretic properties like admissibility
and minimaxity have not been explored for estimating parameters of a half-normal
distribution. In this paper, we first prove a complete class result for estimating the
location parameter when the scale parameter is known. The admissibility and mini-
maxity properties of a generalized Bayes estimator are established. The generalized
Bayes estimator is also shown to be a limit of Bayes rules and is also seen to perform
very well in terms of the mean squared error.
The organization of the paper is as follows. In Sect. 2, estimation of the location
parameter ξ is considered when η is known. Some biased and unbiased estimators of
ξ are derived. A complete class result is established. In Sect. 3, we prove that the Pit-
man estimator is the same as a generalized Bayes estimator and that it is also a limit
of Bayes rules. The minimaxity and admissibility of the Pitman estimator are estab-
lished. A simulation study is carried out to numerically compare the performance of
various estimators.

2 Unbiased Estimation and a Complete Class Result

We consider the estimation of the location parameter ξ when scale parameter η is

known. Without loss of generality, we take η = 1. The probability density function
and the cumulative distribution function of the random variable X following a half-
normal distribution are given by

2 (x − ξ)2
fX (x|ξ) = exp − , x > ξ, − ∞ < ξ < ∞ (3)
π 2

and

2Φ(x − ξ) − 1 , if x > ξ
FX (x|ξ) =
0 , if x ≤ ξ

respectively.
Let X = (X1 , . . . , Xn ) be a random sample from this distribution. We consider the
problem of estimating ξ with respect to the squared error loss function

L(ξ, δ) = (ξ − δ)2 . (4)

Note that the method of moment estimator (MME) of ξ is T1 = X − π2 . This
is also unbiased for ξ. Further, the maximum likelihood estimator (MLE) of ξ is

ξML = X(1) . The joint density function of X is
284 L. K. Patra et al.

n

2
fX (x|ξ) = exp − (xi − ξ)2 /2 , xi > ξ, − ∞ < ξ < ∞ (5)
π
n n
2 ξ
e− xi /2 e{nξ(x− 2 )}
2
= I(x(1) ,∞) (x(i) )I(ξ,∞) (x(1) ),
π i=2

where x = (x1 , x2 , . . . , xn ) and I (.) is the indicator function. By factorization theo-

rem, a sufficient statistic for the above family of distributions is T = (X , X(1) ).
Now, we show that T is not complete. For this, we find a function g(t) such that
Eξ (g(T )) = 0 for all ξ ∈ R, but Pξ (g(T ) = 0) = 1 for some ξ ∈ R.
The density function of X(1) is

2n n (Φ(ξ − y))n−1 φ(ξ − y) , if y > ξ
fX(1) (y|ξ) = (6)
0 , if y ≤ ξ.

It is seen that E(X(1) ) = ξ + Qn , where

0
Qn = 2n (Φ(z))n dz. (7)
−∞

Consequently, Eξ (X(1) − Qn ) = ξ, and so T0 = X(1) − Qn is an unbiased estimator

of ξ.
If we take

2
g(T ) = X(1) − Qn − X + , (8)
π

then Eξ (g(T )) = 0 for all ξ ∈ R, but g(t) = 0 with probability 1. This proves that T
is not complete.
Now, we define a new unbiased estimator of ξ as Tα = αT1 + (1 − α)T0 , where
α ∈ R. Note that

Vξ (Tα ) = Eξ (Tα − ξ)2 = Eξ (αT1 + (1 − α)T0 − ξ)2 .

The choice of α which minimizes Vξ (Tα ) is

Eξ (T02 − T1 T0 )
α(n) = .
Eξ (T1 − T0 )2

Note that α(n) does not depend on ξ. So, we get the following result
Lemma 1 The estimator Tα(n) is the best estimator in the class of estimators {Tα :
α ∈ R} for estimating ξ with respect to squared error loss function (4).

Remark 1 The minimizing choice α(n) depends on the sample size n. In Table 1, we
report values of α(n) for various choices of n. These values have been evaluated using
22 Estimation of the Location Parameter of a General … 285

Table 1 Values of α(n)

n 5 10 20 30 40 50 100 200 500
α(n) 0.16670 0.10684 0.06358 0.04651 0.03565 0.02917 0.01581 0.00755 0.003167

simulation of half-normal random variables based on 50,000 replications. From the

table, we note that α(n) decreases as n increases. It is also seen from the simulated
values that α(n) always lies between 0 and 1.
Next, we consider a class of estimators of ξ of the form δc = X(1) + c, where c is
a real number. The following lemma follows immediately.

Lemma 2 The unbiased estimator T0 is the best estimator of ξ in the class of esti-
mators {δc = X(1) + c : c ∈ R} for estimating ξ with respect to squared error loss
function (4).

2.1 A Complete Class Result

Definition 1 For estimating θ ∈ Θ, a class of estimators D is said to be complete,

if for any estimator δ1 not in D, there exists an estimator δ0 ∈ D such that

R(θ, δ1 ) ≤ R(θ, δ0 ) ∀ θ ∈ Θ and R(θ, δ1 ) < R(θ, δ0 ) for at least one θ ∈ Θ.

Here, R(., .) is the risk function with respect to a given loss function.

Here, we prove a complete class result of the location parameter ξ when ξ > 0.
Maximizing the likelihood function over the restricted parameter space ξ > 0, we
find that the MLE of ξ is X(1) . Consider estimators of the form

δc (X ) = cX(1) (9)

with c is a positive constant. Now

E(δc ) = c(ξ + Qn ), and E(δc2 ) = c2 (ξ 2 + 2ξQn − Rn ),

where
0
Rn = 2 n+1
z (Φ(z))n dz,
−∞
286 L. K. Patra et al.

and Qn is given by (7). Thus, we get the risk function of δc with respect to the loss
(4) as

R(δc , ξ) = c2 (ξ 2 + 2ξQn − Rn ) − 2ξc(ξ + Qn ) + ξ 2 .

The choice of c which minimizes R(δc , ξ) is

ξ(ξ + Qn )

c(ξ) = .
ξ 2 + 2ξQn − Rn

Note that

inf
c(ξ) = 0 and sup
c(ξ) = 1.
0≤ ξ<∞ 0≤ ξ<∞

Since the risk function R(δc , ξ) is convex in c for every ξ, it can be seen that if c < 0,
then the estimator δ0 improves upon δc ; and if c > 1, then the estimator δ1 improves
upon δc . Further, the estimator δc , 0 ≤ c ≤ 1 cannot be improved by any δc . (This
technique was first developed by [4].) This proves the following theorem.
Theorem 1 The class of estimators {δc : 0 < c ≤ 1} forms a complete class for
estimating ξ, among all estimator of the form (9) when the loss function is squared
error loss function.

3 Asymptotic Confidence Interval for ξ

In this section, we will construct an asymptotic confidence interval for ξ based on

the unbiased estimator Tα = αT1 + (1 − α)T0 given in Sect. 2. Let Exp(σ) denote
an exponential distribution with the scale parameter σ. Pewsey [16] has shown that

X(1) − ξ
1
Φ −1 2
+ 2n1

has asymptotically an Exp(1) distribution. Therefore, the limiting distribution of

(1 − α) X(1) − ξ
1
Φ −1 2 + 2n1

is Exp(1 − α).
By weak law of large numbers, X converges in probability to ξ + π2 . Hence,

X − ξ − π2 converges in probability to 0. Now, Qn goes to 0 and Φ −1 21 + 2n 1
√

−1 1
α X −ξ− π2
goes to Φ as n goes to ∞. Hence by Slutsky’s Lemma (see [10]), Φ −1 1 + 1
2 ( 2 2n )
converges in probability to 0.
22 Estimation of the Location Parameter of a General … 287

Now, we have

Tα − ξ αT1 + (1 − α)T0 − ξ
= .
Φ −1 21 + 2n
1
Φ −1 21 + 2n
1

This can be further written as

Tα − ξ α X − 2
π
−ξ (1 − α)(X(1) − ξ) (1 − α)Qn
1 = 1 + 1 − . (10)
Φ −1 + 2n
1
Φ −1 + 2n
1
Φ −1 + 2n
1
Φ −1 21 + 2n
1
2 2 2

Clearly, from (10) it follows that Φ −1Tα1−ξ converges in distribution to a random

( 2 + 2n1 )
variable Y having Exp(1 − α) distribution.
So, a one-tailed asymptotic 100(1 − γ)% confidence interval for ξ is given by

1 1
(1 − α) log (γ) Φ −1 + + Tα ≤ ξ ≤ X(1) . (11)
2 2n

In Tables 2, 3, and 4, the estimated coverage probabilities and widths of the confi-
dence interval (11) of ξ are evaluated based on simulations for 90%, 95%, and 99%
confidence levels. Sample size n is taken ranging from 20 to 100. Reported coverage
probabilities and widths of confidence intervals are obtained from 50,000 simula-
tions of samples of size n from general half-normal distribution with the parameters
ξ = 0 and η = 1. From the tables, we observe the following:
(i) Coverage probability and width of the proposed confidence interval decrease
as α increases.
(ii) Coverage probability of the proposed confidence interval is larger than the
coverage probability of the one-sided confidence interval proposed by [19] for
all values of α.

Table 2 Estimated coverage probability and width based on Tα

γ 90%
n 20 30 50 100
α Coverage Width Coverage Width Coverage Width Coverage Width
0.00 0.9701 0.2043 0.9680 0.1367 0.9662 0.0823 0.9648 0.0413
0.01 0.9694 0.2028 0.9672 0.1357 0.9654 0.0817 0.9644 0.0409
0.05 0.9666 0.1974 0.9648 0.1319 0.9627 0.0792 0.9613 0.0398
0.1 0.9625 0.1898 0.9607 0.1271 0.9571 0.0756 0.9535 0.0384
0.2 0.9496 0.1754 0.9460 0.1175 0.9364 0.0707 0.9160 0.0356
0.3 0.9277 0.1610 0.9160 0.1079 0.8885 0.0649 0.8399 0.0327
α(n) 0.9658 0.1951 0.9650 0.1322 0.9643 0.0806 0.9638 0.0408
288 L. K. Patra et al.

Table 3 Estimated coverage probability and width based on Tα

γ 95%
n 20 30 50 100
α Coverage Width Coverage Width Coverage Width Coverage Width
0.00 0.9865 0.2477 0.9860 0.1657 0.9840 0.9968 0.9828 0.0500
0.01 0.9862 0.2458 0.9855 0.1644 0.9836 0.0989 0.9826 0.0495
0.05 0.9845 0.2383 0.9837 0.1594 0.9819 0.0959 0.9809 0.0481
0.1 0.9824 0.2289 0.9809 0.1531 0.9791 0.0922 0.9769 0.0462
0.2 0.9742 0.2102 0.9718 0.1407 0.9670 0.0846 0.9544 0.0425
0.2 0.9596 0.1914 0.9529 0.1282 0.9351 0.0771 0.8954 0.0387
α(n) 0.9841 0.2358 0.9837 0.1598 0.9830 0.9749 0.09825 0.0493

Table 4 Estimated coverage probability and width based on Tα

γ 99%
n 20 30 50 100
α Coverage Width coverage width Coverage Width Coverage Width
0.00 0.9982 0.3486 0.9975 0.2329 0.9972 0.1400 0.9971 0.0701
0.01 0.9981 0.3457 0.9974 0.2310 0.9971 0.13887 0.9965 0.0696
0.05 0.9976 0.3342 0.9971 0.2233 0.9964 0.1343 0.9960 0.0673
0.1 0.9972 0.3198 0.9964 0.2137 0.9958 0.1284 0.9950 0.0643
0.2 0.9949 0.2909 0.9940 0.1944 0.9927 0.1169 0.9895 0.0586
0.3 0.9908 0.2621 0.9882 0.1752 0.9837 0.1054 0.9569 0.0529
α(n) 0.9770 0.3302 0.9972 0.2239 0.9968 0.1366 0.9965 0.0692

(iii) The width of the proposed confidence interval is larger than that of the one-sided
confidence interval proposed by [19].
(iv) We propose to use confidence interval with choice α as α(n) as it gives very
good coverage probability and also has a smaller width as compared to α = 0.

4 Bayes Estimation

We obtain a generalized Bayes estimator of ξ with respect to a non-informative prior

which is uniform on R.
The prior distribution of ξ is g(ξ) = 1, − ∞ < ξ < ∞.
√
n exp(−n(ξ − x)2 /2)
h(ξ|x) = √ √ , − ∞ < ξ < x(1) .
2πΦ( n(x(1) − x))
22 Estimation of the Location Parameter of a General … 289

So, the generalized Bayes estimator with respect to squared error loss function is
given by

exp{−n(X(1) − X )2 /2}
δ(X ) = X − √ √ ,
2nπΦ( n(X(1) − X ))

which is same as the Pitman estimator. This estimator is biased for ξ.

Definition 2 For estimating the location parameter of a location families of distribu-
tions, the best location invariant estimator with respect to squared error loss function
is called the Pitman estimator (see [8], p. 186).
Next, we prove that the Pitman estimator is a limit of Bayes rules. For this, we
consider the prior distribution of ξ as a normal distribution with mean 0 and standard
deviation τ . After some algebra, we observe that the posterior distribution of ξ given
x is a truncated normal distribution and the Bayes estimator is the mean of this
distribution. This is obtained as
2
exp − 2 aX(1) − a
1 nX
nX
δB = 2 − √ .
a a 2πΦ aX(1) − nX a
√
nτ 2 +1
Further, a = τ2
→ n as τ → ∞, and so

exp{−n(X(1) − X )2 /2}
lim δB = X − √ √
τ →∞ 2nπΦ( n(X(1) − X ))

which is the same as the Pitman estimator. This shows that the Pitman estimator is a
limit of Bayes rules.

4.1 Minimaxity and Admissibility of the Pitman Estimator

In this section, we prove that the Pitman estimator is minimax and admissible.
Theorem 2 The Pitman estimator δ(X ) is minimax for estimating ξ with respect to
squared loss function.

Proof The estimator δ(X ) is the best location equivariant estimator. So by Theorem
3.3 of [9], δ(X ) is minimax.

Next, we state a theorem of given in [18]. Let X1 , . . . , Xn be independent and

identical random variables with the density f (x − ξ), where ξ is unknown but the
function f is known. Then, the Pitman estimator of ξ is given as
290 L. K. Patra et al.

ˆ ) = ξ f (Xi − ξ)d ξ .
ξ(X (12)
f (Xi − ξ)d ξ

It is the best location equivariant estimator with respect to the loss function (4).
Theorem 3 If

2 3/2
ξ 2 f (xi − ξ)d ξ ξ f (xi − ξ)d ξ
f (xi ) − dxi < ∞
f (xi − ξ)d ξ f (xi − ξ)d ξ

ˆ ) is defined by (12) is admissible with respect to squared error loss function.

then ξ(X

Proof Proof of this theorem is essentially given in [18].

We use the above theorem to prove admissibility of the Pitman estimator of our
problem.
Theorem 4 The Pitman estimator δ(X ) given by (12) is admissible for estimating
ξ with respect to squared error loss function.

Proof The Pitman estimator δ(X ) is given by

exp{−n(X(1) − X )2 /2} 1 φ(b) 1

δ(X ) = X − √ √ =X −√ = X − √ ν(b),
2nπΦ( n(X(1) − X )) n Φ(b) n
√ φ(y)
where b = n(X(1) − X ) and we denote ν(y) = Φ(y)
. We can express

√ φ(β)
h(ξ|x) = n , − ∞ < ξ < x(1) ,
Φ(b)
√
where β = n(ξ − x). Now
2
x(1) b
β φ(β)
ξ h(ξ|x)d ξ =
2
√ +x dβ
−∞ −∞ n Φ(b)
1 2x φ(b)
= {Φ(b) − bφ(b)} − √ + x2
nΦ(b) n Φ(b)
1 2x
= {1 − bν(b)} − √ ν(b) + x2 . (13)
n n

Again,
2 2
x(1) 2 1 2x 1
ξh(ξ|x)d ξ = δ(x) = x − √ ν(b) = x2 − √ ν(b) + (ν(b))2 .
−∞ n n n
22 Estimation of the Location Parameter of a General … 291

Subtracting (14) from (13), we get

1 − bν(b) (ν(b))2 1 − ν(b) (b + ν(b)) 1 x − x(1) (b + ν(b)) 1
− = < − √ ≤
n n n n n n

Since (x − x(1) ) > 0 and by Lemma 1 of [13], b + ν(b) > 0, it follows that
2 3/2 3/2
x(1) x(1)
1
ξ h(ξ|x)d ξ −
2
ξh(ξ|x) < .
−∞ −∞ n

So by Theorem 3, the Pitman estimator δ(X ) is admissible.

5 Numerical Comparisons

In this section, we numerically compare the percentage risk improvement (PRI) of

estimators T0 , ξˆBC , T1 , Tα(n) , and δ over the MLE X(1) . For η = 1 the estimator ξˆBC

is given as ξˆBC = X(1) − Φ −1 21 + 2n 1
. Note that risk function of these estimators
does not depend on ξ. For the purpose of simulation study, we have generated 50,000
random samples of size n from a general half-normal distribution with parameters
ξ = 0 and η = 1. For various values of n, we tabulate PRIs of all the estimators in
Table 5. The PRI of an estimator T over the MLE is defined as
Risk(MLE) − Risk(T )
PRI (T ) = × 100. (14)
Risk(MLE)

Following observations can be made from the tabulated values.

(i) The percentage risk improvement over MLE of the Pitman estimator δ is the
highest among all estimators.
(ii) The PRIs of Tα(n) and δ are approximately the same.

Table 5 Percentage risk improvement of various estimators

(n, α(n)) (5, 0.1667) (10, (20, (30, (50, (100,
0.10684) 0.06358) 0.04651) 0.02971) 0.0158)
T0 55.88 53.85 52.24 51.61 50.82 50.44
ξ̂BC 54.09 53.43 52.14 51.56 50.81 50.44
T1 12.00 −46.81 −164.23 −287.52 −505.16 −1088.79
Tα(n) 57.47 55.26 53.23 52.37 51.37 50.70
δ 57.54 55.30 53.27 52.40 51.38 50.71
292 L. K. Patra et al.

(iii) The performance of the unbiased estimators T0 and Tα(n) is better than the
bias-corrected estimator ξˆBC .
Thus, we will recommend using δ or Tα(n) as estimator of ξ.

6 Conclusion

In the present article, we have considered the estimation of the location parameter
of a general half-normal distribution with respect to squared error loss function.
We have obtained some unbiased as well as biased estimators. It is proved that the
Pitman estimator is a limit of Bayes rules and also shown that the Pitman estimator
is minimax and admissible. Based on the MLE, we have derived a complete class
of estimators. A one-sided asymptotic confidence interval is also obtained for the
location parameter. Simulation study is carried out for implementation purpose.

References

1. Aigner, D., Lovell, C.K., Schmidt, P.: Formulation and estimation of stochastic frontier pro-
duction function models. J. Econometrics 6(1), 21–37 (1977)
2. Azzalini, A.: A class of distributions which includes the normal ones. Scand. J. Stat. 12(2),
171–178 (1985)
3. Azzalini, A., Capitanio, A.: Statistical applications of the multivariate skew normal distribution.
J. R. Stat. Soc. Ser. B (Stat. Methodol.) 61(3), 579–602 (1999)
4. Brewster, J.F., Zidek, J.: Improving on equivariant estimators. Ann. Stat. 2(1), 21–38 (1974)
5. Cook, R.D., Weisberg, S.: An Introduction to Regression Graphics, vol. 405. Wiley, New York
(2009)
6. Daniel, C.: Use of half-normal plots in interpreting factorial two-level experiments. Techno-
metrics 1(4), 311–341 (1959)
7. Farsipour, N.S., Rasouli, A.: On the Bayes estimation of the general half-normal distribution.
Calcutta Stat. Assoc. Bull. 58(1–2), 37–52 (2006)
8. Ferguson, T.S.: Mathematical Statistics: A Decision Theoretic Approach. Academic Press,
New York (2014)
9. Girshick, M., Savage, L., et al.: Bayes and minimax estimates for quadratic loss functions. In:
Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability,
vol. 1, pp. 53–74 . University of California Press, Berkeley (1951)
10. Gut, A.: Probability: A Graduate Course. Springer Science, New York (2012)
11. Haberle, J.: Strength and failure mechanisms of unidirectional carbon fibre-reinforced plastics
under axial compression. Ph.D. thesis, Imperial College London (1992)
12. Johnson, N.L., Kotz, S., Balakrishnan, N.: Continuous Univariate Distributions, vol. 1. Wiley,
New York (1994)
13. Katz, M.W.: Admissible and minimax estimates of parameters in truncated spaces. Ann. Math.
Stat. 32(1), 136–142 (1961)
14. Nogales, A., Perez, P.: Unbiased estimation for the general half-normal distribution. Commun.
Stat. Theory Methods 44(17), 3658–3667 (2015)
15. Pewsey, A.: Problems of inference for Azzalini’s skewnormal distribution. J. Appl. Stat. 27(7),
859–870 (2000)
22 Estimation of the Location Parameter of a General … 293

16. Pewsey, A.: Large-sample inference for the general half-normal distribution. Commun. Stat.
Theory Methods 31(7), 1045–1054 (2002)
17. Pewsey, A.: Improved likelihood based inference for the general half-normal distribution.
Commun. Stat. Theory Methods 33(2), 197–204 (2004)
18. Stein, C.: The admissibility of Pitman’s estimator of a single location parameter. Ann. Math.
Stat. 30(4), 970–979 (1959)
19. Wiper, M., Girion, F., Pewsey, A.: Objective bayesian inference for the half-normal and half-t
distributions. Commun. Stat. Theory Methods 37(20), 3165–3185 (2008)
Chapter 23
Existence of Equilibrium Solution
of the Coagulation–Fragmentation
Equation with Linear Fragmentation
Kernel

Debdulal Ghosh and Jitendra Kumar

Abstract The existence of equilibrium solution of a coagulation–fragmentation

equation is shown in this article. We study the problem for a linear fragmentation
kernel. A numerical example is provided to explore the given investigation.

Keywords Coagulation–fragmentation equation · Singular kernels · Equilibrium

solution

1 Introduction

The aim of this work is to investigate the existence of equilibrium state of the so-
lution to the continuous coagulation–fragmentation equation (C-F equation) where
the reaction rate satisfies certain restriction. It is to mention here that the C-F process
represents the dynamic system that describes the mechanisms by which clusters can
coalesce to form larger particles or fragment into smaller pieces. Many scientific fields
apply this C-F process and the pertaining equation; for instances, aerosol science [3],
animal grouping in population dynamics [9], red blood cell aggregation in hematol-
ogy [10], astrophysics [11], colloidal chemistry, and polymer science [12, 13].
The general form of the C-F equation is the following integro-partial differential
equation:
∞
∂c(x, t) 1 x
= K (x − y, y) c(x − y, t) c(y, t) dy − c(x, t) K (x, y) c(y, t) dy
∂t 2 0 0
x ∞
1
− c(x, t) F(x − y, y) dy + F(x, y) c(x + y, t) dy, (1)
2 0 0

D. Ghosh (B) · J. Kumar

Department of Mathematics, Indian Institute of Technology Kharagpur,
Kharagpur 721302, West Bengal, India
e-mail: [email protected]
J. Kumar
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2018 295

with the initial data

c(x, 0) = c0 (x) ≥ 0, a.e. (2)

Equation (1) describes the time evolution of particles c(x, t) ≥ 0 of size x ≥ 0 at time
t ≥ 0. The functions K and F represent the nonnegative coagulation and fragmen-
tation rate that changes the mass of the system. The first two terms on the right-hand
side of (1) represent the birth and death terms, respectively, due to coagulation. The
last two terms are, respectively, the death and birth terms due to fragmentation. More
details of this equation can be found in [15]. In the literature, Eq. (1) is also known
as population balance equation.

1.1 Literature Survey

An equilibrium solution of the C-F equation arises when the birth and death terms
in Eq. (1) are equal. Toward identifying an equilibrium solution of the C-F equation,
[2] has proved the existence of equilibrium solution by Laplace transform. In the
articles of [1, 2, 14], the equilibrium solutions are in the form of exp(−λx). A
general equilibrium solution is also given in [4]. For linear coagulation kernel and
constant fragmentation kernel, [5] have proved the existence of equilibrium solution
and its convergence.
In this research article, we attempt to prove the existence of equilibrium solution
for linear fragmentation kernel.

1.2 Problem Statement

For the continuous C-F equation (1), the detailed balance condition leads to the
following separate cancelation condition [5]:
x x ⎫
1 1 ⎪
K (x − y, y) c̄(x − y) c̄(y) dy − c̄(x) F(x − y, y) dy = 0, ⎪
⎪
2 0 2 0 ⎬
∞ (3)
∞ ⎪
⎪
F(x, y) c̄(x + y) dy − c̄(x) K (x, y) c̄(y) dy = 0. ⎪
⎭
0 0

In the present study, the problem under consideration does not assume such separate
cancelation condition. Thus, the existence of equilibrium solution of the problem is
not trivially followed. In this article, a proof of the existence of equilibrium solution
is presented.
The outline of the presented work is as follows. In Sect. 2, the result on the exis-
tence and uniqueness of an equilibrium solution for the problem is given. Section 3
23 Existence of Equilibrium Solution of the Coagulation–Fragmentation … 297

provides a numerical illustration of the performed analysis. Finally, Sect. 4 concludes

the work by mentioning a brief future direction.

2 Existence and Uniqueness of an Equilibrium Solution

In the present study, we consider the following forms of coagulation kernel K and
fragmentation kernel F for Eq. (1):

K (x, y) = x −σ y −σ (4)

where σ ∈ 0, 21 and

F(x, y) = b [1 + (x + y)], with b > 0. (5)

Let c̄(x) be an equilibrium solution of Eq. (1). Then, from Eq. (1) we obtain
x ∞
1
K (x − y, y) c̄(x − y) c̄(y) dy − c̄(x) K (x, y) c̄(y) dy
2 0
x ∞ 0
1
− c̄(x) F(x − y, y) dy + F(x, y) c̄(x + y) dy = 0. (6)
2 0 0

Fitting the coagulation and fragmentation kernels under consideration into Eq. (6),
the first term of Eq. (6) reduces to

1 x
(x − y)−σ y −σ c̄(x − y) c̄(y) dy
2 0
1
= [φ ∗ φ] (x),
2

where φ(x) := x −σ c̄(x), and ζ ∗ ϑ represents the following integral

x
ζ ∗ ϑ(x) = ζ (x − t) ϑ(t) dt.
0

Denoting
∞
N−σ = φ(x) d x,
0

the second term of Eq. (6) gives

298 D. Ghosh and J. Kumar
∞
c̄(x) x −σ y −σ c̄(y) dy = φ(x)N−σ .
0

The third integral of Eq. (6) yields

x x

1 b b x2
c̄(x) F(x − y, y) dy = c̄(x) (1 + x) dy = c̄(x) x +
2 0 2 0 2 2
1 b x2
= c̄(x) b x + c̄(x) .
2 2 2

Lastly, the fourth integral of Eq. (6) gives

∞ ∞
F(x, y) c̄(x + y) dy = b [1 + (x + y)] c̄(x + y) dy
0

0
∞
=b (1 + z)c̄(z) dz

x
∞ x

=b (1 + z)c̄(z) dz − (1 + z)c̄(z) dz
0 0
= b N + bM − b ∗ c̄(x) − b ∗ ρ,

where ρ(x) := x c̄(x).

Therefore, from Eq. (6), we obtain

1 1 b x2
[φ ∗ φ] − φ(x)N−σ − b x c̄(x) − c̄(x) + bN
2 2 2 2
+ bM − b ∗ c̄(x) − b ∗ ρ = 0.

Hence,

φ ∗ φ + 2bN + 2bM − 2b ∗ c̄ − 2b ∗ ρ
c̄(x) = 2
. (7)
2x −σ N−σ + b x + x2

The function c̄ is an equilibrium solution to (1).

Denoting the right-hand side of Eq. (7) by A (c̄), we note that
(i) A is an operator from the continuous functions space C (0, α] into itself, α is
a positive real number, and
(ii) Letting c1 and c2 satisfy (7), we have
23 Existence of Equilibrium Solution of the Coagulation–Fragmentation … 299

|A (c1 ) − A (c2 )|
x σ [(φ1 ∗ φ1 − φ2 ∗ φ2 ) + 2b ∗ (c1 − c2 ) + 2b ∗ (ρ1 − ρ2 )]
= (8)
2N−σ + bx 1−σ 1 + x2

where

φ1 (x) := x −σ c1 (x), φ2 (x) := x −σ c2 (x),

ρ1 (x) := xc1 (x), ρ2 := xc2 (x).

We consider the first term of the numerator of Eq. (8). We see that

|φ1 ∗ φ1 − φ2 ∗ φ2 | ≤ |φ1 − φ2 | ∗ |φ1 + φ2 |

x
= y −σ (x − y)−σ |c1 − c2 |(y)|c1 + c2 |(x − y) dy.
0

Therefore, under the supremum norm, f := sup | f (x)|, we get

x∈(0,α]

φ1 ∗ φ1 − φ2 ∗ φ2 = (φ1 − φ2 ) ∗ (φ1 + φ2 )

x
≤ c1 − c2 .c1 + c2 sup y −σ
(x − y) −σ
dy

x∈(0,α] 0

= c1 − c2 .c1 + c2 β(1 − σ, 1 − σ )α 1−2σ ,

where β(·, ·) is the well-known beta function.

Let β0 := β(1 − σ, 1 − σ ).
Thence, we have

(c1 + c2 ) β0 α 1−σ + 2bα 1+σ 1 + α2
A c1 − A c2 ≤ c1 − c2 .
2 N−σ

Thus, the operator A is contractive if

c β0 α 1−σ + bα σ +1 1 + α2
≤ 1,
N−σ

that is, if

N−σ − bα σ +1 1 + α2
c ≤ =: Rα , say. (9)
β0 α 1−σ

It is to notice here that Rα > 0 under certain restriction on α.

300 D. Ghosh and J. Kumar

In order to use the contraction mapping theorem, we require to check if the ball
B(Rα ) is invariant.
We observe that

β0 α 1−σ c2 + 2b (N + M) + 2bα σ +1 1 + α2 c
A c ≤ .
2N−σ

Through the inequality A c ≤ c, it is easy to see that the ball B(Rα ) remains
invariant if

β0 α 1−σ c2 + 2b (N + M) + 2bα σ +1 1 + α2 c
≤ c,
2N−σ

that is, if
α
β0 α 1−σ c2 + 2c bα σ +1 1 + − N−σ + 2b(N + M) ≤ 0.
2

We denote a1 := N−σ − bα σ +1 1 + α2 . Therefore, the immediately above relation
gives

a1 + a12 − 2b(N + M)β0 α 1−σ
c ≤ . (10)
β0 α 1−σ

The expression under the square root in inequality (10) and the quantity Rα in (9) is
nonnegative for a range of values of α. We work on this range of α values.
We are now at a position to prove the following lemma.
Lemma 1 Let α be such that Rα > 0 and the expression under the square root in
(10) is nonnegative. Then, there exists a unique continuous solution to (7) on the
interval (0, α] which lies in the ball B(Rα ).

Proof Existence and uniqueness of a continuous solution c̄ in the ball B(Rα ) follow
from the contraction mapping theorem [6].
We prove the uniqueness of all solutions to (7), not necessarily inside the ball B(Rα ).

Suppose that there exists another solution d̄ to (7). The continuity of d̄ follows from
its integrability and we remark that the operator A maps any integrable function to
a continuous one.
Let us consider the restriction of d̄ to an interval (0, α1 ], α1 < α. Choosing α1 small
enough, we find that the ball B(Rα1 ) contains two solutions c̄ and d̄. Actually, Rα1
tends to ∞ as α1 tends to 0. This result contradicts the uniqueness of the solution of
(7) in the ball B(Rα1 ).
23 Existence of Equilibrium Solution of the Coagulation–Fragmentation … 301

Lemma 2 For all x > 0, there exists a unique continuous solution to Eq. (7).

Proof We consider the operator A as a mapping A : C [α, 2α] → C [α, 2α] and it
is a solution c̃(x) of (7) on [α, 2α]. The function c̃(x) evidently satisfies the equality

x 1 α −σ
c̃(x) = y −σ (x − y)−σ c̃(y)c̄(x − y) dy + y (x − y)−σ c̄(y)c̄(x − y) dy
α 2 x−α
α x

+b N +M − (1 + x)c̄(x) d x − (1 + y)c̃(y) dy
0 α

1
× . (11)
x −σ (N−σ ) + bx2 1+ 2
x

Here, the function c̄ is a solution to (7) on (0, α]. Its existence and uniqueness were
proved in Lemma 1. By the standard results on integral equations, the linear Volterra
equation (11) has a unique continuous solution c̃(x) on the interval [α, 2α].
Put c̄(x) = c̃(x) if α ≤ x ≤ 2α. Obviously, c̄ satisfies (7) for all x ∈ (0, 2α]. Its
continuity follows form the proof of Lemma 1. We can now analogously extend the
solution obtained to the interval [2α, 4α], and so on. Hence, the result follows.

3 Numerical Results

In this section, we shown that for some initial condition, the time-dependent solution
achieved to equilibrium state. To explore the numerical result, we use the finite
volume scheme introduced by [7, 8]. In this example, we consider computational
domain in [10−9 , 512] and it is discretized into 20 non-uniform subintervals i :=
[xi−1/2 , xi+1/2 ], i = 1, 2, . . . , 20. The end points of i satisfies the relation xi+1/2 =
r xi−1/2 where r > 1 is the geometric ratio. The mid-point of each i is considered
to be the cell representative or the pivot. We have used adaptive Runge–Kutta 4(5)
solver in Matlab-R2015 software to solve the system of ODEs.
In order to prove the existence result, we have taken coagulation kernel in the
form K (x, y) = (1 + x λ + y λ )(x y)−σ , where 0 ≤ σ ≤ 0.5 and 0 ≤ λ − σ ≤ 1 and
constant fragmentation kernel F(x, y) = 1, with the initial data c0 = (1 + x)−2 . To
observe the equilibrium of the system, we plot numerical number density function
along with the moments M2 (t), M0 (t) and M−σ (t). The zeroth moment M0 (t) repre-
sents the total particle number in the system. Therefore, the constant value of M0 (t),
after a certain time lapse, indicates a equilibrium system and the constant moments
of M2 (t) and M−σ (t) also support the above result.
302 D. Ghosh and J. Kumar

3.1 Example 1

In this example, we consider the problem (1) with the kernels

K (x, y) = (1 + x 0.5 + y 0.5 )(x y)−0.5 , F(x, y) = 1

and is supported by the initial data c0 = (1 + x)−2 . From Fig. 1, the particle number
density c(x, t) has no change at three different times t = 1, 3, 5, and from Fig. 2, we
can see that all the moments are constant after t = 2. So, we can say that the system
has reached to equilibrium after t = 2.

Fig. 1 Particle number 100

density

10−5
particle density

10−10

10−15
Time t=5
Time t=3
Time t=1
10−20
10−10 10−5 100 105
dimensionless size of representative

Fig. 2 Normalized moments 1.2

1.15

1.1
normalized moments

M2
1.05
M0
1
M
−σ
0.95

0.9

0.85

0.8

0.75

0.7
0 1 2 3 4 5
dimensionless time
23 Existence of Equilibrium Solution of the Coagulation–Fragmentation … 303

4 Conclusion

In this study, we have proved the existence of an equilibrium solution to for the
C-F equation with a class of linear fragmentation kernel and singular coagulation
kernel. One numerical example has been shown that explores the provided analysis.
In order to prove the result, we have used that Banach contraction mapping theorem,
a few inequalities related to improper integral and the properties of beta and gamma
functions. As a future scope, one can attempt to extend the result for a larger class
of fragmentation kernels.

References

1. Aizenman, M., Bak, T.A.: Convergence to equilibrium in a system of reacting polymers. Com-
mun. Math. Phys. 65(3), 203–230 (1979)
2. Barrow, J.D.: Coagulation with fragmentation. J. Phys. A Math. Gen. 14(3), 729 (1981)
3. Drake, R.L.: A general mathematical survey of the coagulation equation. Top. Curr. Aerosol
Res. (Part 2) 3, 201–376 (1972)
4. Dubovskiı̌, P., Galkin, V.A., Stewart, I.W.: Exact solutions for the coagulation-fragmentation
equation. J. Phys. A Math. Gen. 25(18), 4737 (1992)
5. Dubovskiı̌, P., Stewart, I.W.: Trend to equilibrium for the coagulation-fragmentation equation.
Math. Methods Appl. Sci. 19(10), 761–772 (1996)
6. Edwards, R.: Functional analysis: theory and applications, Holt, Rinehart and Winston, New
York, 1965. MR 36, 4308 (1994)
7. Kumar, J., Kaur, G., Tsotsas, E.: An accurate and efficient discrete formulation of aggregation
population balance equation. Kinet. Relat. Models 9(2), 373–391 (2016)
8. Kumar, J., Saha, J., Tsotsas, E.: Development and convergence analysis of a finite volume
scheme for solving breakage equation. SIAM J. Numer. Anal. 53(4), 1672–1689 (2015)
9. Okubo, A.: Dynamical aspects of animal grouping: swarms, schools, flocks, and herds. Adv.
Biophys. 22, 1–94 (1986)
10. Perelson, A.S., Samsel, R.W.: Kinetics of red blood cell aggregation: an example of geometric
polymerization. In: Kinetics of Aggregation and Gelation, pp. 137–144 (1984)
11. Safronov, V.S. Evolution of the protoplanetary cloud and formation of the earth and planets.
In: Safronov, V.S. (ed.) Evolution of the Protoplanetary Cloud and Formation of the Earth and
Planets, vol. 1, 212 p. Translated from Russian. Israel Program for Scientific Translations,
Keter Publishing House, Jerusalem, Israel (1972)
12. Smoluchowski, M.: Drei vortrage uber diffusion. brownsche bewegung und koagulation von
kolloidteilchen. Z. Phys. 17, 557–585 (1916)
13. Smoluchowski, M.: Grundriß der koagulationskinetik kolloider lösungen. Colloid Polym. Sci.
21(3), 98–104 (1917)
14. Stewart, I.W., Dubovskiı̌, P.: Approach to equilibrium for the coagulation-fragmentation equa-
tion via a Lyapunov functional. Math. Methods Appl. Sci. 19(3), 171–185 (1996)
15. Stewart, I.W., Meister, E.: A global existence theorem for the general coagulation-fragmentation
equation with unbounded kernels. Math. Methods Appl. Sci. 11(5), 627–648 (1989)
Chapter 24
Explicit Criteria for Stability
of Two-Dimensional Fractional
Nabla Difference Systems

Jagan Mohan Jonnalagadda

Abstract In this article, we discuss a few stability properties of the Riemann–

Liouville (or Caputo)-type linear two-dimensional fractional nabla difference sys-
tem. For this purpose, we construct the equivalent Volterra difference system of
convolution type and analyse its properties using the standard methods applied in
the qualitative investigation of Volterra difference systems. Subsequently, we obtain
sufficient conditions on stability of the considered fractional nabla difference system.
We provide an example to illustrate the applicability of established results.

Keywords Fractional order · Nabla difference · Volterra system · Z-transform

Stability

1 Introduction

Matignon [1] established the following well-known criteria for stability of the linear
fractional differential system
α
D x (t) = Ax(t), t > 0, (1)

of Riemann–Liouville (or Caputo) type:

Theorem 1 Let 0 < α < 1 and A ∈ Rk×k . Then, (1) is asymptotically stable if and
only if
απ
|arg λ| > (2)
2
for all the eigenvalues λ of A.
Later, many other stability results on systems of fractional differential equations
have appeared [2]. On the other hand, stability theory of fractional nabla difference

J. M. Jonnalagadda (B)
Department of Mathematics, Birla Institute of Technology and Science Pilani,
Hyderabad 500078, Telangana, India
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2018 305

equations is less developed. Recently, [3, 4] obtained the nabla discrete analogue of
Theorem 1 as follows:
Theorem 2 Consider the fractional nabla difference system
α
∇ρ(0) u (t) = Au(t), t ∈ N1 , (3)

of Riemann–Liouville type. Let 0 < α < 1, A ∈ Rk×k and det(I − A) = 0. If all the
eigenvalues λ of A lie inside the region
απ arg z α
Sα = z ∈ C : |arg z| > or |z| > 2 cos , (4)
2 α
then (3) is asymptotically stable.
But, in many applications, one needs explicit criteria on the entries of the matrix
associated with the considered system. In this article, we wish to formulate explicit
stability conditions for two-dimensional Riemann–Liouville type fractional nabla
difference systems.

2 Preliminaries

Throughout this article, we use the following notations, definitions and known results
of discrete calculus [5, 6]: denote the set of all real numbers and complex numbers
by R and C, respectively. For any a ∈ R, define Na = {a, a + 1, a + 2, . . .}. Assume
that empty sums and products are taken to be 0 and 1, respectively.

2.1 Fractional Nabla Calculus

Definition 1 (Gamma Function) For any t ∈ R \ {. . . , −2, −1, 0}, the gamma func-
tion is defined by
∞
Γ (t) = e−s st−1 ds, t > 0,
0
Γ (t + 1) = tΓ (t).

Definition 2 (Rising Factorial Function) For any t ∈ R \ {. . . , −2, −1, 0} and α ∈

R such that (t + α) ∈ R \ {. . . , −2, −1, 0}, the rising factorial function is defined
by
Γ (t + α)
tα = , 0α = 0.
Γ (t)

We observe the following properties of rising factorial functions.

24 Explicit Criteria for Stability of Two-Dimensional … 307

Theorem 3 Assume that the following factorial functions are well defined.
1. t α (t + α)β = t α+β .
2. If t ≤ r, then t α ≤ r α .
3. If α < t ≤ r then r −α ≤ t −α .
4. (t + 1)α−1 ≤ (t +1)α−1 ≤ t α−1 , 0 ≤ α ≤ 1.

5. (t + b)a−b = t a−b 1 + O 1t , |t| → ∞.

Definition 3 Let u : Na → R, α ∈ R such that 0 < α < 1.

1. (Nabla Difference) The first-order backward (nabla) difference of u is defined by

∇u (t) = u(t) − u(t − 1), t ∈ Na+1 .

2. (Fractional Nabla Sum) The αth-order nabla sum of u based at ρ(a) = (a − 1)

is given by

−α 1
t
∇ρ(a) u (t) = (t − ρ(s))α−1 u(s), t ∈ Na .
Γ (α) s=a

3. (R-L Fractional Nabla Difference) The Riemann–Liouville-type αth-order nabla

difference of u based at ρ(a) = (a − 1) is given by
α −(1−α)

∇ρ(a) u (t) = ∇ ∇ρ(a) u (t)

1 t
= (t − ρ(s))−α−1 u(s), t ∈ Na .
Γ (−α) s=a

4. (Caputo Fractional Nabla Difference) The Caputo-type αth-order nabla difference

of u based at a is given by
α

∇a∗ u (t) = ∇a−(1−α) ∇u (t)

(t − a)−α
= ∇aα u (t) − u(a), t ∈ Na+1 .
Γ (1 − α)

2.2 Volterra Difference Systems

Consider a linear Volterra difference system of convolution-type

t
u(t + 1) = B(t − j)u(j), (5)
j=0
308 J. M. Jonnalagadda
T
where u(t) = u1 (t), u2 (t), . . . , uk (t) , ui : N0 → R, 1 ≤ i ≤ k and B(t) = bij (t) ,
bij : N0 → R, 1 ≤ i, j ≤ k, is a k × k matrix valued function defined on N0 . We
assume that B(t) ∈ l1 , i.e.
∞
|B(j)| < ∞.
j=0

Now, we state the standard definitions of stability and asymptotic stability adapted
to the Volterra system (5).
Definition 4 Consider (5) along with the initial condition u(0) = u0 . Then, (5) is
said to be
1. stable, if for any real vector u0 there exists ε > 0 such that the corresponding
solution u(t) of (5) satisfies |u(t)| < ε for all t ∈ N1 .
2. asymptotically stable, if u(t) → 0 as t → ∞ for any real vector u0 .
3. uniformly stable, if for any ε > 0, there exists a δ = δ(ε) > 0 such that if u0 is
any real vector with |u0 | < δ then the corresponding solution u(t) of (5) satisfies
|u(t)| < ε for all t ∈ N1 .
4. uniformly asymptotically stable, if it is uniformly stable and if there exists a η > 0
such that for any ε > 0 there is N = N (ε) ∈ N1 such that if u0 is any real vector
with |u0 | < η then the corresponding solution u(t) of (5) satisfies |u(t)| < ε for
all t ∈ NN .
Definition 5 The Z-transform of a sequence of real numbers {v(t)}t∈N0 is defined
by
∞

ṽ(z) = Z v(t) = v(k)z −k ,
k=0

where z ∈ C for which the series converges absolutely. The Z-transform of a sequence
of vectors {u(t)}t∈N0 and a sequence of matrices {B(t)}t∈N0 over R are given by
T
ũ(z) = Z u(t) = ũ1 (z), ũ2 (z), . . . , ũk (z) and B̃(z) = Z B(t) = b̃ij (z) , where

ũi (z) = Z ui (t) , 1 ≤ i ≤ k and b̃ij (z) = Z bij (t) , 1 ≤ i, j ≤ k.
Z-transform can be used to discuss the stability properties
of
(5) by analysing the
roots of the associated characteristic equation det zI − B̃(z) , where I is the k × k
identity matrix. In this connection, we recall a few important results which will be
used to establish the main results of this article.
Theorem4 A necessary and sufficient condition for uniform asymptotic stability of
(5) is det zI − B̃(z) = 0 for all |z| ≥ 1.
An application of the preceding theorem will be introduced next. This will provide
explicit criteria for asymptotic stability. Let
∞

βij = |bij (t)|, 1 ≤ i, j ≤ k.
t=0
24 Explicit Criteria for Stability of Two-Dimensional … 309

Theorem 5 The zero solution of (5) is uniformly asymptotically stable if either one
of the following conditions holds:
k
1. βij < 1, for each 1 ≤ i ≤ k.
j=1
k
2. i=1 ij < 1, for each 1 ≤ j ≤ k.
β

The following theorem provides criteria for uniform stability of (5).

Theorem 6 The zero solution of (5) is uniformly stable if

k
βij ≤ 1,
i=1

for each 1 ≤ j ≤ k.

3 Main Results

In this section, we investigate a few stability properties of the two-dimensional

Riemann–Liouville-type fractional nabla difference system
α
∇ρ(0) U (t) = A U (t), 0 < α < 1, t ∈ N1 , (6)

u1 a a
where U = ; u1 , u2 : N0 → R, A = 11 12 ; a11 , a12 , a21 , a22 ∈ R. Let
u2 a21 a22
T = Trace(A) and D = det(A). We assume the following necessary and sufficient
condition for the existence of unique solution of (6).

det(I − A) = 0, i.e. T − D = 1.

(I) First, we obtain the equivalent Volterra-type difference system of (6). Rewriting
(6), for t ∈ N1 , we have
α
∇ρ(0) u1 (t) = a11 u1 (t) + a12 u2 (t), (7)
α
∇ρ(0) u2 (t) = a21 u1 (t) + a22 u2 (t). (8)

Expanding the Riemann–Liouville operator in (7) and (8), for t ∈ N1 , we get

1 t
(t − ρ(s))−α−1 u1 (s) = a11 u1 (t) + a12 u2 (t), (9)
Γ (−α) s=0

1 t
(t − ρ(s))−α−1 u2 (s) = a21 u1 (t) + a22 u2 (t). (10)
Γ (−α) s=0
310 J. M. Jonnalagadda

Rearranging the terms in (9) and (10), we have

1 t−1
(1 − a11 )u1 (t) = − (t − ρ(s))−α−1 u1 (s) + a12 u2 (t), t ∈ N1 , (11)
Γ (−α) s=0

1 t−1
(1 − a22 )u2 (t) = − (t − ρ(s))−α−1 u2 (s) + a21 u1 (t), t ∈ N1 , (12)
Γ (−α) s=0

(1 − a11 )u1 (t + 1)−a12 u2 (t + 1)

1 t
=− (t + 2 − s)−α−1 u1 (s), t ∈ N0 , (13)
Γ (−α) s=0

−a21 u1 (t + 1)+(1 − a22 )u2 (t + 1)

1 t
=− (t + 2 − s)−α−1 u2 (s), t ∈ N0 . (14)
Γ (−α) s=0

The matrix form of (13) and (14) is given by

t
1 − a11 −a12 u1 (t + 1) (t + 2 − s)−α−1 1 0 u1 (s)
=− .
−a21 1 − a22 u2 (t + 1) Γ (−α) 0 1 u2 (s)
s=0

Thus,

t
U (t + 1) = B(t − s)U (t), t ∈ N0 , (15)
s=0

is the equivalent Volterra-type difference system of (6) with

1 − a11 −a12 −1 −(t + 2)−α−1 1 − a22 a12

B(t) =
−a21 1 − a22 Γ (−α) a21 1 − a11

1 (t + 2)−α−1 1 − a22 a12
=− . (16)
(1 − T + D) Γ (−α) a21 1 − a11

(II) Next, we derive the characteristic equation of (15). Taking Z-transforms on both
sides of (16), we get

z 1 α 1 − a22 a12
B̃(z) = − 1− 1− (17)
(1 − T + D) z a21 1 − a11
24 Explicit Criteria for Stability of Two-Dimensional … 311

for all z ∈ C with |z| ≥ 1. Let

1 1 α
S=− 1− 1− . (18)
(1 − T + D) z

Consider det(zI − B̃(z))

z − zS + zSa22 −zSa12
=
−zSa21 z − zS + zSa11

= z 2 (1 − S)2 + S(1 − S)T + S 2 a11 a22 − z 2 S 2 a11 a22 .

Thus, the characteristic equation of (15) becomes

z 2 (1 − S)2 + S(1 − S)T = 0. (19)

(III) Finally, we formulate an explicit necessary and sufficient condition for asymp-
totic stability of the Volterra system (15). Applying Theorem 4, the system (15) is
uniformly asymptotically stable if and only if

(1 − S)2 + S(1 − S)T = 0, (20)

for all z ∈ C with |z| ≥ 1. Consider

(1 − S)2 + S(1 − S)T = 0. (21)

If T = 1, then D = 0 and the only root of (21) is S = 1 implies

1 α
1− = 1 + D. (22)
z
We analyse (22) with respect to D. If 1 + D < 0, then (22) has no root zr , and hence
the condition (20) is satisfied trivially. If 1 + D ≥ 0, then the unique nonzero real
root of the characteristic equation (19) is given by

1
zr = 1 .
1 − (1 + D) α

To satisfy (20), we require (1 + D) > 2α .

Suppose T = 1. Then, the roots of (21) are

1
S= and 1,
1−T
312 J. M. Jonnalagadda

implies
1 α
1− =2−T +D (23)
z
and
1 α D
1− =2+ , (24)
z 1−T

respectively. We analyse (23) and (24) with respect to T and D.

1. If 2 − T + D < 0, then (23) has no root zr and hence the condition (20) is satisfied
trivially. If 2 − T + D ≥ 0, then the unique nonzero real root of the characteristic
equation (19) is given by

1
zr = 1 .
1 − (2 − T + D) α

To satisfy (20), we require (2 − T + D) > 2α .

2. If 2 + 1−T
D
< 0, then (24) has no root zr and hence the condition (20) is satisfied
trivially. If 2 + 1−T
D
≥ 0, then the unique nonzero real root of the characteristic
equation (19) is given by
1
zr = 1 .
1 − (2 + 1−T
D
)α

To satisfy (20), we require 2 + D

1−T
> 2α .
Compiling the above results, we provide a necessary and sufficient condition for the
asymptotic stability of (15) in the following theorem.

Theorem 7 The system (15) is uniformly asymptotically stable if and only if

T = 1, D ∈ R \ [−1, 2α − 1] (25)
or
D
T = 1, (D − T ) and ∈ R \ [−2, 2α − 2]. (26)
1−T

Now, we apply Theorems 5 and 6 to establish explicit criteria for asymptotic stability
of (15). Consider
∞
1−a ∞
(t + 2)−α−1
22
β11 = |b11 (t)| =
t=0
1 − T + D t=0 Γ (−α)
1−a
22
= .
1−T +D
24 Explicit Criteria for Stability of Two-Dimensional … 313

Similarly, we get

a12
β12 = ,
1 − aT + D
21
β21 = ,
1−T +D
1−a
11
β21 = .
1−T +D

Theorem 8 The zero solution of (15) is uniformly asymptotically stable if either one
of the following conditions holds:

|a12 | + |1 − a22 |, |1 − a11 | + |a21 | < |1 − T + D| (27)

or
|a21 | + |1 − a22 |, |1 − a11 | + |a12 | < |1 − T + D|. (28)

Theorem 9 The zero solution of (15) is uniformly stable if

|a21 | + |1 − a22 |, |1 − a11 | + |a12 | < |1 − T + D|. (29)

Finally, we consider the following two-dimensional Caputo-type fractional nabla

difference system
α

∇0∗ U (t) = A U (t), 0 < α < 1, t ∈ N1 . (30)

Using Definition 3 in (30), we get

α
∇0 U (t) = A U (t) + F(t), t ∈ N1 , (31)

where

t −α u1 (0)
F(t) = , t ∈ N0 . (32)
Γ (1 − α) u2 (0)

Then,

t
U (t + 1) = B(t − s)U (t) + G(t), t ∈ N0 , (33)
s=0

is the equivalent Volterra-type difference system of (30) with

1 t −α 1 − a22 a12 u1 (0)
G(t) = . (34)
(1 − T + D) Γ (1 − α) a21 1 − a11 u2 (0)

Consequently, the characteristic equation of (30) becomes det(zI − B̃(z)), which is

same as (19).
314 J. M. Jonnalagadda

4 Conclusion

To summarise this article, we reformulate some of its results for the fractional nabla
difference system (6) (or (30)). Theorems 7–9 imply the following assertions.
Corollary 1 The system (6) (or (30)) is uniformly asymptotically stable if and only
if either (25) or (26) holds.

Corollary 2 The zero solution of (6) (or (30)) is uniformly asymptotically stable if
either (27) or (28) holds.

Corollary 3 The zero solution of (6) (or (30)) is uniformly stable if (29) holds.

Example 1 Consider the fractional nabla difference system

0.5
∇ρ(0) u1 (t) = −(0.75)u1 (t) − u2 (t), t ∈ N1 , (35)
0.5
∇ρ(0) u2 (t) = u1 (t) − u2 (t), t ∈ N1 . (36)

−0.75 −1 0.25
Solution: Here α = 0.5, A = and u0 = . Then, T = −1.75,
1 −1 0.75
D = 1.75, T − D = −3.50 and T −1 = −0.6363. Clearly, condition (26) holds.
D

Hence, the system (35)–(36) is uniformly asymptotically stable.

References

1. Matignon, D.: Stability results for fractional differential equations with applications to control
processing. In: Computational Engineering in Systems and Application Multiconference, vol.
2, pp. 963–968. IMACS, IEEE-SMC, Lille, France (1996)
2. Li, C.P., Zhang, F.R.: A survey on the stability of fractional differential equations. Eur. Phys.
J. Spec. Top. 193(1), 27–47 (2011)
3. Čermák, J., Győri, I., Nechvátal, L.: Stability regions for linear fractional difference systems
and their discretizations. Appl. Math. Comput. 219, 7012–7022 (2013)
4. Čermák, J., Győri, I., Nechvátal, L.: On explicit stability conditions for a linear fractional
difference system. Fractional Calc. Appl. Anal. 18(3), 651–672 (2015)
5. Elaydi, S.: An Introduction to Difference Equations, 3rd edn. Springer, New York (2005)
6. Goodrich, C., Peterson, A.C.: Discrete Fractional Calculus. Springer International Publishing
(2015). https://doi.org/10.1007/978-3-319-25562-0
7. Agarwal, R.P.: Difference Equations and Inequalities. Marcel Dekker, New York (1992)
8. Podlubny, I.: Fractional Differential Equations. Academic Press, San Diego (1999)
9. Kelly, W.G., Peterson, A.C.: Difference Equations: An Introduction with Applications, 2nd
edn. Academic Press, San Diego (2001)
Chapter 25
Discrete Legendre Collocation Methods
for Fredholm–Hammerstein Integral
Equations with Weakly Singular Kernel

Bijaya Laxmi Panigrahi

Abstract In this paper, we discuss the discrete Legendre collocation methods for
Fredholm–Hammerstein integral equations with the weakly singular kernel. Using
sufficiently accurate quadrature rule, we obtain the convergence rates for the discrete
Legendre collocation solutions to the actual solution in both L 2 and infinity norm.
Numerical examples are presented to validate the theoretical estimates.

Keywords Hammerstein integral equations · Weakly singular kernels · Spectral

methods · Collocation methods · Legendre polynomials

1 Introduction

We consider the following Fredholm–Hammerstein integral equation

1
u(s) − k(s, t) ψ(t, u(t)) dt = f (s), −1 ≤ s ≤ 1, (1)
−1

where k, f and ψ are known functions, u is the unknown function to be determined

in a Banach space X, and the kernel k(., .) is of weakly singular type of the form

k(s, t) = m(s, t)gα |s − t|,

m(s, t) ∈ C([−1, 1] × [−1, 1]) and

x α−1 , if 1/2 < α < 1,
gα (x) =
log x, if α = 1.

B. L. Panigrahi (B)
Department of Mathematics, Sambalpur University,
Sambalpur 768019, Odisha, India
e-mail: [email protected]; [email protected]

© Springer Nature Singapore Pte Ltd. 2018 315

This type of problem (1) arises as a reformulation of boundary value problems

with certain nonlinear boundary conditions.
Many authors have studied numerical methods to solve nonlinear integral equa-
tions with the smooth kernel and also with weakly singular kernel [7–11, 13]. The
Galerkin, collocation, Petrov–Galerkin degenerate kernel methods, and Nyström
methods are commonly used projection methods for finding the numerical solution
of Eq. (1). In all the projection methods, the infinite dimensional space X is approx-
imated by the space of piecewise polynomials. However, to get better accuracy in
piecewise polynomial-based projection methods, one has to solve a large system of
nonlinear equations because of a large number of the partition. So, in the last some
years, different spectral methods have been developed rapidly and the Legendre spec-
tral methods have been applied to linear integral equations and nonlinear integral
equations. The Legendre spectral projection methods for Fredholm–Hammerstein
integral equations with smooth kernel have been studied in [4]. The important point
is if Pn denotes either orthogonal or interpolatory projection from X into a subspace
of global polynomials of degree ≤ n, then Pn ∞ is unbounded. In [4], the similar
convergence rates for the approximate solution of Fredholm–Hammerstein integral
equations with smooth kernel have been obtained in both L 2 and infinity norm as in
the case of piecewise polynomial bases.
However, the spectral projection methods lead to the algebraic nonlinear system,
in which the coefficients are integrals appeared due to inner products and integral
operator K. Since these integrals are almost always evaluated numerically, in all the
above methods the effect of error due to numerical integration has been ignored. So
in the discrete methods, the integrals appeared in the nonlinear system of equations
have been replaced by numerical quadrature rule. The discrete spectral methods for
nonlinear integral equations have been discussed by [5]. However, in all these above
methods, the nonlinear integral equations with smooth kernel have been considered.
The integral equations with weakly singular kernels of the algebraic and logarithmic
type cover many important applications, and this kind of problem arises from poten-
tial problems, Dirichlet problems, the description of the hydrodynamic interaction
between elements of a polymer chain in solution, mathematical problems of radiative
equilibrium, and transport problems.
In this paper, we apply the discrete Legendre spectral collocation methods to solve
the Fredholm–Hammerstein integral equations with the weakly singular kernel. Our
purpose in this paper is to obtain similar convergence rates as in using piecewise and
global polynomial bases for smooth kernels.
The organization of this paper is as follows. In Sect. 2, we discuss the discrete
Legendre collocation methods for Hammerstein integral equations with the weakly
singular kernel. In Sect. 3, we discuss the convergence rates for both L 2 and infinity
norm. In Sect. 4, we illustrate our result by the numerical example. Throughout this
paper, we assume c is a generic constant.
25 Discrete Legendre Collocation Methods for Fredholm–Hammerstein … 317

2 Hammerstein Integral Equations

In this section, we will discuss on the collocation methods for solving Hammerstein
integral equations with weakly singular kernels (1) using Legendre polynomial basis
functions.
Let X = C[−1, 1] and L 2 [−1, 1] with norms .∞ and . L 2 , respectively. Through-
out the paper, the following assumptions are made on f, k(., .) and ψ(., u(.)):
(i) f ∈ C[−1, 1].
(ii) For m(s, t) ∈ C r ([−1, 1] × [−1, 1]), r ≥ 1,

m∞ = sup |m(s, t)| ≤ M < ∞,

s,t∈[−1,1]
∂ i+ j

mr,∞ = max i j m(s, t).
0≤i, j≤r,t,s∈[−1,1] ∂s ∂t

(iii) For s, s ∈ [−1, 1], gα |s − t| − gα |s − t| L 2 → 0 and m s (.) − m s (.) L 2

→ 0 as s → s . 1
(iv) For 1/2 < α < 1, sup |gα |s − t||2 dt = M2 < ∞.
s∈[−1,1] −1
(v) The nonlinear function ψ(t, u) is bounded and continuous over [−1, 1] × R.
ψ(t, u) is Lipschitz continuous in u, i.e., for any u 1 , u 2 ∈ R, ∃ c1 > 0 such that

|ψ(t, u 1 ) − ψ(t, u 2 )| ≤ c1 |u 1 − u 2 |, ∀ t ∈ [−1, 1].

(vi) The partial derivative ψ (0,1) (t, u(t)) of ψ with respect to the second variable
exists and is Lipschitz continuous in u, i.e., for any u 1 , u 2 ∈ R, ∃ c2 > 0 such
that
|ψ (0,1) (t, u 1 ) − ψ (0,1) (t, u 2 )| ≤ c2 |u 1 − u 2 |, ∀ t ∈ [−1, 1].

This implies, ψ (0,1) (., .) ∈ C[−1, 1] × R, ψ (0,1) ∞ ≤ B.√

(vii) We assume that M, M2 , and c1 satisfy the condition that 2M2 Mc1 < 1.
Define
z(t) = ψ(t, u(t)), t ∈ [−1, 1]. (2)

It is easy to show by using chain rule for higher derivatives that z ∈ C r [−1, 1],
because ψ(., .) ∈ C r ([−1, 1] × R) and u ∈ C r [−1, 1].
Then, the Hammerstein integral equation (1) can be written as an operator form

u = Kz + f, (3)

where
1
Kz(s) = k(s, t)z(t) dt. (4)
−1
318 B. L. Panigrahi

For our convenience, we consider a nonlinear operator Ψ : X → X defined by

Ψ (u)(t) = ψ(t, u(t)).

Then, Eq. (2) becomes

z = Ψ (Kz + f ). (5)

Let T (u) = Ψ (Ku + f ), u ∈ X, then the Eq. (5) can be written as

T z = z. (6)

Now, we will prove the existence and uniqueness of the solution of Eq. (6) in the
next theorem.
Theorem 1 Let X = C[−1, 1], f ∈ X and gα |s − t| satisfy the assumption (iv) with
m(., .) ∈ C[−1, 1] × [−1, 1]. Let ψ(t,√u(t)) ∈ C([−1, 1] × R) satisfy the Lipschitz
condition in the second variable and 2M2 Mc1 < 1. Then, the operator equation
T z = z has a unique solution z 0 ∈ X, i.e., z 0 = T z 0 .

Proof Using Cauchy–Schwarz inequality, we get
1
Kz∞ = sup |Kz(s)| ≤ sup |m(s, t)| sup |gα |s − t|z(t)| dt
s∈[−1,1] t,s∈[−1,1] s∈[−1,1] −1

≤ M M2 z L 2 . (7)

Since f ∈ C[−1, 1], it follows that u = Kz + f ∈ C[−1, 1]. Let z 1 , z 2 ∈ C[−1, 1].
Using the Lipschitz continuity of ψ(., u(.)) with Eq. (7), we get

T z 1 − T z 2 ∞ = Ψ (Kz 1 + f ) − Ψ (Kz 2 + f )∞

≤ c1 K(z 1 − z 2 )∞

≤ c1 M M2 z 1 − z 2 L 2 ≤ 2M2 c1 Mz 1 − z 2 ∞ . (8)
√
By assumption (vii), 2M2 Mc1 < 1, hence T is a contraction mapping on X. By
using Banach contraction theorem, T has a unique fixed point in X. Denote the
unique solution as z 0 . This completes the proof.

To describe Legendre collocation methods for the solution of Hammerstein inte-
gral equation (1), we will first approximate the space X by a finite-dimensional space
Xn . Let Xn be the set of all polynomials of degree not more than n. Let {τ0 , τ1 , . . . , τn }
be the zeros of the Legendre polynomial of degree n + 1. For z ∈ C[−1, 1], we define
the Lagrange interpolation polynomial Qn : X → Xn by

n
Qn z(s) = z(τi )L i (s), s ∈ [−1, 1]
i=0
25 Discrete Legendre Collocation Methods for Fredholm–Hammerstein … 319

where
π(s)
L i (s) = , π(s) = (s − τ0 )(s − τ1 ) . . . (s − τn ).
(s − τi )π (τi )

Then, Qn : X → Xn satisfies

Qn u ∈ Xn , Qn u(τi ) = u(τi ), i = 0, 1, . . . , n, u ∈ X. (9)

We quote the following lemma from [3, 6], which gives the properties of the inter-
polatory projection operator Qn .
Lemma 1 Let Qn : X → Xn be the interpolatory projection operator defined by (9).
Then, the following hold:
(i) {Qn : n ∈ N} is uniformly bounded in L 2 norm, that is, Qn u L 2 ≤ pu∞ , u ∈
C[−1, 1], where p is a constant independent of n.
(ii) For any u ∈ C r [−1, 1], there exists a constant c independent of n such that

Qn u − u L 2 ≤ cn −r u (r ) L 2 .

Then, the Legendre

n collocation method for Eq. (5) is seeking an approximate solu-
tion z n (s) = i=0 γi L i (s) ∈ Xn , which satisfies the following nonlinear system of
equations

n
n
γi L i (τ j ) = Ψ K γi L i + f (τ j ), j = 0, 1, . . . , n.
i=0 i=0

Using the interpolatory projection operator, the above system of nonlinear equations
can be written in the following operator equation form.

z n = Qn Ψ (Kz n + f ). (10)

Corresponding approximate solution u n of u is given by

u n = Kz n + f.

Using the projection operator Qn , we define Kn : X → X by

1
Kn (z)(s) = gα |s − t|Qn (m(s, t)z(t)) dt, (11)
−1

which approximates the operator K. For z n ∈ Xn , we have

n
Kn (z n )(s) = wiα (s)m(s, τi )z n (τi ),
i=0
320 B. L. Panigrahi
1
where wiα (s) = L i (s)gα |s − t| dt.
−1
Denote L (r )
2 [−1, 1] = {u : Dsi u ∈ L 2 [−1, 1], i = 0, 1, . . . , r } with the norm

r
u L 2 ,r = Dsi u L 2 .
i=0

Now in the following Lemma, we give the error bounds of the integral operator K
with the approximate operator Kn .
Theorem 2 Let m(s, t) ∈ C (0,r ) ([−1, 1] × [−1, 1]) and z ∈ C r [−1, 1]. Then, there
exists a positive constant c such that

(K − Kn )z∞ ≤ cn −r z L 2 ,r . (12)

Proof For fixed s ∈ [−1, 1], denote bs (t) = m s (t)z(t), where m s (t) = m(s, t). From
Eqs. (11) and (4), we obtain
1

|(K − Kn )z(s)| = gα |s − t|(I − Qn )(m(s, t)z(t))dt .
−1

Now by taking supremum over s ∈ [−1, 1] and using Cauchy–Schwarz inequality

with Lemma 1, we get

(K − Kn )z2∞ ≤ M2 sup (I − Qn )bs 2L 2

s∈[−1,1]
1
= M2 n −2r sup |[bs (t)](r ) |2 dt . (13)
s∈[−1,1] −1

Using Leibniz rule for differentiating the product of two terms and Cauchy–Schwarz
inequality again, we get

2
r 2
[bs (t)](r ) = Cir Dtr −i m(s, t) Dti z(t)
i=0

r
r
≤ Dtr −i m s 2∞ (Cir )2 (Dti z)2 (t) (14)
i=0 i=0

Using Eq. (14) in Eq. (13), we obtain

25 Discrete Legendre Collocation Methods for Fredholm–Hammerstein … 321

r 1
r
(K − Kn )z2∞ ≤ M2 n −2r mr,∞
2
(Cir )2 (Dti z)2 (t)dt
i=0 −1 i=0

r r
≤ M2 n −2r mr,∞
2
(Cir )2 Dti z2L 2
i=0 i=0
r
≤ M2 n −2r mr,∞
2
(Cir )2 z2L 2 ,r .
i=0

Thus, we get

√
r 1/2
(K − Kn )z∞ ≤ M 2 n −r mr,∞ (Cir )2 z L 2 ,r ≤ cn −r z L 2 ,r .
i=0

This completes the proof.

Now by using the approximate discrete operator Kn instead of the integral operator
K, we obtain

n n
ξi L i (τ j ) = Ψ Kn ξi L i + f (τ j ), j = 0, 1, . . . , n. (15)
i=0 i=0

n
Then, z̃ n (t) = ξ j L j (t) is the discrete Legendre collocation approximate solution
j=0
of z of Eq. (5).
Using the interpolation operator Qn , the system of nonlinear equations (15) can be
written in the following operator equation forms.

z̃ n = Qn Ψ (Kn z̃ n + f ). (16)

Let T
n (u) = Qn Ψ (Kn u + f ), u ∈ X, and Eq. (16) can be written as

z̃ n = T
n z̃ n . (17)

The corresponding approximate solution ũ n of u is defined by ũ n = Kn z̃ n + f .

3 Convergence Rates

In this section, we will discuss convergence rates of approximated solutions with the
exact solution of Fredholm–Hammerstein integral equations with weakly singular
kernel, in both L 2 and infinity norm. To do this, we quote the following lemma.
322 B. L. Panigrahi

Definition 1 [1] Let X be a Banach space and, T and Tn ∈ B(X). Then, {Tn } is
said to be ν-convergent to T if Tn ≤ c, (Tn − T )T → 0, (Tn − T )Tn →
0 as n → ∞.

Theorem 3 [2] Let X be a Banach space and T , Tn ∈ BL(X). If Tn is norm conver-

gent to T or Tn is ν-convergent to T and (I − T )−1 exists and bounded on X, then
(I − Tn )−1 exists and uniformly bounded on X for sufficiently large n.

Theorem 4 Let Kn be the approximate integral operator defined by the Eq. (11),
then the set of operators {Kn : n = 1, 2, 3, . . . } is collectively compact.

Proof Toprove {Kn : n = 1, 2, 3, . . . } is collectively compact, we need to show that

the set Kn (B) is a relatively compact set whenever B ⊂ X is bounded.
n
Let S = {Kn (z) : z ∈ B}, and B is a closed unit ball in C[−1, 1] ⊂ L 2 [−1, 1]. To
prove {Kn (z)} is a compact operator, we have to show that S is uniformly bounded
and equicontinuous.
We have
1
Kn (z)(s) = gα |s − t|Qn (m(s, t)z(t)) dt,
−1

Now by using Cauchy–Schwarz inequality and taking supremum over s ∈ [−1, 1],
we obtain
√
Kn (z) L 2 ≤ 2Kn (z)∞ ≤ 2M2 Qn (m(s, t)z(t)) L 2 ≤ c pMz L 2 . (18)

Thus, Kn is uniformly bounded in L 2 norm. Now to show the equicontinuity, for any
s, s ∈ [−1, 1], we obtain

Kn (z)(s) − Kn (z)(s )
1
= gα |s − t|Qn (m(s, t)z(t)) − gα |s − t|Qn (m(s , t)z(t)) dt
−1
1
≤ gα |s − t| − gα |s − t| Qn (m(s, t)z(t))dt
−1
1
+ gα |s − t|Qn m(s, t)z(t) − m(s , t)z(t) dt.
−1

By using Cauchy–Schwarz inequality, we obtain

1 1/2
|Kn (z)(s) − Kn (z)(s )| ≤ (gα |s − t| − gα |s − t|)2 dt Qn (m(s, t)z(t)) L 2
−1
+ M2 pm(s, t) − m(s , t) L 2 z∞ .
25 Discrete Legendre Collocation Methods for Fredholm–Hammerstein … 323

Using assumption (iii) in the above equation, we get |Kn (z)(s) − Kn (z)(s )| → 0 as
s → s and n → ∞. Thus, {Kn (z)} is equicontinuous on [−1, 1]. By using Arzela–
Ascoli theorem, we conclude that {Kn } is collectively compact. This completes the
proof.
We quote the following theorem which gives us the condition under which the
solvability of one equation leads to the solvability of other equation.
Theorem 5 [13] Let F and F
be continuous operators over an open set Ω in a
Banach space X. Let the equation x = F
x has an isolated solution x̃0 ∈ Ω, and let
the following conditions be satisfied.
(a) The operator F is Frechet differentiable in some neighborhood of the point x̃0 ,
while the linear operator I − F (x̃0 ) is continuously invertible.
(b) Suppose that for some δ > 0 and 0 < q < 1, the following inequalities are valid
(the number δ is assumed to be so small that the sphere x − x˜0 ≤ δ is contained
within Ω).

sup (x̃0 ))−1 (F

(I − F (x) − F
(x̃0 )) ≤ q, (19)
x−x̃0 ≤δ

α = (I − F x̃0 ) − F(
(x̃0 ))−1 (F(
x̃0 )) ≤ δ(1 − q). (20)

Then, the equation x = F x has a unique solution x̂0 in the sphere x − x̃0 ≤ δ.
Moreover, the inequality
α α
≤ x̂0 − x̃0 ≤ ,
1+q 1−q

is valid.
Theorem 6 The operators T and T
n are Frechet differentiable on X, and T
n (z 0 ) is
ν-convergent to T (z 0 ) in L 2 -norm.
Proof With the assumptions on the kernel and the nonlinear function ψ and by using
the Lemma 4 of [11], we get that the operator T (z) = Ψ (Kz + f ) is continuously
Frechet differentiable on X. Since Qn is a linear operator, using [11, 12], it can
be proved that T
n (z) = Qn Ψ (Kn z + f ) is also Frechet differentiable on X. Denote
the Frechet derivatives of T (z) and T
n (z) at the point z 0 as T (z 0 ) and T
n (z 0 ),
respectively. Then, T (z 0 ) = Ψ (Kz 0 + f )K, and T
n (z 0 ) = Qn Ψ (Kn z 0 + f )Kn .
Now, we need to show that T
n (z 0 ) is ν-convergent to T (z 0 ) in L 2 -norm. By using
Lemma 1 and the estimate (18) with the assumptions, we obtain

T
n (z 0 )u L 2 = Qn Ψ (Kn z 0 + f )Kn u L 2
≤ pΨ (Kn z 0 + f )∞ Kn u∞

≤ p Ψ (Kn z 0 + f ) − Ψ (Kz 0 + f )∞ + Ψ (Kz 0 + f )∞ u L 2
≤ c((Kn − K)z 0 ∞ + B)u L 2 ≤ c(n −r z 0 L 2 ,r + B)u L 2 .
324 B. L. Panigrahi

This shows that T

n (z 0 ) L 2 is uniformly bounded. Next, we consider

T
n (z 0 ) − T (z 0 ) u L 2 = Qn Ψ (Kn z 0 + f )Kn − Ψ (Kz 0 + f )K u L 2

≤ Qn Ψ (Kn z 0 + f ) − Qn Ψ (Kz 0 + f ) Kn u L 2

+ Qn Ψ (Kz 0 + f )Kn − Qn Ψ (Kz 0 + f )K u L 2

+ Qn Ψ (Kz 0 + f )K − Ψ (Kz 0 + f )K u L 2
√
≤ 2 pc2 (Kn − K)z 0 ∞ Kn u∞ + 2 p B(Kn − K)u∞
+ (Qn − I)Ψ (Kz 0 + f )Ku L 2 .

By using Theorem 2, the first two terms of the right hand side of the above equation →
0 as n → ∞. Since Ψ (Kz 0 + f ) is bounded and K is a compact operator, Ψ (Kz 0 +
f )K is also a compact operator. Since Qn converges pointwise to the identity operator
I from Lemma 1 and Ψ (Kz 0 + f )K is a compact operator, it follows that (Qn −
I)Ψ (Kz 0 + f )Ku L 2 → 0 as n → ∞. Thus,

T
n (z 0 ) − T (z 0 ) u L 2 → 0, as n → ∞.

Let B be a closed unit ball in C[−1, 1]. Since T (z 0 ) = Ψ (Kz 0 + f )K is a compact

operator, S = {T (z 0 )x : x ∈ B} is a relatively compact set in C[−1, 1]. Then, it
follows that

T
n (z 0 ) − T (z 0 ) T (z 0 ) L 2 = sup{ T
n (z 0 ) − T (z 0 ) T (z 0 )u L 2 : u ∈ B}

= sup{ T
n (z 0 ) − T (z 0 ) u L 2 : u ∈ S} → 0, as n → ∞.

Since Qn is uniformly bounded in L 2 norm, Ψ (Kn z 0 + f ) is also bounded and Kn

is a compact operator, and then T
n (z 0 ) = Qn Ψ (Kn z 0 + f )Kn is a compact operator.
Proceeding in the similar way as in before, it can be easy to show that

T
n (z 0 ) − T (z 0 ) T
n (z 0 )u L 2 → 0 as n → ∞.

This shows that T

n (z 0 ) is ν-convergent to T (z 0 ) in L 2 -norm. This completes the
proof.

Theorem 7 Let z 0 ∈ C r [−1, 1] be an isolated solution of the Eq. (6). Assume that
one is not an eigenvalue of the linear operator T (z 0 ). Then for sufficiently large
n, the operators (I − T
n (z 0 )) are invertible on X and there exist constants A1 > 0
independent of n such that (I − T
n (z 0 ))−1 L 2 ≤ A1 .

Proof The proof completes by combining the Theorems 3 and 6.

25 Discrete Legendre Collocation Methods for Fredholm–Hammerstein … 325

Theorem 8 Let Qn : X → Xn be the interpolatory projection operator defined by

(9). Then Eq. (17) has an unique solution z̃ n ∈ B(z 0 , δ) = {z : z − z 0 L 2 < δ} for
some δ > 0 and for sufficiently large n. Moreover, there exists a constant 0 < q < 1,
independent of n such that

βn βn
≤ z̃ n − z 0 L 2 ≤ ,
1+q 1−q

where βn = (I − T
n (z 0 ))−1 (T
n (z 0 ) − T (z 0 )) L 2 .

Proof From Theorem 7, we have (I − T

n (z 0 ))−1 that exists and it is uniformly
bounded in L 2 norm; i.e., there exists A1 > 0 such that (I − T
n (z 0 ))−1 L 2 ≤ A1 .
Using Theorem 4 with the assumption (v), for any z ∈ B(z 0 , δ) and u ∈ C[−1, 1],
we get

(T
n (z) − T
n (z 0 ))u L 2 = [Qn Ψ (Kn z 0 + f )Kn − Qn Ψ (Kn z + f )Kn ]u L 2
= Qn (Ψ (Kn z 0 + f )Kn − Ψ (Kn z + f )Kn )u L 2
≤ p(Ψ (Kn z 0 + f ) − Ψ (Kn z + f ))Kn u∞
≤ cKn (z 0 − z)∞ Kn u∞ ≤ cz − z 0 L 2 u L 2 .

Thus, (T
n (z) − T
n (z 0 )) L 2 ≤ cδ. Hence, we obtain

sup (I − T
n (z 0 ))−1 (T
n (z 0 ) − T
n (z)) L 2 ≤ A1 cδ ≤ q,
z−z 0 L 2 ≤δ

where 0 < q < 1. This proves Eq. (19) of Theorem 5. Now by using Theorem 2 with
Lemma 1, we obtain

T
n (z 0 ) − T (z 0 ) L 2 = Qn Ψ (Kn z 0 + f ) − Ψ (Kz 0 + f ) L 2
≤ Qn [Ψ (Kn z 0 + f ) − Ψ (Kz 0 + f )] L 2
+ (Qn − I)Ψ (Kz 0 + f ) L 2
≤ c(Kn − K)z 0 ∞ + (Qn − I)z 0 L 2
≤ cn −r z 0 L 2 ,r + n −r z 0 L 2 ,r → 0, as n → ∞. (21)

Hence,

βn = (I − T
n (z 0 ))−1 (T
n (z 0 ) − T (z 0 )) L 2 ≤ A1 T
n (z 0 ) − T (z 0 ) L 2 → 0,

as n → ∞. Choose n large enough such that βn ≤ δ(1 − q). Then, Eq. (20) of The-
orem 5 is satisfied. Thus, by applying Theorem 5, we obtain

βn βn
≤ z 0 − z̃ n L 2 ≤ , (22)
1+q 1−q
326 B. L. Panigrahi

where βn = (I − T
n (z 0 ))−1 (T
n (z 0 ) − T (z 0 )) L 2 . Using Eq. (21) with Eq. (22), we
obtain

z 0 − z̃ n L 2 ≤ βn ≤ A1 T
n (z 0 ) − T (z 0 ) L 2 ≤ cn −r z 0 L 2 ,r + n −r z 0 L 2 ,r . (23)

This completes the proof.

Theorem 9 Let z 0 be the isolated solution of Eq. (6) and u 0 be the isolated solution of
(3) such that u 0 = Kz 0 + f . Let ũ n = Kn z̃ n + f be the discrete Legendre collocation
approximation of u 0 . Then, the following hold.

u 0 − ũ n L 2 = O(n −r ), u 0 − ũ n ∞ = O(n −r ).

Proof Using Theorems 4 and 2, we obtain

u 0 − ũ n L 2 = Kz 0 + f − (Kn z̃ n + f ) L 2
≤ Kn (z 0 − z̃ n ) L 2 + (Kn − K)z 0 L 2
√ √
≤ 2Kn (z 0 − z̃ n )∞ + 2(Kn − K)z 0 ∞
√ √
≤ 2cz 0 − z̃ n L 2 + 2n −r z 0 L 2 ,r .

Using the estimate (23), we obtain

u 0 − ũ n L 2 = O(n −r ).

Now for the second estimate, using Theorem 2 with the estimate (23), we obtain

u 0 − ũ n ∞ ≤ Kn (z 0 − z̃ n )∞ + (Kn − K)z 0 ∞

≤ cz 0 − z̃ n L 2 + cn −r z 0 L 2 ,r ≤ cn −r .

This completes the proof.

Remark 1 From Theorem 9, we observe that the Legendre collocation solution con-
verges to the exact solution with the order O(n −r ) in both L 2 and infinity norm.
We obtained the similar convergence rates for Legendre collocation methods for
Fredholm–Hammerstein integral equations with weakly singular kernel using piece-
wise polynomial-based collocation methods.

4 Numerical Examples

In this section, we present an example to validate the errors of the approximation

solutions by using Legendre collocation methods both in L 2 and infinity norm. To
solve the problem by using Legendre collocation methods, we first choose Legendre
polynomials as the basis functions of Xn evaluated from the recurrence relation,
25 Discrete Legendre Collocation Methods for Fredholm–Hammerstein … 327

Table 1 Discrete Legendre collocation method

n u 0 − ũ n L 2 u 0 − ũ n ∞
2 2.457691e−02 6.874354e−03
3 9.347281e−03 3.576579e−03
4 3.566732e−03 9.348632e−04
5 1.008456e−03 3.569632e−04
6 7.869632e−04 1.068532e−05

φ0 (x) = 1, φ1 (x) = x, x ∈ [−1, 1],

and for i = 1, 2, · · · , n − 1,

(i + 1)φi+1 (x) = (2i + 1)xφi (x) − iφi−1 (x), x ∈ [−1, 1].

Example 1 We consider the following integral equation

1 s + 1
1 1
x(t) − √ √ cos + x(s) ds = f (t), t ∈ [−1, 1],
2 −1 |s − t| 2
t + 1
where f (t) is selected so that x(t) = cos is the solution.
2
For different values of n, we compute ũ n and compare the results with exact solution
u 0 . The computed errors in L 2 and infinity norm are presented in Table 1.

References

1. Ahues, M., Largillier, A., Limaye, B.V.: Spectral Computations for Bounded Operators. Chap-
man and Hall/CRC, New York (2001)
2. Atkinson, K.E.: The Numerical Solution of Integral Equations of the Second Kind. Cambridge
University Press, Cambridge, UK (1997)
3. Canuto, C., Hussaini, M.Y., Quarteroni, A., Zang, T.A.: Spectral Methods: Fundamentals in
Single Domains. Springer, Berlin (2006)
4. Das, P., Sahani, M.M., Nelakanti, G., Long, G.: Legendre spectral projection methods for
Fredholm-Hammerstein integral equations. J. Sci. Comput. 68, 213–230 (2016)
5. Das, P., Nelakanti, G., Long, G.: Discrete Legendre spectral projection methods for Fredholm-
Hammerstein integral equations. J. Comp. Appl. Math. 278, 293–305 (2015)
6. Guo, B.: Spectral Methods and their Applications. World Scientific, Singapore (1998)
7. Kaneko, H., Noren, R.D., Padilla, P.A.: Superconvergence of the iterated collocation methods
for Hammerstein equations. J. Comput. Appl. Math. 80(2), 335–349 (1997)
8. Kaneko, H., Xu, Y.: Superconvergence of the iterated Galerkin methods for Hammerstein
equations. SIAM J. Numer. Anal. 33(3), 1048–1064 (1996)
328 B. L. Panigrahi

9. Kaneko, H., Noren, R.D., Xu, Y.: Numerical solutions for weakly singular Hammerstein equa-
tions and their superconvergence. J. Integral Equ. Appl. 4(3), 391–407 (1992)
10. Kumar, S.: The numerical solution of Hammerstein equations by a method based on polynomial
collocation. J. Aust. Math. Soc. Ser. B 31(3), 319–329 (1990)
11. Kumar, S.: Superconvergence of a collocation-type method for Hammerstein equations. IMA
J. Numer. Anal. 7(3), 313–325 (1987)
12. Suhubi, E.S.: Functional Analysis. Kluwer Academic Publishers, Dordrecht (2003)
13. Vainikko, G.M.: A perturbed Galerkin method and the general theory of approximate methods
for non-linear equations. USSR Comput. Math. Phys. 7(4), 1–41 (1967)
Chapter 26
Norm Inequalities Involving Upper
Bounds for Operators in Orlicz-Taylor
Sequence Spaces

Atanu Manna

Abstract An Orlicz extension of the results obtained by Talebi (Indag Math (NS)
28(3):629–636, 2017 [1]) is given. Indeed, the upper bounds for the operator norm
Alϕ ,tϕα are evaluated, where A is either generalized Hausdorff or Nörlund matrix,
lϕ and tϕα , respectively, denote the Orlicz and Orlicz-Taylor sequence spaces.

Keywords Hausdorff matrix · Nörlund matrix · Taylor matrix · Orlicz function

Luxemburg norm

Mathematics Subject Classification (2010) Primary 26D15, 40G05, 47A30;

Secondary 46A45

1 Introduction

An Orlicz function is a map ϕ : (0, ∞) → (0, ∞) which is convex and satisfies

ϕ(0+) = 0. Such a function is strictly increasing and continuous, so it has a unique
inverse ϕ −1 : (0, ∞) → (0, ∞). Usually in the theory of Orlicz spaces, the domain
of Orlicz function is extended to the real line by ϕ(x) = ϕ(|x|) and ϕ(0) = 0 (see
[2] for details).
A supermultiplicative function ϕ : (0, ∞) → (0, ∞) is such that for all positive
u and v, the following holds

ϕ(uv) ≥ ϕ(u)ϕ(v).

An immediate example of supermultiplicative function is ϕ(t) = t p , p ≥ 1. We

would like to recall another example from [3]. Let a, b, p be fixed real numbers such
that a < 0, b > 0, and p > 1. Choose Ma,b, p be a function defined on the interval
[0, b1 ) with Ma,b, p (0) = 0 and

A. Manna (B)
Faculty of Mathematics, Indian Institute of Carpet Technology,
Chauri Road, Bhadohi 221401, Uttar Pradesh, India
e-mail: [email protected]

Ma,b, p (x) = x p | log(bx)|a for x = 0.

Define an Orlicz function ϕ by the formula

Ma,b, p (δx) 1
ϕ(x) = for x ≥ 0, δ ∈ (0, ),
Ma,b, p (δ) b

then ϕ is equivalent to Ma,b, p at 0 with ϕ(1) = 1 and supermultiplicative on [0, 1].

Throughout our study, we consider those supermultiplicative Orlicz function ϕ such
that ϕ(1) = 1 holds.
Let l 0 be the space of all real sequences and x = (xn ) ∈ l 0 (here and after x =
(xn )∞
n=0 will be replaced by x = (x n ) in order to avoid ambiguity). Further, it is
assumed that the Orlicz function ϕ is fixed. The sequence Orlicz spaces are denoted
by lϕ and defined as follows:

∞

lϕ = x ∈ l 0 : ϕ(r xn ) < ∞ for some r > 0 .
n=0

The space lϕ is a Banach space equipped with the Orlicz-Luxemburg norm · ϕ

defined as below:
∞ x
n
xϕ = inf r > 0 : ϕ ≤1 . (1)
n=0
r

It is easy to prove that if |xn | ≤ |yn | for all n = 0, 1, 2, . . . then xϕ ≤ yϕ .
∞ x
n
Thus, the norm · ϕ is monotonic. Further, if 0 < xϕ < ∞ then ϕ
n=0
x ϕ
≤ 1 holds (see [4] or [5], Lemma 1).
In particular, if ϕ(t) = |t| p , p ≥ 1 then we obtain p-summable sequence spaces
l p for p ≥ 1 and Eq. (1) reduces to the l p -norm · p given below:

∞ 1p
x p = |xn | p .
n=0

Let A = (an,k ), n, k = 0, 1, 2, . . . be an infinite matrix with real entries and X ,

Y be two normed sequence spaces. Then A defines a matrix transformation from X
to Y , is denoted by A : X → Y if for every sequence x = (xn ) ∈ X , the sequence
Ax = ((Ax)n ) ≡ (An (x)), A-transform of x is in Y , where
∞

An (x) = an,k xk , n = 0, 1, 2, . . . .
k=0

Throughout the text, we are mainly concerned about the finding of upper bounds U
(not depends on x) attached with the following inequality:
26 Norm Inequalities Involving Upper Bounds … 331

AxY ≤ U x X ,

where X = lϕ and Y = tϕα . The notation · X (or · Y ) stands for norm on X

(or on Y ). We shall obtain the general value of U such that A X,Y ≤ U . Some
investigations and latest developments on bounds of operator norms, Refs. [6–10]
are referred to the reader.
This paper consist of three sections besides this section. In the first one, that is, in
Sect. 2, we introduce Orlicz-Taylor sequence spaces as an Orlicz extension of the Tay-
lor sequence spaces introduced in [1] and obtain some inclusion relations. Section 3
is devoted to the study of obtaining upper bounds of operators norm Alϕ ,tϕα , where
A is generalized Hausdorff or Nörlund matrix. Finally, Sect. 4 gives the conclusion
of this work.

2 Orlicz-Taylor Sequence Spaces

α
At the beginning, the definition of Taylor matrix T (α) = (tn,k )n,k≥0 of order α (0 <
α < 1) is recalled and given below:
k
α (1 − α)n+1 α k−n if k ≥ n,
tn,k = n
0 if 0 ≤ k < n.

Let ϕ be an Orlicz function. Then we define the Orlicz-Taylor sequence spaces

tϕα as the set of all sequences x whose T (α)-transform belongs to lϕ , that is

tϕα = x : T (α)x ∈ lϕ
∞ ∞

k
= x: ϕ r (1 − α)n+1 α k−n xk < ∞ for some r > 0 .
n=0 k=n
n

The space tϕα is a normed linear space endowed with the norm xαϕ = T (α)xϕ .
Denote tnα (x) as the T (α)-transform of the sequence x, that is

∞

k
tnα (x) = (1 − α)n+1 α k−n xk . (2)
k=n
n

The inverse Taylor transform can be obtain easily from Eq. (2) and is given in the
following expression:

∞

k
xn = (1 − α)−(k+1) (−α)k−n tkα (x). (3)
k=n
n
332 A. Manna

For the sake of completeness, first we begin with a short proof of the following
result, which states the completeness of tϕα .
Theorem 1 The sequence space tϕα is a Banach space equipped with the norm · αϕ .

Proof Let (x p ) be a Cauchy sequence in tϕα . Then for any ε > 0, there exists a p0 ∈ N
such that
x p − x q αϕ < ε for each p, q ≥ p0 .

Choose a rε > 0 with rε < ε such that for each n ≥ 0

∞ 1 ∞

k p q
ϕ (1 − α)n+1 α k−n (xk − xk ) ≤ 1 holds for each p, q ≥ p0 .
n=0
rε k=n n
(4)

Using the assumption ϕ(1) = 1, one obtains

∞

1 k p q
(1 − α)n+1 α k−n (xk − xk ) ≤ 1 for each p, q ≥ p0 and n ≥ 0.
rε k=n n

p
Then one can easily deduced that the sequence (xk ) is a Cauchy sequence of real
p
numbers for each k ≥ 0 and hence converges, that is, xk → xk for each k ≥ 0 as
p → ∞. Therefore, using the continuity of ϕ, from inequality (4) one gets

∞ 1 ∞

k p
ϕ (1 − α)n+1 α k−n (xk − xk ) ≤ 1 for each p ≥ p0 .
n=0
rε k=n n

Thus, x ∈ tϕα and x p − xαϕ ≤ rε < ε for p ≥ p0 . So (tϕα , · αϕ ) is a Banach

space.

Now we would like to establish a inclusion between lϕ and tϕα . The following lemma
plays a very important role to prove our further results.
Lemma 1 Let ϕ be an Orlicz and supermultiplicative function, ϕ −1 be its inverse.
Then lϕ ⊆ tϕα holds.

Proof Let x = (xn ) ∈ lϕ such that x = 0. Then applying the Jensen’s inequality, one
obtains
26 Norm Inequalities Involving Upper Bounds … 333

∞ t α (x) ∞ ∞
x
k k
ϕ n
≤ (1 − α)n+1 α k−n ϕ (by Jensen’s inequality)
n=0
r n=0 k=n
n r
∞ x n

n n
= ϕ (1 − α)k+1 α n−k
n=0
r k=0
k
∞ x
n
= (1 − α) ϕ
n=0
r
∞ x
ϕ ϕ −1 1 − α
n
= ϕ
n=0
r
∞ x
n −1
≤ ϕ ϕ 1−α (since ϕ is supermultiplicative).
n=0
r

Now put r = xϕ ϕ −1 1 − α . Then above inequality implies that

∞ t α (x) ∞ x
n −1
ϕ n ≤ ϕ ϕ 1−α
n=0
r n=0
r
∞ x
n
= ϕ ≤ 1.
n=0
x ϕ

This gives

xαϕ ≤ r = ϕ −1 (1 − α)xϕ ,

which yields the inclusion lϕ ⊆ tϕα .

Lemma 2 Let ϕ be an Orlicz and supermultiplicative function, ϕ −1 be its inverse.

Then for 0 < β ≤ α < 1, the following inequality holds:
1 − α
xαϕ ≤ ϕ −1 xβϕ . (5)
1−β

Proof Let x ∈ l 0 be a sequence such that x = 0. Applying Eq. (3) in Eq. (2), the
following is obtained:

∞

k
tnα (x) = (1 − α)n+1 α k−n xk
k=n
n
∞
∞

k j β
= (1 − α)n+1 α k−n (1 − β)−( j+1) (−β) j−k t j (x)
k=n
n j=k
k
334 A. Manna

∞
j

j β j − n k−n
= (1 − α)n+1 (1 − β)−( j+1) t j (x) α (−β) j−k
j=n
n k=n
j − k
∞
j

β
= (1 − α)n+1 (1 − β)−( j+1) (α − β) j−n t j (x)
j=n
n

α−β
∞
j n+1 β
= q j−n 1 − q t j (x) setting q = . (6)
j=n
n 1−β

Hence by the Jensen’s inequality applied to Eq. (6), one gets

t α (x) ∞

n+1 t j (x)
β
j j−n
ϕ n
≤ q 1−q ϕ for some r > 0. (7)
r j=n
n r

Now summing both sides of inequality (7) from n = 0 to n = ∞, one obtains

t α (x) ∞

n+1 t j (x)
∞ ∞ β
j j−n
ϕ n
≤ q 1−q ϕ
n=0
r n=0 j=n
n r
n

∞
n+1 tnβ (x) n n−k
≤ 1−q ϕ q
n=0
r k=0
k
∞ t β (x) 1 − α ∞ t β (x)
n n
≤ (1 − q) ϕ = ϕ .
n=0
r 1 − β n=0
r

Proceeds in a parallel way as in Lemma 1, the Orlicz-Luxemburg norm implies

that
1 − α
xαϕ ≤ ϕ −1 xβϕ as needed.
1−β

Corollary 1 If 0 < β ≤ α < 1, then tϕβ ⊆ tϕα .

3 Matrix Operators on Orlicz-Taylor Sequence Spaces

In the following, two consecutive subsections, upper bounds of the generalized Haus-
dorff matrix operator and Nörlund matrix operator norms in Orlicz-Taylor sequence
spaces are obtained.
26 Norm Inequalities Involving Upper Bounds … 335

3.1 Generalized Hausdorff Matrix Operator

In this portion, it is aimed to establish a Hardy type formula as an upper estimate

for H lϕ ,tϕα , where H : lϕ → tϕα . Suppose that a > −1 and c > 0. The definition
of generalized Hausdorff matrix (see [4, 11]) will be recalled first. The generalized
Hausdorff matrix is denoted by H = (h n,k ), n, k = 0, 1, 2, . . . and given by

0 if k > n,
h n,k = n+a
n−k
Δn−k μk if 0 ≤ k ≤ n,

where Δ is the difference operator defined by Δμk = μk − μk+1 and μ =

(μ0 , μ1 , . . .) is a sequence of real numbers, normalized so that μ0 = 1 and

1
μk = θ c(k+a) dμ(θ ),
0

where dμ(θ ) is a Borel probability measure on [0, 1]. Therefore, the equivalent
expression of the matrix H = (h n,k ) is given by
⎧
⎨ 0 if k > n,

h n,k = 1
n + a c(k+a)
⎩ θ (1 − θ c )n−k dμ(θ ) if 0 ≤ k ≤ n.
0 n−k

The case when a = 0 and c = 1, one obtains the ordinary Hausdorff matrix (see
[6], p. 32), which include four famous classes of matrices as given below if we choose
Lebesgue measure dθ :
(a) Put dμ(θ ) = β(1 − θ )β−1 dθ , then H leads to (C, β), the Cesàro matrix of order
β;
θ|β−1
(b) Put dμ(θ ) = | log
Γ (β)
dθ , then H reduces to (H, β), the Hölder matrix of order
β;
(c) Put dμ(θ ) = point evaluation at θ = β, then H reduces to (E, β), the Euler
matrix of order β;
(d) Put dμ(θ ) = βθ β−1 dθ , then H becomes (Γ, β), the Gamma matrix of order β.
Now the following hypothesis related to Orlicz function and Hausdorff matrix is
considered:
‘Hypothesis OH’: Let ϕ be an Orlicz and supermultiplicative function, ϕ −1 be its
inverse, and · ϕ be the Orlicz-Luxemburg norm. Denote (x)q = Γ Γ(x+q)
(x)
for x ≥ 0
and H = (h n,k ), h n,k ≥ 0. Further, let a > −1, c > 0, q > −a − 1 and (n+a+1)
1
q
be
non-increasing for n ≥ 0.
Then the following ingenious result is due to Love (see [4], Theorem 2) and it is
most important to prove our further results:
336 A. Manna

Lemma 3 ([4], Theorem 2) Suppose that the‘Hypothesis OH’ holds. Then for any
nonnegative sequence x = (xk ) and μ = (μk ) of real numbers normalized so that
μ0 = 1, the following inequality holds:

ϕ,
H xϕ ≤ Cx (8)

where
1
=
C ϕ −1 (θ −(q+1)c )dμ(θ ). (9)
0

Theorem 2 Suppose that the ‘Hypothesis OH’ holds. Then the Hausdorff matrix H
maps lϕ into tϕα and

−1 1 − α , where C
H lϕ ,tϕα ≤ Cϕ will be evaluated from Eq. (9).

Proof Let x ∈ lϕ be any nonnegative and nonzero sequence of real numbers. Let
r > 0 be a real number and applying the Jensen’s inequality, one obtains

∞ 1 ∞
k
k
ϕ (1 − α)n+1 α k−n h k,i xi
n=0
r k=n n i=0
∞
k ∞
1 k
≤ (1 − α)n+1 α k−n ϕ h k,i xi (by Jensen’s inequality)
n=0 k=n
n r i=0
∞ 1 n n

n n−k
= ϕ h n,i xi α
n=0
r i=0 k=0
k
∞ 1 n
≤ (1 − α) ϕ h n,i xi .
n=0
r i=0

Hence applying similar techniques as applied in Lemma 1 and the definition of Orlicz-
Luxemburg norm, inequality (8) implies that

H xαϕ ≤ ϕ −1 1 − α H xϕ ≤ Cϕ −1 1 − α xϕ ,
which in turn implies that

−1 1 − α .
H lϕ ,tϕα ≤ Cϕ
This proves the theorem.
1
=
Corollary 2 Choose c = 1, a = 0. Then C ϕ −1 (θ −(q+1) )dμ(θ ) and Cesàro,
0
Hölder, Euler, and Gamma operators map lϕ into tϕα . Further, one obtains the fol-
lowing result:

1 −1 −(q+1)
−1
(a) (C, β)lϕ ,tϕα ≤ βϕ 1 − α ϕ (θ )(1 − θ )β−1 dθ, β > 0;
0
26 Norm Inequalities Involving Upper Bounds … 337

1 −1 −(q+1)
(b) (H, β) lϕ ,tϕα ≤ 1
Γ (β)
ϕ −1
1−α ϕ (θ )| log θ |β−1 dθ, β > 0;
0
(c) (E, β)lϕ ,tϕα ≤ ϕ −1 1 − α ϕ −1 (β −(q+1) ), 0 < β < 1;

1 −1 −(q+1) β−1
(d) (Γ, β)lϕ ,tϕα ≤ βϕ −1 1 − α ϕ (θ )θ dθ .
0

Corollary 3 Choose c = 1, a = 0 and denote p ∗ = p−1 p

. Choose ϕ(t) = t p , p ≥ 1,
1
=
1
which gives ϕ −1 (t) = t p . Then C θ −(q+1)/ p dμ(θ ) and Cesàro, Hölder, Euler,
0
and Gamma operators map l p into t pα . Further, one gets the following result from
Corollary 2:
1/ p Γ (β+1)Γ ( p1∗ − qp )
(a) (C, β)l p ,t pα ≤ β 1 − α Γ (β+ p1∗ − qp )
, β > 0;
1
1/ p 1
(b) (H, β)l p ,t pα ≤ 1 − α Γ (β)
θ −(q+1)/ p | log θ |β−1 dθ, β > 0;
1/ p −(q+1)/ p0
(c) (E, β)l p ,t pα ≤ 1 − α β , 0 < β < 1;
1/ p pβ
(d) (Γ, β)l p ,t pα ≤ 1 − α pβ−q−1
, pβ > q + 1.

Corollary 4 Consider the similar assumptions as of Corollary 3 and additionally

put q = 0, then one obtains the Corollary 3.2 established by Talebi in [1].

3.2 Nörlund Matrix Operator

Now, the Nörlund matrix operator N maps, lϕ into tϕα is considered. Before proceeds
further, the notion of Nörlund matrix is recalled. Let p = ( pn ) be a sequence of
n
nonnegative numbers such that p0 > 0 and denote Pn = pk for n ≥ 0. Then
k=0
the Nörlund matrix N ≡ N ( pn ) = (an,k )n,k≥0 associated with the sequence ( pn ) is
defined by

0 if k > n,
an,k = pn−k
Pn
if 0 ≤ k ≤ n.

Note that one can assume that p0 = 1 because N ( pn ) = N (cpn ) holds for any
c > 0. Bounds for the operator norms of Nörlund matrix operator are studied by
Johnson et al. in [12]. Our interest in the following study is to estimate a general
upper bound for the Nörlund matrix operator norm N lϕ ,tϕα . The statement of the
theorem is given below:
Theorem 3 Suppose p = ( pn ) is a sequence of nonnegative numbers such that
p0 = 1. Then the Nörlund matrix N maps lϕ into tϕα and the following inequality
holds:
338 A. Manna

∞
pk
N lϕ ,tϕα ≤ ϕ −1 (1 − α) . (10)
P
k=0 k

Proof Let x ∈ lϕ be any nonnegative and nonzero sequence of real numbers and
r > 0. Applying the Jensen’s inequality, the following is obtained:

1 ∞

pk−i
∞ k
k
ϕ (1 − α) α
n+1 k−n
xi
n=0
r k=n n i=0
Pk
∞
1 pk−i
∞ k
k
≤ (1 − α)n+1 α k−n ϕ xi
n=0 k=n
n r i=0 Pk
∞

k
by Jensen’s inequality as (1 − α)n+1 α k−n = 1
k=n
n
1 pn−i n
∞ n n

= ϕ xi (1 − α)k+1 α n−k
n=0
r i=0
Pn k=0
k

pn−i xi
∞ n n
pn−i
≤ (1 − α) ϕ by Jensen’s inequality as =1
n=0 i=0
Pn r i=0
Pn

pn−i xi
∞ ∞
≤ (1 − α) ϕ
i=0 n=i
Pn r

pk x i
∞ ∞
= (1 − α) ϕ
P
i=0 k=0 k+i
r

pk x i
∞ ∞
≤ (1 − α) ϕ .
P
k=0 k i=0
r

Using similar techniques as developed in the Lemma 1, the

notion of Orlicz-

Luxemburg norm implies that N xαϕ ≤ ϕ −1 (1 − α) ∞k=0 Pk xϕ , which gives
pk

∞
pk
N lϕ ,tϕα ≤ ϕ −1 (1 − α) ,
P
k=0 k

and this completes the proof.

Corollary 5 Choose a sequence p = ( pk ) such that pk

Pk
= 1
(k+1)2
, then the Nörlund
matrix N , maps lϕ into tϕα with

π2
N lϕ ,tϕα ≤ ϕ −1 (1 − α) .
6
26 Norm Inequalities Involving Upper Bounds … 339

Corollary 6 Let ϕ(t) = t p for p ≥ 1 in Corollary 5. Then the Nörlund matrix N

maps l p into t pα and one gets

π2 1/ p
N l p ,t pα ≤ (1 − α) ,
6
which is Corollary 3.4 obtained by Talebi in [1].

4 Conclusion

Upper bounds of operator norms for generalized Hausdorff and Nörlund matrix
operators in Orlicz-Taylor sequence spaces are obtained. This work strengthens the
latest work presented by Talebi in [1] and shows a new direction of research. It is
only the Jensen’s inequality applied to prove all the results.

References

1. Talebi, G.: On the Taylor sequence spaces and upper boundedness of Hausdorff matrices and
Nörlund matrices. Indag. Math. (N.S.) 28(3), 629–636 (2017)
2. Musielak, J.: Orlicz Spaces and Modular Spaces, Springer Lecture Notes in Math., vol. 1034.
Springer, Berlin (1983)
3. González, M., Sari, B., Wójtowicz, M.: Semi-homogeneous bases in Orlicz sequence spaces.
Contemp. Math. 435, 171–181 (2007)
4. Love, E.R.: Hardy’s inequality in Orlicz-type sequence spaces for operators related to gener-
alized Hausdorff matrices. Math. Z. 193, 481–490 (1986)
5. Love, E.R.: Hardy’s inequality for Orlicz-Luxemberg norms. Acta Math. Hung. 56, 247–253
(1990)
6. Bennett, G.: Factorizing the classical inequalities. Mem. Am. Math. Soc. 120(576), 1–130
(1996)
7. Mohapatra, R.N., Salzmann, F., Ross, D.: Norm inequalities which yield inclusion for Euler
sequence spaces. Comput. Math. Appl. 30(3–6), 383–387 (1995)
8. Talebi, G., Dehghan, M.A.: Approximation of upper bounds for matrix operators on Fibonacci
weighted sequence spaces. Linear Multilinear Algebra 64(2), 196–207 (2016)
9. Talebi, G., Dehghan, M.A.: Upper bounds for the operator norms of Hausdorff matrices and
Nörlund matrices on the Euler-weighted sequence spaces. Linear Multilinear Algebra 62(10),
1275–1284 (2014)
10. Lashkaripour, R., Foroutannia, D.: Inequalities involving upper bounds for certain matrix oper-
ators. Proc. Indian Acad. Sci. (Math. Sci.) 116(3), 325–336 (2006)
11. Jakimovski, A., Rhoades, B.E., Tzimbalario, J.: Hausdorff matrices as bounded operators over
l p . Math. Z. 138, 173–181 (1974)
12. Johnson Jr., P.D., Mohapatra, R.N., Ross, D.: Bounds for the operator norms of some Nörlund
matrices. Proc. Am. Math. Soc. 124(2), 543–547 (1996)
Chapter 27
A Study on Fuzzy Triangle and Fuzzy
Trigonometric Properties

Debdas Ghosh and Debjani Chakraborty

Abstract This paper investigates fuzzy triangle, fuzzy triangular properties, and
fuzzy trigonometry. A fuzzy triangle on the plane is constructed by three fuzzy
points as its vertices. Using the proposed fuzzy triangle, basic fuzzy trigonometric
functions are investigated. The extension principle and the concepts of same and
inverse points in fuzzy geometry are used to define all the proposed ideas. It is
shown that some well-known trigonometric identities for crisp angles may not hold
with proper equality for fuzzy angles.

Keywords Fuzzy number · Fuzzy point · Same points · Inverse points · Fuzzy
angle · Fuzzy triangle · Extension principle

AMS Subject Classification 03E72

1 Introduction

In the literature on fuzzy trigonometry, definition of a fuzzy triangle in a plane is given

in four different ways: first, a fuzzy triangle is the intersection of three intersecting
fuzzy half planes [12]; second, a fuzzy triangle is the union of three fuzzy line
segments that are obtained by joining three fuzzy points (three vertices) [2]; third, a
fuzzy triangle is a blurred image obtained by blurring the sides of a crisp triangle [9]
and last, a fuzzy triangle is an approximate crisp triangle [7, 8]. Chaudhuri [5] has
defined a fuzzy triangle as a fuzzy sets whose α-cuts are similar triangles.

D. Ghosh (B)
Department of Mathematical Sciences, Indian Institute of Technology (BHU),
Varanasi 221005, Uttar Pradesh, India
e-mail: [email protected]
D. Chakraborty
Department of Mathematics, Indian Institute of Technology Kharagpur,
Kharagpur 721302, West Bengal, India
e-mail: [email protected]

Fuzzy triangle defined in [5] cannot be a fuzzy triangle, and it is a fuzzy point [2]
whose support is a triangular region. Membership value of a point in the fuzzy half
plane defined in [12] depends on the perpendicular distance between the point and
the boundary of the fuzzy half plane; membership value of the points increases for the
increasing value of this perpendicular distance. At the core, the considered fuzzy half
plane in [12] is not similar to the definition of crisp half plane. Over and above, core
of a fuzzy half plane must be a crisp half plane, which also does not follow from the
definition of fuzzy half plane, and hence, definition of fuzzy triangle therein may be
questionable. In [12], boundary of α-cuts of a fuzzy triangle are equivalent triangles
with identical vertex angles, and hence, it is shown that the well-known sine law for
crisp triangles holds for fuzzy triangles also. The fuzzy trigonometric functions and
angles in [12] are crisp for a fuzzy triangle. These should be fuzzy and cannot be
crisp in general, because the considered environment is itself not precisely defined.
Buckley and Eslami [2] defined fuzzy triangle by three fuzzy points as its vertices.
To form a fuzzy triangle, three intersecting fuzzy line segments are being adjoined.
This definition may be acceptable in fuzzy environment. Fuzzy trigonometric func-
tions have also been defined [2] for a right-angled fuzzy triangle using ratio of fuzzy
distances; e.g., fuzzy sine function is the ratio of the perpendicular and hypotenuse.
However, if fuzzy trigonometric functions are tried to be generalized for any fuzzy
angle, then fuzzy distance of a vertex to the opposite side of a fuzzy triangle has to
be measured. But how to measure is not given. By the same authors, in [3] the same
has been defined through extension principle.
Liu and Coghill [10] have defined fuzzy trigonometric functions using fuzzy
unit circle, which has been named as fuzzy qualitative circle. Boundary of the crisp
circle has been partitioned fuzzily, and fuzzy qualitative angles are defined as 4-
tuple trapezoidal fuzzy number. But it is very difficult to obtain the value of the
trigonometric functions for arbitrary fuzzy angle in general.
Imran and Beg [7, 8] studied fuzzy triangle or f -triangle as an approximate tri-
angle. It is reported that instead of drawing a triangle by ruler, any triangle drawn
by free hand is a fuzzy triangle. Subsequently, similarity of fuzzy triangles is also
studied. But we note that core of this fuzzy triangle is not a crisp triangle.
In [9], fuzzy triangle is defined by blurring boundary of a crisp triangle using
smooth unit step function and implicit functions. But in the obtained shape, its
1-level sets contain all the points which lie outside the considered crisp triangle
instead of the points on the boundary.
Recently, [15] has mentioned that the counterpart of a crisp triangle, C, in Euclid-
ian geometry, is a fuzzy triangle. Fuzzy triangle is referred to as an fuzzy transform of
C, with C playing the role of the prototype of fuzzy triangle. It is helpful to visualize
a fuzzy transform of C as the result of execution of the instruction—draw C by hand
with an unprecisiated spray pen. Here, the fuzzy transformation is an one-to-many
function.
An overview on fuzzy geometrical concepts prior to the work of Buckley and
Eslami is reported in [13]. Some simple construction on fuzzy geometrical concepts
can also be obtained in [11].
27 A Study on Fuzzy Triangle and Fuzzy Trigonometric Properties 343

In this paper, new concepts about fuzzy triangle, fuzzy triangular properties, and
some basics of fuzzy trigonometry are proposed. After defining a fuzzy triangle, its
side lengths, vertex angles, area and perimeter have been studied. In [1, 13, 14], some
concepts about perimeter and area of fuzzy sets are given, but those measurements are
crisp numbers. However, the proposed concept yields fuzzy numbers as measurement
of side lengths and vertex angles. In the studied concepts of fuzzy triangle, above
observation about fuzzy triangle reported by Zadeh [15] is followed. All the proposed
concepts introduced here depend on the newly defined concepts of same and inverse
points [4, 6]. The following is the outline of this paper.
Section 2 is covered by basic definitions and terminologies used in this paper.
Construction of fuzzy triangle is proposed in Sect. 3. In Sect. 4, fuzzy trigonometric
functions are introduced. A brief discussion about the work presented here and its
future scope are added in the Sect. 5.

2 Preliminaries

The basic definitions adopted here are taken from [2, 6] with slight alteration. Small
or capital letters with over tilde bar, i.e., B,
A, a,
C, …and b,
c, …represent fuzzy
subsets of R , n = 1, 2. Membership function of a fuzzy set
n
A of Rn is represented
by μ(x|A), x ∈ Rn with μ(Rn ) ⊆ [0, 1], n = 1, 2.

Definition 1 (α-cut of a fuzzy set [6]) For a fuzzy set

A of Rn , n = 1, 2, an α-cut of

A is denoted by
A(α) and is defined by:

{x : μ(x|
A) ≥ α} if 0 < α ≤ 1
A(α) =
closure{x : μ(x|
A) > 0} if α = 0.

The set {x : μ(x|

A) > 0} is called support of the fuzzy set
A.

To represent the construction of membership function of a fuzzy set A, the notation
{x : x ∈
A(0)} is frequently used, which means μ(x|A) = sup{α : x ∈ A(α)}.

Definition 2 (Fuzzy numbers [2]) A fuzzy set

A of R is called a fuzzy number if its
membership function μ has the following properties:
(i) μ(x|A) is upper semi-continuous,
(ii) μ(x|A) = 0 outside some interval [a, d ], and
(iii) there exist real numbers b and c so that a ≤ b ≤ c ≤ d and μ(x| A) is increasing
on [a, b], decreasing on [c, d ] and μ(x|A) = 1 for each x in [b, c].

Since μ(x| A) is upper semi-continuous for a fuzzy number A, the set {x : μ(x|
A) ≥
α} is closed for all α in R. So, α-cut of a fuzzy number A, i.e., the set
A(α) is a closed
and bounded interval of R for all α in [0, 1].
344 D. Ghosh and D. Chakraborty

For b = c, letting f (x) = μ(x| A)∀x ∈ [a, b] and g(x) = μ(x| A)∀x ∈ [c, d ], the
notation (a, c, d )f g is used in this paper to represent the above-defined fuzzy number.
In particular, if f (x) and g(x) are linear functions, then fuzzy number is called a
triangular fuzzy number and it is denoted by (a/b/c).

Definition 3 (Fuzzy number along a line [6]) In defining a fuzzy number, conven-
tionally a real line (R) is taken as the universal set. Instead of a real line as the
universal set, consider any line on the plane R2 where the x-axis represents real line,
and let
p be a fuzzy number. On the x-axis, the membership function of p may be
written as μ((x, 0)|
p) = μ(x| p)∀x ∈ R. More explicitly:

μ(x|
p) if y = 0
μ((x, y)|
p) =
0 elsewhere.

Let T : R2 → R2 be a transformation that includes rotation of the axes by angle θ and

translation of the origin to ( a2ac , bc ), which is the point of intersection for ax +
+b2 a2 +b2
by = c and its perpendicular line through origin. T can be expressed by T (x, y) =
(x cos θ − y sin θ + a2ac
+b2
, x sin θ + y cos θ + a2bc
+b2
). T is a bijective transformation
that transforms the x-axis to ax + by = c. Now, p may be considered as a fuzzy
number on the line ax + by = c and may be defined in the following way:

μ((x, 0)|
p) if (u, v) = T (x, 0), au + bv = c
μ((u, v)|
p) =
0 elsewhere.

Definition 4 (Fuzzy points [2]) A fuzzy point at (a, b) in R2 , written as

P(a, b), is
defined by its membership function:
(i) μ((x, y)|
P(a, b)) is upper semi-continuous,
(ii) μ((x, y)|
P(a, b)) = 1 if and only if (x, y) = (a, b), and
(iii)
P(a, b)(α) is a compact, convex subset of R2 , for all α in [0, 1].
Often the notations
P1 ,
P2 ,
P3 , …are used to represent fuzzy points.

Definition 5 (Same points [6]) Let (x1 , y1 ) and (x2 , y2 ) be two points on support of
two continuous fuzzy points P(a, b) and
P(c, d ), respectively. Let L1 be a line joining

(x1 , y1 ) and (a, b). As P(a, b) is a fuzzy point, along L1 there exists a fuzzy number,
r1 say, on the support of
P(a, b). Membership function of this fuzzy number r1 can
be written as μ((x, y)| r1 ) = μ((x, y)|
P(a, b)) for (x, y) on L1 , and 0 otherwise.
Similarly, along a line, L2 say, joining (x2 , y2 ) and (c, d ), there exists a fuzzy
number, r2 say, on the support of P(c, d ). The points (x1 , y1 ), (x2 , y2 ) are said to be
same points with respect to P(a, b) and P(c, d ) if:
(i) (x1 , y1 ) and (x2 , y2 ) are same points with respect to
r1 and
r2 and
(ii) L1 , L2 have made the same angle with the line joining (a, b) and (c, d ).
27 A Study on Fuzzy Triangle and Fuzzy Trigonometric Properties 345

Definition 6 (Inverse points [6]) Let (x1 , y1 ) and (x2 , y2 ) be two points in the sup-
port of two continuous fuzzy points P(a, b) and
P(c, d ), respectively. The points
(x1 , y1 ), (x2 , y2 ) are said to be inverse points with respect to
P(a, b) and P(c, d ) if
(x1 , y1 ), (−x1 , −y1 ) are same point w.r.t. P(a, b) and −P(c, d ).

Definition 7 (Fuzzy distance [6]) Fuzzy distance ( D= D(

P1 ,
P2 )) between two
fuzzy points P1 and
P2 is defined by its membership function: μ(d |D) = sup {α : d =
d (u, v), where u ∈ P1 (0), v ∈ P2 (0) are inverse points; μ(u|
P1 ) = μ(v| P2 ) = α}.
Here d (, ) is the Euclidean distance metric.

Definition 8 (Fuzzy line segment [6]) Fuzzy line segment LP1 P2 joining two fuzzy
points P1 , P2 is defined by its membership function as μ((x, y)|
LP1 P2 ) = sup{α :
(x, y) lies on the line joining same points (x1 , y1 ) ∈
P1 (0), (x2 , y2 ) ∈ P2 (0) and
μ((x1 , y1 )|
P1 ) = μ((x2 , y2 )|
P2 ) = α}.

Definition 9 (Angle between two fuzzy line segments [6]) Let

P1 ,
P2 ,
P3 be three
continuous fuzzy points. The angle between L
,L is denoted by Θ and is
P1 P2 P2 P3
= sup {α : θ is angle between two line segments Luv and Lvw ,
defined by: μ(θ |Θ)
P1 (0), v ∈
where u, v and v, w are same points of membership value α; u ∈ P2 (0),

w ∈ P3 (0)}.

In the next section, first we introduce the formation of fuzzy triangle and then the
measurements of its side lengths, vertex angles, area, and perimeter.

3 Fuzzy Triangle

Let us suppose that three distinct fuzzy points P1 ,

P2 , and
P3 are given and a fuzzy

triangle (ΔP1 P2 P3 ) is to form. A construction procedure may be designed as follows.
Considering three same points u, v, and w in the support of P1 ,
P2 , and
P3 , respec-
tively, let us construct a triangle having vertices as u, v and w. If μ(u| P1 ) = α,
then obviously μ(v| P2 ) = α, μ(w| P3 ) = α. We may put membership value of
1 P2 P3 is also α. Now ΔP
in ΔP 1 P2 P3 can be considered as union of all of these
’s—crisp triangles with different membership grades. Thus, a formal definition of
a fuzzy triangle may be given by its membership function as
1 P2 P3 ) = sup {α : x ∈ , where is constructed by the same points u ∈
μ(x|ΔP
P1 (0), v ∈
P2 (0), and w ∈ P3 (0) as vertices; μ(u|P1 ) = μ(v|P2 ) = μ(w| P1 ) = α}.

Remark 1 Fuzzy triangle defined in the above definition is exactly equal to

LP1 P2 ∪

LP2 P3 ∪ LP3 P1 .

Example 1 Let us consider the fuzzy triangle, ΔP 1 P2 P3 , whose vertices are three
fuzzy points P1 (1, 2),
P2 (5, 7) and
P3 (6, 1). Let the membership functions are right
elliptical/circular cone with supports
346 D. Ghosh and D. Chakraborty

(x − 1)2
P1 (1, 2)(0) = {(x, y) : + (y − 2)2 ≤ 1},
4

P2 (5, 7)(0) = {(x, y) : (x − 5)2 + (y − 7)2 ≤ 4} and

P3 (6, 1)(0) = {(x, y) : (x − 6)2 + (y − 1)2 ≤ 1}.

1 P2 P3 .
Let us now evaluate membership value of (2, 4) in the fuzzy triangle ΔP
The same points with membership value α ∈ [0, 1] on P1 (1, 2),
P2 (5, 7), and

P3 (6, 1) are (see [6]):

2(1 − α) cos θ 2(1 − α) sin θ )

Aα,θ : 1 + ,2 + ,
4 sin θ + cos2 θ
2
4 sin2 θ + cos2 θ
Bα,θ : (5 + 2(1 − α) cos θ, 7 + 2(1 − α) sin θ ) and
Cα,θ : (6 + (1 − α) cos θ, 1 + (1 − α) sin θ ),

respectively, where θ ∈ [0, 2π ].

Apparently, there is a possibility that (2, 4) may lie on the line segment L̄P1 P2 , but

(2, 4) cannot lie on the line segments L̄P2 P3 and L̄P1 P3 . Now the condition that (2, 4)
lies
L̄P1 P2 or on the line segment Aα,θ Bα,θ (for some θ ∈ [0, 2π ] and α ∈ [0, 1]) is:
4−(7+2(1−α) sin θ) 2+2k(1−α) sin θ−(7+2(1−α) sin θ)
2−(5+2(1−α) cos θ)
= 1+2k(1−α) cos θ−(5+2(1−α) cos θ)
, where k = √ 1
4 sin2 θ+cos2 θ

⇒ α = 1 − (8+6k) sin θ−(10+6k)

3
cos θ
= f (θ ), say.
Here f (θ ) must lie in [0, 1], and hence admissible domain of f (θ ) is Df =
[63◦ , 222.66◦ ]. Maximum value of f (θ ) over Df occurred at 157.32◦ and the value
is 0.8352, the possibility of containment of (2,4) on L̄P1 P2 .
Thus, the point (2, 4) lies on the triangle Aα,θ Bα,θ Cα,θ for α = 0.8352 and θ =
157.32◦ , i.e, A ≡ (0.7726, 2.1193), B = (4.7081, 7.1531) and C ≡ (5.8541, 1.0766).
Hence, μ((2, 4)|ΔP 1 P2 P3 ) = 0.8352.
In Fig. 1, α-cut of a fuzzy triangle is shown. The shaded regions represent P1 (α),
P2 (α), and
P3 (α). The lines BM and CN are inclined at the same angle with the
line joining P2 and P3 . The pairs of points B, C and M , N are same points with
membership value α in P2 ,
P3 . Likewise, A, B and B, C are pairs of same points
with respect to P1 , P2 and
P2 ,
P3 , respectively. According to the proposed definition,
fuzzy triangle with vertex P1 ,
P2 , and P3 is the union of all triangles like ABC.
Note 1 It is to observe that if support of any two vertices of a fuzzy triangle has
non-empty intersection, there may be several same points which are coincident.
Corresponding to those same points, the crisp triangle in the support of the fuzzy
triangle reduces to a crisp line segment. Another case may happen that though the
supports of vertices of a fuzzy triangle ΔP 1 P2 P3 are different but two or more of
their core points are identical. In this case, since the core of the fuzzy triangle is a
crisp line segment, it can form a fuzzy triangle. These are all degenerate cases of
fuzzy triangle. So to get a fuzzy triangle, we need to have three fuzzy points having
distinct core points.
27 A Study on Fuzzy Triangle and Fuzzy Trigonometric Properties 347

Fig. 1 Construction of a fuzzy triangle

In the following theorem, the α-cut of a fuzzy triangle is found.

Theorem 1 Let ΔP 1 P2 P3 be a fuzzy triangle. Its α-cut is the set {x : x ∈ where
is a crisp triangle whose vertices are three same points u ∈ P1 (α), v ∈
P2 (α) and
w∈ P3 (α)}.

Proof The theorem is followed from the observation that ΔP 1 P2 P3 = α∈[0,1] {α :
α is a triangle with vertices as same points of
P1 ,
P2 and
P3 with membership value
α}.
Note 2 The result of the Theorem 1 directly shows that fuzzy triangle joining three
fuzzy points having three distinct core points is unique, since once vertices of fuzzy
triangle are changed, several crisp triangles in the support of the fuzzy triangle which
eventually construct the fuzzy triangles are going to change, and hence, fuzzy triangle
will have different membership function.
Now let us try to find side lengths of a fuzzy triangle. Length of the sides of the fuzzy
triangle ΔP 1 P2 P3 may be defined by fuzzy distances (Definition 7) between the
vertices, i.e.,
D( P1 ,
P2 ),
D(P2 ,
P3 ) and
D(P3 ,
P1 ). Let us denote them as
p3 , p1 , and
p2 , respectively. The vertex angles of ΔP
1 P2 P3 may be defined as ∠(LP1 P2 ,
LP2 P3 ),

∠(LP2 P3 , LP3 P1 ), and ∠(LP3 P1 , LP1 P2 ); the notations ∠P2 , ∠P3 and ∠P1 , respectively,
may be used to represent. It is to note that vertex angle ∠Pi is situated opposite to
the side with length pi , i = 1, 2, 3.
348 D. Ghosh and D. Chakraborty

Example 2 Let ΔP 1 P2 P3 be a fuzzy triangle whose vertices P1 (1, 0),

P2 (2, 0) and

P3 (1.5, 2) are as follows.
The shape of P1 is a right circular cone with base P1 (0) = {(x, y) : (x − 1)2 +
y ≤ 4 } and vertex (1, 0).
2 1

The shape of P2 is a right circular cone with base P2 (0) = {(x, y) : (x − 2)2 +
y2 ≤ 41 } and vertex (2, 0).
The shape of P3 is a right elliptical cone with base P3 (0) = {(x, y) : (x − 1.5)2 +
(y − 2) ≤ 1} and vertex (1.5, 2).
2

The same points with membership value α ∈ [0, 1] on P1 (1, 0),

P2 (2, 0), and

P3 (1.5, 2) are
Aα,θ : 1 + (1−α)2
cos θ, (1−α)
2
sin θ ,
Bα,θ : (2 + (1−α)
2
cos θ, (1−α)
2
sin θ ) and
Cα,θ : (1.5 + (1 − α) cos θ, 2 + (1 − α) sin θ ), respectively, where θ ∈ [0, 2π ].
To calculate length of the side L̄P1 P2 , i.e.,
p3 , first let us obtain the pair of in-
verse points in P1 and P2 . The inverse points with membership value on P1 and P2
are Aα,θ and Bα,π+θ . Here, min d (Aα,θ , Bα,π+θ ) = α and max d (Aα,θ , Bα,π+θ ) =
0≤θ≤2π 0≤θ≤2π
2 − α∀α ∈ [0, 1]. Thus, p3 (α) = [α, 2 − α]∀α ∈ [0, 1]. Hence, membership value
of
p3 will be obtained as
⎧
⎪
⎨d if 0 ≤ d ≤ 1
p3 ) = 2 − d if 1 ≤ d ≤ 2
μ(d |
⎪
⎩
0 elsewhere.

Similarly, length of the sides L̄P2 P3 and

L̄P3 P1 , i.e.,
p1 and
p2 , respectively, will be
obtained as
⎧
⎪
⎨ 3 d − 0.3744 if 0.5615 ≤ d ≤ 2.0616
2

p1 ) = μ(d |
μ(d | p2 ) = 2.3744 − 23 d if 2.0616 ≤ d ≤ 3.5616
⎪
⎩
0 elsewhere.

To evaluate the vertex angle ∠P2 (Definition 9) of the fuzzy triangle ΔP 1 P2 P3 ,
let us first calculate the angle between the line segments Aα,θ Bα,θ and Bα,θ Cα,θ join-
2+ (1−α) sin θ
ing same points of the vertices. Here, ∠(Aα,θ Bα,θ , Bα,θ Cα,θ ) = tan−1 −0.5+ (1−α)
2
cos θ
.
2
We obtain that the core of the angle ∠P2 is 75.9627◦ and support is
∠P2 (0) =
−1 2+ (1−α) sin θ
tan 2 ◦ ◦
= [51.7839 , 90 ].
0≤α≤1,0≤θ≤2π −0.5+ (1−α)
2 cos θ
Similarly, core and support of the angle ∠P1 are 28.0749◦ and [22.0228◦ ,

157.6549 ], respectively. For the angle ∠P2 , its core and support are 75.9627◦ and
◦

[51.7839◦ , 90◦ ], respectively.

1 P2 P3 =
We note that ΔP LP1 P2 ∪
LP2 P3 ∪
LP3 P1 , since fuzzy line segments are also
defined by collection of crisp line segments adjoining same points of the extreme
27 A Study on Fuzzy Triangle and Fuzzy Trigonometric Properties 349

fuzzy points. Side lengths of ΔP 1 P2 P3 are defined as length of the fuzzy line segments

LP1 P2 , LP2 P3 , and LP3 P1 . Obviously, side lengths of a fuzzy triangle are fuzzy numbers,
because distance between two fuzzy points, measured by inverse points, is a fuzzy
number [6]. Vertex angles of ΔP 1 P2 P3 are the angles between fuzzy line segments

LP1 P2 , LP2 P3 , and LP3 P1 . It is worthy to mention that vertex angles of a fuzzy triangle
having vertices as three continuous fuzzy points are fuzzy numbers, since angle
between two fuzzy line segments joining two fuzzy points can be easily shown as
fuzzy number. But we observe that support of the fuzzy number obtained by addition
of vertex angles of a fuzzy triangle may contain angle more than 180◦ . The following
example gives one example supporting this observation.

Example 3 Let us consider the fuzzy triangle, ΔP 1 P2 P3 , whose vertices are three
fuzzy points P1 (3, 2),
P2 (1, 1), and
P3 (1, 0). Membership functions of them are right
circular/elliptical cone with supports as

(x − 3)2 (y − 2)2
P1 (3, 2)(0) = {(x, y) : + ≤ 1},
32 0.52
1
P2 (1, 1)(0) = {(x, y) : (x − 1)2 + (y − 1)2 ≤ } and
4
1
P3 (1, 0)(0) = {(x, y) : (x − 1) + y ≤ }.
2 2
4

Let us suppose vertex angle ∠(

∠P2 = LP1 P2 ,
LP2 P3 ) of the fuzzy triangle ΔP 1 P2 P3
is to evaluate.
The same points with membership value α ∈ [0, 1] on P1 (3, 2),
P2 (1, 0), and

P3 (1, 1) are

1−α
Aα,θ : (3 + 3(1 − α) cos θ, 2 + sin θ ),
2
1−α 1−α
Bα,θ : (1 + cos θ, 1 + sin θ ) and
2 2
1−α 1−α
Cα,θ : (1 + cos θ, sin θ )
2 2
respectively, where θ ∈ [0,
2π ].
Thus,
∠P2 (0) = θ∈[0,2π],α∈[0,1] |∠(Aα,θ Bα,θ , Bα,θ Cα,θ )| = [102.5306◦ ,
◦
194.0362 ].
A geometric visualization of the scenario is given in Fig. 2.

Remark 2 As vertex angles of the fuzzy triangle are defined as angles between
the side line segments of the fuzzy triangle, there is a possibility that vertex angle
may contain more than 180◦ angle on its support. It is shown in the Fig. 2 that
∠(A0,π B0,π , B0,π C0,π ) = 194.0362◦ . However, we note that if the vertex angle ∠P2
would have defined by the collection of the angle ∠Bα,θ of the triangles Aα,θ Bα,θ Cα,θ
350 D. Ghosh and D. Chakraborty

Fig. 2 Addition of the vertex angles may contain more that 180◦ on its support

for all possible θ and α, the measurement of

∠P2 cannot have more than 180◦ on its
support.

In one another way, we may define vertex angle of a fuzzy triangle as follows. Let
1 P2 P3 having vertices as
us consider a fuzzy triangle ΔP P1 ,
P2 , and
P3 . The vertex

angle ∠P2 may be defined as ∠P2 = {∠ABC where ABC is a triangle, A ∈ P1 ,
B∈ P3 are three same points}. Similarly,
P2 , C ∈ ∠P1 ,
∠
P3 can be defined. By this
definition of fuzzy vertex angle, for the triangle considered in the Example 3, we
obtain that
∠P2 (0) = [102.5306◦ , 180◦ ]. However, by this definition also addition
◦
contain more than 180 angle. Thus, in either definition
of three vertex angles may

∠P1 + ∠P2 + ∠P3 = {∠A + ∠B + ∠C: where ABC is a triangle having ver-
tices as same points of P1 ,
P2 , and
P3 }, since right-hand side is the crisp number
180◦ and left-hand side is a fuzzy number.
Now, let us try to investigate perimeter and area of a fuzzy triangle.

Definition 10 (Perimeter of a fuzzy triangle) Let us consider a fuzzy triangle

1 P2 P3 . Fuzzy perimeter of the considered fuzzy triangle may be defined by the
ΔP
following ways.
27 A Study on Fuzzy Triangle and Fuzzy Trigonometric Properties 351

(Method 1) Let us denote the fuzzy perimeter as δ1 . It may be defined by the

membership function: μ(δ| δ1 ) = sup {α : δ is the perimeter of the triangle formed by
P1 (0), v ∈
three same points u ∈ P2 (0), and w ∈ P3 (0) as its vertices with μ(u| P1 ) =
μ(v|P2 ) = μ(w|
P3 ) = α}.
(Method 2) In this method, let us denote the fuzzy perimeter as δ2 . It may be
defined by:
δ2 =
p1 + p2 + p3 .

Remark 3 Here, addition a +

b of two fuzzy numbers a and b will be performed by
applying the concept of same point as defined
in [6]. The definition in [6] for addition
of two fuzzy numbers says that a + b = {x + y : x, y are same points in a,
b}. In
fact, this addition and extended addition give the same result as shown in [6].

Note 3 Fuzzy perimeters obtained by above two methods are not equal, i.e., δ1 =
δ2 .
It is easily followed from the formation of ΔP 1 P2 P3 and the evaluation of distance
between two fuzzy vertices (points). ΔP 1 P2 P3 is formed by taking union of all crisp
triangles whose vertices are same points of the fuzzy vertices of ΔP 1 P2 P3 , whereas
distance between two fuzzy vertices is evaluated by combining distances between
p1 +
inverse points. Thus, addition of the side lengths, i.e., p2 +
p3 , cannot be equal
1 P2 P3 .
to the union of all the perimeter of the crisp triangles on the support of ΔP

δ1 =
The following example explores the fact that δ2 .
Example 4 Let ΔP 1 P2 P3 be a fuzzy triangle whose vertices P1 (1, 0), P2 (2, 0), and

P3 (1.5, 2) are as follows. All of them has membership function as right circular
cone with support sets are: P1 (0) = {(x, y) : (x − 1)2 + y2 ≤ 41 }, P2 (0) = {(x, y) :
1
(x − 2) + y ≤ 4 }, P3 (0) = {(x, y) : (x − 1.5)2 + (y − 2)2 ≤ 1}.
2 2

The same points with membership value α ∈ [0, 1] on P1 (1, 0), P2 (2, 0), and

P3 (1.5, 2) are:
Aα,θ : 1 + (1−α)
2
cos θ, (1−α)
2
sin θ ,
Bα,θ : (2 + (1−α)2
cos θ, (1−α)
2
sin θ ) and
Cα,θ : (1.5 + (1 − α) cos θ, 2 + (1 − α) sin θ ), respectively, where θ ∈ [0, 2π ] .

Thus from definition of δ1 and δ2 , we get
δ1 (0) = α∈[0,1],θ∈[0,2π] |d (Aα,θ , Bα,θ ) +
d (Bα,θ , Cα,θ ) + d (Cα,θ , Aα,θ )| = [4.1623, 6.0990] and δ2 (0) = p1 (0) + p2 (0) + p3
(0) = [0.5616, 3.5616] + [0.5616, 3.5616] + [0, 2] = [1.1232, 9.1232] = δ1 (0).

The results of the following two theorems give information to get α-cuts, and hence
membership functions of δ1 and
δ2

Theorem 2 δ1 is a fuzzy number and

δ1 (α) = {δ : δ is the perimeter of the triangle
P1 (α), v ∈
constructed by the same points u ∈ P2 (α) and w ∈ P3 (α) as its vertices}.

Proof Similar to Theorem 5.1 of [6].

δ2 is a fuzzy number and

Theorem 3 δ2 (α) =
p1 (α) +
p2 (α) +
p3 (α)∀α ∈ [0, 1].

Proof Theorem directly follows from the addition of fuzzy numbers using same
points.
352 D. Ghosh and D. Chakraborty

Now, let us define area of a fuzzy triangle.

of a fuzzy triangle ΔP
Definition 11 (Area of a fuzzy triangle) Fuzzy area (Δ) 1 P2 P3

may be defined by its membership function as μ(|Δ) = sup {α : is the area of
the triangle constructed by the same points u ∈ P1 (0), v ∈
P2 (0) and w ∈ P3 (0) as

its vertices with μ(u|P1 ) = μ(v|P2 ) = μ(w|P3 ) = α}.
Example 5 Let us calculate the area (Δ) of the fuzzy triangle in Example 4. Area of
the triangle Aα,θ Bα,θ Cα,θ for a particular value of θ and α is 21 |2 + 21 (1 − α) sin θ |.
Thus, support of Δ is
1 1 3 5
=
Δ(0) |2 + (1 − α) sin θ | = [ , ]
θ∈[0,2π] α∈[0,1]
2 2 4 4

is 1.
and core of Δ
The results of the following theorem give information to get α-cuts, and hence
membership function, of Δ.

Theorem 4 Δ is a fuzzy number and Δ(α)

= { : is the area of the triangle

constructed by the same points u ∈ P1 (α), v ∈
P2 (α) and w ∈
P3 (α) as its vertices}.
Proof Similar to Theorem 5.1 of [6].
In the next section, we will introduce basic fuzzy trigonometric functions using
the proposed fuzzy triangle. It has been shown that several well-known trigonometric
identities do not hold with proper equality for fuzzy angle.

4 Fuzzy Trigonometry

To define fuzzy trigonometric functions, let us suppose a fuzzy triangle ΔP 1 P2 P3

is given and we have to find sin ∠P2 , cos
∠P2 , tan ∠P2 , etc. Here, a definition of
sin
∠P2 is studied and other functions can be derived in a similar way.
Let a, b, c are three same points taken from P1 (0),
P2 (0),
P3 (0), respectively
(a, b, c ∈ R2 ). Now let us consider the triangle abc in ΔP 1 P2 P3 .
Let θ be the angle between ab, bc; and n be the foot of perpendicular from a to
the line bc.
Obviously, sin θ = dd (a,n)
(a,b)
, where d (, ) is the usual Euclidean distance.
If μ(a|P1 ) = α, then μ(b|
P2 ) = α and μ(c| P3 ) = α.
Since membership values of a, b, and n in the fuzzy triangle ΔP 1 P2 P3 are α, α,
and greater or equals to α, respectively, membership value of sin θ in sin ∠P2 may
be assigned as minimum of membership value of a, b, and n in ΔP 1 P2 P3 . Thus,
μ(sin θ | sin
∠P2 ) = α.
Now, sin ∠P2 can be defined as union of the above sin θ s. Therefore, sin ∠P2 is
defined as follows.
27 A Study on Fuzzy Triangle and Fuzzy Trigonometric Properties 353

Definition 12 (Fuzzy sine function) Let for a fuzzy triangle ΔP 1 P2 P3 ,

∠P2 = Θ.
Then, sin Θ may be defined by the membership function: μ(s| sin Θ) = sup{α : s =
sin θ = dd (a,n)
(a,b)
where a ∈
P1 (0), b ∈
P2 (0), c ∈
P3 (0) are same points with member-
ship value α and n is the foot of perpendicular from a to the line joining b and c}.

In the next section, it is proved that above-defined sin Θ is a fuzzy number. In the
(a,n)
proof, for 0 < α ≤ 1, the notation A(α) is used to represent the set { dd (a,b) :a∈

P1 (α), b ∈ P2 (α), c ∈ P3 (α) are same points, and n is the foot of perpendicular from
a to the line joining b and c}. Similar to Theorem 5.1 of [6], we can prove that

A(α) = sin Θ(α). Before proving the theorem, we will observe one surprising fact
1 P2 P3 , sin
that for a fuzzy triangle ΔP ∠P2 may have singleton support.

Example 6 Let us consider a fuzzy triangle ΔP 1 P2 P3 with vertices

P(2, 3), P(4, 5),
and
P(6, 7). All of these three fuzzy points have right circular cone as membership
functions having bases (x − 2)2 + (y − 3)2 ≤ 41 , (x − 4)2 + (y − 5)2 ≤ 14 , and (x −
6)2 + (y − 7)2 ≤ 41 ; and vertices at (2, 3), (4, 5), and (6, 7), respectively.
The same points with respect to the above fuzzy points can be represented
by a = (2 + (1−α)
2
cos θ , 3 + (1−α)
2
sin θ ), b = (4 + (1−α)
2
cos θ , 5 + (1−α)
2
sin θ ), and
(1−α) (1−α)
c = (6 + 2 cos θ , 7 + 2 sin θ ), respectively, with 0 ≤ θ ≤ 2π , 0 ≤ α ≤ 1.
For any a, b, and c: ∠(ab, bc) = π4 . Apparently, ∠P2 = π4 and sin ∠P2 is the crisp
√
number 2 . Hence, support of sin ∠P2 is the singleton set { 2 }.
1 √1

Note 4 It can be easily perceived that if membership functions and supports of P1 ,

P3 are identical up to a translation, then all of sin
P2 ,
∠P1 , sin
∠P2 and sin
∠P3 must
have singleton support, since in this situation the angles ∠P1 , ∠P2 , and
∠P3 are crisp
angles.
evaluated by the Definition 12 is a fuzzy number.
Theorem 5 sin Θ

Proof Let us take three different fuzzy points P1 ,

P2 , and P3 and consider a fuzzy
triangle using them as vertices. Let Θ be the fuzzy angle between
LP1 P2 and
LP2 P3 .
As P1 ,
P2 ,
P3 are fuzzy points, their α-cuts
P1 (α), P2 (α),
P3 (α) are non-empty
compact subset of R2 . Hence, supremum and infimum of A(α) are attainable at
A(α). That is, if those elements are u(α), l(α), respectively, then l(α) ∈ A(α) and
u(α) ∈ A(α). Therefore, A(α) ⊆ [l(α), u(α)].
We prove that A(α) = [l(α), u(α)] for 0 < α ≤ 1. To prove this, it is sufficient to
prove that A(α) is convex, closed, and bounded set.
Boundedness of A(α) is trivially true, because it is assumed that the sets P1 (0),
P2 (0) and
P3 (0) have empty intersection.
Now as l(α) ∈ A(α) and u(α) ∈ A(α), obviously convexity of A(α) will imply
its closedness. We will prove that A(α) is convex.
is the singleton {s0 } where s0 = sin ∠(P1 P2 ,
It is easy to notice that core of sin Θ
P2 P3 ). We argue that A(α) contains all the points of [l(α), s0 ] and also of [s0 , u(α)].
If s0 = l(α) = u(α), then result is trivially true. If not, then let λ ∈ (0, 1) and t1 ,
t2 ∈ A(α) with t1 < t2 < s0 . Obviously, l(α) < t1 < λt1 + (1 − λ)t2 < t2 < s0 . Let
354 D. Ghosh and D. Chakraborty

θ1 = sin−1 (t1 ), θ2 = sin−1 (t2 ), θα = sin−1 (l(α)) and θλ = sin−1 (λt1 + (1 − λ)t2 ),
where sin−1 not necessarily represents principle value. We took θ1 such a manner
that 0 ≤ θ1 < θ0 ≤ π . The similar restriction also followed for θ2 , θλ , and θα . It can
be easily observed that 0 ≤ θα < θ1 < θλ < θ2 < θ0 ≤ π . As membership function
of Θ
is continuous, it follows that θλ ∈ Θ(α). Hence sin(θλ ) ∈ A(α), i.e., λt1 + (1 −
λ)t2 ∈ A(α). So [l(α), s0 ] ⊆ A(α). Similarly, we can prove that [s0 , u(α)] ⊆ A(α).
Hence, A(α) = [l(α), u(α)], a closed bounded interval. Therefore, membership
function of sin Θ is upper semi-continuous.
Let 0 < α ≤ β ≤ 1. Apparently, Pi (β) ⊆ Pi (α) for i = 1, 2, 3. Therefore, A(β) ⊆
A(α), i.e., [l(β), u(β)] ⊆ [l(α), u(α)]. This implies, l is an increasing function and
u is a decreasing function. On the other hand, apparently, A(0) = [l(0), u(0)] and
A(1) = {θ0 }. Obviously, membership value of sin θ0 in the fuzzy set sin(Θ) is one.
Hence the result is proved.

Here, a natural question may arise whether there exists any relation between the
evaluated by extension principle and by the Definition 12? Result of
value of sin Θ
the following theorem finds this relation.
Theorem 6 Let us consider a fuzzy triangle constructed by three different fuzzy
points P1 ,
P2 , be the fuzzy angle between
P3 . Let Θ LP1 P2 and LP2 P3 and S(α) =

sin(Θ)(α), where sin(Θ) is evaluated by the extension principle. Then, S(α) is iden-
tical to A(α) for 0 ≤ α ≤ 1.

Proof The theorem is followed from the fact that sin Θ = α∈[0,1] A(α) = α∈[0,1]
{sα : sα = sin Θ α }.

Therefore, the trigonometric sine functions evaluated by extension principle and

by the Definition 12 are identical.
In a similar way of defining sin Θ, the other fuzzy trigonometric functions, like
cosine, tangent, etc., can also be defined for fuzzy angles.
Here, it is surprising to note that sin Θ may have discontinuous membership

function even if membership function of Θ is continuous. Following example is an
counterexample of this fact.

Example 7 Let Θ has following membership function

= (0/ π /π ). Then, sin Θ
4
which is discontinuous.
⎧ 4 sin−1 (s)
⎪
⎨ π if 0 ≤ s < √1
2
=
μ(s| sin Θ) 4(π−sin−1 (s))
if √1 ≤s≤1
⎪
⎩
3π 2
0 elsewhere.

and sin Θ,
Figures 3 and 4 depict membership functions of Θ respectively.

With the above fact, several well-known trigonometric identities do not hold with
proper equality in the fuzzy environment. Let us make a point wise analysis on those
identities. All of the analysis are supported by numerical illustration.
27 A Study on Fuzzy Triangle and Fuzzy Trigonometric Properties 355

Fig. 3 Membership function

is continuous (in
of Θ
Example 7)

Fig. 4 Membership function

is discontinuous
of sin(Θ)
(in Example 7)

1. Pythagorean law for a right-angled fuzzy triangle does not hold. For in-
stance, let us consider the fuzzy triangle ΔP 1 P2 P3 whose vertices have member-
ship function as right circular cone and with support sets P1 (0, 0)(0) = {(x, y) :
x2 + y2 ≤ 41 }, P2 (1, 0)(0) = {(x, y) : (x − 1)2 + y2 ≤ 14 } and P3 (1, 1)(0) =
{(x, y) : (x − 1)2 + (y − 1)2 ≤ 41 }. Here, support of the fuzzy hypotenuse is
√ √
p2 (0) = [ 2 − 1, 2 + 1]; support of fuzzy perpendicular and fuzzy base
√ sides
are
√ 1p (0) = p3 (0) = [0, 2]. Thus, (
p1 ) 2
(0) + (
p 3 ) 2
(0) = [0, 8] = [3 − 2 2, 3 +
2 2] = ( p2 )2 (0).
2. For a fuzzy angle Θ, tan Θ cannot be written as ratio of sin Θ and cos Θ.
For example, let us take the same fuzzy angle Θ of Example 7. sin Θ is given
in the Example 7. Membership function of cos Θ and tan Θ has the following
membership functions:
⎧ 4(π−cos−1 (c))
⎪
⎨ 3π
if − 1 ≤ c < √12
=
μ(c| cos Θ) 4 cos −1
(c)
if √12 ≤ c ≤ 1
⎪
⎩
π
0 elsewhere
356 D. Ghosh and D. Chakraborty

Fig. 5 Membership function

of cos(Θ)

Fig. 6 Membership function

of tan(Θ)

and
4 tan−1 (t)
= π
if 0 ≤ t ≤ 1
μ(t| tan Θ) 4(π−tan−1 (t))
3π
elsewhere.

Figures 5 and 6 depict the membership function of cos(Θ) and tan(Θ), respec-
tively.

sin Θ π π
So, for each α ∈ [0, 1], cos (α) = [−sec( 4 α) sin( 4 α), sec( 4 α) sin( 4 α)].
Θ
3π 3π

For instance, if α = , then (α) =

Θ
This α-cut is not equal to α-cut of tan Θ. 2
3
sin
cos Θ
2
(−∞, √ ], whereas tan Θ = [ √ , ∞).1
3 3
3. For a fuzzy angle Θ which is a vertex angle of a right-angled fuzzy trian-
gle, sin Θ̃ cannot be equal to ratio of fuzzy perpendicular side and fuzzy
hypotenuse side length. For a simple example, let us consider the fuzzy right-
angled triangle taken in the Point 1 to show Pythagorean law does not hold. In
◦
that
√ triangle√ ∠P2 = 45 = Θ (say), support of the fuzzy hypotenuse is p2 (0) =
[ 2 − 1, 2 + 1] and support of fuzzy perpendicular side is p1 (0) = [0, 2].

Thus, sin Θ(0) = √12 = [0, 4.8284] =
p1
(0).
p2
27 A Study on Fuzzy Triangle and Fuzzy Trigonometric Properties 357

4. For a fuzzy angle Θ, sin2 Θ + cos2 Θ may not be equal to 1. The result
trivially follows from the observation that sin2 Θ + cos2 Θ is a fuzzy number
which cannot always be a crisp number, viz. ‘1’. For a numerical example, let
us take the angle Θ
= (0/ π /π ). We observe that (sin2 Θ)(0)
+ (cos2 Θ)(0) =
4
[0, 1] + [0, 1] = [0, 2] = 1. Similarly, it can be easily be noted that the identities
= 1 + tan2 Θ
sec2 Θ and csc2 Θ = 1 + cot 2 Θ also may not hold.

Remark 4 Though sin2 Θ + cos2 Θ = 1, we note that if length of the support of Θ

+ cos2 Θ)(0)
is greater than or equal to π , then (sin2 Θ = [0, 2]. However, core of
the fuzzy number (sin2 Θ + cos2 Θ) is always 1.

5. For a fuzzy angle Θ, the identity sin−1 (sin Θ)

=Θ of the inverse circular
function holds true. To prove the result, let θα be the angle whose membership
is α. Now, we observe that
value on Θ

=
sin−1 (sin Θ) sin−1 (sin θα ) =
θα = Θ.
α∈[0,1] α∈[0,1]

The similar reasoning gives that cos−1 (cos Θ) = Θ, tan−1 (tan Θ) = Θ, etc.
6. Following the same way as in the Point 5, the properties sin(−Θ) = − sin Θ,
cos(−Θ) = cos Θ, sin−1 (sin Θ) = Θ, etc., can be proved to be hold for a fuzzy
angle Θ.
7. Periodic properties of trigonometric functions hold for fuzzy angles, e.g.,
sin(2π + Θ) = sin Θ, sin( π + Θ) = cos Θ, etc. The proof of this properties
2
also will be the same as in the Point 5.
8. Area of a fuzzy triangle may not be determined by the rule 21 c sin
b A. Let us

consider a fuzzy triangle constructed by three fuzzy points A, B, and C. Lengths
of the sides of fuzzy triangle are a= D( C),
B, b= D( A,
C), c= D(A,
B) and

while computing D(A, B), D(A, C), D(A, B) the combinations of distances of
inverse points of A and B are being taken into account. But vertex angles ∠A,

∠B, etc., of the fuzzy triangle are evaluated by considering vertex angles of the
crisp triangles having vertices are same points with respect to fuzzy points A,
B, and
C. Apparently, in general area of the fuzzy triangle Δ cannot be equal
to 21 c sin
b A. For example, let us consider the fuzzy triangle ΔABC whose ver-
tices are three fuzzy points having right circular cone as their membership func-
tion with support sets are A(0) = {(x, y) : (x − 2)2 + y2 ≤ 41 }, B(0) = {(x, y) :
2 2
(x − 2) + (y − 2) ≤ 4 } and C(0) = {(x, y) : x + y ≤ 1}. If area of ΔABC
1 2 2

is Δ, then Δ(0)
= θ∈[0,2π],α∈[0,1] |4 − (1 − α) cos θ | = [1.5, 2.5]. However
1
2
( 21c sin
b A)(0) = 21 [0.5, 3.5][1, 3] sin[56.3103◦ , 123.6901◦ ] = 21 [0.5, 3.5][1, 3]
[0.8321, 1] = [0.2080, 5.2500] = [1.5, 2.5] = Δ(0).
9. Sine law of fuzzy triangle may not hold for a fuzzy triangle. Sine law for
fuzzy triangle may not hold with proper equality, i.e., B =
a sin b sin
A. For in-
stance, let us consider the fuzzy triangle considered in the Point 8 just above.
Here, ( a sin
B)(0) = [0.1716, 5.8284] sin[39.8034◦ , 140.1970◦ ] = [0.1716, 5.
8284][0.6402, 1] = [0.1099, 5.8284], whereas ( b sin
A)(0) = [0.5, 3.5] sin[56.
358 D. Ghosh and D. Chakraborty

3103◦ , 123.6901◦ ] = [0.5, 3.5][0.8321, 1] = [0.4160, 3.5] = [0.1099, 5.8284]

a sin
= ( B)(0).

5 Conclusion

This paper discussed a few basic concepts of fuzzy triangle and fuzzy triangular
properties. The sup-min composition of fuzzy sets and the concepts of same and
inverse points are used in all the discussion. We have studied here basic ideas on
formation of fuzzy triangle, its perimeter, and area and fuzzy trigonometric functions.
Two different methods are proposed to find perimeter of a fuzzy triangle; the lesser
imprecise value may be preferred as value of fuzzy perimeter.
In the formation of fuzzy triangle ΔP 1 P2 P3 , we note that if we consider a line
l(x, y) perpendicular to LP1 P2 (1) at (x, y) ∈
LP1 P2 (1), then along the line l(x, y) there
must exist one fuzzy number on (x, y) ∈ LP1 P2 (0) given by l(x, y) ∩ LP1 P2 [6]. We
denote this fuzzy number by l3 (x, y). Thus, corresponding to each (x, y), the func-
tion l3 (x, y) always gives one fuzzy number. Similarly, we will get two functions
l1 (x, y) and
l2 (x, y) corresponding to LP2 P3 and LP1 P3 . Now taking the crisp tri-
angle P1 P2 P3 as prototype, fuzzy triangle can be obtained by f -transformation
(x, y) → l1 (x, y),l2 (x, y) or
l3 (x, y). Thus, the defined concept of fuzzy triangle is
similar to Zadeh [15]. According to the methodologies and definitions proposed,
measurement of the fuzzy area and fuzzy perimeter yields fuzzy numbers. All the
proposed study of fuzzy triangle has been made in a coordinate reference frame of R2
to account the present imprecision in the fuzzy triangle very easily. Future research
can focus to study fuzzy triangles in more generalized spaces.
Here, it is worthy to mention that the proposed definition of fuzzy triangle and
its properties can be easily generalized to obtain and analyze fuzzy polygon. Fuzzy
polygon has its application in fuzzy optimization.
In defining fuzzy trigonometric functions in fuzzy environment, proposed value
of sine of a fuzzy angle is exactly same as the result obtained by direct use of
extension principle. In fuzzy trigonometric properties, it is noted in the Sect. 4 that
almost all the trigonometric identities/rules which hold with proper equality in the
case of classical trigonometry are not holding with proper equality in the notion
of fuzzy trigonometry. This happened because of that we have considered classical
equality instead of fuzzy equality. To mend it, we may use a fuzzy equality. From
the literature of fuzzy equality relation, we are overwhelmed by several definitions
of fuzzy equality relations. Which definition will be appropriate here is not known
properly. After getting an appropriate definition of fuzzy equality relation, we may
also be able to investigate further. Future research work can focus on this topic.
27 A Study on Fuzzy Triangle and Fuzzy Trigonometric Properties 359

References

1. Bogomolny, A.: On the perimeter and area of fuzzy sets. Fuzzy Sets Syst. 23, 257–269 (1987)
2. Buckley, J.J., Eslami, E.: Fuzzy plane geometry II: circles and polygons. Fuzzy Sets Syst. 87,
79–185 (1997)
3. Buckley, J.J., Eslami, E.: An Introduction to Fuzzy Logic and Fuzzy Systems. Physica-Verlag,
Heidelberg (2002)
4. Chakraborty, D., Ghosh, D.: Analytical fuzzy plane geometry II. Fuzzy Sets Syst. 243, 84–109
(2014)
5. Chaudhuri, B.B.: Some shape definitions in of space fuzzy geometry. Pattern Recogn. Lett. 12,
531–535 (1991)
6. Ghosh, D., Chakraborty, D.: Analytical fuzzy plane geometry I. Fuzzy Sets Syst. 209, 66–83
(2012)
7. Imran, B.M., Beg, M.M.S.: Estimation of f -similarity in f -triangles using fis. In: Meghanathan,
N., Chaki, N., Nagamalai, D. (eds.) CCSIT 2012, Part III LNICST, vol. 86, pp. 290–299.
Springer, Heidelberg (2012)
8. Imran, B.M., Beg, M.M.S.: Elements of sketching with words. In: Hu, X. (ed.) IEEE Interna-
tional Conference on Granular Computing, pp. 241–246. IEEE Computer Society, San Jose,
California, USA (2010)
9. Li, Q., Guo, S.: Fuzzy geometric object modelling. Fuzzy Inf. Eng. (ICFIE) ASC 40, 551–563
(2007)
10. Liu, H., Coghill, G.M.: Fuzzy qualitative trigonometry. Proceedings of the IEEE International
Conference on Systems, Man and Cybernetics, Hawaii, USA vol. 2, pp. 1291–1296 (2005)
11. Pham, B.: Representation of fuzzy shapes. In: Arcelli C., et al. (eds.) IWVF4, LNCS, Vol.
2059, pp. 239–248. Springer, Heidelberg (2001)
12. Rosenfeld, A.: Fuzzy plane geometry: triangles. Pattern Recogn. Lett. 15, 1261–1264 (1994)
13. Rosenfeld, A.: Fuzzy geometry: an updated overview. Inf. Sci. 110, 127–133 (1998)
14. Rosenfeld, A., Haber, S.: The perimeter of a fuzzy set. Pattern Recogn. 18, 125–130 (1985)
15. Zadeh, L.A.: Toward extended fuzzy logic-a first step. Fuzzy Sets Syst. 160, 3175–3181 (2009)
Chapter 28
An Extension Asymptotically
λ-Statistical Equivalent Sequences
via Ideals

Ekrem Savas and Rabia Savas

Abstract In (Savaş in Indian J Math 56(2):1–10 (2014) [27]), we examine the

asymptotically I λ -statistical equivalent of order α which is a natural combina-
tion of the definition for asymptotically equivalent of order α, where 0 < α ≤ 1,
I-statistically limit, and λ-statistical convergence. In this paper, we continue to study
by proving some more results.

Keywords Asymptotically statistical equivalent · λ-asymptotically statistical

equivalent · Asymptotically equivalent of order α · Statistical limit points

1 Introduction and Background

Let w be the set of all sequences of real or complex numbers and ∞ , c, and c0be,
respectively, the Banach spaces of bounded,
convergent, and null sequences x = xj
with the usual norm x = sup xj , where j ∈ N = {1, 2, . . .}, the set of positive
integers.
The (relatively more general) concept of I-convergence was introduced by
Kostyrko et al. [10] in a metric space as a generalized form of the concept of sta-
tistical convergence, and it is based upon the notion of an ideal of the subset of the
set N of positive integers. This concept has been studied by many authors; see, for
instance, [18–20, 22–26].
The notion of the convergence of a real sequence has been extended to statis-
tical convergence by Fast [7] (see also [29]) as follows: Let E be a subset of N.
Then, the asymptotic density of E denoted by δ (E) := limn→∞ n1 |{j ≤ n : j ∈ E}| ,
where the vertical
bars denote the cardinality of the enclosed set. A number
sequence x = xj is said to be statistically convergent to ξ if for every ε > 0,

E. Savas (B)
Department of Mathematics, Usak University, Usak, Turkey
e-mail: [email protected]
R. Savas
Department of Mathematics, Sakarya University, Sakarya, Turkey
e-mail: [email protected]

D. Ghosh et al. (eds.), Mathematics and Computing, Springer Proceedings
in Mathematics & Statistics 253, https://doi.org/10.1007/978-981-13-2095-8_28
362 E. Savas and R. Savas

δ j ∈ N : xj − ξ ≥ ε = 0. If xj is statistically convergent to ξ, we write st-
lim xj = ξ. Statistical convergence turned out to be one of the most active areas of
research in summability theory after the works of Fridy [8], Nuray and Ruckle [15],
and Šalát [17].
Let λ = {λp }p∈N be a non-decreasing sequence of positive numbers tending to ∞
such that
λp+1 ≤ λp + 1, λ1 = 1.

The collection of such sequences λ will be denoted by Δ. However, the idea of

λ-statistical convergence was introduced and studied by Mursaleen [14]. Mursaleen
defined λ-statistical convergence as follows: A sequence (xj ) of real numbers is said
to be λ-statistically convergent to ξ ( or, Sλ -convergent to ξ ) if for any > 0,

1
lim |{j ∈ Ip : |xj − ξ| ≥ }| = 0,
p→∞ λp

where |A| denotes the cardinality of A ⊂ N.

λ-statistical convergence is a special case of A-statistical convergence which is
studied by Kolk in [9].
Later, Colak [2] introduced the notion of statistical convergence of order α,
0 < α ≤ 1 by replacing n by nα in the denominator in the definition of statistical
convergence. One can also see [1, 3–5] for related works.
Marouf [13] has presented the definition of asymptotically equivalent sequences
and asymptotic regular matrices. Further, in 1997, asymptotic equivalence of
sequences and summability was studied by Li [12]. Also, Patterson [16], enlarged
these concepts by using an asymptotically statistical equivalent and natural regu-
larity conditions for nonnegative summability matrices. Recently, asymptotically
I λ -statistical equivalent sequences was studied by Gümüs and Savaş [6] (see also,
Kumar and Sharma [11]). I-asymptotically lacunary statistical equivalent sequences
and I-asymptotically lacunary statistical equivalent of order α were studied by Savaş
[20, 28], and also, Savaş [21] studied Iλ -statistically convergent sequences in topo-
logical groups. Recently, Savaş [27] defined asymptotically I-statistical equivalent
sequences of order.
In the present paper, we continue to study the concept asymptotically I λ -statistical
equivalent of order α. In addition, we study some more natural inclusion theorems.

2 Definitions and Preliminaries

The following definitions and notions will be needed in the sequel.

Definition 1 ([13]) Two nonnegative sequences x = (xj ) and y = (yj ) are said to be
asymptotically equivalent if
28 An Extension Asymptotically λ-Statistical … 363

xj
lim =1
j yj

(denoted by x∼y).

Definition 2 ([8]) The sequence x = (xj ) has statistic limit ξ, denoted by st −

lim x = ξ provided that for every > 0,

1
lim {the number of j ≤ n : |xj − ξ| ≥ } = 0.
n n

The next definition is natural combination of Definitions 1 and 2.

Definition 3 ([16]) Two nonnegative sequence x = (xj ) and y = (yj ) are said to be
asymptotically statistical equivalent of multiple ξ provided that for every > 0,

1 xj
lim {the number of j < n : | − ξ| ≥ } = 0,
n n yj

Sξ
(denoted by x ∼ y), and simply asymptotically statistical equivalent if ξ = 1.

Definition 4 ([10]) A family I ⊂ 2N is said to be an ideal of N if the following

conditions hold:
(a) P, Q ∈ I implies P ∪ Q ∈ I,
(b) P ∈ I, Q ⊂ P implies Q ∈ I,

Definition 5 ([10]) A non-empty family F ⊂ 2N is said to be an filter of N if the

following conditions hold:
(a) φ ∈
/ F,
(b) P, Q ∈ F implies P ∩ Q ∈ F,
(c) P ∈ F, P ⊂ Q implies Q ∈ F,

Definition 6 ([10]) A proper ideal I is said to be admissible if {n} ∈ I for each

n ∈ N.

Definition 7 (see [10]) Let I ⊂ 2N be a proper admissible ideal in N. Then, the

sequence (xj ) of elements of R is said to be I-convergent to ξ ∈ R if for each > 0,
the set K() = {n ∈ N : |xj − ξ| ≥ } ∈ I.

Definition 8 ([27]) The two nonnegative sequences x = (xj ) and y = (yj ) are said
to be asymptotically I-statistical equivalent of order α to multiple ξ, (0 < α ≤ 1),
provided that for each > 0 and γ > 0

1 xj
{n ∈ N : α
|{j ≤ n : | − ξ| ≥ }| ≥ γ} ∈ I,
n yj
364 E. Savas and R. Savas

S ξ (I)α
(denoted by x ∼ y) and simply asymptotically I-statistical equivalent of order α
S ξ (I)α
if ξ = 1. Furthermore, let S ξ (I)α denote the set of x and y such that x ∼ y.

Remark 1 If I = Ifin = {B ⊆ N : B is a finite subset}, asymptotically I-statistical

equivalent of order α to multiple ξ reduces to asymptotically statistical equivalent
of order α to multiple ξ. For an arbitrary ideal I and for α = 1, it reduces to asymp-
totically I-statistical equivalent of multiple ξ (see [6]). When I = Ifin and α = 1, it
becomes only asymptotically statistical equivalent of multiple ξ, [16].

The following definition is given in [27].

Definition 9 Let λ = (λp ) ∈ Δ. The two nonnegative sequences x = (xj ) and y =

(yj ) are said to be asymptotically I λ -statistical equivalent of order α, (0 < α ≤ 1),
to multiple ξ provided that for any > 0 and γ > 0

1 xj
{p ∈ N : α
|{j ∈ Ip : | − ξ| ≥ }| ≥ γ} ∈ I,
λp yj

S ξ (I)α
(denoted by x ∼ y) and simply asymptotically I λ -statistical equivalent of order
ξ
ξ Sλ (I)α
α if ξ = 1. Furthermore, let Sλ (I)α denote the set of x and y such that x ∼ y.

Remark 2 If we take α = 1, the above definition reduces to asymptotically I λ -

statistical equivalent of multiple ξ (see [6]). For I = Ifin , asymptotically λ-statistical
equivalent of order α to multiple ξ is a special case of asymptotically I λ -statistical
equivalent of order α to multiple L.

Definition 10 Let λ = (λp ) ∈ Δ, α ∈ (0, 1] be any real number and r be a pos-

itive real number. Two nonnegative sequences x = (xj ) and y = (yj ) are strong
r-asymptotically I λ -equivalent of order α to multiple ξ provided that for any > 0

1 xj
{p ∈ N : | − ξ|r ≥ } ∈ I,
λαp j∈I yj
p

ξ
Vλ (I)αr
(denoted by x ∼ y) and simply strong r-asymptotically I λ -equivalent of order α
ξ
ξ Vλ (I)αr
if ξ = 1. Further, let [Vλ ](I)αr denote the set of x and y such that x ∼ y.

3 Main Results

In this section, we present the main theorems of this paper.

28 An Extension Asymptotically λ-Statistical … 365

Theorem 1 Let λ = (λp ) ∈ Δ and α, β be fixed real numbers such that 0 < α ≤
ξ ξ
β ≤ 1 and let r be a positive real number, then [Vλ ](I)αr ⊂ Sλ (I)β and the inclusion
is strict.

Proof The inclusion part of proof is easy. Taking λp = p for all p, we prove the
ξ ξ
strictness
of the inclusion [Vλ ](I)αr ⊂ Sλ (I)β . For this, consider the sequence x =
xj defined by
1, if j = n2
xj = n = 1, 2, .... (1)
0, if j = n2

1
and yj = 1 for all j. Then, for every ε > 0 and α ∈ , 1 , we have
2
√
1 xj p 1
j ∈ Ip : − 0 ≥ ε ≤ α =

λαp yj p p α− 21

and for any γ > 0, we get

√
1 xj p
p ∈ N : α j ∈ Ip : − 0 ≥ ε ≥ δ ⊆ p ∈ N : α ≥ γ .
λp yj p

Since the set on the right-hand side is a finite set and so belongs to I, it follows that
1 1
ξ
xj → 0 Sλ (I)α for α ∈ ( , 1]. On the other hand for α ∈ (0, ], we have
2 2
√ r
p−1 1 xj r 1 xj
,
≤ = − 0
pα pα j∈I yj λαp j∈I yj
p p

and so we have
⎧ ⎫
√ ⎨ r ⎬
p−1
p∈N: ≥ 1 ⊆ p ∈ N :
1 xj − 0 ≥ 1
pα ⎩
λαp j∈I yj ⎭
p

ξ
which belongs to F (I) , since I is admissible. So xj 0[Vλ ](I)αr .

Corollary 1 If two nonnegative sequences x = (xj ) and y = (yj ) are strong r-

asymptotically I λ -equivalent of order α to multiple ξ, then they are asymptotically
I λ -statistical equivalent of order α to multiple ξ.

Even if x = (xj ) and y = (yj ) are bounded sequences, the converse of Theorem 3.1
does not hold, in general. To show this, we must find two sequences that bounded
ξ ξ
Sλ (I)α Vλ (I)αr
(that is, x, y ∈ ∞ ) and x ∼ y, but need not to be x ∼ y, for some α β real
and
numbers such that 0 < α ≤ β ≤ 1. For this, consider a sequence x = xj defined
366 E. Savas and R. Savas

by (1) and yj = 1 for all j. It can be shown that x, y ∈ ∞ and asymptotically I λ -

1 ξ
/ [Vλ ](I)αr for
statistical equivalent of order α to multiple ξ for α ∈ ( , 1] and x, y ∈
3
1 ξ ξ 1 1
α ∈ (0, ). Therefore, x, y ∈ Sλ (I)β \ [Vλ ](I)αr for α ∈ , .
2 3 2
Theorem 2 Let α and β be fixed real numbers such that 0 < α ≤ β ≤ 1 and r be a
ξ ξ β
positive real number, then [Vλ ](I)αr ⊆ [Vλ ](I)r and the inclusion is strict.

Proof The inclusion part of proof is given in [27]. Taking λp = p for all p, we
ξ ξ β
demonstrate the strictness of the inclusion [Vλ ](I)αr ⊆ [Vλ ](I)r for a special case.
Write a sequence such as in (1). Then,
√
1 xj
p 1 1
β − 0 ≤ pβ = pβ−1/2 → 0, (p → ∞) for β ∈ 2 , 1 ,
λp j∈Ip yj

but
xj √
1 xj
= 1

− 0 ≥ p − 1 → ∞, (p → ∞) for α ∈ 0, 1
− 0
λαp j∈I yj pα
j∈I
y
j
pα 2
p p

ξ β ξ
So x ∈ [Vλ ](I)r for 1
2
/ [Vλ ](I)αr for 0 < α < 21 .
< β < 1 but x ∈

The following result is a consequence of Theorem 2.

ξ
Corollary 2 Let 0 < α ≤ 1 be a positive real number and λ ∈ Δ. Then, [Vλ ](I)αr ⊆
ξ
[Vλ ](I)r for each α ∈ (0, 1].

Now, we shall prove some more inclusion relations.

Theorem 3 Let λ = λp and μ = μp be two sequences in Δ such that λp ≤ μp
for all p ∈ N, and let α and β be fixed real numbers such that 0 < α ≤ β ≤ 1,
(i) If
λαp
lim inf β > 0 (2)
p→∞ μp

ξ
then Sμξ (I)β ⊆ Sλ (I)α ,
(ii) If
μp
lim β
=1 (3)
p→∞ λp

ξ
then Sλ (I)α ⊆ Sμξ (I)β .

Proof (i) Suppose that λp ≤ μp for all p ∈ N, and let (2) be satisfied. For given
ε > 0, we have
28 An Extension Asymptotically λ-Statistical … 367

xj xj
j ∈ Jp : − ξ ≥ ε ⊇ j ∈ Ip : − L ≥ ε
yj yj

where Ip = [p − λp + 1, p] and Jp = [p − μp + 1, p]. Therefore, we can write

1 xj λαp 1 xj
β
j ∈ Jp : − ξ ≥ ε ≥ β α
j ∈ Ip
: − ξ ≥ ε

μp yj μp λp yj

and so for all p ∈ N, we have, for γ > 0,

1 xj
p ∈ N : α j ∈ Ip : − ξ ≥ ε ≥ γ ⊆

λp yj

1 xj λαp

p ∈ N : β j ∈ Jp : − ξ ≥ ε ≥ γ β ∈ I.
μp yj μp

ξ
Hence, Sμξ (I)β ⊆ Sλ (I)α .
ξ
(ii) Let x = xj and y = yj ∈ Sλ (I)α and (3) be satisfied. Since Ip ⊂ Jp , for
ε > 0, we may write

1 xj 1
p − μp + 1 < j ≤ p − λp : xj − ξ ≥ ε

β
j ∈ Jp : − ξ ≥ ε = β y
μp yj μp j

1 xj
+ β j ∈ Ip : − ξ ≥ ε
μp yj

μp − λp 1 xj
≤ + j ∈ I : − ξ ≥ ε
β β p
μp λp yj
β

μp − λp 1 xj
≤ β
+ α j ∈ Ip : − ξ ≥ ε

λp λp yj

μp 1 xj

≤ β
− 1 + α j ∈ I p : − ξ ≥ ε
λp λp yj

for all p ∈ N. Hence, we have

1 xj
p ∈ N : β j ∈ Jp : − ξ ≥ ε ≥ γ ⊆

μp yj

1 xj

p ∈ N : α j ∈ Ip : − ξ ≥ ε ≥ γ ∈ I.
λp yj

ξ
This implies that Sλ (I)α ⊆ Sμξ (I)β .

From Theorem 3, we have the following.

368 E. Savas and R. Savas

Corollary 3 Let λ = λp and μ = μp be two sequences in Δ such that λp ≤ μp
for all p ∈ N. If (2) holds, then
ξ
(i) Sμξ (I)α ⊆ Sλ (I)α for each α ∈ (0, 1] ,
ξ
(ii) Sμξ (I) ⊆ Sλ (I)α for each α ∈ (0, 1] ,
ξ
(iii) Sμξ (I) ⊆ Sλ (I).

Corollary 4 Let λ = λp and μ = μp be two sequences in Δ such that λp ≤ μp
for all p ∈ N. If (3) holds then,
ξ
(i) Sλ (I)α ⊆ Sμξ (I)α for each α ∈ (0, 1] ,
ξ
(ii) Sλ (I)α ⊆ Sμξ (I) for each α ∈ (0, 1] ,
ξ
(iii) Sλ (I) ⊆ Sμξ (I).

Theorem 4 Let λ = λp and μ = μp be two sequences in Δ such that λp ≤ μp
for all p ∈ N, and let α and β be fixed real numbers such that 0 < α ≤ β ≤ 1,
β ξ
(i) If (2) holds, then [Vμξ ](I)r ⊂ [Vλ ](I)αr ,
ξ β
(ii) If (3) holds and x, y ∈ ∞ , then [Vλ ](I)αr ⊂ [Vμξ ](I)r .

Proof (i) Omitted.

ξ
(ii) Let x, y ∈ [Vλ ](I)αr , and supposethat (3)
holds. Since x = xj , y = yj ∈
x
∞ , there exists some M > 0 such that yjj − ξ ≤ M for all j. Now, since Ip ⊆ Jp
and λp ≤ μp for all p ∈ N, we may write
r r r
1 xj
= 1
xj
− γ + 1
xj
− ξ

β y − ξ β y β y
μp j∈Jp j μp j∈Jp −Ip j μp j∈Ip j
r
μp − λp 1 xj

≤ β
M r
+ β y − ξ
μp μp j∈Ip j
r
1 xj
β
μp − λp
≤ M r
+ − ξ
β β
λp λp j∈Ip yj
r
μp 1 xj

≤ β
−1 M + αr
y − γ
λp λ p j∈I j
p

for all p ∈ N. So we have

⎧ ⎫ ⎧ ⎫
⎨ r ⎬ ⎨ r ⎬
1 x x
p∈N: β j − ξ ≥ γ ⊆ p ∈ N : 1 j − ξ ≥ γ ∈ I.
⎩ ⎭ ⎩ λαp j∈I yj ⎭
μp j∈Jp yj p

ξ β
Therefore, [Vλ ](I)αr ⊂ [Vμξ ](I)r .
28 An Extension Asymptotically λ-Statistical … 369

Corollary 5 Let λ = λp and μ = μp be two sequences in Δ such that λp ≤ μp
for all p ∈ N. If (2) holds, then
ξ
(i) [Vμξ ](I)αr ⊂ [Vλ ](I)αr for each α ∈ (0, 1] ,
ξ
(ii) [Vμξ ](I)r ⊂ [Vλ ](I)αr for each α ∈ (0, 1] ,
ξ
(iii) [Vμξ ](I)r ⊂ [Vλ ](I)r .

Corollary 6 Let λ = λp and μ = μp be two sequences in Δ such that λp ≤ μp
for all p ∈ N. If (3) holds and x, y ∈ ∞ , then
ξ
(i) [Vλ ](I)αr ⊂ [Vμξ ](I)αr , for each α ∈ (0, 1] ,
ξ
(ii) [Vλ ](I)αr ⊂ [Vμξ ](I)r , for each α ∈ (0, 1] ,
ξ
(iii) [Vλ ](I)r ⊂ [Vμξ ](I)r .
Finally, we conclude this paper by presenting the following theorem.

Theorem 5 Let λ = λp and μ = μp be two sequences in γ such that λp ≤ μp
for all p ∈ N, and let α and β be fixed real numbers such that 0 < α ≤ β ≤ 1 and
0 < r < ∞. Then, we have
ξ ξ
Vλ (I)αr Sλ (I)α
(i) Let (2) holds, if x ∼
y then x ∼ y,
ξ
Sλ (I)α
(ii) Let (3) holds and x = xj and y = yj be two bounded sequences, if x ∼ y
ξ
Vλ (I)αr
then x ∼ y.
Proof (i) Omitted.
ξ
Sλ (I)α
(ii) Suppose that x ∼ y and that x = xj and y = ykj be bounded and > 0
is given. Since x = xj and y = yj are bounded, there exists an integer M such
x
that | yjj − ξ| ≤ M for all j; then, we may write

r pr r
1 xj
1 xj
1 xj

β y − ξ = β y − ξ + β y − ξ
μp j∈Jp j μp j∈Jp −Ip j μp j∈Ip j
r
μp − λp 1 xj

≤ β
M + β
r
y − ξ
μp μp j∈Ip j
r
1 xj
β
μp − λp
≤ M + β
y − ξ
r
β
μp μp j∈Ip j
r r
μp 1 xj
+ 1
xj
− ξ
= β
− 1 M r
+ β y − ξ β y
λp λp j∈Ip j λp j
x x j∈Ip
j j
y −ξ ≥ε y −ξ <ε
j j

μp M r xj μp
≤ β
− 1 M + α j ∈ Ip
r
: − Lξ ≥ ε + β εr

λp λp yj λp
370 E. Savas and R. Savas

for all p ∈ N. So we have, γ > 0,

⎧ r ⎫
⎨ 1 xj ⎬ x γ
1 j
p∈N: β − ξ ≥ γ ⊆ p ∈ N : j ∈ Ip : − ξ ≥ ε ≥ r ∈ I.
⎩ μ yj ⎭ λα
p yj M
p j∈Jp

Using (3) , we obtain that x = (xj ) strong r-asymptotically I λ -equivalent of order

ξ
Sλ (I)α
α to multiple ξ, whenever x ∼ y.

Corollary 7 Let λ = λp and μ = μp be two sequences in Δ such that λp ≤ μp
for all p ∈ N. If (2) holds and let α ∈ (0, 1], then
ξ
Vμξ (I)αr Sλ (I)α
(i) If x ∼ y, then x ∼ y,
ξ
Vμξ (I)r Sλ (I)α
(ii) If x ∼ y, then x ∼ y,
ξ
Vμξ (I)r Sλ (I)
(iii) If x ∼ y, then x ∼ y.

References

1. Bhunia, S., Das, P., Pal, S.: Restricting statistical convergenge. Acta Math. Hungar 134(1–2),
153–161 (2012)
2. Colak, R.: Statistical Convergence of Order α, Modern Methods in Analysis and its Applica-
tions, pp. 121–129. Anamaya Publishers, New Delhi (2010)
3. Colak, R., Bektas, C.A.: λ-statistical convergence of order α. Acta Math. Scientia 31B(3),
953–959 (2011)
4. Das, P., Savaş, E.: On I -statistical and I -lacunary statistical convergence of order α. Bull.
Iranian Soc. 40(2), 459–472 (2014)
5. Et, M., Çınar, M., Karakaş, M.: On λ-statistical convergence of order α of sequences of function.
J. Inequal. Appl. 2013, 204 (2013)
6. Gumus, H., Savas, E.: On SλL (I )-asymptotically statistical equivalent sequences. Numer. Analy.
Appl. Math. (ICNAAM: AIP conference proceeding, vol. 1479 (2012) pp. 936–941 (2012)
7. Fast, H.: Sur la convergence statistique. Colloq. Math. 2, 241–244 (1951)
8. Fridy, J.A.: On statistical convergence. Analysis 5, 301–313 (1985)
9. Kolk, E.: The statistical convergence in Banach spaces. Acta Comment. Univ. Tartu 928, 41–52
(1991)
10. Kostyrko, P., Šalát, T., Wilczynki, W.: I -convergence. Real Anal. Exchange 26(2) 669–685
(2000/2001)
11. Kumar, V., Sharma, A.: On asymptotically generalized statistical equivalent sequences via
ideal. Tamkang J. Math. 43(3), 469–478 (2012)
12. Li, J.: Asymptotic equivalence of sequences and summability. Int. J. Math. Math. Sci. 20(4),
749–758 (1997)
13. Marouf, M.: Asymptotic equivalence and summability. Int. J. Math. Math. Sci. 16(4), 755–762
(1993)
14. Mursaleen, M.: λ-statistical convergence. Math. Slovaca 50, 111–115 (2000)
15. Nuray, F., Ruckle, W.H.: Generalized statistical convergence and convergence free spaces. J.
Math. Anal. Appl. 245(2), 513–527 (2000)
16. Patterson, R.F.: On asymptotically statistically equivalent sequences. Demonstratio Math.
36(1), 149–153 (2003)
28 An Extension Asymptotically λ-Statistical … 371

17. Šalát, T.: On statistically convergent sequences of real numbers. Math. Slovaca 30, 139–150
(1980)
18. Savaş, E., Das, P.: A generalized statistical convergence via ideals. Appl. Math. Lett. 24, 826–
830 (2011)
19. Savaş, E., Das, P., Dutta, S.: A note on strong matrix summability via ideals. Appl. Math Lett.
25(4), 733–738 (2012)
20. Savaş, E.: On I -asymptotically lacunary statistical equivalent sequences. Adv. Differ. Equ.
2013, 2013:111 (18 April 2013)
21. Savaş, E.: On Iλ -statistically convergent sequences in topological groups. Acta Comment.
Univ. Tartu. Math. 18(1), 33–38 (2014)
22. Savaş, E.: Δm -strongly summable sequence spaces in 2-normed spaces defined by ideal con-
vergence and an Orlicz function. Appl. Math. Comput. 217, 271–276 (2010)
23. Savaş, E.: A-sequence spaces in 2-normed space defined by ideal convergence and an Orlicz
function. Abst. Appl. Anal. 2011 Article ID 741382 (2011)
24. Savaş, E.: On some new sequence spaces in 2-normed spaces using Ideal convergence and an
Orlicz function. J. Ineq. Appl. Article Number 482392
25. Savaş, E.: On generalized double statistical convergence via ideals. In: The Fifth Saudi Science
Conference 16–18 April 2012
26. Savaş, E.: On generalized A-difference strongly summable sequence spaces defined by ideal
convergence on a real n-normed space. J. Ineq. Appl. 2012, 87 (2012)
27. Savaş, E.: On asymptotically I -statistical equivalent sequences of order. Indian J. Math., Special
Volume Dedicated to Professor Billy E. Rhoades 56(2) 1–10 (2014)
28. Savaş, E.: On asymptotically I -lacunary statistical equivalent sequences of order α. In: The
2014 International Conference on Pure and Applied Mathematics, Venice, Italy, March 15–17
2014
29. Schoenberg, I.J.: The integrability of certain functions and related summability methods. Amer.
Math. Monthly 66, 361–375 (1959)
Chapter 29
Fuzzy Goal Programming Approach
for Resource Allocation in an NGO
Operation

Vinaytosh Mishra, Tanmoy Som, Cherian Samuel and S. K. Sharma

Abstract Diabetes is a major health challenge in India. The lifetime cost of treatment
in the disease management is humongous. India is presently lacking at infrastructure
and resources to meet the demand created by the sudden surge of the disease. This
situation makes it imperative to optimally allocate the resources so that the treatment
can be made available to a maximum number of patients at affordable cost. This paper
uses fuzzy goal programming with exponential membership function for resource
allocation. The human and financial resources are described with fuzzy conditions for
determining the future strategies for unknown situations. A fuzzy goal programming
model is demonstrated using the case study of an NGO working in the area of
awareness and treatment of diabetes in Varanasi.

Keywords Resource allocation · Fuzzy goal programming · Diabetes · NGO

Exponential membership function

1 Introduction

The number of people living with diabetes is increasing exponentially in India [1].
The disease has become a major health challenge in the country in the last decade [2].
Such is the prevalence of the disease in the country that it is called as the diabetes
capital of the world [3]. The chronic nature of the disease makes the treatment
extremely costly. The cost of treatment of diabetes can be divided into two categories,
namely direct cost and indirect cost [4]. The direct cost includes the expenses related
to treatment, while indirect cost includes the loss of productivity. In addition to this
cost, there is an intangible cost which includes reduced quality of life due to pain,
anxiety, and stress [5]. The studies indicate that there has been a significant increase in
the cost of diabetes management in the recent time [6, 7]. With the progression of the
disease, the cost of treatment increases many folds because of comorbidities [8–10].

V. Mishra (B) · T. Som · C. Samuel · S. K. Sharma

Indian Institute of Technology (BHU), Varanasi 221005, India
e-mail: [email protected]

The studies suggest that the increasing cost of treatment results in low adherence of
medication regime [11, 12]. In a recent study, Roebuck et al. concluded that improved
medication adherence by people with diabetes produced substantial medical savings
as a result of reductions in hospitalization and emergency department use [13]. India
lags at healthcare infrastructure, and a number of doctors and beds per patients are far
below the World Healthcare Organization (WHO) guidelines. The non-government
organization (NGO) can play an important role in bridging this gap. The NGOs
have a limited number of resources and need to optimally allocate the resources to
maximize the social welfare.

1.1 Cost of Treatment

There is an acute shortage of hospital beds and doctors in India, and more than 50%
of the ambulatory care is provided by the private players. The country has witnessed
spiraling medical expenses in recent years. According to National Sample Survey
Office (NSSO) report, consumer expenditure on healthcare in rural India increased
from 6.6% in 2004–05 to 6.9% in 2011–2012, and urban Indians’ expenditure on
medical care increased from 5.2% in 2004–05 to 5.5% in 2011–2012. The 70% of
this cost is constituted by medicine. The diabetes patient once diagnosed undergoes
the treatment regime for the rest of his life after. This scenario results are high lifetime
cost of treatment of diabetes. The average cost of treatment per diabetes patient per
hospital admission, with and without multiple complications, is 314.15 (USD) and
29.1 (USD), respectively, out of which 255.32 (USD) falls under the direct cost of
treatment of the disease [7]. Table 1 further provides the details of constituents of
the direct cost of treatment of diabetes.
From the above table, we can conclude that the reducing the risk of hospitalization
can significantly reduce the cost of treatment of the disease. The self-management
education of the disease can help patients reducing the risk of hospitalization in dia-
betes [14]. Another measure suggested in the literature for reducing hospitalization
risk is income tax exemption [15].

Table 1 Direct medical cost per patient per hospitalization

Component of cost Cost
(average) USD % of total cost
Lab investigations 29.45 10.15
Medication for diabetes 7.00 2.42
Medication for comorbidity 69.46 23.94
Hospitalization 143.75 49.56
Doctor’s consultation 40.37 13.93
29 Fuzzy Goal Programming for Resource Allocation 375

1.1.1 Healthcare Finance Models

There are three main models for healthcare finance on the basis of their funding.
The first one is the Beveridge model [16], which is based on taxation and has many
public providers. The second is the Bismarck ‘mixed’ model [17], funded by a mix of
government and insurance providers. Finally, the ‘private insurance model’ in which
the cost of the treatment is borne by the health insurance provider. Health insurance
helps to spread the cost of treatment over a large time period. Properly designed and
administered health insurance can act as a bridge between patients and providers
balancing quality care at reasonable costs [18].
India has one of the largest private health sectors in the world, with over 80%
of ambulatory care being supported through out-of-pocket expenses [19, 20]. Out-
of-pocket (OOP) expenditure on health care has significant implications for poverty
in many developing countries [21]. In India, three-fourth of the healthcare expenses
are supported by out-of-pocket spend. The government spending on health care has
been paltry as a percentage of GDP when we compare it with other developing and
developed countries. India spends only 5% annual gross domestic product (GDP) on
health care [22].
The diabetes patient needs, affordable and quality health care, self-management
education and insurance coverage to meet the cost of diabetes management. The gov-
ernment in India has not been able to develop adequate infrastructure and support to
manage the sudden surge in a number of diabetes patients [23]. Despite recent thrust
to improve the healthcare infrastructure in India, inequalities related to socioeco-
nomic status, geography, and gender still persist. This situation is further aggravated
by high out-of-pocket expenditures [24].

2 Role of an NGO in Health care

Nonprofit organizations can be registered in India as a society, under the Registrar

of Societies (Society Act 1860) or as a trust, by making a trust deed, or as a Sect. 8
Company, under the Companies Act, 2013 [25]. They can work in the capacity
building, policy shaping or ensure long-term results in healthcare areas. They work
in partnership with communities, health institutions, donors, academicians, and gov-
ernments to achieve these results. They fund their activities through international
funding, government funding, local philanthropy, and income-generating activities.
NGOs carry out a range of projects including emergency management and relief;
healthcare research; designing and implementing alternative funding and insurance
schemes; mobilization, advocacy and raising awareness, health campaigns, protec-
tion of patient’s right; and balancing private players interest. NGOs can fill the gap
in diabetes care by working in areas like disease awareness, free consultation, and
checkup camps and providing funding for the diabetic patients not able to meet the
healthcare expenses. They can also work for bringing transparency and efficiency
in healthcare supply chain so that medicine reaches the patients at affordable cost.
376 V. Mishra et al.

Thus, we can say that NGOs can bridge that gap between demand and supply of
diabetes care, but they need to efficiently utilize the scarce resources available at
their disposal to maximize the welfare of the patients [26].
To the best of our knowledge, there is a lack of any study in the area of resource
allocation in the case of cardio-diabetes management. This study attempts to fill this
gap. This study uses a case study of Indian NGO, working in the area of cardio-
diabetes awareness and treatment, in the eastern part of India. The study can be used
as a reference for the NGOs and government bodies working in the area of diabetes
management.

3 Resource Allocation in Healthcare

Healthcare resources are limited and demand exceeds supply. The allocation of
resources becomes a challenge in healthcare. In the case of private healthcare
providers, the allocation is resolved on the basis of the ability to pay. The allo-
cation on basis of ability to play is against the principle of healthcare equity [27].
This section discusses the various approaches to resource allocation found in the
literature.

3.1 Hippocratic Model

According to this model, the focus of medical action revolves around the physician-
patient encounter. It establishes a fiduciary relationship between the physician and the
patient, which means that the physician’s duty toward the individual patient overrides
all other considerations except insofar as these affect the physician’s ability to fulfill
her or his patient-related duties [28–30].

3.2 Social Service Model

This model sees the health care in much broader perspective and considers medicine
as one among several social enterprises of which the overall purpose is to advance
the well-being of members of society [31, 32]. In this approach, the allocation issues
assume an entirely different nature. Although the physician-patient encounter still
remains an element of fiduciary duty, that element is limited by the constraints per-
taining to social welfare maximization [33, 34].
There is need of a coherent and consistent model for healthcare resource allocation.
As for our knowledge, there is a lack of the literature on a quantitative model for the
allocation of the healthcare resources. There is also a lack of a case-based approach
for resource allocation for Indian NGO. This paper attempts to fill this gap.
29 Fuzzy Goal Programming for Resource Allocation 377

3.3 Business Model

This model considers health care as neither a fiduciary undertaking nor a health-
oriented profession that operationalizes society’s duty to do the best for its members
[35]. The healthcare provider ethically works for the value maximization of its share-
holder [36].

4 Methodology

Goal programming (GP) is an important method for multi-objective decision-making

approaches in decision making. In a standard GP formulation, goals and constraints
are defined precisely [37, 38]. In healthcare, the system aims and conditions include
vague and undetermined situations as every healthcare event is unique and involves
uncertainties. The study uses the fuzzy membership function suggested by Turgay
and Taşkın [38] and proposes a fuzzy goal programming (FGP) model for optimizing
the resource allocation of an NGO working in cardio-diabetes management and
education area.
The heart of the methodology of FLP lies in the construction of membership func-
tion for objection coefficients, technical coefficients, resource variable, and decision
variables [39]. The reasons behind selecting exponential membership in FLP are as
follows: (1) It transforms into linear membership function when dealing with non-
linear aggregate operators, and (2) it is more realistic than the linear membership
function and has been successfully used for the resource allocation in health care and
other industries [37, 38, 40, 41] (Figs. 1 and 2).
The exponential membership function depends on the fuzzy restriction given to a
fuzzy goal of the problem in a fuzzy decision-making situation. Let t ln and t un be the
lower- and upper-tolerance ranges considered, respectively, for the achievement of
the aspired level bn of the nth fuzzy goal. Then, the exponential membership function
for the fuzzy goal F n (x) having lower tolerance limit (bun − tln ) and upper-tolerance
limit (bun + tln ) can be given as follows:

Fig. 1 Exponential
membership function for
minimization objective
378 V. Mishra et al.

Fig. 2 Exponential
membership function for
maximization objective

⎧
⎪
⎪ 1, if Fn (x) ≤ bn
⎪
⎨ αi(bn −Fn (x))
−
μn (x) 1 + e tn (x) −e−αi
, if bn ≤ Fn (x) ≤ bn + tun (1)
⎪
⎪ 1−e−αi
⎪
⎩ 0, if F (x) ≥ b + t
n n un

and
⎧
⎪
⎪ 1, if Fn (x) ≥ bn
⎪
⎨ αi(bn −Fn (x))
−
μn (x) e tun (x) −αi −e−αi , if bn − tln ≤ Fn (x) ≤ bn (2)
⎪
⎪ 1−e
⎪
⎩ 0, if Fn (x) ≤ bn − tln

The exponential membership function-based fuzzy goal programming with upper

and lower level conditions can be presented as follows:
Maximize λ, subject to:
αi(bn −Fn (x))
e− tn (x) − e−αi
≤ λ, n 1, 2, . . . , N ; (3)
1− e−αi

n
xi j 1, j 1, 2, . . . , N ; λ ≥ 0 (4)
i1

1, if the ith resource is assigned to the jth task
xi j (5)
0, if the ith resource is not assigned to the jth task

Minimize λ, subject to:

αi (bn −Fn (x))
e−αi − e− tn (x)
≥ λ, n 1, 2, . . . , N ; (6)
1 − e−αi
n
xi j 1, j 1, 2, . . . ., N ; λ ≥ 0 (7)
i1
29 Fuzzy Goal Programming for Resource Allocation 379

1, if the ith resource is assigned to the jth task
xi j (8)
0, if the ith resource is not assigned to the jth task

The slack variables are minimized on the basis of the importance of achieving the
aspired goal levels in the decision-making context, for the goal achievement. The
fuzzy goal programming model of the problem under a preemptive priority structure
can be presented as follows:

Min Z P1 d − , P2 d − , . . . , Pi d − (9)
α (b −F (x))
− i ntn (x)n
e−αi − e
+ dn− − dn+ 1 (10)
1 − e−αi
α (b −F (x))
− i ntn (x)n
e−αi − e
1− + dn− − dn+ 1 (11)
1 − e−αi
dn− , dn+ ≥ 0, n 1, 2, . . . N (12)

In the above formulation, Z represents the vector of i priority achievement func-

tions and dn− , dn+ are the slack
−
(under-deviational) and surplus (over-deviational) vari-
ables of the nth goal.
i d is a linear function of the weighted under-deviational
P
variables, where Pi d − is of the form:

n
−
Pi d − win ∗ din− , din− ≥ 0, (n 1, 2, . . . , N ) (13)
i1

where din− ith priority is level for din− and win

−
is the weight associated with it. Here,
the numerical weight is the weight of importance of achieving the aspired level of the
nth goal relative to others which are grouped together at the ith priority level [42].
The model uses the concept of preemptive priorities of the goals, and the ith priority
is preferred over the higher priority irrespective of the weight associated with.

5 Model Construction

5.1 Parameter Definition

The objective of the research is to obtain a solution which minimizes the service cost
as well as patient service level. Since the objective of the NGO is to include a max-
imum number of the patients in the social welfare program without compromising
on the service quality, the first objective has a higher priority. The model includes
the variable cost like salary, cost of equipment, cost of medicines, and another rele-
vant cost (variable operating expenses). The parameters of the model are defined as
below:
380 V. Mishra et al.

Table 2 Decision variable values for objective functions

Variable Demand per Capacity per Flexibility of the Target patients
department month (Dit ) month (U it ) service (F it ) (%) (Bit )
Endocrinology 90 150 15 100
(x 1 )
Cardiology (x 2 ) 100 150 10 75
Internal medicine 200 150 15 100
(x 3 )

pi Number of IPD patients in each service

ri Cost of care inpatient stay
Di Demand of each service
Ui Capacity of each service
Pi Total budget
F Flexibility of service quota allocation
Bi Number of patients targeted for each service
W 1ti Number of physician in each service in t period
W 2ti Number of nurses in each service in t period
W 3ti Number of technicians in each service in t period
CS1i Salary of physician of department i in period t
CS2i Salary of nurses of department i in period t
CS3i Salary of technician of department i in period t
CMi Medication cost per patient in department i
CEi Equipment cost per patient in department i
COi Another relevant cost per patient in department i
ai Arrived patient in each service.
Decision variables for the objective function and constraint are decided by taking
input from the case organization and are listed in Tables 1 and 2, respectively.

5.2 Problem Statement

The first objective function minimizes the total cost to serve, while the second objec-
tive function is related to the minimization of the total patient complaints. Using
the definition of the parameters in earlier section, the two objective functions of the
study can be written as below:
Minimize total service cost:

Z1 (W 1ti ) ∗ (CS1i ) + (W 2ti ) ∗ (CS2i ) + (W 3ti ) ∗ (CS3i )
i1 t1
+ CMi + CEi + COi (14)
29 Fuzzy Goal Programming for Resource Allocation 381

The total complaints can be written as sum product of arrived patients in each ser-
vice and the complaint per patient. The objective function related to the minimization
of total patient complaints is given as:

Z 2 300x1 + 350x2 + 400x3 (15)

5.3 Constraints

Constraint 1: Constraint on the demand for the healthcare services

n
n
xi D (when D ≺ U ) or xi U (when D U )
i1 i1

Constraint 2: Capacity constraint for the healthcare services

xi ≤ Ui for i 1, 2, 3

Constraint 3: Total budget constraint for the healthcare services

n
ri x i ≤ P
i1

Constraint 4: A constraint on the flexibility for the service quota allocation

n
f i xi ≥ F
i1

Constraint 5: Nonnegativity constraint for all allocation quantities

xi ≥ 0

6 Objective Function

Assuming α is equal to 0.05, the maximum targeted service cost for the month is
200,000 and the current resource utilization is 0.02 for each of the services. The
flexibility of the overall service is required to be more than 5% by design. The NGO
also aims to serve at least 150 patients in the given time period (month). The study
considers 10% of the tolerance range for all the three objectives as suggested by
experts. Each of the services should be at least 10% utilized for the CSR, while
382 V. Mishra et al.

the upper limit for the same is 20%. Given the above assumptions, the membership
function for objective one (minimization of the service cost) can be written as:

1, if Z 1 < 180,000
0.05(200,000−z 1 )
− 200,000−180,000
μ(z 1 ) 1− e −e−0.05 , if 180,000 ≤ Z 1 ≤ 200,000 ; (16)
1−e−0.05
0, Z 1 > 200,000

The third objective function minimizing the patient complaint is given as follow:

1, if Z 2 < 0.045
0.05(0.5−z 2 )
−
(z 2 ) 1− e
0.5−0.045 − −0.05
e
−0.05 , if 0.045 ≤ Z 2 ≤ 0.5 ; (17)
1−e
0, Z 2 > 0.5

Finally, the resource allocation model for the NGO can be formulated as below:

Max f (u) μ1 + μ2 (18)

For small exponent, the exponential function can be transferred into a linear
function as:

μ1 : 10 − 0.00005Z 1 + d1− − d1+ 1

μ2 : 10 − 0.5Z 2 + d2− − d2+ 1 (19)
xi , di− , di+ ≥0

Other constraints:

90x1 + 100x2 + 200x3 ≤ 275 (20)

150x1 + 150x2 + 150x3 ≤ 275 (21)
400x1 + 500x2 + 300x3 ≤ 200 (22)
0.15x1 + 0.10x2 + 0.15x3 ≥ 0.05 (23)
0.1 ≤ x1 ≤ 0.2;
0.1 ≤ x2 ≤ 0.2; (24)
0.1 ≤ x3 ≤ 0.2;

7 Results and Discussion

Using the information given in Tables 2 and 3 and problem statement, we can modify
the values of Z 1 and Z 2 as below:
29 Fuzzy Goal Programming for Resource Allocation 383

Table 3 Decision variable values for constraints

W 1ti W 2ti W 3ti CS1i CS2i CS3i CMi CEi COi ai
X1 1 4 2 60 10 12 0.2 0.1 0.1 300
X2 1 3 2 60 10 12 0.15 0.1 0.1 350
X3 2 4 2 60 8 12 0.1 0.8 0.1 400
Note All costs are taken in thousands

Table 4 Result for the model Variables Results

X1 0.2
X2 0.12
X3 0.2
d1− 0
d2− 0.9
Z 1 (INR) 151,000
Z 2 (Nos) 182

Z 1 274 x1 + 271.5 x2 + 316x3 (25)

Z 2 300 x1 + Z 2 350x2 + 400x3 (26)

Solving for a preemptive solution for the problem keeping service cost as at
higher priority than the patient complaints. Excel solver was used to solve the linear
programing problem. The method used for solving the problem is simplex method
(Table 4).
We received the following results:
The above scenario will minimize the service cost to 151 thousand. The optimal
solution suggests that totally 182 patients are served during the given time period.
The patient served by endocrine, cardiology, and internal medicine department is 60,
42, and 80, respectively. The answer with different objectives may give a different
allocation of the resources.

8 Conclusions

The study proposes and uses a fuzzy goal programming approach as a quantitative
method for the resource allocation in healthcare organization. As suggested by Turgay
and Taşkın [38], the exponential membership function was used for the study. The
reason behind the selection of the method is the better representation of the real-life
scenarios than a linear function. Moreover, it can be easily converted into linear
approximation for a small value of alpha. The case study suggests that for the given
objectives, the optimal solution may be different from the most obvious solutions;
384 V. Mishra et al.

hence, a quantitative model/qualitative can help us in solving a resource allocation

problem. Qualitative allocation is usually very personal to the people involved in
the allocation and therefore is very subjective and quite unreliable. The quantitative
models are preferred over the qualitative models because they are objective, based
on data and facts, and are therefore impersonal. This model is easy to use and can
be adopted in other similar organizations involved in the chronic care like diabetes,
asthma, tuberculosis, and HIV.

References

1. Mishra, V., Samuel, C., Sharma, S.K.: Use of machine learning to predict the onset of diabetes.
Int. J. Recent Adv. Mech. Eng. (IJMECH) 4(2) (2015)
2. Mishra, V., Samuel, C., Sharma, S.K.: Visualization of perceived expensiveness of diabetes-
fuzzy MDS approach. In: 2016 IEEE Uttar Pradesh Section International Conference on Elec-
trical, Computer and Electronics Engineering (UPCON), pp. 67–71. IEEE, New York (2016
December)
3. Joshi, S.R., Parikh, R.M.: India; the diabetes capital of the world: now heading towards hyper-
tension. J.-Assoc. Physicians India 55(5), 323 (2007)
4. Jönsson, B.: Revealing the cost of type II diabetes in Europe. Diabetologia 45(7), S5–S12
(2002)
5. Bjork, S., Kapur, A., Sylvest, C., Kumar, D., Kelkar, S., Nair, J.: The economic burden of
diabetes in India: results from a national survey. Diabetes Res. Clin. Pract. 50, 190 (2000)
6. Kapur, A.: Economic analysis of diabetes care. Indian J. Med. Res. 125(3), 473 (2007)
7. Akari, S., Mateti, U.V., Kunduru, B.R.: Health-care cost of diabetes in South India: a cost of
illness study. J. Res. Pharm. Pract. 2(3), 114 (2013)
8. Al-Maskari, F., El-Sadig, M., Nagelkerke, N.: Assessment of the direct medical costs of diabetes
mellitus and its complications in the United Arab Emirates. BMC Public Health 10(1), 679
(2010)
9. Henriksson, F., Agardh, C.D., Berne, C., Bolinder, J., Lönnqvist, F., Stenström, P., Jönsson,
B.: Direct medical costs for patients with type 2 diabetes in Sweden. J. Intern. Med. 248(5),
387–396 (2000)
10. Hogan, P., Dall, T., Nikolov, P.: Economic costs of diabetes in the US in 2002. Diabetes Care
26(3), 917 (2003)
11. Sokol, M.C., McGuigan, K.A., Verbrugge, R.R., Epstein, R.S.: Impact of medication adherence
on hospitalization risk and healthcare cost. Med. Care 43(6), 521–530 (2005)
12. Ho, P.M., Bryson, C.L., Rumsfeld, J.S.: Medication adherence. Circulation 119(23), 3028–3035
(2009)
13. Roebuck, M.C., Liberman, J.N., Gemmill-Toyama, M., Brennan, T.A.: Medication adherence
leads to lower health care use and costs despite increased drug spending. Health Aff. 30(1),
91–99 (2011)
14. Norris, S.L., Lau, J., Smith, S.J., Schmid, C.H., Engelgau, M.M.: Self-management education
for adults with type 2 diabetes. Diabetes Care 25(7), 1159–1171 (2002)
15. Newhouse, J.P.: Medical care costs: how much welfare loss? J. Econ. Perspect. 6(3), 3–21
(1992)
16. Beveridge, R.: CAEP issues. J. Emerg. Med. 16, 507–511 (1998)
17. Vienonen, M.A., Wlodarczyk, W.C.: Health care reforms on the European scene: evolution,
revolution or seesaw? world health statistics quarterly. Rapport trimestriel de statistiqu essan-
itaires mondiales 46(3), 166–169 (1993)
18. Srinivasan, R.: Health insurance in India. Health Population Perspect. Issues 24(2), 65–72
(2001)
29 Fuzzy Goal Programming for Resource Allocation 385

19. Duggal, R.: Poverty & health: criticality of public financing. Indian J. Med. Res. 126(4), 309
(2007)
20. Gangolli, L.V., Duggal, R., Shukla, A.: Review of Healthcare in India. Centre for Enquiry into
Health and Allied Themes, Mumbai (2005)
21. Berman, P., Ahuja, R., Bhandari, L.: The impoverishing effect of healthcare payments in India:
new methodology and findings. Econ. Political Wkly. 65–71 (2010)
22. Prinja, S., Bahuguna, P., Pinto, A.D., Sharma, A., Bharaj, G., Kumar, V., Kumar, R.: The cost
of universal health care in India: a model based estimate. PLoS ONE 7(1), e30362 (2012)
23. Patil, A.V., Somasundaram, K.V., Goyal, R.C.: Current health scenario in rural India. Aust. J.
Rural Health 10(2), 129–135 (2002)
24. Balarajan, Y., Selvaraj, S., Subramanian, S.V.: Health care and equity in India. The Lancet
377(9764), 505–515 (2011)
25. Ganesh, S.: The myth of the non-governmental organization: governmentality and transnation-
alism in an Indian NGO. Int. Multicultural Organ. Commun. 7, 193–219 (2005)
26. Delisle, H., Roberts, J.H., Munro, M., Jones, L., Gyorkos, T.W.: The role of NGOs in global
health research for development. Health Res. Policy Syst. 3(1), 3 (2005)
27. Kluge, E.H.W.: Resource allocation in healthcare: implications of models of medicine as a
profession. Medscape Gen. Med. 9(1), 57 (2007)
28. Veatch, R.M.: The Principle of Avoiding Killing. The Basics of Bioethics, pp. 88–104. Prentice
Hall, Upper Saddle River, NJ (2003)
29. Beauchamp, T.L., Childress, J.F.: Principles of Biomedical Ethics. Oxford University Press,
New York (2001)
30. Tauber, A.I.: Patient autonomy and the ethics of responsibility (2005)
31. Cruess, S.R.: Professionalism and medicine’s social contract with society. Clin. Orthoped.
Relat. Res. 449, 170–176 (2006)
32. Bernardin, J.C.: Renewing the covenant with patients and society. Linacre Q. 63(1), 3–10
(1996)
33. Daniels, N.: Just Health Care. Cambridge University Press, Cambridge (1985)
34. Freidson, E.: Profession of Medicine: A Study of the Sociology of Applied Knowledge. Uni-
versity of Chicago Press, Chicago (1988)
35. Hui, E.C.: The contractual model of the patient-physician relationship and the demise of medical
professionalism. Hong Kong Med. J. (2005)
36. Carroll, C.D., Manderscheid, R.W., Daniels, A.S., Compagni, A.: Convergence of service,
policy, and science toward consumer-driven mental health care. J. Mental Health Policy Econ.
9(4), 185–192 (2006)
37. Iskander, M.G.: Exponential membership functions in fuzzy goal programming: a computa-
tional application to a production problem in the textile industry. Am. J. Comput. Appl. Math.
5(1), 1–6 (2015)
38. Turgay, S., Taşkın, H.: Fuzzy goal programming for health-care organization. Comput. Ind.
Eng. 86, 14–21 (2015)
39. Rubin, P.A., Narasimhan, R.: Fuzzy goal programming with nested priorities. Fuzzy Sets Syst.
14(2), 115–129 (1984)
40. Carlsson, C., Korhonen, P.: A parametric approach to fuzzy linear programming. Fuzzy Sets
Syst. 20(1), 17–30 (1986)
41. Li, R.J., Lee, E.S.: An exponential membership function for fuzzy multiple objective linear
programming. Comput. Math. Appl. 22(12), 55–60 (1991)
42. Zimmermann, H.J.: Decision making in ill-structured environments and with multiple criteria.
In: Readings in Multiple Criteria Decision Aid, pp. 119–151. Springer, Berlin (1990)
Chapter 30
Stoichio Simulation of FACSP From
Graph Transformations to Differential
Equations

J. Philomenal Karoline, P. Helen Chandra, S. M. Saroja Theerdus Kalavathy

and A. Mary Imelda Jayaseeli

Abstract In this paper, a methodology to derive ordinary differential equations

(ODEs) using graph transformation technique is developed for Michaelis–Menten
kinetics. This approach is based on a variant of the construction of critical pairs. It
has been executed using the AGG tool and validated for FACSP.

Keywords Rate of reaction · Fuzzy artificial cell system · Parallel conflicts

Sequential dependencies · Stoichiometric matrix · Place transition net

1 Introduction

Multiset processing is a simple technique, easy to be used by biologists, which con-

trasts with most continuous models and simulation systems. Abstract Rewriting Sys-
tem on Multisets (ARMS), a class of P systems based on multiset processing but with
a simple membrane structure, was introduced with the aim of modelling chemical
systems. It is a stochastic model where rules are applied probabilistically [1].
In particular, ARMS is based on stoichiometric chemistry, and if the number of
elements in the system is large, then the behaviour of the system is similar to the
behaviour of models based on differential equations [2].

J. Philomenal Karoline · P. Helen Chandra (B) · S. M. Saroja Theerdus Kalavathy · A. Mary

Imelda Jayaseeli
Jayaraj Annapackiam College for Women (Autonomous),
Periyakulam, Theni, Tamil Nadu, India
e-mail: [email protected]
J. Philomenal Karoline
e-mail: [email protected]
S. M. Saroja Theerdus Kalavathy
e-mail: [email protected]
A. Mary Imelda Jayaseeli
e-mail: [email protected]

In [3], a new device, Fuzzy ARMS in Artificial Cell System with Proteins on
Membrane (FACSP), is developed for which the structure is analysed on its parame-
ters. In [4], a methodology has been developed to model Michaelis–Menten kinetic
reactions networks in terms of DPO graph transformation.
In [5], the chemical reaction kinetics is rephrased in terms of stochastic graph
transformations. The ODEs that describe the evolution of concentrations of chemical
species over time are derived. It is based on stochastic graph transformation [6] which
combines rules to capture the reactive behaviour of the system with a specification
of rate constants governing the speed at which the reaction occur.
However, it is of great interest to study the dynamical properties of FACSP, and we
have considered to apply mathematical methods developed for analysing differential
equations.
In this paper, the formation of our work is designed as follows: first a background
and related works are given. Then molecular representation of FACSP is deliberated,
and critical pair analysis of DPO graph transformation rules using AGG tool is done.
Stoichiometric matrix and the incidence matrix of the PT net are obtained.

2 Preliminaries

In [5], a stoichiometric matrix is obtained which relates each elementary reaction to

each molecular species in the system by the aggregate effect the reaction has on that
species population. The rate laws are extracted, and a rate law vector of length n is
produced. A multiplication of this vector and the stoichiometric matrix produces a
system of ordinary differential equations:

d[X ]/dt = S · R (1)

where d[X ]/dt is the differential with respect to time t, of a chemical species X in
the system, S is the stoichiometric matrix, and R is the rate law vector.
In [7], the translation of Petri nets whose transitions are labelled by rate constants,
to differential equations, is discussed.

2.1 The Graph Transformation System [5]

A type graph representing molecules using graph transformations is discussed in

[5]. Here atoms are represented as square nodes. The round nodes are atom-specific
bonding nodes. A bond between atoms is represented by an edge between two of
these bonding nodes. Each bonding node is connected to only one atom node. The
formal definitions of typed graph transformation system and Accountable GTS are
also given in [5].
30 Stoichio Simulation of FACSP From Graph Transformations … 389

Fig. 1 a Oxidation of sulphides and b evolution rules for FACSP

2.2 FACSP (Fuzzy Artificial Cell Systems With Proteins On

Membranes) [3]

Oxidation of Sulphides: Oxidation of aryl methyl sulphides using iron–salen com-

plexes as catalyst in presence of hydrogen peroxide as oxidant is followed kinetically
and is described in [8]. Chemically the reaction takes place through formation of in-
termediate oxo compound of the catalyst and in second step the oxidation of substrate
following regeneration of catalyst. The general reaction rule is presented in (a), and
the structure is shown in Fig. 1a.

(a). Z + X (F3)X → X (F4O)X ;

X (F4O)X + Y -RSR → X (F3)X + Y -RSOR
The structure of (a) is represented in Fig. 1a. In [3], the authors carried out catalytic
reactions of aryl methyl sulphides varying the substitution at Y as H, Cl, Br, CH3 ,
OCH3 , F and NO2 groups. In case X = H and Y varying as seven substitutions, (a)
consists of seven reaction rules.
Fuzzy ARMS in Artificial Cell Systems with Proteins on Membranes (FACSP)
has been introduced in which the evolution rules (Fuzzy rewriting rules) are the seven
reaction rules and the Fuzzy data are oxidant, catalyst and substrate (Fig. 1b).
390 J. Philomenal Karoline et al.

3 Graph Transformations for FACSP

We present the molecular representation of FACSP using graph transformation sys-

tem and the derivation of ordinary differential equation for the reactions through
critical pair analysis using AGG tools.

3.1 Molecular Representation of FACSP Using Graphs

Let us consider the first evolution rule (R11 ) in FACSP from Sect. 2. The structure of
the corresponding reaction rule is shown in Fig. 2 in which the formation of interme-
diate iron (IV)–oxo salen complex of parent molecule is described. The complex acts
as a catalyst for the oxidation of phenyl methyl sulphide to phenyl methyl sulphox-
ide. At the end of the reaction, the catalyst, iron (III)–salen complex is regenerated.
The species hydrogen peroxide (Z ), iron (III)–salen complex (A1 ), iron (IV)–oxo
salen complex (B), phenyl methyl sulphide (S1 ) and phenyl methyl sulphoxide (P1 )
in Fig. 2 are represented as molecules. Each molecule consists of bonds that connect
two atoms. The intuitive representation of molecules consists of atoms as nodes and
bonds as edges that directly connect them.
The type graph is produced in AGG (Fig. 3a) for all atoms and groups in FACSP.
In this type graph, atoms and groups such as O, Fe, N, S, H, C, Cl, Br, F, CH3 , OCH3
and NO2 are represented as square nodes, each distinct species having its own node
type. The round nodes represented are atom-specific bonding nodes. All bonding

Fig. 2 Reaction rule R11

Fig. 3 a Type graph for FACSP and b type graph for the molecule hydrogen peroxide
30 Stoichio Simulation of FACSP From Graph Transformations … 391

Fig. 4 Graph depicting starting materials for FACSP, produced in AGG

nodes are subtypes of the generic bond node type. Finally, the atoms chlorine (Cl),
bromine (Br) and fluorine (F) are grouped as halogens, which is denoted by X .
A bond connecting H and O is represented by an edge [arrows with filled arrow-
heads, (Fig. 3b)]. The bond node oxygen is connected to atom node O and that of
hydrogen is connected to atom node H. Oxygen has two bonds satisfying the valency
two. The atoms and groups C, CH3 and OCH3 have the same bonding node type (C)
associated with them.
In our problem, atom C is less electronegative than atoms N and S; C and N are
less electronegative atoms than O atom. H is the least electronegative atom compared
with all other atoms. Thus, a bond between H and any other atom would go from H.
The type graph contains C and H node types, and so the methyl group is represented
as a single CH3 node type. The critical pair analysis constructs an overlap between
the graphs on the left-hand side and right-hand side of the evolution rules. The single
CH3 node is expressed in terms of C atom nodes, H atom nodes, C bonding nodes
and H bonding nodes. The node CH3 and the edges between them would constitute
a total of 10 nodes.
The type graphs are drawn (Fig. 4) for the starting materials (Z ), (A1 ) and (S1 )
representing, namely, hydrogen peroxide, iron (III)–salen complex and phenyl methyl
sulphide respectively taking as molecular identity rules. The LHS and RHS of this
molecular identity rules are same and contain only the graph of a particular molecule.
Type graphs are obtained to all possible general rules in FACSP using the above
methodology. The type graph of molecules in the reaction rule R11 (a) and R11 (b) is
shown in Figs. 5 and 6. Each one is added as a molecular identity rule. Fig. 7 depicts
the abstraction of the reaction rule R11 in Fig. 2.

3.2 Critical Pair Analysis for FACSP

A critical pair analysis is done between each general reaction rule and each molecular
identity rule.
392 J. Philomenal Karoline et al.

Fig. 5 Type graph for the molecule A1 and B in FACSP

Fig. 6 Type graph for the molecule Z , S1 and P1 in FACSP

30 Stoichio Simulation of FACSP From Graph Transformations … 393

Fig. 7 R11 (a)-top, R11 (b)-bottom

Fig. 8 Critical
analysis—parallel conflicts
(PC) and sequential
dependencies (SD)

The parallel conflict and sequential dependencies are verified by the application of
the general rule to the molecule (LHS) at the match given by the critical overlapping
and the application of the general rule to the molecule (RHS) at the match given
by the critical overlapping respectively. The results of the first iteration are given
in Fig. 8. Each entry signifies how many of the overlappings were critical for each
pair. Critical pair analysis checks all possible unions of L and M for parallel conflict
analysis and R and M for sequential dependence analysis.
In a similar manner, the molecular identity rules are obtained, and hence, the
critical pair analysis is done for all reaction rules in FACSP. The result of the critical
pair analysis is given in Fig. 9.
For the FACSP reaction studied, the results obtained after applying the reaction
rule to the overlappings at their critical matches are compared. There are two critical
overlappings wherever there is a conflict with A1 which is shown in Fig. 10. The
critical nodes and edges in this overlapping (Fe) are covered by the shaded area.
394 J. Philomenal Karoline et al.

Fig. 9 Critical pair analysis—parallel conflicts (PC) and sequential dependencies (SD)

Fig. 10 Critical overlapping between R11 and A1 (critical graph elements are contained within the
shaded area)

The other types of nodes and edges in the critical overlappings are identical and
same. Due to the symmetry around the critical Fe atom node, two overlappings arise.
They have no significance to the selection of a reaction or to the outcome. So, they
are equivalent and hence we have got an isomorphism between the corresponding
transformations. Since the molecules involved are very small, these overlappings are
reduced to 1 in all cases. We then have reduced this entry in Fig. 9 to 1 which is the
obtained stoichiometric matrix (Fig. 12). It is immediate to obtain the ODEs.

4 Stochastic Graph Transformation System for FACSP

In a chemical system, the reaction speed is captured by the rate constant as a measure
of the reactivity of the given components. In FACSP, reaction rules that act on the
molecules are specified by rewriting rules and rate constant represents the member-
ship value. Assigning the membership values (rate constants) to rules of a graph
transformation system, we obtain a stochastic graph transformation system.
30 Stoichio Simulation of FACSP From Graph Transformations … 395

4.1 Stoichiometric Matrix for FACSP

Consider the evolution rule

R11 (a) : [1 A1 |Z ]1 −
→ [ B|φ]1 ; R11 (b) : [1 B|S1 ]1 −
→ [ A | [ |P1 ]2 ]1 (2)
ω1 1 ω2 1 1 2

which comprise an example reaction mechanism for FACSP. If it is known for each
reaction, how many molecules of each chemical species is created or destroyed, we
can build up a matrix for the reactions in (2).
Each entry in the stoichiometric matrix corresponds to the aggregate number of
molecules consumed or produced in a reaction, negative for consumption and positive
for production. The first reaction in (2) with the membership value ω1 consumed one
molecule of Z , the entry for ω1 and Z in the matrix would be −1. Similarly, the
entry for ω1 and A1 in the matrix would be −1. Also the first reaction in (2) with
the membership value ω1 produced one molecule of B, the entry for ω1 and B in the
matrix would be 1. Proceeding like this, we build up a stoichiometric matrix for the
reaction (2) which is tabulated in Fig. 11a.
In a similar way, we are able to build up the stoichiometric matrix (Fig. 12) for
all the reactions in the seven evolution rules of FACSP.
The membership law for FACSP is defined such that the membership coefficient
for each row is multiplied by the concentration of those species which are destroyed.
For example, the membership law for the corresponding oxidation of sulphides in
(2): ω1 [Z ][A1 ] for the reaction R11 (a) and ω2 [B][S1 ] for R11 (b).
The membership law matrix for the reactions in (2) is shown in Fig. 11b. In a
similar manner, we are able to define membership law for all reactions in the seven
evolution rules of FACSP.
We multiply the membership law matrix by stoichiometric matrix, and hence, we
have obtained the following ODE’s.

d[A1 ]/dt = −ω1 [Z ][A1 ] + ω2 [B][S1 ];

d[Z ]/dt = −ω1 [Z ][A1 ];

d[B]/dt = ω1 [Z ][A1 ] − ω2 [B][S1 ];

Fig. 11 a Stoichiometric matrix for R11 and b membership law matrix for R11
396 J. Philomenal Karoline et al.

d[S1 ]/dt = −ω2 [B][S1 ];

d[P1 ]/dt = ω2 [B][S1 ].

4.2 Place Transition Net Representing FACSP Reaction

Mechanism

In [7], it is described how a discrete Petri net can be converted into a continuous
one by allowing places to have a positive real number of tokens representing the
concentration of that particular chemical species in the system. The ODEs can be
deduced from the incidence matrix for such a Petri net.

Fig. 12 Stoichiometric matrix for FACSP, M-molecules, M.V.-membership values

Fig. 13 PT Net for FACSP

30 Stoichio Simulation of FACSP From Graph Transformations … 397

In our problem, the incidence matrix for a place transition net with places
Z , A1 , B, Sn , Pn where n = 1 to 7 and transitions ωl , ωm where l = 1, 3, 5, . . . , 13
and m = 2, 4, 6, . . . , 14 are shown in Fig. 13. It is obviously similar to the stoichio-
metric matrix (Fig. 12) and hence we are able to obtain the ODEs.

5 Conclusion

We have derived ordinary differential equations from the stoichiometric matrix by

doing critical pair analysis from the graph transformation system using AGG tools.
In the same way, we have obtained the stoichiometric matrix using stochastic graph
transformation by assigning membership values to the evolution rules. Again we
have obtained the incidence matrix of a petri net representing the FACSP mechanism
which is similar to the stoichiometric matrix of the FACSP.
We have observed that once a stoichiometric matrix is established, the ODEs
could be derived. Also it is understood that it is enough to encode the graph trans-
formation system into a place transition net to find the stoichiometric matrix. This
approach has been demonstrated by means of oxidation of sulphides reactions fol-
lowing Michaelis–Menten kinetics using the AGG tool and validated for FACSP.

Acknowledgements The author Dr. Sr. P. Helen Chandra, Principal Investigator of UGC Major
Research Project (F.No. -43-412/2014(SR) dated 05 September 2015) is grateful to UGC, New Del-
hi, for the award of the project which enabled to execute this research work in Jayaraj Annapackiam
College for Women (Autonomous), Periyakulam, Theni District, Tamil Nadu.

References

1. Suzuki, Y., Tsumoto, S., Tamaka, H.: Analysis of Cycles in Symbolic Chemical Systems
Based on Abstract Rewriting Systems on Multisets, pp. 522–528. Artificial Life V, MIT Press,
Cambridge, MA (1996)
2. Suzuki, Y., Fujiwara, Y., Takabayashi, J., Tanaka, H.: Artificial life applications of a class
of P systems: abstract rewriting system on multisets, multiset processing. Lecture Notes in
Computer Science, vol. 2235, pp. 299–346. Springer, Berlin (2001)
3. Helen Chandra, P., Saroja Theerdus Kalavathy, S.M., Mary Imelda Jayaseeli, A., Philomenal
Karoline, J.: Fuzzy ACS with biological catalysts on membranes in chemical reactions. J. Netw.
Innovative Comput. MIR Labs, USA 4, 143–151 (2016)
4. Philomenal Karoline, J., Helen Chandra, P., Saroja Theerdus Kalavathy, S.M., Mary Imelda
Jayaseeli, A.: Model based simulation of Michaelis-Menten kinetic reactions by DPO graph
transformation. IJPAM (2017)
5. Bapordra, M., Heckel, R.: From graph transformations to differential equations. Electron.
Commun. EASST 30, 1–21 (2010)
6. Heckel, R., Lajios, G., Menge, S.: Stochastic graph transformation systems. In: International
Colloquium on Theoretical Aspects of Computing 2005. Lecture Notes in Computer Science,
vol. 3256, pp. 210–225 (2004)
398 J. Philomenal Karoline et al.

7. Gilbert, D., Heiner, M.: From Petri nets to differential equations-an integrative approach for
biochemical network analysis. In: ICATPN, Lecture Notes in Computer Science, vol. 4024,
pp. 181–200 (2006)
8. Mary Imelda Jayaseeli, A., Rajagopal, S.: [Iron(III)-salen] Ion Catalyzed H2 O2 Oxidation of
organic sulfides and sulfoxides. J. Mol. Catal, A: Chem. 309, 103–110 (2009)
Chapter 31
Fully Dynamic Group Signature
Scheme with Member Registration
and Verifier-Local Revocation

Maharage Nisansala Sevwandi Perera and Takeshi Koshiba

Abstract Since Bellare et al. (EUROCRYPT 2003) proposed a security model for
group signature schemes, almost all the securities of group signature schemes have
been discussed in their model (the BMW03 model). While the BMW03 model is
for static groups, Bellare et al. in 2005 considered the case of dynamic group sig-
nature schemes and provided a solution to cope with dynamic groups. However,
their scheme does not serve member revocation, serves only member registration. In
this paper, we incorporate a member revocation mechanism into a group signature
scheme with member registration and construct a fully dynamic group signature,
which supports verifier-local revocation (VLR) to manipulate member revocation.
Moreover, we achieve the security of the proposed scheme with a restricted version of
full anonymity to overcome the security complications that may arise due to member
revocation.

Keywords Dynamic group signature · Verifier-local revocation · Almost-full

anonymity

1 Introduction

The notion of group signature was first introduced by Chaum and van Heyst [12]
in 1991. Each member has a private signing key and a corresponding public key.
The private signing key is used to generate signatures on messages while the public
key is used as a public verification key by verifiers to authenticate the signatures.
Group signatures allow group members to sign anonymously on behalf of the group
(anonymity). Only the authorized person can reveal the identity of the member who
signs (traceability).

M. N. S. Perera (B)
Graduate School of Science and Engineering, Saitama University, Saitama, Japan
e-mail: [email protected]
T. Koshiba
Faculty of Education and Integrated Arts and Sciences, Waseda University, Tokyo, Japan
e-mail: [email protected]

Besides the naive security notions (anonymity and traceability) for group sig-
natures, more security requirements like un-frameability, collusion resistance, and
unforgeability are proposed. In 2003, Bellare et al. [2] suggested a formal security
notion with full anonymity and full traceability to provide a stronger security for
group signature schemes (the BMW03 model). This BMW03 model supports only
for static groups, not for dynamic groups. Hence, it does not guarantee the security
when group members can be flexibly reorganized.
In the setting of dynamic group signatures, neither the number of group members
nor their keys should be fixed in the setup phase. Thus, a scheme should be able to
register or revoke members anytime. In 2005, Bellare et al. [3] suggested a scheme
by providing foundations for dynamic group signatures. The scheme in [3] helps
to bridge the gap between the results in [2], and the previous works are done to
deliver a dynamic group signature scheme. The dynamic groups are more complex
than the static groups since they require many security concerns and deliver more
issues to be focused. Schemes in [3] and [14] provide formal security definitions for
dynamic group signatures to overcome those issues. Another scheme was suggested
by Libert et. al. [15]. However, none of them are fully dynamic group signature
schemes since they do not support member revocation. Recently, Bootle et. al. [7]
suggested a security definition for fully dynamic group signature schemes and they
have also provided some fixes for existing schemes. Hereafter, if a scheme supports
both member registration and member revocation, we refer to it as fully dynamic,
and if a scheme supports either member registration or revocation, we refer to it as
dynamic.
The member revocation is an essential requirement in practice, and many re-
searchers presented various approaches to manage member revocation in groups.
One approach is replacing the group public key and the private signing keys with
new keys for all existing members when a member is revoked. Since this requires
to update all the existing members and the verifiers, it is not the best solution, espe-
cially not suitable for large groups. In 2001, Bresson et al. [8] provided a solution
that requires signers to prove, at the time of signing, that their member certificates
are not in the public revocation list. In 2002, Camenisch et al. [11] proposed a dif-
ferent approach, which is based on dynamic accumulators. It maps a set of values
into a fixed-length string and permits efficient proofs of memberships. However, this
approach requires existing members to keep track of the revoked users. Thus, it in-
creases the workload of existing members. Moreover, schemes in [5, 10, 18] have
taken some other revocation approache.
A different and simple revocation mechanism was suggested by Brickell [9],
which was subsequently formalized by Boneh and Shacham [6]. This revocation
mechanism is known as Verifier-Local Revocation (VLR). VLR allows the members
to convince the verifiers that they are valid members, who are not revoked and eligible
to sign on behalf of the group. Every member has a unique token, and when he is
revoked, this token is added to a list called Revocation List (RL). Then, the group
manager passes the latest RL to the verifiers. When a verifier needs to authenticate a
signature, he checks the validity of the signer with the help of RL. Since the verifiers
are smaller in number than the members, this mechanism is more convenient than any
31 Fully Dynamic Group Signature Scheme … 401

others, especially for large groups. Moreover, this is advantageous to the previous
approaches since it does not affect on existing members.
Our Contribution
This paper presents a fully dynamic group signature scheme that allows to both add
and revoke members and a new security notion to overcome some security barriers.
First, we take the scheme in [3], which includes an interactive protocol, that
allows new users to join the group at any time, and we incorporate with member
revocation mechanism by adapting the methods in the scheme in [3] and suggesting
new methods to manage member revocation with VLR.
Then, we suggest a method to generate member revocation tokens in our scheme.
In general, any VLR scheme consists of a token system and those tokens are generated
as a part of the secret signing key. Since our intention is to apply full anonymity which
requires to provide all the secret signing keys to an adversary at the anonymity game,
this method is not suitable for our scheme. If we generate revocation tokens using the
signing keys of the members, the adversary can obtain the tokens of the challenged
indices and win the anonymity game. Thus, to present a member’s token, we use his
personal secret key (usk[i]) and his verification key (pki ). Nevertheless, pki is a public
attribute, revealing pki does not show any other information about the member. Even
though usk[i] is a secret key, no one can generate any secret signing key by usk[i].
Besides, no one can create a group member token using the secret signing key, since
the token is not a part of the secret signing key. Thus, it ensures the security of the
scheme.
Moreover, we present a new security notion that is somewhat weaker than the full
anonymity. VLR relies on a weaker security notion called selfless-anonymity. Even
our intention is to apply full anonymity for our scheme, achieving full anonymity
suggested in the BMW03 model for VLR is quite difficult. In case of the anonymity
game (for the definition) between a challenger and a adversary, the BMW03 model
passes all the secret keys to the adversary. But, we cannot allow the adversary to
reveal all the secret keys since he can corrupt the anonymity of the scheme. If we
allow the adversary to reveal all the users’ personal private keys (usk), which we
use to create tokens he can create any token, including the challenged users’ tokens.
Then, he can verify the challenging signature and return the correct user index of the
challenged signature. Thus, we suggest a new restricted version of full anonymity
(almost-full anonymity), which will not provide all the secret keys to the adversary to
ensure the security of our scheme. It will allow the adversary to reveal any member’s
secret signing keys not the member’s personal private keys.

2 Preliminaries

In this section, we describe notations used in the paper and the primitives with
which we use to construct our scheme. Construction of dynamic group signature
schemes use three building blocks: public-key encryption schemes secure against
402 M. N. S. Perera and T. Koshiba

chosen-ciphertext attack [13], digital signature schemes secure against chosen-

message attack [1], and simulation-sound adaptive non-interactive zero-knowledge
(NIZK) proofs for NP [17]. All the three primitives are based on trapdoor permuta-
tion.

2.1 Notation

We denote by λ the security parameter of the scheme and let N = {1, 2, 3, . . .} be

the set of positive integers. For any k ≥ 1 ∈ N, we denote by [k] the set of integers
{1, …, k}. An empty string is denoted by ε. If s is a string, then |s| denotes the length
of the string and if S is a set then |S| denotes the size of the set. If S is a finite set,
$
b ← S denotes that b is chosen uniformly at random from S. We denote experiments
by Exp.

2.2 Digital Signature Schemes

A digital signature scheme DS = (Ks , Sig, Vf) consists of three algorithms: key gener-
ation Ks , signing Sig, and verification Vf. The scheme DS should satisfy the standard
notion of unforgeability under chosen-message attack.
unforg-cma
For an adversary A, consider an experiment ExpDS,A (λ). First, a pair of a
public key and the corresponding secret key for the scheme DS is obtained by ex-
$
ecuting Ks with the security parameter λ as (pk, sk) ← Ks (1λ ). Then, the public
key pk is given to the adversary, and the adversary can access the signing oracle
Sig(sk, ·) for any number of messages. Finally, the forging adversary A outputs
(m, σ). He wins if σ is a valid signature on the message m and m is not queried so
unforg-cma unforg-cma
far. We let AdvDS,A (λ) = Pr[ExpDS,A (λ) = 1].
A digital signature scheme DS is secure against forgeries under chose message
unforg-cma
attack if AdvDS,A (λ) is negligible in λ for any polynomial-time adversary A.

2.3 Encryption Scheme

An encryption scheme E = (Ke , Enc, Dec) consists of three algorithms: key gener-
ation Ke , encryption Enc, and decryption Dec. The scheme E should satisfy the
standard notion of indistinguishability under adaptive chosen-ciphertext attack.
For an adversary A, consider an experiment Expind-cca-b
E,A (λ). First, a pair of a
public key and the corresponding secret key for the encryption scheme E is obtained
by executing Ke with the security parameter λ and a randomness string re (where the
$
length of re is bounded by some fixed polynomial r (λ)) as (pk, sk) ← Ke (1λ , re ).
31 Fully Dynamic Group Signature Scheme … 403

Let LR(m 0 , m 1 , b) a function which returns m b for a bit b and messages m 0 , m 1 .

We assume the adversary A never queries Dec(sk, ·) on a ciphertext previously
E,A (λ) = | Pr[ExpE,A
returned by Enc(pk, LR(·, ·, b)). We let Advind-cca ind-cca-1
(λ) = 1] −
Pr[ExpE,A (λ) = 1]|.
ind-cca-0

An encryption scheme E is IND-CCA secure if Advind-cca E,A (λ) is negligible in λ

for any polynomial-time adversary A.

2.4 Simulation-Sound Non-interactive Zero-Knowledge

Proof System

A two-party game between a prover and a verifier which needs to determine whether
a given string belongs to a language or not is called an interactive system. The
interactive system allows to exchange messages between the prover and the verifier.
Besides, argument systems are like interactive proof systems, except they are required
to be computationally infeasible for a prover to convince the verifier to accept inputs
not in the language. Non-interactive proof systems are mono-directional [4]. The non-
interactive proof systems allow a prover to convince a verifier about a truth statement
while zero-knowledge ensures that the verifier learns nothing from the proof other
than the truth of the statement. The non-interactive zero-knowledge proof system
shows that without any interaction but using a common string computational zero-
knowledge can be achieved. In a simulation-sound NIZK proof system, an adversary
cannot prove any false statements even after seeing simulated proofs of arbitrary
statements.
An NP-r elation over domain Dom ⊆ {0, 1}∗ is a subset ρ of {0, 1}∗ × {0, 1}∗ .
We say that x is a theorem and w is a proof of x if (x, w) ∈ ρ. The membership of
(x, w) ∈ ρ is decidable in time polynomial in the length of the first argument for all
x in Dom.
We fix an NP relation ρ over Dom and take a pair of polynomial-time algorithms
(P, V ), where P is randomized, and V is deterministic. Both P and V have access
to a common reference string R. The (P, V ) is a non-interactive proof system for ρ
over Dom if the following two conditions are satisfied for polynomials p and l.
– Completeness: ∀λ ∈ N, ∀(x, w) ∈ ρ with |x| ≤ l(λ) and x ∈ Dom :
$ $
Pr [R ← {0, 1} p(λ) ; π ← P(1λ , x, w, R) : V (1λ , x, π, R) = 1] = 1.
– Soundness: ∀λ ∈ N, ∀ P̂ and x ∈ Dom such that x ∈ / L ρ:
$ $
Pr[R ← {0, 1} p(λ) ; π ← P̂(1λ , x, R) : V (1λ , x, π, R) = 1] ≤ 2−λ .

3 Our Scheme

We construct our scheme based on the scheme in [3]. In the scheme in [3], they
have taken a digital signature scheme DS = (Ks , Sig, Vf) and a public-key encryption
scheme E = (Ke , Enc, Dec) as the building blocks to construct a group signature
404 M. N. S. Perera and T. Koshiba

Fig. 1 Group-joining protocol

scheme GS. Moreover, they have used NIZK proof system to convince the verifier
the validity of the signature. We also use above-mentioned primitives; DS, E, and
NIZK to present a new scheme FDGS = (GKg, UKg, Join, Issue, Revoke, Sign,
Verify, Open, Judge). GKg, UKg, and Judge are same as the scheme in [3]. We
provide a new algorithm Revoke to revoke members, and we modify Join, Issue,
Sign, Verify, and Open to be compatible with the revocation mechanism. We use
DS for generating the group manager’s keys and E for generating the opener’s keys.
Thus, our group public key gpk consists of the security parameter λ, public keys
of group manager and opener, and two reference strings R1 , R2 obtained for NIZK
proof.
We describe our group-joining protocol which executes Join and Issue in Fig. 1,
and we describe other algorithms of our scheme in Fig. 2.

3.1 Coping with VLR and Making the Scheme Secure

In general, VLR schemes satisfy a weaker security notion called selfless-anonymity,

which does not provide any secret keys to the adversary. Even though our scheme
supports VLR mechanism, we make our scheme more secure by using the techniques
in [3] scheme and suggesting a new security notion called almost-full anonymity.
Making VLR scheme fully anonymous is quite difficult since the full anonymity
requires to provide all the secret keys to the adversary and providing tokens to the
adversary makes the scheme insecure. The adversary can execute Verify with the
tokens of the challenged indices and win the game easily. Thus, we consider a new
security notion called almost-full anonymity which will not provide tokens to the
adversary, which is a restricted version of the full anonymity.
Moreover, any VLR scheme has an associated tracing mechanism called implicit
tracing algorithm to trace signers. The implicit tracing algorithm requires to run
Verify linear times in the number of group members. Compare to the explicit tracing
31 Fully Dynamic Group Signature Scheme … 405

Fig. 2 Algorithms of the new fully dynamic group signature scheme

406 M. N. S. Perera and T. Koshiba

algorithm, which is used in schemes like [16], use of the implicit tracing algorithm
increases the time consumption. Hence, instead of using the implicit tracing algorithm
given in VLR, we use algorithms provided in [3] for our scheme’s tracing mechanism.
As well, VLR manages a token system. Thus, our scheme should consist user
tokens and those tokens should be unique to the users. Furthermore, tokens should
not reveal user’s identity in case of disclosing to the outsiders. We generate tokens
for members, which will not expose identity of the members even though tokens are
opened to the outsiders. We use the combination of each group member’s personal
secret key and his verification key as his token, and we maintain the list RL with
revoked members’ tokens.

3.2 Description of Our Scheme

There are two authorities, group manager and opener. The trusted setup is respon-
sible for generating the group public key and keys for the authorities. The group
manager manages member registration and member revocation while the opener
traces signers.
When a new user wants to join the group, he interacts with the group manager
via group-joining protocol (Fig. 1), which allows new users to generate their public
key and secret keys. We assume this interaction between the new user, and the group
manager is done through a secure channel. The new user produces a signature on his
verification key and sends both the signature and the key to the group manager. If the
signature is acceptable, then the group manager accepts him as a new member. In the
registration table reg, we maintain a field called Status for each member to identify
the active status of them. Thus, the group manager stores the index i, verification key
pki , and the signature sigi of the new member in reg and makes the status of the new
member as active. After that, the group manager issues member certification to the
new member. Now the new member can generate signatures on messages using his
secret key.
Each member has a unique token, which is the tracing key to identify the validity
of signers, whether they are revoked or not. Here we use the member’s personal
secret key usk[i] and his verification key pki as the token since usk[i] or pki does not
help to reveal any other information. We check the existence of the new user keys
against reg at the joining protocol. Thus, in a situation that a revoked member wants
to join again, he cannot use his previous keys, and he has to follow the process as a
new user. That is to secure the scheme against adversaries who steal tokens and try to
join the group. During the member revocation, the group manager adds the revoking
member’s token to RL and updates reg to inactive. When a member needs to sign
a message, he generates the signature on a message with his secret key and passes
to the verifier with his token for verification. The verifier authenticates the signature
on the given message and checks the validity of the signer with the provided token
against the latest RL. In the case of necessity to trace the signer, the opener can trace
the signer using opener’s key, and he can check the status of the signer in reg.
31 Fully Dynamic Group Signature Scheme … 407

Our scheme is a tuple FDGS = (GKg, UKg, Join, Issue, Revoke, Sign, Verify,
Open, Judge), which consists of polynomial-time algorithms. Each algorithm is
described in below. GKg, UKg, and Judge are same as in [3] and Join, Issue,
Sign, Verify, and Open are different from the algorithms given in [3] since we have
to generate and pass the member’s token as an additional attribute in our scheme.
Revoke helps to revoke the misbehaved users.
– GKg(1λ ): On input 1λ , the trusted party obtains a group public key gpk and
authority keys, ik and ok. Then gives secret keys, ik to the group manager and ok
to the opener.
– UKg(1λ ): Every user who wants to be a member should run this algorithm before
the group-joining protocol to obtain their personal public key and personal private
key (upk[i], usk[i]). UKg takes as input 1λ . We assume upk is publicly available.
– Join, Issue: The group-joining protocol is an interactive protocol between the
group manager and the user who wants to be a member. Join is implemented
by the user while Issue is implemented by the group manager. Join allows new
users to generate keys and a signature on the keys which are needed to join the
group. Issue allows the group manager to validate the keys and the signatures sent
by users and generate member certifications. Each algorithm takes an incoming
message as input and returns an outgoing message. Join and Issue maintains their
current status for both parties. The user i generates a public / secret key pair pki and
ski . Then, he produces a signature sigi on pki using usk[i], which was obtained in
UKg. Then, user sends sigi and pki to the group manager to authenticate. The group
manager authenticates the signature sigi on pki and generates member certification
by signing pki with his private key ik (gmsk). The group manager stores new
member’s informations, i, pki , and sigi with the status as 1 (active) in reg. Then,
he sends member certification certi to the user who is the new member of the
group. After that new user can make his secret key gsk[i] = (i, pki , ski , certi ), and
his token grt[i] = (usk[i], pki ).
– Revoke(i, grt[i], ik, RL, reg): This algorithm takes, index i of the member, who
wants to be revoked and the group manager’s secret key ik as inputs. First, the
group manager queries reg using the index i to obtain the information of the
member stored. Then, he checks whether the queries are equal to the data obtained
by parsing the grt[i]. If the data are equal and if the user is active, insert (usk[i],
pki ) to RL and updates reg to 0 (inactive).
– Sign(gpk, gsk[i], grt[i], m): This randomized algorithm generates a signature σ
on a given message m. It takes the group public key gpk, the group member’s secret
key gsk, and the message m as inputs. In addition, we pass the group member’s
token as an input to prove that the member is an active person at the time of signing.
– Verify(gpk, m, σ, RL): This deterministic algorithm allows anyone in possession of
group public key gpk to verify the given signature σ on the message m and checks
the validity of the signer against RL. This algorithm outputs 1 if both conditions
are valid. Otherwise, it returns 0.
– Open(gpk, ok, reg, m, σ): This deterministic algorithm traces the signer by taking
gpk, the opener’s secret key ok, reg, the message m, and the signature σ as inputs.
408 M. N. S. Perera and T. Koshiba

It returns the index of the signer, the proof of the claim τ , and the status of the
signer st at reg. If the algorithm failed to trace the signature to a particular group
member, it returns (0, ε, 0).
– Judge(gpk, i, upk[i], m, σ, τ ): This deterministic algorithm outputs either 1 or
0 depending on the validity of the proof τ on σ. This takes, the group public key
gpk, the member index i, the tracing proof τ , the member verification key upk[i],
the message m, and the signature σ as inputs. The algorithm outputs 1 if τ can
proof that i produced σ. Otherwise, it returns 0.
In addition, we use the following simple polynomial-time algorithm.
– IsActive(i,reg): This algorithm determines whether the member i is active by
querying the registration table and outputs either 0 or 1.

4 Security Notions of the Scheme

Even though the BMW03 model has two key requirements, full anonymity and full
traceability, the scheme in [3] has three key requirements; anonymity, traceability,
and non-frameability. Since full traceability discussed in the BMW03 model covers
both traceability and non-frameability, the BMW03 model has only two require-
ments. In the setting of [3], traceability and non-frameability are separated since
non-frameability can be achieved with lower levels of trust in the authorities than
traceability as discussed below. According to the scheme in [3], the opener’s secret
key is provided to an adversary in traceability game but, the issuer’s secret key is
not provided. The scheme in [3], they assume that the opener is partially corrupted
in traceability. But in non-frameability, both the opener’s and the tracer’s secret keys
are given to the adversary. Thus, the adversary is stronger in non-frameability than in
traceability. Thus, non-frameability is separated from the traceability in [3]. More-
over, anonymity allows the adversary to corrupt the issuer in [3]. Thus, we provide
the issuer’s secret key to the adversary but not the opener’s secret key.
However, the scheme in [3] does not support member revocation but our scheme
supports. Thus, we adapt the security experiments and the oracles to be compatible
with VLR. Before we discuss the security notions, we define the set of oracles that
we use. We suggest a new oracle, revoke to maintain the member revocation queried
by any adversary.
For the requirement of anonymity, we suggest a restricted version of full anony-
mity. In the full-anonymity game, we provide all the members’ secret keys to the
adversary including challenged indexes’ keys to the adversary. In our scheme, this
may help the adversary to create the challenged indexes’ tokens since he knows all the
members’ personal secret keys (usk) and he can execute Verify to check which index
is used to generate the challenged signature. Thus, we will not provide users’ personal
secret keys to the adversary when he requests for user’s secret keys. However, he can
request for any private signing key. Hence, we suggest a new security notion almost-
full anonymity to show the security of our scheme. Since the almost-full anonymity
31 Fully Dynamic Group Signature Scheme … 409

does not allow members’ personal secret keys to the adversary, it is somewhat weaker
than the full anonymity, and since, it provides members’ secret signing keys including
challenged indices’ to the adversary, it is stronger than the selfless-anonymity.

4.1 The Oracles

All the oracles that we use are specified in Fig. 3. We maintain a set of global lists,
which are manipulated by the oracles in the security experiments discussed later.
HUL is the honest user list, which maintains the indexes of the users who are added
to the group. When the adversary corrupts any user, that user’s index is added to
CUL. SL carries the signatures that obtained from Sign oracle. When the adversary
requests a signature, the generated signature, the index, and the message are added
to SL. When the adversary accesses Challenge oracle, the generated signature is
added to CL with the message sent. We use a set S to maintain a set of revoked users.
– AddU(i): The adversary can add a user i ∈ N to the group as an honest user. The
oracle adds i to HUL and selects keys for i. It then executes the group-joining
protocol. If Issue accepts, then adds the state to reg and if Join accepts then
generates gsk[i]. Finally, returns upk[i].
– CrptU(i, upk): The adversary can corrupt user i by setting its personal public
key upk[i] to upk. The oracle adds i to CUL and initializes the issuer’s state in
group-joining protocol.
– SendToIssuer(i, Min ): The adversary acts as i and engages in group-joining pro-
tocol with Issue-executing issuer. The adversary provides i and Min to the oracle.
The oracle which maintains the Issue state returns the outgoing message and adds
a record to reg.
– SendToUser(i, Min ): The adversary corrupts the issuer and engages in group-
joining protocol with Join-executing user. The adversary provides i and Min to the
oracle. The oracle which maintains the user i state, returns the outgoing message,
and sets the private signing key of i to the final state of Join.
– RevealU(i): The adversary can reveal secret keys of the user i. We only provide
user’s private signing key gsk[i] not his personal private key usk[i].
– ReadReg(i): The adversary can read the entry of i in reg.
– ModifyReg(i, val): The adversary can modify the contents of the record for i in
reg by setting val.
– Sign(i, m): The adversary obtains a signature σ for a given message m and user i
who is an honest user and has private signing key.
– Chalb (i0 , i1 , m): This oracle is for defining anonymity and provides a group sig-
nature for the given message m under the private signing key of ib , as long as both
i0 , i1 are active and honest users having private signing keys.
– Revoke(i): The adversary can revoke user i. The oracle updates the record for i
in reg and adds revocation token of i to the set S.
410 M. N. S. Perera and T. Koshiba

Fig. 3 Oracles
31 Fully Dynamic Group Signature Scheme … 411

– Open(m, σ): The adversary can access this opening oracle with a message m and
a signature σ to obtain the identity of the user, who generated the signature σ. If
σ is queried before for Chalb , oracle will abort.

4.2 Correctness

The notion of correctness requires that any signature generated by any honest and
active users should be valid and Open should correctly identify the signer for a given
message and a signature. Moreover, the proof returned by Open should be accepted
by Judge. Hence, any scheme is correct if the advantage of the correctness game is
0, for all λ ∈ N and for any adversary A.
F DG S,A (λ) = Pr[Exp F DG S,A (λ) = 1].
We let, Advcorr corr

Exp F DG S,A (λ)

corr

(gpk, ok, ik) ← GKg(1λ ); HUL ← ∅; (i, m) ← A(gpk; AddU, ReadReg,

Revoke);
If i ∈/ HUL or gsk[i] = ε or IsActive(i, r eg)= 0, then return 0.
σ ← Sign(gpk, gsk[i], m);
If Verify(gpk, m, σ, S) = 0, then return 1.
(i
, τ ) ← Open(gpk, ok, r eg, m, σ);
If i = i
, then return 1.
If Judge(gpk, i, upk[i], m, σ, τ ) = 0, then return 1 else return 0.

4.3 Anonymity

The anonymity requires the signatures do not reveal the identity of the signer. In the
anonymity game, the adversary’s goal is to identify the index that is used to create
the signature. We allow the adversary A to corrupt any user and allow him to fully
corrupt the group manager. Also, A can learn secret signing keys of any user. In full-
anonymity game, adversary can access all the secret keys of any member. However,
we suggest a new security notion almost-full anonymity, which does not allow to
reveal the personal secret keys of the users since the adversary can create the tokens
of the challenged ones and check with Verify. Hence, he can easily win the game. We
say that FDGS scheme is almost-fully anonymous if the advantage of the adversary
F DG S,A (λ) is negligible for any polynomial-time adversary.
Advanon
In the game, A selects two active group members and a message to challenge the
game. He has to guess which member is used to generate the signature. He wins if
he can guess the member correctly. We allow only one guess.
We let AdvanonF DG S,A (λ) = Pr[Exp F DG S,A (λ) = 1] − Pr[Exp F DG S,A (λ) = 1].
anon-0 anon-1
412 M. N. S. Perera and T. Koshiba

F DG S,A (λ)
Expanon-b
(gpk, ok, ik) ← GKg(1λ ); HUL, CUL, SL, CL ← ∅;
b∗ ← A(gpk, ik; CrptU, SendToUser, RevealU, Open, ModifyReg, Revoke,
Chalb );
Return b∗ ;

4.4 Non-Frameability

The non-frameability ensures that any adversary unable to produce a signature can
be attributed to an honest member, who did not produce it.
non-fram non-fram
We let Adv F DG S,A (λ) = Pr[Exp F DG S,A (λ) = 1].
In this game, we only require that the framed member is honest. Thus, the adver-
sary A can fully corrupt the group manager and the opener.
Formally, the FDGS scheme is non-frameable for all λ ∈ N and for any adver-
sary A.
non-fram
Exp F DG S,A (λ)
(gpk, ok, ik) ← GKg(1λ ); HUL, CUL, SL ← ∅;
(m, σ, i, τ ) ← A(gpk, ik, ok; CrptU, SendToUser, RevealU, Sign, ModifyReg);
If Verify(gpk, m, σ, S) = 0, then return 0.
If Judge(gpk, i, upk[i], m, σ, τ ) = 0, then return 0.
If i ∈
/ HUL or (i, m, σ, τ ) ∈ SL, then return 0 else 1.

4.5 Traceability

The traceability requires any adversary cannot produce a signature that unable to
identify the origin of the signature. That means the adversary’s challenge is to gen-
erate a signature that cannot be traced to an active member of the group. In this
game, A is allowed to corrupt any user and he has the opener’s key, but he is not
allowed to corrupt the group manager since he can produce dummy users. He wins
if he can create a signature, whose signer cannot be identified or signer is an inactive
member when creating the signature, or Judge algorithm does not accept the Open
algorithm’s decision.
F DG S,A (λ) = Pr[Exp F DG S,A (λ) = 1].
We let Advtrace trace

Exp F DG S,A (λ)

trace

(gpk, ok, ik) ← GKg(1λ ); HUL, CUL, SL ← ∅;

(m, σ) ← A(gpk, ok; AddU, CrptU, SendToIssuer, RevealU, Sign,
ModifyReg, Revoke);
If Verify(gpk, m, σ, S) = 0, then return 0.
(i, τ ) ← Open(gpk, ok, r eg, m, σ);
If i = 0 or Judge(gpk, i, upk[i], m, σ, τ ) = 0, then return 1 else return 0.
31 Fully Dynamic Group Signature Scheme … 413

5 Security Proof of Our Scheme

We can prove that our scheme is anonymous, non-frameable and traceable according
to the experiments described above and which are discussed in [3] and [7]. Even
though our scheme has used a token system as an additional attribute than the scheme
in [3], since we are not providing the tokens to the adversary and since we have used
the member’s personal secret key usk[i] and his verification key pki as his revocation
token, which cannot be used to learn about the member, there is no impact on the
security of the scheme from the token system. Since our scheme requires a reasonable
and sufficient security notion for the problem of considering full anonymity, we use
almost-full anonymity and we use security experiments provided above instead of
experiments given in [3]. However, due to the page limitation, we provide only a
summary of security proof and we will give a detailed proof of security in a full
version of this paper.

5.1 Anonymity

On the assumption that P1 is computational zero knowledge for ρ1 over Dom 1 and
P2 is computational zero knowledge for ρ2 over Dom 2 , two simulations Sim 1 and
Sim 2 can be fixed as 1 = P1 , V1 , Sim 1 ; 2 = P2 , V2 , Sim 2 ; 1 and 2 are the
simulation-sound zero-knowledge non-interactive proof systems of them for L ρ1 and
L ρ2 , respectively.
For any polynomial-time adversary B, who will challenge the anonymity of
our scheme and who can construct polynomial-time IND-CCA adversaries A0 , A1
against encryption scheme E, an adversary As against the simulation soundness of
and distinguishers D1 , D2 that distinguish real proofs of 1 and 2 , respectively,
for all λ ∈ N, we say

F DG S,B (k) ≤ Adv E,A0 (k) + Advind−cca (k) + Advss

,As (k)
ind−cca
Advanon E,A1

P1 ,Sim 1 ,D1 (k) + Adv P2 ,Sim 2 ,D2 (k)).

+ 2 · (Advzk zk

According to the Lemma 5.1 described and proved in [3], we can say the left side
function is negligible since all the functions on the right side are negligible under the
assumptions on the security of building blocks described. This proves the anonymity
of our scheme.

5.2 Non-Frameability

If there is a non-frameability adversary B, who creates at most n(k) honest users,

where n is a polynomial and who constructs two adversaries A2 , A3 against the digital
414 M. N. S. Perera and T. Koshiba

signature scheme, on the assumption that (P1 , V1 ), (P2 , V2 ) are sound proof systems
for ρ1 , ρ2 , respectively, we say

Adv F DG S,B (k) ≤ 2−k+1 + n(k) · (Adv DS,A2

non− f ram un f org−cma un f org−cma
(k) + Adv DS,A3 (k)).

On the assumption that the scheme DS is secure, all the functions on the right side
are negligible, so the left side function. Thus, our scheme is non-frameable according
to the definition of DS.

5.3 Traceability

If there is a traceability adversary B, who constructs an adversary A1 against the

scheme DS, on the assumption that (P1 , V1 ) is a sound proof system for ρ1 , we say
−k+1 un f org−cma
F DG S,B (k) ≤ 2
Advtrace + Adv DS,A1 (k).

On the assumption that DS is secure against traceability, all the functions on the
right side are negligible. Because of this, the advantage of B is negligible. Thus, it
proves that our scheme is traceable.

6 Conclusion

In this paper, we have presented a simple fully dynamic group signature scheme
that can be used as a basic scheme to develop with different approaches. We have
constructed our scheme based on the scheme in [3] and proposed Verifier-Local revo-
cation mechanism, which ease member revocation and convenient for large groups.
Thus, our scheme is more flexible and suitable for dynamically changing groups,
even they are large. We have shown how to achieve the security with almost-fully
anonymity, which is a limited version of fully anonymity.

Acknowledgements This work is supported in part by JSPS Grant-in-Aids for Scientic Research
(A) JP16H01705 and for Scientic Research (B) JP17H01695.

References

1. Bellare, M., Micali, S.: How to sign given any trapdoor function. In: CRYPTO 1988, vol. 403,
pp. 200–215. LNCS (1988)
2. Bellare, M., Micciancio, D., Warinschi, B.: Foundations of group signatures: formal definitions,
simplified requirements, and a construction based on general assumptions. In: EUROCRYPT
2003, vol. 2656, pp. 614–629. LNCS (2003)
31 Fully Dynamic Group Signature Scheme … 415

3. Bellare, M., Shi, H., Zhang, C.: Foundations of group signatures: the case of dynamic groups.
In: CT-RSA 2005, vol. 3376, pp. 136–153. LNCS (2005)
4. Blum, M., De Santis, A., Micali, S., Persiano, G.: Noninteractive zero-knowledge. SIAM J.
Comput. 20(6), 1084–1118 (1991)
5. Boneh, D., Boyen, X., Shacham, H.: Short group signatures. In: CRYPTO 2004, vol. 3152, pp.
41–55. LNCS (2004)
6. Boneh, D., Shacham, H.: Group signatures with verifier-local revocation. In: ACM-CCS 2004,
pp. 168–177. ACM (2004)
7. Bootle, J., Cerulli, A., Chaidos, P., Ghadafi, E., Groth, J.: Foundations of fully dynamic group
signatures. In: ACNS 2016, pp. 117–136. LNCS (2016)
8. Bresson, E., Stern, J.: Efficient revocation in group signatures. In: PKC 2001, vol. 1992, pp.
190–206. LNCS (2001)
9. Brickell, E.: An efficient protocol for anonymously providing assurance of the container of the
private key. Submitted to the Trusted Computing Group (April 2003)
10. Camenisch, J., Groth, J.: Group signatures: better efficiency and new theoretical aspects. In:
SCN 2004, vol. 3352, pp. 120–133. LNCS (2004)
11. Camenisch, J., Lysyanskaya, A.: Dynamic accumulators and application to efficient revocation
of anonymous credentials. In: CRYPTO 2002, vol. 2442, pp. 61–76. LNCS (2002)
12. Chaum, D., van Heyst, E.: Group signatures. In: EUROCRYPT 1991, vol. 547, pp. 257–265.
LNCS (1991)
13. Dolev, D., Dwork, C., Naor, M.: Nonmalleable cryptography. SIAM Rev. 45(4), 727–784
(2003)
14. Kiayias, A., Yung, M.: Secure scalable group signature with dynamic joins and separable
authorities. Int. J. Secur. Netw. 1(1–2), 24–45 (2006)
15. Libert, B., Ling, S., Mouhartem, F., Nguyen, K., Wang, H.: Signature schemes with efficient
protocols and dynamic group signatures from lattice assumptions. In: ASIACRYPT 2016, vol.
10032, pp. 373–403. LNCS (2016)
16. Ling, S., Nguyen, K., Wang, H.: Group signatures from lattices: simpler, tighter, shorter, ring-
based. In: PKC 2015, vol. 9020, pp. 427–449. LNCS (2015)
17. Sahai, A.: Non-malleable non-interactive zero knowledge and adaptive chosen-ciphertext se-
curity. In: FOCS 1999. pp. 543–553. IEEE (1999)
18. Song, D.X.: Practical forward secure group signature schemes. In: ACM-CCS 2004, pp. 225–
234. ACM (2001)
Chapter 32
Fourier-Based Function Secret Sharing
with General Access Structure

Takeshi Koshiba

Abstract Function secret sharing (FSS) scheme is a mechanism that calculates a

function f (x) for x ∈ {0, 1}n which is shared among p parties, by using distribut-
ed functions fi : {0, 1}n → G (1 ≤ i ≤ p), where G is an Abelian group, while the
function f : {0, 1}n → G is kept secret to the parties. Ohsawa et al. in 2017 observed
that any function f can be described as a linear combination of the basis functions
by regarding the function space as a vector space of dimension 2n and gave new FSS
schemes based on the Fourier basis. All existing FSS schemes are of (p, p)-threshold
type. That is, to compute f (x), we have to collect fi (x) for all the distributed functions.
In this paper, as in the secret sharing schemes, we consider FSS schemes with any
general access structure. To do this, we observe that Fourier-based FSS schemes by
Ohsawa et al. are compatible with linear secret sharing scheme. By incorporating the
techniques of linear secret sharing with any general access structure into the Fourier-
based FSS schemes, we propose Fourier-based FSS schemes with any general access
structure.

Keywords Function secret sharing · Distributed computation · Fourier basis

Linear secret sharing · Access structure · Monotone span program

1 Introduction

Secret sharing (SS) schemes are fundamental cryptographic primitives, which were
independently invented by Blakley [4] and Shamir [21]. SS schemes involve several
ordinary parties (say, p parties) and the special party called a dealer. We suppose
that the dealer has a secret information s and partitions the secret information s
into share information Si (0 ≤ i ≤ p) which will be distributed to the ith party. In
(n, p)-threshold SS scheme, the secret information S can be recovered from n shares
(collected if any n parties get together), but no information on s is obtained from

T. Koshiba (B)
Faculty of Education and Integrated Arts and Sciences,
Waseda University, 1-6-1 Nishiwaseda, Shinjuku-ku, Tokyo 169-8050, Japan
e-mail: [email protected]

at most n − 1 shares. This threshold property can be discussed in terms of access

structures. An access structure (A, B) consists of two classes of sets of parties such
that (1) if all parties in some set A ∈ A get together, then the secret information can be
recovered from their shares; (2) even if all parties in any set B ∈ B get together, then
any information of the secret s cannot be obtained. For example, the access structure
(A, B) of the (n, p)-threshold SS scheme can be defined as A = {A ⊆ {1, . . . , p} :
|A| ≥ n} and B = {B ⊆ {1, . . . , p} : |B| < n}. Besides the access structure of the
threshold type, many variants have been investigated in the literature [3, 6, 7, 13,
15, 17]. As a standard technique for constructing access structures, monotone span
programs [10, 11, 14, 18] are often used.
The idea where a secret information is secretly distributed to several parties can be
applied to a function. The idea of secretly distributing a function has an application
in private information retrieval (PIR) [8, 9, 16] as demonstrated in [12]. Gilboa
et al. [12] consider to distribute point functions (DPFs) fa,b : {0, 1}n → G, where
fa,b (x) = b if x = a for some a ∈ {0, 1}n and fa,b (x) = 0 otherwise. In a basic DPF
scheme, the function f is partitioned into two keys f0 , f1 and each key is distributed
to the respective party of the two parties. Each party calculates the share yi = fi (x)
for common input x by using the key fi . On the other hand, each fi does not give
any important information (e.g., the value a for fa,b ) on the original function. The
functional value of the point function fa,b can be obtained by just summing up two
shares y0 and y1 of the two parties. Boyle et al. [5] investigate the efficiency in the
key size and extend the two-party setting into the multi-party setting. Moreover, they
generalize the target functions (i.e., point functions) to other functions and propose
an FSS scheme for some function family F in which functions f : {0, 1}n → G can
be calculated efficiently. In the multi-key FSS scheme, we partition a function p f ∈ F
into p distributed functions (f1 , . . . , fp ). Likewise, an equation f (x) = i=1 fi (x) is
satisfied with respect to any x, and the information about the secret function f (except
the domain and the range) does not leak out from at most p − 1 distributed functions.
Moreover, distributed functions fi can be described as short keys ki , and it is required
to be efficiently evaluated.
In [20], Ohsawa et al. observed that any function f from {0, 1}n to {0, 1} can
be described as a linear combination of the basis functions by regarding the func-
tion space as a vector space of dimension 2n . While the point functions fa,1 (for
all a ∈ {0, 1}n ) constitute a (standard) basis for the vector space, any function
f : {0, 1}n → {±1} can be represented as a linear combination of the Fourier ba-
sis functions χa (x) = (−1)a,x , where a, x denotes the inner product between vec-
tors a = (a1 , . . . , an ) and x = (x1 , . . . , xn ). Based on the above observation, Ohsawa
et al. gave new FSS schemes based on the Fourier basis. If we limit our concern to
polynomial-time computable FSS schemes, functions for which the existing schemes
are available would be limited. Since polynomial-time computable functions repre-
sented by combinations of point functions are quite different from ones represent-
ed by the Fourier basis functions, point function-based FSS schemes and Fourier
function-based FSS schemes are complementary.
We note that properties of some functions are often discussed in the technique
of the Fourier analysis. Akavia et al. [1] introduced a novel framework for proving
32 Fourier-Based Function Secret Sharing with General Access Structure 419

hard-core properties in terms of Fourier analysis. Any predicates can be represented

as a linear combination of Fourier basis functions. Akavia et al. show that if the
number of nonzero coefficients in the Fourier representation of hard-core predicates
is polynomially bounded, then the coefficients are efficiently approximable. This fact
leads to the hard-core properties. Besides hard-core predicates, it is well known that
low-degree polynomials are Fourier-concentrated [19].
Contribution
Since the existing FSS schemes are of (p, p)-threshold type, it is natural to consider
the possibility of FSS schemes with any threshold structure of (n, p)-type and even
general access structures as in the case of SS schemes.
In this paper, we affirmatively answer this question. As mentioned, Fourier-based
FSS schemes in [20] are quite simpler than the previous FSS schemes. This is because
Fourier basis functions have some linear structure. Shamir’s threshold SS scheme
can be seen as an application of the Reed–Solomon code, which is a linear code.
Both the distribution phase and the reconstruction phase can be described in a linear
algebraic way. From this viewpoint, we construct an (n, p)-threshold Fourier-based
FSS scheme. Moreover, SS schemes with general access structure can be discussed
in terms of monotone span program (MSP). The underlying structure of SS schemes
by using MSP is similar to the linear algebraic view of Shamir’s (n, p)-threshold
SS scheme, and we can similarly construct Fourier-based FSS schemes with general
access structure.
Technically speaking, Ohsawa et al. [20] consider a function from {0, 1}n to C.
That is, they consider Fourier transform over n-dimensional vector space of F2 . On
the other hand, we consider a function from a finite field Fq (of prime order q) to
C. So, in this paper, we consider the Fourier transform over Fq rather than (F2 )n .
The shift of the underlying mathematical structure enables to construct FSS schemes
with general access structure.

2 Preliminaries

2.1 Access Structure and Monotone Span Program

Let us assume that there are p parties in an SS (or, FSS) scheme. A qualified group is
a set of parties who are allowed to reconstruct the secret, and a forbidden group is a
set of parties who should not be able to get any information about the secret. The set
of qualified groups is denoted by A and the set of forbidden groups by B. The set A is
said to be monotonically increasing if, for any set A ∈ A, any set A such that A ⊇ A
is also included in A. The set B is said to be monotonically decreasing if, for any set
B ∈ B, any set B such that B ⊆ B is also included in B. If a pair (A, B) satisfies that
A ∩ B = ∅, A is monotonically increasing and B is monotonically decreasing, then
the pair is called a (monotone) access structure. If an access structure (A, B) satisfies
420 T. Koshiba

that A ∪ B coincides with the power set of {1, . . . , p}, we say that the access structure
is complete. If we consider a complete access structure, we may simply denote the
access structure by A instead of (A, B), since B is equal to the complement set of A.
As mentioned, there are several ways to realize general access structures. Mono-
tone span program (MSP) is a typical way to construct general access structures.
Before mentioning the MSP, we prepare some basics and notations for linear algebra.
An m × d matrix M over a field F defines a linear map from Fd to Fm . The kernel
of M , denoted by ker(M ), is the set of vectors u ∈ Fd such that M u = 0. The image
of M , denoted by im(M ), is the set of vectors v ∈ Fm such that v = M u for some
u ∈ Fd .
A monotone span program (MSP) M is a triple (F, M , ρ), where F is a finite
field, M is an m × d matrix over F, and ρ : {1, . . . , m} → {1, . . . , p} is a surjective
function which labels each row of M by a party. For any set A ⊆ {1, . . . , p}, let MA
denote the submatrix obtained by restricting M to the rows labeled by parties in A.
We say that M accepts A if e1 = (1, 0, . . . , 0)T ∈ im(MAT ); otherwise, we say M
rejects A. Moreover, we say that M accepts a (complete) access structure A if the
following is equivalent: M accepts A if and only if A ∈ A.
When M accepts a set A, there exists a recombination vector λ such that
MAT λ = e1 . Also, note that e1 ∈
/ im(MBT ) if and only if there exists a vector ξ such
that MB ξ = 0 and the first element of ξ is 1.

2.2 Function Secret Sharing

The original definition in [5] of FSS schemes are tailored for threshold schemes. We
adapt the definition for general access structures. In an FSS scheme, we partition
a function f into keys ki (the succinct descriptions of fi ) which the corresponding
parties Pi receive. Each party Pi calculates the share yi = fi (x) for the common input
x. The functional value f (x) is recovered from shares yA in a qualified set A of parties,
which is a subvector of y = (y1 , y2 , . . . , yp ), by using a decode function Dec. Any
joint keys ki in a forbidden set B of parties do not leak any information on function
f except the domain and the range of f . We first define the decoding process from
shares.
Definition 1 (Output Decoder) An output decoder Dec, on input a set T of parties
and shares from the parties in T , outputs a value in the range R of the target function f .
Next, we define FSS schemes. We assume that A is a complete access structure
among p parties and T ⊆ {1, 2, . . . , p} is a set of parties.
Definition 2 For any p ∈ N, T ⊆ {1, 2, . . . , p}, an A-secure FSS scheme with re-
spect to a function class F is a pair of PPT algorithms (Gen, Eval) satisfying the
following.
– The key generation algorithm Gen(1λ , f ), on input the security parameter 1λ and
a function f : D → R in F, outputs p keys (k1 , . . . , kp ).
32 Fourier-Based Function Secret Sharing with General Access Structure 421

– The evaluation algorithm Eval(i, ki , x), on input a party index i, a key ki , and an
element x ∈ D, outputs a value yi , corresponding to the ith party’s share of f (x).
Moreover, these algorithms must satisfy the following properties:
– Correctness: For all A ∈ A, f ∈ F and x ∈ D,

Pr[Dec(A, {Eval(i, ki , x)}i∈A ) = f (x) | (k1 , . . . , kp ) ← Gen(1λ , f )] = 1.

– Security: Consider the following indistinguishability challenge experiment for a

forbidden set B of parties, where B ∈
/ A:
1. The adversary D outputs (f0 , f1 ) ← D(1λ ), where f0 , f1 ∈ F.
2. The challenger chooses b ← {0, 1} and (k1 , . . . , kp ) ← Gen(1λ , fb ).
3. D outputs a guess b ← D({ki }i∈B ), given the keys for the parties in the forbidden
set B.
The advantage of the adversary D is defined as Adv(1λ , D) := Pr[b = b ] − 1/2.
The scheme (Gen, Eval) satisfies that there exists a negligible function ν such that
for all non-uniform PPT adversaries D which corrupts parties in any forbidden set
B, it holds that Adv(1λ , D) ≤ ν(λ).

2.3 Basis Functions

The function space of functions f : Fq → C can be regarded as a vector space of

dimension q. Therefore, the basis vectors for the function space exist, and we let
hi (x) be each basis function. Any function f in the function space is described as a
linear combination of the basis functions

f (x) = βj hj (x),
j∈Fq

where βj s are coefficients in C.

The Fourier basis
Let f : Fq → C, where q is an odd prime number. The Fourier transform of the
function f is defined as

1
fˆ (a) = f (x)e−2π(ax/q)i , (1)
q
x∈Fq

where i is the imaginary number. Then, f (x) can be described as a linear combination
of the basis functions χa (x) = e2π(ax/q)i , that is,
422 T. Koshiba

f (x) = fˆ (a)χa (x).
a∈Fq

In the above, fˆ (a) is called Fourier coefficient of χa (x). By using ωq = e(2π/q)i , the
primitive root of unity of order q, we can denote each Fourier basis function by

χa (x) = (ωq )ax

and let BF = {χa | a ∈ Fq } be the sets of all the Fourier basis functions.
It is easy to see that the Fourier basis is orthonormal since

1 1 if a = b,
χa (x)χb (x) = (2)
q 0 otherwise.
x∈Fq

In this paper, we consider only Boolean-valued functions and assume that the range
of the boolean function is {±1} instead of {0, 1} without loss of generality. That is,
we regard boolean functions as mappings from Fq to {±1}. Also, we have

χa+b (x) = χa (x)χb (x).

This multiplicative property plays an important role in this paper.

3 Linear Secret Sharing

3.1 Shamir’s Threshold Secret Sharing

First, we give a traditional description of Shamir’s (n, p)-threshold SS scheme [21],

where p ≥ n ≥ 2. Let s be a secret integer which a dealer D has. First, the dealer
D chooses a prime number q > s and a polynomial g(X ) ∈ Fq [X ] of degree n − 1.
Then, the dealer D computes si = (i, g(i)) as a share for the ith party Pi and sends
si to each Pi . For the reconstruction, n parties get together and recover the secret s
by the Lagrange interpolation from their shares.
The above procedure can be equivalently described as follows. Let M be an n × p
Vandermonde matrix and mi be the ith row in M . That is, mi = (1, i, i2 , . . . , in−1 ).
Let b = (b0 , b1 , . . . , bn−1 )T be an n-dimensional vector such that b0 = s and b1 , . . . ,
bn−1 are randomly chosen elements in Fq . Let y = (s1 , s2 , . . . , sp )T = M b. The share
si for Pi is the ith element of y, that is, si = mTi , b, where ·, · denotes the inner prod-
uct. Let A be a subset of {1, 2, . . . , p} which corresponds to a set of parties. Let MA be
a submatrix of M obtained by collecting rows mj for all j ∈ A. We similarly define a
32 Fourier-Based Function Secret Sharing with General Access Structure 423

subvector yA by collecting elements sj for all j ∈ A. Let e1 = (1, 0, 0, . . . , 0)T ∈

(Fq )n . Then, we can uniquely determine λ such that MAT λ = e1 by solving an equa-
tion system if and only if |A| ≥ n. Then, we have

s = b, e1 = b, MAT λ = MA b, λ = yA , λ.

Since yA corresponds to all shares for Pj (j ∈ A), we can reconstruct the secret s by
computing the inner product yA , λ.

3.2 Monotone Span Program and Secret Sharing

Here, we give a construction of linear secret sharing (LSS) based on monotone span
program (MSP). Here, we do not mention how to construct MSP. For the construction
of MSP, see the literature, e.g., [6, 10, 11, 14]. In this paper, we will use the LSS
schemes. Since the LSS schemes imply MSPs [2, 22], it is sufficient to consider
MSP-based SS schemes.
Let s ∈ Fq be a secret which the dealer D has and M = (Fq , M , ρ) be an M-
SP which corresponds to a complete access structure A. The dealer D considers to
partition s into several shares. In the sharing phase, the dealer D chooses a random
vector r ∈ (Fq )p−1 and sends a share mTi , (s, r)T to the ith party. In the reconstruc-
tion phase, using the recombination vector λ, any qualified set A ∈ A of parties can
reconstruct the secret as follows:

λ, MA (s, r)T = MAT λ, (s, r)T = e1 , (s, r)T = s.

Regarding the privacy, let B be a forbidden set of parties, and consider the joint
information held by the parties in B. That is, MB b = yB , where b = (s, r)T . Let
s ∈ Fq be an arbitrary value, and let ξ be a vector such that MB ξ = 0 and the first
element in ξ is equal to 1. Then, yB = MB (b + ξ(s − s)), where the first coordinate
of the vector b + ξ(s − s) is now equal to s . This means that, from the viewpoint of
the parties in B, their shares yB are equally likely consistent with any secret s ∈ Fq .

4 Our Proposal

As mentioned, any function can be described as a linear combination of basis func-

tions. If the function is described as a linear combination of a super-polynomial
number of basis functions, then the computational cost for evaluating the function
might be inefficient. We say that a function has a succinct
description (with respect to
the basis B) if the function f is described as f (x) = h∈B βh h(x) for some B ⊂ B
such that |B | is polynomially bounded in the security parameter. If we can find a
424 T. Koshiba

good basis set B, some functions may have a succinct description with respect to B.
We consider to take the Fourier basis as such a good basis candidate.
We will provide an FSS scheme for some function class whose elements are
functions with succinct description with respect to the Fourier basis BF . Since the
Fourier basis has nice properties, our FSS scheme with general access structure can
be realized.
In what follow, we assume that the underlying basis is always the Fourier basis
BF . Moreover, we assume that M = (Fq , M , ρ) is an MSP which corresponds to a
general complete access structure A. We will consider Fourier-based FSS schemes
with this access structure.

4.1 FSS Scheme for the Fourier Basis

In this subsection, we consider to partition each Fourier basis function χa (x) = (ωq )ax
into several keys. That is, we give an FSS scheme with general access structure with
respect to the function class BF .
Our FSS scheme with respect to BF consists of three algorithms GenF1
(Algorithm 1), Eval F1 (Algorithm 2), and DecF1 (Algorithm 3). GenF1 is an algorithm
that divides the secret a (for χa (x)) into p keys (k1 , . . . , kp ) as in the SS scheme with
the same access structure. Each key ki is distributed to the ith party Pi . Note that the
secret a can be recovered from the keys ki for all i in a qualified set A ∈ A.
In Eval1F , each party obtains the share by feeding x to the function distributed as
the key. DecF1 is invoked in order to obtain the Fourier basis function χa (x) from the
shares.
The correctness follows from

χa (x) = (ωq )ax

= (ωq )yA ,λx

= (ωq )( ki λi )x
λi
= (ωq )ki x .

For the security, we assume that an adversary D chooses (f0 , f1 ) where f0 = χa

and f1 = χb . Then, the challenger chooses a random bit c to select fc and invokes
GenF1 (1λ , a) if c = 0 and GenF1 (1λ , b) if c = 1. If c = 0, then a is divided into p
keys. If c = 1, then b is divided into different p keys. From the argument in Sect. 3.2,
the guess for the secret information a (resp., b) is a perfectly random guess. That
is, the inputs to the adversary D are the same in the two cases. Thus, the adversary
D cannot decide if the target function is either χa (x) or χb (x). It implies that only
D can do for guessing the random bit c selected by the challenger is just a random
guess. So, Ad v(1λ , D) = 0. This concludes the security proof.
32 Fourier-Based Function Secret Sharing with General Access Structure 425

4.2 General FSS Scheme for Succinct Functions

Since we do not know how to evaluate any function efficiently, we limit ourselves to
succinct functions with respect to the Fourier basis BF . Note that succinct functions
with respect to BF do not coincide with succinct functions with respect to point
functions. Simple periodic functions are typical examples of succinct functions with
respect to BF , which might not be succinct functions with respect to point functions.
As mentioned, some hard-core predicates of one-way functions are succinct functions
with respect to BF .
Let FBF , be a class of functions f which can be represented as a linear combination
of basis functions (with respect to BF ) at most, where is a polynomial in the
security parameter. That is, f has the following form:

f (x) = βi χai (x).
i=1

We construct an FSS scheme with general access structure (GenF≤ , Eval F≤ , DecF≤ )
for a function f ∈ FBF , as follows. Note that the construction is a simple adaptation
of the Fourier-based FSS scheme over (F2 )n in [20].

Algorithm 1 GenF1 (1λ , a)

Choose a random vector r ∈ (Fq )p−1 uniformly ;
for i = 1 to p do
mi ← the i-th row of M ;
ki ← mi , (a, r)T
end for
Return (k1 , . . . , kp ).

Algorithm 2 Eval F1 (i, ki , x)

vi ← (ωq )ki x ;
Return (i, vi ).

Algorithm 3 DecF1 (A, {(i, vi )}i∈A )

Compute a recombination vector λ = (λ1 , . . . , λp )T from A ;

Return w = i∈A (vi )λi .

– GenF≤ (1λ , f ) : On input the security parameter 1λ and a function f , the key gen-
eration algorithm (Algorithm 4) outputs p keys (k1 , . . . , kp ).
– Eval F≤ (i, ki , x) : On input a party index i, a key ki , and an input string x ∈ Fq , the
evaluation algorithm (Algorithm 5) outputs a value yi , corresponding to the ith
party’s share of f (x).
426 T. Koshiba

– DecF≤ (A, {yi }i∈A ) : On input shares {yi }i∈A of parties in a (possibly) qualified set
A, the decryption algorithm (Algorithm 6) outputs a solution f (x) for x.
In the above FSS scheme (GenF≤ , Eval F≤ , DecF≤ ) for succinct functions f ∈ FB, ,
we invoke FSS scheme (GenF1 , Eval F1 , DecF1 ) for basis functions BF , since f can be
represented as a linear combination of at most basis functions. In this construction,
we distribute each basis function χai (x) and each coefficient βi as follows. We invoke
(GenF1 , Eval F1 , DecF1 ) to distribute each basis function χai (x) and use any SS scheme
with the same access structure to distribute each coefficient βi .
The correctness of (GenF≤ , Eval F≤ , DecF≤ ) just comes from the correctness of
each FSS scheme (GenF1 , Eval F1 , DecF1 ) for the basis function χai (x) and the correct-
ness of each SS scheme for the coefficients. But some care must be done. From the
assumption, f ∈ FBF , has terms at most. If we represent f as a linear combina-
tion of exactly terms, some coefficients for basis functions must be zero. Since
the zero-function χ0 (x) = (ωq )0·x = 1 which maps any element x ∈ Fq to 1 can be
partitioned into several functions as the ordinary basis functions can be, we can apply
(GenF≤ , Eval F≤ , DecF≤ ) as well.

Algorithm 4 GenF≤ (1λ , f (·) = i=1 βi χai (·))
for i = 1 to do
(k1i , k2i , . . . , kpi ) ←GenF1 (1λ , ai ) ;
(s1i , s2i , . . . , spi ) ←iThe sharing phase of some SS scheme, given βi ;
end for
for j = 1 to p do
Set kj ← (kj1 , kj2 , . . . , kj ) ;
Set sj ← (sj1 , sj2 , . . . , sj ) ;
end for
Return ((k1 , s1 ), . . . , (kp , sp )).

Algorithm 5 Eval F≤ (i, (ki , si ), x)

for j = 1 to do
yji ←Eval F1 (i, kji , x) ;
end for
Set yi = (y1i , y2i , . . . , yi ) ;
Return (i, yi , si ).

Algorithm 6 DecF≤ (A, {(i, yi , si )}i∈A )

for i = 1 to do
j
gi ←DecF1 (A, {(j, yi )}j∈A ) ;
j
βi ←The reconstruction phase of the SS scheme, on input {si }j∈A ;
end for
Return g = i=1 βi gi .
32 Fourier-Based Function Secret Sharing with General Access Structure 427

The security of (GenF≤ , Eval F≤ , DecF≤ ) can be discussed as follows. Without of
loss of generality, we assume that all parties in a forbidden set B (where |B| = m) get
((k1 , s1 ), . . . , (km , sm )). For any i with 1 ≤ i ≤ , the m-tuples of the ith elements
of k1 , . . . , km are identical whatever the basis function for the ith term of the target
function is, because the advantage of any adversary against (GenF1 , Eval F1 , DecF1 ) is
0 as discussed in Sect. 4.1. Moreover, for any i with 1 ≤ i ≤ , the m-tuples of the ith
elements of s1 , . . . , sm are identical whatever the coefficient for the ith term of the
target function is, because of the perfect security of the underlying SS scheme with
the same access structure. Furthermore, the outputs of several executions of GenF1
(even for the same target basis function) are independent because each GenF1 uses a
fresh randomness. Thus, the information that all the parties in B can get is always
the same regardless of the target function f ∈ FBF , . This guarantees the security of
(GenF ≤ , Eval F≤ , DecF≤ ).

Remark If we do not care about the leakage of the number of terms with nonzero
coefficients for f , we can omit the partitioning of zero-functions, which increases
the efficiency of the scheme.

5 Conclusion

By observing that Fourier-based FSS schemes by Ohsawa et al. [20] are compatible
with linear SS schemes, we have provided Fourier-based FSS schemes with general
access structure, which affirmatively answers the question raised in [20].

Acknowledgements TK is supported in part by JSPS Grant-in-Aids for Scientific Research (A)

JP16H01705 and for Scientific Research (B) JP17H01695.

References

1. Akavia, A., Goldwasser, S., Safra S.: Proving hard-core predicates using list decoding. In:
Proceeding of the 44th Symposium on Foundations of Computer Science (FOCS 2003), pp.
146–157 (2003)
2. Beimel, A., Chor, B.: Universally ideal secret sharing schemes. IEEE Trans. Inf. Theor. 40(3),
786–794 (1994)
3. Benaloh, J., Leichter, J.: Generalized secret sharing and monotone functions. In: Proceeding
of CRYPTO ’88. Lecture Notes in Computer Science, vol. 403, pp. 27–35. Springer (1990)
4. Blakley, G.R.: Safeguarding cryptographic keys. In: American Federation of Information Pro-
cessing Societies: National Computer Conference, pp. 313–317 (1979)
5. Boyle, E., Gilboa, N., Ishai, Y.: Function secret sharing. In: EUROCRYPT 2015. Part II, Lecture
Notes in Computer Science, vol. 9057, pp. 337–367 (2015)
6. Brickell, E.F.: Some ideal secret sharing schemes. In Proceeding of EUROCRYPT ’89. Lecture
Notes in Computer Science, vol. 434, pp. 468–475. Springer (1990)
428 T. Koshiba

7. Brickell, E.F., Davenport, D.M.: On the classification of ideal secret sharing schemes. In:
Proceeding of CRYPTO ’89. Lecture Notes in Computer Science, vol. 435, pp. 278–285.
Springer (1990)
8. Chor, B., Gilboa, N.: Computationally private information retrieval. In: Proceeding of the 29th
Annual Symposium on Theory of Computing (STOC’97), pp. 304–313 (1997)
9. Chor, B., Goldreich, O., Kushilevitz, E., Sudan, M.: Private information retrieval. J. ACM
45(6), 965–981 (1998)
10. Fehr, S.: Span programs over rings and how to share a secret from a module. Master’s thesis,
ETH Zurich, Institute for Theoretical Computer Science (1998)
11. Fehr, S.: Efficient construction of the dual span program. Manuscript (1999)
12. Gilboa N., Ishai, Y.: Distributed point functions and their applications. In: Proceeding of EU-
ROCRYPT 2014. Lecture Notes in Computer Science, vol. 8441, pp. 640–658 (2014)
13. Ito, M., Saito, A., Nishizeki, T.: Secret sharing scheme realizing general access structure. In:
Proceeding of IEEE GLOBECOM ’87, pp. 99–102. IEEE Communications Society (1987)
14. Karchmer, M., Wigderson, A.: On span programs. In: Proceeding of the 8th Structures in
Complexity Theory Conference, pp. 102–111. IEEE Computer Society (1993)
15. Kothari, S.C.: Generalized linear threshold scheme. In: Proceeding of CRYPTO ’84. Lecture
Notes in Computer Science, vol. 196, pp. 231–241. Springer (1985)
16. Kushilevitz, E., Ostrovsky, R.: Replication is not needed: single database, computationally-
private information retrieval. In: Proceeding of the 38th IEEE Symposium on Foundations of
Computer Science (FOCS’97), pp. 364–373 (1997)
17. Nikov, V., Nikova, S., Preneel, B.: On multiplicative linear secret sharing schemes. In: Pro-
ceeding of INDOCRYPT 2003. Lecture Notes in Computer Science, vol. 2904, pp. 135–147,
Springer (2003)
18. Nikov, V., Nikova, S.: New Monotone Span Programs from Old. Cryptology ePrint Archive,
Report 2004/282 (2004)
19. O’Donnell, R.: Analysis of Boolean Functions. Cambridge University Press, Cambridge (2014)
20. Ohsawa, T., Kurokawa, N., Koshiba, T.: Function secret sharing using Fourier basis. In Proceed-
ing of the 8th International Workshop on Trustworthy Computing and Security (TwCSec-2017).
Lecture Notes on Data Engineering and Communications Technologies, vol. 7, pp. 865–875.
Springer (2018)
21. Shamir, A.: How to share a secret. Commun. ACM 22(11), 612–613 (1979)
22. van Dijk, M.: A linear construction of perfect secret sharing schemes. In: Proceeding of EU-
ROCRYPT ’94. Lecture Notes in Computer Science, vol. 950, pp. 23–34. Springer (1995)
Chapter 33
A Uniformly Convergent NIPG Method
for a Singularly Perturbed System
of Reaction–Diffusion Boundary-Value
Problems

Gautam Singh and Srinivasan Natesan

Abstract In this article, we study the numerical solution of singularly perturbed

system of boundary-value problems for second-order ordinary differential equations
of reaction–diffusion type. The solution of these problems exhibits twin boundary
layers at both the ends of the domain. To obtain the numerical solution of these
problems, we apply the nonsymmetric discontinuous Galerkin FEM with interior
penalties (NIPG method). Also, we proved that the method is O(N −1 ln N )k accurate
in energy norm, on Shishkin mesh with N number of intervals and k degree of
piecewise polynomial. Numerical results are presented to support the theoretical
results.

Keywords Singularly perturbed system of reaction–diffusion boundary-value

problems · Shishkin mesh · Discontinuous Galerkin finite element method
Uniform convergence

Subject Classification: 65L11 · 65L20 · 65L60 · 65L70

1 Introduction

The numerical solution of singularly perturbed differential equations (SPDEs) at-

tracts many researchers in the recent years; for more details, one can see the books
by Farrell et al. [1], Miller et al. [5], and Roos et al. [7]. The solution of SPDEs has
a multi-scale character; it varies rapidly inside the boundary layer and varies slowly
away from the boundary layers; therefore, classical numerical schemes fail to yield
satisfactory numerical approximate solution on uniform meshes. Special care has to
be taken to devise parameter-uniform numerical methods to these problems. There
are two types of methods to solve SPDEs; one is known as fitted operator methods
(FOMs) and the other one is fitted mesh methods (FMMs); see the book [5] for more
details.

G. Singh (B) · S. Natesan

Department of Mathematics, Indian Institute of Technology, Guwahati 781039, India
e-mail: [email protected]

Classical finite element methods (FEMs) will also fail to provide parameter-
uniform numerical solutions to SPDEs on uniform meshes. Either one has to use
exponential basis functions as the trial functions [6], or one has to use layer-adapted
nonuniform meshes for classical FEM [7]. There are several research articles which
deal with the numerical solution of SPDEs by FEM; we cite a few of them [2, 10],
the references therein.
Recently, researchers started to apply the nonsymmetric discontinuous Galerkin
method with interior penalty (NIPG method) to solve SPDEs, originally designed for
elliptic equations. Zarin and Roos [9] applied the NIPG method to solve singularly
perturbed 2D convection–diffusion problems with parabolic layers. Zhu et. al. [12]
have applied the NIPG method to solve singularly perturbed 1D convection–diffusion
BVPs and showed that it converges ε—uniformly in the energy norm with optimal
order. Linß and Madden [3] applied the FEM for singularly perturbed system of
reaction–diffusion BVPs on S-type mesh and proved that the methods is uniformly
convergent with first order.
In this article, we obtain the numerical solution of singularly perturbed system of
BVPs of reaction–diffusion type, by applying the NIPG method on the layer-adapted
piecewise uniform Shishkin mesh. Also, we established that the proposed method is
ε—uniformly convergent of order k, where k is the degree of piecewise polynomials
in finite element space. To support the theoretical findings, we carried out some
numerical experiments; the results are presented in the form of tables.
This paper is organized in the following style: In Sect. 2, we describe the model
problem with some basic definitions. We use the NIPG method for system of singular
perturbation problems and prove its existence and uniqueness in Sect. 3. Parameter-
uniform error estimate is derived in Sect. 4. Section 5 shows the numerical results
obtained for a test problem.
In this paper, we use C to denote a generic positive constant that is independent of
both the perturbation parameter ε and the mesh parameter N . We shall also assume
that ε ≤ CN −1 as is generally the case.

2 The Model Problem and the Analytical Solution

Here, we consider the following singularly perturbed system of reaction–diffusion

boundary-value problems (BVPs):
⎧ 2
⎨ −ε u1 (x) + a11 (x)u1 (x) + a12 (x)u2 (x) = f1 (x), x ∈ Ω = (0, 1),
−ε2 u2 (x) + a21 (x)u1 (x) + a22 (x)u2 (x) = f2 (x), (1)
⎩
u1 (0) = u2 (0) = u1 (1) = u2 (1) = 0,

where 0 < ε 1 is the perturbation parameter, and the coefficients aij and the source
functions fj are sufficiently smooth functions. We shall assume that reaction coeffi-
cient matrix A = {aij }2i,j=1 is an L0 –matrix with
33 A Uniformly Convergent NIPG Method for a Singularly Perturbed … 431

min{a11 + a12 , a21 + a22 } > β 2 , (2)

i.e., A is an M -matrix whose inverse is bounded by β −2 in the maximum norm. The

solution u = (u1 , u2 )T of (1) has layers at x = 0 and 1 of width O(ε ln ε).

Lemma 1 The solution u = (u1 , u2 )T of (1) can be decomposed as u = S + E,

where S and E are smooth and layer parts, respectively. Then, the bound on the
smooth and layer components are

|Si(l) (x)| ≤ C (3)

|Ei(l) (x)| −l
≤ Cε Dε (x), for, i = 1, 2 and 0 ≤ l ≤ p (4)

where Dε (x) = exp((−βx)/ε) + exp((−β(1 − x))/ε). Here, p > 0 depends on the

smoothness of the data.

Proof The proof of this lemma can be found in [4].

The space of square integrable functions on an interval K ⊂ R will be denoted by

L2 (K), with the associated inner product

(u, v)K = u1 (x)v1 (x)dx + u2 (x)v2 (x)dx.
K K

We will also use the usual Sobolev space H k (K) to denote the space of functions
on K whose generalized derivatives are in L2 (K), for 0, 1, 2, . . . , k, and it is equipped
with norm and seminorm .k,K and |.|k,K , respectively. For any vector-valued func-
tions u = (u1 (x), u2 (x))T , we will write

u2k,K = u1 2k,K + u2 2k,K .

Let TN = {Kj = (xj−1 , xj ) : j = 1, . . . , N }, be a partition of the domain Ω. To

each element Kj ∈ TN , denote the discrete Sobolev space of order s with

H s (Ω, TN ) = {u ∈ L2 (Ω) : u|Kj ∈ H s (Kj ), ∀Kj ∈ TN }.

Discrete Sobolev norm and seminorm are given by

N
N
N
N
u2s,TN = u1 2s,Kj + u2 2s,Kj , |u|2s,TN = |u1 |2s,Kj + |u1 |2s,Kj ,
j=1 j=1 j=1 j=1

where .0,Kj and |.|1,Kj are the usual Sobolev norm and seminorm defined over the
domain Kj , respectively.
432 G. Singh and S. Natesan

To discretize the domain Ω = (0, 1), we use the layer-adapted piecewise uniform
Shishkin mesh, which is described in the following. We divide the domain Ω = (0, 1)
into three subdomains as Ω = Ω1 ∪ Ω2 ∪ Ω3 , where Ω1 , Ω2 and Ω3 are [0, τε ],
[τε , 1 − τε ] and [1 − τε , 1], respectively. Here the transition point τε is defined by

1 αε
τε = min , ln N .
4 β

In this article, we will take τε = (αε/β) ln N .

The step size in each of the subdomain is given by

4τε /N , for Ω1 and Ω3 ,
hi =
2(1 − 2τε )/N , for Ω2 .

Let us define the finite element space VNk (Ω) associated with the family TN of
Shishkin mesh by

VNk (Ω) = {u ∈ L2 (Ω) : u|Kj ∈ P k (Kj ), ∀Kj ∈ TN },

where P k (Kj ) denotes the polynomials space of degree k on Kj . The function in

VNk (Ω) are discontinuous at each mesh point.

3 The NIPG Method

The finite element problem corresponds to (1) by the NIPG method reads:

find uh ∈ VNk (Ω)2 , such that
(5)
B(uh , vh ) = L(vh ), ∀vh ∈ VNk (Ω)2 ,

with B(u, v) = B1 (u, v) + B2 (u, v), where

N

N
B1 (u, v) = ε2 u1 (x)v1 (x)dx + ε2 ({u1 (xj )}[v1 (xj )] − {v1 (xj )}[u1 (xj )])
j=1 Kj j=0

N N

+ σj [u1 (xj )][v1 (xj )] + (a11 (x)u1 (x) + a12 (x)u2 (x))v1 (x)dx,
j=0 j=1 Kj

N

N
B2 (u, v) = ε 2
u2 (x)v2 (x)dx +ε ({u2 (xj )}[v2 (xj )] − {v2 (xj )}[u2 (xj )])
2

j=1 Kj j=0

N N

+ σj [u2 (xj )][v2 (xj )] + (a21 (x)u1 (x) + a22 (x)u2 (x))v2 (x)dx,
j=0 j=1 Kj
33 A Uniformly Convergent NIPG Method for a Singularly Perturbed … 433

and
N

L(v) = (f1 v1 + f2 v2 )dx,
j=1 Kj

here σj ≥ 0 (j = 0, 1, . . . , N ) are penalty parameter with the node xj .

Lemma 2 Let u be the exact solution of the problem (1), then the bilinear form
B(., .) defined in (5) satisfies the Galerkin orthogonality property

B(u − uh , v) = 0, ∀v ∈ VNk (Ω)2 .

Proof Since u is the exact solution of (1), we have [u1 (xj )] = [u2 (xj )] = 0, 0 ≤ j ≤
N and [u1 (xj )] = [u2 (xj )] = 0, 1 ≤ j ≤ N − 1. Then, for all v ∈ VNk (Ω)2 , we easily
get

N

N
B1 (u, v) = ε2 u1 (x)v1 (x)dx + ε2 ({u1 (xj )}[v1 (xj )])
j=1 Kj j=0
N
N

+ a11 (x)u1 (x)v1 (x)dx + a12 (x)u2 (x)v1 (x)dx,
j=1 Kj j=1 Kj

Similarly, we can write for B2 (u, v). Using integration by parts and the definition of
jump and average, one can show that

N
N

N
ε2 u1 (x)v1 (x)dx = −ε2 u1 (x)v1 (x)dx − ε2 ({u1 (xj )}[v1 (xj )]),
j=1 Kj j=1 Kj j=0

and
N
N

N
ε 2
u2 (x)v2 (x)dx = −ε 2
u2 (x)v2 (x)dx −ε ({u2 (xj )}[v2 (xj )]).
2

j=1 Kj j=1 Kj j=0

Using the above estimate and recalling our model problem, we obtain

B(u − uh , v) = 0, ∀v ∈ VNk (Ω)2 .

434 G. Singh and S. Natesan

A natural norm associated with the bilinear form B(., .) is the energy norm

N
N
|v| = ε
2 2
(v1 2L2 (Kj ) + v2 2L2 (Kj ) ) +β 2
(v1 2L2 (Kj ) + v2 2L2 (Kj ) )
j=1 j=1

N
N
+ σj [v1 (xj )]2 + σj [v2 (xj )]2 , (6)
j=0 j=0

where σj are the penalty parameters and β is such that it satisfies min{a11 + a12 , a21 +
a22 } > β 2 . It is easy to show that the bilinear form given in (5) satisfies the coercivity
condition, and hence, it can be shown that

|uh | ≤ f L2 (Ω) ,

using the above expression one can show the uniqueness of the solution of (5), and
also using rank nullity theorem, we can show the existence of the solution.

4 Error Analysis

In this section, we perform the error analysis for the NIPG method (5). We will
show that the NIPG method possesses optimal order of convergence. From [8], we
introduce a special interpolant on each element Kj ; for any w ∈ C(Kj ), we define
k + 1 nodal functional Nl by

N0 (w) = w(xj−1 ), Nk(w) = w(xj ),

xj
1
Nl (w) = (x − xj−1 )l−1 w(x)dx, l = 1, . . . , k − 1.
(xj − xj−1 ) xj−1
l

Now a local interpolation wI |k ∈ P K (Kj ) is defined by Nl (wI − w) = 0, l = 0, . . . , k.

Lemma 3 [8] Interpolation error has the following bounds:

|u − uI |l,Kj ≤ Chk+1−l
j |u|k+1,Kj , l = 0, 1, . . . , k + 1, ∀u ∈ H k+1 (Kj ),
u − uI L∞ (Kj ) ≤ Chk+1
j |u|k+1,∞,Kj , ∀u ∈ W k+1,∞ (Kj ),

here Kj are elements of TN and its length are hj .

Lemma 4 Let SI and EI are the interpolants of S and E, respectively. Then, we can
write uI = SI + EI and the estimates
33 A Uniformly Convergent NIPG Method for a Singularly Perturbed … 435

um − um,I L∞ (Ωi ) ≤ C(N −1 ln N )k+1 , for i = 1, 3, (7)

(Sm − Sm,I ) L2 (Ω) ≤ CN
l
, for l = 0, 1, 2, . . . , k,
l−(k+1)
(8)
−1/2 −α
Em L∞ (Ω2 ) + ε Em L2 (Ω2 ) ≤ CN , (9)
Eml L2 (Ω2 ) ≤ Cε1/2−l N −α , (10)
N −1 Em,I

L2 (Ω2 ) + Em,I L2 (Ω2 ) ≤ C(ε1/2 N −α + N −(1/2+α) ), (11)
−(k+1) −α
um − um,I L∞ (Ω2 ) ≤ C(N +N ), (12)

for m = 1, 2 hold true.

Proof Using the linearity, we can write uI = (S + E)I = SI + EI . Using Lemma 3,

we can have

um − um,I L∞ (Ωi ) ≤ Ch(k+1)

j |um(k+1) |L∞ (Ωi ) , (13)

using the solution decomposition and the bounds on smooth and layer parts, i.e., (3),
we can obtain that

|umk+1 |L∞ (Ωi ) ≤ |Smk+1 |L∞ (Ωi ) + |Emk+1 |L∞ (Ωi ) ≤ Cε−(k+1) Dε (x) (14)

We know that the interval length on the fine part of the mesh is given by hi =
(2αε/β)N −1 ln N . From Eqs. (13) and (14), we obtain

um − um,I L∞ (Ωi ) ≤ C(N −1 ln N )k+1 for i = 1, 3.

Next, by using Lemma 3 for the smooth part of the solution and the bound on the
smooth part, i.e., (3), we get

(Sm − Sm,I )l L2 (Ω) ≤ CN l−(k+1) |Sm |k+1,Kj ≤ CN l−(k+1) .

To prove (9), we need to show that

Em L∞ (Ω2 ) ≤ C max (exp(−βx/ε) + exp(−β(1 − x)/ε))

[τε ,1−τε ]

≤ CN −α , (using τε = (αε/β) ln N ),

by using the definition of L2 norm and value of τε , we obtain

1−τε
Em 2L2 (Ω2 ) ≤ C (Dε (x))2 dx
τε
1−τε
≤C (exp(−2βx/ε) + exp(−2β(1 − x)/ε))dx
τε
≤ CεN −2α .
436 G. Singh and S. Natesan

By adding above two inequality, we get

Em L∞ (Ω2 ) + ε−1/2 Em L2 (Ω2 ) ≤ CN −α .

In a similar way, we can prove (10). The proofs of the estimates (11)–(12) can be
established following the idea used in (Lemma 12 of [8]).

To obtain the error estimate of the NIPG method, we decompose the error in two
parts, as
|u − uh | ≤ |u − uI | + |uI − uh |.

Let η = u − uI and ξ = uI − uh . For proving the interpolation error on the mesh

point, we will use the following lemma.

Lemma 5 [11] For any w ∈ H 1 (Kj ), we have the following bound

|w(xs )|2 ≤ 2(h−1

j ||w||L2 (Kj ) + wL2 (Kj ) w L2 (Kj ) ), s ∈ {j − 1, j}.
2

Lemma 6 Assuming α = k + 1, we can show the following estimates for {ηm }

Cε−2 (N −1 ln N )2k , for xj ∈ Ω1 ∪ Ω3 ,

{ηm (xj )}2 ≤
Cε−2 N −(2k+1) , for xj ∈ Ω2 ,

where ηm = um − um,I .

Proof Using Lemma 5, we can write

1 − 1
{ηm (xj )}2 = (η (x ) + ηm (xj+ ))2 ≤ (ηm (xj− )2 + ηm (xj+ )2 )
4 m j 2
≤ h−1 2 −1 2
j ηm L2 (Kj ) + ηm L2 (Kj ) ηm L2 (Kj ) + hj+1 ηm L2 (Kj+1 )

+ ηm L2 (Kj+1 ) ηm L2 (Kj+1 ) .

Now our job is to estimate ηm L2 (Kj ) and ηm L2 (Kj ) separately. Using (8), we obtain

(Sm − Sm,I ) L2 (Kj ) ≤ CN −k ,

(Sm − Sm,I ) L2 (Kj ) ≤ CN −(k+1) , for all Kj , j = 1, . . . , N .

In order to obtain the bounds for ηm L2 (Kj ) and ηm L2 (Kj ) , it remains to estimate
(Em − Em,I ) L2 (Kj ) and (Em − Em,I ) L2 (Kj ) inside and outside regions.
First, we will estimate (Em − Em,I ) L2 (Kj ) and (Em − Em,I ) L2 (Kj ) outside layer
that is Kj ∈ Ω2 . By using (10) and (11) and the fact that ε ≤ N −1 , α = k + 1, we
obtain

(Em − Em,I ) L2 (Kj ) ≤ Em L2 (Kj ) + Em,I

L2 (Kj ) ≤ Cε−1/2 N −(k+1) .
33 A Uniformly Convergent NIPG Method for a Singularly Perturbed … 437

Similarly, using the inverse inequality and the fact that ε ≤ N −1 , α = k + 1, we

obtain

(Em − Em,I ) L2 (Kj ) ≤ Em L2 (Kj ) + Em,I

L2 (Kj )
≤ Em L2 (Kj ) + Ch−1
j Em,I L2 (Kj )

≤ Cε−3/2 N −(k+1) .

Now, we will estimate (Em − Em,I ) L2 (Kj ) and (Em − Em,I ) L2 (Kj ) inside the
boundary layer regions, that is, Kj ∈ Ω1 ∪ Ω3 . By using Lemmas 1 and 3, we have
for l = 1, 2

(Em − Em,I )(l) L2 (Kj ) ≤ Chk+1−l

j Em L2 (Kj )
1/2
−2(k+1) 2
≤ Chj
k+1−l
ε Dε (x)
Kj
−1
≤ Cε 1/2−l
(N ln N )k+3/2−l .

From the triangle inequality and the above estimate, we can obtain

Cε−1/2 (N −1 ln N )k+1/2 , for Kj ∈ Ω1 ∪ Ω3 ,

ηm L2 (Kj ) ≤
Cε−1/2 N −k (ε1/2 + N −1 ), for Kj ∈ Ω2 ,

and

Cε−3/2 (N −1 ln N )k+1/2 , for Kj ∈ Ω1 ∪ Ω3 ,

ηm L2 (Kj ) ≤
Cε−3/2 N 1−k (ε3/2 + N −2 ), for Kj ∈ Ω2 ,

and by using above estimate, we get our desired result.

Theorem 1 By taking α = k + 1, we have following interpolation error bound:

|η| ≤ C(N −1 ln N )k ,

where η = u − uI .

Proof Because u − uI is continuous in Ω, hence [η1 ]j = [η2 ]j = 0, j = 0, . . . , N .

Then,
N
N
|η|2 = ε2 η 2L2 (Kj ) + β 2 η2L2 (Kj ) .
j=1 j=1
438 G. Singh and S. Natesan

By Lemma 4, we can easily conclude that

u − uI L2 (Ω) ≤ |Ω1 |1/2 u − uI L∞ (Ω1 ) + |Ω2 |1/2 u − uI L∞ (Ω2 )
+ |Ω3 |1/2 u − uI L∞ (Ω3 )
≤ τε1/2 u − uI L∞ (Ω1 ) + u − uI L∞ (Ω2 ) + τε1/2 u − uI L∞ (Ω3 )
≤ C(N −1 ln N )k .

Similarly, we can show that ε|u − uI |1,Ω ≤ C(N −1 ln N )k . Hence, we have

|η| ≤ C(N −1 ln N )k .

Theorem 2 Let ξ = uh − uI . Then, ξ satisfies the following error bound:

|ξ | ≤ C(N −1 ln N )k .

Proof As we know the bilinear form given in (5) that for the first term,

N

N 1/2
N 1/2
ε2 η1 (x)ξ1 (x)dx ≤ ε2 (η1 )2 (x)dx ε2 (ξ1 )2 (x)dx
j=1 Kj j=1 Kj j=1 Kj

−1
≤ Cη L2 (Ω) |ξ | ≤ C(N ln N ) |ξ |,
k

and for the second term

N
N 1/2
N 1/2
ε2 2
ε2 ({η1 (xj )}[ξ1 (xj )]) ≤ {η } σj [ξ1 ]2 (x)dx
j=0 j=0
σj 1 j j=0

≤ C(N −1 ln N )k |ξ |.

Similarly, we can show that the third and fourth terms satisfy

N

a11 (x)η1 (x)ξ1 (x)dx ≤ C(N −1 ln N )k |ξ |
j=1 Kj

N

a12 (x)η2 (x)ξ1 (x)dx ≤ C(N −1 ln N )k |ξ |
j=1 Kj

Hence, we have |ξ | ≤ C(N −1 ln N )k .

From Theorems 1 and 2, we can obtain the parameter-uniform error estimate for
the NIPG method, which is given in the following theorem.
33 A Uniformly Convergent NIPG Method for a Singularly Perturbed … 439

Theorem 3 Let u and uh be the solution of the continuous and discrete problem,
respectively. Then,
|u − uh | ≤ C(N −1 ln N )k .

5 Numerical Result

In this section, we verify experimentally our convergence result by considering the

numerical solution of a constant coefficient problem.
Example 1 Consider the following singularly perturbed system of BVP:
⎧ 2
⎨ −ε u1 (x) + 2u1 (x) − u2 (x) = f1 (x), x ∈ Ω = (0, 1),
−ε2 u2 (x) − u1 (x) + 2u2 (x) = f2 (x), (15)
⎩
u1 (0) = u2 (0) = u1 (1) = u2 (1) = 0,

where f1 (x), f2 (x) are chosen such that

2
u1 (x) = (exp(−x/ε) + exp(−(1 − x)/ε)) − 2,
(1 + exp (−1/ε))
1
u2 (x) = (exp(−x/ε) + exp(−(1 − x)/ε)) − 1,
(1 + exp (−1/ε))

are the exact solution of the (15).

We calculate the error in the energy norm as defined in (6). The numerical conver-
gence rate is computed by using the formula r = ln(eN /e2N )/ ln 2, where eN is the
computation error with N number of interval. Tables 1 and 2 provide the numerical
result with the finite element polynomials of order k = 1, 2.

Table 1 Energy norm error for Example 1 for k = 1

ε Number of mesh intervals N
16 32 64 128 256
10−2 4.1102e−01 2.1731e−01 1.0850e−01 5.2935e−02 2.5923e−02
0.9194 1.0021 1.0353 1.0300
10−3 4.0432e−01 2.1116e−01 9.9547e−02 4.3265e−02 1.7935e−02
0.9371 1.0849 1.2022 1.2705
10−4 4.0519e−01 2.1328e−01 1.0250e−01 4.5540e−02 1.8899e−02
0.9258 1.0571 1.1704 1.2689
.. .. .. .. .. ..
. . . . . .

10−10 4.0530e−01 2.1356e−01 1.0307e−01 4.6427e−02 1.9855e−02

0.92434 1.0511 1.1505 1.2255
440 G. Singh and S. Natesan

Table 2 Energy norm error for Example 1 for k = 2

ε Number of mesh intervals N
16 32 64 128 256
10−1 1.0326e−02 2.3568e−03 5.4424e−04 1.2900e−04 3.1266e−05
2.1314 2.1145 2.0769 2.0447
10−2 2.6762e−02 9.7779e−03 2.9795e−03 8.0849e−04 2.0313e−04
1.4526 1.7144 1.8818 1.9928
10−3 2.5394e−02 9.2121e−03 2.8263e−03 7.6634e−04 1.9307e−04
1.4629 1.7046 1.8828 1.9889
.. .. .. .. .. ..
. . . . . .

10−9 2.5333e−02 9.1889e−03 2.8190e−03 7.6956e−04 1.8028e−04

1.4631 1.7047 1.8731 2.0938
10−10 2.5309e−02 9.1968e−03 2.7990e−03 7.6424e−04 1.4193e−04
1.4604 1.7162 1.8728 2.4289

References

1. Farrell, P.A., Hegarty, A.F., Miller, J.J.H., O’Riordan, E., Shishkin, G.I.: Robust Computational
Techniques for Boundary Layers. Chapman & Hall/CRC Press, Boca Raton (2000)
2. Lin, R., Stynes, M.: A balanced finite element method for singularly perturbed reaction-
diffusion problems. SIAM J. Numer. Anal. 50(5), 2729–2743 (2012)
3. Linß, T., Madden, N.: A finite element analysis of a coupled system of singularly perturbed
reaction–diffusion equations. Appl. Math. Comput. 148(3), 869–880 (2004)
4. Madden, N., Stynes, M.: A uniformly convergent numerical method for a coupled system of two
singularly perturbed linear reaction-diffusion problems. IMA J. Numer. Anal. 23(4), 627–644
(2003)
5. Miller, J.J.H., O’Riordan, E., Shishkin, G.I.: Fitted Numerical Methods for Singular Perturba-
tion Problems. World Scientific, Singapore (1996)
6. O’Riordan, E., Stynes, M.: An analysis of some exponentially fitted finite element methods for
singularly perturbed elliptic problems. In: Computational Methods for Boundary and Interior
Layers in Several Dimensions, volume 1 of Adv. Comput. Methods Bound. Inter. Layers, pp.
138–153. Boole, Dublin (1991)
7. Roos, H.-G., Stynes, M., Tobiska, L.: Robust Numerical Methods for Singularly Perturbed
Differential Equations, vol. 24, 2nd edn. Springer Series in Computational Mathematics, Berlin
(2008)
8. Tobiska, L.: Analysis of a new stabilized higher order finite element method for advection-
diffusion equations. Comput. Methods Appl. Mech. Engrg. 196(1–3), 538–550 (2006)
9. Zarin, H., Roos, H.-G.: Interior penalty discontinuous approximations of convection-diffusion
problems with parabolic layers. Numer. Math. 100(4), 735–759 (2005)
10. Zhang, Z.: Finite element superconvergence approximation for one-dimensional singularly
perturbed problems. Numer. Methods Partial Differ. Equ. 18(3), 374–395 (2002)
11. Zhu, P., Xie, Z., Zhou, S.: A coupled continuous-discontinuous FEM approach for convection
diffusion equations. Acta Math. Sci. Ser. B Engl. Ed. 31(2):601–612 (2011)
12. Zhu, P., Yang, Y., Yin, Y.: Higher order uniformly convergent NIPG methods for 1-d singularly
perturbed problems of convection–diffusion type. Appl. Math. Model. 39(22), 6806–6816
(2015)
Chapter 34
On Solving Bimatrix Games
with Triangular Fuzzy Payoffs

Subrato Chakravorty and Debdas Ghosh

Abstract The aim of this paper is to introduce the concept of bimatrix fuzzy games.
The fuzzy games are defined by payoff matrices constructed using triangular fuzzy
numbers. The bimatrix fuzzy game discussed in this paper is different from the one
given by Maeda and Cunlin in respect that it is not a zero-sum game and two different
payoff matrices are provided for the two players. Three kinds of Nash equilibriums
are introduced for fuzzy games, and their existence conditions are studied. A solution
method for bimatrix fuzzy games is given using crisp parametric bimatrix games.
Finally, a numerical example is discussed to support the model described in the paper.

Keywords Bimatrix games · Nash equilibrium · Fuzzy set theory · Fuzzy games
Non-cooperative games

1 Introduction

Game theory is the study of mathematical models for conflict resolution among
intelligent decision makers with applications in economics, finance, management,
engineering, etc. It can be broadly classified into cooperative and non-cooperative
games. In cooperative games, we have alliances that can be externally enforced,
whereas in non-cooperative games, only self-enforcing alliances are permitted. In
1951, Nash [1] gave the solution concept of Nash equilibrium for non-cooperative
games. Since then, Nash equilibrium has been one of the most fundamental concepts
in game theory. An equilibrium strategy in a non-cooperative game is said to be
a Nash equilibrium strategy if a player cannot improve its payoff by changing its
strategy provided that all other players keep their strategy constant.

S. Chakravorty · D. Ghosh (B)

Department of Mechanical Engineering, Indian Institute of Technology (BHU),
Varanasi 221005, Uttar Pradesh, India
e-mail: [email protected]
S. Chakravorty
e-mail: [email protected]

In traditional game theory, payoffs are assumed to be precise and well known to
all the players. But in reality, due to the complexity of the problems, or due to lack of
adequate information and imprecision in the knowledge of the environment, payoffs
cannot be defined precisely. The lack of precision and certainty in parameters is
modeled using various ways such as interval games, stochastic games, fuzzy games.
In our paper, we deal with matrix games which have only two players and model our
payoffs using fuzzy numbers.
Many excellent works have contributed to the field of fuzzy games. Butnariu
[2, 3] modeled the beliefs of each player about other players’ strategies as fuzzy sets
and determined the equilibrium strategies based on the fuzzy preference relations
of the investment of the players pure strategy. Campos [4], in his analysis of two-
person zero-sum games with fuzzy payoffs, converted the solution of the game into
a fuzzy linear programming problem using Yager’s ranking index [5]. Li [6, 7]
gave an efficient method to solve matrix games with triangular fuzzy numbers as
payoffs. The fuzzy game value is taken as a triangular fuzzy number. Using the
duality theorem of linear programming (LP), the fuzzy game value is computed
by solving derived LP models. These LP models are defined using 1-cut and 0-
cut of fuzzy payoffs. Sakawa and Nishizaki [8] investigated single-objective and
multi-objective games with fuzzy goals and fuzzy payoffs. The models in [8] are
transformed into a fractional programming problem and ultimately solved using a
relaxed method. Vijay et al. [9, 10] showed that using a suitable defuzzification
function, solution to zero-sum matrix games is equivalent to primal-dual pair of a
fuzzy linear programming problem. Larbani [11, 12] investigated non-cooperative
games with payoff functions involving fuzzy parameters. The equilibrium strategy
in [11] considers the aspect of conflict as well as the aspect of decision making under
uncertainty concerning the use of fuzzy parameters.
Maeda [13, 14] characterized the Nash equilibrium [1] for games with symmetric
triangular fuzzy numbers as payoffs into three kinds using fuzzy max order rela-
tion. Cunlin and Qiang [15] extended the results in [13] for asymmetric triangular
fuzzy numbers which was further extended by Dutta and Gupta [16] for asymmetric
trapezoidal fuzzy numbers.
Above results investigate two-person zero-sum games in fuzzy domain. In crisp
two-person zero-sum games, one player’s gain is another player’s loss. This fact is
hard to accept in fuzzy games as fuzzy numbers are used to model uncertainty in
payoffs and to say that one player’s gain is exactly equal to another player’s loss
with certainty in fuzzy domain seems unrealistic. Hence, the idea of a zero-sum
game using fuzzy numbers as payoffs is seldom realized. In this paper, our goal is
to extend the models of Maeda [13, 14] and Cunlin [15] for bimatrix games where
we have a different payoff matrix for each player. These payoff matrices are formed
using triangular fuzzy numbers. The rest of the paper is arranged as follows. Section
2 briefly gives the definitions of some preliminaries needed to implement a bimatrix
fuzzy game. Section 3 provides the definitions for three different types of equilibrium
strategies, namely Nash equilibrium, non-dominated Nash equilibrium, and weak
non-dominated Nash equilibrium. Conditions for existence of these strategies are
given, and the relation between bimatrix fuzzy games and parametric bimatrix games
34 On Solving Bimatrix Games with Triangular Fuzzy Payoffs 443

is established. Further, we introduce the concept of a new parametric crisp bimatrix

game whose payoffs are constructed using alpha cuts of triangular fuzzy numbers.
In Sect. 4, a numerical example of a fuzzy bimatrix game is discussed. Conclusion
is made in Sect. 5.

2 Preliminaries

In this section, we summarize the basic concepts of fuzzy numbers, triangular fuzzy
numbers as given by Zadeh [17] and give a ranking method for the same.

Definition 1 (Fuzzy number [15]) A fuzzy set defined on the space of real numbers
R, is said to be a fuzzy number if its membership function μa (x) satisfies the following
conditions:
(i) μa (x) is a mapping from R to the closed interval [0, 1];
(ii) there exists a unique real number c, called the center of a, such that
a. μa (c) = 1,
b. μa (x) is non-decreasing on (−∞, c],
c. μa (x) is non-increasing on [c, ∞).
The α-cut or α-level of a fuzzy number a for α ∈ [0, 1] is given as
aα = {x|μa ≥
α, x ∈ R}. The 0-level α-cut is known as the support of a given by {x|μa ≥ 0, x ∈ R}.
aαR = supaα , aαL = inf
aα , and
aα = [aαL , aαR ].
Let
a, b be two fuzzy numbers and c be a real number, the sum of a and
b and
the scalar product of c with a are defined as follows:
(i) μa+b (x) = sup minx=u+v (μa (u), μb (v))
(ii) μca (x) = max{supx=cu μa (u), 0}, with sup{Φ} = −∞.

Definition 2 (Triangular fuzzy number) A fuzzy number is said to be a triangular

fuzzy number if its membership function is given by
⎧
⎪
⎨ m−l , l ≤ x ≤ m
x−l

μa (x) = m−n

x−n
, m ≤ x ≤ n,
⎪
⎩
0 otherwise

From now onwards, we will denote a triangular fuzzy number by (l, m, n). We con-
sider F to be the set of all triangular fuzzy numbers.

a = (l1 , m1 , n1 ),
Lemma 1 Let b = (l2 , m2 , n2 ) ∈ F , c ∈ R+ . Then,
a +
(i) b = (l1 + l2 , +m1 + m2 , n1 + n2 ),
a = (cl, cm, cn).
(ii) c
444 S. Chakravorty and D. Ghosh

Definition 3 Let x = (x1 , x2 , x3 , . . . , xn ), y = (y1 , y2 , y3 , . . . , yn ) ∈ Rn . Then, we

write
(i) x y if xi ≤ yi ∀i = 1, 2, . . . , n,
(ii) x ≤ y if xi ≤ yi ∀i = 1, 2, . . . , n and x = y, and
(iii) x < y if xi < yi ∀i = 1, 2, . . . , n.
Definition 4 (Maeda [13]) Let a and b be two fuzzy numbers. Then,

(i) b
a if (bα , bα ) (aα , aα ) ∀ α ∈ [0, 1];
L R L R

(ii)
b ≤a if (bLα , bRα ) ≤ (aαL , aαR ) ∀ α ∈ [0, 1];
(iii)
b <a if (bLα , bRα ) < (aαL , aαR ) ∀ α ∈ [0, 1];
Consequently, a = b if their α-cuts are equal ∀ α ∈ [0, 1].
Theorem 1 Let a = (l1 , m1 , n1 ) and b = (l2 , m2 , n2 ) be two triangular fuzzy num-
bers. Then,
a
(i) b if (l1 , m1 , n1 ) (l2 , m2 , n2 ),
a <
(ii) b if (l1 , m1 , n1 ) < (l2 , m2 , n2 ).
The proof is omitted in this paper as it is sufficiently intuitive.

3 Nash Equilibrium Strategies for Bimatrix Games

Let M = {1, 2, . . . , m} and N = {1, 2, . . . , n} be the sets of pure strategies of player

I and player J respectively. A = (aij )m×n is the payoff matrix of player I , and B =
(bij )m×n is the payoff matrix of player J such that when player I selects strategy i
and player J selects strategy j , the payoff of player I is aij and the payoff of player J
is bij . The mixed strategies of sets I and J are probability distributions on their sets
of pure strategies given by

m
SI = {(x1 , x2 , . . . , xm ) ∈ Rm |xi ≥ 0, i = 1, 2, . . . , m xi = 1}
i=1

n
SJ = {(y1 , y2 , . . . , yn ) ∈ Rn |yi ≥ 0, i = 1, 2, . . . , n yi = 1}
i=1

If the players are open to mixed strategies, we calculate the expected payoff of player
I as E(x, y) = xT Ay and that of player J as E(x, y) = xT By. The bimatrix game is
given by
Γ ≡ ({I , J }, SI , SJ , A, B)

A fuzzy bimatrix game is where the payoffs are triangular

fuzzy
numbers. The
expected payoff of player I is given by E(x, y) = xT Ay = m n
x
i ij j and
a y
n j=1
m i=1
T
expected payoff of player J is given by E(x, y) = x By = i=1 j=1 xi bij yj .
34 On Solving Bimatrix Games with Triangular Fuzzy Payoffs 445

By Lemma 1, the expected payoffs will also be triangular fuzzy numbers. The
bimatrix fuzzy game is denoted by

Γ ≡ ({I , J }, SI , SJ ,
A,
B) (1)

where A = ( B = (
aij )m×n = ((lij , mij , nij ))m×n and bij )m×n = ((pij , qij , rij ))m×n . In
the rest of this paper, we denote AL0 = (lij )m×n , A = (mij )m×n , AR0 = (nij )m×n , B0L =
(pij )m×n , B = (qij )m×n , and B0R = (rij )m×n .

Definition 5 A point (x∗ , y∗ ) ∈ SI × SJ is said to be a Nash equilibrium strategy to

game Γ if it holds that
(i) xT
Ay∗ x∗T
Ay∗ ∀x ∈ SI , and
∗T ∗T ∗
(ii) x By x By ∀y ∈ SJ .

Definition 6 A point (x∗ , y∗ ) ∈ SI × SJ is said to be a non-dominated Nash equi-

librium strategy to game Γ if it holds that
(i) there exist no x ∈ SI such that x∗T
Ay∗ ≤ xT
Ay∗ , and
∗T ∗ ∗T
(ii) there exist no y ∈ SJ such that x By ≤ x By.

Definition 7 A point (x∗ , y∗ ) ∈ SI × SJ is said to be a weak non-dominated Nash

equilibrium strategy to game Γ if it holds that
(i) there exist no x ∈ SI such that x∗T
Ay∗ < xT
Ay∗ , and
∗T ∗ ∗T
(ii) there exist no y ∈ SJ such that x By < x By.

These three definitions of equilibrium strategies act as a natural extension of the

classical Nash equilibrium in the realm of fuzzy games. We will investigate the
existence of these strategies in our framework. It can be easily seen that the set of
Nash equilibrium strategies is a subset of non-dominated Nash equilibrium strategies
which further is a subset of weak non-dominated Nash equilibrium strategies. If the
individual payoffs become crisp, then these definitions reduce to the classical one.
Theorem 2 A pair (x∗ , y∗ ) ∈ SI × SJ is the Nash equilibrium strategy of the fuzzy
game Γ if and only if the following inequalities hold true

xT AL0 y∗ ≤ x∗T AL0 y∗ ,

x∗T B0L y ≤ x∗T B0L y∗ ,

xT Ay∗ ≤ x∗T Ay∗ ,

x∗T By ≤ x∗T By∗ ,

xT AR0 y∗ ≤ x∗T AR0 y∗ ,

x∗T B0R y ≤ x∗T B0R y∗ .

446 S. Chakravorty and D. Ghosh

Proof Let (x∗ , y∗ ) be the Nash equilibrium of the game Γ, then from Definition 5,
we have
(i) xT
Ay∗ x∗T
Ay∗ ∀x ∈ SI , and
∗T ∗T ∗
(ii) x By x By y ∈ SJ .
For the first condition to be true, using Theorem 1, we get

(xT AL0 y∗ , xT Ay∗ , xT AR0 y∗ ) (x∗T AL0 y∗ , x∗T Ay∗ , x∗T AR0 y∗ ) (2)

Analogously, for the second condition to be true, we get

(x∗T B0L y, x∗T By, x∗T B0R y∗ ) (x∗T B0L y∗ , x∗T By∗ , x∗T B0R y∗ ) (3)

Clubbing Equations (2) and (3), we derive

xT AL0 y∗ ≤ x∗T AL0 y∗ and x∗T B0L y ≤ x∗T B0L y∗ , (4)

xT Ay∗ ≤ x∗T Ay∗ and x∗T By ≤ x∗T By∗ , (5)

xT AR0 y∗ ≤ x∗T AR0 y∗ and x∗T B0R y ≤ x∗T B0R y∗ . (6)

Therefore, we may conclude that the Nash equilibrium of the fuzzy game Γ is
equivalent to the Nash equilibriums of three crisp bimatrix games given by Eqs. (4),
(5), and (6). Consequently, the existence of Nash equilibrium for fuzzy bimatrix
games cannot be guaranteed. This necessitates the need for non-dominated and weak
non-dominated Nash equilibriums which we are going to investigate as we go further.

Theorem 3 A pair (x∗ , y∗ ) is said to be a non-dominated Nash equilibrium of the

game Γ if and only if the following conditions hold true
(i) there exists no x ∈ SI such that
x∗T (AL0 , AR0 )y∗ ≤ xT (AL0 , AR0 )y∗ and x∗T Ay∗ ≤ xT Ay∗
(ii) there exists no y ∈ SJ such that
x∗T (B0L , B0R )y∗ ≤ x∗T (B0L , B0R )y and x∗T By∗ ≤ x∗T By.

Proof Let the pair (x∗ , y∗ ) be a non-dominated Nash equilibrium strategy for the
game Γ, we assume that there exists a x such that condition (i) holds

x∗T (AL0 , AR0 )y∗ ≤ xT (AL0 , AR0 )y∗ and (7)

x∗T Ay∗ ≤ xT Ay∗ (8)

34 On Solving Bimatrix Games with Triangular Fuzzy Payoffs 447

The above Inequality (7) can be broken down into following inequalities

x∗T AL0 y∗ ≤ xT AL0 y∗

x∗T AR0 y∗ ≤ xT AR0 y∗

Since the inequalities cannot occur simultaneously. Thus for any α ∈ [0, 1), we
can write

(x∗T (αA + (1 − α)AL0 )y∗ , x∗T (αA + (1 − α)AR0 )y∗ )

≤ (xT (αA + (1 − α)AL0 )y∗ , xT (αA + (1 − α)AR0 )y∗ )

Hence, it follows that,

(x∗T ALα y∗ , x∗T ARα y∗ ) ≤ (xT ALα y∗ , xT ARα y∗ ).

From Definition 4, we get, x∗T Ay∗ ≤ xT Ay∗ . But, this is a contradiction.

Analogously, there exists no y such that condition (ii) holds.
For the converse, let a pair (x∗ , y∗ ) be such that both the conditions (i) and (ii)
are satisfied. Let us assume that it is not a non-dominated Nash equilibrium point.
Hence, there will be a certain x such that x∗T
Ay∗ ≤ xT Ay∗ . From Definition 4 we get,

(x∗T ALα y∗ , x∗T ARα y∗ )

≤ (xT ALα y∗ , xT ARα y∗ )

Since, ALα and ARα are continuous with respect to α. As α tends to 1,

x∗T Ay∗ ≤ xT Ay∗ .

And if α = 0,

x∗T (AL0 , AR0 )y∗ ≤ xT (AL0 , AR0 )y∗ , which is a contradiction.

Analogously, we can prove that there exists no such y such that x∗T (B0L , B0R )y∗ ≤
x∗T (B0L , B0R )y and x∗T By∗ ≤ x∗T By. Hence (x∗ , y∗ ) is a non-dominated Nash equilib-
rium point.
Theorem 4 A pair (x∗ , y∗ ) is a weak non-dominated Nash equilibrium point of the
game Γ if and only if the following conditions hold
(i) there exists no x ∈ SI such that
x∗T (AL0 , A, AR0 )y∗ < xT (AL0 , A, AR0 )y∗ ,
(ii) there exists no y ∈ SJ such that
x∗T (B0L , B, B0R )y∗ < x∗T (B0L , B, B0R )y.
448 S. Chakravorty and D. Ghosh

Proof Above theorem can be proved similar to Theorem 3.

Now, we will investigate parametric bimatrix games and transform the solutions
of fuzzy games into the solutions of parametric bimatrix games and prove existence
results for non-dominated and weak non-dominated Nash equilibrium. We define a
parametric bimatrix game using AL0 , AR0 , B0L , and B0R . A parametric bimatrix game is
one in which payoff matrices are given by

A(λ) = λAL0 + (1 − λ)AR0 and

B(μ) = μB0L + (1 − μ)B0R .

The parametric bimatrix game is denoted by Γ (λ, μ) = ({I , J }, SI , SJ , A(λ), B(μ)).

Lemma 2 Every parametric bimatrix game Γ (λ, μ) has at least one Nash equilib-
rium point. If (x∗ , y∗ ) is a Nash equilibrium strategy then,

xT A(λ)y∗ ≤ x∗T A(λ)y∗ , x ∈ SI ,

x∗T B(μ)y ≤ x∗T B(μ)y∗ , y ∈ SJ .

Theorem 5 If a point (x∗ , y∗ ) ∈ SI × SJ is a Nash equilibrium of the parametric

bimatrix game Γ (λ, μ) with λ, μ ∈ (0, 1), then it is also a non-dominated Nash
equilibrium of the fuzzy game Γ = ({I , J }, SI , SJ ,
A,
B).

Proof Let a pair (x∗ , y∗ ) ∈ SI × SJ be a Nash equilibrium of the parametric bimatrix

game Γ (λ, μ). Therefore,

xT A(λ)y∗ ≤ x∗T A(λ)y∗ , x ∈ SI , (9)

x∗T B(μ)y ≤ x∗T B(μ)y∗ , y ∈ SJ . (10)

Let us assume that it is not a non-dominated Nash equilibrium point for the fuzzy
game Γ. Therefore, we have a x such that

x∗T
Ay∗ ≤ xT
Ay∗ (11)

From Definition 4, it follows that

(x∗T ALα y∗ , x∗T ARα y∗ ) ≤ (xT ALα y∗ , xT ARα y∗ )

If α = 0, we get

(x∗T AL0 y∗ , x∗T AR0 y∗ ) ≤ (xT AL0 y∗ , xT AR0 y∗ )

34 On Solving Bimatrix Games with Triangular Fuzzy Payoffs 449

Since all the equalities cannot hold together, we can write using the same λ as in
the parametric bimatrix game Γ (λ, μ)

λx∗T AL0 y∗ + (1 − λ)x∗T AR0 y∗ < λxT AL0 y∗ + (1 − λ)xT AR0 y∗

But it violates Condition (9). Hence, there exists no x such that Condition (11)
holds. Analogously, there exists no y ∈ SJ such that

x∗T
By∗ ≤ x∗T
By (12)

holds. Hence from Definition 6, pair (x∗ , y∗ ) is a non-dominated Nash equilibrium

of the fuzzy game Γ.

Theorem 6 If a point (x∗ , y∗ ) is a Nash equilibrium point of the parametric bima-

trix game Γ (λ, μ) with λ, μ ∈ [0, 1]. Then, it is also a weak non-dominated Nash
equilibrium of the fuzzy game Γ = ({I , J }, SI , SJ ,
A,
B).

Proof The proof to this theorem is on the lines of Theorem 5.

We can observe that non-dominated Nash equilibrium set obtained using this
method is a subset of the weak non-dominated Nash equilibrium set, as the values
of λ and μ need to be positive for non-dominated Nash equilibrium while for weak
non-dominated Nash equilibrium, λ and μ being non-negative are sufficient. Based
on Lemma 2, Theorems 5 and 6, we derive the following results.
Theorem 7 If Γ is a fuzzy game with triangular fuzzy numbers as payoffs, then
(i) There exists atleast one non-dominated Nash equilibrium strategy to the game
Γ,
(ii) There exists atleast one weak non-dominated Nash equilibrium strategy to the
game Γ.

4 Numerical Example

Example 1 Let us consider a bimatrix fuzzy game Γ ≡ ({I , J }, SI , SJ ,

A,
B) given

by payoff matrices A and B as follows:

(10, 12, 17) (5, 9, 13)

A=
(7, 8, 10) (12, 14, 15)

(2, 5, 7) (13, 17, 18)

B=
(11, 13, 15) (7, 9, 10)

We try to find whether the above game has a Nash equilibrium strategy. According
to Theorem 2, Nash equilibrium of the fuzzy game Γ exists if crisp bimatrix games
450 S. Chakravorty and D. Ghosh

defined in Theorem 2 have the same equilibrium strategy. Observe that the crisp
games have different Nash equilibrium strategies here; hence, the above game Γ has
no Nash equilibrium strategy.

To find non-dominated and weak non-dominated Nash equilibrium strategies, we

need to find the equilibrium strategies to the crisp bimatrix game

Γ (λ, μ) = ({I , J }, SI , SJ , A(λ), B(μ)) where

A(λ) = λAL0 + (1 − λ)AR0 and

B(μ) = μB0L + (1 − μ)B0R .

17 − 7λ 13 − 8λ
A(λ) =
10 − 3λ 15 − 3λ

7 − 5μ 18 − 5μ
B(μ) =
15 − 4μ 10 − 3μ

In order to find a mixed strategy Nash equilibrium to the above bimatrix game,
we will use the property that the player who optimizes using a mixed strategy is in-
different among all pure strategies that occur in a given mixed strategy with positive
probabilities. Let I chooses the row 1 strategy with probability p and consequently,
row 2 strategy with probability 1 − p and player J chooses the column 1 strategy with
probability q and consequently, column 2 strategy with probability 1 − q. For this
strategy to be a Nash equilibrium, player J must be indifferent between its strategies.
Therefore,

p(7 − 5μ) + (1 − p)(15 − 4μ)

= p(18 − 5μ) + (1 − p)(10 − 3μ).

Solving the above linear equation for p, we get,

p= 5−μ
16−μ
.

Similarly, solving for q we get,

q= 2+5λ
9+λ
.
34 On Solving Bimatrix Games with Triangular Fuzzy Payoffs 451

Hence, the Nash equilibrium strategy for the above problem is

x∗T = ( 16−μ
5−μ
, 16−μ
11
)
∗T
y = ( 9+λ , 9+λ ).
2+5λ 7−4λ

By Theorems 5 and 6, the non-dominated Nash equilibrium and weak non-dominated

Nash equilibrium strategies of the fuzzy game Γ are as follows:

NDN = (x∗ , y∗ ) λ, μ ∈ (0, 1)

W NDN = (x∗ , y∗ ) λ, μ ∈ [0, 1].

5 Conclusion

In this paper, we introduce the concept of bimatrix fuzzy games with different pay-
off matrices for the two players. Models by Maeda and Cunlin are extended for this
game. Three different characterizations of Nash equilibrium are given for the realm
of fuzzy games. These characterizations can be seen as natural extensions of Nash
equilibrium for crisp games. Existence conditions are studied for the three equi-
librium strategies. It was seen that the Nash equilibrium for bimatrix fuzzy games
can’t be guaranteed unlike the general crisp bimatrix games. Parametric bimatrix
games were introduced as a solution method to find non-dominated and weak non-
dominated Nash equilibrium strategies for a fuzzy bimatrix game. Further work is
needed to generalize the models discussed in this paper for n players.

References

1. Nash, J.F.: Non-cooperativegames. In: Annals of Mathematics, Second Series, vol. 54, no. 2,
pp. 286–295 (1951)
2. Butnariu, D.: Fuzzy games: a description of the concept. Fuzzy Sets Syst. 1(3), 181–192 (1978).
July
3. Butnariu, D.: Stability and Shapley value for an n-persons fuzzy game. Fuzzy Sets Syst. 4(1),
63–72 (1980). July
4. Campos, L.: Fuzzylinear programming models to solve fuzzy matrix games. FuzzySets Syst.
32, 27589 (1989)
5. Yager, R.: Ranking fuzzysubsets over the unit interval. In: Proceedings of the CDC, pp. 1435–
1437 (1978)
6. Li, D.F.: A fuzzy multiobjective approach to solve fuzzy matrix games. J. Fuzzy Math. 7, 90712
(1999)
7. Li, D.F.: A fast approach to compute fuzzy values of matrix games with payoffs of triangular
fuzzy numbers. Eur. J. Oper. Res. 223(2), 421–429 (2012)
8. Sakawa, M., Nishizaki, I.: Max-min solutions for fuzzy multiobjective matrix games. FuzzySets
Syst. 61, 26575 (1994)
452 S. Chakravorty and D. Ghosh

9. Bector, C.R., Chandra, S.: On duality in linear programming under fuzzy environment. Fuzzy-
Sets Syst. 125, 31725 (2002)
10. Vijay, V., Chandra, S., Bector, C.R.: Matrix games with fuzzy goals and fuzzy pay offs. Omega
33, 425429 (2005)
11. Kacher, F., Larbani, M.: Existence of equilibrium solution for a non-cooperative game with
fuzzy goals and parameters. Fuzzy Sets Syst. 159(2), 164–176 (2008). January
12. Larbani, M.: Non cooperative fuzzy games in normal form: a survey. Fuzzy Sets Syst. 160(22),
3184–3210 (2009). November
13. Maeda, T.: On characterization of equilibrium strategy of two person zero-sum games with
fuzzy pay offs. FuzzySets Syst. 139, 28396 (2003)
14. Maeda, T.: On characterization of equilibrium strategy of bimatrix games with fuzzy pay offs.
J. Math. Anal. Appl. 251, 885896 (2000)
15. Cunlin, L., Qiang, Z.: Nash equilibrium strategy for fuzzy non-cooperative games. FuzzySets
Syst. 176, 4655 (1976)
16. Dutta, B., Gupta, S.K.: On nash equilibrium strategy of two-person zero-sum games with
trapezoidal fuzzy payoffs. Fuzzy Inf. Eng. 6, 299–314 (2014)
17. Zadeh, L.A.: Fuzzy Sets. Inf. Control 8(3), 338–352 (1968)
Chapter 35
Comparison of Two Methods Based
on Daubechies Scale Functions
and Legendre Multiwavelets for
Approximate Solution of Cauchy-Type
Singular Integral Equation on R

Swaraj Paul and B.N. Mandal

Abstract Two methods based on Daubechies scale functions and Legendre mul-
tiwavelet for the approximate solution of singular integral equation of the second
kind with Cauchy type on the real line R are developed
∞ and compared. The integral
equation considered here is of the form u(x) + λ −∞ K(x, t)u(t)dt = f (x), x ∈ R,
where K(x, t) = t−x1
+ h(x, t), h(x, t) being a regular kernel. In both of the cases,
two-scale relations involving the scale functions are used for the evaluation of mul-
tiscale representation of the integral operator. Then the given integral equation is
converted into a system of linear algebraic equations which can be solved easily by
using library function ‘Solve[]’ available in MATHEMATICA. The convergence of
the method has been proved in L2 spaces. Two examples are given and their approx-
imate solutions obtained by the proposed methods have been compared with the
available numerical results to assess the efficiency of the method developed here.

Keywords Singular integral equation in R · Cauchy type singularity · Legendre

multiwavelet · Daubechies scale function

1 Introduction

Singular integral equations (SIE) with Cauchy-type kernel arises in a large class of
mixed boundary value problems of mathematical physics such as contact problems in
fracture mechanics, mainly crack problems in elasticity [1]. Closed form solution of
this SIE is well known by complex function theoretic method [2, 3]. Unfortunately,

S. Paul (B)
Department of Mathematics, Visva Bharati,
Santiniketan 731235, West Bengal, India
e-mail: [email protected]
B. N. Mandal
Physics and Applied Mathematics Unit,
Indian Statistical Institute, 203, B T Road, Kolkata 700108, India
e-mail: [email protected]

this method makes the calculation of the corresponding singular integral (Cauchy
type) required by a suitable quadrature rule quite complicated and time-consuming.
So the numerical solution of SIE by avoiding quadrature rule has an important impact.
Several methods have been proposed for the numerical solutions of the SIE with
Cauchy-type singularity. For a thorough review of Galerkin methods, see Ioakimidis
[4], Gong [5, 6], Monegato [7], etc. Collocation methods have been studied by
Junghans and Kaiser [8], Scuder [9], etc. Bernstein polynomial method has been
studied by Setia [10], and Taylor series expansion and Legendre polynomial method
have been developed by Arzhang [11]. Method based on Daubechies scale function
has been proposed by Panja and Mandal [12]. But all these methods were used in the
case when the domain of the integral equation is finite. In this paper, Daubechies scale
functions and Legendre multiwavelets have been used to get multiscale approximate
solution of a second-kind Fredholm integral equation respectively with the Cauchy-
type kernel on R of the form
∞
1
u(x) + λ h(x, t) + u(t)dt = f (x), x ∈ R, (1)
−∞ t−x

where f (x) is a known function defined on R, h(x, t) is a regular kernel, and u is

unknown. This type of integral equation arises in acoustic scattering problems [13],
in the scattering of elastic waves [14, 15], in the study of unsteady water waves [16],
to solve Laplace’s equation in the upper half plane subject to fairly general mixed
boundary data [17], etc. Bonis et al. [18] proposed a method based on interpolation
process, related to zeros of Hermite polynomials to solve this equation. Sheshko
and his collaborators [19, 20] investigated the exact solutions of the characteristic
and dominant equation with the Cauchy kernel on the real line by transforming the
domain from real line to a circle and using a complex function theoretic method.
Multiresolution analysis (MRA) of function space concerning refinable functions
and wavelets became an efficient tool in several areas of mathematical physics and
engineering [21, 22]. As more desirable properties of wavelet became available,
this tool has been called a “mathematical microscope” in the study of smoothness
of regularity of function/signal. For numerical implementation, using wavelet bases
is more efficient than classical orthogonal or non-orthogonal polynomial bases due
to some useful properties of wavelet, e.g., capturing the local information about the
functions/signals and operators, discretization of the domain and sparseness nature of
the matrix representation of the operator in wavelet bases. The pioneer workers in the
development of multiwavelet involving polynomial are Alpert and his collaborators
[23, 24], and they have solved the weakly singular integral equation numerically with
logarithmic singularity by using these wavelets. Recently using LMW basis, Paul et
al. obtained a class of Fredholm integral equation of the second kind with singular
kernel, i.e., with weakly singular Abel-type kernel [25], hypersingular kernel [26],
Cauchy-type kernel with constant coefficient [27], Cauchy-type kernel with variable
coefficient [28]. To the authors’ best knowledge, multiwavelets have not been used
for getting approximate solution of Fredholm singular integral equation of the second
kind on the real line R with the Cauchy-type singular kernel. In this work, we have
35 Comparison of Two Methods Based on Daubechies Scale Functions … 455

successfully implemented the numerical scheme based on Daubechies scale functions

and LMW for getting the solution of the singular integral equation on the real line.
In this paper, we consider the integral equation (1) and study two different meth-
ods based on Daubechies scale functions and Legendre multiwavelets by compar-
ing the solutions. The organization of this paper is as follows: Basic properties of
Daubechies scale functions and method of solution to a singular integral equation
on R with Cauchy-type kernel by using Daubechies scale functions are discussed in
Sect. 2. Transformation into a finite range of integration from infinite range under
some suitable condition is discussed in Sect. 3.1. Basic definition and properties of
Legendre multiwavelets are described in Sect. 3.2, while in Sect. 3.3, the multiscale
approximation of a L2 function is presented. We evaluate the double integrals involv-
ing the basis of Legendre multiwavelets and the Cauchy singular kernel in Sect. 3.4.
Using the values of the double integrals obtained in Sect. 3.4, we evaluate the mul-
tiscale representation of the transformed integral operator in Sect. 3.5. In Sect. 3.6,
the integral equation (1) is transformed into a system of linear algebraic equations
which can be solved by a standard method. Error estimation of the proposed method
is also presented. The numerical scheme has been verified through two examples in
Sect. 4. Conclusion is presented in Sect. 5.

2 Solution of Singular Integral Equation on R with

Cauchy-Type Kernel by Daubechies Scale Functions

2.1 Basic Properties of Daubechies Scale Function

The interesting property of Daubechies scale functions is refinement equation or

two-scale relation

√ 2K−1
φ(x) = 2 hl φ(2x − l) (2)
l=0

∞
with the normalization −∞ φ(x)dx = 1. The translates of scaling function are orthog-
∞
onal for a particular resolution j such that −∞ φ jk1 (x)φ jk2 (x)dx = δk1 k2 , where
j
φ jl (x) = 2 2 φ(2 j x − l).

2.2 Method of Solution to Singular Integral Equation

on R with Cauchy-Type Kernel

Let us consider the integral equation (1). Express u(x) in terms of Daubechies scale
functions φ jk (x) at resolution j as
456 S. Paul and B. N. Mandal

k max

j

u(x) = u jk φ jk (x) (3)

k=k min
j

with raw image u jk ’s. Integrating both sides between −∞ and ∞ after substituting
u(x) from (3) into (1) and multiplying with φ jk , we get

k max

j

u jk +λ ρ jkk + I jkk u jk = f jk , (4)
k=k min
j

where
∞ ∞ φ jk (x)φ jk (t)
ρ jkk =
dtdx, (5)
−∞ −∞ t−x
∞ ∞
I jkk = φ jk (x)φ jk (t)h(x, t)dtdx, (6)
−∞ −∞
∞
f jk = φ jk (x) f (x)dx. (7)
−∞

The solution of a set of linear simultaneous equation (4) is the approximate numerical
solution u(x) of the integral equation (1).

2.3 Evaluation of Matrix Element

The technique for calculating the integrals f jk and I jkk can be obtained by using a
one-point quadrature rule [29] as
∞
1
2K−1 2K−1 2K−1
φ jk (x) f (x)dx = j1 ... hl j1 hl j1 −1 . . . hl1 f x j, j1 ,k (8)
−∞ 2 2
l j1 =0 l j1 −1 l1
35 Comparison of Two Methods Based on Daubechies Scale Functions … 457

and
∞ ∞
φ (x)φ jk (y)h(x, y)dydx
−∞ −∞ jk

2K−1
2K−1 2K−1
2K−1
2K−1
2K−1
1
= ... ... hl hl . . . hl hm j hm j −1 . . . hm1 f x , y j, j ,k ,
2 j1 l =0 l j1 j1 −1 1 1 1 j, j1 ,k 1
j1 j1 −1 =0 l1 =0 m j =0 m j −1 =0 m1 =0
1 1
(9)

where
2 j1 −1 l1 + ··· +l j1 + m
2 j1
+k
x j, j1 ,k = ,
2j
2 j1 −1 m1 + ··· + m j1 + m
2 j1
+k
y j, j1 ,k = ,
2j
∞
and m = −∞ xφ(x)dx. The main task for numerical solution of Eq.(1) is the
numerical evaluation of the double integral ρ jkk of (5). Clearly, ρ jkk = ρk−k and
ρk−k = −ρk −k where
∞ ∞
φn (x)φ(t)
ρn = dtdx. (10)
−∞ −∞ x−t

For evaluation of ρn , we follow the technique given by [12] through the use of their
asymptotic values
1
ρn ≈ , when |n| > 20 (11)
n
in their two-scale relation
ρn = hl hl ρ2n+l−l . (12)

ll

For 5 < |n| < 20, the value of the nonsingular integrals ρn can be evaluated by the
using the formulae (11) and (12) simultaneously. The value of the singular integral
ρn for |n| ≤ 5 can be evaluated by solving the linear equations (12).

3 Solution of Singular Integral Equation on R

with Cauchy-Type Kernel by Legendre Multiwavelets

3.1 Transformation to the Finite Range of Integration

We consider integral equation of the form,

∞
u(x) + λ K(x, t)u(t)dt = f (x), x ∈ R, (13)
−∞
458 S. Paul and B. N. Mandal

where K(x, t) = h(x, t) + 1

t−x
, and the operator
∞
(Hu) (x) = h(x, t)u(t)dt, x ∈ R (14)
−∞

is assumed to be compact, u(x) and f (x) be bounded continuous functions on R. We

abbreviate (13) by the operator form as

u + λKu = f, (15)

where K is the integral operator defined by

∞
(Ku) (x) = K(x, t)u(t)dt, x ∈ R. (16)
−∞

Define the finite section approximation

uβ (x) + λ K(x, t)uβ (t)dt = f (x), |x| < β, (17)
|t|<β

where uβ (x) converges to u(x) as β → ∞, uβ (x) is a finite section approximation of

u(x), and here the required condition of the force term f (x) is such that f (x) → 0
as |x| ≥ β. We abbreviate (17) in operator form as

uβ + λKβ uβ = f, (18)

where Kβ is defined by
β
Kβ (u(x)) = K(x, t)u(t)dt. (19)
−β

It has been shown in [30] and [31] that, under quite general conditions on the kernel
K, the convergence of uβ to u is uniform on finite intervals of R. Condition for the
−1
existence and uniform boundedness of I + Kβ have been obtained in [31] for the
special case when K = W + H , where W is a Wiener–Hopf integral operator and H is
a compact operator. Chandler-Wilde [32] proved that this finite section approximation
method is stable for a perturbed equation in which the kernel K is replaced by
K + h. Also, Chandler-Wilde [33] showed that under some condition on K(x, t) and
if f (x) → 0 as x → ∞, then u(x) → 0 as x → ∞. Now we are interested to solve
(17) instead of (13). Under some change of variable (17) can be transformed to
1
v(x) + G(x, t)v(t)dt = F(x), 0 < x < 1 (20)
0
35 Comparison of Two Methods Based on Daubechies Scale Functions … 459

where

v(x) = uβ (2βx − β) ,

G(x, t) = 2λβ K (2βx − β, 2βt − β)

λ
= 2λβ h (2βx − β, 2βt − β) + . (21)
x−t

3.2 Legendre Multiwavelets

The scaling functions in Legendre multiwavelet (LMW) basis consist of K compo-

nent vectors
1
φi (x) := (2i + 1) 2 Pi (2x − 1), i = 0, 1, . . . , K − 1; 0 ≤ x < 1 (22)

where Pi (x) is the Legendre polynomial of degree i (i = 0, 1, . . . , K − 1). Their

expressions at resolution j are given by
j
φij,k (x) := 2 2 φi (2 j x − k), j ∈ {0} ∪ N, k = 0, 1, ., 2 j − 1, (23)

The refinement equations or the two-scale relations among the scale functions φij,k (x)
are

1 (0) r
1 (s) r
K−1 K−1 1
(1)
φij,k (x) = √ hi,r φ j+1,2k (x) + hi,r φrj+1,2k+1 (x) = √ hi,r φ j+1,2k+s (x).
2 r=0 2 r=0 s=0
(24)
j

The elements ψ ij,k (x) := 2 2 ψ i (2 j x − k) of Legendre multiwavelets ψ j,k having K

components for each resolution j and admissible shift k (0 ≤ k ≤ 2 j − 1) are given
by

1 (0) r
1 (s) r
K−1 K−1 1
(1)
ψ ij,k (x) = √ gi,r φ j+1,2k (x) + gi,r φrj+1,2k+1 (x) = √ gi,r φ j+1,2k+s (x).
2 r=0 2 r=0 s=0
(25)

.
(0) . (1)
The elements h(s)
i,r and g (s)
i,r , (s = 0, 1) of the low-pass filter H = √1
2
h . h

.
and high-pass filter G = √12 g(0) .. g(1) are obtained by using (22) into (24) and
the following relations at resolution 0 [23] respectively :
460 S. Paul and B. N. Mandal
1
ψ0,0
i
(x)xm dx = 0 for i = 0, 1, . . . , K − 1; m = 0, 1, . . . , K − 1 + i , (26)
0

1
i1 i2
ψ0,0 (x)ψ0,0 (x)dx = δi1 ,i2 for i1 , i2 = 0, 1, . . . , K − 1. (27)
0

Explicit values of the elements of h(0) , g(0) for K = 4, 5 can be found in [25].

3.3 Multiscale Approximation of a Function

Here we represent multiscale approximation of a function f ∈ L2 [0, 1].

⎧
⎪
⎪ 2J
−1 K−1
i
⎪
⎪ (P K f )(x) ≡ cJ ,k φiJ ,k (x), in Legendre piecewise polynomial (LPP) basis
⎪
⎪ V
⎪
⎨⎛ J ⎞
k=0 i=0
f (x) ≈
⎪ ⎜ ⎟ i
K−1 j −1
−1 2
J
(28)
⎪
⎪ ⎜P f⎟ c0,0 φi (x) + d ij,k ψ ij,k (x) , in LMW basis
⎪
⎪
⎪ ⎝ K J−1 ⎠ (x) ≡
⎪
⎩ V0 WK i=0 j=0 k=0
j
j=0

For the clarity of presentation, we use the following notations [25]. For a given j
and k = 0, 1, . . . , 2 j − 1

J ,k (x) = φ0J ,k (x), φ1J ,k (x), . . . φK−1
J ,k (x) , (29)

j,k (x) = ψ 0j,k (x), ψ 1j,k (x), . . . ψ K−1

j,k (x) . (30)

J
The bases for VJK , W jK , and W jK are then denoted by
j=0

J := J ,0 (x), J ,1 (x), . . . J ,2J −1 (x) 1×2J K , (31)

j := j,0 (x), j,1 (x), . . . j,2 j −1 (x) 1×2 j K , (32)

and
J := ( 0 , 1 , . . . J )1×(2J +1 −1)K . (33)

The components of cJ ,k and d j,k , the coefficients in multiscale approximation of

f (x) are the inner product of f (x) with φiJ ,k (x) and ψ ij,k (x) respectively. Finally, we
use the symbols
cJ := cJ ,0 , cJ ,1 , . . . cJ ,2J −1 1×2J K , (34)

d j := d j,0 , d j,1 , . . . d j,2 j −1 1×2 j K , (35)
35 Comparison of Two Methods Based on Daubechies Scale Functions … 461

and
Jd := (d0 , d1 , . . . dJ )1×(2J +1 −1)K . (36)

Then (28) can be expressed as

⎧
⎪
⎪ (P K f )(x) ≡ ⎞
J cJT , in LPP basis ⎛
⎨ ⎛ VJ ⎞
c0T
f (x) ≈ ⎝P J ⎠ , in LMW basis (37)
⎪
⎪ −1 f ⎠ (x) ≡ (0 , (J −1) )
⎝
⎩ V0K W jK T
j=0 (J −1) d

where the superscript T denotes the transpose. These notations and symbols have
been discussed in somewhat details because of their relevance in the subsequent
discussions.

3.4 Evaluation of Integrals

Our main task is to solve the given integral equation (20) which is of the form

v(x) + (H v) (x) + (Lv) (x) = h(x), 0 < x < 1

1
v(t)
where (Lv) (x) = dt. (38)
0 x−t

Now for evaluation of the multiscale representation of the integral operator L, we

have to evaluate the CPV integrals involving the product of elements of basis and
their images under L. The method to evaluate such integrals has been discussed in
some detail by Paul et al. [27]. For the sake of completeness, the results are stated in
Sect. 3.4.
Integrals involving scale functions We use the notation

1 1
φl1 (x) φl2 (x)
ρ(n; l1 , l2 ) = dt dx, n ∈ Z (39)
0 0 n+x−t

Theorem 1 ρ(n; l1 , l2 ) satisfy the following relations

K−1
K−1
ρ(n; l1 , l2 ) = h(0) (1) (0) (0)
l1 ,k1 hl2 ,k2 ρ(2n − 1; l1 , l2 ) + hl1 ,k1 hl2 ,k2
k1 =0 k2 =0

+h(1) h(1)
l1 ,k1 l2 ,k2 ρ(2n; l 1 , l 2 ) + h(1) (0)
h
l1 ,k1 l2 ,k2 ρ(2n + 1; l 1 , l 2 ) . (40)

Moreover, this relation forms an system of linear equation of ρ(0; l1 , l2 ), and it has
unique solution.
Integrals involving product of scale functions and wavelets
462 S. Paul and B. N. Mandal

Theorem 2 We denote by
1 1 φl1 (x) ψ lj,k
2
(x)
α(n; l1 , l2 , j, k) = dtdx, n ∈ Z, j ≥ 0, k = 0, ., 2 j − 1, (41)
0 0 n+x−t

then
K−1
K−1
α(n; l1 , l2 , 0, 0) = h(0) (1)
l1 ,k1 gl2 ,k2 ρ(2n − 1; k1 , k2 )
k1=0 k2=0

h(0) (0) (1) (1) (1) (0)

l1 ,k1 gl2 ,k2 + hl1 ,k1 gl2 ,k2 ρ(2n; k1 , k2 ) + hl1 ,k1 gl2 ,k2 ρ(2n + 1; k1 , k2 ) . (42)

Moreover, the formula for the evaluation of α(n; l1 , l2 , j, k) for j > 0 and k =
0, 1, . . . , 2 j − 1 is given by
⎧ K−1 (0)
⎪
⎪
⎪ k1 =0 hl1 k1 α(2n; k1 , l2 , j − 1,
k)
⎪
⎪ (1)
⎪
⎪ +hl1 k1 α(2n + 1; k1 , l2 , j − 1, k) ,
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨ for k = 0, 1, . . . 2 j−1 − 1,
α(n; l1 , l2 , j, k) = (43)
⎪
⎪ K−1 (0)
⎪
k1 =0 hl1 k1 α(2n − 1; k1 , l2 , j − 1, k − 2
j−1 )+
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (1)
⎪
⎪ hl1 k1 α(2n; k1 , l2 , j − 1, k − 2 j−1 ) ,
⎩
for k = 2 j−1 , 2 j−1 + 1, . . . 2 j − 1.
Theorem 3 We denote by
1 1 ψ lj,k
1
(x)φl2 (t)
β(n; l1 , l2 , j, k) = dtdx, n ∈ Z, j ≥ 0, k = 0, ., 2 j − 1, (44)
0 0 n+x−t

then
β (n; l1 , j, k; l2 ) = −α (−n; l2 ; l1 , j, k) . (45)

Integrals involving product of wavelets

Theorem 4 We denote by
1 1 ψ lj11 ,k1 (x)ψ lj22 ,k2 (t)
γ(n; l1 , j1 , k1 ; l2 , j2 , k2 ) = dtdx, j1 , j2 ∈ N
0 0 n+x−t
k1 = 0, 1, . . . , 2 j1 − 1, k2 = 0, 1, . . . , 2 j2 − 1, (46)

then
K−1
K−1
(0) (1)
γ(n; l1 , 0, 0; l2 , 0, 0) = gl1 ,k1 gl2 ,k2 ρ(2n − 1; k1 , k2 )+
k1=0 k2=0

(0) (0) (1) (1) (1) (0)

gl1 ,k1 gl2 ,k2 + gl1 ,k1 gl2 ,k2 ρ(2n; k1 , k2 ) + gl1 ,k1 gl2 ,k2 ρ(2n + 1; k1 , k2 ) , (47)
35 Comparison of Two Methods Based on Daubechies Scale Functions … 463

⎧ K−1 (0)
⎪ k1 =0 gl1 k1 α(2n; k1 , l2 , j − 1, k)
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ +gl(1) α(2n + 1; k1 ; l2 , j − 1, k) ,
⎪
⎪ 1 k1
⎪
⎪
⎪
⎪
⎪
⎨ for k = 0, 1, . . . 2 j−1 − 1,
γ(n; l1 , 0, 0; l2 , j, k) = , (48)
⎪
⎪ K−1 (0)
⎪
k1 =0 gl1 k1 α(2n − 1; k1 ; l2 , j − 1, k − 2
j−1 )+
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ gl(1) α(2n; k1 ; l2 , j − 1, k − 2 j−1 ) ,
⎪
⎪
1 k1
⎪
⎪
⎩
for k = 2 j−1 , 2 j−1 + 1, . . . 2 j − 1

γ (0; l1 , j1 , k1 ; l2 , j2 , k2 ) = 2 j1 γ k1 − r; l1 , 0, 0; l2 , j2 − j1 , k2 − 2 j2 − j1 r . (49)

Here r takes the value 0, 1, . . . , 2 j − 1 so that for given r, k2 ∈ {r2 j2 − j1 , r2 j2 − j1 +
1, . . . , (r + 1)2 j2 − j1 − 1}.
now denote the matrices as

ρ := [ρ (0; l1 , l2 )]K×K ,
α ( j) := [α (0; l1 ; l2 , j, k)]K×(2 j K ) ,
β ( j) := [β (0; l1 , j, k; l2 )](2 j K )×K ,
γ ( j1 , j2 ) := [γ (0; l1 , j1 , k1 ; l2 , j2 , k2 )](2 j1 K )×(2 j2 K ) . (50)

3.5 Multiscale Representation of the Operator L

The multiscale representation

(0 , (J −1) ), L(0 , (J −1) )

of L in the basis (0 , (J −1) ) can be written in the form

⎛ ρ α (0) α (1) ...... α (J − 1) ⎞
⎜ β (0) γ (0, 0) γ (0, 1) . . . . . . γ (0, J − 1) ⎟
⎜ β (1) γ (1, 0) γ (1, 1) . . . . . . γ (1, J − 1) ⎟
⎜ ⎟
LMS
J =⎜
⎜
. . . ...... . ⎟
⎟
(51)
⎝ . . . ...... . ⎠
. . . ...... .
β (J − 1) γ (J − 1, 0) γ (J − 1, 1) . . . . . . γ (J − 1, J − 1) (2J K)×(2J K)

where the sub-matrices ρ, α, β, γ are given in (50).

Also the multiscale representation

(0 , (J −1) ), H (0 , (J −1) )

464 S. Paul and B. N. Mandal

of H in the basis (0 , (J −1) ) can be written in the form

⎛ ρH αH (0) αH (1) ...... αH (J − 1) ⎞
⎜ βH (0) γH (0, 0) γH (0, 1) . . . . . . γH (0, J − 1) ⎟
⎜ βH (1) γH (1, 0) γH (1, 1) . . . . . . γH (1, J − 1) ⎟
⎜ ⎟
HJMS = ⎜
⎜
. . . ...... . ⎟
⎟
(52)
⎝ . . . ...... . ⎠
. . . ...... .
βH (J − 1) γH (J − 1, 0) γH (J − 1, 1) . . . . . . γH (J − 1, J − 1) (2J K)×(2J K)

where the sub-matrices ρH , αH , βH , γH be easily be calculated by any standard

Gauss–Legendre quadrature rule as the kernel H is a regular kernel.

3.6 Application to Singular Integral Equation on R with

Cauchy-Type Kernel

For the integral equation (20), we assume that f ∈ L2 [0, 1]. The solution v(x) of
the equation (20) is also L2 [0, 1] so that it has multiscale expansion similar to (37).
Using the multiscale representation (51) for L and (52) for H, discussed in Sect. 3.5,
the integral equation (20) can be recast into a system of linear algebraic equations
given by ⎛ T ⎞ ⎛ T ⎞
a0 c0
(I + λ HJMS + λ LJ MS ) ⎝ ⎠=⎝ ⎠. (53)
T T
(J −1) b (J −1) d

The matrix I in (53) is an identity matrix of order (2J K) × (2J K). The matrix (I +
λ HJMS + λ LC J MS ) is well conditioned so unknown coefficient of a0 , (J −1) b can be
found from
⎛ T ⎞ ⎛ T ⎞
a0 c0
⎝ ⎠ = (I + λ HJMS + λ LJ MS )−1 ⎝ ⎠. (54)
T T
(J −1) b (J −1) d

2
Error estimation The L2 - error
LJ = ||v − vJMS ||L2 in the approximate solution vJMS
given by (53) can be derived as
⎡ ⎤ 21

K−1 ∞ 2
j
−1

LJ = ⎣ |blj,k |2 ⎦ .
2
(55)
l=0 j=J k=0

4 Illustrative Examples

To test the efficiency of the numerical method developed here, we consider here two
examples [18].
35 Comparison of Two Methods Based on Daubechies Scale Functions … 465

Example 1 We consider the Eq. (1) with

e−t
2

λ = −3, h(x, t) = 7 (56)

(1 + |t| 2 + x2 )3

and

x2
f (x) = . (57)
(1 + x2 )4

For LMW basis, we choose β = 20, then f (x) satisfies the condition for finite range
transformation and f (x) < 10−8 if |x| > 20. Also we have computed the L2 error
of the solution without knowing the exact solution. It is found to be .0007 from the
wavelet coefficient
√ for J = 3 which is presented in Table 1. The plot of approximate
solution e−x2 v3 (x) by using LMW basis is shown in Fig. 1. The values at some
points in LMW basis are also displayed in Table 2 for comparison with the results
of Bonis et al. [18] obtained by using interpolation process, related to zeros of Her-
mite polynomials. The approximate solution by using Daubechies scale functions is
shown in Fig. 2 for j =√ 5, k2 5 = −128, k5 = 123, and j1 = 3. The values of the
min max

approximate solution e−x u(x) by using Daubechies scale functions at some points
are also displayed in Table 3 to compare our results with these of Bonis et al. [18].
Example 2 We consider another equation (1) with

t 2 e−t
2

λ = −1, h(x, t) = (58)

(1 + t 4 + x4 )3

and

tan−1 (1 + x)
f (x) = . (59)
(1 + x2 )3

Table 1 The coefficients b j,k ( j = 0, 1, 2, k = 0, 1, 2, . . . , 2 j − 1) obtained by using (54) by

using LMW in Example 1
j l
bj,k l=0 l=1 l=2 l=3
0 k =0 2 × 10−4 −8 × 10−4 −5 × 10−5 9 × 10−4
1 k =0 −1 × 10−5 −4 × 10−4 −9 × 10−5 −2 × 10−4
k =1 2 × 10−4 −3 × 10−4 1 × 10−4 −9 × 10−5
2 k =0 1 × 10−4 −4 × 10−4 6 × 10−5 −4 × 10−4
k =1 −3 × 10−4 −5 × 10−4 −3 × 10−5 −2 × 10−4
k =2 4 × 10−4 −3 × 10−4 6 × 10−5 −6 × 10−5
k =3 6 × 10−6 −4 × 10−5 6 × 10−6 −5 × 10−5
466 S. Paul and B. N. Mandal

e−x v3 (x) for Example 1
2
Fig. 1 The plot of

e−x v3 (x) by using LMW in Example 1 with comparison with [18]
2
Table 2 The value of the
x = −1.5 x = 0.2 x = 0.5
Present method for J = 3 2.2 × 10−3 5.88 × 10−3 2.9 × 10−4
Method in [18] for n = 256 1.97 × 10−3 5.14 × 10−3 1.99 × 10−3

e−x v3 (x) for Example 2
2
Fig. 2 The plot of

e−x u(x) in Example 1 by using Daubechies scale functions with
2
Table 3 The value of the
comparison with [18]
x = −1.5 x = 0.5
Present method for J = 4 2.519 × 10−3 3.119 × 10−3
by using Daubechies scale functions
Present method for J = 5 2.723 × 10−3 3.129 × 10−3
by using Daubechies scale functions
Method in [18] for n = 256 1.97 × 10−3 1.99 × 10−3
35 Comparison of Two Methods Based on Daubechies Scale Functions … 467

For LMW basis, we also choose β = 20 for this example, f (x) satisfies the con-
dition for finite range transformation and f (x) < 10−8 if |x| > 20. Also we have
computed the L2 error of the solution without knowing the exact solution. It is found
√ from the wavelet coefficient for J = 3 which is presented in Table 4. The
to be 0.01
plot of e−x2 v3 (x) by using LMW is shown in Fig. 3. The values at some points
are also displayed in Table 5 to compare our results with these of Bonis et al. [18].
The approximate solution by using Daubechies scale functions is shown in Fig. 4
for j = 5,√ k5 2 = −128, k5 = 123, and j1 = 3. The values of the approximate
min max

−x
solution e u(x) by using Daubechies scale functions at some points are also dis-
played in Table 6 to compare our results with these of Bonis et al. [18]. From the
figures, it is noted that the result (by using Daubechies scale function) agrees with
these of Bonis et al. [18] compared to the result by using LMW-based method. So
it is obvious that the Daubechies scale function-based method is more accurate than
LMW-based method for dealing with unbounded domain.

Table 4 The coefficients b j,k ( j = 0, 1, 2, k = 0, 1, 2, . . . , 2 j − 1) obtained by using (54) by

using LMW in Example 2
j l
bj,k l=0 l=1 l=2 l=3
0 k =0 0.0026 −0.0068 −0.0084 0.02479
1 k =0 0.0086 0.0042 0.0026 0.0002
k =1 0.0013 −0.0034 0.002 −0.0026
2 k =0 0.0007 −0.0013 0.0004 −0.0012
k =1 0.0048 0.0016 0.0003 −0.0004
k =2 0.005 −0.0091 0.0032 −0.0074
k =3 0.0006 −0.0016 0.0005 −0.0015

Fig.
3 The plot of
e−x u(x) by using
2

Daubechies scale functions

with j = 5 for Example 1
468 S. Paul and B. N. Mandal

e−x v3 (x) in Example 2 by using LMW with comparison with [18]
2
Table 5 The value of the
x = −1 x = −0.5 x = 0.4
Present method for J = 3 5.43 × 10−2 1.25 × 10−1 −1.97 × 10−2
Method in [18] for n = 256 5.17 × 10−2 1.49 × 10−1 −4.12 × 10−2

Fig.
4 The plot of
e−x u(x) by using
2

Daubechies scale functions

with j = 5 for Example 2

e−x u(x) in Example 2 by using Daubechies scale functions with
2
Table 6 The value of the
comparison with [18]
x = −1 x = −0.5
Present method for J = 4 6.448 × 10−2 1.671 × 10−1
by using Daubechies scale functions
Present method for J = 5 6.429 × 10−2 1.668 × 10−1
by using Daubechies scale functions
Method in [18] for n = 256 5.17 × 10−2 1.49 × 10−1

5 Conclusion

In this paper, we have presented a comparative study of two methods based on

Daubechies scale functions and Legendre multiwavelets. These methods are used
to solve the Fredholm integral equation of second kind with Cauchy-type kernel in
a unbounded domain. The method of solution by using Daubechies scale functions
is discussed. In the process of our development, the recurrence relations among
elements or formulae involving the elements of the multiscale representation of the
integral operator (CPV) have been derived. The efficiency and comparison of the two
methods have been tested for two examples and compute the L2 -error of the solution
by wavelet coefficients. It is found to be 10−3 order for taking 32 basis elements.
For solving Fredholm integral equation of second kind with Cauchy-type kernel in a
unbounded domain, it is found that Daubechies scale function-based method is more
appropriate than LMW-based method.
35 Comparison of Two Methods Based on Daubechies Scale Functions … 469

This study motivates us to extend the scheme based on Daubechies scale func-
tion and Legendre multiwavelets to get multiscale approximation and local behavior
of the solution of integro-differential equation, integro-differential-difference equa-
tions with constant or variable coefficients and regular or non-smooth input function
involving weakly singular, Cauchy singular or hypersingular kernels in finite as well
as infinite domain. Works in these directions are in progress and will be reported in
due course.

Acknowledgements S. Paul is thankful to Dr. M. M. Panja for his idea and valuable suggestions
during the preparation of this paper. This work is supported by a research grant from SERB(DST),
No. SR/S4/MS:821/13.

References

1. Helsing, J., Peters, G.: Integral equation methods and numerical solution of crack and inclusion
problems in plannar elastrostatics. SIAM. J. Appl. Math. 59(3), 965–982 (1999)
2. Muskhelishvilli, N. I.: Singular Integral Equations: Boundary Problems of Function Theory
and Their Application to Mathematical Physics (1953)
3. Gakhov, F.D.: Boundary Value Problems. Pergamon Press, New York (1966)
4. Ioakimidis, N.I.: On the weighted Galerkin method of numerical solution of Cauchy-type
singular integral equations. SIAM. J. Numer. Anal. 18(6), 1120–1127 (1981)
5. Gong, Y.: Galerkin solution of a singular integral equation with constant coefficients. J. Comput.
Appl. Math. 230, 393–399 (2009)
6. Gong, Y.: New errors for Galerkin method to an airfoil equation. J. Comp. Appl. Math. 206,
278–287 (2007)
7. Monegato, G., Sloan, I.H.: Numerical Solution of the generalized airfoil equation for an airfoil
with a flap. SIAM J. Numer. Anal. 34, 2288–2305 (1997)
8. Junghans, P., Kaiser, R.: Collocation for a Cauchy singular integral equation. Lin. Alg. Appl.
439(1), 729–770 (2013)
9. Scuderi, L.: A collocation method for the generalized airfoil equation for an airfoil with a flap.
SIAM J. Numer. Anal. 35, 1725–1739 (1998)
10. Setia, A.: Numerical solution of various cases of Cauchy type singular integral equation. Appl.
Math. Comp. 230(3), 200–207 (2014)
11. Arzhang, A.: Numerical solution of weakly singular integral equations by using Taylor series
and Legendre polynomial. Math. Sci. 4(2), 187–203 (2010)
12. Panja, M.M., Mandal, B.N.: Solution of second kind integral equation with Cauchy type kernel
using Daubechies scale function. J. Comp. Appl. Math. 241, 130–142 (2013)
13. Chandler-Wilde, S.N., Ross, C.R., Zhang, B.: Scattering by infinite one-dimensional rough
surfaces. Proc. R. Lond. A. 455, 3767–3787 (1999)
14. Arens, T.: Uniqueness of elastic wave scattering by rough surfaces. SIAM. J. Math. Anal. 33,
461–476 (2001)
15. Arens, T.: Existence of solution in elastic wave scattering by unbounded rough surfaces. Math.
Meth. Appl. Sci. 25, 507–526 (2002)
16. Preston, M.D., Chamberlain, P. G., Chandler-Wilde, S. N.: An integral equation method for
a boundary value problem arising in unsteady water wave problems. In: Advances in Bound-
ary Integral Methods, Proceeding of the 5th UK conference on Boundary Integral methods,
University of Liverpool, pp. 126–133 (2005)
17. Fariborz, S.J.: Singular integral equations with Cauchy kernel on the half line. Int. J. Engng.
Sci. 25, 123–126 (1987)
470 S. Paul and B. N. Mandal

18. Bonis, M.C., Frammartino, C., Mastroianni, G.: Numerical methods for some special Fredholm
integral equations on the real line. J. Comp. Appl. Math. 164–165, 225–243 (2004)
19. Sheshko, M.A., Sheshko, S.M.: Singular integral equation with Cauchy kernel on the real axis.
Diff. Eqn. 46, 568–585 (2010)
20. Pylak, D., Karczmarek, P., Sheshko, M.A.: Cauchy-type singular integral equation with constant
coefficients on the real line. Appl. Math. Comput. 217, 2977–2988 (2010)
21. Daubechies, I.: Ten lectures on wavelets. In: CBMS Lecture notes, SIAM publication, Philadel-
phia (1992)
22. Meyer, Y.: Wavelets and Operators. Cambridge University Press, Cambridge (1992)
23. Alpert, B.K.: A class of bases in L2 for the sparse representation of integral operators. SIAM.
J. Math. Anal. 24(1), 246–262 (1993)
24. Alpert, B.K., Beylikn, G., Coifman, R., Rokhlin, V.: Wavelet-like bases for the fast solution of
second-kind integral equations. SIAM J. Sci. Comput. 14, 159–184 (1993)
25. Paul, S., Panja, M.M., Mandal, B.N.: Multiscale approximation of the solution of weakly
singular second kind Fredholm integral equation in Legendre multiwavelet basis. J. Comput.
Appl. Math. 300, 275–289 (2016)
26. Paul, S., Panja, M.M., Mandal, B.N.: Wavelet based numerical solution of second kind hyper-
singular integral equation. Appl. Math. Sci. 10(54), 2687–2707 (2016)
27. Paul, S., Panja, M.M., Mandal, B.N.: Use of Legendre multiwavelets in solving second kind
singular integral equations with Cauchy type kernel. Invest. Math. Sci. 5, 2687–2707 (2016)
28. Paul, S., Panja, M.M., Mandal, B.N.: Use of Legendre multiwavelets to solve Carleman type
singular integral equations. Appl. Math. Model. 55, 522–535 (2018)
29. Panja, M.M., Mandal, B.N.: A note on one-point quadrature formula for Daubechies scale
function with partial support. Appl. Math. Comput. 218, 4147–4151 (2011)
30. Atkinson, K.E.: The numerical Solution of integral equations on the half line. SIAM. J. Numer.
Anal. 6, 375–397 (1969)
31. Anselone, P.M., Sloan, I.H.: Integral equations on the half line. J. Integr. Eqn. 9, 3–23 (1985)
32. Chandler-Wilde, S.N.: On asymptotic behavior at infinity and the finite section method for
integral equations on the half line. J. Integr. Eqn. 6, 1–38 (1994)
33. Chandler-Wilde, S.N.: On the behavior at infinity of solutions of integral equations on the real
line. J. Integr. Eqn. 4(2), 1–25 (1992)