Download Pattern recognition algorithms for data mining scalability knowledge discovery and soft granular computing 1st Edition Sankar K. Pal ebook All Chapters PDF
Download Pattern recognition algorithms for data mining scalability knowledge discovery and soft granular computing 1st Edition Sankar K. Pal ebook All Chapters PDF
com
https://ebookfinal.com/download/pattern-recognition-
algorithms-for-data-mining-scalability-knowledge-discovery-
and-soft-granular-computing-1st-edition-sankar-k-pal/
OR CLICK BUTTON
DOWNLOAD EBOOK
https://ebookfinal.com/download/pattern-recognition-algorithms-for-
data-mining-1st-edition-sankar-k-pal/
ebookfinal.com
https://ebookfinal.com/download/data-mining-and-knowledge-discovery-
technologies-advances-in-data-warehousing-and-mining-1st-edition-
david-taniar/
ebookfinal.com
https://ebookfinal.com/download/knowledge-discovery-and-data-mining-
challenges-and-realities-xingquan-zhu/
ebookfinal.com
https://ebookfinal.com/download/cloud-computing-solutions-1st-edition-
souvik-pal/
ebookfinal.com
Pattern Recognition 1st Edition William Gibson
https://ebookfinal.com/download/pattern-recognition-1st-edition-
william-gibson/
ebookfinal.com
https://ebookfinal.com/download/pattern-recognition-and-trading-
decisions-chris-satchwell/
ebookfinal.com
https://ebookfinal.com/download/pattern-recognition-4ed-edition-
sergios-theodoridis/
ebookfinal.com
https://ebookfinal.com/download/data-mining-for-bioinformatics-1st-
edition-sumeet-dua/
ebookfinal.com
Pal, Sankar K.
Pattern recognition algorithms for data mining : scalability, knowledge discovery, and
soft granular computing / Sankar K. Pal and Pabitra Mitra.
p. cm.
Includes bibliographical references and index.
ISBN 1-58488-457-6 (alk. paper)
1. Data mining. 2. Pattern recognition systems. 3. Computer algorithms. 4. Granular
computing / Sankar K. Pal and Pabita Mitra.
QA76.9.D343P38 2004
006.3'12—dc22 2004043539
This book contains information obtained from authentic and highly regarded sources. Reprinted material
is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable
efforts have been made to publish reliable data and information, but the author and the publisher cannot
assume responsibility for the validity of all materials or for the consequences of their use.
Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic
or mechanical, including photocopying, microfilming, and recording, or by any information storage or
retrieval system, without prior permission in writing from the publisher.
The consent of CRC Press LLC does not extend to copying for general distribution, for promotion, for
creating new works, or for resale. Specific permission must be obtained in writing from CRC Press LLC
for such copying.
Direct all inquiries to CRC Press LLC, 2000 N.W. Corporate Blvd., Boca Raton, Florida 33431.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation, without intent to infringe.
Foreword xiii
Preface xxi
1 Introduction 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Pattern Recognition in Brief . . . . . . . . . . . . . . . . . . 3
1.2.1 Data acquisition . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Feature selection/extraction . . . . . . . . . . . . . . . 4
1.2.3 Classification . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Knowledge Discovery in Databases (KDD) . . . . . . . . . . 7
1.4 Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4.1 Data mining tasks . . . . . . . . . . . . . . . . . . . . 10
1.4.2 Data mining tools . . . . . . . . . . . . . . . . . . . . 12
1.4.3 Applications of data mining . . . . . . . . . . . . . . . 12
1.5 Different Perspectives of Data Mining . . . . . . . . . . . . . 14
1.5.1 Database perspective . . . . . . . . . . . . . . . . . . . 14
1.5.2 Statistical perspective . . . . . . . . . . . . . . . . . . 15
1.5.3 Pattern recognition perspective . . . . . . . . . . . . . 15
1.5.4 Research issues and challenges . . . . . . . . . . . . . 16
1.6 Scaling Pattern Recognition Algorithms to Large Data Sets . 17
1.6.1 Data reduction . . . . . . . . . . . . . . . . . . . . . . 17
1.6.2 Dimensionality reduction . . . . . . . . . . . . . . . . 18
1.6.3 Active learning . . . . . . . . . . . . . . . . . . . . . . 19
1.6.4 Data partitioning . . . . . . . . . . . . . . . . . . . . . 19
1.6.5 Granular computing . . . . . . . . . . . . . . . . . . . 20
1.6.6 Efficient search algorithms . . . . . . . . . . . . . . . . 20
1.7 Significance of Soft Computing in KDD . . . . . . . . . . . . 21
1.8 Scope of the Book . . . . . . . . . . . . . . . . . . . . . . . . 22
vii
© 2004 by Taylor & Francis Group, LLC
viii
References 215
Index 237
Indian Statistical Institute (ISI), the home base of Professors S.K. Pal and P.
Mitra, has long been recognized as the world’s premier center of fundamental
research in probability, statistics and, more recently, pattern recognition and
machine intelligence. The halls of ISI are adorned with the names of P.C. Ma-
halanobis, C.R. Rao, R.C. Bose, D. Basu, J.K. Ghosh, D. Dutta Majumder,
K.R. Parthasarathi and other great intellects of the past century–great intel-
lects who have contributed so much and in so many ways to the advancement
of science and technology. The work of Professors Pal and Mitra, ”Pattern
Recognition Algorithms for Data Mining,” or PRDM for short, reflects this
illustrious legacy. The importance of PRDM is hard to exaggerate. It is a
treatise that is an exemplar of authority, deep insights, encyclopedic coverage
and high expository skill.
The primary objective of PRDM, as stated by the authors, is to provide
a unified framework for addressing pattern recognition tasks which are es-
sential for data mining. In reality, the book accomplishes much more; it
develops a unified framework and presents detailed analyses of a wide spec-
trum of methodologies for dealing with problems in which recognition, in one
form or another, plays an important role. Thus, the concepts and techniques
described in PRDM are of relevance not only to problems in pattern recog-
nition, but, more generally, to classification, analysis of dependencies, system
identification, authentication, and ultimately, to data mining. In this broad
perspective, conventional pattern recognition becomes a specialty–a specialty
with deep roots and a large store of working concepts and techniques.
Traditional pattern recognition is subsumed by what may be called recog-
nition technology. I take some credit for arguing, some time ago, that de-
velopment of recognition technology should be accorded a high priority. My
arguments may be found in the foreword,” Recognition Technology and Fuzzy
Logic, ”Special Issue on Recognition Technology, IEEE Transactions on Fuzzy
Systems, 2001. A visible consequence of my arguments was an addition of
the subtitle ”Soft Computing in Recognition and Search,” to the title of the
journal ”Approximate Reasoning.” What is important to note is that recogni-
tion technology is based on soft computing–a coalition of methodologies which
collectively provide a platform for the conception, design and utilization of in-
telligent systems. The principal constitutes of soft computing are fuzzy logic,
neurocomputing, evolutionary computing, probabilistic computing, rough set
theory and machine learning. These are the methodologies which are de-
scribed and applied in PRDM with a high level of authority and expository
xiii
© 2004 by Taylor & Francis Group, LLC
xiv
xvii
© 2004 by Taylor & Francis Group, LLC
Foreword
This is the latest in a series of volumes by Professor Sankar Pal and his col-
laborators on pattern recognition methodologies and applications. Knowledge
discovery and data mining, the recognition of patterns that may be present in
very large data sets and across distributed heterogeneous databases, is an ap-
plication of current prominence. This volume provides a very useful, thorough
exposition of the many facets of this application from several perspectives.
The chapters provide overviews of pattern recognition, data mining, outline
some of the research issues and carefully take the reader through the many
steps that are involved in reaching the desired goal of exposing the patterns
that may be embedded in voluminous data sets. These steps include prepro-
cessing operations for reducing the volume of the data and the dimensionality
of the feature space, clustering, segmentation, and classification. Search al-
gorithms and statistical and database operations are examined. Attention is
devoted to soft computing algorithms derived from the theories of rough sets,
fuzzy sets, genetic algorithms, multilayer perceptrons (MLP), and various hy-
brid combinations of these methodologies.
A valuable expository appendix describes various soft computing method-
ologies and their role in knowledge discovery and data mining (KDD). A sec-
ond appendix provides the reader with several data sets for experimentation
with the procedures described in this volume.
As has been the case with previous volumes by Professor Pal and his col-
laborators, this volume will be very useful to both researchers and students
interested in the latest advances in pattern recognition and its applications in
KDD.
I congratulate the authors of this volume and I am pleased to recommend
it as a valuable addition to the books in this field.
xix
© 2004 by Taylor & Francis Group, LLC
Preface
xxi
© 2004 by Taylor & Francis Group, LLC
xxii
Sankar K. Pal
September 13, 2003 Pabitra Mitra
xxv
© 2004 by Taylor & Francis Group, LLC
xxvi
8.1 Rough set dependency rules for Vowel data along with the input
fuzzification parameter values . . . . . . . . . . . . . . . . . . 191
8.2 Comparative performance of different models . . . . . . . . . 193
8.3 Comparison of the performance of the rules extracted by vari-
ous methods for Vowel, Pat and Hepatobiliary data . . . . . . 195
8.4 Rules extracted from trained networks (Model S) for Vowel
data along with the input fuzzification parameter values . . . 196
8.5 Rules extracted from trained networks (Model S) for Pat data
along with the input fuzzification parameter values . . . . . . 196
8.6 Rules extracted from trained networks (Model S) for Hepato-
biliary data along with the input fuzzification parameter values 197
8.7 Crude rules obtained via rough set theory for staging of cervical
cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
8.8 Rules extracted from the modular rough MLP for staging of
cervical cancer . . . . . . . . . . . . . . . . . . . . . . . . . . 199
xxvii
© 2004 by Taylor & Francis Group, LLC
xxviii
4.4 Variation of atest with CPU time for (a) cancer, (b) ionosphere,
(c) heart, (d) twonorm, and (e) forest cover type data. . . . . 98
4.5 Variation of confidence factor c and distance D for (a) cancer,
(b) ionosphere, (c) heart, and (d) twonorm data. . . . . . . . 99
4.6 Variation of confidence factor c with iterations of StatQSVM
algorithm for (a) cancer, (b) ionosphere, (c) heart, and (d)
twonorm data. . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.7 Margin distribution obtained at each iteration by the StatQSVM
algorithm for the Twonorm data. The bold line denotes the fi-
nal distribution obtained. . . . . . . . . . . . . . . . . . . . . 101
4.8 Margin distribution obtained by some SVM design algorithms
for the Twonorm data set. . . . . . . . . . . . . . . . . . . . . 102
7.1 The basic network structure for the Kohonen feature map. . . 151
7.2 Neighborhood Nc , centered on unit c (xc , yc ). Three different
neighborhoods are shown at distance d = 1, 2, and 3. . . . . . 153
7.3 Mapping of reducts in the competitive layer of RSOM. . . . . 154
7.4 Variation of quantization error with iteration for Pat data. . . 160
7.5 Variation of quantization error with iteration for vowel data. 160
7.6 Plot showing the frequency of winning nodes using random
weights for the Pat data. . . . . . . . . . . . . . . . . . . . . . 161
7.7 Plot showing the frequency of winning nodes using rough set
knowledge for the Pat data. . . . . . . . . . . . . . . . . . . . 161
1.1 Introduction
Pattern recognition (PR) is an activity that we humans normally excel
in. We do it almost all the time, and without conscious effort. We receive
information via our various sensory organs, which is processed instantaneously
by our brain so that, almost immediately, we are able to identify the source
of the information, without having made any perceptible effort. What is
even more impressive is the accuracy with which we can perform recognition
tasks even under non-ideal conditions, for instance, when the information that
needs to be processed is vague, imprecise or even incomplete. In fact, most
of our day-to-day activities are based on our success in performing various
pattern recognition tasks. For example, when we read a book, we recognize the
letters, words and, ultimately, concepts and notions, from the visual signals
received by our brain, which processes them speedily and probably does a
neurobiological implementation of template-matching! [189]
The discipline of pattern recognition (or pattern recognition by machine)
essentially deals with the problem of developing algorithms and methodolo-
gies/devices that can enable the computer-implementation of many of the
recognition tasks that humans normally perform. The motivation is to per-
form these tasks more accurately, or faster, and perhaps more economically
than humans and, in many cases, to release them from drudgery resulting from
performing routine recognition tasks repetitively and mechanically. The scope
of PR also encompasses tasks humans are not good at, such as reading bar
codes. The goal of pattern recognition research is to devise ways and means of
automating certain decision-making processes that lead to classification and
recognition.
Machine recognition of patterns can be viewed as a two-fold task, consisting
of learning the invariant and common properties of a set of samples charac-
terizing a class, and of deciding that a new sample is a possible member of
the class by noting that it has properties common to those of the set of sam-
ples. The task of pattern recognition by a computer can be described as a
transformation from the measurement space M to the feature space F and
finally to the decision space D; i.e.,
M → F → D.
1
© 2004 by Taylor & Francis Group, LLC
2 Pattern Recognition Algorithms for Data Mining
Data mining is that part of knowledge discovery which deals with the pro-
cess of identifying valid, novel, potentially useful, and ultimately understand-
able patterns in data, and excludes the knowledge interpretation part of KDD.
Therefore, as it stands now, data mining can be viewed as applying PR and
machine learning principles in the context of voluminous, possibly heteroge-
neous data sets [189].
The objective of this book is to provide some results of investigations,
both theoretical and experimental, addressing certain pattern recognition
tasks essential for data mining. Tasks considered include data condensation,
feature selection, case generation, clustering, classification and rule genera-
tion/evaluation. Various methodologies based on both classical and soft com-
puting approaches (integrating fuzzy logic, artificial neural networks, rough
sets, genetic algorithms) have been presented. The emphasis of these method-
ologies is given on (a) handling data sets which are large (both in size and
dimension) and involve classes that are overlapping, intractable and/or having
nonlinear boundaries, and (b) demonstrating the significance of granular com-
puting in soft computing paradigm for generating linguistic rules and dealing
with the knowledge discovery aspect. Before we describe the scope of the
book, we provide a brief review of pattern recognition, knowledge discovery
in data bases, data mining, challenges in application of pattern recognition
algorithms to data mining problems, and some of the possible solutions.
Section 1.2 presents a description of the basic concept, features and tech-
niques of pattern recognition briefly. Next, we define the KDD process and
describe its various components. In Section 1.4 we elaborate upon the data
mining aspects of KDD, discussing its components, tasks involved, approaches
and application areas. The pattern recognition perspective of data mining is
introduced next and related research challenges are mentioned. The problem
of scaling pattern recognition algorithms to large data sets is discussed in Sec-
tion 1.6. Some broad approaches to achieving scalability are listed. The role
of soft computing in knowledge discovery is described in Section 1.7. Finally,
Section 1.8 discusses the plan of the book.
two categories – feature selection in the measurement space and feature selec-
tion in a transformed space. The techniques in the first category generally
reduce the dimensionality of the measurement space by discarding redundant
or least information carrying features. On the other hand, those in the sec-
ond category utilize all the information contained in the measurement space
to obtain a new transformed space, thereby mapping a higher dimensional
pattern to a lower dimensional one. This is referred to as feature extraction.
1.2.3 Classification
The problem of classification is basically one of partitioning the feature
space into regions, one region for each category of input. Thus it attempts to
assign every data point in the entire feature space to one of the possible classes
(say, M ) . In real life, the complete description of the classes is not known.
We have instead a finite and usually smaller number of samples which often
provides partial information for optimal design of feature selector/extractor
or classifying/clustering system. Under such circumstances, it is assumed that
these samples are representative of the classes. Such a set of typical patterns
is called a training set. On the basis of the information gathered from the
samples in the training set, the pattern recognition systems are designed; i.e.,
we decide the values of the parameters of various pattern recognition methods.
Design of a classification or clustering scheme can be made with labeled or
unlabeled data. When the computer is given a set of objects with known
classifications (i.e., labels) and is asked to classify an unknown object based
on the information acquired by it during training, we call the design scheme
supervised learning; otherwise we call it unsupervised learning. Supervised
learning is used for classifying different objects, while clustering is performed
through unsupervised learning.
Pattern classification, by its nature, admits many approaches, sometimes
complementary, sometimes competing, to provide solution of a given problem.
These include decision theoretic approach (both deterministic and probabilis-
tic), syntactic approach, connectionist approach, fuzzy and rough set theoretic
approach and hybrid or soft computing approach.
In the decision theoretic approach, once a pattern is transformed, through
feature evaluation, to a vector in the feature space, its characteristics are ex-
pressed only by a set of numerical values. Classification can be done by using
deterministic or probabilistic techniques [55, 59]. In deterministic classifica-
tion approach, it is assumed that there exists only one unambiguous pattern
class corresponding to each of the unknown pattern vectors. Nearest neighbor
classifier (NN rule) [59] is an example of this category.
In most of the practical problems, the features are usually noisy and the
classes in the feature space are overlapping. In order to model such systems,
the features x1 , x2 , . . . , xi , . . . , xp are considered as random variables in the
probabilistic approach. The most commonly used classifier in such probabilis-
tic systems is the Bayes maximum likelihood classifier [59].
Although both these concepts are important, it has often been observed that
actionability and unexpectedness are correlated. In literature, unexpectedness
is often defined in terms of the dissimilarity of a discovered pattern from a
vocabulary provided by the user.
As an example, consider a database of student evaluations of different
courses offered at some university. This can be defined as EVALUATE (TERM,
YEAR, COURSE, SECTION, INSTRUCTOR, INSTRUCT RATING, COURSE RATING). We
describe two patterns that are interesting in terms of actionability and unex-
pectedness respectively. The pattern that “Professor X is consistently getting
the overall INSTRUCT RATING below the overall COURSE RATING” can be of in-
terest to the chairperson because this shows that Professor X has room for
improvement. If, on the other hand, in most of the course evaluations the
overall INSTRUCT RATING is higher than the COURSE RATING and it turns out
that in most of Professor X’s ratings overall the INSTRUCT RATING is lower
than the COURSE RATING, then such a pattern is unexpected and hence inter-
esting. ✸
Data mining is a step in the KDD process that consists of applying data
analysis and discovery algorithms which, under acceptable computational lim-
itations, produce a particular enumeration of patterns (or generate a model)
over the data. It uses historical information to discover regularities and im-
prove future decisions [161].
The overall KDD process is outlined in Figure 1.1. It is interactive and
iterative involving, more or less, the following steps [65, 66]:
. Data
Cleaning
. Data
Condensation
Machine Mathematical
Knowledge
Interpretation
. Dimensionality Preprocessed Learning Model
of Data Useful
Huge Reduction .Classification . Knowledge Knowledge
Hetero-
geneous
Data
. Clustering (Patterns) Extraction
Raw . Rule . Knowledge
Data
. Data Generation Evaluation
Wrapping
Thus, KDD refers to the overall process of turning low-level data into high-
level knowledge. Perhaps the most important step in the KDD process is data
mining. However, the other steps are also important for the successful appli-
cation of KDD in practice. For example, steps 1, 2 and 3, mentioned above,
have been the subject of widespread research in the area of data warehousing.
We now focus on the data mining component of KDD.
that contain it. Businesses can use knowledge of these patterns to im-
prove placement of items in a store or for mail-order marketing. The
huge size of transaction databases and the exponential increase in the
number of potential frequent itemsets with increase in the number of at-
tributes (items) make the above problem a challenging one. The a priori
algorithm [3] provided one early solution which was improved by sub-
sequent algorithms using partitioning, hashing, sampling and dynamic
itemset counting.
2. Clustering: maps a data item into one of several clusters, where clusters
are natural groupings of data items based on similarity metrics or prob-
ability density models. Clustering is used in several exploratory data
analysis tasks, customer retention and management, and web mining.
The clustering problem has been studied in many fields, including statis-
tics, machine learning and pattern recognition. However, large data
considerations were absent in these approaches. Recently, several new
algorithms with greater emphasis on scalability have been developed, in-
cluding those based on summarized cluster representation called cluster
feature (Birch [291], ScaleKM [29]), sampling (CURE [84]) and density
joins (DBSCAN [61]).
Some other tasks required in some data mining applications are, outlier/
anomaly detection, link analysis, optimization and planning.
5. Example based methods (e.g., nearest neighbor [7], lazy learning [5] and
case based reasoning [122, 208] methods )
The data mining algorithms determine both the flexibility of the model in
representing the data and the interpretability of the model in human terms.
Typically, the more complex models may fit the data better but may also
be more difficult to understand and to fit reliably. Also, each representation
suits some problems better than others. For example, decision tree classifiers
can be very useful for finding structure in high dimensional spaces and are
also useful in problems with mixed continuous and categorical data. However,
they may not be suitable for problems where the true decision boundaries are
nonlinear multivariate functions.
Other (11%)
Banking (17%)
Telecom (11%)
Biology/Genetics (8%)
Retail (6%)
eCommerce/Web (15%)
Pharmaceuticals (5%)
Investment/Stocks (4%)
Fraud Detection (8%)
Insurance (6%)
• The World Wide Web: Information retrieval, resource location [62, 210].
8. Integration. Data mining tools are often only a part of the entire decision
making system. It is desirable that they integrate smoothly, both with
the database and the final decision-making procedure.
In the next section we discuss the issues related to the large size of the data
sets in more detail.
of the original massive data set [18]. The reduced representation should be
as faithful to the original data as possible, for its effective use in different
mining tasks. At present the following categories of reduced representations
are mainly used:
• Sampling/instance selection: Various random, deterministic and den-
sity biased sampling strategies exist in statistics literature. Their use
in machine learning and data mining tasks has also been widely stud-
ied [37, 114, 142]. Note that merely generating a random sample from
a large database stored on disk may itself be a non-trivial task from
a computational viewpoint. Several aspects of instance selection, e.g.,
instance representation, selection of interior/boundary points, and in-
stance pruning strategies, have also been investigated in instance-based
and nearest neighbor classification frameworks [279]. Challenges in de-
signing an instance selection algorithm include accurate representation
of the original data distribution, making fine distinctions at different
scales and noticing rare events and anomalies.
• Data squashing: It is a form of lossy compression where a large data
set is replaced by a small data set and some accompanying quantities,
while attempting to preserve its statistical information [60].
• Indexing data structures: Systems such as kd-trees [22], R-trees, hash
tables, AD-trees, multiresolution kd-trees [54] and cluster feature (CF)-
trees [29] partition the data (or feature space) into buckets recursively,
and store enough information regarding the data in the bucket so that
many mining queries and learning tasks can be achieved in constant or
linear time.
• Frequent itemsets: They are often applied in supermarket data analysis
and require that the attributes are sparsely valued [3].
• DataCubes: Use a relational aggregation database operator to represent
chunks of data [82].
The last four techniques fall into the general class of representation called
cached sufficient statistics [177]. These are summary data structures that lie
between the statistical algorithms and the database, intercepting the kinds of
operations that have the potential to consume large time if they were answered
by direct reading of the data set. Case-based reasoning [122] also involves a
related approach where salient instances (or descriptions) are either selected
or constructed and stored in the case base for later use.
proved performance.
Genetic algorithms provide efficient search algorithms to select a model,
from mixed media data, based on some preference criterion/objective function.
They have been employed in regression and in discovering association rules.
Rough sets are suitable for handling different types of uncertainty in data and
have been mainly utilized for extracting knowledge in the form of rules.
Other hybridizations typically enjoy the generic and application-specific
merits of the individual soft computing tools that they integrate. Data mining
functions modeled by such systems include rule extraction, data summariza-
tion, clustering, incorporation of domain knowledge, and partitioning. Case-
based reasoning (CBR), a novel AI problem-solving paradigm, has recently
drawn the attention of both soft computing and data mining communities.
A profile of its theory, algorithms, and potential applications is available in
[262, 195, 208].
A review on the role of different soft computing tools in data mining prob-
lems is provided in Appendix A.
compression index and the effect of scale parameter are also presented.
While Chapters 2 and 3 deal with some preprocessing tasks of data mining,
Chapter 4 is concerned with its classification/learning aspect. Here we present
two active learning strategies for handling the large quadratic programming
(QP) problem of support vector machine (SVM) classifier design. The first
one is an error-driven incremental method for active support vector learning.
The method involves selecting a chunk of q new points, having equal number of
correctly classified and misclassified points, at each iteration by resampling the
data set, and using it to update the current SV set. The resampling strategy
is computationally superior to random chunk selection, while achieving higher
classification accuracy. Since it allows for querying multiple instances at each
iteration, it is computationally more efficient than those that are querying for
a single example at a time.
The second algorithm deals with active support vector learning in a statis-
tical query framework. Like the previous algorithm, it also involves queries
for multiple instances at each iteration. The intermediate statistical query
oracle, involved in the learning process, returns the value of the probability
that a new example belongs to the actual support vector set. A set of q new
points is selected according to the above probability and is used along with
the current SVs to obtain the new SVs. The probability is estimated using a
combination of two factors: the margin of the particular example with respect
to the current hyperplane, and the degree of confidence that the current set
of SVs provides the actual SVs. The degree of confidence is quantified by a
measure which is based on the local properties of each of the current support
vectors and is computed using the nearest neighbor estimates.
The methodology in the second part has some more advantages. It not only
queries for the error points (or points having low margin) but also a number of
other points far from the separating hyperplane (interior points). Thus, even if
a current hypothesis is erroneous there is a scope for its being corrected owing
to the interior points. If only error points were selected the hypothesis might
have actually been worse. The ratio of selected points having low margin and
those far from the hyperplane is decided by the confidence factor, which varies
adaptively with iteration. If the current SV set is close to the optimal one, the
algorithm focuses only on the low margin points and ignores the redundant
points that lie far from the hyperplane. On the other hand, if the confidence
factor is low (say, in the initial learning phase) it explores a higher number
of interior points. Thus, the trade-off between efficiency and robustness of
performance is adequately handled in this framework. Also, the efficiency of
most of the existing active SV learning algorithms depends on the sparsity
ratio (i.e., the ratio of the number of support vectors to the total number
of data points) of the data set. Due to the adaptive nature of the query in
the proposed algorithm, it is likely to be efficient for a wide range of sparsity
ratio.
Experimental results have been presented for five real life classification prob-
lems. The number of patterns ranges from 351 to 495141, dimension from
9 to 34, and the sparsity ratio from 0.01 to 0.51. The algorithms, particularly
the second one, are found to provide superior performance in terms of classi-
fication accuracy, closeness to the optimal SV set, training time and margin
distribution, as compared to several related algorithms for incremental and
active SV learning. Studies on effectiveness of the confidence factor, used in
statistical queries, are also presented.
In the previous three chapters all the methodologies described for data
condensation, feature selection and active learning are based on classical ap-
proach. The next three chapters (Chapters 5 to 7) emphasize demonstrating
the effectiveness of integrating different soft computing tools, e.g., fuzzy logic,
artificial neural networks, rough sets and genetic algorithms for performing
certain tasks in data mining.
In Chapter 5 methods based on the principle of granular computing in
rough fuzzy framework are described for efficient case (representative class
prototypes) generation of large data sets. Here, fuzzy set theory is used for
linguistic representation of patterns, thereby producing a fuzzy granulation of
the feature space. Rough set theory is used to obtain the dependency rules
which model different informative regions in the granulated feature space.
The fuzzy membership functions corresponding to the informative regions are
stored as cases along with the strength values. Case retrieval is made using a
similarity measure based on these membership functions. Unlike the existing
case selection methods, the cases here are cluster granules, and not the sam-
ple points. Also, each case involves a reduced number of relevant (variable)
features. Because of this twofold information compression the algorithm has
a low time requirement in generation as well as retrieval of cases. Superior-
ity of the algorithm in terms of classification accuracy, and case generation
and retrieval time is demonstrated experimentally on data sets having large
dimension and size.
In Chapter 6 we first describe, in brief, some clustering algorithms suit-
able for large data sets. Then an integration of a minimal spanning tree
(MST) based graph-theoretic technique and expectation maximization (EM)
algorithm with rough set initialization is described for non-convex clustering.
Here, rough set initialization is performed using dependency rules generated
on a fuzzy granulated feature space. EM provides the statistical model of the
data and handles the associated uncertainties. Rough set theory helps in faster
convergence and avoidance of the local minima problem, thereby enhancing
the performance of EM. MST helps in determining non-convex clusters. Since
it is applied on Gaussians rather than the original data points, the time re-
quirement is very low. Comparison with related methods is made in terms
of a cluster quality measure and computation time. Its effectiveness is also
demonstrated for segmentation of multispectral satellite images into different
landcover types.
A rough self-organizing map (RSOM) with fuzzy discretization of feature
space is described in Chapter 7. Discernibility reducts obtained using rough
set theory are used to extract domain knowledge in an unsupervised frame-
work. Reducts are then used to determine the initial weights of the network,
which are further refined using competitive learning. Superiority of this net-
work in terms of quality of clusters, learning time and representation of data is
demonstrated quantitatively through experiments over the conventional SOM
with both random and linear initializations. A linguistic rule generation algo-
rithm has been described. The extracted rules are also found to be superior
in terms of coverage, reachability and fidelity. This methodology is unique in
demonstrating how rough sets could be integrated with SOM, and it provides
a fast and robust solution to the initialization problem of SOM learning.
While granular computing is performed in rough-fuzzy and neuro-rough
frameworks in Chapters 5 and 6 and Chapter 7, respectively, the same is done
in Chapter 8 in an evolutionary rough-neuro-fuzzy framework by a synergis-
tic integration of all the four soft computing components. After explaining
different ensemble learning techniques, a modular rough-fuzzy multilayer per-
ceptron (MLP) is described in detail. Here fuzzy sets, rough sets, neural
networks and genetic algorithms are combined with modular decomposition
strategy. The resulting connectionist system achieves gain in terms of perfor-
mance, learning time and network compactness for classification and linguistic
rule generation.
Here, the role of the individual components is as follows. Fuzzy sets han-
dle uncertainties in the input data and output decision of the neural network
and provide linguistic representation (fuzzy granulation) of the feature space.
Multilayer perceptron is well known for providing a connectionist paradigm for
learning and adaptation. Rough set theory is used to extract domain knowl-
edge in the form of linguistic rules, which are then encoded into a number of
fuzzy MLP modules or subnetworks. Genetic algorithms (GAs) are used to
integrate and evolve the population of subnetworks as well as the fuzzifica-
tion parameters through efficient searching. A concept of variable mutation
operator is introduced for preserving the localized structure of the consti-
tuting knowledge-based subnetworks, while they are integrated and evolved.
The nature of the mutation operator is determined by the domain knowledge
extracted by rough sets.
The modular concept, based on a “divide and conquer” strategy, provides
accelerated training, preserves the identity of individual clusters, reduces the
catastrophic interference due to overlapping regions, and generates a com-
pact network suitable for extracting a minimum number of rules with high
certainty values. A quantitative study of the knowledge discovery aspect is
made through different rule evaluation indices, such as interestingness, cer-
tainty, confusion, coverage, accuracy and fidelity. Different well-established
algorithms for generating classification and association rules are described in
this regard for convenience. These include a priori, subset, MofN and dynamic
itemset counting methods.
The effectiveness of the modular rough-fuzzy MLP and its rule extraction
algorithm is extensively demonstrated through experiments along with com-
parisons. In some cases the rules generated are also validated by domain
2.1 Introduction
The current popularity of data mining and data warehousing, as well as the
decline in the cost of disk storage, has led to a proliferation of terabyte data
warehouses [66]. Mining a database of even a few gigabytes is an arduous
task for machine learning techniques and requires advanced parallel hardware
and algorithms. An approach for dealing with the intractable problem of
learning from huge databases is to select a small subset of data for learning
[230]. Databases often contain redundant data. It would be convenient if
large databases could be replaced by a small subset of representative patterns
so that the accuracy of estimates (e.g., of probability density, dependencies,
class boundaries) obtained from such a reduced set should be comparable to
that obtained using the entire data set.
The simplest approach for data reduction is to draw the desired number
of random samples from the entire data set. Various statistical sampling
methods such as random sampling, stratified sampling, and peepholing [37]
have been in existence. However, naive sampling methods are not suitable
for real world problems with noisy data, since the performance of the algo-
rithms may change unpredictably and significantly [37]. Better performance
is obtained using uncertainty sampling [136] and active learning [241], where a
simple classifier queries for informative examples. The random sampling ap-
proach effectively ignores all the information present in the samples not chosen
for membership in the reduced subset. An advanced condensation algorithm
should include information from all samples in the reduction process.
Some widely studied schemes for data condensation are built upon classi-
fication-based approaches, in general, and the k-NN rule, in particular [48].
The effectiveness of the condensed set is measured in terms of the classification
accuracy. These methods attempt to derive a minimal consistent set, i.e.,
a minimal set which correctly classifies all the original samples. The very
first development of this kind is the condensed nearest neighbor rule (CNN)
of Hart [91]. Other algorithms in this category including the popular IB3,
IB4 [4], reduced nearest neighbor and iterative condensation algorithms are
summarized in [279]. Recently a local asymmetrically weighted similarity
metric (LASM) approach for data compression [239] is shown to have superior
29
© 2004 by Taylor & Francis Group, LLC
30 Pattern Recognition Algorithms for Data Mining
These letters were all sealed with the seal of the Sultan.
Hámma showed me also another letter which he had received
from the Sultan, and which I think interesting enough to be here
inserted, as it is a faithful image of the turbulent state of the country
at that time, and as it contains the simple expression of the sincere
and just proceedings of the new Sultan. Its purport was as follows,
though the language in which it is written is so incorrect that several
passages admit of different interpretations:—
“In the name of God, etc.
“From the Commander, the faithful Minister of Justice, the
Sultan ʿAbd el Káder, son of the Sultan Mohammed el Bákeri,
to the chiefs of all the tribe of Eʾ Núr, and Hámed, and Sëis,
and all those among you who have large possessions, perfect
peace to you.
“Your eloquence, compliments, and information are
deserving of praise. We have seen the auxiliaries sent to us
by your tribe, and we have taken energetic measures with
them against the marauders, who obstruct the way of the
caravans of devout people, and the intercourse of those who
travel, as well as those who remain at home. On this account
we desire to receive aid from you against their incursions.
The people of the Kél-fadaye, they are the marauders. We
should not have prohibited their chiefs to exercise rule over
them, except for three things: first, because I am afraid they
will betake themselves from the Aníkel [the community of the
people of Aír] to the Awelímmiden; secondly, in order that
they may not make an alliance with them against us, for they
are all marauders; and thirdly, in order that you may approve
of their paying us the tribute. Come, then, to us quickly. You
know that what the hand holds it holds only with the aid of
the fingers; for without the fingers the hand can seize
nothing.
“We therefore will expect your determination, that is to say
your coming, after the departure of the salt-caravan of the
Itísan, fixed among you for the fifteenth of the month. God!
God is merciful and answereth prayer! Come therefore to us,
and we will tuck up our sleeves, and drive away the
marauders, and fight valiantly against them as God (be He
glorified!) hath commanded.
“Lo, corruption hath multiplied on the face of the earth!
May the Lord not question us on account of the poor and
needy, orphans and widows, according to His word: ‘You are
all herdsmen, and ye shall all be questioned respecting your
herds, whether ye have indeed taken good care of them or
dried them up.’
“Delay not, therefore, but hasten to our residence, where
we are all assembled; for ‘zeal in the cause of religion is the
duty of all;’ or send thy messenger to us quickly with a
positive answer; send thy messenger as soon as possible.
Farewell!”
The whole population was in alarm, and everybody who was able
to bear arms prepared for the expedition. About sunset the “égehen”
left the town, numbering about four hundred men, partly on camels,
partly on horseback, besides the people on foot. Bóro as well as
Áshu accompanied the Sultan, who this time was himself mounted
on a camel. They went to take their encampment near that of
Astáfidet, in Tagúrast, ʿAbd el Káder pitching a tent of grey colour,
and in size like that of a Turkish aghá, in the midst of the Kél-gerés,
the Kél-ferwán, and the Emgedesíye; while Astáfidet, who had no
tent, was surrounded by the Kél-owí. The Sultan was kind and
attentive enough not to forget me even now; and having heard that
I had not yet departed, Hámma not having finished his business in
the town, he sent me some wheat, a large botta with butter and
vegetables (chiefly melons and cucumbers), and the promise of
another sheep.
In the evening the drummer again went his rounds through the
town, proclaiming the strict order of the Sultan that everybody
should lay in a large supply of provisions. Although the town in
general had become very silent when deserted by so many people,
our house was kept in constant bustle, and in the course of the night
three mehára came from the camp, with people who could get no
supper there, and sought it with us. Bóro sent a messenger to me
early the next morning, urgently begging for a little powder, as the
“Mehárebín” of the Imghád had sent off their camels and other
property, and were determined to resist the army of the Sultan.
However, I could send him but very little. My amusing friend
Mohammed spent the whole day with us, when he went to join the
ghazzia. I afterwards learnt that he obtained four head of cattle as
his share. There must be considerable herds of cattle in the more
favoured valleys of Asben; for the expedition had nothing else to live
upon, as Mohammed afterwards informed me, and slaughtered an
immense quantity of them. Altogether, the expedition was
successful, and the Fádë-ang and many tribes of the Imghád lost
almost all their property. Even the influential Háj Beshír was
punished, on account of his son having taken part in the expedition
against us. I received also the satisfactory information that ʿAbd el
Káder had taken nine camels from the man who retained my méheri;
but I gained nothing thereby, neither my own camel being returned
nor another given me in its stead. The case was the same with all
our things; but nevertheless the proceeding had a good effect,
seeing that people were punished expressly for having robbed
Christians, and thus the principle was established that it was not less
illegal to rob Christians than it was to rob Mohammedans, both
creeds being placed, as far as regards the obligations of peace and
honesty, on equally favourable terms.
Tuesday, October 22.—I spent the whole of Tuesday in my house,
principally in taking down information which I received from the
intelligent Ghadámsi merchant Mohammed, who, having left his
native town from fear of the Turks, had resided six years in Ágades,
and was a well-informed man.
Wednesday, October 23.—My old friend the blacksmith Hámmeda,
and the tall Elíyas, went off this morning with several camels laden
with provisions, while Hámma still stayed behind to finish the
purchases; for on account of the expedition, and the insecure state
of the road to Damerghú, it had been difficult to procure provisions
in sufficient quantity. Our house therefore became almost as silent
and desolate as the rest of the town; but I found a great advantage
in remaining a few days longer, for my chivalrous friend and
protector, who, as long as the Sultan and the great men were
present, had been very reserved and cautious, had now no further
scruple about taking me everywhere, and showing me the town
“within and without.”
We first visited the house of Ídder,
a broker, who lived at a short distance
to the south from our house, and had
also lodged Háj ʿAbdúwa during his
stay here. It was a large, spacious
dwelling, well arranged with a view to
comfort and privacy, according to the
conception and customs of the
inhabitants, while our house (being a
mere temporary residence for Ánnur’s
people occasionally visiting the town)
was a dirty, comfortless abode. We
entered first a vestibule, about twenty-five feet long and nine broad,
having on each side a separate space marked off by that low kind of
balustrade mentioned in my description of the Sultan’s house. This
vestibule or ante-room was followed by a second room of larger size
and irregular arrangement; opposite the entrance it opened into
another apartment, which, with two doors, led into a spacious inner
courtyard, which was very irregularly circumscribed by several rooms
projecting into it, while to the left it was occupied by an enormous
bedstead (1). These bedsteads are a most characteristic article of
furniture in all the dwellings of the Sónghay. In Ágades they are
generally very solidly built of thick boards, and furnished with a
strong canopy resting upon four posts, covered with mats on the top
and on three sides, the remaining side being shut in with boards.
Such a canopied bed looks like a little house by itself. On the wall of
the first chamber, which on the right projected into the courtyard,
several lines of large pots had been arranged, one above the other
(2), forming so many warm nests for a number of turtle-doves which
were playing all about the courtyard; while on the left, in the half-
decayed walls of two other rooms (3), about a dozen goats were
fastened each to a separate pole. The background of the courtyard
contained several rooms, and in front of it a large shade (4) had
been built of mats, forming a rather pleasant and cool resting-place.
Numbers of children were gambolling about, who gave to the whole
a very cheerful appearance. There is something very peculiar in
these houses, which are constructed evidently with a view to
comfort and quiet enjoyment.
We then went to visit a female friend of Hámma, who lived in the
south quarter of the town, in a house which likewise bespoke much
comfort; but here, on account of the number of inmates, the
arrangement was different, the second vestibule being furnished on
each side with a large bedstead instead of mats, though here also
there was in the courtyard an immense bedstead. The courtyard was
comparatively small, and a long corridor on the left of it led to an
inner courtyard or “tsakangída,” which I was not allowed to see. The
mistress of the house was still a very comely person, although she
had borne several children. She had a fine figure, though rather
under the middle size, and a fair complexion. I may here remark that
many of the women of Ágades are not a shade darker than Arab
women in general. She wore a great quantity of silver ornaments,
and was well dressed in a gown of coloured cotton and silk. Hámma
was very intimate with her, and introduced me to her as his friend
and protégé, whom she ought to value as highly as himself. She was
married, but her husband was residing in Kátsena, and she did not
seem to await his return in the Penelopean style. The house had as
many as twenty inmates, there being no less than six children, I
think, under five years of age, and among them a very handsome
little girl, the mother’s favourite; besides, there were six or seven
full-grown slaves. The children were all naked, but wore ornaments
of beads and silver.
After we had taken leave of this Emgedesíye lady, we followed the
street towards the south, where there were some very good houses,
although the quarter in general was in ruins; and here I saw the
very best and most comfortable-looking dwelling in the town. All the
pinnacles were ornamented with ostrich eggs. One will often find in
an eastern town, after the first impression of its desolate appearance
is gone by, many proofs that the period of its utter prostration is not
yet come, but that even in the midst of the ruins there is still a good
deal of ease and comfort. Among the ruins of the southern quarter
are to be seen the pinnacled walls of a building of immense
circumference and considerable elevation; but unfortunately I could
not learn from Hámma for what purpose it had been used; however,
it was certainly a public building, and probably a large khán rather
than the residence of the chief. With its high, towering walls, it still
forms a sort of outwork on the south side of the town, where in
general the wall is entirely destroyed, and the way is everywhere
open. Hámma had a great prejudice against this desolate quarter.
Even the more intelligent Mohammedans are often afraid to enter
former dwelling-places of men, believing them to be haunted by
spirits; but he took me to some inhabited houses, which were all
built on the same principle as that described, but varying greatly in
depth and in the size of the courtyard; the staircases (abi-n-háwa)
leading to the upper story are in the courtyard, and are rather
irregularly built of stones and clay. In some of them young ostriches
were running about. The inhabitants of all the houses seemed to
have the same cheerful disposition, and I was glad to find scarcely a
single instance of misery. I give here the ground-plan of another
house.
The artisans who work in leather (an
occupation left entirely to females) seem to live
in a quarter by themselves, which originally was
quite separated from the rest of the town by a
sort of gate; but I did not make a sufficient
survey of this quarter to mark it distinctly on the
ground-plan of the town. We also visited some
of the mat-makers.
Our maimólo of the other day, who had
discovered that we had slaughtered our sheep,
paid us a visit in the evening, and for a piece of meat entertained
me with a clever performance on his instrument, accompanied with
a song. Hámma spent his evening with our friend the Emgedesíye
lady, and was kind enough to beg me to accompany him. This I
declined, but gave him a small present to take to her.
I had a fair sample of the state of morals in Ágades the following
day, when five or six girls and women came to pay me a visit in our
house, and with much simplicity invited me to make merry with
them, there being now, as they said, no longer reason for reserve,
“as the Sultan was gone.” It was indeed rather amusing to see what
conclusions they drew from the motto “Serki yátafi.” Two of them
were tolerably pretty and well-formed, with fine black hair hanging
down in plaits or tresses, lively eyes, and very fair complexion. Their
dress was decent, and that of one of them even elegant, consisting
of an under-gown reaching from the neck to the ankles, and an
upper one drawn over the head, both of white colour; but their
demeanour was very free, and I too clearly understood the caution
requisite in a European who would pass through these countries
unharmed and respected by the natives, to allow myself to be
tempted by these wantons. It would be better for a traveller in these
regions, both for his own comfort and for the respect felt for him by
the natives, if he could take his wife with him; for these simple
people do not understand how a man can live without a partner. The
Western Tuarek, who in general are very rigorous in their manners,
and quite unlike the Kél-owí, had nothing to object against me
except my being a bachelor. But as it is difficult to find a female
companion for such journeys, and as by marrying a native he would
expose himself to much trouble and inconvenience on the score of
religion, he will do best to maintain the greatest austerity of
manners with regard to the other sex, though he may thereby
expose himself to a good deal of derision from some of the lighter-
hearted natives. The ladies, however, became so troublesome that I
thought it best to remain at home for a few days, and was thus
enabled at the same time to note down the information which I had
been able to pick up. During these occupations I was greatly pleased
with the companionship of a diminutive species of finches which
frequent all the rooms in Ágades, and, as I may add from later
experience, in Timbúktu also; the male, with its red neck, in
particular looks extremely pretty. The poults were just about to
fledge.
Sunday, October 27.—There was one very characteristic building in
the town, which, though a most conspicuous object from the terrace
of our house, I had never yet investigated with sufficient accuracy.
This was the mesállaje, or high tower rising over the roof of the
mosque. The reason why this building in particular (the most famous
and remarkable one in the town) had been hitherto observed by me
only from a distance, and in passing by, must be obvious. Difference
of religious creed repelled me from it; and so long as the town was
full of strangers, some of them very fanatical, it was dangerous for
me to approach it too closely. I had often inquired whether it would
not be possible to ascend the tower without entering the mosque;
but I had always received for answer that the entrance was locked
up. As soon, however, as the Sultan was gone, and when the town
became rather quiet, I urged Hámma to do his best that I might
ascend to the top of this curious building, which I represented to
him as a matter of the utmost importance to me, since it would
enable me not only to control my route by taking a few angles of the
principal elevations round the valley Aúderas, but also to obtain a
distant view over the country towards the west and south, which it
was not my good luck to visit myself. To-day Hámma promised me
that he would try what could be done.
Having once more visited the lively house of Ídder, we took our
way over the market-places, which were now rather dull. The
vultures looked out with visible greediness and eagerness from the
pinnacles of the ruined walls around for their wonted food—their
share of offal during these days, when so many people were absent,
being of course much reduced, though some of them probably had
followed their fellow-citizens on the expedition. So few people being
in the streets, the town had a more ruined look than ever, and the
large heap of rubbish accumulated on the south side of the butchers’
market seemed to me more disgusting than before. We kept along
the principal street between Dígi and Arrafíya, passing the deep well
Shedwánka on our right, and on the other side a school, which
resounded with the shrill voices of about fifty little boys repeating
with energy and enthusiasm the verses of the Kurán, which their
master had written for them upon their little wooden tablets. Having
reached the open space in front of the mosque, and there being
nobody to disturb me, I could view at my leisure this simple but
curious building, which in the subsequent course of my journey
became still more interesting to me, as I saw plainly that it was built
on exactly the same principle as the tower which rises over the
sepulchre of the famed conqueror Háj Mohammed Áskiá (the
“Ischia” of Leo).
The mesállaje starts up from the platform or terrace formed by the
roof of the mosque, which is extremely low, resting apparently, as
we shall see, in its interior, upon four massive pillars. It is square,
and measures at its base about thirty feet, having a small lean-to, on
its east side, on the terrace of the mosque, where most probably
there was formerly the entrance. From this the tower rises
(decreasing in width, and with a sort of swelling or entasis in the
middle of its elevation, something like the beautiful model adopted
by nature in the deléb palm, and imitated by architects in the
columns of the Ionic and Corinthian orders) to a height of from
ninety to ninety-five feet. It measures at its summit not more than
about eight feet in width. The interior is lighted by seven openings
on each side. Like most of the houses in Ágades, it is built entirely of
clay; and in order to strengthen a building so lofty and of so soft a
material, its four walls are united by thirteen layers of boards of the
dúm-tree, crossing the whole tower in its entire breadth and width,
and coming out on each side from three to four feet, while at the
same time they afford the only means of getting to the top. Its
purpose is to serve as a watch-tower, or at least was so at a former
time, when the town, surrounded by a strong wall and supplied with
water, was well capable of making resistance, if warned in due time
of an approaching danger. But at present it seems rather to be kept
in repair only as a decoration of the town.
The mesállaje in its present state was only six years old at the
time of my visit (in 1850), and perhaps was not even quite finished
in the interior, as I was told that the layers of boards were originally
intended to support a staircase of clay. About fifty paces from the
south-western corner of the mosque, the ruins of an older tower are
seen still rising to a considerable height, though leaning much to one
side, more so than the celebrated Tower of Pisa, and most probably
in a few years it will give way to an attack of storm and rain. This
more ancient tower seems to have stood quite detached from the
mosque.
Having sufficiently surveyed the exterior of the tower, and made a
sketch of it, I accompanied my impatient companion into the interior
of the mosque, into which he felt no scruple in conducting me. The
lowness of the structure had already surprised me from without; but
I was still more astonished when I entered the interior, and saw that
it consisted of low, narrow naves, divided by pillars of immense
thickness, the reason of which it is not possible at present to
understand, as they have nothing to support but a roof of dúm-tree
boards, mats, and a layer of clay; but I think it scarcely doubtful that
originally these naves were but the vaults or cellars of a grand
superstructure, designed but not executed; and this conjecture
seems to be confirmed by all that at present remains of the mosque.
The gloomy halls were buried in a mournful silence, interrupted only
by the voice of a solitary man, seated on a dirty mat at the western
wall of the tower, and reading diligently the torn leaves of a
manuscript. Seeing that it was the kádhi, we went up to him and
saluted him most respectfully; but it was not in the most cheerful
and amiable way that he received our compliments—mine in
particular—continuing to read, and scarcely raising his eyes from the
sheets before him. Hámma then asked for permission to ascend the
tower, but received a plain and unmistakable refusal, the thing being
impossible, there being no entrance to the tower at present. It was
shut up, he said, on account of the Kél-gerés, who used to ascend
the tower in great numbers. Displeased with his uncourteous
behaviour, and seeing that he was determined not to permit me to
climb the tower, were it ever so feasible, we withdrew and called
upon the imám, who lives in a house attached to these vaults, and
which looked a little neater from having been whitewashed;
however, he had no power to aid us in our purpose, but rather
confirmed the statement of the kádhi. This is the principal mosque
of the town, and seems to have been always so, although there are
said to have been formerly as many as seventy mosques, of which
ten are still in use. They deserve no mention, however, with the
exception of three, the Msíd Míli, Msíd Éheni, and Msíd el Mékki. I
will only add here that the Emgedesíye, so far as their very slender
stock of theological learning and doctrine entitles them to rank with
any sect, are Malekíye, as well as the Kél-owí.
Resigning myself to the disappointment of not being able to
ascend the tower, I persuaded my friend to take a longer walk with
me round the northern quarter of the town. But I forgot to mention
that besides Hámma, I had another companion of a very different
character. This was Zúmmuzuk, a reprobate of the worst description,
and whose features bore distinct impress of the vile and brutal
passions which actuated him; yet being a clever fellow, and (as the
illegitimate son, or “dan néma,” of an Emgédesi woman) fully master
of the peculiar idiom of Ágades, he was tolerated not only by the old
chief Ánnur, who employed him as interpreter, but even by me. How
insolent the knave could be I shall soon have occasion to mention.
With this fellow, therefore, and with Hámma, I continued my walk,
passing the kófa-n-alkáli, and then, from the ruins of the quarter
Ben-Gottára, turning to the north. Here the wall of the town is in a
tolerable state of preservation, but very weak and insufficient,
though it is kept in repair, even to the pinnacles, on account of its
surrounding the palace of the Sultan. Not far from this is an open
space called Azarmádarangh, “the place of execution,” where
occasionally the head of a rebellious chieftain or a murderer is cut
off by the “dóka;” but as far as I could learn, such things happen
very seldom. Even on the north side, two gates are in a tolerable
state of preservation.
Having entered the town from this side, we went to visit the
quarter of the leather-workers, which, as I stated before, seems to
have formed originally a regular ward; all this handicraft, with the
exception of saddle-work, is carried on by women, who work with
great neatness. Very beautiful provision-bags are made here,
although those which I brought back from Timbúktu are much
handsomer. We saw also some fine specimens of mats, woven of a
very soft kind of grass, and dyed of various colours. Unfortunately, I
had but little with me wherewith to buy; and even if I had been able
to make purchases, the destination of our journey being so distant,
there was not much hope of carrying the things safely to Europe.
The blacksmiths’ work of Ágades is also interesting, although showy
and barbarous, and not unlike the work with which the Spaniards
used to adorn their long daggers.
Monday, October 28.—During all this time I prosecuted inquiries
with regard to several subjects connected with the geography and
ethnography of this quarter of the world. I received several visits
from Emgédesi tradesmen, many of whom are established in the
northern provinces of Háusa, chiefly in Kátsena and Tasáwa, where
living is infinitely cheaper than in Ágades. All these I found to be
intelligent men, having been brought up in the centre of intercourse
between a variety of tribes and nations of the most different
organization, and, through the web of routes which join here,
receiving information of distant regions. Several of them had even
made the pilgrimage, and thus come in contact with the relatively
high state of civilization in Egypt and near the coast; and I shall not
easily forget the enlightened view which the mʿallem Háj
Mohammed ʿOmár, who visited me several times, took of Islamism
and Christianity. The last day of my stay in Ágades, he reverted to
the subject of religion, and asked me, in a manner fully expressive of
his astonishment, how it came to pass that the Christians and
Moslemín were so fiercely opposed to one another, although their
creeds, in essential principles, approximated so closely. To this I
replied by saying that I thought the reason was that the great
majority both of Christians and Moslemín paid less regard to the
dogmas of their creeds than to external matters, which have very
little or no reference to religion itself. I also tried to explain to him
that in the time of Mohammed Christianity had entirely lost that
purity which was its original character, and that it had been mixed up
with many idolatrous elements, from which it was not entirely
disengaged till a few centuries ago, while the Mohammedans had
scarcely any acquaintance with Christians except those of the old
sects of the Jacobites and Nestorians. Mutually pleased with our
conversation, we parted from each other with regret.
In the afternoon I was agreeably surprised by the arrival of the
Tinýlkum Ibrahim, for the purpose of supplying his brother’s house
with what was wanted; and being determined to make only one
day’s stay in the town, he had learned with pleasure that we were
about to return by way of Áfasás, the village whither he himself was
going. I myself had cherished this hope, as all the people had
represented that place as one of the largest in the country, and as
pleasantly situated. Hámma had promised to take me this way on
our return to Tin-téllust; but having stayed so much longer in the
town than he had intended, and being afraid of arriving too late for
the salt-caravan of the Kél-owí on their way to Bilma, which he was
to supply with provisions, he changed his plan, and determined to
return by the shortest road. Meanwhile he informed me that the old
chief would certainly not go with us to Zínder till the salt-caravan
had returned from Bilma.
Fortunately, in the course of the 29th a small caravan with corn
arrived from Damerghú, and Hámma completed his purchases. He
had, however, first to settle a disagreeable affair; for our friend
Zúmmuzuk had bought, in Hámma’s name, several things for which
payment was now demanded. Hámma flew into a terrible rage, and
nearly finished the rogue. My Arab and Tawáti friends, who heard
that we were to start the following day, though they were rather
busy buying corn, came to take leave of me, and I was glad to part
from all of them in friendship. But before bidding farewell to this
interesting place, I shall make a few general observations on its
history.
CHAPTER XVIII.
HISTORY OF ÁGADES
At present I still think that I was not far wrong in estimating the
number of the inhabited houses at from six hundred to seven
hundred, and the population at about seven thousand, though it
must be borne in mind that, as the inhabitants have still preserved
their trading character, a great many of the male inhabitants are
always absent from home, a circumstance which reduces the armed
force of the place to about six hundred. A numerical element,
capable of controlling the estimated amount of the population, is
offered by the number of from two hundred and fifty to three
hundred well-bred boys, who at the time of my visit were learning a
little reading and writing, in five or six schools scattered over the
town; for it is not every boy who is sent to school, but only those
belonging to families in easy circumstances, and they are all about
the same age, from eight to ten years old.
With regard to the names of the quarters of the town, which are
interesting from an historical point of view, I was not able to learn
exactly the application of each of the names; and I am sure very few
even of the inhabitants themselves can now tell the limits of the
quarters, on account of the desolate state of many of them. The
principal names which can be laid down with certainty in the plan
are Masráta, Gobetáren, Gáwa-Ngírsu, Dígi or Dégi, Katánga,
Terjemán, and Arrafía, which comprise the south-western quarter of
the town. The names of the other quarters, which I attempted to lay
down on the plan sent to Government together with my report, I
now deem it prudent to withdraw, as I afterwards found that there
was some uncertainty about them. I therefore collect here, for the
information of future travellers, the names of the other quarters of
the place, besides those mentioned above and marked in the plan—
Lárelóg, Churúd, Hásena, Amaréwuël, Imurdán (which name, I was
assured afterwards, has nothing in common with the name of the
tribe of the Imghád), Tafimáta (the quarter where the tribe of the
same name lived), Yobímme (“yobu-mé” meaning the mouth of the
market), Dégi-n-béne, or the Upper Dégi, and Bosenrára. Kachíyu
(not Kachín) seems to have been originally the name of a pool, as I
was assured that, besides the three ponds still visible, there were
formerly seven others, namely Kudúru, Kachíyu, Chikinéwan,
Lángusúgázará, Kurungúsu, and Rabafáda, this latter in the square
of the palace.
The whole ground upon which the town is built (being the edge of
a tableland which coincides with the transition from granite to
sandstone) seems to be greatly impregnated with salt at a certain
depth, of which not only the ponds, but even the wells bear
evidence, two of the three wells still in use having saltish water, and
only that of Shedwánka being, as to taste, free from salt, though it
is still regarded as unwholesome, and all the water used for drinking
is brought from the wells outside the walls. Formerly, it is said, there
were nine wells inside the town.
From what I have said above, it may be concluded that the
commerce of Ágades is now inconsiderable. Its characteristic feature
is that no kind of money whatever is current in the market—neither
gold, nor silver, nor kurdí, nor shells; while strips of cotton, or
gábagá (the Kanúri, and not the Háusa term being employed in this
case, because the small quantity of this stuff which is current is
imported from the north-western province of Bórnu), are very rare,
and indeed form almost as merely nominal a standard as the
mithkál. Nevertheless the value of the mithkál is divided into ten
rijáls, or érjel, which measure means eight drʿa, or cubits, of
gábagá. The real standard of the market, I must repeat, is millet or
dukhn (“géro” in Háusa, “éneli” in Temáshight, Pennisetum
typhoïdeum), durra, or Holcus sorghum, being scarcely ever brought
to market. And it is very remarkable, that with this article a man may
buy everything at a much cheaper rate than with merchandise,
which in general fetches a low price in the place; at least it did so
during my stay, when the market had been well stocked with
everything in demand, by the people who had come along with us.
English calico of very good quality was sold by me at 20 per cent.
less than it had been bought for at Múrzuk. Senna in former times
formed an article of export of some importance; but the price which
it fetches on the coast has so decreased that it scarcely pays the
carriage, the distance from the coast being so very great; and it
scarcely formed at all an article in request here, nor did we meet on
our whole journey a single camel laden with it, though it grows in
considerable quantities in the valleys hereabouts.
Ágades is in no respect a place of resort for wealthy merchants,
not even Arabs, while with regard to Europe its importance at
present consists in its lying on the most direct road to Sókoto and
that part of Sudán. In my opinion it would form for a European
agent a very good and comparatively healthy place from which to
open relations with Central Africa. The native merchants seem only
to visit the markets of Kátsena, Tasáwa, Marádi, Kanó, and Sókoto,
and, as far as I was able to learn, never go to the northern markets
of Ghát or Múrzuk, unless on a journey to Mekka, which several of
them have made. Neither does there seem to exist any intercourse
at present with Gágho, or Gógo, or with Timbúktu; but the Arabs of
Azawád and those parts, when undertaking a pilgrimage, generally
go by way of Ágades.
I must here add, that I did not observe that the people of Ágades
use manna in their food, nor that it is collected in the neighbourhood
of the town; but I did not inquire about it on the spot, not having
taken notice of the passage of Leo relating to it.
My stay in Ágades was too short to justify my entering into detail
about the private life of the people, but all that I saw convinced me
that, although open to most serious censure on the part of the
moralist, it presented many striking features of cheerfulness and
happiness, and nothing like the misery which is often met with in
towns which have declined from their former glory. It still contains
many active germs of national life, which are most gratifying to the
philosophic traveller. The situation, on an elevated plateau, cannot
but be healthy, as the few waterpools, of small dimensions, are
incapable of infecting the air. The disease which I have mentioned in
my diary as prevalent at the time of my sojourn was epidemic.
Besides, it must be borne in mind that the end of the rainy season
everywhere in the tropical regions is the most unhealthy period of
the year.