Topological Constraints and Robustness in Liquid State Machines
Topological Constraints and Robustness in Liquid State Machines
a r t i c l e i n f o a b s t r a c t
Keywords: The Liquid State Machine (LSM) is a method of computing with temporal neurons, which can be used
Liquid State Machine amongst other things for classifying intrinsically temporal data directly unlike standard artificial neural
Reservoir computing networks. It has also been put forward as a natural model of certain kinds of brain functions. There are
Small world topology two results in this paper: (1) We show that the Liquid State Machines as normally defined cannot serve
Robustness
as a natural model for brain function. This is because they are very vulnerable to failures in parts of the
Machine learning
model. This result is in contrast to work by Maass et al. which showed that these models are robust to
noise in the input data. (2) We show that specifying certain kinds of topological constraints (such as
‘‘small world assumption’’), which have been claimed are reasonably plausible biologically, can restore
robustness in this sense to LSMs.
Ó 2011 Elsevier Ltd. All rights reserved.
0957-4174/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2011.06.052
1598 H. Hazan, L.M. Manevitz / Expert Systems with Applications 39 (2012) 1597–1606
The term detector is standard in the LSM community and date sense that small damages to the LSM neurons reduce the trained
back to Maass et al. (Jaeger, 2001a; Lukosevicius & Jaeger, 2009; classifiers dramatically, even to essentially random values (Hazan
Maass, 2002; Maass & Markram, 2004; Maass et al., 2002b) the & Manevitz, 2010; Manevitz & Hazan, 2010).
idea is that the ‘‘detectors’’ are testing whether the information Seeking to correct this problem, we experimented with differ-
for classification resides in the liquid; and thus are not required ent architectures of the liquid. The essential need of the LSM is that
to be biological. In this way, it is theoretically possible for the there should be sufficient recurrent connections so that on the one
detectors to recognize any spatio-temporal signal that has been hand, the network maintains the information in a signal, while on
fed into the liquid; and thus the system could be used for, e.g. the other hand it separates different signals. The models typically
speech recognition, or vision, etc. used are random connections; or those random with a bias to-
This is an exciting idea and, e.g. Maass and his colleagues have wards ‘‘nearby’’ connections. Our experiments with these topolo-
published a series of papers on it. Amongst other things, they have gies show that the network is very sensitive to damage because
recently shown that once a detector has been sufficiently trained at the recurrent nature of the system causes substantial feedback.
any time frame, it is resilient to noise in the input data and thus it Taking this as a clue, we tried networks with ‘‘hub’’ or ‘‘small
can be used successfully for generalization (Bassett & Bullmore, world’’ (Albert & Barabási, 2000; Barabási, 2000; Barabási & Albert,
2006; Fern & Sojakka, n.d.; Maass et al., 2002b). 1999) architecture. This architecture has been claimed (Achard,
Furthermore, there is a claim that this abstraction is faithful to Salvador, Whitcher, Suckling, & Bullmore, 2006; Bassett &
the potential capabilities of the natural neurons and thus is explan- Bullmore, 2006; Varshney, Chen, Paniagua, Hall, & Chklovskii,
atory to some extent from the viewpoint of computational brain 2011) to be ‘‘biologically feasible’’.
science. Note that one of the underlying assumptions is that the The intuition was that the hub topology, on the one hand, inte-
detector works without memory; that is the detector should be grates information from many locations and so is resilient to dam-
able to classify based on instantaneous static information; i.e. by age in some of them; and on the other hand, since such hubs follow
sampling the liquid at a specific time. That this is theoretically pos- a power rule distribution, they are rare enough that damage usu-
sible is the result of looking at the dynamical system of the liquid ally does not affect them directly. This intuition was in fact borne
and noting that it is sufficient to cause the divergence of the two out by our experiments.
classes in the space of activation.
Note that the detector systems (e.g. a back propagation neural
network, a perceptron or a support vector machine (SVM)) are 2. Materials and methods
not required to have any biological plausibility; either in their de-
sign or in their training mechanism, since the model does not try to We simulated the Liquid State Machine with 243 integrate and
account for the way the information is used in nature. Despite this, fire neurons (LIF) in the liquid following the exact set up of Maass
since natural neurons exist in a biological and hence noisy environ- and using the code available at the Maass laboratory software ‘‘A
ment, for these models to be successful in this domain, they must neural Circuit SIMulator’’.2 To test variants of topology we re-
be robust to various kinds of noise. As mentioned above, Maass et implemented the code, available at our website.3 The variants of
al. (Lukosevicius & Jaeger, 2009; Maass, Legenstein, & Markram, the topologies implemented are described in the paper below as
2002; Maass et al., 2002b; Maass & Markram, 2004) addressed are the types of damages. Input to the liquid was at 30% of the neu-
one dimension of this problem by showing that the systems are rons, the same input at all locations in a given time instances. The
in fact robust to noise in the input. Thus small random shifts in a detectors of the basic networks were back propagation networks
temporal input pattern will not affect the LSM’s ability to recognize with three levels with 3 neurons in the hidden level and one output
the pattern. From a machine learning perspective, this means that neuron. In most experiments, the input was given by the output of
the model is capable of generalization. all non-input neurons of the liquid (i.e. 170 inputs to the detector).
However, there is another component to robustness; that of the In some experiments (see section below) the inputs to the detector
components of the system itself. were given over 20 time instances and so the detector had 3400
In this paper we report on experiments performed with various
kinds of ‘‘damage’’ to the LSM and unfortunately have shown that 2
http://www.lsm.tugraz.at/csim/.
the LSM with any of the above detectors is not resistant, in the 3
http://www.cri.haifa.ac.il/neurocomputation.
H. Hazan, L.M. Manevitz / Expert Systems with Applications 39 (2012) 1597–1606 1599
inputs. The networks were tested with 20 random temporal binary as in the liquid (Maass & Markram, 2002) to recognize ten of these
sequences of length 45 chosen with uniform distribution. The inputs and reject the other ten. Each choice of architecture was run
experiments were repeated 500 times and statistics reported. 500 times varying the precise connections randomly. We tested the
robustness of the recognition ability of the network with the fol-
lowing parameters:
3. Theory/calculations
– The neurons in the network were either leaky integrate and fire
As discussed in the introduction, in a system, there are two
neurons (Maass, 2002) or Izhikevich (Izhikevich, 2003) style
sources of potential instability. First is the issue of small variants
neurons.
in the input. Systems need to balance the need of separation with
– The average connectivity of the networks was maintained at
generalization. That is, on the one hand, one may need to separate
about 20% chosen randomly in all cases although with different
inputs with small variations into separate treatment, but on the
distributions.
other hand, small variants may need to be treated as ‘‘noise’’ or
– The damages were either ‘‘generators", i.e. the neurons issued a
generalization of the trained system. For the LSM, as is typically
spike whenever their refractory period allowed it; or they were
presented in the literature, it is understood, e.g. from the work of
‘‘dead’’ neurons that could not spike.
Lukosevicius and Jaeger (2009) and Maass (2002)) that the LSM
– The degree of damage was systematically checked at 0.1%, 0.5%,
and its variants do this successfully in the case of spatio-temporal
1%, 5%, and 10% in randomly chosen neurons.
signals.
The second issue concerns that of the sensitivity of the system
The results shown in tables throughout the paper are in per-
to small changes in the system itself, which we choose to call
centages, over the (500) repeated tests. One hundred percent indi-
‘‘damages’’ in this paper. This is very important if, as is the case
cates that all the 20 vectors of one test, over 500 repetitions of the
for LSM, it is supposed to be explanatory for biological systems.
test were fully recognized correctly. Fifty percent indicates that
Our experiments therefore are based on simulating the LSM
only half the vectors over 500 times were recognized. (This corre-
with temporal sequences and calculating how resistant they are
sponds to a chance baseline). The graphs presented below show
to two main kinds of such damages. The damages chosen for inves-
the full distribution of all the tests and the results over all the kinds
tigation were: (1) at each time instance a certain percentage of
of damages and all varieties of topologies. As expected, they dis-
neurons in the liquid would refuse to fire regardless of the internal
tribute as Gaussian, but note that the average success rate varies
charge in its state; (2) at each time instance a certain percentage of
from a baseline of 10 successes (50%) for random guessing (see
neurons would fire regardless of the internal charge, subject only
Fig. 2) to as high as almost 20 (98%) for generalization in certain
to the limitation of the refractory period.
cases and 88% for some of the damages.
Since the basic results (see below) showed that the standard
variants of LSM were not robust to these damages at various small
levels, we considered topological differences in the connectivity of 3.2. Second experiments: modifications of the LSM
the LSM.
3.2.1. Different kinds of basic neurons
In attempts to restore the robustness to damage, we experi-
3.1. First experiments: LSMs are not robust
mented with the possibility that a different kind of basic neuron
might result in a more resilient network. Accordingly, we imple-
3.1.1. The experiments
mented the LSM with various variants of ‘‘leaky integrate and fire
To test the resistance of standard LSM to noise, we (i) down-
neurons’’, e.g. with history dependent refractory period (Manevitz
loaded the code of Maass et al. from his laboratory site4 and then
& Marom, 2002) and by using the model of neurons due to
implemented two kinds of damage to the liquid and (ii) re-imple-
Izhikevich (2003). The results under these variants were
mented the LSM code so that we could handle variants. These mod-
qualitatively the same as the standard integrate and fire neuron.
els use a kind of basic neuron that is of the ‘‘leaky integrate and fire’’
(The Izhikevich model produces a much more dense activity in
(LIF)5 variety and in Maass’ work, the neurons are connected
the network and thus the detector was harder to train but in the
randomly but with some biologically inspired parameters: 20%
end the network was trainable and the results under damage were
inhibitory and a connectivity constraint giving a preference to geo-
very similar.) Accordingly, we report only results with the standard
metrically nearby neurons over more remote ones. (For precise
integrate and fire neuron as appears, e.g. in Maass’ work (Maass,
details on these parameters, see: neural Circuit SIMulator4 and
2002).
Maass and Markram (2002).) External stimuli to the network were
always sent to 30% of the neurons, always chosen to be excitatory
neurons. Initially, we experimented with two parameters: (i) the
percentage of neurons damaged; (ii) the kinds of damages. The kinds
were either transforming a neuron into a ‘‘dead" neuron; i.e. one that
never fires or transforming a neuron into a ‘‘generator’’ neuron, i.e.
one which fires as often as its refractory period allows it, regardless
of its input. We did experiments with different kinds of detectors:
Adaline (Widrow & Hoff, 1960), Back-Propagation, SVM and Tempo-
tron (Gutig & Sompolinsky, 2006).
Classification of new data could then be done at any of the sig-
nal points. We ran experiments as follows: we randomly chose
twenty temporal inputs; i.e. random sequences of 0s and 1s of
length 45, corresponding to spike inputs over a period of time;
and trained an LSM composed of 243 integrate and fire neurons
3.2.2. Allowing detectors to have memory the ‘‘signal points’’; after training there was no particular impor-
In trying to consider how to make the model more robust to tance to the choice of separation of the signal points except that
damage, we investigated the fact that the detector has no memory. there was no overlap between the data points. While we did not
Perhaps, if we allow the detector to follow the development of the control for any connections between the intervals of data points
network for a substantial amount of time, both in training and run- (i.e. 50, and we also checked other time intervals) and possible nat-
ning, it would be more robust. To check this, we took the most ex- ural oscillations in the network, we do not believe there were any.
treme other case; we assumed that the detector system in fact As anticipated, there was no significant trouble in training the net-
takes as input a full time course of 20 iterations of the output neu- work to even 100% of recognition of the training data.
rons of the liquid. This means that instead of a NN with input of The ‘‘detectors’’ were three level neural networks, trained by
170; we had one with 20 times 170 time course inputs. It seemed back-propagation. We also did some experiments with the Tempo-
reasonable that (i) with so much information, it should be rela- tron (Gutig & Sompolinsky, 2006); and with a simple Adaline
tively easy to train the detector; (ii) one could hope that damage detector (Widrow & Hoff, 1960). Training for classification could
in the liquid would be local enough that over the time period, be performed in the damage-less environment successfully with
the detector could correct for it. In order to test this, we re-imple- any of these detectors. Then we exhaustively ran tests on these
mented the LSM detector to allow for this time entry. possibilities.
Our detector was trained and tested as follows. There were 170 In all of these tests, following Maass (2002), Maass and
output units. At a ‘‘signal point’’ each of them was sampled for the Markram (2002) and Maass et al. (2002a), we assumed that
next 20 iterations and all of these values were used as a single data approximately 20% of the neurons of the liquid were of the inhib-
point to the detector. Thus the detector had 170 times 20 inputs. itory type. The architecture of the neural network detector was 204
We chose separate detector points typically at intervals of 50. input neurons (which were never taken from the neurons in the
We then used back propagation on these data points. This means LSM which were also used as inputs to the LSM) 100 hidden level
that eventually the detector could recognize the signal at any of neurons and one neuron for the output. Results running the Maass
et al. architecture are presented in Fig. 4 and Table 4 and can be
compared with a random connected network of 10% average
connectivity, see Table 2.
The bottom line (see the results section) was that even with low
amounts damage and under most kinds of connectivity, the net-
works would fail; i.e. the trained but damaged network loss of
function was very substantial and in many cases could not perform
substantially differently from a random selection.
Fig. 4. Maass LSM (a) normal operation; (b) with 10% dead damage; (c) with 10% noise. One can easily discern the large change in the reaction of the network.
H. Hazan, L.M. Manevitz / Expert Systems with Applications 39 (2012) 1597–1606 1601
instantaneous changes in the architecture, it seems reasonable to i. Input connectivity is power law. That is we assign a link
design architectures that can somehow ‘‘filter’’ out minor changes. from a uniformly randomly chosen neuron to a second neu-
The liquids were varied in their topologies in the following ron chosen randomly according to a power law. In this case
ways: the input connectivity follows a power law; while the output
connectivity follows a Gaussian distribution.
1. Random connectivity. Each neuron in the network is connected ii. Output connectivity is power law. That is we reverse the
to 20% of the other neurons in a random fashion. (i) In the ori- above. In this case the input connectivity is Gaussian while
ginal Maass topology the connections are chosen with a larger the output connectivity is power law.
bias for nearby neurons (see Maass, 2002; Maass et al., 2002a; iii. Replacing ‘‘Gaussian’’ with ‘‘uniform’’ in case (i) above.
Maass, Natschläger, & Markram, 2002c). This is the literature iv. Replacing ‘‘Gaussian’’ with ‘‘uniform’’ in case (ii) above.
standard and is what is usually meant as LSM. (ii) We also v. We also tried choosing a symmetric network with power
tested a network without such bias; i.e. the connections are law connectivity (i.e. for both input and output.) Note that
chosen to 20% of the other neurons randomly and uniformly. in this case, the same neurons served as ‘‘hubs’’ both for
The results presented below showed that these architectures input and output.
are not robust. vi. Finally, we designed an algorithm to allow distinct input and
2. Reducing the connectivity to 10% and 5% in the above arrange- output power law connectivity. In this case the hubs in the
ment. The intuition for this was that with lower connectivity, two directions are distinct. Algorithms 1 and 2 below accom-
the feed-back should be reduced. The results presented below plish this task.
show that this intuition is faulty and that these networks are
even less robust than the above (see Tables 1, 2, 5 and 6).
3. Implementation of ‘‘Hub’’ topologies in either input connectiv-
ity or output. The intuition here is that the relative rarity of Algorithm 1
‘‘hubs’’ results in their damage being a very rare event. But Generate a random number between min and max value
when they are not damaged, they receive information from with Power law distribution, Input: min,max, size,
many sources and can thus filter out the damage thus alleviat- How_many_numbers, counter Arry = array, Magnify = 5
ing the feedback in the input case. In the output hub case, the for i = 1 to How_many_numbers
existence of many hubs should allow the individual neurons index = random(array.start,array.end)
to filter out noise. end_array = array.end
candidate = array[index]
The construction of hubs was done in various fashions: AddCells(array, Magnify);
for t = 0 to Magnify
a. Hand design of a network with one hub for input. See array[end_array + t] = candidate
Appendix A for a full description of this design. end for
b. Small world topologies. Since small world topologies follow shuffle(array)
power law connectivity, they produce hubs. On the other output_Array[i] = candidate
hand such topologies are thought to emerge in a ‘‘natural’’ counterArry[candidate]++
fashion (Albert & Barabási, 2000; Barabási, 2000; Barabási end for
& Albert, 1999; Varshney et al., 2011) and appear in real shuffle(counterArry)
neuronal systems (Albert & Barabási, 2000; Bassett & Output output_Array,counterArry
Bullmore, 2006), see Fig. 3. Note however, that in our context
there are two directions to measure the power law: input
and output connectivity histograms for the neurons. We
checked the following variants:
Algorithm 2
Create the connectivity matrix for the liquid network using
the Algorithm 1 as an Input weight_Matrix
Table 1
use algorithm 1 to creart (arraylist, counterArry)
Five percentage uniform random connectivity without memory input to the detector.a
counter = 0
Damage Non 0.1% 0.5% 1% 5% 10% for i=1 to counterArry.lenght
Dead neurons 100% 55% 53% 52% 51% 49% for t=1 to counterArry[i]
Noisy neurons 100% 63% 54% 55% 51% 50% weight_Matrix[i, arraylist[counter]]=true
Dead and noisy 100% 55% 52% 52% 50% 50%
counter++
Generalization 100% 93% 88% 80% 75% 78%
end for
a
For all the tables that are shown in this paper, 50% is the baseline of random end for
classification.
One problem with the various algorithms for designing power law
connectivity is that under a ‘‘fair’’ sampling, the network might
Table 2
not be connected. This means that such a network actually has a
Ten percentage uniform random connectivity without memory input to the detector.
lower, effective connectivity. We decided to eliminate this problem
Damage Non 0.1% 0.5% 1% 5% 10% by randomly connecting the disconnected components (either from
Dead neurons 100% 56% 53% 51% 51% 49% an input or output perspective) to another neuron chosen randomly
Noisy neurons 100% 73% 58% 54% 51% 52% but proportionally to the connectivity. (This does not guarantee
Dead and noisy 100% 59% 54% 52% 52% 51%
connectivity of the graph, but makes it unlikely, so that the effective
Generalization 100% 100% 93% 88% 83% 81%
connectivity is not substantially affected.)
1602 H. Hazan, L.M. Manevitz / Expert Systems with Applications 39 (2012) 1597–1606
4. Results Table 4
Twenty percentage connectivity under Maass’s distribution preferring local
connections.
4.1. First experiments: LSM is not robust
Damage Non 0.1% 0.5% 1% 5% 10%
First, there was not much difference between the detectors; so Dead neurons 90% 60% 52% 51% 50% 50%
eventually we restricted ourselves to the back-propagation Noisy neurons 90% 78% 57% 52% 52% 52%
Dead and noisy 90% 54% 52% 53% 50% 50%
detector. (Note that none of units of the liquid input were accessed
Generalization 90% 96% 93% 93% 84% 84%
by the detectors were allowed to be input neurons of the liquid.) It
turned out that while the detector is able to learn the randomly
chosen test classes successfully, if there is sufficient average
connectivity (e.g. 20%), almost any kind of damage caused the Table 5
Five percentage uniform random connectivity with memory input to the detector.
detector to have a very substantial decay in its detecting ability
(see Table 3). Note that even with lower connectivity, which has Damage Non 0.1% 0.5% 1% 5% 10%
less feedback, the same phenomenon occurs. See Table 1 Dead neurons 100% 55% 53% 53% 51% 50%
(5% connectivity) and Table 2 (10% connectivity). Noisy neurons 100% 63% 54% 54% 53% 51%
When the network is connected randomly but with bias for geo- Dead and noisy 100% 56% 53% 52% 51% 51%
Generalization 100% 93% 87% 80% 75% 79%
metric closeness as in Maass’ distribution, the network is still very
sensitive (although a bit less so). Compare Table 4 to Table 3.
After our later experiments, we returned to this point (see con-
cluding remarks, below). In Fig. 4 we illustrate the difference in Table 6
Ten percentage uniform random connectivity with memory input to the detector.a
reaction of the network by a raster (ISI) display. Note that with
10% damage, it is quite evident to the eye that the network di- Damage Non 0.1% 0.5% 1% 5% 10%
verges dramatically from the noise free situation. In Tables 1–4 Dead neurons 100% 58% 55% 53% 49% 50%
one can see this as well with 5% noise for purely random connec- Noisy neurons 100% 74% 59% 57% 54% 50%
tivity. Actually, with low degrees of damage the detectors under Dead and noisy 100% 61% 54% 55% 50% 50%
Generalization 100% 96% 92% 85% 82% 82%
even the Maass connectivity (see Table 4) show dramatic decay
in recognition although not to the extremes of random connectiv- a
For all the tables that shown in this paper, 50% is the baseline of random
ity. These results (see Tables 1–4) were robust and repeatable classification.
under many trials and variants.
Accordingly, we conclude that the LSM, either as purely defined
with random connectivity, or, as implemented in Maass et al.
Table 7
(2002a) cannot serve as a biologically relevant model.
Twenty percentage uniform random connectivity with memory input to the detector.
4.2. Second experiments: varying the neurons and allowing the Damage Non 0.1% 0.5% 1% 5% 10%
detectors to have memory Dead neurons 100% 63% 55% 52% 50% 50%
Noisy neurons 100% 87% 67% 61% 54% 52%
Dead and noisy 100% 68% 57% 52% 50% 49%
4.2.1. Variants of neurons (history dependent refractory period and
Generalization 100% 98% 97% 95% 89% 86%
izhikevich)
The results under these variants were qualitatively the same as
the standard integrate and fire neuron. (The Izhikevich model pro-
duces a much more dense activity in the network and thus the
Table 8
detector was harder to train but in the end the network was train-
Maass’s distribution like in Table 4 but with memory input to the detectors.
able and the results under damage were very similar.) Accordingly,
we report only results with the standard integrate and fire neuron Damage Non 0.1% 0.5% 1% 5% 10%
as appears, e.g. in Maass’ work. Dead neurons 100% 61% 53% 49% 49% 50%
Noisy neurons 100% 79% 60% 55% 51% 49%
Dead and noisy 100% 64% 55% 52% 51% 52%
4.2.2. Detectors with memory input
Generalization 100% 100% 96% 93% 84% 85%
The ‘‘detectors’’ in our experiments were either three level neu-
ral networks, trained by back-propagation, the Tempotron (Gutig &
Sompolinsky, 2006); or with a simple Adaline detector (Widrow &
Hoff, 1960). Training for classification could be performed in the
damage-less environment successfully with any of these detectors.
We exhaustively ran tests on these possibilities; including damage
degree and kinds and detector types.
Tables 5–8 show the results with different uniform connectivity
in the liquid when there is memory input to the detector. Table 8
Table 3
Twenty percentage uniform random connectivity without memory input to the
detector.
Fig. 6. Histographs of correctness results in LSM networks with 20 time interval Fig. 7. Histographs of correctness results in LSM networks with one hub distribu-
input, different amounts of ‘‘noise generator’’ neuron damage, average connectivity tion with different amounts of ‘‘noise generator’’ neuron damage.
of 20% with a uniform random distribution on the connections.
Table 9 Table 10
One hub network with memory input to the detector. Small world with a power-law distribution with memory input to the detector.
Damage Non 0.1% 0.5% 1% 5% 10% Damage Non 0.1% 0.5% 1% 5% 10%
Dead neurons 100% 95% 88% 85% 76% 67% Dead neurons 100% 55% 51% 51% 50% 51%
Noisy neurons 100% 97% 91% 86% 70% 62% Noisy neurons 100% 79% 58% 53% 50% 51%
Dead and noisy 100% 96% 89% 86% 75% 68% Dead and noisy 100% 58% 51% 50% 48% 50%
Generalization 100% 100% 97% 97% 96% 95% Generalization 100% 100% 97% 93% 90% 89%
1604 H. Hazan, L.M. Manevitz / Expert Systems with Applications 39 (2012) 1597–1606
Table 11
Small world with a double power-law distribution with memory input to the
detector.
Table 12
Small world with a double power-law distribution without memory input to the
detector.
Fig. 9. Histographs of correctness results in LSM networks with different amounts Damage Non 0.1% 0.5% 1% 5% 10%
of ‘‘dead’’ neuron damage with small world topology obtained with a power law Dead neurons 62% 83% 67% 61% 56% 53%
distribution. Noisy neurons 62% 91% 75% 66% 54% 55%
Dead and noisy 62% 86% 69% 65% 52% 55%
Generalization 62% 100% 96% 95% 93% 91%
Fig. 10. Histographs of correctness results in LSM networks with different amounts
of ‘‘noise generator’’ neuron damage for small world topology obtained with a
power-law distribution.
Fig. 12. Histographs of correctness results in LSM networks with different amounts
of ‘‘dead’’ neuron damage with small world topology obtained with a double power
law distribution.
robustness from damage in the liquid. On the other hand, they had
improved generalization capability (see Table 10).
Looking closer at the distribution, as can be seen from Fig. 3,
Algorithm 1 actually creates a power-law distribution in terms of
total connections, but when we separate the connections to input
and output connections, we see that while the output has a power
law distribution, the input connections have a roughly random
uniform distribution.
5. Discussion
Fig. 14. A graphical summary of the results presented in this paper. The ‘‘standard’’ LSM topologies either uniform or in Maass’s original papers are not robust; but small
world topologies show an improvement, which is most marked in the case of a two-way power law distribution.
Divide all the neurons (240) to groups; the size of each group is Jaeger, H. (2002). Adaptive nonlinear system identification with echo state networks.
Retrieved from <http://www.faculty.iu-bremen.de/hjaeger/pubs/esn_NIPS02>.
randomly chosen between 3 and 6 neurons in one group. Each
Lukosevicius, M., & Jaeger, H. (2009). Reservoir computing approaches to recurrent
neuron in the entire group connects to 2 of his neighbors in neural network training. Computer Science Review, 3(3), 127–149. doi:10.1016/
the same group. j.cosrev.2009.03.005.
Choose 1=4 of the groups to be hubs and the rest of the groups we Maass, W. (2002). Paradigms for computing with spiking neurons. In J. L. van
Hemmen, J. D. Cowan, & E. Domany (Eds.). Models of neural networks. Early
will call them the base. vision and attention (Vol. 4, pp. 373–402). New York: Springer.
For 20% connection (that is 11,472 connections) 90% of the con- Maass, W., Legenstein, R. A., & Markram, H. (2002). A new approach towards vision
nections are from the base groups to the hub groups, 7% are suggested by biologically realistic neural microcircuit models. In Proceedings of
the 2nd workshop on biologically motivated computer vision. Lecture notes in
from the hub group to base group and 3% are connections computer science. Springer. Retrieved from papers/lsm-vision-146.pdf.
between the hub groups. To accomplish that: Maass, W., & Markram, H. (2002). Temporal integration in recurrent microcircuits.
– Choose (10324 times) random neurons from the base groups and In M. A. Arbib (Ed.), The handbook of brain theory and neural networks (2nd ed..
Cambridge: MIT Press.
connect each one with a randomly neuron from a hub group. Maass, W., & Markram, H. (2004). On the computational power of circuits of spiking
– Randomly choose (803 times) a randomly neuron and connect it neurons. Journal of Computer and System Sciences, 69(4), 593–616. doi:10.1016/
to a randomly chosen neuron from the base neurons. j.jcss.2004.04.001.
Maass, W., Natschläger, T., & Markram, H. (2002a). Computational models for
– Connect (345 times) randomly one of the neurons from the hub generic cortical microcircuits. In J. Feng (Ed.), Computational neuroscience: A
neurons to anther neuron but from a different group (see comprehensive approach. CRC-Press. Retrieved from papers/lsm-feng-chapter-
Fig. A1). 149.pdf.
Maass, W., Natschläger, T., & Markram, H. (2002b). Real-time computing without
stable states: A new framework for neural computation based on perturbations.
Neural Computation, 14(11), 2531–2560. Retrieved from papers/lsm-nc-130.pdf.
References Maass, W., Natschläger, T., & Markram, H. (2002c). A model for real-time
computation in generic neural microcircuits. In Proceedings of NIPS 2002 (Vol.
Achard, S., Salvador, R., Whitcher, B., Suckling, J., & Bullmore, E. (2006). A Resilient, 15, pp. 229–236). Retrieved from papers/lsm-nips-147.pdf
low-frequency, small-world human brain functional network with highly Maass, W., Natschläger, T., Markram, H. (2002d). A fresh look at real-time
connected association cortical hubs. The Journal of Neuroscience, 26(1), 63–72. computation in generic recurrent neural circuits, Tech. Report, Institute for
doi:10.1523/JNEUROSCI.3874-05.2006. Theoretical Computer Science, TU Graz, Graz, Austria.
Albert, R., & Barabási, A.-L. (2000). Topology of evolving networks: Local events and Manevitz, L., & Hazan, H. (2010). Stability and topology in reservoir computing. In G.
universality. Physical Review Letters, 85(24), 5234–5237. Retrieved from <http:// Sidorov, A. Hernández Aguirre, & C. Reyes García (Eds.), Advances in soft
www.ncbi.nlm.nih.gov/pubmed/11102229>. computing. Lecture notes in computer science (Vol. 6438, pp. 245–256). Berlin/
Barabási, G. B. A.-L. (2000). Competition and multiscaling in evolving networks. Heidelberg: Springer. Retrieved from <http://dx.doi.org/10.1007/978-3-642-
cond-mat/0011029. Retrieved from <http://arxiv.org/abs/cond-mat/0011029>. 16773-7_21>.
Barabási, A.-L., & Albert, R. (1999). Emergence of scaling in random networks. Manevitz, L. M., & Marom, S. (2002). Modeling the process of rate selection in
Science, 286(5439), 509–512. doi:10.1126/science.286.5439.509. neuronal activity. Journal of Theoretical Biology, 216(3), 337–343. Retrieved from
Bassett, D. S., & Bullmore, E. (2006). Small-world brain networks. The Neuroscientist, <http://www.ncbi.nlm.nih.gov/pubmed/12183122>.
12(6), 512–523. doi:10.1177/1073858406293182. Natschläger, T., Maass, W., & Markram, H. (2002). The ‘‘Liquid Computer’’: A novel
Fern, C., & Sojakka, S. (n.d.). Pattern recognition in a bucket. Retrieved from <http:// strategy for real-time computing on time series. Special Issue on foundations of
citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.97.3902>. information processing of TELEMATIK, 8 (1), 39–43. Retrieved from papers/lsm-
Gutig, R., & Sompolinsky, H. (2006). The tempotron: A neuron that learns spike telematik.pdf.
timing-based decisions. Nature Neuroscience, 9(3), 420–428. doi:10.1038/ Natschläger, T., Markram, H., & Maass, W. (2002). Computer models and analysis
nn1643. tools for neural microcircuits. In R. Kötter (Ed.), A practical guide to
Hazan, H., & Manevitz, L. M. (2010). The liquid state machine is not robust to neuroscience databases and associated tools. Boston: Kluwer Academic
problems in its components but topological constraints can restore robustness. Publishers. Retrieved from papers/lsm-koetter-chapter-144.pdf.
In IJCCI (ICFC-ICNC) (pp. 258–264). Pitts, W., & McCulloch, W. S. (1943). A logical calculus of the ideas immanent in
Izhikevich, E. M. (2003). Simple model of spiking neurons. IEEE Transactions on nervous activity. Bulletin of Mathematical Biology, 52(1–2), 99–115. discussion
Neural Networks, 14(6), 1569–1572. doi:10.1109/TNN.2003.820440. 73-97.
Jaeger, H. (2001a). The ‘‘echo state’’ approach to analysing and training recurrent Varshney, L. R., Chen, B. L., Paniagua, E., Hall, D. H., & Chklovskii, D. B. (2011).
neural networks (No. GMD Report 148). German National Research Center for Structural Properties of the Caenorhabditis elegans Neuronal Network. PLoS
Information Technology. Retrieved from <http://www.faculty.iu-bremen.de/ Computational Biology, 7(2), e1001066. doi:10.1371/journal.pcbi.1001066.
hjaeger/pubs/EchoStatesTechRep.pdf>. Widrow, B., & Hoff, M. (1960). Adaptive switching circuits. 1960 {IRE} {WESCON}
Jaeger, H. (2001b). Short term memory in echo state networks (No. GMD Report 152). Convention Record, Part 4 (pp. 96–104). {IRE}. Retrieved from <http://isl-
German National Research Center for Information Technology. Retrieved from www.stanford.edu/~widrow/papers/c1960adaptiveswitching.pdf>.
<http://www.faculty.iu-bremen.de/hjaeger/pubs/STMEchoStatesTechRep.pdf>.