0% found this document useful (0 votes)
20 views

Brain and Neuron

Brain and neuron based on machine learning

Uploaded by

SARANYA M
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Brain and Neuron

Brain and neuron based on machine learning

Uploaded by

SARANYA M
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

CHAPTER 3

Neurons, Neural Networks,


and Linear Discriminants

We’ve spent enough time with the concepts of machine learning, now it is time to actually
see it in practice. To start the process, we will return to our demonstration that learning is
possible, which is the squishy thing that your skull protects.

3.1 THE BRAIN AND THE NEURON


In animals, learning occurs within the brain. If we can understand how the brain works, then
there might be things in there for us to copy and use for our machine learning systems.
While the brain is an impressively powerful and complicated system, the basic building
blocks that it is made up of are fairly simple and easy to understand. We’ll look at them
shortly, but it’s worth noting that in computational terms the brain does exactly what we
want. It deals with noisy and even inconsistent data, and produces answers that are usually
correct from very high dimensional data (such as images) very quickly. All amazing for
something that weighs about 1.5 kg and is losing parts of itself all the time (neurons die as
you age at impressive/depressing rates), but its performance does not degrade appreciably
(in the jargon, this means it is robust).
So how does it actually work? We aren’t actually that sure on most levels, but in this
book we are only going to worry about the most basic level, which is the processing units
of the brain. These are nerve cells called neurons. There are lots of them (100 billion = 1011
is the figure that is often given) and they come in lots of different types, depending upon
their particular task. However, their general operation is similar in all cases: transmitter
chemicals within the fluid of the brain raise or lower the electrical potential inside the body
of the neuron. If this membrane potential reaches some threshold, the neuron spikes or fires,
and a pulse of fixed strength and duration is sent down the axon. The axons divide (arborise)
into connections to many other neurons, connecting to each of these neurons in a synapse.
Each neuron is typically connected to thousands of other neurons, so that it is estimated
that there are about 100 trillion (= 1014 ) synapses within the brain. After firing, the neuron
must wait for some time to recover its energy (the refractory period) before it can fire again.
Each neuron can be viewed as a separate processor, performing a very simple computa-
tion: deciding whether or not to fire. This makes the brain a massively parallel computer
made up of 1011 processing elements. If that is all there is to the brain, then we should be
able to model it inside a computer and end up with animal or human intelligence inside a
computer. This is the view of strong AI. We aren’t aiming at anything that grand in this

39
40  Machine Learning: An Algorithmic Perspective

book, but we do want to make programs that learn. So how does learning occur in the
brain? The principal concept is plasticity: modifying the strength of synaptic connections
between neurons, and creating new connections. We don’t know all of the mechanisms by
which the strength of these synapses gets adapted, but one method that does seem to be
used was first postulated by Donald Hebb in 1949, and that is what is discussed now.

3.1.1 Hebb’s Rule


Hebb’s rule says that the changes in the strength of synaptic connections are proportional
to the correlation in the firing of the two connecting neurons. So if two neurons consistently
fire simultaneously, then any connection between them will change in strength, becoming
stronger. However, if the two neurons never fire simultaneously, the connection between
them will die away. The idea is that if two neurons both respond to something, then they
should be connected. Let’s see a trivial example: suppose that you have a neuron somewhere
that recognises your grandmother (this will probably get input from lots of visual processing
neurons, but don’t worry about that). Now if your grandmother always gives you a chocolate
bar when she comes to visit, then some neurons, which are happy because you like the taste
of chocolate, will also be stimulated. Since these neurons fire at the same time, they will
be connected together, and the connection will get stronger over time. So eventually, the
sight of your grandmother, even in a photo, will be enough to make you think of chocolate.
Sound familiar? Pavlov used this idea, called classical conditioning, to train his dogs so that
when food was shown to the dogs and the bell was rung at the same time, the neurons for
salivating over the food and hearing the bell fired simultaneously, and so became strongly
connected. Over time, the strength of the synapse between the neurons that responded to
hearing the bell and those that caused the salivation reflex was enough that just hearing
the bell caused the salivation neurons to fire in sympathy.
There are other names for this idea that synaptic connections between neurons and
assemblies of neurons can be formed when they fire together and can become stronger. It
is also known as long-term potentiation and neural plasticity, and it does appear to have
correlates in real brains.

3.1.2 McCulloch and Pitts Neurons


Studying neurons isn’t actually that easy. You need to be able to extract the neuron from the
brain, and then keep it alive so that you can see how it reacts in controlled circumstances.
Doing this takes a lot of care. One of the problems is that neurons are generally quite
small (they must be if you’ve got 1011 of them in your head!) so getting electrodes into
the synapses is difficult. It has been done, though, using neurons from the giant squid,
which has some neurons that are large enough to see. Hodgkin and Huxley did this in 1952,
measuring and writing down differential equations that compute the membrane potential
based on various chemical concentrations, something that earned them a Nobel prize. We
aren’t going to worry about that, instead, we’re going to look at a mathematical model
of a neuron that was introduced in 1943. The purpose of a mathematical model is that it
extracts only the bare essentials required to accurately represent the entity being studied,
removing all of the extraneous details. McCulloch and Pitts produced a perfect example of
this when they modelled a neuron as:
Neurons, Neural Networks, and Linear Discriminants  41

FIGURE 3.1 A picture of McCulloch and Pitts’ mathematical model of a neuron. The
inputs xi are multiplied by the weights wi , and the neurons sum their values. If this sum
is greater than the threshold θ then the neuron fires; otherwise it does not.

(1) a set of weighted inputs wi that correspond to the synapses


(2) an adder that sums the input signals (equivalent to the membrane of the cell that
collects electrical charge)
(3) an activation function (initially a threshold function) that decides
whether the neuron fires (‘spikes’) for the current inputs

A picture of their model is given in Figure 3.1, and we’ll use the picture to write down
a mathematical description. On the left of the picture are a set of input nodes (labelled
x1 , x2 , . . . xm ). These are given some values, and as an example we’ll assume that there are
three inputs, with x1 = 1, x2 = 0, x3 = 0.5. In real neurons those inputs come from the
outputs of other neurons. So the 0 means that a neuron didn’t fire, the 1 means it did,
and the 0.5 has no biological meaning, but never mind. (Actually, this isn’t quite fair, but
it’s a long story and not very relevant.) Each of these other neuronal firings flowed along
a synapse to arrive at our neuron, and those synapses have strengths, called weights. The
strength of the synapse affects the strength of the signal, so we multiply the input by the
weight of the synapse (so we get x1 × w1 and x2 × w2 , etc.). Now when all of these signals
arrive into our neuron, it adds them up to see if there is enough strength to make it fire.
We’ll write that as
m
X
h= wi xi , (3.1)
i=1

which just means sum (add up) all the inputs multiplied by their synaptic weights. I’ve
assumed that there are m of them, where m = 3 in the example. If the synaptic weights
are w1 = 1, w2 = −0.5, w3 = −1, then the inputs to our model neuron are h = 1 × 1 + 0 ×
−0.5 + 0.5 × −1 = 1 + 0 + −0.5 = 0.5. Now the neuron needs to decide if it is going to
fire. For a real neuron, this is a question of whether the membrane potential is above some
threshold. We’ll pick a threshold value (labelled θ), say θ = 0 as an example. Now, does
our neuron fire? Well, h = 0.5 in the example, and 0.5 > 0, so the neuron does fire, and
produces output 1. If the neuron did not fire, it would produce output 0.
42  Machine Learning: An Algorithmic Perspective

The McCulloch and Pitts neuron is a binary threshold device. It sums up the inputs
(multiplied by the synaptic strengths or weights) and either fires (produces output 1) or
does not fire (produces output 0) depending on whether the input is above some threshold.
We can write the second half of the work of the neuron, the decision about whether or not
to fire (which is known as an activation function), as:

1 if h > θ
o = g(h) = (3.2)
0 if h ≤ θ.
This is a very simple model, but we are going to use these neurons, or very simple
variations on them using slightly different activation functions (that is, we’ll replace the
threshold function with something else) for most of our study of neural networks. In fact,
these neurons might look simple, but as we shall see, a network of such neurons can perform
any computation that a normal computer can, provided that the weights wi are chosen
correctly. So one of the main things we are going to talk about for the next few chapters is
methods of setting these weights.

3.1.3 Limitations of the McCulloch and Pitts Neuronal Model


One question that is worth considering is how realistic is this model of a neuron? The
answer is: not very. Real neurons are much more complicated. The inputs to a real neuron
are not necessarily summed linearly: there may be non-linear summations. However, the
most noticeable difference is that real neurons do not output a single output response,
but a spike train, that is, a sequence of pulses, and it is this spike train that encodes
information. This means that neurons don’t actually respond as threshold devices, but
produce a graded output in a continuous way. They do still have the transition between
firing and not firing, though, but the threshold at which they fire changes over time. Because
neurons are biochemical devices, the amount of neurotransmitter (which affects how much
charge they required to spike, amongst other things) can vary according to the current
state of the organism. Furthermore, the neurons are not updated sequentially according to
a computer clock, but update themselves randomly (asynchronously), whereas in many of
our models we will update the neurons according to the clock. There are neural network
models that are asynchronous, but for our purposes we will stick to algorithms that are
updated by the clock.
Note that the weights wi can be positive or negative. This corresponds to excitatory
and inhibitory connections that make neurons more likely to fire and less likely to fire,
respectively.
Both of these types of synapses do exist within the brain, but with the McCulloch and
Pitts neurons, the weights can change from positive to negative or vice versa, which has not
been seen biologically—synaptic connections are either excitatory or inhibitory, and never
change from one to the other. Additionally, real neurons can have synapses that link back
to themselves in a feedback loop, but we do not usually allow that possibility when we make
networks of neurons. Again, there are exceptions, but we won’t get into them.
It is possible to improve the model to include many of these features, but the picture is
complicated enough already, and McCulloch and Pitts neurons already provide a great deal
of interesting behaviour that resembles the action of the brain, such as the fact that networks
of McCulloch and Pitts neurons can memorise pictures and learn to represent functions and
classify data, as we shall see in the next couple of chapters. In the last chapter we saw a
simple model of a neuron that simulated what seems to be the most important function
of a neuron—deciding whether or not to fire—and ignored the nasty biological things like
Neurons, Neural Networks, and Linear Discriminants  43

chemical concentrations, refractory periods, etc. Having this model is only useful if we can
use it to understand what is happening when we learn, or use the model in order to solve
some kind of problem. We are going to try to do both in this chapter, although the learning
that we try to understand will be machine learning rather than animal learning.

3.2 NEURAL NETWORKS


One thing that is probably fairly obvious is that one neuron isn’t that interesting. It doesn’t
do very much, except fire or not fire when we give it inputs. In fact, it doesn’t even learn.
If we feed in the same set of inputs over and over again, the output of the neuron never
varies—it either fires or does not. So to make the neuron a little more interesting we need
to work out how to make it learn, and then we need to put sets of neurons together into
neural networks so that they can do something useful.
The question we need to think about first is how our neurons can learn. We are going to
look at supervised learning for the next few chapters, which means that the algorithms will
learn by example: the dataset that we learn from has the correct output values associated
with each datapoint. At first sight this might seem pointless, since if you already know the
correct answer, why bother learning at all? The key is in the concept of generalisation that
we saw in Section 1.2. Assuming that there is some pattern in the data, then by showing
the neural network a few examples we hope that it will find the pattern and predict the
other examples correctly. This is sometimes known as pattern recognition.
Before we worry too much about this, let’s think about what learning is. In the Intro-
duction it was suggested that you learn if you get better at doing something. So if you
can’t program in the first semester and you can in the second, you have learnt to program.
Something has changed (adapted), presumably in your brain, so that you can do a task that
you were not able to do previously. Have a look again at the McCulloch and Pitts neuron
(e.g., in Figure 3.1) and try to work out what can change in that model. The only things
that make up the neuron are the inputs, the weights, and the threshold (and there is only
one threshold for each neuron, but lots of inputs). The inputs can’t change, since they are
external, so we can only change the weights and the threshold, which is interesting since it
tells us that most of the learning is in the weights, which aren’t part of the neuron at all;
they are the model of the synapse! Getting excited about neurons turns out to be missing
something important, which is that the learning happens between the neurons, in the way
that they are connected together.
So in order to make a neuron learn, the question that we need to ask is:
How should we change the weights and thresholds of the neurons so that the network gets
the right answer more often?
Now that we know the right question to ask we’ll have a look at our very first neural
network, the space-age sounding Perceptron, and see how we can use it to solve the problem
(it really was space-age, too: created in 1958). Once we’ve worked out the algorithm and
how it works, we’ll look at what it can and cannot do, and then see how statistics can give
us insights into learning as well.

3.3 THE PERCEPTRON


The Perceptron is nothing more than a collection of McCulloch and Pitts neurons together
with a set of inputs and some weights to fasten the inputs to the neurons. The network
is shown in Figure 3.2. On the left of the figure, shaded in light grey, are the input nodes.
These are not neurons, they are just a nice schematic way of showing how values are fed

You might also like