Anurag
INTRODUCTION
1.1
INTRODUCTION
Borrowing
from biology, researchers are exploring neural networks—a new, non algorithmic
approach to information processing. A Neural
Network is a powerful data-modeling tool that is
able to capture and represent complex input/output relationships. The
motivation for the development of neural network
technology stemmed from the desire to develop an artificial system that could
perform "intelligent" tasks similar to those performed
by the human brain. Neural networks
resemble the human brain in the following two ways:
Ø A neural network
acquires knowledge through learning.
Ø A neural
network's knowledge is stored within inter-neuron connection strengths known as
synaptic weights.
1.2ADVANTAGES
OF NEURAL NETWORKS
Ø Adaptive learning:
An ability to learn how to do tasks based on the data given for training or
initial experience.
Ø Self-Organization:
An ANN can create its own organization or representation of the information it
receives during learning time.
Ø Real Time Operation:
ANN computations may be carried out in parallel, and special hardware devices
are being designed and manufactured which take advantage of this capability.
Ø Fault Tolerance via Redundant
Information Coding: Partial destruction of a network leads to
the corresponding degradation of performance. However, some network
capabilities may be retained even with major network damage.
Ø
1.3LEARNING
IN NEURAL NETWORKS
Learning
is a process by which the free parameters of a neural network are adapted through
a process of stimulation by the environment in which the network is embedded.
The
type of learning is determined by the manner in which the parameter changes
take place. All learning methods used for neural networks can be classified
into two major categories:
Ø SUPERVISED
LEARNING which incorporates an external teacher, so
that each output unit is told what its desired response to input signals ought
to be. During the learning process global information may be required.
Paradigms of supervised learning include error-correction learning (back
propagation algorithm), reinforcement learning and stochastic learning.
Ø UNSUPERVISED
LEARNING uses no external teacher and is based upon
only local information. It is also referred to as self-organization, in the
sense that it self-organizes data presented to the network and detects their
emergent collective properties. Paradigms of unsupervised learning are Hebbian
learning and competitive learning.
1.4
OVERVIEW OF BACK PROPAGATION ALGORITHM
Minsky
and Papert (1969) showed that there are many simple problems such as the exclusive-or
problem which linear neural networks cannot solve. Note that term
"solve" means learn the desired associative links. Argument is that
if such networks can not solve such simple problems how they could solve
complex problems in vision, language, and motor control. Solutions to this
problem were as follows:
·
Select
appropriate "recoding" scheme which transforms inputs
·
Perceptron
Learning Rule -- Requires that you correctly "guess" an acceptable input
to hidden unit mapping.
·
Back-propagation
learning rule -- Learn both sets of weights simultaneously.
Back
propagation is a form of supervised learning for multi-layer nets, also known
as the generalized delta rule. Error data at the output layer is "back
propagated" to earlier ones, allowing incoming weights to these layers to
be updated. It is most often used as training algorithm in current neural
network applications. The back propagation algorithm was developed by Paul
Werbos in 1974 and rediscovered independently by Rumelhart and Parker. Since
its rediscovery, the back propagation algorithm has been widely used as a learning
algorithm in feed forward multilayer neural networks.
What
makes this algorithm different than the others is the process by which the
weights are calculated during the learning network. In general, the difficulty
with multilayer
Perceptrons
is calculating the weights of the hidden layers in an efficient way that result
in the least (or zero) output error; the more hidden layers there are, the more
difficult it becomes. To update the weights, one must calculate an error. At
the output layer this error is easily measured; this is the difference between
the actual and desired (target) outputs. At the hidden layers, however, there
is no direct observation of the error; hence, some other technique must be
used. To calculate an error at the hidden layers that will cause minimization
of the output error, as this is the ultimate goal.
The
back propagation algorithm is an involved mathematical tool; however, execution
of the training equations is based on iterative processes, and thus is easily
implementable on a computer.
1.4USE
OF BACK PROPAGATION NEURAL NETWORK SOLUTION
Ø A
large amount of input/output data is available, but you're not sure how to
relate it to the output.
Ø The
problem appears to have overwhelming complexity, but there is clearly a solution.
Ø It
is easy to create a number of examples of the correct behavior.
Ø The
solution to the problem may change over time, within the bounds of the given input
and output parameters (i.e., today 2+2=4, but in the future we may find that 2+2=3.8).
Ø Outputs
can be "fuzzy", or non-numeric.
One
of the most common applications of NNs is in image processing. Some examples would
be: identifying hand-written characters; matching a photograph of a person's
face with a different photo in a database; performing data compression on an
image with minimal loss of content. Other applications could be voice
recognition; RADAR signature analysis; stock market prediction. All of these
problems involve large amounts of data, and complex relationships between the
different parameters.
It
is important to remember that with a NN solution, you do not have to understand
the solution at all. This is a major advantage of NN approaches. With more
traditional techniques, you must understand the inputs, and the algorithms, and
the outputs in great detail, to have any hope of implementing something that
works. With a NN, you simply show it: "this is the correct output, given
this input". With an adequate amount of training, the network will mimic
the function that you are demonstrating. Further, with a
NN,
it is ok to apply some inputs that turn out to be irrelevant to the solution -
during the training process; the network will learn to ignore any inputs that
don't contribute to the output. Conversely, if you leave out some critical
inputs, then you will find out because the network will fail to converge on a
solution
1.5OBJECTIVE
OF THESIS
The
objectives of thesis are:
·
Exploration of a supervised learning
algorithm for artificial neural networks i.e. the, Error Back propagation learning algorithm
for a layered feed forward network.
·
Formulation of individual modules of the
Back Propagation algorithm for efficient implementation in hardware.
·
Analysis of the simulation results of Back
Propagation algorithm.
INTRODUCTION
TO NEURAL NETWORKS
2.1
INTRODUCTION
Borrowing
from biology, researchers are exploring neural networks—a new, non algorithmic
approach to information processing.
A
neural network is a powerful data-modeling
tool that is able to capture and represent complex input/output relationships.
The motivation for the development of neural network technology stemmed from
the desire to develop an artificial system that could perform
"intelligent" tasks similar to those performed by the human brain.
Neural networks resemble the human brain in the following two ways:
Ø A neural network
acquires knowledge through learning.
Ø A neural
network's knowledge is stored within inter-neuron connection strengths known as
synaptic weights.
Artificial
Neural Networks are being counted as the wave of the future in computing.
They
are indeed self-learning mechanisms which don't require the traditional skills
of a programmer. But unfortunately, misconceptions have arisen. Writers have
hyped that these neuron-inspired processors can do almost anything. These
exaggerations have created disappointments for some potential users who have
tried, and failed, to solve their problems with neural networks. These
application builders have often come to the conclusion that neural nets are
complicated and confusing. Unfortunately, that confusion has come from the
industry itself. An avalanche of articles has appeared touting a large assortment
of different neural networks, all with unique claims and specific examples.
Currently,
only a few of these neuron-based structures, paradigms actually, are being used
commercially. One particular structure, the feed forward, back-propagation network,
is by far and away the most popular. Most of the other neural network
structures represent models for "thinking" that are still being
evolved in the laboratories. Yet, all of these networks are simply tools and as
such the only real demand they make is that they require the network architect
to learn how to use them.
The
power and usefulness of artificial neural networks have been demonstrated in
several applications including speech synthesis, diagnostic problems, medicine,
business and finance, robotic control, signal processing, computer vision and
many other problems that fall under the category of pattern recognition. For
some application areas, neural models show promise in achieving human-like
performance over more traditional artificial intelligence techniques.
2.2
HISTORY OF NEURAL NETWORKS
The
study of the human brain is thousands of years old. With the advent of modern
electronics, it was only natural to try to harness this thinking process.
The
history of neural networks that was described above can be divided into several
periods:
Ø First
Attempts: There were some initial simulations using
formal logic. McCulloch and Pitts (1943) developed models of neural networks
based on their understanding of neurology. These models made several assumptions
about how neurons worked. Their networks were based on simple neurons which
were considered to be binary devices with fixed thresholds. The results of
their model were simple logic functions such as "a or b" and "a
and b". Another attempt was by using computer simulations. Two groups
(Farley and Clark, 1954; RochesterHolland, Haibit and Duda, 1956). The first
group (IBM researchers) maintained closed contact with neuroscientists at McGill
University. So whenever their models did not work, they consulted the neuroscientists.
This interaction established a multidisciplinary trend which continues to the
present day.
Ø Promising
& Emerging Technology: Not only was neuroscience influential in
the development of neural networks, but psychologists and engineers also contributed
to the progress of neural network simulations. Rosenblatt (1958) stirred
considerableinterest and activity in the field when he designed and developed
the Perceptron. The Perceptron had three layers with the
middle layer known as the association layer. This system could learn to connect
or associate a given input to a random output unit. Another system was the
ADALINE (Adaptive Linear Element)
which was developed in 1960 by Widrow and Hoff (of Stanford University). The
ADALINE was an analogue electronic device made from simple components. The
method used for learning was different to that of the Perceptron; it employed
the Least-Mean-Squares(LMS) learning rule.
Ø Period of
Frustration & Disrepute: In 1969 Minsky and Papert
wrote a book in which they generalized the limitations of single layer
Perceptrons to multilayered systems. In the book they said: "...our
intuitive judgment that the extension (to multilayer systems) is sterile".
The significant result of their book was to eliminate funding for research with
neural network simulations. The conclusions supported the disenchantment of
researchers in the field. As a result, considerable prejudice against this
field was activated.
Ø Innovation:
Although public interest and available funding were minimal, several researchers
continued working to develop neuromorphically based computational methods for
problems such as pattern recognition. During this period several paradigms were
generated which modern work continues to enhance. Grossberg's (Steve Grossberg
and Gail Carpenter in 1988) influence founded a school of thought which
explores resonating algorithms. They developed the ART (Adaptive Resonance
Theory) networks based on biologically plausible models. Anderson and Kohonen
developed associative techniques independent of each other. Klopf (A. Henry
Klopf) in 1972 developed a basis for learning in artificial neurons based on a
biological principle for neuronal learning called heterostasis.
Werbos (Paul Werbos 1974) developed and used the back-propagation
learning method, however several years passed before this
approach was popularized. Backpropagation nets are probably the most well known
and widely applied of the neural networks today. In essence, the
back-propagation net. is a Perceptron with multiple layers, a different
threshold function in the artificial neuron, and a more robust and capable
learning rule. Amari (A. Shun-Ichi 1967) was involved with theoretical developments:
he published a paper which established a mathematical theory for a learning
basis (error-correction method) dealing with adaptive pattern classification. While
Fukushima (F. Kunihiko) developed a step wise trained multilayered neural network
for interpretation of handwritten characters. The original network was published
in 1975 and was called the Cognitron.
Ø Re-Emergence:
Progress during the late 1970s and early 1980s was important to the re-emergence
on interest in the neural network field. Several factors influenced this movement.
For example, comprehensive books and conferences provided a forum for people in
diverse fields with specialized technical languages, and the response to conferences
and publications was quite positive. The news media picked up on the increased
activity and tutorials helped disseminate the technology. Academic programs
appeared and courses were introduced at most major Universities (in US and
Europe). Attention is now focused on funding levels throughout Europe, Japan
and the US and as this funding becomes available, several new commercial with applications
in industry and financial institutions are emerging.
Ø Today:
Significant progress has been made in the field of neural networks-enough to attract
a great deal of attention and fund further research. Advancement beyond current
commercial applications appears to be possible, and research is advancing the field
on many fronts. Neurally based chips are emerging and applications to complex problems
developing. Clearly, today is a period of transition for neural network technology.
2.3
ADVANTAGES OF NEURAL NETWORKS
Either
humans or other computer techniques can use neural networks, with their remarkable
ability to derive meaning from complicated or imprecise data, to extract patterns
and detect trends that are too complex to be noticed. A trained neural network can
be thought of as an "expert" in the category of information it has
been given to analyze.
Advantages
include:
Ø Adaptive
learning:
An ability to learn how to do tasks based on the data given for training or
initial experience.
Ø Self-Organization:
An ANN can create its own organization or representation of the information it
receives during learning time.
Ø Real Time Operation:
ANN computations may be carried out in parallel, and special hardware devices
are being designed and manufactured which take advantage of this capability.
Ø Fault Tolerance via Redundant
Information Coding: Partial destruction of a network
leads to the corresponding degradation of performance. However, some network
capabilities may be retained even with major network damage.
2.4
NEURAL NETWORKS VERSUS CONVENTIONAL
COMPUTERS
Neural
networks take a different approach to problem solving than that of conventional
computers.
Ø Conventional computers
use an algorithmic approach i.e. the computer follows a set of instructions
in order to solve a problem. Unless the specific steps that the computer needs
to follow are known the computer cannot solve the problem. That restricts the problem
solving capability of conventional computers to problems that we already understand
and know how to solve. But computers would be so much more useful if they could
do things that we don't exactly know how to do.
Ø Neural networks
on the other hand, process information in a similar way the
human brain does. The network is composed of a large number of highly
interconnected processing elements (neurons) working in parallel to solve a
specific problem. Neural networks learn by example. They cannot be programmed
to perform a specific task.
Ø The
disadvantage of neural
networks is that because
the network finds out how to solve the problem by itself, its operation can be
unpredictable.
On the other hand, conventional computers
use a cognitive approach to problem
solving; the way the problem is to solve must be known and stated in small unambiguous instructions. These
instructions are then converted to a high- level
language program and then into machine code that the computer can understand. These machines are totally
predictable; if anything goes wrong is due
to a software or hardware fault.
Neural
networks and conventional algorithmic computers are not in competition but complement
each other. There are tasks are more suited to an algorithmic approach like arithmetic
operations and tasks that are more suited to neural networks. Even more, a large
number of tasks require systems that use a combination of the two approaches (normally
a conventional computer is used to supervise the neural network) in order to perform
at maximum efficiency.
2.5
HUMAN AND ARTIFICIAL NEURONS-INVESTIGATING THE
SIMILARITIES
2.5.1
LEARNING PROCESS IN HUMAN BRAIN
Much
is still unknown about how the brain trains itself to process information, so theories
abound. In the human brain, a typical neuron collects signals from others
through a host of fine structures called dendrites.
Fig-
2.1: Components of a Neuron
The neuron
sends out spikes of electrical activity through a long, thin stand known as an axon,
which splits into thousands of branches. At the end of each branch, a structure
called a synapse converts the
activity from the axon into electrical effects that inhibit or excite the
activity in the connected neurons. When a neuron receives excitatory input that
is sufficiently large compared with its inhibitory input, it sends a spike of
electrical activity down its axon. Learning occurs by changing the
effectiveness of the synapses so that the influence of one neuron on another
changes.
|
|
|
Fig- 2.2: The
Synapse
2.5.2
HUMAN NEURONS TO ARTIFICIAL NEURONS
We conduct
these neural networks by first trying to deduce the essential features of
neurons and their interconnections. We then typically program a computer to
simulate these features. However because our knowledge of neurons is incomplete
and our computing power is limited, our models are necessarily gross
idealizations of real networks of
neurons.
Comments
Post a Comment