Figure 2.2 Propagation of signal through neurons.
ANN, which is a domain of artificial intelligence, mimics the above discussed biological neural networks of nervous system. The connections of the neurons in ANN are computationally and mathematically modeled in more or less same way as the connections between the biological neurons.
In the following section, we highlight the basic network topology and different types of models in ANN and the learning rules.
2.3 Artificial Neural Networks
An ANN can be defined as a mathematical and computational tool for nonlinear statistical data modeling, influenced by the structure and function of biological nervous system. A large number of immensely interconnected processing units, termed as neurons, build ANN.
Generally, ANN receives a set of inputs and produces the weighted sum, and then, the result is passed to the nonlinear function which generates the output. Like human being, ANN also learns by example. The models of ANN are required to be appropriately trained to generate the output efficiently. In biological nervous system, learning involves adaptations in the synaptic connections between the neurons. This idea influences the learning procedure of ANN. The system parameter of ANNs can be adjusted according to I/O pattern. Through learning process, ANN can be applied in the domains of data classification, pattern recognition, etc.
The researchers are working on ANN for past several decades. This domain has been established even before the advent of computers. The artificial neuron [1] was first introduced by Warren McCulloch, the neurophysiologist, and Walter Pits, the logician, in 1943.
2.3.1 McCulloch-Pitts Neural Model
The model proposed by McCulloch and Pitts is documented as linear threshold gate [1]. The artificial neuron takes a set of input I1, I2, I3, … …, IN ∈ {0, 1} and produces one output, y ∈ {0, 1}. Input sets are of two types: one is dependent input termed as excitatory input and the other is independent input termed as inhibitory input. Mathematically, the function can be expressed by the following equations:
(2.2)
where
W1, W2, W3, …, …, WN ≡ weight values associated with the corresponding input which are normalized in the range of either (0, 1) or (−1, 1);
S ≡ weighted sum;
θ ≡ threshold constant.
The function f is called linear step function shown in Figure 2.3.
The schematic diagram of linear threshold gate is given in Figure 2.4.
This initial two-state model of ANN is simple but has immense computational power. The disadvantage of this model is lack of flexibility because of fixed weights and threshold values. Later McCulloch-Pitts neuron model has been improved incorporating more flexible features to extend its application domain.
2.3.2 The Perceptron
McCulloch-Pitts neuron model was enhanced by Frank Rosenblatt in 1957 where he proposed the concept of the perceptron [2] to solve linear classification problems. This algorithm supervises the learning process of binary classifiers. This binary single neuron model merges the concept of McCulloch-Pitts model [1] with Hebbian learning rule of adjusting weights [3]. In perceptron, an extra constant, termed as bias, is added. The decision boundary can be shifted by bias away from the origin. It is independent of any input value. To define perceptron, Equation (2.1) has been modified as follows:
(2.3)
where
b ≡ bias value.
2.3.3 ANN With Continuous Characteristics
This model is also the extension of McCulloch-Pitts neuron model. Two stages are used to illustrate ANN with continuous characteristics. The schematic diagram of the model is presented in Figure 2.5. The linear combination of input values is calculated in the first stage. The weight value associated with each value of the input array lies between 0 and 1. The summation function can be expressed as σ in Equation (2.4).
where
T ≡ extra input value associated with weight value 1 which represents the threshold or bias of a neuron.
Figure 2.3 Linear threshold function.
Figure 2.4 Schematic diagram of linear threshold gate.
The second stage of the model is the activation function which takes the sum-of-product value as the input and produces the output. The activity of this stage determines the characteristic of the ANN model. This function compresses the amplitude of the output so that it lies in the range of [0, 1] or [−1, 1]. The compression of the output signal is performed to mimic the signal produced by biological neuron in the form of continuous action-potential spikes.
The function used in the above discussed model is semi-linear and termed as logistic sigmoid function. The graphical depiction of the function is presented in Figure 2.6. The mathematical demonstration of logistic sigmoid function is presented in Equation (2.5).
Figure 2.5 ANN model with continuous characteristics.