Artificial Neural Network - Theory (Part I)

#ann #perceptron #backpropagation

Harshita Pandey Jan 07 2022 · 3 min read
Share this

An Artificial Neural Network (ANN) is a computational model that is inspired by the human nervous system. The idea behind ANN is to have machines artifically mimic biological neural intelligence. The neurons (nerve cells) are the fundamental units of the human nervous system, in a similar manner artificial neurons are the elementary units in an ANN.

Biological Neuron

Neurons (or nerve cells) are the primary components of the nervous system that process and transmit cellular signals in the body. Neurons are composed of three main parts: dendrites, soma or cell body and axon. Signals are received through the dendrites, travel to the cell body and continue down the axon until they reach the synapse. At the synapse, a neuron passes either an electrical or a chemical signal to another neuron or to a target neuron.

Figure 1. Biological Neuron and Structure of a chemical synapse.
  • Neurons are specialized cells that receive and transmit electrical signals to other nerve cells, muscle, or gland cells.
  • The soma or cell body of a neuron contains the nucleus, which is the key feature of soma as most protein synthesis occur here.
  • Dendrites are branched extensions of a nerve cell that receive signals from other neurons and then carry the signals to the cell body or soma.
  • The axon carries an electrical signal from the cell body to the synapse.
  • The axon terminals are found at the end of the branches of an axon farthest from the soma.
  • The junction or point of communication between two neurons or between a neuron and a target cell, like a muscle or a gland is called synapses.
  • The Perceptron

    Single Computational Layer Neural Network: The simplest kind of neural network is a single layer perceptron network that comprises of a single input layer and an output node. In this network, a set of inputs is directly mapped to an output.

    Figure 2. Architecture of a single layer perceptron.

    The architecture of the perceptron is shown in Figure 2, in which a single input layer transmits the features to the output node. The edges from the input to the output contain the weights w1 . . . w5 with which the features are multiplied, an additional bias can be integrated and added at the output node. Subsequently, an activation function is applied in order to convert the aggregated value into a class label. But the single layer perceptron are only linear classifiers, and could not capture non-linear regularaties.

    Multi-layer Neural Network: Multilayer perceptron networks overcome many of the limitations of single layer perceptrons, and can be trained using the backpropagation algorithm. Multilayer neural networks consists of multiple layers of artificial neurons, usually interconnected in a feed-forward way. Each neuron in one layer is connected to the neurons of the subsequent layer.

    Simple Feed Forward Neural Network: In feed forward neural network, the information moves in only forward direction, it flows from input layer to output layer through the hidden layer. The feed forward neural network does not have a feedback connection.

    Artificial Neural Network with Back Propagation

    The objective behind back propagation is to train a multi-layered feed forward neural network such that it can learn the appropriate internal representations to allow it to learn any arbitrary mapping of input to output. The learning process in the neurons is simply the modification or update of weights and biases during training with backpropagation.

    Figure 3. Artificial Neural Network with Backpropagation.
  • The input layer simply receives and transmits the information and no computation is performed in that layer. Hidden layers are the intermediate layers between input and output layer.Output layer generates the outcome for given inputs. The number of output neurons can be single or multiple.
  • Weights and bias are the learnable parameters. The weights play an important role in propagation of signal in the network. A weight decides how much influence the input will have on the output. Bias helps in controlling the value at which activation function will trigger. The bias value allows the activation function to be shifted to the left or right, to better fit the data.
  • The objective of activation function is to normalize the data and introduce non-linearity into the output of a neuron.
  • Loss function describes how efficient the model performs with respect to the expected outcome.
  • The purpose of optimizer is to reduce the loss function (or achieve the global minima) by updating the weights and bias during backpropagation.
  • Training a Neural Network with Backpropagation

    The backpropagation algorithm has two main phases- forward and backward phase.

    Forward Propagation : In this phase, neurons at the input layer receive signals and without performing any computation they simply transmit the information to the hidden layer. The net input to a neuron of the hidden layer is calculated as the summation of each output of the input layer multiplied by weights (weights are initialized as small random numbers) and an additional bias can be incorporated. Then an activation function is applied to calculate the output of the neurons at the hidden layer. The forward-propagation phase continues as activation level calculations propagate forward to the output layer through the hidden layer(s). In each successive layer, every neuron sums its inputs and then applies a transfer function to compute its output. The output layer of the network then produces the final response, i.e., the estimated target value.The final predicted output can be compared to that of the expected instance and the derivative of the loss function with respect to the output is computed. The derivative of this loss now needs to be computed wrt the weights and bias in all layers in the backward propagation phase.

    Backward Propagation : The main goal of the backward phase is to learn the gradient of the loss function with respect to the different weights by using the chain rule of differential calculus. These gradients are used to update the weights. Since these gradients are learned in the backward direction, starting from the output node, this learning process is referred to as the backward phase.

    References

  • Neural Networks and Deep Learning by Charu C. Aggarwal
  • Artificial Neural Networks: Multilayer Perceptron for Ecological Modeling
  • https://en.wikipedia.org/wiki/Neuron
  • https://www.khanacademy.org/science/biology/human-biology/neuron-nervous-system/a/the-synapse
  • https://www.brainfacts.org/brain-anatomy-and-function/anatomy/2012/the-neuron
  • Comments
    Read next