You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Neural networks are machine learning models inspired by the structure of the biological brain. They consist of layers of interconnected neurons (nodes) that learn to transform inputs into outputs by adjusting their internal weights during training. Neural networks are the foundation of deep learning and power many modern AI applications — from image recognition and language translation to game playing and drug discovery.
A biological neuron receives signals through dendrites, processes them in the cell body, and transmits the output through the axon to other neurons. An artificial neuron mimics this:
| Biological | Artificial |
|---|---|
| Dendrites (inputs) | Input features (x1, x2, ...) |
| Synaptic weights | Learnable weights (w1, w2, ...) |
| Cell body (processing) | Weighted sum + activation function |
| Axon (output) | Output value |
The perceptron is the simplest neural network — a single neuron that computes a weighted sum of inputs and applies a step function:
output = step(w1x1 + w2x2 + ... + wn*xn + b)
Where the step function returns 1 if the sum is positive, 0 otherwise.
import numpy as np
def perceptron(X, weights, bias):
"""A simple perceptron."""
linear_output = np.dot(X, weights) + bias
return (linear_output >= 0).astype(int)
# Example: AND gate
X = np.array([[0,0], [0,1], [1,0], [1,1]])
y = np.array([0, 0, 0, 1])
weights = np.array([0.5, 0.5])
bias = -0.7
predictions = perceptron(X, weights, bias)
print(f"AND gate predictions: {predictions}")
A single perceptron can only learn linearly separable functions. It cannot solve the XOR problem (where the classes are not separable by a straight line). This limitation motivated the development of multi-layer networks.
A multi-layer neural network (also called a multi-layer perceptron or MLP) has:
| Layer | Role | Typical Size |
|---|---|---|
| Input | Receives features | Number of features |
| Hidden | Learns intermediate representations | 32, 64, 128, 256, or more neurons |
| Output | Produces predictions | 1 neuron (regression/binary) or n neurons (multi-class) |
Each neuron computes:
The output of one layer becomes the input to the next.
Activation functions introduce non-linearity into the network, enabling it to learn complex patterns.
| Function | Formula | Range | Use Case |
|---|---|---|---|
| Sigmoid | 1 / (1 + e^(-z)) | (0, 1) | Binary classification output |
| Tanh | (e^z - e^(-z)) / (e^z + e^(-z)) | (-1, 1) | Hidden layers (centred output) |
| ReLU | max(0, z) | [0, infinity) | Most popular for hidden layers |
| Leaky ReLU | max(0.01z, z) | (-infinity, infinity) | Avoids "dying ReLU" problem |
| Softmax | e^(zi) / sum(e^(zj)) | (0, 1), sums to 1 | Multi-class classification output |
Tip: ReLU is the default choice for hidden layers. Use sigmoid for binary output and softmax for multi-class output.
Neural networks learn through backpropagation combined with gradient descent:
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.