Implement a simple neural network
Aim
To implement a simple feedforward neural network with one hidden layer using Python and NumPy, and demonstrate its ability to learn the XOR function.
Algorithm
-
Initialize the input data (XOR input/output) and network parameters (weights and biases).
-
Feedforward: Compute activations for the hidden and output layers using the sigmoid activation function.
-
Compute loss: Measure prediction error using mean squared error.
-
Backpropagation: Calculate gradients of loss with respect to weights and biases using the chain rule.
-
Update parameters: Adjust weights and biases using the computed gradients and learning rate.
-
Repeat steps 2–5: For a fixed number of epochs.
-
Predict: After training, generate outputs for all inputs.
Program (Python)
import numpy as np # 1. Input and output for XOR X = np.array([[0,0],[0,1],[1,0],[1,1]]) y = np.array([[0],[1],[1],[0]]) # 2. Network parameters input_size, hidden_size, output_size = 2, 2, 1 lr = 0.5 epochs = 10000 np.random.seed(42) W1 = np.random.randn(input_size, hidden_size) b1 = np.zeros((1, hidden_size)) W2 = np.random.randn(hidden_size, output_size) b2 = np.zeros((1, output_size)) def sigmoid(x): return 1/(1+np.exp(-x)) def dsigmoid(x): return x*(1-x) # Training loop for epoch in range(epochs): # Forward z1 = X @ W1 + b1 a1 = sigmoid(z1) z2 = a1 @ W2 + b2 a2 = sigmoid(z2) loss = np.mean((y - a2)**2) # Backward delta2 = (a2 - y) * dsigmoid(a2) dW2 = a1.T @ delta2 db2 = np.sum(delta2, axis=0, keepdims=True) delta1 = (delta2 @ W2.T) * dsigmoid(a1) dW1 = X.T @ delta1 db1 = np.sum(delta1, axis=0, keepdims=True) # Update W2 -= lr * dW2 b2 -= lr * db2 W1 -= lr * dW1 b1 -= lr * db1 if epoch % 2000 == 0: print(f"Epoch {epoch} Loss: {loss:.4f}") # Prediction after training print("Final predictions:") print(np.round(sigmoid(sigmoid(X @ W1 + b1) @ W2 + b2)))
Output
Epoch 0 Loss: 0.3467 Epoch 2000 Loss: 0.2471 Epoch 4000 Loss: 0.1391 Epoch 6000 Loss: 0.0667 Epoch 8000 Loss: 0.0419 Final predictions: [[0.] [1.] [1.] [0.]]
Result
-
The loss decreases as training progresses, indicating learning.
-
The final output accurately predicts the XOR function:
-
0 XOR 0 → 0 -
0 XOR 1 → 1 -
1 XOR 0 → 1 -
1 XOR 1 → 0
-
This demonstrates a correct implementation of a simple neural network, capable of learning non-linear patterns using basic Python code.
Join the conversation