Gradient Descent
Gradient descent is a fundamental optimization technique widely used in deep learning and machine learning to minimize functions, typically the loss or error function, by iteratively moving towards the minimum value.
1. Concept and Principle
-
Gradient descent solves optimization problems of the form:
where is a differentiable function (often a loss or cost function in machine learning).
-
The algorithm proceeds by moving in the direction of the negative gradient, which points towards the fastest local decrease of the function.
2. Update Rule
-
At each iteration, parameters are updated as:
where:
-
If the gradient is zero, the update stops and the minimum (local or global) is found.
3. Algorithm Steps
-
Initialize the parameters randomly.
-
Compute the gradient of the loss with respect to the parameters.
-
Update the parameters using the rule above.
-
Repeat steps 2-3 until convergence or for a set number of iterations.
Pseudo-code
pythoninitialize w for i in range(num_iterations): grad = gradient_of_loss_function(w) w = w - learning_rate * grad return w
4. Variants
-
Batch Gradient Descent: The gradient is computed using the whole dataset at each iteration.
-
Stochastic Gradient Descent (SGD): The gradient is computed using one randomly chosen example at a time.
-
Mini-batch Gradient Descent: The gradient is computed using a small subset (batch) of the data per iteration.
5. Example
To minimize the quadratic function:
The gradient is:
The update steps are:
By repeating the above updates, converge to .
6. Applications
-
Training machine learning models: Linear regression, logistic regression, neural networks.
-
Minimizing error/loss: Fitting the best solution to the data.
7. Choosing Learning Rate
-
A too-small learning rate leads to slow convergence.
-
A too-large learning rate may cause divergence or oscillations.
8. Summary
-
Gradient descent iteratively moves towards the minimum by following the negative gradient.
-
It forms the backbone of model training in deep learning and machine learning.
Join the conversation