Skip to content

rraghavkaushik/Optimizers

Repository files navigation

Optimizers

PyTorch Implementation of Optimizers from scratch

BGD - Batch Gradient Descent

BGD computes the gradient of the cost function w.r.t. to the parameters θ for the entire training dataset. We perform an update in the direction of the gradients and the learning rate(η), determines how large of an update we perform. Update rule:

image

Batch gradient descent will converge to the global minimum for convex error surfaces and to a local minimum for non-convex surfaces.

All implementations are based on the paper 'An overview of gradient descent optimization algorithms' by Sebastian Ruder.

References

@article{ruder2016overview,
  title={An overview of gradient descent optimization algorithms},
  author={Ruder, Sebastian},
  journal={arXiv preprint arXiv:1609.04747},
  year={2016},
  url={https://arxiv.org/abs/1609.04747}
}

About

PyTorch Implementation of Optimizers for Deep Learning from scratch.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages