You are viewing a single comment's thread from:
RE: Why Do I Sometimes Get Better Accuracy With A Higher Learning Rate Gradient Descent?
I briefly mentioned some of those techniques in at the end.
You're correct that stochastic gradient descent has somewhat overtaken traditional gradient descent if for no other reason than it's faster to compute.
Anyway, there will definitely be more articles about these other techniques in the future.