class GradientDescentOptimizer(stepsize=0.01)[source]

Bases: object

Base class for other gradient-descent-based optimizers.

A step of the gradient descent optimizer computes the new values via the rule

$x^{(t+1)} = x^{(t)} - \eta \nabla f(x^{(t)}).$

where $$\eta$$ is a user-defined hyperparameter corresponding to step size.

Parameters

stepsize (float) – the user-defined hyperparameter $$\eta$$

 apply_grad(grad, x) Update the variables x to take a single optimization step. compute_grad(objective_fn, x[, grad_fn]) Compute gradient of the objective_fn at the point x. step(objective_fn, x[, grad_fn]) Update x with one step of the optimizer. update_stepsize(stepsize) Update the initialized stepsize value $$\eta$$.
apply_grad(grad, x)[source]

Update the variables x to take a single optimization step. Flattens and unflattens the inputs to maintain nested iterables as the parameters of the optimization.

Parameters
• grad (array) – The gradient of the objective function at point $$x^{(t)}$$: $$\nabla f(x^{(t)})$$

• x (array) – the current value of the variables $$x^{(t)}$$

Returns

the new values $$x^{(t+1)}$$

Return type

array

static compute_grad(objective_fn, x, grad_fn=None)[source]

Compute gradient of the objective_fn at the point x.

Parameters
• objective_fn (function) – the objective function for optimization

• x (array) – NumPy array containing the current values of the variables to be updated

• grad_fn (function) – Optional gradient function of the objective function with respect to the variables x. If None, the gradient function is computed automatically.

Returns

NumPy array containing the gradient $$\nabla f(x^{(t)})$$

Return type

array

step(objective_fn, x, grad_fn=None)[source]

Update x with one step of the optimizer.

Parameters
• objective_fn (function) – the objective function for optimization

• x (array) – NumPy array containing the current values of the variables to be updated

• grad_fn (function) – Optional gradient function of the objective function with respect to the variables x. If None, the gradient function is computed automatically.

Returns

the new variable values $$x^{(t+1)}$$

Return type

array

update_stepsize(stepsize)[source]

Update the initialized stepsize value $$\eta$$.

This allows for techniques such as learning rate scheduling.

Parameters

stepsize (float) – the user-defined hyperparameter $$\eta$$