qml.AdamOptimizer¶

class AdamOptimizer(stepsize=0.01, beta1=0.9, beta2=0.99, eps=1e-08)[source]¶

Bases: pennylane.optimize.gradient_descent.GradientDescentOptimizer

Gradient-descent optimizer with adaptive learning rate, first and second moment.

Adaptive Moment Estimation uses a step-dependent learning rate, a first moment \(a\) and a second moment \(b\), reminiscent of the momentum and velocity of a particle:

\[x^{(t+1)} = x^{(t)} - \eta^{(t+1)} \frac{a^{(t+1)}}{\sqrt{b^{(t+1)}} + \epsilon },\]

where the update rules for the two moments are given by

\[\begin{split}a^{(t+1)} &= \beta_1 a^{(t)} + (1-\beta_1) \nabla f(x^{(t)}),\\ b^{(t+1)} &= \beta_2 b^{(t)} + (1-\beta_2) (\nabla f(x^{(t)}))^{\odot 2},\\ \eta^{(t+1)} &= \eta \frac{\sqrt{(1-\beta_2^{t+1})}}{(1-\beta_1^{t+1})}.\end{split}\]

Above, \(( \nabla f(x^{(t-1)}))^{\odot 2}\) denotes the element-wise square operation, which means that each element in the gradient is multiplied by itself. The hyperparameters \(\beta_1\) and \(\beta_2\) can also be step-dependent. Initially, the first and second moment are zero.

The shift \(\epsilon\) avoids division by zero.

For more details, see arXiv:1412.6980.

Parameters

stepsize (float) – the user-defined hyperparameter \(\eta\)
beta1 (float) – hyperparameter governing the update of the first and second moment
beta2 (float) – hyperparameter governing the update of the first and second moment
eps (float) – offset \(\epsilon\) added for numerical stability

Note

When using torch, tensorflow or jax interfaces, refer to Gradients and training for suitable optimizers.

Attributes

`fm`	Returns estimated first moments of gradient
`sm`	Returns estimated second moments of gradient
`t`	Returns accumulated timesteps

fm¶: Returns estimated first moments of gradient

sm¶: Returns estimated second moments of gradient

t¶: Returns accumulated timesteps

Methods

`apply_grad`(grad, args)	Update the variables args to take a single optimization step.
`compute_grad`(objective_fn, args, kwargs[, …])	Compute gradient of the objective function at the given point and return it along with the objective function forward pass (if available).
`reset`()	Reset optimizer by erasing memory of past steps.
`step`(objective_fn, *args[, grad_fn])	Update trainable arguments with one step of the optimizer.
`step_and_cost`(objective_fn, *args[, grad_fn])	Update trainable arguments with one step of the optimizer and return the corresponding objective function value prior to the step.

apply_grad(grad, args)[source]¶

Update the variables args to take a single optimization step. Flattens and unflattens the inputs to maintain nested iterables as the parameters of the optimization.

Parameters

grad (tuple[ndarray]) – the gradient of the objective function at point \(x^{(t)}\): \(\nabla f(x^{(t)})\)
args (tuple) – the current value of the variables \(x^{(t)}\)

Returns

the new values \(x^{(t+1)}\)

Return type

list

static compute_grad(objective_fn, args, kwargs, grad_fn=None)¶

Compute gradient of the objective function at the given point and return it along with the objective function forward pass (if available).

Parameters

objective_fn (function) – the objective function for optimization
args (tuple) – tuple of NumPy arrays containing the current parameters for the objection function
kwargs (dict) – keyword arguments for the objective function
grad_fn (function) – optional gradient function of the objective function with respect to the variables args. If None, the gradient function is computed automatically. Must return the same shape of tuple [array] as the autograd derivative.

Returns

NumPy array containing the gradient \(\nabla f(x^{(t)})\) and the objective function output. If grad_fn is provided, the objective function will not be evaluted and instead None will be returned.

Return type

tuple (array)

reset()[source]¶: Reset optimizer by erasing memory of past steps.

step(objective_fn, *args, grad_fn=None, **kwargs)¶

Update trainable arguments with one step of the optimizer.

Parameters

objective_fn (function) – the objective function for optimization
*args – Variable length argument list for objective function
grad_fn (function) – optional gradient function of the objective function with respect to the variables x. If None, the gradient function is computed automatically. Must return a tuple[array] with the same number of elements as *args. Each array of the tuple should have the same shape as the corresponding argument.
**kwargs – variable length of keyword arguments for the objective function

Returns

the new variable values \(x^{(t+1)}\). If single arg is provided, list [array] is replaced by array.

Return type

list [array]

step_and_cost(objective_fn, *args, grad_fn=None, **kwargs)¶

Update trainable arguments with one step of the optimizer and return the corresponding objective function value prior to the step.

Parameters

objective_fn (function) – the objective function for optimization
*args – variable length argument list for objective function
grad_fn (function) – optional gradient function of the objective function with respect to the variables *args. If None, the gradient function is computed automatically. Must return a tuple[array] with the same number of elements as *args. Each array of the tuple should have the same shape as the corresponding argument.
**kwargs – variable length of keyword arguments for the objective function

Returns

the new variable values \(x^{(t+1)}\) and the objective function output prior to the step. If single arg is provided, list [array] is replaced by array.

Return type

tuple[list [array], float]

code/api/pennylane.AdamOptimizer

Download Python script

Download Notebook

View on GitHub

qml.AdamOptimizer¶

Attributes

Methods

Contents

Downloads

qml.AdamOptimizer¶

Attributes

Methods

Contents

Downloads

Related