qml.AdamOptimizer

class AdamOptimizer(stepsize=0.01, beta1=0.9, beta2=0.99, eps=1e-08)[source]

Bases: pennylane.optimize.gradient_descent.GradientDescentOptimizer

Gradient-descent optimizer with adaptive learning rate, first and second moment.

Adaptive Moment Estimation uses a step-dependent learning rate, a first moment \(a\) and a second moment \(b\), reminiscent of the momentum and velocity of a particle:

\[x^{(t+1)} = x^{(t)} - \eta^{(t+1)} \frac{a^{(t+1)}}{\sqrt{b^{(t+1)}} + \epsilon },\]

where the update rules for the three values are given by

\[\begin{split}a^{(t+1)} &= \frac{\beta_1 a^{(t)} + (1-\beta_1)\nabla f(x^{(t)})}{(1- \beta_1)},\\ b^{(t+1)} &= \frac{\beta_2 b^{(t)} + (1-\beta_2) ( \nabla f(x^{(t)}))^{\odot 2} }{(1- \beta_2)},\\ \eta^{(t+1)} &= \eta^{(t)} \frac{\sqrt{(1-\beta_2)}}{(1-\beta_1)}.\end{split}\]

Above, \(( \nabla f(x^{(t-1)}))^{\odot 2}\) denotes the element-wise square operation, which means that each element in the gradient is multiplied by itself. The hyperparameters \(\beta_1\) and \(\beta_2\) can also be step-dependent. Initially, the first and second moment are zero.

The shift \(\epsilon\) avoids division by zero.

For more details, see arXiv:1412.6980.

Parameters
  • stepsize (float) – the user-defined hyperparameter \(\eta\)

  • beta1 (float) – hyperparameter governing the update of the first and second moment

  • beta2 (float) – hyperparameter governing the update of the first and second moment

  • eps (float) – offset \(\epsilon\) added for numerical stability

apply_grad(grad, x)

Update the variables x to take a single optimization step.

compute_grad(objective_fn, x[, grad_fn])

Compute gradient of the objective_fn at the point x.

reset()

Reset optimizer by erasing memory of past steps.

step(objective_fn, x[, grad_fn])

Update x with one step of the optimizer.

update_stepsize(stepsize)

Update the initialized stepsize value \(\eta\).

apply_grad(grad, x)[source]

Update the variables x to take a single optimization step. Flattens and unflattens the inputs to maintain nested iterables as the parameters of the optimization.

Parameters
  • grad (array) – The gradient of the objective function at point \(x^{(t)}\): \(\nabla f(x^{(t)})\)

  • x (array) – the current value of the variables \(x^{(t)}\)

Returns

the new values \(x^{(t+1)}\)

Return type

array

static compute_grad(objective_fn, x, grad_fn=None)

Compute gradient of the objective_fn at the point x.

Parameters
  • objective_fn (function) – the objective function for optimization

  • x (array) – NumPy array containing the current values of the variables to be updated

  • grad_fn (function) – Optional gradient function of the objective function with respect to the variables x. If None, the gradient function is computed automatically.

Returns

NumPy array containing the gradient \(\nabla f(x^{(t)})\)

Return type

array

reset()[source]

Reset optimizer by erasing memory of past steps.

step(objective_fn, x, grad_fn=None)

Update x with one step of the optimizer.

Parameters
  • objective_fn (function) – the objective function for optimization

  • x (array) – NumPy array containing the current values of the variables to be updated

  • grad_fn (function) – Optional gradient function of the objective function with respect to the variables x. If None, the gradient function is computed automatically.

Returns

the new variable values \(x^{(t+1)}\)

Return type

array

update_stepsize(stepsize)

Update the initialized stepsize value \(\eta\).

This allows for techniques such as learning rate scheduling.

Parameters

stepsize (float) – the user-defined hyperparameter \(\eta\)

Contents

Using PennyLane

Development

API