# qml.SPSAOptimizer¶

class SPSAOptimizer(maxiter=None, alpha=0.602, gamma=0.101, c=0.2, A=None, a=None)[source]

Bases: object

The Simultaneous Perturbation Stochastic Approximation method (SPSA) is a stochastic approximation algorithm for optimizing cost functions whose evaluation may involve noise.

While other gradient-based optimization methods usually attempt to compute the gradient analytically, SPSA involves approximating gradients at the cost of evaluating the cost function twice in each iteration step. This cost may result in a significant decrease in the overall cost of function evaluations for the entire optimization. It is based on an approximation of the unknown gradient $$\hat{g}(\hat{\theta}_{k})$$ through a simultaneous perturbation of the input parameters:

$\begin{split}\hat{g}_k(\hat{\theta}_k) = \frac{y(\hat{\theta}_k+c_k\Delta_k)- y(\hat{\theta}_k-c_k\Delta_k)}{2c_k} \begin{bmatrix} \Delta_{k1}^{-1} \\ \Delta_{k2}^{-1} \\ \vdots \\ \Delta_{kp}^{-1} \end{bmatrix}\text{,}\end{split}$

where

• $$k$$ is the current iteration step,

• $$\hat{\theta}_k$$ are the input parameters at iteration step $$k$$,

• $$y$$ is the objective function,

• $$c_k=\frac{c}{k^\gamma}$$ is the gain sequence corresponding to evaluation step size and it can be controlled with

• scaling parameter $$c$$ and

• scaling exponent $$\gamma$$

• $$\Delta_{ki}^{-1} \left(1 \leq i \leq p \right)$$ are the inverted elements of random pertubation vector $$\Delta_k$$.

$$\hat{\theta}_k$$ is updated to a new set of parameters with

$\hat{\theta}_{k+1} = \hat{\theta}_{k} - a_k\hat{g}_k(\hat{\theta}_k)\text{,}$

where the gain sequences $$a_k=\frac{a}{(A+k)^\alpha}$$ controls parameter update step size.

The gain sequence $$a_k$$ can be controlled with

• scaling parameter $$a$$,

• scaling exponent $$\alpha$$ and

• stability constant $$A$$

For more details, see Spall (1998a).

Note

• One SPSA iteration step of a cost function that involves computing the expectation value of a Hamiltonian with M terms requires $$2*M$$ quantum device executions.

• The forward-pass value of the cost function is not computed when stepping the optimizer. Therefore, in case of using step_and_cost method instead of step, the number of executions will include the cost function evaluations.

Examples:

For VQE/VQE-like problems, the objective function can be the following:

>>> coeffs = [0.2, -0.543, 0.4514]
>>> obs = [qml.PauliX(0) @ qml.PauliZ(1), qml.PauliZ(0) @ qml.Hadamard(2),
...             qml.PauliX(3) @ qml.PauliZ(1)]
>>> H = qml.Hamiltonian(coeffs, obs)
>>> num_qubits = 4
>>> dev = qml.device("default.qubit", wires=num_qubits)
>>> @qml.qnode(dev)
... def cost(params, num_qubits=1):
...     qml.BasisState(np.array([1, 1, 0, 0]), wires=range(num_qubits))
...     for i in range(num_qubits):
...         qml.Rot(*params[i], wires=0)
...         qml.CNOT(wires=[2, 3])
...         qml.CNOT(wires=[2, 0])
...         qml.CNOT(wires=[3, 1])
...     return qml.expval(H)
...
>>> params = np.random.normal(0, np.pi, (num_qubits, 3), requires_grad=True)


Once constructed, the cost function can be passed directly to the step or step_and_cost function of the optimizer:

>>> max_iterations = 100
>>> opt = qml.SPSAOptimizer(maxiter=max_iterations)
>>> for _ in range(max_iterations):
...     params, energy = opt.step_and_cost(cost, params, num_qubits=num_qubits)
>>> print(energy)
-0.4294539602541956


The algorithm provided by SPSA does not rely on built-in automatic differentiation capabilities of the interface being used and therefore the optimizer can be used in more complex hybrid classical-quantum workflow with any of the interfaces:

>>> n_qubits = 1
>>> max_iterations = 20
>>> dev = qml.device("default.qubit", wires=n_qubits)
>>> @qml.qnode(dev, interface="tf")
... def layer_fn_spsa(inputs, weights):
...     qml.AngleEmbedding(inputs, wires=range(n_qubits))
...     qml.BasicEntanglerLayers(weights, wires=range(n_qubits))
...     return qml.expval(qml.PauliZ(wires=0))
...
>>> opt = qml.SPSAOptimizer(maxiter=max_iterations)
... def fn(params, tensor_in, tensor_out):
...     with tf.init_scope():
...             for _ in range(max_iterations):
...                     # Some classical steps before the quantum computation
...                     params_a, layer_res = opt.step_and_cost(layer_fn_spsa,
...                                     np.tensor(params))
...                     params = params_a[1]
...                     tensor_out = layer_res
...                     # Some classical steps after the quantum computation
...     return layer_res
...
>>> tensor_in = tf.Variable([0.27507603])
>>> tensor_out = tf.Variable([0])
>>> params = tf.Variable([[3.97507603],
...     [3.12950603],
...     [1.00854038],
...     [1.25907603]])
>>> loss = fn(params, tensor_in, tensor_out)
>>> print(loss)
tf.Tensor(-0.9995854230771829, shape=(), dtype=float64)

Keyword Arguments
• maxiter (int) – the maximum number of iterations expected to be performed. Used to determine $$A$$, if $$A$$ is not supplied, otherwise ignored.

• alpha (float) – A hyperparameter to calculate $$a_k=\frac{a}{(A+k+1)^\alpha}$$ for each iteration. Its asymptotically optimal value is 1.0.

• gamma (float) – An hyperparameter to calculate $$c_k=\frac{c}{(k+1)^\gamma}$$ for each iteration. Its asymptotically optimal value is 1/6.

• c (float) – A hyperparameter related to the expected noise. It should be approximately the standard deviation of the expected noise of the cost function.

• A (float) – The stability constant; if not provided, set to be 10% of the maximum number of expected iterations.

• a (float) – A hyperparameter expected to be small in noisy situations, its value could be picked using A, $$\alpha$$ and $$\hat{g_0} (\hat{\theta_0})$$. For more details, see Spall (1998b).

 apply_grad(grad, args) Update the variables to take a single optimization step. compute_grad(objective_fn, args, kwargs) Approximate the gradient of the objective function at the given point. step(objective_fn, *args, **kwargs) Update trainable arguments with one step of the optimizer. step_and_cost(objective_fn, *args, **kwargs) Update the parameter array $$\hat{\theta}_k$$ with one step of the optimizer and return the step and the corresponding objective function.
apply_grad(grad, args)[source]

Update the variables to take a single optimization step.

Parameters
• grad (tuple [array]) – the gradient approximation of the objective function at point $$\hat{\theta}_{k}$$

• args (tuple) – the current value of the variables $$\hat{\theta}_{k}$$

Returns

the new values $$\hat{\theta}_{k+1}$$

Return type

list [array]

compute_grad(objective_fn, args, kwargs)[source]

Approximate the gradient of the objective function at the given point.

Parameters
• objective_fn (function) – The objective function for optimization

• args (tuple) – tuple of NumPy array containing the current parameters for objective function

• kwargs (dict) – keyword arguments for the objective function

Returns

$$\hat{g}_k(\hat{\theta}_k)$$

Return type

tuple (array)

step(objective_fn, *args, **kwargs)[source]

Update trainable arguments with one step of the optimizer. The number of steps is being counted through calls to step_and_cost and cost.

Parameters
• objective_fn (function) – the objective function for optimization

• *args – variable length argument array for objective function

• **kwargs – variable length of keyword arguments for the objective function

Returns

the new variable values $$\hat{\theta}_{k+1}$$.

Return type

list [array]

step_and_cost(objective_fn, *args, **kwargs)[source]

Update the parameter array $$\hat{\theta}_k$$ with one step of the optimizer and return the step and the corresponding objective function. The number of steps stored by the k attribute of the optimizer is counted internally when calling step_and_cost and cost.

Parameters
• objective_fn (function) – the objective function for optimization

• *args – variable length argument array for objective function

• **kwargs – variable length of keyword arguments for the objective function

Returns

the new variable values $$\hat{\theta}_{k+1}$$ and the objective function output prior to the step.

Return type

tuple[list [array], float]