# opennmt.optimizers.mixed_precision_wrapper module¶

Wrapper that maintains and update a float32 copy of the weights.

class opennmt.optimizers.mixed_precision_wrapper.MixedPrecisionOptimizerWrapper(optimizer, loss_scale=None)[source]

Bases: tensorflow.python.training.optimizer.Optimizer

compute_gradients(loss, var_list=None, gate_gradients=1, aggregation_method=None, colocate_gradients_with_ops=False, grad_loss=None)[source]

Compute gradients of loss for the variables in var_list.

This is the first part of minimize(). It returns a list of (gradient, variable) pairs where “gradient” is the gradient for “variable”. Note that “gradient” can be a Tensor, an IndexedSlices, or None if there is no gradient for the given variable.

Parameters: loss – A Tensor containing the value to minimize or a callable taking no arguments which returns the value to minimize. When eager execution is enabled it must be a callable. var_list – Optional list or tuple of tf.Variable to update to minimize loss. Defaults to the list of variables collected in the graph under the key GraphKeys.TRAINABLE_VARIABLES. gate_gradients – How to gate the computation of gradients. Can be GATE_NONE, GATE_OP, or GATE_GRAPH. aggregation_method – Specifies the method used to combine gradient terms. Valid values are defined in the class AggregationMethod. colocate_gradients_with_ops – If True, try colocating gradients with the corresponding op. grad_loss – Optional. A Tensor holding the gradient computed for loss. A list of (gradient, variable) pairs. Variable is always present, but gradient can be None. TypeError – If var_list contains anything else than Variable objects. ValueError – If some arguments are invalid. RuntimeError – If called with eager execution enabled and loss is not callable.

@compatibility(eager) When eager execution is enabled, gate_gradients, aggregation_method, and colocate_gradients_with_ops are ignored. @end_compatibility

apply_gradients(grads_and_vars, global_step=None, name=None)[source]

This is the second part of minimize(). It returns an Operation that applies gradients.

Parameters: grads_and_vars – List of (gradient, variable) pairs as returned by compute_gradients(). global_step – Optional Variable to increment by one after the variables have been updated. name – Optional name for the returned operation. Default to the name passed to the Optimizer constructor. An Operation that applies the specified gradients. If global_step was not None, that operation also increments global_step. TypeError – If grads_and_vars is malformed. ValueError – If none of the variables have gradients. RuntimeError – If you should use _distributed_apply() instead.
class opennmt.optimizers.mixed_precision_wrapper.AutomaticLossScaler(algorithm='Backoff', params=None)[source]

Bases: object

SUPPORTED_ALGOS = ['backoff', 'logmax']
update_op(has_nan, amax)[source]
loss_scale
static check_grads(grads_and_vars)[source]
class opennmt.optimizers.mixed_precision_wrapper.BackoffScaler(params)[source]

Bases: object

update_op(has_nan, amax)[source]
loss_scale
class opennmt.optimizers.mixed_precision_wrapper.LogMaxScaler(params)[source]

Bases: object

update_op(has_nan, amax)[source]
loss_scale
opennmt.optimizers.mixed_precision_wrapper.get_loss_scale_from_params(params)[source]

Returns the loss scale argument from user parameters.

Parameters: params – A dictionary containing the user parameters. A value that can be passed to the opennmt.optimizers.mixed_precision_wrapper.MixedPrecisionOptimizerWrapper loss_scale constructor argument.