InvSqrtDecay

class opennmt.schedules.InvSqrtDecay(learning_rate, warmup_steps, initial_learning_rate=0)[source]

Decay based on the reciprocal of the step square root. This corresponds to inverse_sqrt in Fairseq and --lr-decay-inv-sqrt in Marian.

During warmup (linear increase of the learning rate):

schedule(step)=init_lr+(lrinit_lr)×stepwarmup_steps

After warmup:

schedule(step)=lr×warmup_stepsstep

Inherits from: keras.src.optimizers.schedules.learning_rate_schedule.LearningRateSchedule

__init__(learning_rate, warmup_steps, initial_learning_rate=0)[source]

Initializes the decay function.

Parameters
  • learning_rate – The base learning rate.

  • warmup_steps – The number of warmup steps.

  • initial_learning_rate – Initial learning rate during warmup.