InvSqrtDecay
- class opennmt.schedules.InvSqrtDecay(learning_rate, warmup_steps, initial_learning_rate=0)[source]
Decay based on the reciprocal of the step square root. This corresponds to
inverse_sqrt
in Fairseq and--lr-decay-inv-sqrt
in Marian.During warmup (linear increase of the learning rate):
\[\text{schedule}(\text{step}) = \text{init_lr} + (\text{lr} - \text{init_lr}) \times \frac{\text{step}}{\text{warmup_steps}}\]After warmup:
\[\text{schedule}(\text{step}) = \text{lr} \times \sqrt{\frac{\text{warmup_steps}}{\text{step}}}\]See also
Inherits from:
keras.src.optimizers.schedules.learning_rate_schedule.LearningRateSchedule