lr_schedule

Classes

`BaseLRSchedule`
`LinearLRSchedule`	Linear learning rate decay from initial_value to final_value.
`CosineAnnealingLRSchedule`	Cosine annealing learning rate schedule.
`ExponentialLRSchedule`	Exponential learning rate decay.
`CosineWarmupLRSchedule`	Cosine annealing with linear warmup.
`PolynomialLRSchedule`	Polynomial learning rate decay.

Module Contents

class lr_schedule.BaseLRSchedule(cfg)

Parameters:: cfg (dict | float)

classmethod create(cfg)

Parameters:: cfg (dict | float)
Return type:: BaseLRSchedule

func(progress_remaining)

Get the learning rate :param progress_remaining: (float) :return: (float)

Parameters:: progress_remaining (float)
Return type:: float

__call__(progress_remaining)

Parameters:: progress_remaining (float)
Return type:: float

class lr_schedule.LinearLRSchedule(cfg)

Bases: BaseLRSchedule

Linear learning rate decay from initial_value to final_value. Standard linear annealing schedule.

Parameters:: cfg (dict | float)

func(progress_remaining)

Get the current learning rate depending on remaining progress. :param progress_remaining: (float) 1.0 at start, 0.0 at end :return: (float) learning rate

Parameters:: progress_remaining (float)
Return type:: float

class lr_schedule.CosineAnnealingLRSchedule(cfg)

Bases: BaseLRSchedule

Cosine annealing learning rate schedule. Smooth decay following cosine curve - often works better than linear. Popular in modern deep learning (e.g., ResNet, Transformers).

Formula: lr = final_lr + 0.5 * (initial_lr - final_lr) * (1 + cos(π * progress))

Parameters:: cfg (dict | float)

func(progress_remaining)

Cosine annealing from initial_value to final_value. :param progress_remaining: (float) 1.0 at start, 0.0 at end :return: (float) learning rate

Parameters:: progress_remaining (float)
Return type:: float

class lr_schedule.ExponentialLRSchedule(cfg)

Bases: BaseLRSchedule

Exponential learning rate decay. Decays faster early, slower later.

Formula: lr = initial_lr * decay_rate^progress

Parameters:: cfg (dict | float)

decay_rate

func(progress_remaining)

Exponential decay from initial_value to initial_value * decay_rate. :param progress_remaining: (float) 1.0 at start, 0.0 at end :return: (float) learning rate

Parameters:: progress_remaining (float)
Return type:: float

class lr_schedule.CosineWarmupLRSchedule(cfg)

Bases: BaseLRSchedule

Cosine annealing with linear warmup. Starts from small LR, linearly increases to initial_value during warmup, then cosine annealing to final_value.

Very popular in Transformer training (BERT, GPT, etc).

Parameters:: cfg (dict | float)

func(progress_remaining)

Linear warmup followed by cosine annealing. :param progress_remaining: (float) 1.0 at start, 0.0 at end :return: (float) learning rate

Parameters:: progress_remaining (float)
Return type:: float

class lr_schedule.PolynomialLRSchedule(cfg)

Bases: BaseLRSchedule

Polynomial learning rate decay. More gradual than exponential, more controlled than linear.

Formula: lr = (initial_lr - final_lr) * (progress_remaining**power) + final_lr

Parameters:: cfg (dict | float)

func(progress_remaining)

Polynomial decay from initial_value to final_value. :param progress_remaining: (float) 1.0 at start, 0.0 at end :return: (float) learning rate

Parameters:: progress_remaining (float)
Return type:: float