lr_schedule

Classes

BaseLRSchedule

LinearLRSchedule

Linear learning rate decay from initial_value to final_value.

CosineAnnealingLRSchedule

Cosine annealing learning rate schedule.

ExponentialLRSchedule

Exponential learning rate decay.

CosineWarmupLRSchedule

Cosine annealing with linear warmup.

PolynomialLRSchedule

Polynomial learning rate decay.

Module Contents

class lr_schedule.BaseLRSchedule(cfg)
Parameters:

cfg (dict | float)

classmethod create(cfg)
Parameters:

cfg (dict | float)

Return type:

BaseLRSchedule

func(progress_remaining)

Get the learning rate :param progress_remaining: (float) :return: (float)

Parameters:

progress_remaining (float)

Return type:

float

__call__(progress_remaining)
Parameters:

progress_remaining (float)

Return type:

float

class lr_schedule.LinearLRSchedule(cfg)

Bases: BaseLRSchedule

Linear learning rate decay from initial_value to final_value. Standard linear annealing schedule.

Parameters:

cfg (dict | float)

func(progress_remaining)

Get the current learning rate depending on remaining progress. :param progress_remaining: (float) 1.0 at start, 0.0 at end :return: (float) learning rate

Parameters:

progress_remaining (float)

Return type:

float

class lr_schedule.CosineAnnealingLRSchedule(cfg)

Bases: BaseLRSchedule

Cosine annealing learning rate schedule. Smooth decay following cosine curve - often works better than linear. Popular in modern deep learning (e.g., ResNet, Transformers).

Formula: lr = final_lr + 0.5 * (initial_lr - final_lr) * (1 + cos(π * progress))

Parameters:

cfg (dict | float)

func(progress_remaining)

Cosine annealing from initial_value to final_value. :param progress_remaining: (float) 1.0 at start, 0.0 at end :return: (float) learning rate

Parameters:

progress_remaining (float)

Return type:

float

class lr_schedule.ExponentialLRSchedule(cfg)

Bases: BaseLRSchedule

Exponential learning rate decay. Decays faster early, slower later.

Formula: lr = initial_lr * decay_rate^progress

Parameters:

cfg (dict | float)

decay_rate
func(progress_remaining)

Exponential decay from initial_value to initial_value * decay_rate. :param progress_remaining: (float) 1.0 at start, 0.0 at end :return: (float) learning rate

Parameters:

progress_remaining (float)

Return type:

float

class lr_schedule.CosineWarmupLRSchedule(cfg)

Bases: BaseLRSchedule

Cosine annealing with linear warmup. Starts from small LR, linearly increases to initial_value during warmup, then cosine annealing to final_value.

Very popular in Transformer training (BERT, GPT, etc).

Parameters:

cfg (dict | float)

func(progress_remaining)

Linear warmup followed by cosine annealing. :param progress_remaining: (float) 1.0 at start, 0.0 at end :return: (float) learning rate

Parameters:

progress_remaining (float)

Return type:

float

class lr_schedule.PolynomialLRSchedule(cfg)

Bases: BaseLRSchedule

Polynomial learning rate decay. More gradual than exponential, more controlled than linear.

Formula: lr = (initial_lr - final_lr) * (progress_remaining**power) + final_lr

Parameters:

cfg (dict | float)

func(progress_remaining)

Polynomial decay from initial_value to final_value. :param progress_remaining: (float) 1.0 at start, 0.0 at end :return: (float) learning rate

Parameters:

progress_remaining (float)

Return type:

float