trw.train.optimizers_v2
¶
Module Contents¶
Classes¶
Scheduler based on |
|
Attributes¶
- trw.train.optimizers_v2.SchedulerType¶
- trw.train.optimizers_v2.StepSchedulerType¶
- class trw.train.optimizers_v2.CosineAnnealingWarmRestartsDecayed(optimizer: torch.optim.Optimizer, T_0: int, T_mult: int = 1, eta_min: float = 0, last_epoch: int = - 1, decay_factor: float = 0.7)¶
Bases:
torch.optim.lr_scheduler.CosineAnnealingWarmRestarts
Scheduler based on
torch.optim.lr_scheduler.CosineAnnealingWarmRestarts
. In addition, every time the learning rate is restarted, the base learning rate is decayed by decay_factor- step(self, epoch=None)¶
Step could be called after every batch update
Example
>>> scheduler = CosineAnnealingWarmRestarts(optimizer, T_0, T_mult) >>> iters = len(dataloader) >>> for epoch in range(20): >>> for i, sample in enumerate(dataloader): >>> inputs, labels = sample['inputs'], sample['labels'] >>> optimizer.zero_grad() >>> outputs = net(inputs) >>> loss = criterion(outputs, labels) >>> loss.backward() >>> optimizer.step() >>> scheduler.step(epoch + i / iters)
This function can be called in an interleaved way.
Example
>>> scheduler = CosineAnnealingWarmRestarts(optimizer, T_0, T_mult) >>> for epoch in range(20): >>> scheduler.step() >>> scheduler.step(26) >>> scheduler.step() # scheduler.step(27), instead of scheduler(20)
- class trw.train.optimizers_v2.Optimizer(optimizer_fn: Callable[[Iterator[torch.nn.parameter.Parameter]], torch.optim.Optimizer], scheduler_fn: Optional[Callable[[torch.optim.Optimizer], SchedulerType]] = None, step_scheduler_fn: Optional[Callable[[torch.optim.Optimizer], StepSchedulerType]] = None)¶
- set_scheduler_fn(self, scheduler_fn: Optional[Callable[[torch.optim.Optimizer], SchedulerType]])¶
- set_step_scheduler_fn(self, step_scheduler_fn: Optional[Callable[[torch.optim.Optimizer], StepSchedulerType]])¶
- __call__(self, datasets: trw.basic_typing.Datasets, model: torch.nn.Module) Tuple[Dict[str, torch.optim.Optimizer], Optional[Dict[str, SchedulerType]], Optional[Dict[str, StepSchedulerType]]] ¶
- scheduler_step_lr(self, step_size: int, gamma: float = 0.1) Optimizer ¶
Apply a scheduler on the learning rate.
Decays the learning rate of each parameter group by gamma every step_size epochs.
- scheduler_cosine_annealing_warm_restart(self, T_0: int, T_mult: int = 1, eta_min: float = 0, last_epoch=- 1) Optimizer ¶
Apply a scheduler on the learning rate.
Restart the learning rate every T_0 * (T_mult)^(#restart) epochs.
References
- scheduler_cosine_annealing_warm_restart_decayed(self, T_0: int, T_mult: int = 1, eta_min: float = 0, last_epoch=- 1, decay_factor=0.7) Optimizer ¶
Apply a scheduler on the learning rate. Each time the learning rate is restarted, the base learning rate is decayed
Restart the learning rate every T_0 * (T_mult)^(#restart) epochs.
References
- scheduler_one_cycle(self, max_learning_rate: float, epochs: int, steps_per_epoch: int, learning_rate_start_div_factor: float = 25.0, learning_rate_end_div_factor: float = 10000.0, percentage_cycle_increase: float = 0.3, anneal_strategy: str = 'cos', cycle_momentum: bool = True, base_momentum: float = 0.85, max_momentum: float = 0.95)¶
This scheduler should not be used with another scheduler!
The learning rate or momentum provided by the Optimizer will be overriden by this scheduler.
- clip_gradient_norm(self, max_norm: float = 1.0, norm_type: float = 2.0)¶
Clips the gradient norm during optimization
- Parameters
max_norm – the maximum norm of the concatenated gradients of the optimizer. Note: the gradient is modulated by the learning rate
norm_type – type of the used p-norm. Can be
'inf'
for infinity norm
- See:
torch.nn.utils.clip_grad_norm_()
- class trw.train.optimizers_v2.OptimizerSGD(learning_rate: float, momentum: float = 0.9, weight_decay: float = 0, nesterov: bool = False)¶
Bases:
Optimizer