how to implement custom learning rate scheduler? #2399

tnq177 · 2022-10-06T19:48:33Z

tnq177
Oct 6, 2022

Hi, my code structure is simple like this:

model = Model(args)
engine, optimizer, _ = deepspeed.initialize(model, ds_config, parameters)

data_loader = get_data_loader(args)
for batch in data_loader:
    loss = engine(batch)
    engine.backward(loss)
    engine.step()

This is working well for me so far using 8 GPUs on a single machine. Right now, I'm trying to do 2 things:

Using Deepspeed's learning rate scheduler

If using deepspeed's learning rate scheduler, I wonder how I can manually change the learning rate please? I understand with pytorch, we can simply go through each in param_groups and modify the lr but since for deepspeed, the learning rate scheduler is wrapped within initialize, I'm a bit hesitant to do so.

Custom learning rate scheduler

Right now I'm using the Warmup + Decay LR. However, for decay I think it's only going down linearly. I want to try exponential decay as well. And potentially in the future I'd like to experiment with other custom learning rate schedulers. How do I implement a custom one please?

Thanks,
T

tjruwase · 2022-11-08T19:10:11Z

tjruwase
Nov 8, 2022
Maintainer

@tnq177, apologies for the delay. I will take a closer look this week.

1 reply

tjruwase Nov 8, 2022
Maintainer

For using deepspeed learning rate scheduler, please note that the tuple return value of deepspeed.initialize() also includes the scheduler.

See the docs: https://deepspeed.readthedocs.io/en/latest/initialize.html#deepspeed.initialize

Specifically, the return value is engine, optimizer, training_data_loader, lr_scheduler

tjruwase · 2022-11-08T19:28:43Z

tjruwase
Nov 8, 2022
Maintainer

Right now I'm using the Warmup + Decay LR. However, for decay I think it's only going down linearly. I want to try exponential decay as well. And potentially in the future I'd like to experiment with other custom learning rate schedulers. How do I implement a custom one please?

deepspeed.initialize() accepts a lr scheduler object or a constructor of such an object.

Thus, you can pass any of the torch lr scheduler objects or any object that implements the expected API to deepspeed.initialize().

0 replies

tnq177 · 2022-11-09T15:26:20Z

tnq177
Nov 9, 2022
Author

if the schedule is supposed to execute at any other interval (e.g., training epochs), then the user should NOT pass the scheduler to DeepSpeed during initialization and must manage it explicitly.

@tjruwase sorry I missed this part in the docs. So apparently I can explicitly manage lr by setting 'lr' in each param group right?

Update: I tested, turn out I can just manually adjust learning rate like usual with Pytorch :) Thanks @tjruwase

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to implement custom learning rate scheduler? #2399

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

how to implement custom learning rate scheduler? #2399

tnq177 Oct 6, 2022

Using Deepspeed's learning rate scheduler

Custom learning rate scheduler

Replies: 3 comments · 1 reply

tjruwase Nov 8, 2022 Maintainer

tjruwase Nov 8, 2022 Maintainer

tjruwase Nov 8, 2022 Maintainer

tnq177 Nov 9, 2022 Author

tnq177
Oct 6, 2022

Replies: 3 comments 1 reply

tjruwase
Nov 8, 2022
Maintainer

tjruwase Nov 8, 2022
Maintainer

tjruwase
Nov 8, 2022
Maintainer

tnq177
Nov 9, 2022
Author