remove minibatch_size attribute to SAC trainer #655

alexnikulkov · 2022-07-19T18:23:50Z

Summary: Remove minibatch_size attribute in SAC and DQN trainer

Differential Revision: D37834043

…ter) to github/third-party/PyTorchLightning/pytorch-lightning Summary: ### Manual - (ephemeral*) make `ResultCollection._extract_batch_size` a class method - (ephtermal) commented out the MisconfigurationException in https://fburl.com/diffusion/agbk3mxc - reagent/gym/tests/test_gym.py: wrap EpisodicDataset with dataloader before passing it to .fit() to fix the type checker error \* ephemeral means that the change are made in-place in Lightning and will disappear after another sync. ### Automatic ### New commit log messages cdcc483e CHANGELOG update after v1.3.6 release (#7988) 7978a537 Ipynb update (#8004) c6e02e48 [feat] Allow overriding optimizer_zero_grad and/or optimizer_step when using accumulate_grad_batches (#7980) eebdc910 progressive restoring of trainer state (#7652) 3fece17f [feat] Add `{,load_}state_dict` to `ResultCollection` 1/n (#7948) 906de2a7 [feat] Named Parameter Groups in `LearningRateMonitor` (#7987) 5647087f New speed documentation (#7665) 55494e87 Fix Special Tests (#7841) bc2c2db2 Do not override the logged epoch in `logged_metrics` (#7982) 21342165 Change `WarningCache` to subclass `set` (#7995) 4ffba600 Add predict hook test (#7973) 917cf836 [doc] Add more reference around predict_step (#7997) d2983c7c [fix] Enable manual optimization DeepSpeed (#7970) b093a9e6 Support `save_hyperparameters()` in LightningModule dataclass (#7992) 341adad8 Loop Refactor 2/N - Remove Old Training Loop (#7985) b71aa55b Make optimizers skippable when using amp (#7975) 0004216f Easier configurability of callbacks that should always be present in LightningCLI (#7964) 78a14a3f Add `tpu_spawn_debug` to plugin registry (#7933) 92024df2 Pt 1.9 breaking fix: __iter__ type hint (#7993) b2e9fa81 Improvements related to save of config file by LightningCLI (#7963) 971908a1 Loop Refactor 1/N - Training Loop (#7871) 560b1970 Standardize positional datamodule and argument names (#7431) 0974d66c Add docs for IPUs (#7923) 024cf23c Remove convert_to_half, suggest using `model.half` (#7974) Reviewed By: colin2328 Differential Revision: D29203448 fbshipit-source-id: 0e866b869bda06349828ec4fc61af19e4ea21f0e

…research#490) Summary: Pull Request resolved: facebookresearch#490 Fix world model simulation. The previous failure is due to that the world model is not loaded properly from warmstart path. Also, this diff updates `prepare_data()` API. `prepare_data()` is now assumed to not return setup data, following pytorch lightning's API. Reviewed By: kittipatv Differential Revision: D29157160 fbshipit-source-id: 7d52e12793b8bbc827bb2a14567993a7f63dd54c

…ookresearch#496) Summary: Pull Request resolved: facebookresearch#496 Offline Batch RL runs were failing on import error, which arose from missing init.py file Reviewed By: czxttkl Differential Revision: D29284160 fbshipit-source-id: 4e69941028f5d00bc0ef7dc30049929a9d44c306

Summary: Fixes : pytorch/pytorch#24892 In the paper : https://arxiv.org/pdf/1908.03265.pdf Liyuan Liu et al. suggested a new optimization algorithm with an essence of similar to Adam Algorithm. It has been discussed in the paper that, without warmup heuristic, in the early stage of adaptive optimization / learning algorithms sometimes we can get undesirable large variance which can slow overall convergence process. Authors proposed the idea of rectification of variance of adaptive learning rate when it is expected to be high. Differing from the paper, we selected variance tractability cut-off as 5 instead of 4. This adjustment is common practice, and could be found in the code-repository and also tensorflow swift optim library as well : https://github.com/LiyuanLucasLiu/RAdam/blob/2f03dd197022da442c6a15c47321f4335d113a3f/radam/radam.py#L156 https://github.com/tensorflow/swift-apis/blob/f51ee4618d652a2419e998bf9418ad80bda67454/Sources/TensorFlow/Optimizers/MomentumBased.swift#L638 Pull Request resolved: pytorch/pytorch#58968 Reviewed By: gchanan Differential Revision: D29241736 Pulled By: iramazanli fbshipit-source-id: 288b9b1f3125fdc6c7a7bb23fde1ea5c201c0448

Differential Revision: D29241736 (facebookresearch@0ff3634) Original commit changeset: 288b9b1f3125 fbshipit-source-id: 56c4ec98647c6f1822b130726741a1c9ca193670

Summary: Pull Request resolved: facebookresearch#487 We shouldn't need to yield the placeholder loss. Differential Revision: D29111772 fbshipit-source-id: 0971221583bd9a5de770860ff15cc80eb8d749c3

Summary: According to the original [SlateQ paper](https://arxiv.org/abs/1905.12767) (p28, 2nd paragraph, last sentence), the discount factor `gamma` will be scaled by the time difference in this way: `gamma^((t2-t1)/time_scale)`. Here, `t1` and `t2` are the timestamps between the current and the next state-action pairs within a training sample, and the `time_scale` is a hyperparameter that can scale up/down the time difference. This diff implements this mechanism by adding a `discount_time_scale` parameter to `SlateQTrainer`. Its value is the `time_scale` in the formula above. If this parameter is not set, i.e., `None`, we will keep the discount factor as it is. Reviewed By: kittipatv Differential Revision: D29297804 fbshipit-source-id: 5bd9101a2fe3b1b3d9817a3233357cab197e8ce8

Summary: Fixes : pytorch/pytorch#5804 In the paper : https://openreview.net/forum?id=OM0jvwB8jIp57ZJjtNEZ Timothy Dozat suggested a new optimization algorithm with an essence of combination of NAG and Adam algorithms. It is known that the idea of momentum can be improved with the Nesterov acceleration in optimization algorithms, and Dozat is investigating to apply this idea to momentum component of Adam algorithm. Author provided experiment evidence in their work to show excellence of the idea. In this PR we are implementing the proposed algorithm NAdam in the mentioned paper. Author has a preliminary work http://cs229.stanford.edu/proj2015/054_report.pdf where he shows the decay base constant should be taken as 0.96 which we also followed the same phenomenon here in this implementation similar to Keras. Moreover, implementation / coding practice have been followed similar to Keras in some other places as well: https://github.com/tensorflow/tensorflow/blob/f9d386849581d15d72f6f1f96f12aac230a8edbe/tensorflow/python/keras/optimizer_v2/nadam.py Pull Request resolved: pytorch/pytorch#59009 Reviewed By: gchanan, vincentqb Differential Revision: D29220375 Pulled By: iramazanli fbshipit-source-id: 4b4bb4b15f7e16f7527f368bbf4207ed345751aa

Summary: Fixes : pytorch/pytorch#24892 In the paper : https://arxiv.org/pdf/1908.03265.pdf Liyuan Liu et al. suggested a new optimization algorithm with an essence of similar to Adam Algorithm. It has been discussed in the paper that, without warmup heuristic, in the early stage of adaptive optimization / learning algorithms sometimes we can get undesirable large variance which can slow overall convergence process. Authors proposed the idea of rectification of variance of adaptive learning rate when it is expected to be high. Differing from the paper, we selected variance tractability cut-off as 5 instead of 4. This adjustment is common practice, and could be found in the code-repository and also tensorflow swift optim library as well : https://github.com/LiyuanLucasLiu/RAdam/blob/2f03dd197022da442c6a15c47321f4335d113a3f/radam/radam.py#L156 https://github.com/tensorflow/swift-apis/blob/f51ee4618d652a2419e998bf9418ad80bda67454/Sources/TensorFlow/Optimizers/MomentumBased.swift#L638 Pull Request resolved: pytorch/pytorch#58968 Reviewed By: vincentqb Differential Revision: D29310601 Pulled By: iramazanli fbshipit-source-id: b7bd487f72f1074f266687fd9c0c6be264a748a9

Summary: Pull Request resolved: facebookresearch#491 Reviewed By: czxttkl, bankawas Differential Revision: D29251412 fbshipit-source-id: 0a6cbcf59956ecc113e9425079f91a6b3098c2de

Summary: Pull Request resolved: facebookresearch#492 Reviewed By: bankawas Differential Revision: D29252722 fbshipit-source-id: d855c6688199d2c3a09fab200e9b8d66c52d7273

Summary: We've implemented data modules; this method is redundant Reviewed By: bankawas Differential Revision: D29252903 fbshipit-source-id: 044cde768b481d4a12d4a17cca42180b4bd989cb

Summary: redundant Reviewed By: bankawas Differential Revision: D29252914 fbshipit-source-id: 536982d3b7886bda68fc14c5c933343167213224

Summary: redundant Reviewed By: bankawas Differential Revision: D29253003 fbshipit-source-id: cd05c62a0840b4f2d10c8bf4d9fe9ea057b6a13f

Summary: redundant Reviewed By: bankawas Differential Revision: D29253030 fbshipit-source-id: 969d03b6428aead6c6982a26b2e2c4a9a940273f

Summary: This is the start of making model manager stateless to reduce complexity Reviewed By: czxttkl Differential Revision: D29253248 fbshipit-source-id: 681d141cb46784e40c8802f2325c1636044c61de

…less Summary: Removing state from model managers Reviewed By: czxttkl Differential Revision: D29253249 fbshipit-source-id: 93ecb090cd2e2b66f86480679ae6145519227360

Summary: Prereq for making model managers stateless Reviewed By: czxttkl Differential Revision: D29253385 fbshipit-source-id: 9db747f46a84f26bce079efe8c4394efd3c8adc7

…h#493) Summary: Pull Request resolved: facebookresearch#493 Finally removed normalization data from model manager state Reviewed By: czxttkl Differential Revision: D29253429 fbshipit-source-id: 619b93b473e49b07fe74d0b525d6fc5f30f52550

Summary: Give it to `build_trainer()` directly so that we can remove state in model managers Reviewed By: czxttkl Differential Revision: D29258017 fbshipit-source-id: 39f4a7e8ad9a92499ffeb3c04e2e1c61c10769c0

Summary: Pull Request resolved: facebookresearch#494 - Remove `initialize_trainer()` - Implement `train()` on ModelManager base class; remove all the duplicates - Make `build_serving_module[s]()` takes the trainer module so it can extract whatever nets in the trainer module - `ModelManager.train()` now returns `Tuple[RLTrainingOutput, pl.Trainer]` so that `_lightning_trainer` member can be deleted Reviewed By: czxttkl Differential Revision: D29258016 fbshipit-source-id: 71545dc77c386b532bb48fe4c8ee94c79c20f5c6

Summary: Implement multi-stage trainer module so that multi-stage training looks the same as other training. Internally, the multi-stage trainer forward calls to internal trainers. Reviewed By: czxttkl Differential Revision: D29273266 fbshipit-source-id: b51e91e5670362fc8ed85d9eeb05bd685fc7cbfd

Differential Revision: D29398026 fbshipit-source-id: 76923009da0f6fbc82a9fa8ae96c9417422c2577

Summary: Pull Request resolved: facebookresearch#497 Reviewed By: czxttkl Differential Revision: D29405221 fbshipit-source-id: 3e3524d92fb8d243b7fe62a04830b8f2b80df6ce

Differential Revision: D29458224 fbshipit-source-id: dcef29cd83ee7aecc94100ed579d023072ab581e

Summary: Pull Request resolved: facebookresearch#498 Add some assertions to make sure end users can use algorithms correctly. Reviewed By: bankawas Differential Revision: D29481662 fbshipit-source-id: 0332d990df7d3eca61e1f7bd205136d32f04a7b2

Summary: Pull Request resolved: facebookresearch#499 Remove Seq2SlateDifferentiableRewardTrainer because it's not tested and wouldn't be used. Reviewed By: kittipatv Differential Revision: D29522083 fbshipit-source-id: 9cd7e0d6d1d10c17cc174a54d77a4b37b0f279b7

Summary: Pull Request resolved: facebookresearch#500 Migrate the regular seq2slate to PyTorch Lightning, which includes one model manager `Seq2SlateTransformer` and three trainers `Seq2SlateTrainer`, `Seq2SlateSimulationTrainer` and `Seq2SlateTeacherForcingTrainer`. Manual optimization (https://pytorch-lightning.readthedocs.io/en/latest/common/optimizers.html#manual-optimization) is used to handle the sophisticated usage of optimizers during training. Model manager `Seq2SlatePairwiseAttn` and trainer `Seq2SlatePairwiseAttnTrainer` are not migrated in this diff. But to make them compatible with the changes, the setting of `minibatch_size` is also moved from `trainer_params` to `reader_options`. Reviewed By: czxttkl Differential Revision: D29436608 fbshipit-source-id: 612a1de4923eb7d138fcb6cb4715be6e4d05b424

Summary: AutoDataModule yields dictionary of tensors. Therefore, we need to manually type the input Reviewed By: czxttkl Differential Revision: D29479986 fbshipit-source-id: ab135bb869d8f0eb1fba1813aebf5af6d5ca3401

Differential Revision: D29573192 fbshipit-source-id: 65dc670d1777dd1d6b86c9228a198cd16f504c6e

Differential Revision: D35968031 fbshipit-source-id: 80d19aab074a8f4aaea544a56b7309b46901f1cc

Summary: Pull Request resolved: facebookresearch#633 add `test_shift_kjt_by_one` and `test_reorder_data_kjt` `test_reorder_data_kjt` will reorder data within each key `test_shift_kjt_by_one` will left shift data by one within each key The two functions will be used in the Ads LTV project. Reviewed By: alexnikulkov Differential Revision: D35970439 fbshipit-source-id: dfc67f00216bcb575e4c9fb439ec570dc96f0951

Summary: Pull Request resolved: facebookresearch#634 When KeyedJaggedTensor doesn't have weights, `.weights()` will throw an assertion error. We should use `.weights_or_none()` to check if a KJT has weights. Reviewed By: BerenLuthien Differential Revision: D36005910 fbshipit-source-id: b075ef9949b44fc1186bc124fd42a00e3c9d77f3

Summary: Pull Request resolved: facebookresearch#637 Simple implementation of the Bayesian Network tutorial at https://www.internalfb.com/intern/anp/view/?id=1798614 in the context of an reagent model and trainer. Reviewed By: czxttkl Differential Revision: D35852022 fbshipit-source-id: c91b8982129b747bdf2049ba38a8b7c447002548

Summary: Applies the black-fbsource codemod with the new build of pyfmt. paintitblack Reviewed By: lisroach Differential Revision: D36324783 fbshipit-source-id: 280c09e88257e5e569ab729691165d8dedd767bc

Summary: Pull Request resolved: facebookresearch#640 We have seen some import error: https://fb.workplace.com/groups/2126278550786248/permalink/5403372223076848/ So far, I find the only way to unblock is to use torchrec + torch stable instead of nightly. I believe changing to stable will reduce our maintenance burden overall. The only downside is that if we use some latest feature internally, our OSS tests may not capture that in time. But I think we won't have such a case frequently. The more frequent case should be that the nightly version breaks something. Reviewed By: alexnikulkov Differential Revision: D36156197 fbshipit-source-id: b32bc518a573edf489c01b28e6ad9cc10886396f

Summary: Pull Request resolved: facebookresearch#642 By convention, logging metrics only happens on rank 0. self.logger.experiment will return None on rank != 0. Reviewed By: alexnikulkov Differential Revision: D36039765 fbshipit-source-id: 7444bd5ceae33f29ad017dd127359f27f5004471

Summary: Applies new import merging and sorting from µsort v1.0. When merging imports, µsort will make a best-effort to move associated comments to match merged elements, but there are known limitations due to the diynamic nature of Python and developer tooling. These changes should not produce any dangerous runtime changes, but may require touch-ups to satisfy linters and other tooling. Note that µsort uses case-insensitive, lexicographical sorting, which results in a different ordering compared to isort. This provides a more consistent sorting order, matching the case-insensitive order used when sorting import statements by module name, and ensures that "frog", "FROG", and "Frog" always sort next to each other. For details on µsort's sorting and merging semantics, see the user guide: https://usort.readthedocs.io/en/stable/guide.html#sorting Reviewed By: lisroach Differential Revision: D36402214 fbshipit-source-id: b641bfa9d46242188524d4ae2c44998922a62b4c

Summary: Pull Request resolved: facebookresearch#635 as titled Reviewed By: alexnikulkov Differential Revision: D36021439 fbshipit-source-id: ce008f941caf2d2b137851662a0b7926bd8520f8

Summary: Pull Request resolved: facebookresearch#641 as titled Reviewed By: alexnikulkov Differential Revision: D36039334 fbshipit-source-id: 863027d8ad1a65cd5510853ccca8e947f88ef6e0

Summary: Pull Request resolved: facebookresearch#643 1. Add new sections to YAML for model and optimizer configs 2. Add support for weights in Parametric DQN input 3. Expose FC hidden layer dims in config 4. Sort data in the batch by separable_id, timestamp, position. 5. Zero-out the weight for observations for which we don't know the next state ("terminal", but they are actually not terminal, we just don't know their next state), the time_diff is negative or the position feature is missing, preventing us from sorting properly. 6. Read and pass in the batch time gap to next state 7. Clip reward (paced bid) To launch MC LTV training: - local run: `starlight app run -j 1 free.reagent.train_ltv:train` - submit to MAST: `starlight app submit reagent/submit_config.py:get_config_ltv` To launch SARSA LTV training: - local run: `starlight app run -j 1 free.reagent.train_ltv:train_sarsa` - submit to MAST: `starlight app submit reagent/submit_config.py:get_config_ltv -- --model_type SARSA` Reviewed By: czxttkl Differential Revision: D36360500 fbshipit-source-id: c07f0b2ea297844970389b2059a7c42d63d16a8d

Summary: Pull Request resolved: facebookresearch#644 Add extra optional columns of mdp_id and arms in the ReAgent codebase. These are used in eval workflow for linucb. Reviewed By: alexnikulkov Differential Revision: D36493574 fbshipit-source-id: 509a5b17617381b244202d7a857a7d7b1eb8bcc9

…ch#645) Summary: Pull Request resolved: facebookresearch#645 Instead of always using a linear output activation, I want to specify which activation to use. I want to try this with positive activation functions (e.g. relu) because I know that my Q-values have to be positive (all rewards are non-negative) Reviewed By: czxttkl Differential Revision: D36744360 fbshipit-source-id: 81296d2dfebb0ec77917d6024c0216d0e3fed4d1

Summary: pyfmt now specifies a target Python version of 3.8 when formatting with black. With this new config, black adds trailing commas to all multiline function calls. This applies the new formatting as part of rolling out the linttool-integration for pyfmt. paintitblack Reviewed By: zertosh, lisroach Differential Revision: D37084377 fbshipit-source-id: 781a1b883a381a172e54d6e447137657977876b4

Differential Revision: D37172467 fbshipit-source-id: c5b8f6fceb327eb61a013836f57d315fd6b17211

Summary: Pull Request resolved: facebookresearch#646 Fixing this problem: https://fb.workplace.com/groups/horizon.users/posts/1105605083328644/ Differential Revision: D37244190 fbshipit-source-id: 498d6f0790d3954f83758c4d46a6203c672fff67

Differential Revision: D37305159 fbshipit-source-id: 3532b6de87431137832c546d6544ba8d419cd726

Summary: Pull Request resolved: facebookresearch#647 X-link: meta-pytorch/torchrec#447 Add a static function to concat a list of KJTs Reviewed By: dstaay-fb Differential Revision: D36944002 fbshipit-source-id: 1b6865f60dcea91ee250b69360e4606184ffad53

Summary: Made LinUCB neural-based to enable distributed training. Differential Revision: D37009962 fbshipit-source-id: 4bb3e68ea60a264d26e13e4ce19832bca67a2c7e

Differential Revision: D37353746 fbshipit-source-id: b0dc7a8b59f4c6a1e39daa67fbae1e519488b6ef

Differential Revision: D37371791 fbshipit-source-id: 8247af28ab27242782ab8d317e7d9763121bcc74

Summary: Pull Request resolved: facebookresearch#649 fix world model reporter so that we can read losses per epoch make the logic needed to perform at the end of a train/test/validation epoch more explicit in reagent_lightning_module Differential Revision: D37305377 fbshipit-source-id: 2204cfe94269cfba839b72c77bfea341ab63637d

Summary: We want to make the EB/EBC scriptable by default w/o the need of running torch.fx first. Not able to script EB/EBC modules by default (especially when needed for inference) is very non-intuitive and inconvenient. In the same time we don't plan to make ShardedEB/EBC scriptable. 1. Do not use property, use methods instead. This is consistent w/ KJT. 1. _embedding_bag_configs has a complex type List[EmbeddingBagConfig] which does not script. Use List[str] to store features instead. Pull Request resolved: facebookresearch#648 X-link: meta-pytorch/torchrec#467 Reviewed By: colin2328 Differential Revision: D37389962 fbshipit-source-id: 5ce079a946b9458ee63658cae2fd731cfc1c7958

Summary: Pull Request resolved: facebookresearch#650 In the base LightningModule class we define an optional `logger` property which may be initialized to None, while in DQNtrainer._log_dqn method we try to access the `logger` object without checking first if it was initialized. The issue surfaced when trying to run unit tests analogous to those in `test_qrdqn`. This commit adds a check whether the `logger` is initialized prior to attempting to use it. Interestingly, the analogous QRDQNTrainer class implementation does not use the `logger` property for logging, perhaps it's redundant? Reviewed By: czxttkl Differential Revision: D37529027 fbshipit-source-id: 5fe81cf715ee9f759b937290f1184d1c67e5325f

Summary: Pull Request resolved: facebookresearch#651 Adds the test_dqn.py with a set of unit tests for DQNTrainer class, mirroring those in test_qrdqn.py Reviewed By: czxttkl Differential Revision: D37536537 fbshipit-source-id: 60cef76adb62c54e66b3fda39596c1cf0ad20555

Summary: Remove `minibatch_size` attribute in SAC and DQN trainer Differential Revision: D37834043 fbshipit-source-id: 5d3cf3d772842a0a60c9ad8ce11c8317649e51ff

facebook-github-bot · 2022-07-19T18:24:09Z

This pull request was exported from Phabricator. Differential Revision: D37834043

Yifu Wang and others added 30 commits June 18, 2021 00:44

Revert D29241736: To add Rectified Adam Algorithm to Optimizers

57967dc

Differential Revision: D29241736 (facebookresearch@0ff3634) Original commit changeset: 288b9b1f3125 fbshipit-source-id: 56c4ec98647c6f1822b130726741a1c9ca193670

Simplify PPO (facebookresearch#487)

1e985fd

Summary: Pull Request resolved: facebookresearch#487 We shouldn't need to yield the placeholder loss. Differential Revision: D29111772 fbshipit-source-id: 0971221583bd9a5de770860ff15cc80eb8d749c3

ActorCriticDataModule (facebookresearch#491)

395b079

Summary: Pull Request resolved: facebookresearch#491 Reviewed By: czxttkl, bankawas Differential Revision: D29251412 fbshipit-source-id: 0a6cbcf59956ecc113e9425079f91a6b3098c2de

SlateQDataModule (facebookresearch#492)

18f5917

Summary: Pull Request resolved: facebookresearch#492 Reviewed By: bankawas Differential Revision: D29252722 fbshipit-source-id: d855c6688199d2c3a09fab200e9b8d66c52d7273

Remove run_feature_identification() from ModelManager

98cf8c9

Summary: We've implemented data modules; this method is redundant Reviewed By: bankawas Differential Revision: D29252903 fbshipit-source-id: 044cde768b481d4a12d4a17cca42180b4bd989cb

Remove should_generate_eval_dataset() from ModelManager

d499a21

Summary: redundant Reviewed By: bankawas Differential Revision: D29252914 fbshipit-source-id: 536982d3b7886bda68fc14c5c933343167213224

Remove build_batch_preprocessor() from model managers

d99d165

Summary: redundant Reviewed By: bankawas Differential Revision: D29253003 fbshipit-source-id: cd05c62a0840b4f2d10c8bf4d9fe9ea057b6a13f

Remove query_data() from ModelManager

9b802c7

Summary: redundant Reviewed By: bankawas Differential Revision: D29253030 fbshipit-source-id: 969d03b6428aead6c6982a26b2e2c4a9a940273f

Remove trainer property from modelmanager

e23e20e

Summary: This is the start of making model manager stateless to reduce complexity Reviewed By: czxttkl Differential Revision: D29253248 fbshipit-source-id: 681d141cb46784e40c8802f2325c1636044c61de

Explicitly passing normalization_data_map to make model manager state…

bb53fb4

…less Summary: Removing state from model managers Reviewed By: czxttkl Differential Revision: D29253249 fbshipit-source-id: 93ecb090cd2e2b66f86480679ae6145519227360

Explicitly passing normalization_data_map to build_trainer()

2175b10

Summary: Prereq for making model managers stateless Reviewed By: czxttkl Differential Revision: D29253385 fbshipit-source-id: 9db747f46a84f26bce079efe8c4394efd3c8adc7

Remove reward_options accessor

d6fd98f

Summary: Give it to `build_trainer()` directly so that we can remove state in model managers Reviewed By: czxttkl Differential Revision: D29258017 fbshipit-source-id: 39f4a7e8ad9a92499ffeb3c04e2e1c61c10769c0

suppress errors in reagent

47a6a8b

Differential Revision: D29398026 fbshipit-source-id: 76923009da0f6fbc82a9fa8ae96c9417422c2577

Fix test errors (facebookresearch#497)

3f61744

Summary: Pull Request resolved: facebookresearch#497 Reviewed By: czxttkl Differential Revision: D29405221 fbshipit-source-id: 3e3524d92fb8d243b7fe62a04830b8f2b80df6ce

suppress errors in fbcode/reagent - batch 1

f11fbca

Differential Revision: D29458224 fbshipit-source-id: dcef29cd83ee7aecc94100ed579d023072ab581e

Fix validation_step in DQNTrainer

7412be7

Summary: AutoDataModule yields dictionary of tensors. Therefore, we need to manually type the input Reviewed By: czxttkl Differential Revision: D29479986 fbshipit-source-id: ab135bb869d8f0eb1fba1813aebf5af6d5ca3401

suppress errors in reagent

e6d1e5c

Differential Revision: D29573192 fbshipit-source-id: 65dc670d1777dd1d6b86c9228a198cd16f504c6e

Pyre Bot Jr and others added 26 commits April 27, 2022 16:08

suppress errors in fbcode/reagent - batch 1

f60fdd5

Differential Revision: D35968031 fbshipit-source-id: 80d19aab074a8f4aaea544a56b7309b46901f1cc

formatting changes from black 22.3.0

eb9b2b7

Summary: Applies the black-fbsource codemod with the new build of pyfmt. paintitblack Reviewed By: lisroach Differential Revision: D36324783 fbshipit-source-id: 280c09e88257e5e569ab729691165d8dedd767bc

check input for parametric dqn (facebookresearch#635)

083f457

Summary: Pull Request resolved: facebookresearch#635 as titled Reviewed By: alexnikulkov Differential Revision: D36021439 fbshipit-source-id: ce008f941caf2d2b137851662a0b7926bd8520f8

make reward net optional (facebookresearch#641)

deb9c67

Summary: Pull Request resolved: facebookresearch#641 as titled Reviewed By: alexnikulkov Differential Revision: D36039334 fbshipit-source-id: 863027d8ad1a65cd5510853ccca8e947f88ef6e0

suppress errors in reagent

a407035

Differential Revision: D37172467 fbshipit-source-id: c5b8f6fceb327eb61a013836f57d315fd6b17211

suppress errors in reagent

8a05b06

Differential Revision: D37305159 fbshipit-source-id: 3532b6de87431137832c546d6544ba8d419cd726

concat KJTs (facebookresearch#647)

f9e7484

Summary: Pull Request resolved: facebookresearch#647 X-link: meta-pytorch/torchrec#447 Add a static function to concat a list of KJTs Reviewed By: dstaay-fb Differential Revision: D36944002 fbshipit-source-id: 1b6865f60dcea91ee250b69360e4606184ffad53

Neural LinUCB

b62c9de

Summary: Made LinUCB neural-based to enable distributed training. Differential Revision: D37009962 fbshipit-source-id: 4bb3e68ea60a264d26e13e4ce19832bca67a2c7e

suppress errors in reagent

a98d8fb

Differential Revision: D37353746 fbshipit-source-id: b0dc7a8b59f4c6a1e39daa67fbae1e519488b6ef

suppress errors in reagent

ae38db0

Differential Revision: D37371791 fbshipit-source-id: 8247af28ab27242782ab8d317e7d9763121bcc74

remove minibatch_size attribute to SAC trainer

c8d9b75

Summary: Remove `minibatch_size` attribute in SAC and DQN trainer Differential Revision: D37834043 fbshipit-source-id: 5d3cf3d772842a0a60c9ad8ce11c8317649e51ff

facebook-github-bot added cla signed fb-exported labels Jul 19, 2022

xuruiyang force-pushed the main branch from ac49faa to d3ac80e Compare September 20, 2025 06:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

remove minibatch_size attribute to SAC trainer #655

remove minibatch_size attribute to SAC trainer #655

Uh oh!

alexnikulkov commented Jul 19, 2022

Uh oh!

facebook-github-bot commented Jul 19, 2022

Uh oh!

Uh oh!

remove minibatch_size attribute to SAC trainer #655

Are you sure you want to change the base?

remove minibatch_size attribute to SAC trainer #655

Uh oh!

Conversation

alexnikulkov commented Jul 19, 2022

Uh oh!

facebook-github-bot commented Jul 19, 2022

Uh oh!

Uh oh!