Skip to content

Conversation

gji1
Copy link
Contributor

@gji1 gji1 commented Apr 19, 2022

Summary: To cater for the negative subsampling scheme of the creative ranking project.

Differential Revision: D35664393

gji1 and others added 30 commits May 5, 2021 10:30
Summary:
Pull Request resolved: facebookresearch#465

A recent change in PyTorch Lightning set the states of optimizers (https://fburl.com/code/5tpf2i0j), which contradicts the frozen dataclass we had for the Optimizer wrapper in ReAgent. This diff removes the frozen settings, and replaces `__getattr__` with the safer, more explicit property functions.

Reviewed By: MisterTea

Differential Revision: D28205046

fbshipit-source-id: 848e3a0f90565eb041c0e91ef27c2be9102c5a7d
Summary:
Pull Request resolved: facebookresearch#466

See title

Reviewed By: MisterTea

Differential Revision: D28236105

fbshipit-source-id: 9fc750e4c73d40b42d25b5378af94e722d96f5c5
Summary: Pull Request resolved: facebookresearch#467

Reviewed By: alexnikulkov

Differential Revision: D28237308

fbshipit-source-id: 0025540b11ffa7d4325147c4304728c644f65c5d
Summary:
Pull Request resolved: facebookresearch#469

One test failure only happens in OSS: https://app.circleci.com/pipelines/github/facebookresearch/ReAgent/1655/workflows/cbf167ec-76b2-423a-91b2-d454ba8d41d2/jobs/10454.

This diff fixes it.

Reviewed By: gji1

Differential Revision: D28248488

fbshipit-source-id: efc777757d9bc18d6b573e394e81997404252fb7
…kresearch#473)

Summary:
Pull Request resolved: facebookresearch#473

To satisfy a client team's request

Differential Revision: D28327536

fbshipit-source-id: d3b1f9ef0c6b6bc09b29930d59ed2834cdadd7df
…el (facebookresearch#471)

Summary:
Pull Request resolved: facebookresearch#471

As titled. See T83887308 & T83886520 for more details.

Reviewed By: kaiwenw

Differential Revision: D26498062

fbshipit-source-id: ea0242d16f7673cad25d018235abb31742ab7434
Summary: Pull Request resolved: facebookresearch#474

Reviewed By: czxttkl

Differential Revision: D28312845

fbshipit-source-id: abb039d445a1228bb11ffb6103744854b209b3dc
Summary:
Pull Request resolved: facebookresearch#476

Add a n-gram MLP for synthetic reward attribution. This model uses an MLP to predict each step's reward.

Compared with single-step reward model, it uses n-gram with a context window centered around each step and zero padding.

Reviewed By: czxttkl

Differential Revision: D28362111

fbshipit-source-id: 624de95f14b7fedb79ccb0cd47cb811b651fab04
Summary:
Pull Request resolved: facebookresearch#472

Distributed readers are not supported yet, as shown in the test plan below czxttkl.

Reviewed By: czxttkl

Differential Revision: D28292330

fbshipit-source-id: 0f03d27fdba75740ab9590747ae025c6da6ce9fa
Summary: Pull Request resolved: facebookresearch#478

Reviewed By: bankawas

Differential Revision: D28427686

fbshipit-source-id: b53a9f974f9c2ee615fb453b5efe48b9de487dbf
…ch#479)

Summary:
Pull Request resolved: facebookresearch#479

Making these changes can finally get us distributed training for reward networks (hopefully. Still need to wait for the workflow to finish). Fix the error asked in https://fb.workplace.com/groups/pytorchLightning/permalink/455491295468768/.

Reviewed By: gji1

Differential Revision: D28318470

fbshipit-source-id: fe3836ef49864a20af07511a10e25c0d1a20ba0d
Summary:
Pull Request resolved: facebookresearch#480

Lower the number of training samples & threshold, use Adam instead of SGD.

Reviewed By: j-jiafei

Differential Revision: D28464831

fbshipit-source-id: 918329290be62bd846507e2bd3697af4c3e710db
…bookresearch#470)

Summary: Pull Request resolved: facebookresearch#470

Reviewed By: czxttkl

Differential Revision: D28093192

fbshipit-source-id: 6b260c3e8d49c8b302e40066e2be49a0bfe96688
Summary:
Pull Request resolved: facebookresearch#477

Add ConvNet support to n-gram synthetic reward network.

Reviewed By: czxttkl

Differential Revision: D28402551

fbshipit-source-id: c2201be3d71c32977c2f19b69e5a0abcaf0a855d
Summary:
Pull Request resolved: facebookresearch#481

Add LSTM synthetic reward net.

Reviewed By: czxttkl

Differential Revision: D28448615

fbshipit-source-id: e8c77ef8c7b4ad69fcda2fd432cc018cfb7495cd
Summary:
Pull Request resolved: facebookresearch#482

as titled. Also support discrete action.

Reviewed By: j-jiafei

Differential Revision: D28248528

fbshipit-source-id: bf87afa18914e9331177b22f0c9a823ac2ba2337
…h#483)

Summary:
Pull Request resolved: facebookresearch#483

As title.

Reviewed By: czxttkl

Differential Revision: D28551285

fbshipit-source-id: 3cc14daa930399daa0880c8569f8f36b46c1ff94
Summary:
Pull Request resolved: facebookresearch#484

Refactoring so that we can use spark transform to bulk eval synthetic reward models.

Things changed:
1. Improve API for defining models. In `reagent/models/synthetic_reward.py`, we create `SyntheticRewardNet`, which takes in different architecture implementations with standardized input/output shapes.
2. Net builders will build different architectures to construct `SyntheticRewardNet`. So we follow a composite pattern in net builders.
3. All net builders now share the same `build_serving_module` method.
4. Improve test methods so they share as much code as possible between different architectures.

Reviewed By: j-jiafei

Differential Revision: D28549704

fbshipit-source-id: 535a6191b6cfc4c55ed8b4f8c366af77ceac5c79
Summary: Added binary_difference_scorer to discrete_dqn.py

Reviewed By: czxttkl

Differential Revision: D28691568

fbshipit-source-id: dd9fe5518b13aea2acb94dae10823cdfd9253926
…cebookresearch#485)

Summary:
Pull Request resolved: facebookresearch#485

As title.

Reviewed By: czxttkl

Differential Revision: D28790947

fbshipit-source-id: 26405326402a0b913731c2a9ccb4badde4b47a9b
…s set up, required in lightning 1.3.3

Summary: with move to lightning 1.3 (D28792413), MDNRNNTrainer cannot call self.log() without setting up a LoggerConnector

Reviewed By: kandluis

Differential Revision: D28825504

fbshipit-source-id: 145028b62647f7466d44833bde0c0d4fb4c6d729
Summary: Data module for CFEval

Reviewed By: gji1

Differential Revision: D28661138

fbshipit-source-id: c248600105bad5e66c717deb1fc0dee44d415005
…esearch#486)

Summary:
Pull Request resolved: facebookresearch#486

1. Add batch norm to single-step synthetic reward network;
2. Add layer norm to single-step, ngram fc and ngram conv net synthetic reward network;

The normalization helps mitigate the problem of zero predictions from the use of MSE and sigmoid output layer.

Reviewed By: czxttkl

Differential Revision: D28888793

fbshipit-source-id: c041e0602880b270f10acba91d77b1cb4d8d17a2
Summary:
Pull Request resolved: facebookresearch#415

Currently, we have some test failures (https://app.circleci.com/pipelines/github/facebookresearch/ReAgent/1460/workflows/ecc21254-779b-4a89-a40d-ea317e839d96/jobs/8655) because we miss some latest features.

Reviewed By: MisterTea

Differential Revision: D26977836

fbshipit-source-id: 9243d194ddf5c62895c9f1369830309c379fd7dd
Summary: A standalone workflow to train reward models for discrete-action contextual bandit problems.

Reviewed By: kittipatv

Differential Revision: D28937902

fbshipit-source-id: 9d3a28a195654eb9892f9aba56c499ccc59079c2
Summary: As titled. Otherwise for very large datasets we see the Presto memory limit error.

Reviewed By: j-jiafei

Differential Revision: D29020301

fbshipit-source-id: a35198cf0da83f2fc454e92844d6a7ea17e2b8f7
…DQNBase (facebookresearch#475)

Summary:
Pull Request resolved: facebookresearch#475

As titled. Mimicking changes done in D25377364 (facebookresearch@7584cd1).

1) Create a data module class `ParametricDqnDataModule` inheriting from `ManualDataModule`, and move implementation of following methods from `ParametricDQNBase` to it:
- `should_generate_eval_dataset`
- `run_feature_identification`
- `query_data`
- `build_batch_preprocessor`

Methods that were not implemented are left unimplemented in `ParametricDqnDataModule`.

2) Create `get_data_module()` method in `ParametricDQNBase` which returns a `ParametricDqnDataModule` object.

Reviewed By: czxttkl

Differential Revision: D26888159

fbshipit-source-id: 2e4ce8eaa0e2a5871b0746f36a83506ce0bd7707
…ter) to github/third-party/PyTorchLightning/pytorch-lightning

Summary:
### Manual
- (ephemeral*) make `ResultCollection._extract_batch_size` a class method
- (ephtermal) commented out the MisconfigurationException in https://fburl.com/diffusion/agbk3mxc
- reagent/gym/tests/test_gym.py: wrap EpisodicDataset with dataloader before passing it to .fit() to fix the type checker error

\* ephemeral means that the change are made in-place in Lightning and will disappear after another sync.

### Automatic
### New commit log messages
  cdcc483e CHANGELOG update after v1.3.6 release (#7988)
  7978a537 Ipynb update (#8004)
  c6e02e48 [feat] Allow overriding optimizer_zero_grad and/or optimizer_step when using accumulate_grad_batches (#7980)
  eebdc910 progressive restoring of trainer state (#7652)
  3fece17f [feat] Add `{,load_}state_dict` to `ResultCollection` 1/n (#7948)
  906de2a7 [feat] Named Parameter Groups in `LearningRateMonitor` (#7987)
  5647087f New speed documentation (#7665)
  55494e87 Fix Special Tests (#7841)
  bc2c2db2 Do not override the logged epoch in `logged_metrics` (#7982)
  21342165 Change `WarningCache` to subclass `set` (#7995)
  4ffba600 Add predict hook test (#7973)
  917cf836 [doc] Add more reference around predict_step (#7997)
  d2983c7c [fix] Enable manual optimization DeepSpeed (#7970)
  b093a9e6 Support `save_hyperparameters()` in LightningModule dataclass (#7992)
  341adad8 Loop Refactor 2/N - Remove Old Training Loop (#7985)
  b71aa55b Make optimizers skippable when using amp (#7975)
  0004216f Easier configurability of callbacks that should always be present in LightningCLI (#7964)
  78a14a3f Add `tpu_spawn_debug` to plugin registry (#7933)
  92024df2 Pt 1.9 breaking fix: __iter__ type hint (#7993)
  b2e9fa81 Improvements related to save of config file by LightningCLI (#7963)
  971908a1 Loop Refactor 1/N - Training Loop (#7871)
  560b1970 Standardize positional datamodule and argument names (#7431)
  0974d66c Add docs for IPUs (#7923)
  024cf23c Remove convert_to_half, suggest using `model.half` (#7974)

Reviewed By: colin2328

Differential Revision: D29203448

fbshipit-source-id: 0e866b869bda06349828ec4fc61af19e4ea21f0e
…research#490)

Summary:
Pull Request resolved: facebookresearch#490

Fix world model simulation. The previous failure is due to that the world model is not loaded properly from warmstart path.
Also, this diff updates `prepare_data()` API. `prepare_data()` is now assumed to not return setup data, following pytorch lightning's API.

Reviewed By: kittipatv

Differential Revision: D29157160

fbshipit-source-id: 7d52e12793b8bbc827bb2a14567993a7f63dd54c
…ookresearch#496)

Summary:
Pull Request resolved: facebookresearch#496

Offline Batch RL runs were failing on import error, which arose from missing init.py file

Reviewed By: czxttkl

Differential Revision: D29284160

fbshipit-source-id: 4e69941028f5d00bc0ef7dc30049929a9d44c306
czxttkl and others added 25 commits February 24, 2022 11:02
Summary:
Pull Request resolved: facebookresearch#608

update READEME and add ForkedPdb in reagent

Reviewed By: alexnikulkov

Differential Revision: D34425175

fbshipit-source-id: c59ee44b8ff89cf87a13794f23d85f0890f52cb2
Summary:
Pull Request resolved: facebookresearch#609

X-link: meta-pytorch/torchrec#112

As discussed in D33960410, we want the responsibility of processing KeyedTensor into sparse features the responsibility of SparseArch.

A motivation for this is that we want to have an extension EsuhmDLRM, where all we would need to do is replace the sparse arch component. However, the esuhm sparse arch's output doesn't adhere to the current KeyedTensor output.

Reviewed By: bigning

Differential Revision: D34482853

fbshipit-source-id: 90048cc1d36327593422d459b49cb8d3783226e2
Summary:
- Model Manager for BehaviorCloning
- UnitTest of the ModelManager
- DataModule for UnitTest

Reviewed By: czxttkl

Differential Revision: D33829752

fbshipit-source-id: 9d1d6af293f652e095b914608108fc0d215ff257
Summary:
Our mission at [Meta Open Source](https://opensource.facebook.com/) is to empower communities through open source, and we believe that it means building a welcoming and safe environment for all. As a part of this work, we are adding this banner in support for Ukraine during this crisis.

Pull Request resolved: facebookresearch#613

Reviewed By: alexnikulkov

Differential Revision: D34630775

Pulled By: dmitryvinn-fb

fbshipit-source-id: 7108199313663725759377fe0972e59e9ae2cb22
Summary:
### New commit log messages
- [a52a6ea03 Add support for pluggable Accelerators (#12030)](Lightning-AI/pytorch-lightning#12030)

Reviewed By: edward-io

Differential Revision: D34608197

fbshipit-source-id: ee87d0ce693659a4e689290a079f8c5a4772faf2
Differential Revision: D34666657

fbshipit-source-id: 02546bd9ce2d328ad1210eb18499d8db86267e65
Summary: Pull Request resolved: facebookresearch#614

Reviewed By: czxttkl

Differential Revision: D34657092

fbshipit-source-id: 47e0af9b751dffaeafbf9019b7bb5967c0ff84c1
…#189)

Summary:
X-link: facebookresearch/d2go#189

X-link: facebookresearch/recipes#14

Pull Request resolved: facebookresearch#616

### New commit log messages
- [9b011606f Add callout items to the Docs landing page (#12196)](Lightning-AI/pytorch-lightning#12196)

Reviewed By: edward-io

Differential Revision: D34687261

fbshipit-source-id: 3ef6be5169a855582384f9097a962d2261625882
Summary:
Pull Request resolved: facebookresearch#617

Improve the reinforce trainer by
1. Allowing reward mean subtraction without normalization,
2. Providing the option to log training loss and ips ratio mean per epoch.

Reviewed By: alexnikulkov

Differential Revision: D34688279

fbshipit-source-id: 50e94140fbf2182523e03c350f7bbe6812cb6e74
Summary:
Pull Request resolved: facebookresearch#618

as titled

Reviewed By: sinannasir

Differential Revision: D34587407

fbshipit-source-id: 738aa3fb580716628330efa65a8c5ca7596aff14
Summary:
Pull Request resolved: facebookresearch#615

as titled

Reviewed By: PavlosApo

Differential Revision: D34677139

fbshipit-source-id: 9fa8a0884d8f4abf0c7ca47fa669932d739a2d4c
Summary:
Pull Request resolved: facebookresearch#619

as titled

Reviewed By: alexnikulkov

Differential Revision: D34940029

fbshipit-source-id: 9f6add38bd7f03f6811b6f4c51db431a1412660c
Summary:
Pull Request resolved: facebookresearch#620

Officially import torchrec

Reviewed By: alexnikulkov

Differential Revision: D34942469

fbshipit-source-id: d4d47f4e90ff99f738f27c0720fd5462f40abe86
Summary:
Pull Request resolved: facebookresearch#621

As the creative ranking project runs only 1 epoch, enable per-batch logging to TensorBoard, as did in the SAC trainer in ReAgent.

Reviewed By: czxttkl

Differential Revision: D35100625

fbshipit-source-id: 37bf361a4f668665de7691731467755c37b31067
Differential Revision: D35275827

fbshipit-source-id: e1e402f8a07f97e3243318bb0101e2943a40c48c
Differential Revision: D35313194

fbshipit-source-id: 30b3f317f90b2e736453ae5162caad765fbfa414
…->context, action->arm (facebookresearch#624)

Summary:
Pull Request resolved: facebookresearch#624

1. Rename state -> context
2. Rename action -> arm
3. Add capability to read context-arm features from the input
4. Remove action probability from contextual bandit input (will add back in when we add algorithms which require it)
5. Improve offset validation in `FixedLengthSequences` transform

Differential Revision: D35372899

fbshipit-source-id: b00fa256aec344a2d7fcf2034e1f00132fef62f3
…kresearch#625)

Summary:
Pull Request resolved: facebookresearch#625

Tittle

Differential Revision: D35417882

fbshipit-source-id: 74bf4799cebce3f8f35f0b83fd7fd9825c34c7c2
Summary:
Pull Request resolved: facebookresearch#626

use pure pytorch operators to perform array length check

Reviewed By: alexnikulkov

Differential Revision: D35423434

fbshipit-source-id: 397879eb2d0cbbcaaf9624e9b4cbead2f445263e
Summary:
Pull Request resolved: facebookresearch#627

Removing an unused parameter
Disable interaction features by default

Reviewed By: czxttkl

Differential Revision: D35442407

fbshipit-source-id: fdc0fd3137226565656b8feddbdffdb054026fe2
Summary:
Pull Request resolved: facebookresearch#628

Fixing some device issues in ReAgent code

Reviewed By: alexnikulkov

Differential Revision: D34995851

fbshipit-source-id: 2f0376c2d53b7797e6193deffa95ca162bd1153a
Differential Revision: D35589581

fbshipit-source-id: b08bb906c6703876a3be2be5345f69342d123a1c
…search#629)

Summary:
Pull Request resolved: facebookresearch#629

The attributes weren't registered properly, so they weren't pushed to the device when `model.to(device)` was called

Reviewed By: soudia

Differential Revision: D35560710

fbshipit-source-id: 67492e7f64829750e395bdec85e04b7fb6fff04c
…arch#630)

Summary:
Pull Request resolved: facebookresearch#630

Removing device assignment following changes in D35560710

Reviewed By: alexnikulkov

Differential Revision: D35656985

fbshipit-source-id: 423124fdc9615c74476152f39e259bcf1f9f94d0
Summary: To cater for the negative subsampling scheme of the creative ranking project.

Differential Revision: D35664393

fbshipit-source-id: a0e2752a07024b6650cf17adfba3ae00052233ec
@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D35664393

@codecov-commenter
Copy link

codecov-commenter commented Apr 19, 2022

Codecov Report

Merging #631 (4b3fda5) into main (6b8cfb9) will decrease coverage by 0.03%.
The diff coverage is 56.66%.

@@            Coverage Diff             @@
##             main     #631      +/-   ##
==========================================
- Coverage   86.84%   86.80%   -0.04%     
==========================================
  Files         350      350              
  Lines       22205    22226      +21     
  Branches       44       44              
==========================================
+ Hits        19283    19294      +11     
- Misses       2896     2906      +10     
  Partials       26       26              
Impacted Files Coverage Δ
reagent/training/reinforce_trainer.py 67.34% <40.00%> (-3.44%) ⬇️
reagent/training/utils.py 87.80% <71.42%> (-8.87%) ⬇️
reagent/core/types.py 86.70% <100.00%> (+0.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6b8cfb9...4b3fda5. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.