Skip to content

Conversation

alexnikulkov
Copy link
Contributor

Summary: Title

Differential Revision: D34626435

kaiwenw and others added 30 commits April 17, 2021 11:15
Summary:
Pull Request resolved: facebookresearch#454

title

Reviewed By: alexnikulkov

Differential Revision: D27800185

fbshipit-source-id: 406001b48f55d7304d18e06237e7bf82ed07c11b
Reviewed By: divchenko

Differential Revision: D27835360

fbshipit-source-id: cbb23793ee57382e43bd65bd40cfeb2820c6eec2
Summary: Pull Request resolved: facebookresearch#384

Test Plan:
CI Tests
...but without running open source tests.

Reviewed By: gji1

Differential Revision: D27842452

Pulled By: MisterTea

fbshipit-source-id: 6fb192d30217d358e86a04e6bcc5a69911276e71
…ainers (facebookresearch#457)

Summary:
Pull Request resolved: facebookresearch#457

trainer.train(batch) was the old, pre-Lightning ReAgent trainer API.
With this diff we make sure that nobody is trying to call trainer.train(batch).
trainer.train() or trainer.train(True/False) is allowed - this puts the network into training/eval mode.

Reviewed By: MisterTea

Differential Revision: D27862583

fbshipit-source-id: b0875e11cd4ef214c75fd1bef5b696f1cdf2b8d6
Summary:
fix bugs: GreedyActionSampler returned one as a log prob and EpsilonGreedyActionSampler didn't work.

Pull Request resolved: facebookresearch#393

Test Plan:
Imported from GitHub, without a `Test Plan:` line.
...but without running open source tests.

Reviewed By: kaiwenw

Differential Revision: D27842450

Pulled By: MisterTea

fbshipit-source-id: 9b4aa85f352f2d7565473127b280d61bcc6d3b71
Summary: Pull Request resolved: facebookresearch#455

Test Plan: CI Tests

Reviewed By: czxttkl

Differential Revision: D27842449

Pulled By: MisterTea

fbshipit-source-id: bee6d009236e87eaddae7ea7d083c7500dc1220b
Summary:
Pull Request resolved: facebookresearch#458

When trying to follow the [tutorial](https://reagent.ai/rasp_tutorial.html) there are a few things that need fixing:

1. When running the script serving/scripts/rasp_to_model.py I came across this error

```
python serving/scripts/rasp_to_model.py /tmp/rasp_logging/log.txt /tmp/input_df.pkl

Traceback (most recent call last):
  File "serving/scripts/rasp_to_model.py", line 13, in <module>
    logger.setLevel(logging.info)
  File "/usr/local/anaconda3/envs/reagent/lib/python3.7/logging/__init__.py", line 1353, in setLevel
    self.level = _checkLevel(level)
  File "/usr/local/anaconda3/envs/reagent/lib/python3.7/logging/__init__.py", line 195, in _checkLevel
    raise TypeError("Level not an integer or a valid string: %r" % level)
TypeError: Level not an integer or a valid string: <function info at 0x7fb8000d73b0>
```

Luckily it is an easy fix to pass an actual loglevel.

2. This config file probably is outdated: serving/examples/ecommerce/training/contextual_bandit.yaml
- changed indentation level
- changed key name

3. There is an __init__.py file missing in the gym tests therefore leading to an error

4. The path to the SPARK_JAR was not resolving correctly.

Pull Request resolved: facebookresearch#391

Test Plan:
Imported from GitHub, without a `Test Plan:` line.
...but without running open source tests.

Reviewed By: czxttkl

Differential Revision: D27842451

Pulled By: MisterTea

fbshipit-source-id: 2175296c6b60db4dc4b22804a74c2259b14fee7e
…Set test model type appropriately.

Reviewed By: bankawas

Differential Revision: D27863892

fbshipit-source-id: 0084920bd82d54f5aece46f36c32fbbec5ba3380
Summary:
Pull Request resolved: facebookresearch#459

as titled. also some small polish on the codebase.

Reviewed By: kaiwenw

Differential Revision: D27899809

fbshipit-source-id: 882471f1a9376d0d50bd935e02328667f1867450
…rch#460)

Summary:
Pull Request resolved: facebookresearch#460

OOM issues can occur in CFEval of DQN and CRR workflows when the validation set is too large, as in https://fb.workplace.com/groups/horizon.users/permalink/836921400197015/. This diff solves this issue by computing the numbers needed for CFEval in `validation_step`, instead of just stacking the raw batches, which include all the state features that can take a lot of memory.

Note that if `use_gpu=True`, for speed the CFEval-required numbers are computed on the GPUs, where both the validation batch and the trainer is stored. Then the returned `EvaluationDataPage` will be moved to the CPU, because later in `validation_epoch_end` everything will be done on the CPU for larger memory capacity. To enable this transportation between devices, in this diff `EvaluationDataPage` is changed to a subclass of `TensorDataClass` from the previous `NamedTuple`.

Reviewed By: kaiwenw

Differential Revision: D27929283

fbshipit-source-id: f57948232f395b297d957cdc2afbc38a874a1810
Differential Revision: D27949485

fbshipit-source-id: 7f0fde8111150922bd0c62cb473f71a3a2bc7367
…ookresearch#450)

Summary: Pull Request resolved: facebookresearch#450

Reviewed By: kaiwenw

Differential Revision: D27692807

fbshipit-source-id: 2b880d2a5543db0fa244b818747328d6bce7ed20
Summary:
- Add more elements to the output
- Fix dependency in TARGETS
- Fix some typos in comments
- Wrap paths in `os.path.expanduser()`

Reviewed By: bankawas

Differential Revision: D27946814

fbshipit-source-id: b9cd0bedfecc1e63007e7d15f40a5431ed85e3ae
Summary: Pull Request resolved: facebookresearch#447

Reviewed By: czxttkl

Differential Revision: D26627900

fbshipit-source-id: 7be325fada7819f011092726d1cd29fb5483d599
Summary: Change the Klotski training code to use the Lightning training API

Reviewed By: alexzhangxx

Differential Revision: D28018402

fbshipit-source-id: 8c3054da176f5e08a68f4b87cc522af1fcd4912b
facebookresearch#463)

Summary: Pull Request resolved: facebookresearch#463

Reviewed By: czxttkl

Differential Revision: D28114174

fbshipit-source-id: c6f9953b2b4922c4c1b0271f3243c14f7261e103
Summary:
Pull Request resolved: facebookresearch#462

title

Reviewed By: czxttkl

Differential Revision: D28044160

fbshipit-source-id: ac3d3231a164208d27deb4a0ddd0ac3de8fe8948
Differential Revision: D28150387

fbshipit-source-id: b6409f37823e99027baec8cc349215c3fd799bb4
Summary:
Add backbone of one particular model of synthetic reward attribution. This model uses an MLP to predict each step's reward.

A single step synthetic reward model works as follows:
1. Suppose you have an MDP: s0, a0, r0, s1, a1, r1, ...st, at, rt.
2. However you only know the aggregated reward R=r0 + r1 +... + rt. To facilitate RL model learning, it is ideal to distribute the aggregated reward to individual steps.
3. So we create a neural network net.
4. Fit the neural network by: MSE(R, net(s0, a0) + net(s1, a1) + ... net(st, at))

Reviewed By: j-jiafei

Differential Revision: D27934701

fbshipit-source-id: c57418459e9378c8d690596cab8a627784551a18
Differential Revision: D28190581

fbshipit-source-id: a976503c8ea44495350744f68c7306e686dc4c28
Summary:
This applies the formatting changes from black v21.4b2 to all covered
projects in fbsource. Most changes are to single line docstrings, as black
will now remove leading and trailing whitespace to match PEP8. Any other
formatting changes are likely due to files that landed without formatting,
or files that previously triggered errors in black.

Any changes to code should be AST identical. Any test failures are likely
due to bad tests, or testing against the output of pyfmt.

Reviewed By: thatch

Differential Revision: D28204910

fbshipit-source-id: 804725bcd14f763e90c5ddff1d0418117c15809a
Summary:
Pull Request resolved: facebookresearch#465

A recent change in PyTorch Lightning set the states of optimizers (https://fburl.com/code/5tpf2i0j), which contradicts the frozen dataclass we had for the Optimizer wrapper in ReAgent. This diff removes the frozen settings, and replaces `__getattr__` with the safer, more explicit property functions.

Reviewed By: MisterTea

Differential Revision: D28205046

fbshipit-source-id: 848e3a0f90565eb041c0e91ef27c2be9102c5a7d
Summary:
Pull Request resolved: facebookresearch#466

See title

Reviewed By: MisterTea

Differential Revision: D28236105

fbshipit-source-id: 9fc750e4c73d40b42d25b5378af94e722d96f5c5
Summary: Pull Request resolved: facebookresearch#467

Reviewed By: alexnikulkov

Differential Revision: D28237308

fbshipit-source-id: 0025540b11ffa7d4325147c4304728c644f65c5d
Summary:
Pull Request resolved: facebookresearch#469

One test failure only happens in OSS: https://app.circleci.com/pipelines/github/facebookresearch/ReAgent/1655/workflows/cbf167ec-76b2-423a-91b2-d454ba8d41d2/jobs/10454.

This diff fixes it.

Reviewed By: gji1

Differential Revision: D28248488

fbshipit-source-id: efc777757d9bc18d6b573e394e81997404252fb7
…kresearch#473)

Summary:
Pull Request resolved: facebookresearch#473

To satisfy a client team's request

Differential Revision: D28327536

fbshipit-source-id: d3b1f9ef0c6b6bc09b29930d59ed2834cdadd7df
…el (facebookresearch#471)

Summary:
Pull Request resolved: facebookresearch#471

As titled. See T83887308 & T83886520 for more details.

Reviewed By: kaiwenw

Differential Revision: D26498062

fbshipit-source-id: ea0242d16f7673cad25d018235abb31742ab7434
Summary: Pull Request resolved: facebookresearch#474

Reviewed By: czxttkl

Differential Revision: D28312845

fbshipit-source-id: abb039d445a1228bb11ffb6103744854b209b3dc
Summary:
Pull Request resolved: facebookresearch#476

Add a n-gram MLP for synthetic reward attribution. This model uses an MLP to predict each step's reward.

Compared with single-step reward model, it uses n-gram with a context window centered around each step and zero padding.

Reviewed By: czxttkl

Differential Revision: D28362111

fbshipit-source-id: 624de95f14b7fedb79ccb0cd47cb811b651fab04
Summary:
Pull Request resolved: facebookresearch#472

Distributed readers are not supported yet, as shown in the test plan below czxttkl.

Reviewed By: czxttkl

Differential Revision: D28292330

fbshipit-source-id: 0f03d27fdba75740ab9590747ae025c6da6ce9fa
Pyre Bot Jr and others added 23 commits December 28, 2021 13:36
Differential Revision: D33337676

fbshipit-source-id: 34ddb3312749e8c1ae80e5c688d4c3d7f2da40af
Summary:
Pull Request resolved: facebookresearch#595

The test was flaky because:
1. The seed wasn't fixed
2. Both UCB1 and MetricUCB were estimating variance, so UCB1 wasn't always at a disadvantage

Reviewed By: czxttkl

Differential Revision: D33340651

fbshipit-source-id: 2e94997eb2a7c0c209ed1ecd62412900ed701152
Summary:
Pull Request resolved: facebookresearch#598

Implemented :
- synthetic data
   - To match state feature with label(action), [++++++++, ++++----, ----++++, -------- ] respectively correspond to 4 different actions.
   - support state feature with random noise to emulate stochastic
   - support label in type of both one-hot and integer, e.g., action=[1,0,0,0] or action=[0].
   -
- trainer
   - CrossEntropyLoss is adopted on top of model from dqn.py
- unittest
   - training & validation loss both approach zero, as validation of reasonable training
   - probability matches labels

Reviewed By: gji1

Differential Revision: D33409534

fbshipit-source-id: 3d9bfac68f0ef405e379ad88add7b533f72f1e2a
Summary:
Pull Request resolved: facebookresearch#600

Add missing init file in reagent/prediction/cfeval/

Reviewed By: czxttkl

Differential Revision: D33795738

fbshipit-source-id: bee4f88bfce9aa21af81db1eb96843706c07afeb
Summary: as titled

Reviewed By: wenwei202

Differential Revision: D33796163

fbshipit-source-id: 8b9480c71f6f174b05bcf8d95b9313760a86d1aa
Summary:
Pull Request resolved: facebookresearch#601

as titled

Reviewed By: PavlosApo

Differential Revision: D33802718

fbshipit-source-id: 2c2668a1bcddfe706c6303c80544f997356af417
Reviewed By: daniellepintz

Differential Revision: D33848208

fbshipit-source-id: ccd590d0286cb2bd2f381e5003bba230c9406b58
Summary:
Pull Request resolved: facebookresearch#597

as titled

Reviewed By: alexnikulkov

Differential Revision: D33225789

fbshipit-source-id: d0dcf72329bef88fd0ace08f3c674ee3bff67242
Summary:
See "Feature config definition" section in https://fb.quip.com/1RdkAeTsSjgh for why I made the change.

Alex brought a good point that we may need to unify the representation of sparse features. Will consider in a later diff.

Reviewed By: alexnikulkov

Differential Revision: D34081716

fbshipit-source-id: 0a2ff14360640435f7db7bc59b87f85b8a5f4b7e
Summary: See data reading section in https://fb.quip.com/1RdkAeTsSjgh for why I made the change.

Reviewed By: alexnikulkov

Differential Revision: D34081719

fbshipit-source-id: a57612a84eed2a2f6211db31f635cba01ddc9b45
Summary:
As a showcase for how to add sparse features to ReAgent

See "Model Training" section in quip https://fb.quip.com/1RdkAeTsSjgh

Reviewed By: alexnikulkov

Differential Revision: D34082047

fbshipit-source-id: 5d02b337cf3059c5f986a4b2d95b92d56c5cd7e0
Summary:
As a showcase for how to add sparse features to ReAgent

See "Model Training" section in quip https://fb.quip.com/1RdkAeTsSjgh

Reviewed By: alexnikulkov

Differential Revision: D34082046

fbshipit-source-id: 82a7294f0d9dd36c0f63d85c6366b9b2e0114dc4
Summary: Necessary changes in model managers to accommodate previous changes in the stack.

Reviewed By: alexnikulkov

Differential Revision: D34082048

fbshipit-source-id: 638554012aefaf71acc058b8add679dfb4382703
Summary: as titled

Reviewed By: alexnikulkov

Differential Revision: D34082045

fbshipit-source-id: 2f71e1b735512f01b65778d7b83a283832aa4ffe
…arch#604)

Summary:
Pull Request resolved: facebookresearch#604

All tests accompanied with D33850915

Reviewed By: alexnikulkov

Differential Revision: D33971614

fbshipit-source-id: 215ce0f609ab0d0a47cc1e6f88806444ef900ae0
Summary:
Pull Request resolved: facebookresearch#605

as titled

Reviewed By: gji1

Differential Revision: D34114567

fbshipit-source-id: e5a792c36c55fe047ef7bdd1620ee56c76104f58
Summary:
Pull Request resolved: facebookresearch#606

A new foreach flag is being added to the optimizers to indicate whether foreach logic or single tensor logic is used (see D33767870 and the associated stack).

This causes reagent tests to fail such as https://www.internalfb.com/intern/testinfra/diagnostics/7318349469673867.281475021413633.1644559942/

The issue arises from this line https://fburl.com/code/lroy3a2p where the value for foreach cannot be found in `getattr(self, k)`.

This PR adds the foreach flag to `uninferrable_optimizers.py` to address this (Note that we do not add this flag to `LBFGS` and `SparseAdam` as they do not support this option)

Reviewed By: alexnikulkov

Differential Revision: D34216723

fbshipit-source-id: fac4e6095157c7cd33184bfa5b7042bdd151688e
Reviewed By: shannonzhu

Differential Revision: D34226909

fbshipit-source-id: 4045a574efe46205ddf87ff839f52e2aac454fc5
Reviewed By: shannonzhu

Differential Revision: D34333122

fbshipit-source-id: 896c3306d85863ee8831ed08023bcd87e36f1657
Summary:
Pull Request resolved: facebookresearch#607

1. add log performance of each episode
2. crease usecase specific episode post callback
3. create step post callback

Reviewed By: vgup0, alexnikulkov

Differential Revision: D34295015

fbshipit-source-id: 2a72c9d291421707fb3192c34b74f5bcbd788a53
Summary:
Pull Request resolved: facebookresearch#608

update READEME and add ForkedPdb in reagent

Reviewed By: alexnikulkov

Differential Revision: D34425175

fbshipit-source-id: c59ee44b8ff89cf87a13794f23d85f0890f52cb2
Summary:
Pull Request resolved: facebookresearch#609

X-link: meta-pytorch/torchrec#112

As discussed in D33960410, we want the responsibility of processing KeyedTensor into sparse features the responsibility of SparseArch.

A motivation for this is that we want to have an extension EsuhmDLRM, where all we would need to do is replace the sparse arch component. However, the esuhm sparse arch's output doesn't adhere to the current KeyedTensor output.

Reviewed By: bigning

Differential Revision: D34482853

fbshipit-source-id: 90048cc1d36327593422d459b49cb8d3783226e2
Summary:
- Model Manager for BehaviorCloning
- UnitTest of the ModelManager
- DataModule for UnitTest

Reviewed By: czxttkl

Differential Revision: D33829752

fbshipit-source-id: 9d1d6af293f652e095b914608108fc0d215ff257
@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D34626435

Summary:
Pull Request resolved: facebookresearch#611

Title

Differential Revision: D34626435

fbshipit-source-id: 3a6c52ebd28955e84a769060d4c97b586048a131
@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D34626435

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.