Skip to content

Conversation

ArashPartow
Copy link
Contributor

No description provided.

igfox and others added 30 commits August 16, 2021 08:59
Summary:
Pull Request resolved: facebookresearch#520

Adds dedicated unit test for PPO Trainer, additionally:
- Fixes a bug with fully connected value net
- Fixes some bugs in PPO training around using value net
- Adds possible_action_mask to DuelingQNetwork

Reviewed By: czxttkl

Differential Revision: D30114686

fbshipit-source-id: 3735af1ea65429867d63f7da1462194242ad8254
Differential Revision:
D30114686 (facebookresearch@8d00eb1)

Original commit changeset: 3735af1ea654

fbshipit-source-id: 905ff36cdf587565487b8ad2e623c3cfbd77effc
…evision.

Summary: ^

Reviewed By: yifuwang

Differential Revision: D30346110

fbshipit-source-id: 154e69f233132635e947ddbd252ffbd957ead6f1
Summary:
Pull Request resolved: facebookresearch#526

Adds dedicated unit test for PPO Trainer, additionally:
- Fixes a bug with fully connected value net
- Fixes some bugs in PPO training around using value net
- Adds possible_action_mask to DuelingQNetwork

Note: a continuation of D30114686 (facebookresearch@8d00eb1), which I reverted after it caused some CircleCI failures

Reviewed By: czxttkl

Differential Revision: D30342897

fbshipit-source-id: 9be5e86d234619e97e476e46556a4dee07e3b734
Reviewed By: czxttkl

Differential Revision: D29880479

fbshipit-source-id: 61c241d5570c7b81567974c50068a672e6058278
Summary: Implement DiscreteDqnDataModule as an AutoDataModule.

Reviewed By: czxttkl

Differential Revision: D29835012

fbshipit-source-id: 384413ac3d61cd52285c6a860cff0e0f15e299e0
Summary: Update SAC to support ID-list features

Reviewed By: czxttkl

Differential Revision: D29880917

fbshipit-source-id: b7be1b7727a1749af38e1640d192b15c1b7608d1
Summary:
Pull Request resolved: facebookresearch#527

ActorPredictorUnwrapper takes state features as positional arguments, not ServingFeatureData.

Reviewed By: igfox

Differential Revision: D30428162

fbshipit-source-id: aaa7307cef35200545478c621b7cb3fe9a1f4eea
Summary: Diff D30342897 (facebookresearch@9b25610) swapped out uses of FullyConnected (which takes a tensor as input) with FloatFeatureFullyConnected (which takes FeatureData as input). This broke an assumption made in the predictor wrapper.

Reviewed By: kittipatv

Differential Revision: D30432700

fbshipit-source-id: 732eda23f97cb21f094daed6857fb44dc49316b3
Summary: When `feature_type` is given, the parameters for Box-Cox transformation are not computed. That causes error when we try to instantiate the normalization

Reviewed By: igfox

Differential Revision: D30437833

fbshipit-source-id: 2e9c25a28e6d9cfe85670eb3b6714668f4cefff6
Summary:
Pull Request resolved: facebookresearch#528

Simplify the comments

Reviewed By: xuruiyang

Differential Revision: D30343990

fbshipit-source-id: 8b3c4172a4af9e01c27e8e511486bed68c1032b5
Differential Revision: D30514142

fbshipit-source-id: 4e9d8facc613e67d7806a26a170c8b7545a1c742
Summary:
- Pass down action_names to DiscreteDqnDataModule
- Fix action query formatting

Reviewed By: czxttkl

Differential Revision: D30527307

fbshipit-source-id: b9128b3708dc922f774d7fa97d041c0e16df1088
Summary:
Pull Request resolved: facebookresearch#529

Using torch::jit::load has some privacy issues: T93507328
instead we're supposed to load the model as a caffe2:PyTorchPredictorContainer and then extract the pytorch module.

Reviewed By: kittipatv

Differential Revision: D30285801

fbshipit-source-id: b13330d5a27eec943a46fe13a2be4c203e2e993c
Summary:
Steps
1. Run scorer (learned CRR actor/q network) on the whole evaluation dataset, and get a list of scores (for action 1)
2. Find the percentiles of these scores. This determines a threshold for 60% promo for example.
3. use this threshold to construct a new predictor wrapper, which outputs 1/0
4. Replace the original wrapper with this new wrapper, with the same manifold path.

Validator takes a parameter that is Null by default. If not specified, the promo threshold will be set s.t. the promo ratio matches the dataset.
If specified, the promo threshold will be set s.t. promo ratio is equal to the specified percentile.

Reviewed By: DavidV17

Differential Revision: D30584956

fbshipit-source-id: 310f91bc25470904dfcf1b8b6455376334d2a8f0
Summary: make pyre complains less

Reviewed By: czxttkl

Differential Revision: D30560574

fbshipit-source-id: ec419dd2ec0fae0285f916d61d6f262e1732eb00
Summary:
Pull Request resolved: facebookresearch#532

Adding unit tests to cover some functions in transform.py

I'm leaving some methods uncovered in this diff to try out bootcamping unit test creation

Reviewed By: czxttkl

Differential Revision: D30607144

fbshipit-source-id: 08a993ab8afadd49cc30c6b691989b8f867a151a
Summary: A lighter weight way to experiment with sparse features

Reviewed By: czxttkl

Differential Revision: D30560575

fbshipit-source-id: 21ea8b560c0578e81f3ddf127b017db16630da3c
Summary: Some choices of feature type overrides were not respected.

Reviewed By: DavidV17

Differential Revision: D30658323

fbshipit-source-id: 5d6d2f54a7904ef47b5c1e89fdca858cb0af5c61
Summary:
Gym will be installed by tox before running unittests. No need to install Gym outside of virtual env.

Pull Request resolved: facebookresearch#533

Reviewed By: czxttkl

Differential Revision: D30731643

fbshipit-source-id: 19ad746de6712bebb89770366b3d04a65294eeb9
Summary:
Pull Request resolved: facebookresearch#534

Catching PickleError stops working as it's now RuntimeError. Since RuntimeError is quite generic, I don't think it's a good idea to catch it. Therefore, let's just disable parallel evaluation.

Reviewed By: igfox

Differential Revision: D30730645

fbshipit-source-id: 4f9be1dd5fd9e559d76c6cda0aaa183da410d2ed
Summary: Exposes the upper bound clip limit for action weights in CRR as a max_weight parameter

Reviewed By: DavidV17

Differential Revision: D30739945

fbshipit-source-id: 3a8273d32f0566e4801ae30c90703e880a4f6691
Summary:
Pull Request resolved: facebookresearch#531

A lite API for solving combinatorial problems. Currently only support discrete input spaces.

Reviewed By: kittipatv

Differential Revision: D30453019

fbshipit-source-id: 47d0cdb12ef4e2b7b26d1a00a90f70016ba67af0
…facebookresearch#505)

Summary:
Pull Request resolved: facebookresearch#505

When we set `reader_options.min_nodes` > 1, we turn on distributed training. The koski reader in each trainer process should only read `1/min_nodes` data.

Reviewed By: j-jiafei

Differential Revision: D28779856

fbshipit-source-id: 9665c6b65b6d02066ae38d2f37be8d268c624797
Differential Revision: D30797764

fbshipit-source-id: c7c9fa99d5de21acb6917e7d70ade5049e20bab3
Summary:
Pull Request resolved: facebookresearch#536

np.array -> np.ndarray

Reviewed By: wenwei202

Differential Revision: D30812091

fbshipit-source-id: 52e6fea3be48983981e28b49b5e709593951763f
…lity assert.

Summary: add function to convert idx to raw choices. More tests with probability assert.

Reviewed By: czxttkl

Differential Revision: D30824852

fbshipit-source-id: 502c814f8cf629603fa7ee9576706d1833ca182e
Summary:
Pull Request resolved: facebookresearch#537

We need to have a unique identity for each epoch and dataset type (train/val/test).
We must use cpu-based batch preprocessor
Some other small fixes.

Reviewed By: j-jiafei

Differential Revision: D30861672

fbshipit-source-id: e89a1a03bc345123a164987c3f4c7876fc783b93
Summary: as titled

Reviewed By: wenwei202

Differential Revision: D30909621

fbshipit-source-id: a76f5298566dfc05360f83be565f91714eac4084
Summary:
Pull Request resolved: facebookresearch#538

Support specifying estimated_budgets and optimizer_name.

Reviewed By: teytaud

Differential Revision: D30912782

fbshipit-source-id: e4dd8804face839bb6175afd22944dd7893fe5c7
Xiaoxiang Zhang and others added 25 commits November 7, 2022 11:30
Summary:
Pull Request resolved: facebookresearch#690

In the GPU mode, the function will raise error since tensors created by torch.ones or torch.zeros are in CPU and rest tensors are in GPU.

Reviewed By: alexnikulkov

Differential Revision: D41062175

fbshipit-source-id: 27a4be58804f72f258476c749c64731de157c7f2
Summary:
X-link: pytorch/pytorch#88701

X-link: meta-pytorch/tnt#269

Pull Request resolved: facebookresearch#691

X-link: meta-pytorch/torchsnapshot#129

X-link: meta-pytorch/torchrec#799

X-link: facebookresearch/detectron2#4649

Context in https://fburl.com/4irjskbe

This change deletes distributed.pyi, so that lintrunner will run mypy on distributed.py for typing check.
It also helps to fix a lot pyre false alarms, so that we can remove those pyre suppressions.

Reviewed By: zhaojuanmao

Differential Revision: D41028360

fbshipit-source-id: 577f212ca4c47e23a8577c4b92385c483f96e2c1
Summary:
Pull Request resolved: facebookresearch#692

Fixing a bug introduced in D41062175

Reviewed By: qfettes

Differential Revision: D41164406

fbshipit-source-id: d6bac862ea37fd9a807f6a344ddd8e6cb0b31c45
…h#693)

Summary: Pull Request resolved: facebookresearch#693

Reviewed By: BerenLuthien

Differential Revision: D41226452

fbshipit-source-id: 57c384670f2fd28c56a9554581a07289dc217868
Summary:
Pull Request resolved: facebookresearch#697

I recently exposed LRScheduler as a public endpoint as that is the right direction for users. This diff adds LRScheduler as a torch lr scheduler, which it is.

Would fix test errors such as https://www.internalfb.com/intern/testinfra/diagnostics/281475249445597.562950030008200.1668085134/, which were introduced by my landing of D41109279.

Created from CodeHub with https://fburl.com/edit-in-codehub

Reviewed By: czxttkl

Differential Revision: D41187073

fbshipit-source-id: 2637b6a80247c24620cf0ce8310e8181135637cd
Summary:
Pull Request resolved: facebookresearch#695

Add Offline Evaluation for non-stationary Contextual Bandit policies.
This diff includes only the Policy Evaluator algorithms from the LinUCB paper: https://arxiv.org/pdf/1003.0146.pdf (Algorithm 3)

Reviewed By: BerenLuthien

Differential Revision: D41226450

fbshipit-source-id: 10fae8b9b0fb10d44d8ddf313938028585a94c07
…search#694)

Summary:
Pull Request resolved: facebookresearch#694

`BaseCBTrainerWithEval` integrates Offline Eval into the training process. By default the behavior is same as before refactor. But after `.attach_eval_module()` method gets called, every batch is processed by the eval module before training on it. The processing includes keeping track of the reward and filtering the training batch.

Reviewed By: BerenLuthien

Differential Revision: D41239491

fbshipit-source-id: f5c506d14a736a71ddc1b64270d1e8842a23488b
Summary:
Pull Request resolved: facebookresearch#700

In addition to number of observations, keep track of total weight of consumed data

Reviewed By: BerenLuthien

Differential Revision: D41483276

fbshipit-source-id: fab4c95455d7ef611706b9356ffaf3416adc1d6d
Summary: Add new is_causal flag introduced by nn.Transformer API

Reviewed By: houseroad

Differential Revision: D42095517

fbshipit-source-id: 8bd7813aa86fa49d50c0fcfac6c0d9bb71320b2b
…sor (facebookresearch#701)

Summary:
Pull Request resolved: facebookresearch#701

Make changes to reagent transforms and CB preprocessor to add support for variable number of arms
1. The main API change is that `num_arms=None` in `FbContBanditBatchPreprocessor`, then we use variable-length version of the code.
A presence tensor is generated to indicate which arms are present vs 0-padded
2. Add `arm_presence` field to `CBInput` to indicate which arms are present

Reviewed By: BerenLuthien

Differential Revision: D41989361

fbshipit-source-id: 4804556d427c5e4fd7cf2d8da66359cbacce2514
Summary: When some arms might be missing, apply a masked softmax to model scores during offline eval to avoid selecting the missing arms.

Reviewed By: BerenLuthien

Differential Revision: D41990957

fbshipit-source-id: e04d370ba9001a730cb17ff6e825bfbe732f3996
Summary:
Pull Request resolved: facebookresearch#704

numpy 1.24.0 introduced test failures in CircleCI (example: https://app.circleci.com/pipelines/github/facebookresearch/ReAgent/2548/workflows/3cb8da07-f49f-4300-ab65-28dca3f24633)
My preliminary investigation showed that it might be related to several things:
1. the enforcement of ranges for data types (https://numpy.org/devdocs/release/1.24.0-notes.html#conversion-of-out-of-bound-python-integers), specifically - we might be generating very large values in tests, which we attempt to histogram and some variables start overflowing. But I'm not 100% sure.
2. Deprecation of `np.object` (https://numpy.org/devdocs/release/1.24.0-notes.html#expired-deprecations)

I created a task T140754266 to properly upgrade us to numpy 1.24. Timeline is TBD

Reviewed By: BerenLuthien

Differential Revision: D42198006

fbshipit-source-id: 033aaddcd305c5b4065e948c4757e5bbc8a5a846
…arch#705)

Summary:
Pull Request resolved: facebookresearch#705

Pass `SummaryWriter` to offline evaluators, so that they could log metrics to TensorBoard

Reviewed By: BerenLuthien

Differential Revision: D42251660

fbshipit-source-id: 202cee318ad2ba47055b1ed945eebeeab88db226
Summary:
N2904645

torch.inverse will report error when the conditional number of a matrix is too high. We propose to use pinv to fix this for now.

Reviewed By: alexnikulkov

Differential Revision: D42322767

fbshipit-source-id: a8a2060a1aa55db722a38d70f7793b6aae76f97b
Summary: This is a copy of D42322767 to fix the inverse of ill-conditioned matrix for joint LinUCB

Reviewed By: alexnikulkov

Differential Revision: D42334903

fbshipit-source-id: 7ad8176688f4645b8fdf252779c99a77fcfa182e
…ch#707)

Summary:
Pull Request resolved: facebookresearch#707

I'm updating the training logic of LinUCB to keep track of the average values of `A` and `b` instead of cumulative values. This should improve the numerical stability of training by preventing numerical overflows.

The average values are aggregated among the trainers and among epochs when computing the coefficients.

Reviewed By: BerenLuthien

Differential Revision: D42334470

fbshipit-source-id: f103c9ab2fe84fa2da639d7ec726c82737d6c4f0
Summary:
Pull Request resolved: facebookresearch#708

Adding support for distributed Offline Eval. This requires maintaining local buffers in each trainer instance and syncing them across all trainers periodically. The sync happens under one of 2 conditions:
1. When the "critical" weight of data has been consumed (will be set approximately equal to the size of 1-hr partition)
2. At the end of the training epoch (if data has been consumed since last sync)

Also, updating the FREE pipeline to remove the restriction on number of nodes for Offline Eval runs

Differential Revision: D42407669

fbshipit-source-id: ce436b42b1bb01f3688c6f1f80c52a3d66a47b22
Summary:
Pull Request resolved: facebookresearch#709

Reduce logging level of unwanted messages
See image below for example of messages which will be removed:
{F848736094}

Reviewed By: rodrigodesalvobraz

Differential Revision: D42563840

fbshipit-source-id: 9ceff0e6a5419e0ff61d1b93009a7922754f747c
Summary:
As title

Note: we cannot put this num_obs as a parameter in the model in buffer, as this will cause bugs to current launch candidate when doing future package upgrade.

TODO: Register num_obs as a parameter of LinUCB model when we are about to replace the old launch candidate

Reviewed By: alexnikulkov

Differential Revision: D42588867

fbshipit-source-id: 5b94973ecf3d58efbcc2e89b3fd4b3e32ba2462a
Summary:
Pull Request resolved: facebookresearch#710

Add a method for applying discounting to better control when it gets applied

Reviewed By: PoojaAg18

Differential Revision: D42815841

fbshipit-source-id: 19851f7c8e83ce460bbe88bc6e7447ecfc873318
…h#711)

Summary:
The current version is specified via a commit hash. This started generating an error during install for an unknown reason. I analyzed the releases and the commit history - 1.6.0 is the closest released version to this commit. I tried using a recent lighting version, but it generated quite a few errors on CircleCI due to changes in APIs.

Pull Request resolved: facebookresearch#711

Reviewed By: speedystream, BillMatrix

Differential Revision: D42873869

fbshipit-source-id: fd7b34fc7193cfb15f6d4ff5d9d379e7c422835d
Differential Revision: D42946511

fbshipit-source-id: ce36494c8f42fd1010f4a89f13c6991cf29ef4e6
Differential Revision: D43043970

fbshipit-source-id: 454cdd9a6bd4a952e33530ca51ad83654df965d1
Summary:
Pull Request resolved: facebookresearch#713

I'm using recmetrics from torchrec to log supervised learning metrics. Metrics logged:
- MSE
- MAE
- Calibration

New metrics can easily be added by updating the model_utils.py file

The metrics are logged every 100 steps by default (controlled by `config.trainer.log_every_n_steps`). If this parameter is set too small, training will be slowed down by frequent matrix inversions.

The default window size is equal to batch_size*world_size, so the window is just 1 step.

An unfortunate side effect of per-ts discounting is that each epoch (ts partition) has a separate curve in TB.

Reviewed By: PoojaAg18

Differential Revision: D42971642

fbshipit-source-id: 52e1d23209aca487e3755e039597aea4e19a0345
@ArashPartow
Copy link
Contributor Author

@MisterTea when you have a moment can you please review and merge this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.