Update Documentation for Quick Start Example: usage.rst #718

adhiiisetiawan · 2023-06-17T09:29:38Z

Update the documentation in the quick start example to improve clarity. The following changes have been made:

Modified the command in the documentation to use the run_test_replay_buffer function instead of run_test in the reagent.gym.tests.test_gym module.
Updated the example code snippet to reflect the change:

Before

# set the config
export CONFIG=reagent/gym/tests/configs/cartpole/discrete_dqn_cartpole_online.yaml
# train and evaluate model on gym environment
./reagent/workflow/cli.py run reagent.gym.tests.test_gym.run_test $CONFIG

After

# set the config
export CONFIG=reagent/gym/tests/configs/cartpole/discrete_dqn_cartpole_online.yaml
# train and evaluate model on gym environment
./reagent/workflow/cli.py run reagent.gym.tests.test_gym.run_test_replay_buffer $CONFIG

I also change in "On-Policy Training" section

These changes ensure that the example aligns with the current implementation and usage of the reagent.gym.tests.test_gym module.

Please review these modifications and let me know if any further adjustments are needed.

Summary: 1. super net sampling (with Reagent APIs) 2. Other utils to support 1 2.1. update `SuperNNConfig` attribute by a path str so that samples from Reagent ng.p.Dict can be easily mapped to masks within `SuperNNConfig`: `replace_named_tuple_by_path` 3. test samples such that counts of masks are close to configured probabilities Reviewed By: dehuacheng Differential Revision: D31126805 fbshipit-source-id: 95e48728773c2afd7e6856f8a7a831b00214bbda

Summary: Add a unit test for OneHotActions. Reviewed By: igfox Differential Revision: D31248082 fbshipit-source-id: 74d55ab5d3a23c75f5d0020b53616c87023afcf0

Summary: Adds unit test to the test_processing.py for columnvector function from transform.py Reviewed By: igfox Differential Revision: D31247953 fbshipit-source-id: 8e6eee0fecf3dfb0bff8fb3d168e15f002c0acf3

Summary: I found some of the documentation confusing, this is an attempt to clarify the functionality of the code. Reviewed By: czxttkl Differential Revision: D31071280 fbshipit-source-id: 62e7e299d40e7a431ed29dea0c6582646a855fd9

Summary: Pull Request resolved: facebookresearch#548 as titled Reviewed By: gji1 Differential Revision: D31217654 fbshipit-source-id: 514ab8ae7561b8a5a7ff5094642314f83c6b5be1

Summary: Pull Request resolved: facebookresearch#550 update miniconda and update T101565175 Reviewed By: gji1 Differential Revision: D31290939 fbshipit-source-id: cbecdb63048fb3fb79a7b7eb87406408309026c1

Summary: Pull Request resolved: facebookresearch#549 Tests for replay buffer's behavior Reviewed By: alexnikulkov Differential Revision: D30978005 fbshipit-source-id: aa034db5699071654d607fe7795bc8be232157c2

Summary: ### New commit log messages 3aba9d16a Remove `ABC` from `LightningModule` (#9517) Reviewed By: ananthsub Differential Revision: D31296721 fbshipit-source-id: a9992486c61a6f86fb251f2733bbc9311d93f293

Summary: Pull Request resolved: facebookresearch#551 as titled Reviewed By: igfox Differential Revision: D31296738 fbshipit-source-id: 3672485ccd230f9b1a029f90759bdf598f5990e4

…tly into `trainer.py` (#9495) Summary: ### New commit log messages 290398f81 Deprecate TrainerProperties Mixin and move property definitions directly into `trainer.py` (#9495) Reviewed By: ananthsub Differential Revision: D31317981 fbshipit-source-id: 9a6270f326cebb59ef5fb53b8db9d0797f62be77

Summary: Pull Request resolved: facebookresearch#552 By relaxing the threshold... Also set seeds Reviewed By: bankawas Differential Revision: D31334025 fbshipit-source-id: d5d666b2b5f5e5e4f06dea2a1353e85456f39a60

…rch#553) Summary: Pull Request resolved: facebookresearch#553 Use [0.01, 0.99] may cause some performance loss in boosting with entropy metrics. Reviewed By: czxttkl Differential Revision: D31346456 fbshipit-source-id: dae1ef0f6e36e67a182ced5793555e0d78dbf51e

Summary: Pull Request resolved: facebookresearch#554 as titled. This is one step towards a config/script-based rl orchestrator which can start necessary workflows automatically. Reviewed By: j-jiafei Differential Revision: D31334081 fbshipit-source-id: 0355b46396d922cf82f041734ffb8d20ceeab8e5

Summary: Adding basic UCB MAB classes to ReAgent. 3 variants of UCB are added (including the one currently used for Ads Creative Exploration - MetricUCB) Supported functionality: 1. Batch training (feed in counts of samples and total reward from each arm). We'll use this mode for Ads Creative Exploration. 2. Online training (query the bandit for next action one step at a time). 3. Dumping the state of the bandit and loading it from a JSON string Reviewed By: czxttkl Differential Revision: D31355506 fbshipit-source-id: 978ec16cba289dc08af599a2c05bb49fcae2843a

Summary: Replace numpy with PyTorch. This is a step towards using the standard ReAgent interface for MABs Reviewed By: czxttkl Differential Revision: D31423841 fbshipit-source-id: 04ccf92fba7b0f44ab6c19bdef3d098bf62394cf

Differential Revision: D31496257 fbshipit-source-id: 0f6b56075e4d24bdfd9d54bcecee90c5d86efbaf

…ng the same variable (facebookresearch#555) Summary: Pull Request resolved: facebookresearch#555 The current implementation was buggy if the env was reusing the same variable for possible_actions_mask and modifying it in place. I fix the bug by copying the possible_action_mask values instead of assigning the variable directly. Reviewed By: czxttkl Differential Revision: D31487641 fbshipit-source-id: ebc70164e42dc097291a7aeecba60d2ef30117b3

Summary: Pull Request resolved: facebookresearch#558 add some input check and simplify code Reviewed By: gji1 Differential Revision: D31529090 fbshipit-source-id: 0c38d9b927d0149256fa78d373687bc9048a0c85

Summary: Pull Request resolved: facebookresearch#556 Convert possible_actions_mask to a Tensor Reviewed By: czxttkl Differential Revision: D31497491 fbshipit-source-id: c0b8eb479b6be517a9c74c1d61ad68e4120d388a

Summary: Pull Request resolved: facebookresearch#559 cleanly_stop is a manually set variable which needs to be placed on the correct device. Otherwise we will see errors like in f301990179. Also, ddp is not needed in single cpu/gpu training. Reviewed By: alexnikulkov Differential Revision: D31530342 fbshipit-source-id: 98879fc130616aaccc454f939cd7cf2a704eb0eb

Differential Revision: D31605682 fbshipit-source-id: 6c2d89926ecab45cdbbcdd48058ef3697f94f92b

Summary: Pull Request resolved: facebookresearch#560 Bayesian Optimization Optimizer mutation-based optimization and acquisition function. Reviewed By: czxttkl Differential Revision: D31424105 fbshipit-source-id: 97872516e1c633071f983ebe6b254cbabee7b037

…etworks, independent Thompson sampling, and mutation. (facebookresearch#561) Summary: Pull Request resolved: facebookresearch#561 Bayesian Optimization Optimizer with ensemble of feedforward networks, ITS, and mutation based optimization. Reviewed By: czxttkl Differential Revision: D31424065 fbshipit-source-id: 8ffc1e7fd5de303cd572ea5bcd880429af67d173

Summary: Pull Request resolved: facebookresearch#557 See title Reviewed By: czxttkl Differential Revision: D31524614 fbshipit-source-id: e7aa7996de570f4ff990b402fbd23688a4ed12f4

Differential Revision: D31739112 fbshipit-source-id: d7ab577f32eadf56fa8ad1846a0e916ab9fcb778

… methods to unify (facebookresearch#565) Summary: Pull Request resolved: facebookresearch#565 1. Add 2 Thompson sampling MAB algorithms: 1 for Bernoulli rewards, 1 for Normal rewards 2. Refactor UCB code so that Thompson sampling could reuse as much as possible Reviewed By: czxttkl Differential Revision: D31642370 fbshipit-source-id: c4447a22ad11e1bb9696cf269ea9f45523d22f28

Summary: Pull Request resolved: facebookresearch#566 Adding some tools to evaluate the performance of MAB algorithms in a simple simulated environment Notebook shows how to use this: https://fburl.com/anp/f7y0gzl8 Reviewed By: czxttkl Differential Revision: D31672454 fbshipit-source-id: 32e3d4a8daa8f15a4c777c37f70c7962f949c299

Summary: 1. Add option to estimate reward variance and scale the confidence interval width by SQRT(VAR). 2. Add an option to multiply confidence interval width by a constant scalar to make exploration more/less aggressive 3. Remove UCBTuned algorithm because it is essentially UCB1 + variance estimation Reviewed By: czxttkl Differential Revision: D31741828 fbshipit-source-id: 684788746e2e626228cb522c49b2bafa9179d6fe

Summary: Pull Request resolved: facebookresearch#567 Reviewed By: czxttkl Differential Revision: D31743265 fbshipit-source-id: 3508027a8ab23c8569d4cf416560f1b9a6891752

Summary: ### New commit log messages 6429de894 Add support for `len(datamodule)` (#9895) Removed the following internal patch which may be conflicting with this change: ``` --- a/fbcode/github/third-party/PyTorchLightning/pytorch-lightning/pytorch_lightning/trainer/connectors/data_connector.py +++ b/fbcode/github/third-party/PyTorchLightning/pytorch-lightning/pytorch_lightning/trainer/connectors/data_connector.py @@ -215,6 +215,7 @@ def attach_datamodule( self, model: "pl.LightningModule", datamodule: Optional["pl.LightningDataModule"] = None ) -> None: + datamodule = datamodule or getattr(model, 'datamodule', None) # If we have a datamodule, attach necessary hooks + dataloaders if datamodule is None: return ``` Reviewed By: yifuwang Differential Revision: D31693305 fbshipit-source-id: 48e58aa6a6f9cdf7029b93663004f9243de5d3d8

Summary: documentation README for Deep Learning based LinUCB model Reviewed By: rodrigodesalvobraz Differential Revision: D44508375 fbshipit-source-id: 4408b2ea85b1bea728815af20a526c674f0a062b

Summary: [need below pictures in Summary so as to use their links in README.md file] {F927315010} {F927382569} {F927429214} Reviewed By: rodrigodesalvobraz Differential Revision: D44561198 fbshipit-source-id: 77e74778b5b725297039cd4f1396fe610257efd0

Summary: Our NN models have batch norm and dropout, which behave differently during training and eval. In this diff I: 1. Switch the evaluated model to eval mode when it's attached to the offline evaluator 2. Switch the model to eval model when an inference wrapper is created. This might not be 100% necessary, but I'm not sufficiently familiar with the publishing process to know 100% that it's been switched to eval model before a wrapper is used. Reviewed By: BerenLuthien Differential Revision: D44552990 fbshipit-source-id: c8b4d9690d959da7187418c3c81096256e22f101

Summary: 1. Add a `ResidualWrapper` module, which can be used to turn a regular layer into a residual/skip layer 2. Modify `FullyConnectedNetwork` `__init__` method to support the residual wrapper 3. Add support for new `use_skip_connections` argument to all CB NN models 4. Add unit tests for `ResidualWrapper` Reviewed By: BerenLuthien Differential Revision: D44552989 fbshipit-source-id: d092f943737d8ead4bd484da455dfea5023c4e7c

… to the base CB trainer class Summary: Requiring each CB trainer class to do input check, chosen arm feature extraction and recmetric logging is cumbersome and error-prone (there was already a bug due to SupervisedTrainer not logging recmetrics - https://fb.workplace.com/groups/4402889573094258/posts/5999347940115072/?comment_id=6004958952887304). Instead, these 3 actions will be performed centrally in the base class and each trainer can focus on just computing the loss or executing the parameter update. Reviewed By: BerenLuthien Differential Revision: D44593201 fbshipit-source-id: 349f6702c9c756d80d23bd9ede6c4fb2e5940e94

Summary: Outputting the accumulated rewards and regrets. This helps evaluation. Demo N3067100 Reviewed By: alexnikulkov Differential Revision: D44352607 fbshipit-source-id: 06e6f756a35229f294a98650c2d67d3a78e3c513

Differential Revision: D44719105 fbshipit-source-id: 9e73e110d4c3ed858ac9e8944c73404bb6fa6122

Summary: T148338245 (next Diff will addresses T148655761), Output `{"pred_reward": pred_reward, "pred_sigma": pred_sigma, "ucb": ucb}` from Contextual Bandit models Reviewed By: alexnikulkov Differential Revision: D44775830 fbshipit-source-id: 2ed22bda5d8ae0491602ee8ffe4ac126f7f4774c

Summary: Original commit changeset: 2ed22bda5d8a Original Phabricator Diff: D44775830 Reverting to fix broken release tests. Example: https://www.internalfb.com/mast/job/aienv-20be949aec-f429693811 Reviewed By: zxpmirror1994 Differential Revision: D45132749 fbshipit-source-id: 35bb3496ac720d2569040fc020bcba3fb71af8cd

Summary: redo D44775830 , plus D45133345 (previous D44775830 passed all Unit Tests but failed on starlight run. D45133345 should fix it) Reviewed By: alexnikulkov Differential Revision: D45159072 fbshipit-source-id: f621f71672a4fc64ec457deb4f73a4a1e4897a45

Summary: --- # What This Diff switch `inv` or `pinv` to save calculation cost on matrix inverse. ------ ### Details : - matrix inverse cost calculation on NNLinUCB - we need matrix inverse operation on LinUCB and NNLinUCB. - On NNLinUCB, we have to do this inverse on each back-propogation process. Thus, it is costful. Saving some calcuation on inverse operation is more important on NNLinUCB (than LinUCB) - Hermitian=True saves complexity on `pinv` https://pytorch.org/docs/stable/generated/torch.linalg.pinv.html "If hermitian= True, A is assumed to be Hermitian if complex or symmetric if real, but this is not checked internally. Instead, just the lower triangular part of the matrix is used in the computations." - Apparently our matrix is Hermitian. - inv is more efficient (than pinv) but may be calculation unstable - `torch.linalg.inv` is more calculation efficient than `torch.linalg.pinv`, but `pinv` is more stable - if matrix is rank deficient or its condition number is too huge its inverse is not stable to calculate - the existence of regularization diagonal matrix Eye makes the matrix `A` always full rank. However, we also need consider how powerful the arm is compared to Eye. - In case some arm has many historical data (equivalently, the arm feature is huge), the eigen value corresponding to this arm may be so huge that the condition number of matrix A is very bad. That makes `inv` unstable. - Given the above observation/analysis, we adopt : `Try inv, Exception pinv` **NOTE** : let D45159072 land first, then change this Diff accordingly before land this Diff. Reviewed By: alexnikulkov Differential Revision: D44564771 fbshipit-source-id: 5d2c120d0c5faef6390af9c96cdb7453f22c3524

Summary: Add a test for recmetric logging in CB trainer (test for LinUCB only, but should work the same for other trainers) Reviewed By: BerenLuthien Differential Revision: D45404107 fbshipit-source-id: cb58809ffbe3948518bdc0ecf25348257f6e704d

Summary: Adding support for weighted supervised losses. This is especially important for Offline Eval because it uses weights to filter training data Reviewed By: BerenLuthien Differential Revision: D44597203 fbshipit-source-id: 8af32344123cd6b68e3df085952e35d1244fea8b

Summary: Log a few more metrics to improve Offline Eval understanding: 1. Average accepted/rejected rewards 2. Fraction of accepted to rejected rewards 3. Average reward across all (accepted + rejected) data 4. Average slate sizes of accepted/rejected observations Reviewed By: BerenLuthien Differential Revision: D45300609 fbshipit-source-id: d6e776d1d05e789942272fef42993471719e19be

Summary: See title Reviewed By: BerenLuthien Differential Revision: D45323720 fbshipit-source-id: 0e01614cdf91c2360e8b2aa63c8352d3d418d279

Summary: Adding a separate concept of `label`, which is used as the prediction target for model training. In the basic case, `label` is equal to `reward` and if `CBInput` is created without specifying `label`, we automatically set `label=reward` in `__post_init__`. But we can also define `label` differently, e.g. as a transformation of `reward`, to give the model a more stable learning target. I have observed improvements in performance from using `log` or `sqrt` transforms in AP Container Selection. The `reward` field is now used only for Offline Evaluation, while `label` is used for model training and supervised learning accuracy metrics. In a FREE workflow the transform is specified in `config.features.label_transform`, which can be one of `["identity", "log", "sqrt"]` Reviewed By: BerenLuthien, PoojaAg18 Differential Revision: D45300610 fbshipit-source-id: 7005b10e652549948e9104c9d90ef76475276f67

…ias/intercept Summary: Currently we pass the output of MLP directly to the input of LinUCB. This is equivalent to using a linear layer without a bias term. This diff appends a column of ones as an extra feature to the output of MLP in order to allow LinUCB to have a bias/intercept. Reviewed By: PoojaAg18 Differential Revision: D45539068 fbshipit-source-id: 48504c276590c04e274b380f89f806b2307809cf

Summary: allow NN to grad backprop Reviewed By: alexnikulkov Differential Revision: D45997912 fbshipit-source-id: 68ac01811f3b0eb9610ee5bf8652ae5163316a27

Summary: Pyre upgrade and continuous jobs never get black formatting anymore, and I don't have time to fix it this week. It's quicker to create diffs locally than to commandeer and fix the bot-generated diffs. ``` LOCAL_CONFIG=reagent IDENTIFIER="$(sed 's/\//-/g' <<< $LOCAL_CONFIG)" ERRORS_FILE=/tmp/$IDENTIFIER PYRE_UPGRADE=~/fbsource/"$(buck2 build //tools/pyre/facebook/tools:upgrade --show-output | awk '{ print $2}')" # get errors pyre -l "$LOCAL_CONFIG" --output json check > $ERRORS_FILE # fix errors cat $ERRORS_FILE | $PYRE_UPGRADE fixme-single $LOCAL_CONFIG --lint --error-source stdin --no-commit ``` Reviewed By: grievejia Differential Revision: D46025666 fbshipit-source-id: e1d3c8a7dca99707b48a5c93e7e30a3b7dfc89eb

Summary: X-link: meta-pytorch/torchrec#1171 Pull Request resolved: facebookresearch#717 ATT. If window_size is smaller than the overall/global batch size, window metrics will be NaN since we'll pop the entire batch out of the window state buffer. Reviewed By: joshuadeng Differential Revision: D45590488 fbshipit-source-id: 6d84e24cf1c77760e3ff2ef8fb9a86b5ab775f68

Differential Revision: D46117822 fbshipit-source-id: 811c2fc92ca39622f74d05ea1298863328ac6eda

Summary: Addresses this error: https://pxl.cl/2JNBR This shouldn't happen because the if statement above checks range. Perhaps the feature range is less than the boxcox resolution. So in the if statement check that the range is more than the boxcox resolution. Differential Revision: D46269758 fbshipit-source-id: 2e6272a7da6e63b5c93cd8aeab52a8fb2e8166b2

Summary: Allow inference to get specific `ucb_alpha` Reviewed By: alexnikulkov Differential Revision: D46284800 fbshipit-source-id: 00daa34e0679e2d8c6d268b6efd2dd349d047c8a

Differential Revision: D46355537 fbshipit-source-id: 696677a13e131de466df1e919b0b976a988eb67e

Summary: numpy 1.20.0 removed `np.object`. It was an alias to builtin `object`. It's safe to replace directly to `object`. Change is generated mechanically using the following oneliner: ``` fbgr -sl 'np\.object\b' | xargs perl -pi -e 's,\bnp\.object\b,object,g' ``` Differential Revision: D46585978 fbshipit-source-id: 21f2a5f0d1379ebd3fc5f89c9362699cbce0ef50

facebook-github-bot · 2023-06-17T09:29:43Z

Hi @adhiiisetiawan!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

facebook-github-bot · 2023-06-17T11:17:43Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

Wei Wen and others added 30 commits September 27, 2021 09:19

Add a unit test for OneHotActions

5f0b21e

Summary: Add a unit test for OneHotActions. Reviewed By: igfox Differential Revision: D31248082 fbshipit-source-id: 74d55ab5d3a23c75f5d0020b53616c87023afcf0

Adds unit test for columnvector function

57f27db

Summary: Adds unit test to the test_processing.py for columnvector function from transform.py Reviewed By: igfox Differential Revision: D31247953 fbshipit-source-id: 8e6eee0fecf3dfb0bff8fb3d168e15f002c0acf3

Update docstring for transforms.py

c703915

Summary: I found some of the documentation confusing, this is an attempt to clarify the functionality of the code. Reviewed By: czxttkl Differential Revision: D31071280 fbshipit-source-id: 62e7e299d40e7a431ed29dea0c6582646a855fd9

Allow obj_func be optional (facebookresearch#548)

b5afcc0

Summary: Pull Request resolved: facebookresearch#548 as titled Reviewed By: gji1 Differential Revision: D31217654 fbshipit-source-id: 514ab8ae7561b8a5a7ff5094642314f83c6b5be1

Fix rasp tests (facebookresearch#550)

c41b961

Summary: Pull Request resolved: facebookresearch#550 update miniconda and update T101565175 Reviewed By: gji1 Differential Revision: D31290939 fbshipit-source-id: cbecdb63048fb3fb79a7b7eb87406408309026c1

Add test_gym_replay_buffer (facebookresearch#549)

0517902

Summary: Pull Request resolved: facebookresearch#549 Tests for replay buffer's behavior Reviewed By: alexnikulkov Differential Revision: D30978005 fbshipit-source-id: aa034db5699071654d607fe7795bc8be232157c2

Remove ABC from LightningModule (#9517)

2e71682

Summary: ### New commit log messages 3aba9d16a Remove `ABC` from `LightningModule` (#9517) Reviewed By: ananthsub Differential Revision: D31296721 fbshipit-source-id: a9992486c61a6f86fb251f2733bbc9311d93f293

Fix gym_cpu_unittest (facebookresearch#551)

603387e

Summary: Pull Request resolved: facebookresearch#551 as titled Reviewed By: igfox Differential Revision: D31296738 fbshipit-source-id: 3672485ccd230f9b1a029f90759bdf598f5990e4

Fix last two circle ci tests (facebookresearch#552)

9b7281d

Summary: Pull Request resolved: facebookresearch#552 By relaxing the threshold... Also set seeds Reviewed By: bankawas Differential Revision: D31334025 fbshipit-source-id: d5d666b2b5f5e5e4f06dea2a1353e85456f39a60

Move ReAgent MAB from numpy to PyTorch

bb357dc

Summary: Replace numpy with PyTorch. This is a step towards using the standard ReAgent interface for MABs Reviewed By: czxttkl Differential Revision: D31423841 fbshipit-source-id: 04ccf92fba7b0f44ab6c19bdef3d098bf62394cf

suppress errors in reagent

34fe167

Differential Revision: D31496257 fbshipit-source-id: 0f6b56075e4d24bdfd9d54bcecee90c5d86efbaf

Improve REINFORCE trainer (facebookresearch#558)

b70c43e

Summary: Pull Request resolved: facebookresearch#558 add some input check and simplify code Reviewed By: gji1 Differential Revision: D31529090 fbshipit-source-id: 0c38d9b927d0149256fa78d373687bc9048a0c85

Convert possible_actions_mask to a Tensor (facebookresearch#556)

dba2fd9

Summary: Pull Request resolved: facebookresearch#556 Convert possible_actions_mask to a Tensor Reviewed By: czxttkl Differential Revision: D31497491 fbshipit-source-id: c0b8eb479b6be517a9c74c1d61ad68e4120d388a

suppress errors in reagent

2b65e91

Differential Revision: D31605682 fbshipit-source-id: 6c2d89926ecab45cdbbcdd48058ef3697f94f92b

add assertion for non-empty possible action mask (facebookresearch#557)

57b58a8

Summary: Pull Request resolved: facebookresearch#557 See title Reviewed By: czxttkl Differential Revision: D31524614 fbshipit-source-id: e7aa7996de570f4ff990b402fbd23688a4ed12f4

suppress errors in reagent

103893c

Differential Revision: D31739112 fbshipit-source-id: d7ab577f32eadf56fa8ad1846a0e916ab9fcb778

Add MAB unittests to CircleCI test config (facebookresearch#567)

9531e9c

Summary: Pull Request resolved: facebookresearch#567 Reviewed By: czxttkl Differential Revision: D31743265 fbshipit-source-id: 3508027a8ab23c8569d4cf416560f1b9a6891752

Hongbo Guo and others added 26 commits April 3, 2023 12:13

BE: NNUCB[2/2] README

bf35935

Summary: documentation README for Deep Learning based LinUCB model Reviewed By: rodrigodesalvobraz Differential Revision: D44508375 fbshipit-source-id: 4408b2ea85b1bea728815af20a526c674f0a062b

BE: Linucb [1/x] README

53d51a5

Summary: [need below pictures in Summary so as to use their links in README.md file] {F927315010} {F927382569} {F927429214} Reviewed By: rodrigodesalvobraz Differential Revision: D44561198 fbshipit-source-id: 77e74778b5b725297039cd4f1396fe610257efd0

Synthetic data/env for contextual bandit [3/X]

9e1a350

Summary: Outputting the accumulated rewards and regrets. This helps evaluation. Demo N3067100 Reviewed By: alexnikulkov Differential Revision: D44352607 fbshipit-source-id: 06e6f756a35229f294a98650c2d67d3a78e3c513

suppress errors in reagent

d94632c

Differential Revision: D44719105 fbshipit-source-id: 9e73e110d4c3ed858ac9e8944c73404bb6fa6122

Add weight decay regularization to NN CB trainers

3e80830

Summary: See title Reviewed By: BerenLuthien Differential Revision: D45323720 fbshipit-source-id: 0e01614cdf91c2360e8b2aa63c8352d3d418d279

quick fix NN not update

5dcaca2

Summary: allow NN to grad backprop Reviewed By: alexnikulkov Differential Revision: D45997912 fbshipit-source-id: 68ac01811f3b0eb9610ee5bf8652ae5163316a27

upgrade pyre version in fbcode/reagent - batch 1

1ad5734

Differential Revision: D46117822 fbshipit-source-id: 811c2fc92ca39622f74d05ea1298863328ac6eda

Is NNLinUCB using ucb_alpha ?

6ae6732

Summary: Allow inference to get specific `ucb_alpha` Reviewed By: alexnikulkov Differential Revision: D46284800 fbshipit-source-id: 00daa34e0679e2d8c6d268b6efd2dd349d047c8a

suppress errors in reagent

78cf56a

Differential Revision: D46355537 fbshipit-source-id: 696677a13e131de466df1e919b0b976a988eb67e

Update usage.rst

82db4bb

facebook-github-bot added the cla signed label Jun 17, 2023

xuruiyang force-pushed the main branch from ac49faa to d3ac80e Compare September 20, 2025 06:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update Documentation for Quick Start Example: usage.rst #718

Update Documentation for Quick Start Example: usage.rst #718

Uh oh!

adhiiisetiawan commented Jun 17, 2023

Uh oh!

facebook-github-bot commented Jun 17, 2023

Uh oh!

facebook-github-bot commented Jun 17, 2023

Uh oh!

Uh oh!

Update Documentation for Quick Start Example: usage.rst #718

Are you sure you want to change the base?

Update Documentation for Quick Start Example: usage.rst #718

Uh oh!

Conversation

adhiiisetiawan commented Jun 17, 2023

Uh oh!

facebook-github-bot commented Jun 17, 2023

Action Required

Process

Uh oh!

facebook-github-bot commented Jun 17, 2023

Uh oh!

Uh oh!