Skip to content

Conversation

PavlosApo
Copy link
Contributor

Differential Revision: D31583265

kittipatv and others added 30 commits February 7, 2021 15:32
Summary: We missed the `_create_parameters_CLIP_LOG` function

Reviewed By: badrinarayan

Differential Revision: D26026641

fbshipit-source-id: 17d453c2667e3b1abd2ab41b1ea5890cb1ca8cbd
Differential Revision: D26395711

fbshipit-source-id: ec1679789f05545058d568898a379a06e3286923
Summary:
Pull Request resolved: facebookresearch#395

There is some inheritance problem when `BaseDataClass` is a standard dataclass but the subclass is a pydantic dataclass. Since `BaseDataClass` doesn't have its own field, it doesn't need to be a dataclass.

Reviewed By: czxttkl

Differential Revision: D26434426

fbshipit-source-id: 1517f7e68541912f017dbd48b7ea05f95537868c
Summary:
Pull Request resolved: facebookresearch#398

n/a

Reviewed By: alexnikulkov

Differential Revision: D26539186

fbshipit-source-id: 54166a0624a4b56c93a47b7c3ac8e5d25d0288a0
Summary: When use_gpu=True, all evaluation batches will be accumulated in CUDA memory, causing OOM easily. Given that only the models that inherit from DQNTrainerBase need to evaluate full evaluation data in memory, I think the best hack would be to manually enforcing that evaluation happens on cpu.

Reviewed By: igfox

Differential Revision: D26568575

fbshipit-source-id: 4a7ed2e7147bef40b34e414038fa5d51a0638c2e
Summary:
Migrate LearnVM from rl_exp to reagent proper. Still keep the code in
rl_exp as is.

Reviewed By: igfox

Differential Revision: D26580567

fbshipit-source-id: c555f7e3ec03b77aeb7bc74f34cd1def1aa41750
Summary:
Pull Request resolved: facebookresearch#396

Enable the logging of actor_loss

Reviewed By: czxttkl

Differential Revision: D26472756

fbshipit-source-id: 43c6549cc8df3346a76cb1a1ca0d923a0103a746
Reviewed By: alexnikulkov

Differential Revision: D26592353

fbshipit-source-id: 7f72d966d6aa793ae35fc7460606c7b025ed65b4
Summary:
Pull Request resolved: facebookresearch#390

Adding a function that wraps a gym env in a DataLoader in order to use PyTorch Lightning trainers for these environments.

Reviewed By: kittipatv

Differential Revision: D26246713

fbshipit-source-id: 2b994af6dc8458ee382d9a21cb46f0b5746ad37b
Summary: This will allow us to create trainers with a single optimizer

Reviewed By: kittipatv

Differential Revision: D26270800

fbshipit-source-id: ed514a30a1c2dd6313f000d444db9b4de3570aea
Summary:
Pull Request resolved: facebookresearch#388

I migrated REINFORCE trainer to Lightning.
I also added a gym test, based on a new parametric test function which performs online learning without a replay buffer.

Reviewed By: kittipatv

Differential Revision: D26246712

fbshipit-source-id: f00ecb1e7406df7d1b477219efa0825c21127a73
Summary:
Pull Request resolved: facebookresearch#403

1. Add support for decaying temperature to `SoftmaxActionSampler`
2. Make sure we don't sample invalid actions in `EpsilonGreedyActionSampler` (indicated by hugely negative scores)

Reviewed By: czxttkl

Differential Revision: D26676495

fbshipit-source-id: 4248fc0b979be484252a2baa73690242e66e78e1
Summary:
Signed-off-by: Manish Pandit <[email protected]>
<img width="1084" alt="Screen Shot 2021-02-23 at 11 06 35 AM" src="https://user-images.githubusercontent.com/7349834/108871498-37339080-75c7-11eb-9b17-a3a199d3d3d6.png">

Tested locally and it successfully performed docstrings coverage.  The current coverage is around 16% so I have set the limit to 15% to make sure that the tests pass initially.  We can increase the limits once we start improving the code.
Notes:
1. I am using circleci/python:3.7 image.
2. I am pip installing the interrogate.

Pull Request resolved: facebookresearch#399

Reviewed By: kaiwenw

Differential Revision: D26705510

Pulled By: manishpandit

fbshipit-source-id: 37cbd69e0f4b83461213ff93f9bc1435b589c5d5
Summary: Pyre errors are pre-existing and will clean up in a following diff

Reviewed By: czxttkl

Differential Revision: D26635084

fbshipit-source-id: 12f53ec4f3bc4b063aa709608191438f12bb7343
Summary: Pass actor values to CPE when calculating model propensities

Reviewed By: kaiwenw

Differential Revision: D26730220

fbshipit-source-id: f621ef6ea22d6cd274dbb3226f9fe92d8f61144d
Summary:
Pull Request resolved: facebookresearch#402

Implementation notes:
1. I had to create a Dataloader to handle the buildup of the trajectory buffer for PPO.
2. PPO operates on batches of trajectories. I chose to implement the batches in the simplest (but probably inefficient) way - as lists of trajectories.
3. Distributed training will not work for this implementation. I don't think it's a high priority for PPO right now, so we can implement it when it's necessary. Since PPO is an online algorithm, it would actually need a different approach than what we do for offline (batch) RL.
4. I made a change to `ReAgentLightningModule` to enable automatic conversion of not only dictionaries, but also lists of dictionaries (a list represents a batch)

Reviewed By: czxttkl

Differential Revision: D26651755

fbshipit-source-id: af09720a8603a8eeb56502bddb3d978eb0ad1f9d
Summary:
Pull Request resolved: facebookresearch#404

Since PPO trainer involves many customized flows and tricks, I think it might be a good idea to use manual_backward()

Reviewed By: bankawas

Differential Revision: D26747860

fbshipit-source-id: d56345448d65ef6d006bc1b1314df7b420405b12
Summary:
Pull Request resolved: facebookresearch#400

`train_workflow()` is basically the same for every algos. The internal version & OSS version are different so let's separate them

Reviewed By: kaiwenw

Differential Revision: D26642559

fbshipit-source-id: 126fc202b519396eb9c3ba43d522a3ed7abad745
Summary:
Pull Request resolved: facebookresearch#401

This function is standardized. No need to put it in model manager for customization

Reviewed By: kaiwenw

Differential Revision: D26645777

fbshipit-source-id: 28fa4b348e77c4096dc586f7d03ca77bc9f07f41
Summary: For DQN, we would like to see CPE results for every epoch.

Reviewed By: MisterTea

Differential Revision: D26773197

fbshipit-source-id: 41335acfdc62aa5985310638d1b0943949f2fbf5
Summary: Code which trains a tensor placement policy using PG

Reviewed By: kittipatv

Differential Revision: D25593933

fbshipit-source-id: e2137d46ddc800269cf49547beea2659718b9a78
Summary: Add hyperparameter tuning using Ax for non-FBL workflows

Reviewed By: kittipatv

Differential Revision: D25487673

fbshipit-source-id: 16c7bd9ff6f63c9222acd3413c398219f8d2c140
Summary:
Pull Request resolved: facebookresearch#405

Adding a GNN model based on GraphSAGE to ReAgent (outside of the main codebase for now)

Reviewed By: czxttkl

Differential Revision: D25934888

fbshipit-source-id: 48e7e038818b79e332339ec72a0e0a949e30e757
Summary: Pull Request resolved: facebookresearch#407

Reviewed By: czxttkl

Differential Revision: D26787543

fbshipit-source-id: 4e74e01c7d04569a599e2493f3bea0218e8fb116
Summary: Use evaluation dataset if it's not None, else fall back to previous logics. This will allow custom evaluation dataset.

Reviewed By: czxttkl

Differential Revision: D26694566

fbshipit-source-id: f831dae9fd36b4ba0e3f33b6e353e81fa0dea7d3
Summary: Pull Request resolved: facebookresearch#408

Reviewed By: czxttkl

Differential Revision: D26635649

fbshipit-source-id: 9d6a3aa554dfa91b431c9e9e6785625f71c2ae66
)

Summary:
Pull Request resolved: facebookresearch#410

Once we dedupe workflow directories, we can add autodeps.  For now we can get close.

Reviewed By: czxttkl

Differential Revision: D26772795

fbshipit-source-id: 070bc3d2982155452a658c92b1f56af10336afb9
Reviewed By: czxttkl

Differential Revision: D26809740

fbshipit-source-id: e51aada18b9d31ae5b5ce71f0b30addf315c50e6
…er files

Summary: Generalized ips_use_cases.py, added some comments and printing to other files

Reviewed By: kaiwenw

Differential Revision: D26878973

fbshipit-source-id: 4025d076dbd8dfa5eafa91ad456fff756a91eca8
Reviewed By: kaiwenw

Differential Revision: D26920016

fbshipit-source-id: 76000f76f7ed365719cb2e6678e3e3a2a48d0ed1
PavlosApo and others added 26 commits September 21, 2021 16:54
Summary:
Pull Request resolved: facebookresearch#543

Creating a unit test to cover FixedLengthSequences function.

Reviewed By: igfox

Differential Revision: D31084450

fbshipit-source-id: 747caa5669ea6f353009236311f66c2ba2bd20a2
…#544)

Summary:
Pull Request resolved: facebookresearch#544

Adding unit test for transforms.StackDenseFixedSizeArray

Reviewed By: igfox

Differential Revision: D31114407

fbshipit-source-id: acd1a15c524ca2a990b879e31bea2832c8549be2
Summary:
### New commit log messages
  e0f2e041b Share the training step output data via `ClosureResult` (#9349)

Reviewed By: kandluis

Differential Revision: D31058705

fbshipit-source-id: 1b7b59087129406c0164b30b49a40383c65e6250
…earch#545)

Summary: Pull Request resolved: facebookresearch#545

Reviewed By: igfox

Differential Revision: D31136906

fbshipit-source-id: 63e7b2555bff4a6cda8487f85218473ed736a4c9
Summary:
Pull Request resolved: facebookresearch#546

Write a unit test for SlateView class to test that it functions as expected and to ensure it raises errors when it should

Reviewed By: igfox

Differential Revision: D31151826

fbshipit-source-id: e5750eff2a256c04ab5740d94917cee321c0265e
Summary:
1. super net sampling (with Reagent APIs)
2. Other utils to support 1
2.1. update `SuperNNConfig` attribute by a path str so that samples from Reagent ng.p.Dict can be easily mapped to masks within `SuperNNConfig`: `replace_named_tuple_by_path`
3. test samples such that counts of masks are close to configured probabilities

Reviewed By: dehuacheng

Differential Revision: D31126805

fbshipit-source-id: 95e48728773c2afd7e6856f8a7a831b00214bbda
Summary: Add a unit test for OneHotActions.

Reviewed By: igfox

Differential Revision: D31248082

fbshipit-source-id: 74d55ab5d3a23c75f5d0020b53616c87023afcf0
Summary: Adds unit test to the test_processing.py for columnvector function from transform.py

Reviewed By: igfox

Differential Revision: D31247953

fbshipit-source-id: 8e6eee0fecf3dfb0bff8fb3d168e15f002c0acf3
Summary: I found some of the documentation confusing, this is an attempt to clarify the functionality of the code.

Reviewed By: czxttkl

Differential Revision: D31071280

fbshipit-source-id: 62e7e299d40e7a431ed29dea0c6582646a855fd9
Summary:
Pull Request resolved: facebookresearch#548

as titled

Reviewed By: gji1

Differential Revision: D31217654

fbshipit-source-id: 514ab8ae7561b8a5a7ff5094642314f83c6b5be1
Summary:
Pull Request resolved: facebookresearch#550

update miniconda and update T101565175

Reviewed By: gji1

Differential Revision: D31290939

fbshipit-source-id: cbecdb63048fb3fb79a7b7eb87406408309026c1
Summary:
Pull Request resolved: facebookresearch#549

Tests for replay buffer's behavior

Reviewed By: alexnikulkov

Differential Revision: D30978005

fbshipit-source-id: aa034db5699071654d607fe7795bc8be232157c2
Summary:
### New commit log messages
  3aba9d16a Remove `ABC` from `LightningModule` (#9517)

Reviewed By: ananthsub

Differential Revision: D31296721

fbshipit-source-id: a9992486c61a6f86fb251f2733bbc9311d93f293
Summary:
Pull Request resolved: facebookresearch#551

as titled

Reviewed By: igfox

Differential Revision: D31296738

fbshipit-source-id: 3672485ccd230f9b1a029f90759bdf598f5990e4
…tly into `trainer.py` (#9495)

Summary:
### New commit log messages
  290398f81 Deprecate TrainerProperties Mixin and move property definitions directly into `trainer.py` (#9495)

Reviewed By: ananthsub

Differential Revision: D31317981

fbshipit-source-id: 9a6270f326cebb59ef5fb53b8db9d0797f62be77
Summary:
Pull Request resolved: facebookresearch#552

By relaxing the threshold...

Also set seeds

Reviewed By: bankawas

Differential Revision: D31334025

fbshipit-source-id: d5d666b2b5f5e5e4f06dea2a1353e85456f39a60
…rch#553)

Summary:
Pull Request resolved: facebookresearch#553

Use [0.01, 0.99] may cause some performance loss in boosting with entropy
metrics.

Reviewed By: czxttkl

Differential Revision: D31346456

fbshipit-source-id: dae1ef0f6e36e67a182ced5793555e0d78dbf51e
Summary:
Pull Request resolved: facebookresearch#554

as titled.
This is one step towards a config/script-based rl orchestrator which can start necessary workflows automatically.

Reviewed By: j-jiafei

Differential Revision: D31334081

fbshipit-source-id: 0355b46396d922cf82f041734ffb8d20ceeab8e5
Summary:
Adding basic UCB MAB classes to ReAgent.
3 variants of UCB are added (including the one currently used for Ads Creative Exploration - MetricUCB)
Supported functionality:
1. Batch training (feed in counts of samples and total reward from each arm). We'll use this mode for Ads Creative Exploration.
2. Online training (query the bandit for next action one step at a time).
3. Dumping the state of the bandit and loading it from a JSON string

Reviewed By: czxttkl

Differential Revision: D31355506

fbshipit-source-id: 978ec16cba289dc08af599a2c05bb49fcae2843a
Summary: Replace numpy with PyTorch. This is a step towards using the standard ReAgent interface for MABs

Reviewed By: czxttkl

Differential Revision: D31423841

fbshipit-source-id: 04ccf92fba7b0f44ab6c19bdef3d098bf62394cf
Differential Revision: D31496257

fbshipit-source-id: 0f6b56075e4d24bdfd9d54bcecee90c5d86efbaf
…ng the same variable (facebookresearch#555)

Summary:
Pull Request resolved: facebookresearch#555

The current implementation was buggy if the env was reusing the same variable for possible_actions_mask and modifying it in place. I fix the bug by copying the possible_action_mask values instead of assigning the variable directly.

Reviewed By: czxttkl

Differential Revision: D31487641

fbshipit-source-id: ebc70164e42dc097291a7aeecba60d2ef30117b3
Summary:
Pull Request resolved: facebookresearch#558

add some input check and simplify code

Reviewed By: gji1

Differential Revision: D31529090

fbshipit-source-id: 0c38d9b927d0149256fa78d373687bc9048a0c85
Summary:
Pull Request resolved: facebookresearch#556

Convert possible_actions_mask to a Tensor

Reviewed By: czxttkl

Differential Revision: D31497491

fbshipit-source-id: c0b8eb479b6be517a9c74c1d61ad68e4120d388a
Differential Revision: D31577462

fbshipit-source-id: d93f64e803b76c79c0539ab975eaeb4ec42e902b
Differential Revision: D31583265

fbshipit-source-id: b712aa627aa06436af0c5d8cb926220406849b96
@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D31583265

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.