[Doc] Tutorial: recurrent training on sequence batches#3860
Open
theap06 wants to merge 1 commit into
Open
Conversation
Adds tutorials/sphinx-tutorials/recurrent_sequence_training.py — the multi-step / sequence-training complement to dqn_with_rnn.py (which covers single-step recurrent DQN at collection time). Walks through the post-pytorch#3695 recurrent contract end-to-end: - Collector auto-wiring of InitTracker + the recurrent-state primer via auto_register_policy_transforms=True - Trajectory-aware sampling with SliceSampler - Multi-step LSTM forward under set_recurrent_mode(True) - Boundary safety: hand-built two-trajectory packed batch + isolation check that proves hidden state does not leak across is_init markers - A tiny end-to-end training loop closing the BC-style sequence path Runs in ~3s on CPU. Cross-references the recurrent state lifecycle guide (pytorch#3792), collector internals page (pytorch#3796), and the glossary. Toctree entry added to docs/source/index.rst next to dqn_with_rnn.
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3860
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 3 New Failures, 1 Unrelated FailureAs of commit 69543ac with merge base ed22b88 ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Contributor
Benchmark Results: PR
|
| Benchmark | main ops | PR ops | Change |
|---|---|---|---|
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] |
50.80 | 192.80 | +279.50% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] |
201.31 | 40.42 | -79.92% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] |
773.00 | 1,107 | +43.24% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] |
3,408 | 2,603 | -23.63% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] |
3,231 | 2,521 | -21.97% |
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[16-same] |
19.32 | 22.32 | +15.53% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] |
3,027 | 2,577 | -14.86% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] |
924.34 | 790.57 | -14.47% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] |
3,224 | 3,682 | +14.20% |
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[4-same] |
27.57 | 23.79 | -13.73% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] |
582.86 | 516.38 | -11.41% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] |
2,026 | 1,824 | -9.97% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] |
2,983 | 2,701 | -9.47% |
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-backward] |
259.60 | 284.07 | +9.43% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] |
2,127 | 2,313 | +8.76% |
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-backward] |
130.72 | 141.94 | +8.58% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] |
3,115 | 3,372 | +8.22% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] |
2,132 | 2,307 | +8.20% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape1-atari] |
5,190 | 4,837 | -6.79% |
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-None] |
265.01 | 282.58 | +6.63% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] |
3,198 | 3,402 | +6.39% |
benchmarks/test_objectives_benchmarks.py::test_values[td0_return_estimate-False-False] |
7,727 | 7,237 | -6.33% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape2-large_img] |
583.29 | 546.41 | -6.32% |
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-backward] |
389.97 | 414.57 | +6.31% |
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[4-constant] |
4,154 | 4,370 | +5.20% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[200-img_shape3-large_batch] |
771.57 | 732.55 | -5.06% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] |
3,022 | 2,876 | -4.83% |
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-1-512] |
673.08 | 641.21 | -4.73% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] |
560.74 | 534.49 | -4.68% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape2-large_img] |
400.26 | 418.02 | +4.44% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] |
2,070 | 2,160 | +4.36% |
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[safetensors] |
22,874 | 23,870 | +4.35% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[50-img_shape0-small] |
4,312 | 4,494 | +4.22% |
benchmarks/test_objectives_benchmarks.py::test_sac_speed[reduce-overhead-None] |
457.36 | 475.60 | +3.99% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[200-img_shape3-large_batch] |
139.63 | 134.18 | -3.90% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape2-large_img] |
177.09 | 170.31 | -3.83% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-True] |
35,370 | 34,198 | -3.31% |
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-lstm] |
2.0238 | 1.9602 | -3.14% |
benchmarks/test_envs_benchmark.py::test_transformed |
0.8880 | 0.9156 | +3.11% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] |
51.55 | 53.12 | +3.05% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[50-img_shape0-small] |
7,100 | 7,311 | +2.97% |
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-backward] |
111.12 | 114.39 | +2.94% |
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-None] |
460.84 | 474.36 | +2.93% |
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[untyped_storage] |
7.9115 | 8.1399 | +2.89% |
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[numpy] |
347,396 | 337,827 | -2.75% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-True] |
42,929 | 41,756 | -2.73% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[50-img_shape0-small] |
3,482 | 3,573 | +2.63% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-True] |
38,821 | 37,811 | -2.60% |
benchmarks/test_envs_benchmark.py::test_parallel |
0.9650 | 0.9405 | -2.53% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-True] |
38,982 | 38,015 | -2.48% |
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] |
0.5227 | 0.5356 | +2.47% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-True] |
31,648 | 30,877 | -2.44% |
benchmarks/test_objectives_benchmarks.py::test_td3_speed[reduce-overhead-None] |
570.11 | 556.30 | -2.42% |
benchmarks/test_objectives_benchmarks.py::test_iql_speed[reduce-overhead-None] |
117.93 | 115.13 | -2.38% |
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-backward] |
59.25 | 57.85 | -2.36% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-False] |
52,363 | 51,139 | -2.34% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-True] |
20,568 | 20,090 | -2.32% |
benchmarks/test_objectives_benchmarks.py::test_redq_speed[True-None] |
221.44 | 226.57 | +2.32% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-False] |
78,753 | 76,976 | -2.26% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape1-atari] |
273.06 | 279.07 | +2.20% |
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] |
0.2230 | 0.2279 | +2.17% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] |
53.69 | 52.57 | -2.09% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] |
928.36 | 908.96 | -2.09% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-True] |
21,784 | 21,330 | -2.08% |
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-False-0-gru] |
1.3502 | 1.3774 | +2.01% |
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[False-backward] |
240.57 | 245.35 | +1.98% |
benchmarks/test_objectives_benchmarks.py::test_redq_speed[reduce-overhead-None] |
232.89 | 228.47 | -1.90% |
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-backward] |
123.56 | 125.90 | +1.89% |
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[reduce-overhead-None] |
285.20 | 290.60 | +1.89% |
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-None] |
1,776 | 1,743 | -1.86% |
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-gru] |
3.0589 | 3.0050 | -1.76% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] |
168.28 | 171.24 | +1.76% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] |
162.11 | 164.94 | +1.75% |
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[False-backward] |
516.09 | 507.08 | -1.75% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-True] |
31,549 | 31,000 | -1.74% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape2-large_img] |
390.21 | 396.82 | +1.70% |
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-backward] |
116.57 | 118.46 | +1.62% |
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[16-constant] |
2,624 | 2,666 | +1.60% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-False] |
65,368 | 64,322 | -1.60% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] |
167.98 | 170.66 | +1.60% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-True] |
18,559 | 18,853 | +1.58% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-False] |
28,299 | 27,859 | -1.55% |
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-None] |
693.75 | 704.39 | +1.53% |
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[False-None] |
703.47 | 692.71 | -1.53% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-False] |
56,152 | 55,294 | -1.53% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-True] |
19,460 | 19,169 | -1.50% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-True] |
21,773 | 21,454 | -1.46% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-False] |
36,119 | 35,593 | -1.46% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-True] |
22,774 | 22,442 | -1.45% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-False] |
38,959 | 39,523 | +1.45% |
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[False-backward] |
78.43 | 79.52 | +1.38% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] |
48.77 | 49.43 | +1.35% |
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-backward] |
28.57 | 28.20 | -1.28% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-True] |
24,386 | 24,088 | -1.22% |
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-None] |
175.70 | 177.82 | +1.21% |
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-True-0-lstm] |
0.9480 | 0.9367 | -1.20% |
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-None] |
38.71 | 38.25 | -1.19% |
benchmarks/test_envs_benchmark.py::test_simple |
1.7912 | 1.8121 | +1.17% |
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-None] |
84.81 | 85.79 | +1.16% |
benchmarks/test_envs_benchmark.py::test_serial |
0.5778 | 0.5843 | +1.13% |
benchmarks/test_objectives_benchmarks.py::test_redq_speed[False-backward] |
56.70 | 56.09 | -1.08% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-False] |
32,885 | 32,531 | -1.08% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] |
164.30 | 165.97 | +1.02% |
benchmarks/test_objectives_benchmarks.py::test_values[td1_return_estimate-False-False] |
35.59 | 35.24 | -0.98% |
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[False-None] |
346.33 | 349.73 | +0.98% |
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-backward] |
82.41 | 83.21 | +0.97% |
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-backward] |
131.47 | 132.73 | +0.96% |
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-None] |
288.00 | 290.72 | +0.94% |
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-False-0-lstm] |
0.8717 | 0.8636 | -0.92% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-False] |
51,530 | 51,999 | +0.91% |
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[200-img_shape1-large_batch] |
15.25 | 15.39 | +0.90% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-False] |
64,680 | 65,264 | +0.90% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] |
511.38 | 515.85 | +0.87% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-False] |
58,549 | 59,053 | +0.86% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-False] |
32,895 | 32,616 | -0.85% |
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[200-img_shape1-large_batch] |
13.37 | 13.48 | +0.85% |
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-lstm] |
3.1335 | 3.1075 | -0.83% |
benchmarks/test_collectors_benchmark.py::test_single_with_rb |
8.6332 | 8.7042 | +0.82% |
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-None] |
211.77 | 213.50 | +0.82% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] |
25.61 | 25.40 | -0.81% |
| ... | ... | ... | Showing 120 of 192 comparisons, sorted by absolute change. |
GPU
Compared 202 benchmarks. Regressions over 5%: 14. Improvements over 5%: 10.
| Benchmark | main ops | PR ops | Change |
|---|---|---|---|
benchmarks/test_objectives_benchmarks.py::test_iql_speed[reduce-overhead-None] |
78.17 | 102.99 | +31.76% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] |
1,037 | 738.83 | -28.73% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape1-atari] |
3,584 | 4,582 | +27.85% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] |
1,044 | 826.98 | -20.80% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] |
818.77 | 656.11 | -19.87% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] |
1,854 | 2,199 | +18.62% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] |
712.38 | 838.54 | +17.71% |
benchmarks/test_objectives_benchmarks.py::test_sac_speed[reduce-overhead-None] |
86.62 | 101.95 | +17.69% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] |
2,848 | 3,334 | +17.04% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] |
3,689 | 3,138 | -14.94% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] |
3,620 | 3,080 | -14.92% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] |
45.69 | 39.48 | -13.58% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] |
2,685 | 2,993 | +11.46% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] |
2,208 | 1,973 | -10.65% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] |
530.89 | 474.51 | -10.62% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] |
2,327 | 2,108 | -9.42% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] |
2,761 | 3,007 | +8.90% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] |
3,023 | 3,265 | +8.00% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] |
2,685 | 2,869 | +6.85% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[200-img_shape3-large_batch] |
774.89 | 721.96 | -6.83% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] |
2,204 | 2,054 | -6.81% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-False] |
37,450 | 35,036 | -6.45% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape2-large_img] |
565.08 | 532.02 | -5.85% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-False] |
59,503 | 56,430 | -5.16% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape2-large_img] |
423.70 | 405.07 | -4.40% |
benchmarks/test_collectors_benchmark.py::test_sync_pixels |
10.42 | 10.87 | +4.37% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] |
2,808 | 2,687 | -4.30% |
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-True] |
1.3145 | 1.3705 | +4.26% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape2-large_img] |
405.03 | 388.36 | -4.12% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] |
485.89 | 466.25 | -4.04% |
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] |
0.6088 | 0.5842 | -4.04% |
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-None] |
743.68 | 715.34 | -3.81% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-True] |
21,557 | 20,744 | -3.77% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-True] |
38,887 | 37,431 | -3.75% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-False] |
51,831 | 49,912 | -3.70% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-True] |
36,110 | 34,828 | -3.55% |
benchmarks/test_envs_benchmark.py::test_simple |
1.2480 | 1.2056 | -3.40% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] |
719.93 | 744.31 | +3.39% |
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-backward] |
343.07 | 331.46 | -3.38% |
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[False-backward] |
237.68 | 245.71 | +3.38% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape2-large_img] |
174.51 | 168.68 | -3.34% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-True] |
41,988 | 43,359 | +3.27% |
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-None] |
710.63 | 733.28 | +3.19% |
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-backward] |
268.98 | 260.43 | -3.18% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-True] |
28,695 | 29,568 | +3.04% |
benchmarks/test_collectors_benchmark.py::test_async |
11.34 | 11.00 | -3.00% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-True] |
33,923 | 32,962 | -2.83% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-False] |
64,863 | 63,115 | -2.70% |
benchmarks/test_collectors_benchmark.py::test_sync_preempt |
10.20 | 10.46 | +2.56% |
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] |
0.5389 | 0.5251 | -2.56% |
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-cuda_storage_cuda_samp... |
1,520 | 1,482 | -2.50% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] |
21.97 | 21.44 | -2.43% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-True] |
23,676 | 24,245 | +2.40% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape1-atari] |
270.99 | 277.42 | +2.37% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape1-atari] |
715.50 | 698.70 | -2.35% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape1-atari] |
664.43 | 649.24 | -2.29% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-True] |
20,564 | 20,099 | -2.27% |
benchmarks/test_envs_benchmark.py::test_parallel |
0.5405 | 0.5289 | -2.14% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] |
491.09 | 480.64 | -2.13% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[200-img_shape3-large_batch] |
303.78 | 310.19 | +2.11% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-True] |
22,178 | 22,639 | +2.08% |
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-backward] |
154.51 | 151.35 | -2.05% |
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[untyped_storage] |
8.6762 | 8.5060 | -1.96% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-False] |
30,242 | 29,650 | -1.96% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-False] |
77,381 | 75,889 | -1.93% |
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-32-512] |
1,329 | 1,304 | -1.92% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-False] |
32,652 | 32,032 | -1.90% |
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[reduce-overhead-None] |
806.14 | 821.31 | +1.88% |
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb_cuda[200-img_shape1-large_batch] |
8.2501 | 8.4047 | +1.87% |
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sampler_sample_scale[1000000-cuda] |
2,293 | 2,252 | -1.78% |
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-1-512] |
1,365 | 1,341 | -1.77% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] |
52.55 | 53.47 | +1.76% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-True] |
21,160 | 20,790 | -1.75% |
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-backward] |
235.66 | 239.73 | +1.73% |
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-lstm] |
21.60 | 21.24 | -1.65% |
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-None] |
362.35 | 368.29 | +1.64% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-False] |
56,335 | 55,418 | -1.63% |
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-lstm] |
75.41 | 74.19 | -1.63% |
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-None] |
113.85 | 115.69 | +1.61% |
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[reduce-overhead-None] |
863.60 | 876.99 | +1.55% |
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-backward] |
283.52 | 279.15 | -1.54% |
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-backward] |
367.68 | 373.20 | +1.50% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] |
195.49 | 198.39 | +1.48% |
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-backward] |
72.60 | 73.65 | +1.45% |
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-backward] |
86.64 | 85.40 | -1.43% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-True] |
19,861 | 20,140 | +1.40% |
benchmarks/test_objectives_benchmarks.py::test_sac_speed[False-backward] |
79.25 | 80.34 | +1.37% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-True] |
19,417 | 19,152 | -1.36% |
benchmarks/test_collectors_benchmark.py::test_single_with_rb_pixels |
5.3496 | 5.2774 | -1.35% |
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-backward] |
309.79 | 313.93 | +1.34% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[50-img_shape0-small] |
3,551 | 3,506 | -1.28% |
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[False-None] |
233.95 | 236.93 | +1.27% |
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-gru] |
22.77 | 22.48 | -1.26% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-False] |
35,304 | 34,867 | -1.24% |
benchmarks/test_objectives_benchmarks.py::test_gae_speed[generalized_advantage_estimate-False-1-512] |
48.57 | 49.16 | +1.22% |
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb_cuda[200-img_shape1-large_batch] |
8.6049 | 8.7097 | +1.22% |
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-backward] |
883.44 | 894.18 | +1.22% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] |
22.37 | 22.64 | +1.21% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-False] |
45,132 | 45,667 | +1.19% |
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[reduce-overhead-None] |
110.60 | 109.33 | -1.15% |
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-cuda_storage_cpu_sampler] |
89.26 | 88.24 | -1.14% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] |
170.57 | 168.65 | -1.13% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-True] |
18,637 | 18,845 | +1.12% |
benchmarks/test_objectives_benchmarks.py::test_sac_speed[False-None] |
113.60 | 114.79 | +1.05% |
benchmarks/test_objectives_benchmarks.py::test_iql_speed[False-None] |
100.65 | 101.71 | +1.05% |
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-None] |
101.12 | 102.18 | +1.04% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] |
49.06 | 48.55 | -1.03% |
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-backward] |
347.15 | 350.65 | +1.01% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-False] |
32,475 | 32,151 | -1.00% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] |
168.28 | 169.85 | +0.93% |
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-None] |
672.14 | 678.32 | +0.92% |
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] |
0.7056 | 0.6992 | -0.92% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[200-img_shape3-large_batch] |
326.95 | 329.95 | +0.92% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-False] |
51,054 | 51,517 | +0.91% |
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb_cuda[100-img_shape0-atari] |
16.46 | 16.60 | +0.85% |
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-backward] |
443.11 | 446.77 | +0.83% |
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-backward] |
215.96 | 217.74 | +0.83% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] |
54.26 | 54.69 | +0.79% |
benchmarks/test_collectors_benchmark.py::test_sync |
10.28 | 10.36 | +0.78% |
benchmarks/test_objectives_benchmarks.py::test_values[generalized_advantage_estimate-True-True] |
48.44 | 48.82 | +0.77% |
| ... | ... | ... | Showing 120 of 202 comparisons, sorted by absolute change. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds tutorials/sphinx-tutorials/recurrent_sequence_training.py — the multi-step / sequence-training complement to dqn_with_rnn.py (which covers single-step recurrent DQN at collection time).