[Feature] Feat/loss module helpers#3901
Merged
Merged
Conversation
Add LossModule._prepare_value_estimator_kwargs() to eliminate the 9-11 line boilerplate that every make_value_estimator override was repeating: defaulting value_type, delegating the ValueEstimatorBase instance/class path to the base handler, and building the hp dict from default_value_kwargs merged with self.gamma and caller overrides. Refactored losses: A2CLoss, PPOLoss, SACLoss, DiscreteSACLoss, DDPGLoss, TD3Loss (~55 lines removed). The remaining 11 losses follow the identical pattern and can be migrated in a follow-up. Adds 30 regression tests in TestPrepareValueEstimatorKwargs covering the helper in isolation and enum dispatch for all supported value types across each refactored loss. Also adds .envrc and notebooks/ to .gitignore. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…_value_estimator_keys Follow-up to the make_value_estimator preamble extraction. Adds two more LossModule helpers that remove copy-paste from new loss modules: - register_coeff_buffer(name, value, *, device, dtype): converts a scalar (or tensor) coefficient to a tensor and registers it as a buffer, with None setting the attribute to None instead (the common optional-coefficient idiom). Replaces the repeated isinstance/torch.tensor/register_buffer block. - A default _forward_value_estimator_keys that forwards the six universally accepted value-estimator keys (advantage, value_target, value, reward, done, terminated) present on the loss's tensor_keys, then calls _set_in_keys when defined. Losses that remap key names (value -> state_action_value / global_value) or forward estimator-specific keys (sample_log_prob) keep their own override. Migrated: A2CLoss (coeff buffers), SACLoss and IQLLoss (drop bespoke _forward_value_estimator_keys). The remaining losses follow identical patterns and can be migrated in a follow-up; losses whose _AcceptedKeys include extra value keys (e.g. REDQ's sample_log_prob) need their forwarding verified first. Adds TestRegisterCoeffBuffer and TestDefaultForwardValueEstimatorKeys to test/objectives/test_loss_module.py (12 tests). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3901
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Contributor
Benchmark Results: PR
|
| Benchmark | main ops | PR ops | Change |
|---|---|---|---|
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] |
430.59 | 2,050 | +376.07% |
benchmarks/test_objectives_benchmarks.py::test_sac_speed[False-backward] |
54.06 | 84.84 | +56.93% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] |
2,573 | 3,628 | +40.99% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] |
2,612 | 3,571 | +36.71% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] |
2,484 | 3,353 | +35.02% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] |
2,504 | 3,364 | +34.36% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] |
2,527 | 3,189 | +26.20% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] |
1,896 | 2,298 | +21.18% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] |
2,796 | 3,354 | +19.94% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] |
1,849 | 2,204 | +19.21% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] |
3,035 | 3,612 | +19.01% |
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[4-same] |
23.99 | 28.35 | +18.18% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] |
2,629 | 3,087 | +17.42% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] |
687.79 | 785.31 | +14.18% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] |
727.20 | 829.64 | +14.09% |
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-32-512] |
33.98 | 29.24 | -13.94% |
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[16-same] |
22.52 | 19.90 | -11.65% |
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-None] |
248.31 | 275.40 | +10.91% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] |
491.46 | 537.49 | +9.37% |
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-backward] |
228.71 | 248.19 | +8.52% |
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-backward] |
258.85 | 278.48 | +7.59% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-True] |
21,502 | 23,128 | +7.56% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] |
51.16 | 54.79 | +7.08% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] |
2,295 | 2,133 | -7.07% |
benchmarks/test_envs_benchmark.py::test_simple |
1.7003 | 1.7977 | +5.73% |
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[safetensors] |
24,360 | 23,161 | -4.92% |
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-1-512] |
637.95 | 666.44 | +4.47% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-True] |
19,874 | 20,750 | +4.41% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape1-atari] |
629.46 | 655.72 | +4.17% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] |
46.60 | 48.52 | +4.11% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-True] |
19,073 | 19,850 | +4.07% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] |
517.69 | 498.15 | -3.77% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape2-large_img] |
395.39 | 409.97 | +3.69% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] |
1,051 | 1,090 | +3.67% |
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[numpy] |
359,382 | 372,516 | +3.65% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-False] |
30,642 | 31,760 | +3.65% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-True] |
20,334 | 21,066 | +3.60% |
benchmarks/test_objectives_benchmarks.py::test_gae_speed[generalized_advantage_estimate-False-1-512] |
106.52 | 110.35 | +3.60% |
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-gru] |
3.0190 | 3.1259 | +3.54% |
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-True] |
1.3598 | 1.3123 | -3.50% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] |
739.92 | 765.43 | +3.45% |
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-None] |
82.21 | 84.89 | +3.26% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape1-atari] |
5,241 | 5,072 | -3.23% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] |
25.33 | 24.51 | -3.21% |
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] |
0.6026 | 0.5833 | -3.21% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-False] |
43,703 | 42,420 | -2.94% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape2-large_img] |
173.08 | 168.12 | -2.87% |
benchmarks/test_objectives_benchmarks.py::test_values[td1_return_estimate-False-False] |
35.68 | 36.71 | +2.87% |
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-backward] |
63.14 | 61.36 | -2.82% |
benchmarks/test_objectives_benchmarks.py::test_redq_speed[False-backward] |
56.39 | 54.83 | -2.78% |
benchmarks/test_envs_benchmark.py::test_transformed |
0.8868 | 0.9101 | +2.62% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-True] |
17,695 | 18,150 | +2.57% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape1-atari] |
681.95 | 699.45 | +2.57% |
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-backward] |
122.67 | 125.78 | +2.54% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-False] |
37,649 | 38,577 | +2.47% |
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[untyped_storage] |
7.6489 | 7.8350 | +2.43% |
benchmarks/test_objectives_benchmarks.py::test_values[td0_return_estimate-False-False] |
7,670 | 7,854 | +2.41% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-False] |
26,794 | 27,438 | +2.40% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-True] |
36,942 | 37,810 | +2.35% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-True] |
21,643 | 22,151 | +2.35% |
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[100-img_shape0-atari] |
25.62 | 26.21 | +2.31% |
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-None] |
175.10 | 178.96 | +2.21% |
benchmarks/test_objectives_benchmarks.py::test_values[td_lambda_return_estimate-True-False] |
24.09 | 24.62 | +2.19% |
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[reduce-overhead-None] |
274.00 | 279.90 | +2.16% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-True] |
36,959 | 37,745 | +2.13% |
benchmarks/test_envs_benchmark.py::test_serial |
0.5704 | 0.5823 | +2.09% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-True] |
18,214 | 18,593 | +2.08% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[50-img_shape0-small] |
4,396 | 4,486 | +2.05% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-False] |
76,088 | 77,608 | +2.00% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[50-img_shape0-small] |
3,441 | 3,509 | +1.99% |
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[100-img_shape0-atari] |
29.31 | 29.89 | +1.98% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-True] |
19,161 | 19,537 | +1.97% |
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-None] |
281.85 | 287.38 | +1.96% |
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[200-img_shape1-large_batch] |
13.10 | 13.36 | +1.94% |
benchmarks/test_objectives_benchmarks.py::test_redq_speed[True-None] |
227.64 | 223.36 | -1.88% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-False] |
56,128 | 57,176 | +1.87% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-False] |
43,994 | 44,809 | +1.85% |
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-None] |
689.46 | 702.21 | +1.85% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-True] |
29,532 | 30,060 | +1.79% |
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[False-None] |
348.98 | 342.87 | -1.75% |
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[False-backward] |
244.88 | 240.60 | -1.75% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[200-img_shape3-large_batch] |
328.83 | 334.52 | +1.73% |
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-backward] |
56.93 | 57.91 | +1.72% |
benchmarks/test_objectives_benchmarks.py::test_values[generalized_advantage_estimate-True-True] |
95.45 | 97.08 | +1.71% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape2-large_img] |
421.59 | 428.72 | +1.69% |
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[200-img_shape1-large_batch] |
14.85 | 15.09 | +1.65% |
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] |
0.5266 | 0.5181 | -1.61% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] |
195.80 | 198.93 | +1.60% |
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] |
0.2228 | 0.2264 | +1.59% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-True] |
30,045 | 30,522 | +1.59% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] |
193.75 | 196.80 | +1.58% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] |
495.60 | 487.79 | -1.57% |
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[torch.save] |
7,226 | 7,112 | -1.57% |
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-None] |
335.08 | 329.85 | -1.56% |
benchmarks/test_collectors_benchmark.py::test_single_with_rb |
8.5892 | 8.7218 | +1.54% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-True] |
32,250 | 32,748 | +1.54% |
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-backward] |
89.63 | 90.96 | +1.49% |
benchmarks/test_objectives_benchmarks.py::test_sac_speed[reduce-overhead-None] |
475.64 | 482.67 | +1.48% |
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-backward] |
131.85 | 129.91 | -1.47% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-False] |
53,908 | 54,652 | +1.38% |
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-None] |
88.96 | 87.75 | -1.37% |
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-None] |
120.09 | 121.72 | +1.35% |
benchmarks/test_objectives_benchmarks.py::test_sac_speed[False-None] |
120.74 | 122.31 | +1.30% |
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[False-backward] |
77.81 | 76.80 | -1.29% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-False] |
62,712 | 63,508 | +1.27% |
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] |
0.5842 | 0.5912 | +1.19% |
benchmarks/test_collectors_benchmark.py::test_sync |
16.50 | 16.70 | +1.17% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-False] |
34,252 | 33,859 | -1.15% |
benchmarks/test_objectives_benchmarks.py::test_iql_speed[False-None] |
49.09 | 49.65 | +1.14% |
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-lstm] |
3.1154 | 3.1506 | +1.13% |
benchmarks/test_collectors_benchmark.py::test_single |
8.8213 | 8.9170 | +1.09% |
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-False-0-lstm] |
0.8560 | 0.8470 | -1.06% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape1-atari] |
273.15 | 275.95 | +1.02% |
benchmarks/test_objectives_benchmarks.py::test_redq_speed[reduce-overhead-None] |
223.20 | 225.45 | +1.01% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] |
24.68 | 24.43 | -1.01% |
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-None] |
116.73 | 117.83 | +0.95% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-False] |
49,417 | 49,882 | +0.94% |
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-32-512] |
558.95 | 564.19 | +0.94% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[200-img_shape3-large_batch] |
308.83 | 311.68 | +0.92% |
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] |
0.2103 | 0.2123 | +0.92% |
| ... | ... | ... | Showing 120 of 192 comparisons, sorted by absolute change. |
GPU
Compared 202 benchmarks. Regressions over 5%: 13. Improvements over 5%: 7.
| Benchmark | main ops | PR ops | Change |
|---|---|---|---|
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] |
50.38 | 490.50 | +873.51% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] |
196.49 | 37.58 | -80.88% |
benchmarks/test_objectives_benchmarks.py::test_cql_speed[reduce-overhead-None] |
57.77 | 89.41 | +54.77% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] |
3,656 | 2,447 | -33.07% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] |
937.57 | 702.69 | -25.05% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] |
3,416 | 2,654 | -22.33% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] |
915.36 | 738.56 | -19.32% |
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[reduce-overhead-None] |
106.51 | 86.19 | -19.08% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] |
868.83 | 704.66 | -18.90% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] |
1,862 | 2,210 | +18.69% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] |
415.91 | 493.48 | +18.65% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] |
848.71 | 719.89 | -15.18% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] |
3,309 | 2,994 | -9.52% |
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[reduce-overhead-None] |
794.90 | 861.97 | +8.44% |
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[reduce-overhead-None] |
787.46 | 845.13 | +7.32% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape2-large_img] |
410.04 | 380.17 | -7.29% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape2-large_img] |
413.14 | 389.14 | -5.81% |
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-None] |
1,863 | 1,966 | +5.51% |
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-backward] |
399.72 | 378.60 | -5.28% |
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-backward] |
478.37 | 454.17 | -5.06% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[200-img_shape3-large_batch] |
733.09 | 697.69 | -4.83% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] |
2,604 | 2,728 | +4.80% |
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-1-512] |
1,293 | 1,233 | -4.61% |
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[False-backward] |
245.32 | 234.30 | -4.49% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] |
2,489 | 2,600 | +4.45% |
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-True] |
1.2995 | 1.3560 | +4.34% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape1-atari] |
4,246 | 4,064 | -4.29% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape2-large_img] |
551.73 | 528.38 | -4.23% |
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-1-512] |
1,397 | 1,338 | -4.22% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape2-large_img] |
173.33 | 166.11 | -4.17% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-True] |
17,743 | 18,479 | +4.15% |
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-backward] |
973.28 | 1,012 | +3.99% |
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-backward] |
155.57 | 149.49 | -3.91% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[50-img_shape0-small] |
3,555 | 3,422 | -3.72% |
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sampler_sample_scale[1000000-cuda] |
2,345 | 2,258 | -3.71% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape1-atari] |
685.68 | 660.33 | -3.70% |
benchmarks/test_collectors_benchmark.py::test_sync |
10.16 | 10.53 | +3.67% |
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[False-None] |
647.14 | 624.10 | -3.56% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-False] |
48,388 | 50,098 | +3.53% |
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-lstm] |
22.17 | 21.41 | -3.43% |
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] |
0.6052 | 0.5851 | -3.31% |
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-32-512] |
1,359 | 1,314 | -3.31% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[200-img_shape3-large_batch] |
137.97 | 133.64 | -3.14% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-True] |
27,707 | 28,574 | +3.13% |
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[reduce-overhead-None] |
1,881 | 1,940 | +3.12% |
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-None] |
758.49 | 781.33 | +3.01% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] |
22.17 | 21.52 | -2.94% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-False] |
75,418 | 77,632 | +2.94% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-True] |
19,224 | 19,781 | +2.90% |
benchmarks/test_objectives_benchmarks.py::test_td3_speed[reduce-overhead-None] |
43.31 | 44.49 | +2.72% |
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[torch.save] |
6,977 | 7,162 | +2.66% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-False] |
41,876 | 42,970 | +2.61% |
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-None] |
726.83 | 745.43 | +2.56% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] |
23.22 | 22.64 | -2.48% |
benchmarks/test_envs_benchmark.py::test_serial |
0.4227 | 0.4328 | +2.39% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] |
53.55 | 52.30 | -2.33% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-True] |
32,190 | 32,936 | +2.32% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[50-img_shape0-small] |
4,164 | 4,259 | +2.28% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-False] |
33,880 | 34,645 | +2.26% |
benchmarks/test_objectives_benchmarks.py::test_values[td0_return_estimate-False-False] |
12,098 | 11,828 | -2.23% |
benchmarks/test_objectives_benchmarks.py::test_sac_speed[False-backward] |
80.89 | 79.10 | -2.22% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-True] |
29,514 | 30,162 | +2.19% |
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape1-atari] |
630.86 | 617.07 | -2.19% |
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[pickle] |
11,946 | 12,207 | +2.18% |
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-cuda_storage_cpu_sampler] |
87.77 | 89.63 | +2.12% |
benchmarks/test_envs_benchmark.py::test_simple |
1.2439 | 1.2177 | -2.10% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-False] |
27,169 | 27,738 | +2.10% |
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-gru] |
23.37 | 22.88 | -2.10% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-True] |
37,417 | 38,149 | +1.96% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] |
3,600 | 3,671 | +1.95% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-False] |
55,995 | 57,087 | +1.95% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] |
485.95 | 495.42 | +1.95% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-True] |
34,723 | 34,087 | -1.83% |
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-backward] |
348.38 | 354.71 | +1.82% |
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-None] |
370.22 | 376.82 | +1.78% |
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb_cuda[200-img_shape1-large_batch] |
8.7318 | 8.5801 | -1.74% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] |
2,115 | 2,079 | -1.73% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-False] |
34,798 | 34,198 | -1.72% |
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-backward] |
222.68 | 218.85 | -1.72% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-True] |
41,809 | 42,521 | +1.70% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] |
160.18 | 162.89 | +1.69% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-False] |
28,495 | 28,964 | +1.64% |
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-buffers-True] |
0.5112 | 0.5196 | +1.64% |
benchmarks/test_objectives_benchmarks.py::test_values[vec_generalized_advantage_estimate-True-True] |
295.26 | 300.04 | +1.62% |
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-None] |
507.92 | 516.12 | +1.61% |
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb_cuda[200-img_shape1-large_batch] |
8.6198 | 8.4824 | -1.59% |
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[4-constant] |
4,984 | 4,904 | -1.59% |
benchmarks/test_objectives_benchmarks.py::test_values[td_lambda_return_estimate-True-False] |
12.77 | 12.56 | -1.59% |
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb_cuda[100-img_shape0-atari] |
17.48 | 17.21 | -1.55% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-True] |
19,359 | 19,652 | +1.51% |
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-None] |
684.14 | 694.37 | +1.50% |
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[200-img_shape1-large_batch] |
13.49 | 13.29 | -1.46% |
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] |
1,969 | 1,997 | +1.45% |
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-memmap_cpu_storage_cud... |
994.08 | 979.76 | -1.44% |
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[4-same] |
7.0274 | 6.9262 | -1.44% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-False] |
63,566 | 64,438 | +1.37% |
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb_cuda[100-img_shape0-atari] |
17.05 | 16.82 | -1.35% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-False] |
38,190 | 38,703 | +1.34% |
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] |
0.5910 | 0.5831 | -1.34% |
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[False-None] |
346.21 | 341.63 | -1.32% |
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] |
0.5191 | 0.5125 | -1.28% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-False] |
49,426 | 48,800 | -1.27% |
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-cuda_storage_cuda_samp... |
1,538 | 1,519 | -1.24% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-False] |
31,898 | 32,291 | +1.23% |
benchmarks/test_collectors_benchmark.py::test_async_pixels |
10.71 | 10.84 | +1.20% |
benchmarks/test_envs_benchmark.py::test_parallel |
0.5386 | 0.5322 | -1.19% |
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-True] |
23,784 | 23,502 | -1.19% |
benchmarks/test_collectors_benchmark.py::test_async |
11.03 | 10.90 | -1.18% |
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-None] |
743.09 | 751.75 | +1.17% |
benchmarks/test_objectives_benchmarks.py::test_iql_speed[reduce-overhead-None] |
105.66 | 106.89 | +1.16% |
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[False-None] |
234.57 | 231.86 | -1.16% |
benchmarks/test_collectors_benchmark.py::test_sync_preempt |
10.55 | 10.43 | -1.13% |
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-backward] |
41.22 | 40.77 | -1.11% |
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-backward] |
316.42 | 319.87 | +1.09% |
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[safetensors] |
23,663 | 23,409 | -1.07% |
benchmarks/test_objectives_benchmarks.py::test_gae_speed[generalized_advantage_estimate-False-1-512] |
50.94 | 50.42 | -1.01% |
benchmarks/test_objectives_benchmarks.py::test_values[vec_td_lambda_return_estimate-True-False] |
878.58 | 869.94 | -0.98% |
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[100-img_shape0-atari] |
30.66 | 30.36 | -0.95% |
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[200-img_shape1-large_batch] |
15.27 | 15.12 | -0.94% |
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[reduce-overhead-None] |
833.26 | 840.83 | +0.91% |
| ... | ... | ... | Showing 120 of 202 comparisons, sorted by absolute change. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds reusable
LossModulehelpers for boilerplate that appears across several loss modules, and migrates a small set of callsites to use them.New helpers
LossModule.register_coeff_buffer(...)Nonebehavior, and scalar validation.torch.tensor(...)/register_buffer(...)/Nonehandling for coefficient-like fields.LossModule._forward_value_estimator_keys(...)tensor_keys.in_keysvia_set_in_keys()when available.Migrated callsites
A2CLossentropy_coeff,critic_coeff, andclip_valuenow useregister_coeff_buffer.PPOLossentropy_coeff, optionalcritic_coeff, andclip_valuenow useregister_coeff_buffer.IQLLoss_forward_value_estimator_keysimplementation instead of carrying an identical local override.Losses with non-standard value-estimator key remapping, estimator-specific keys, or other custom behavior keep their explicit overrides.
Tests
register_coeff_buffer, includingNone, dtype preservation, non-scalar rejection, and bool rejection._forward_value_estimator_keysbehavior andset_keyspropagation.