Skip to content

[Refactor] Add structured inference server config objects#3893

Draft
vmoens wants to merge 2 commits into
gh/vmoens/288/basefrom
gh/vmoens/288/head
Draft

[Refactor] Add structured inference server config objects#3893
vmoens wants to merge 2 commits into
gh/vmoens/288/basefrom
gh/vmoens/288/head

Conversation

[ghstack-poisoned]
@pytorch-bot

pytorch-bot Bot commented Jun 21, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3893

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 12 New Failures, 1 Cancelled Job

As of commit 804b9e1 with merge base d7ef78b (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions

github-actions Bot commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Benchmark Results: PR 804b9e19 vs main 7fef8524

Benchmark run: https://github.com/pytorch/rl/actions/runs/27939185334

Higher ops/sec is better. Tables are sorted by largest absolute change.

CPU

Compared 187 benchmarks. Regressions over 5%: 8. Improvements over 5%: 28.

Benchmark main ops PR ops Change
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 35.98 51.57 +43.34%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 2,674 3,621 +35.38%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2,577 3,349 +29.98%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2,565 3,320 +29.44%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 2,827 3,634 +28.54%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2,763 3,437 +24.40%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-True] 17,215 20,435 +18.70%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-backward] 836.90 986.01 +17.82%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1,891 2,178 +15.20%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 469.05 539.57 +15.04%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1,099 937.25 -14.68%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-backward] 109.94 125.74 +14.37%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-None] 1,569 1,792 +14.24%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,066 2,312 +11.89%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 2,814 3,119 +10.82%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape1-atari] 5,105 4,576 -10.36%
benchmarks/test_envs_benchmark.py::test_simple 1.6590 1.7998 +8.49%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 713.78 770.50 +7.95%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[reduce-overhead-None] 108.00 116.52 +7.89%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 900.59 832.08 -7.61%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-True] 35,159 37,788 +7.48%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape2-large_img] 406.78 435.85 +7.15%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-None] 273.43 292.68 +7.04%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 484.56 518.41 +6.99%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[pickle] 11,649 12,451 +6.88%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[reduce-overhead-None] 705.25 659.42 -6.50%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-True-0-gru] 1.4916 1.3965 -6.38%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-True] 35,721 37,925 +6.17%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-False] 41,744 44,304 +6.13%
benchmarks/test_objectives_benchmarks.py::test_redq_speed[reduce-overhead-None] 219.79 232.81 +5.93%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 552.65 520.26 -5.86%
benchmarks/test_objectives_benchmarks.py::test_redq_speed[True-None] 215.53 228.02 +5.79%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-True] 40,323 42,626 +5.71%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,196 2,072 -5.66%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[safetensors] 22,867 24,118 +5.47%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-False-0-gru] 1.3772 1.3056 -5.20%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape2-large_img] 178.61 169.75 -4.96%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape2-large_img] 390.86 410.01 +4.90%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-False] 73,487 76,963 +4.73%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[reduce-overhead-None] 544.55 570.15 +4.70%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 2,922 3,059 +4.68%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 781.17 816.58 +4.53%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-backward] 115.22 120.43 +4.53%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-False] 61,063 63,712 +4.34%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-True] 18,661 17,853 -4.33%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[50-img_shape0-small] 4,235 4,418 +4.32%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-False] 40,477 42,219 +4.30%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-False] 36,975 38,436 +3.95%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2,026 1,947 -3.90%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-backward] 135.12 140.27 +3.81%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-False] 32,052 30,931 -3.50%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[torch.save] 7,023 7,264 +3.44%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-True] 22,966 23,755 +3.44%
benchmarks/test_envs_benchmark.py::test_serial 0.5596 0.5787 +3.42%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-False] 33,063 34,192 +3.41%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[reduce-overhead-None] 1,782 1,842 +3.40%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-True] 35,487 34,302 -3.34%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-False] 27,795 26,899 -3.22%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[numpy] 372,555 361,097 -3.08%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[False-None] 708.86 687.42 -3.03%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 647.98 666.92 +2.92%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-True-0-lstm] 0.9675 0.9395 -2.89%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-None] 91.25 88.62 -2.88%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-None] 541.31 556.33 +2.78%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-False-0-lstm] 0.8721 0.8480 -2.77%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape1-atari] 685.05 703.90 +2.75%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[50-img_shape0-small] 3,435 3,530 +2.75%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-None] 685.00 703.20 +2.66%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape1-atari] 642.54 658.40 +2.47%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[untyped_storage] 8.1328 7.9320 -2.47%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-backward] 63.67 62.10 -2.47%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2,714 2,779 +2.41%
benchmarks/test_envs_benchmark.py::test_transformed 0.8727 0.8937 +2.41%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-True] 30,944 30,199 -2.40%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[reduce-overhead-None] 280.01 286.73 +2.40%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[100-img_shape0-atari] 29.36 30.05 +2.38%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 545.43 557.36 +2.19%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 164.61 168.18 +2.17%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-False] 62,189 63,501 +2.11%
benchmarks/test_objectives_benchmarks.py::test_redq_speed[False-backward] 54.32 53.19 -2.09%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 52.74 53.83 +2.07%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-True] 22,085 21,645 -1.99%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-True] 29,212 28,637 -1.97%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-gru] 4.2458 4.3293 +1.97%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[reduce-overhead-None] 474.70 483.37 +1.83%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sampler_sample_scale[10000000-cpu] 51.72 52.64 +1.78%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[200-img_shape3-large_batch] 774.10 787.30 +1.71%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-True] 18,297 18,602 +1.66%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 187.51 184.39 -1.66%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-None] 470.46 478.25 +1.66%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-False] 48,317 49,113 +1.65%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 24.91 24.51 -1.64%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[50-img_shape0-small] 7,187 7,303 +1.61%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[200-img_shape3-large_batch] 310.12 315.06 +1.59%
benchmarks/test_collectors_benchmark.py::test_sync 16.79 16.54 -1.53%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 161.22 163.64 +1.50%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape2-large_img] 568.72 577.08 +1.47%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-backward] 58.89 58.03 -1.46%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[50-img_shape0-small] 858.73 871.19 +1.45%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-False] 55,512 56,236 +1.31%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 51.70 52.37 +1.29%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-False] 28,622 28,992 +1.29%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[200-img_shape3-large_batch] 332.62 336.82 +1.26%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-None] 208.89 211.50 +1.25%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-backward] 246.42 249.48 +1.24%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 0.5923 0.5997 +1.24%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[200-img_shape1-large_batch] 13.20 13.36 +1.24%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[False-backward] 240.66 243.62 +1.23%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-True] 30,519 30,156 -1.19%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-False] 31,741 31,364 -1.19%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-False] 1.5971 1.6159 +1.18%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[reduce-overhead-None] 259.99 263.03 +1.17%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 23.10 23.37 +1.17%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[100-img_shape0-atari] 25.92 26.21 +1.13%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-True] 20,576 20,807 +1.13%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 168.14 170.02 +1.12%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-True] 32,506 32,149 -1.10%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[generalized_advantage_estimate-False-1-512] 105.93 104.77 -1.09%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-None] 38.30 37.89 -1.09%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-backward] 407.16 411.57 +1.08%
... ... ... Showing 120 of 187 comparisons, sorted by absolute change.

GPU

Compared 197 benchmarks. Regressions over 5%: 30. Improvements over 5%: 17.

Benchmark main ops PR ops Change
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 46.67 192.24 +311.92%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 194.18 48.90 -74.82%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1,808 2,236 +23.69%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3,624 3,018 -16.72%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 2,956 3,432 +16.12%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,123 3,625 +16.10%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2,686 3,116 +15.97%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 395.51 455.55 +15.18%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,040 2,347 +15.04%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,179 1,868 -14.28%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 2,781 3,161 +13.68%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 758.22 848.41 +11.89%
benchmarks/test_collectors_benchmark.py::test_single_with_rb_pixels 5.3468 4.7139 -11.84%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[reduce-overhead-None] 118.88 105.46 -11.29%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-True] 30,378 27,267 -10.24%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3,188 2,879 -9.70%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-False] 31,472 28,520 -9.38%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[untyped_storage] 7.8355 8.5624 +9.28%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-False] 57,293 52,044 -9.16%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-True] 30,535 27,813 -8.91%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-False] 27,482 25,115 -8.61%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-True] 28,449 26,095 -8.27%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-True] 32,649 29,960 -8.23%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-backward] 371.48 399.96 +7.67%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-True] 19,452 17,999 -7.47%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 483.26 518.73 +7.34%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-True] 20,722 19,231 -7.19%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-False] 34,090 31,656 -7.14%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-True] 42,004 39,014 -7.12%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-False] 44,506 41,338 -7.12%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-False] 37,979 35,354 -6.91%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 46.81 50.04 +6.90%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-False] 64,005 59,625 -6.84%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 747.05 797.53 +6.76%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[safetensors] 22,543 24,041 +6.65%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-False] 34,214 32,054 -6.31%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-False] 48,966 45,887 -6.29%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-False] 49,670 46,583 -6.21%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-True] 18,553 17,404 -6.20%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2,606 2,762 +6.02%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-True] 37,630 35,414 -5.89%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-False] 54,149 50,996 -5.82%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-True] 17,934 16,894 -5.80%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 523.46 493.94 -5.64%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-False] 42,230 39,904 -5.51%
benchmarks/test_objectives_benchmarks.py::test_values[td1_return_estimate-False-False] 20.32 19.24 -5.32%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-backward] 331.42 348.21 +5.07%
benchmarks/test_envs_benchmark.py::test_transformed 0.6731 0.7065 +4.97%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-False] 28,603 27,188 -4.95%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape1-atari] 274.53 261.65 -4.69%
benchmarks/test_objectives_benchmarks.py::test_values[td_lambda_return_estimate-True-False] 12.26 11.69 -4.68%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-True] 20,565 19,606 -4.66%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-True] 34,375 32,777 -4.65%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-True] 37,167 35,471 -4.56%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 23.02 24.04 +4.44%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-None] 735.89 703.70 -4.37%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-False] 31,530 30,158 -4.35%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 51.60 53.81 +4.29%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[pickle] 11,706 12,201 +4.23%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-backward] 330.93 317.37 -4.10%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[50-img_shape0-small] 6,088 5,840 -4.07%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-backward] 273.13 284.25 +4.07%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-True] 19,253 18,471 -4.06%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-True] 23,043 22,108 -4.06%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape2-large_img] 424.11 406.96 -4.04%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 1,355 1,302 -3.90%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[reduce-overhead-None] 42.12 43.75 +3.85%
benchmarks/test_objectives_benchmarks.py::test_values[generalized_advantage_estimate-True-True] 48.41 46.61 -3.71%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-False] 75,270 72,557 -3.60%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 185.86 179.39 -3.48%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-backward] 238.38 246.60 +3.45%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[reduce-overhead-None] 108.60 104.90 -3.40%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 1,321 1,277 -3.38%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[200-img_shape1-large_batch] 13.39 12.94 -3.35%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sampler_sample_scale[1000000-cuda] 2,181 2,251 +3.19%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-True] 21,429 20,745 -3.19%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.57 23.29 +3.18%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape1-atari] 686.22 707.40 +3.09%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape2-large_img] 175.35 170.00 -3.05%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[200-img_shape1-large_batch] 15.36 14.91 -2.92%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape1-atari] 630.00 648.40 +2.92%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-memmap_cpu_storage_cud... 970.66 998.82 +2.90%
benchmarks/test_objectives_benchmarks.py::test_values[vec_generalized_advantage_estimate-True-True] 301.62 309.86 +2.73%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.60 22.19 +2.71%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape2-large_img] 552.16 537.30 -2.69%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb_cuda[100-img_shape0-atari] 16.69 16.25 -2.64%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-backward] 903.93 880.27 -2.62%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-backward] 352.10 361.24 +2.60%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-None] 1,932 1,883 -2.53%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 53.24 54.55 +2.47%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape1-atari] 4,270 4,166 -2.43%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[50-img_shape0-small] 4,158 4,259 +2.43%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape2-large_img] 402.26 392.54 -2.42%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[100-img_shape0-atari] 26.19 25.56 -2.42%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-None] 113.50 110.87 -2.32%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[200-img_shape3-large_batch] 752.11 735.19 -2.25%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[numpy] 364,365 372,541 +2.24%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-backward] 220.48 225.40 +2.23%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1,279 1,251 -2.23%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2,160 2,207 +2.17%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3,331 3,403 +2.16%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb_cuda[200-img_shape1-large_batch] 8.3259 8.1460 -2.16%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[False-backward] 70.79 69.28 -2.12%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-None] 747.89 732.92 -2.00%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb_cuda[200-img_shape1-large_batch] 8.7432 8.5693 -1.99%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[generalized_advantage_estimate-False-1-512] 48.23 47.28 -1.98%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-backward] 84.37 82.76 -1.91%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[reduce-overhead-None] 130.63 128.15 -1.90%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[200-img_shape3-large_batch] 132.74 135.26 +1.90%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-False] 63,074 61,931 -1.81%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 0.5318 0.5223 -1.79%
benchmarks/test_objectives_benchmarks.py::test_values[vec_td_lambda_return_estimate-True-False] 859.23 843.92 -1.78%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 0.5154 0.5063 -1.77%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-cuda_storage_cuda_samp... 1,474 1,500 +1.72%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[reduce-overhead-None] 855.68 841.50 -1.66%
benchmarks/test_objectives_benchmarks.py::test_values[td0_return_estimate-False-False] 11,689 11,876 +1.60%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[50-img_shape0-small] 3,481 3,425 -1.59%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 0.2241 0.2207 -1.53%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[False-None] 345.36 340.09 -1.53%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[100-img_shape0-atari] 30.37 29.93 -1.47%
... ... ... Showing 120 of 197 comparisons, sorted by absolute change.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is worth a paragraph in the doc somewhere

[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Collectors Documentation Improvements or additions to documentation Integrations/torch_geometric Integrations Modules Refactoring Refactoring of an existing feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant