Skip to content

[Feature] Add process inference server control plane#3896

Draft
vmoens wants to merge 2 commits into
gh/vmoens/291/basefrom
gh/vmoens/291/head
Draft

[Feature] Add process inference server control plane#3896
vmoens wants to merge 2 commits into
gh/vmoens/291/basefrom
gh/vmoens/291/head

Conversation

[ghstack-poisoned]
@pytorch-bot

pytorch-bot Bot commented Jun 21, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3896

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 12 New Failures, 1 Cancelled Job

As of commit 53b0f0a with merge base d7ef78b (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions

github-actions Bot commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Benchmark Results: PR 53b0f0ad vs main 93f02659

Benchmark run: https://github.com/pytorch/rl/actions/runs/27939185249

Higher ops/sec is better. Tables are sorted by largest absolute change.

CPU

Compared 187 benchmarks. Regressions over 5%: 5. Improvements over 5%: 13.

Benchmark main ops PR ops Change
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 36.01 53.51 +48.61%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-backward] 349.86 418.34 +19.57%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,183 3,592 +12.84%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2,641 2,963 +12.20%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape1-atari] 4,878 5,470 +12.14%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-backward] 230.27 254.08 +10.34%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 768.79 693.51 -9.79%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-None] 107.57 117.59 +9.32%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3,425 3,111 -9.17%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2,198 2,003 -8.86%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2,739 2,929 +6.92%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-backward] 131.12 139.88 +6.68%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[reduce-overhead-None] 269.48 286.53 +6.32%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,320 2,176 -6.19%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[50-img_shape0-small] 4,481 4,234 -5.53%
benchmarks/test_envs_benchmark.py::test_simple 1.7027 1.7952 +5.44%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-False] 48,651 51,201 +5.24%
benchmarks/test_collectors_benchmark.py::test_async 17.44 18.35 +5.21%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2,963 2,818 -4.91%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-True] 32,227 33,801 +4.88%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 758.45 721.62 -4.85%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-False] 43,366 45,387 +4.66%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-gru] 2.9645 3.0995 +4.56%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-backward] 114.00 118.91 +4.31%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[reduce-overhead-None] 471.73 490.52 +3.98%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-False] 48,419 50,342 +3.97%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1,875 1,948 +3.88%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 670.44 644.55 -3.86%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-None] 467.70 484.82 +3.66%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[pickle] 12,244 11,799 -3.64%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[50-img_shape0-small] 7,112 7,361 +3.50%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1,902 1,838 -3.40%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-False] 26,784 27,691 +3.39%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-True] 29,506 30,494 +3.35%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-backward] 57.95 59.78 +3.16%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-True-0-gru] 1.4794 1.4347 -3.02%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-True] 36,934 38,013 +2.92%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-True] 1.3598 1.3205 -2.89%
benchmarks/test_envs_benchmark.py::test_transformed 0.8865 0.9113 +2.79%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[torch.save] 7,162 6,966 -2.73%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-backward] 60.18 58.58 -2.66%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2,740 2,669 -2.62%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape1-atari] 274.88 281.99 +2.59%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[reduce-overhead-None] 114.06 116.96 +2.54%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[200-img_shape3-large_batch] 328.69 337.02 +2.53%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-backward] 122.93 126.01 +2.50%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 721.64 703.63 -2.50%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-None] 1,771 1,815 +2.47%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 0.6044 0.6193 +2.47%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-False] 62,963 64,481 +2.41%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sampler_sample_scale[10000000-cpu] 51.59 52.82 +2.40%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[200-img_shape3-large_batch] 766.64 748.33 -2.39%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-True] 34,062 34,847 +2.30%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-backward] 973.68 995.54 +2.24%
benchmarks/test_objectives_benchmarks.py::test_values[td0_return_estimate-False-False] 7,707 7,877 +2.21%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-True] 18,466 18,868 +2.17%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-False] 63,668 65,043 +2.16%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-None] 278.67 284.68 +2.15%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-lstm] 2.0176 2.0596 +2.08%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-True] 28,213 28,790 +2.05%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-False] 1.5861 1.6183 +2.03%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[200-img_shape3-large_batch] 307.78 313.75 +1.94%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3,097 3,039 -1.88%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-False] 76,558 77,971 +1.85%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-memmap_cpu_storage_cpu... 81.68 83.15 +1.80%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape1-atari] 644.84 656.40 +1.79%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 194.17 197.61 +1.77%
benchmarks/test_envs_benchmark.py::test_parallel 0.9736 0.9567 -1.74%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,115 3,168 +1.68%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 512.63 521.15 +1.66%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 186.86 189.94 +1.64%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-True] 19,384 19,687 +1.57%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 0.6815 0.6920 +1.54%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-None] 554.56 563.08 +1.54%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 2,205 2,238 +1.52%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 23.47 23.82 +1.51%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[False-None] 50.20 49.45 -1.49%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[reduce-overhead-None] 710.08 720.67 +1.49%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape1-atari] 702.72 692.51 -1.45%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[reduce-overhead-None] 570.59 578.79 +1.44%
benchmarks/test_collectors_benchmark.py::test_sync 16.78 16.54 -1.42%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-False] 29,054 29,463 +1.41%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[reduce-overhead-None] 83.79 84.96 +1.41%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-True] 41,906 42,484 +1.38%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[100-img_shape0-atari] 29.52 29.92 +1.34%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-False] 31,645 32,063 +1.32%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 25.16 25.48 +1.28%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 165.95 168.05 +1.27%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[50-img_shape0-small] 3,507 3,551 +1.26%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-False] 34,590 35,020 +1.24%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-None] 700.28 708.96 +1.24%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[100-img_shape0-atari] 25.82 26.13 +1.22%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-None] 38.54 38.07 -1.22%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[safetensors] 23,268 22,986 -1.21%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-False] 54,697 55,343 +1.18%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-backward] 281.76 285.07 +1.18%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 0.5122 0.5180 +1.13%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-True] 37,632 38,052 +1.12%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[numpy] 361,414 357,403 -1.11%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[False-None] 159.96 161.70 +1.09%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-True] 20,551 20,761 +1.02%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[200-img_shape1-large_batch] 14.99 15.14 +1.00%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 0.5916 0.5974 +0.99%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-backward] 114.97 113.84 -0.98%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 29.32 29.04 -0.97%
benchmarks/test_objectives_benchmarks.py::test_values[td_lambda_return_estimate-True-False] 24.20 24.43 +0.96%
benchmarks/test_objectives_benchmarks.py::test_redq_speed[True-None] 226.74 228.91 +0.96%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 682.68 676.24 -0.94%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 0.2118 0.2098 -0.94%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sampler_sample_scale[1000000-cpu] 97.06 97.97 +0.93%
benchmarks/test_objectives_benchmarks.py::test_values[vec_td_lambda_return_estimate-True-False] 54.43 54.93 +0.93%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-True] 30,242 30,515 +0.90%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-backward] 132.52 131.33 -0.90%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[generalized_advantage_estimate-False-1-512] 105.39 106.31 +0.87%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 0.5929 0.5980 +0.87%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-False] 56,412 56,897 +0.86%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 48.19 48.59 +0.82%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-backward] 91.06 90.31 -0.81%
benchmarks/test_objectives_benchmarks.py::test_redq_speed[False-None] 94.92 94.16 -0.80%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape2-large_img] 572.04 567.51 -0.79%
... ... ... Showing 120 of 187 comparisons, sorted by absolute change.

GPU

Compared 197 benchmarks. Regressions over 5%: 11. Improvements over 5%: 20.

Benchmark main ops PR ops Change
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 42.64 195.34 +358.08%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 194.75 38.94 -80.00%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[reduce-overhead-None] 82.91 106.42 +28.35%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1,805 2,279 +26.31%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3,681 2,895 -21.37%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1,862 2,256 +21.15%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1,992 2,373 +19.14%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,010 2,383 +18.56%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2,759 3,194 +15.78%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3,055 3,503 +14.67%
benchmarks/test_collectors_benchmark.py::test_single_with_rb_pixels 5.3673 4.6908 -12.60%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3,275 2,871 -12.33%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3,317 3,724 +12.27%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 820.74 726.06 -11.54%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 512.13 453.37 -11.47%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,305 3,635 +9.98%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 815.01 743.85 -8.73%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape2-large_img] 600.18 549.25 -8.49%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 495.90 536.91 +8.27%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[untyped_storage] 8.2437 8.9137 +8.13%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-backward] 334.15 358.87 +7.40%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-backward] 347.47 372.91 +7.32%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-backward] 977.20 907.06 -7.18%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[safetensors] 22,270 23,826 +6.99%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 746.01 796.52 +6.77%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2,616 2,778 +6.18%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape2-large_img] 382.12 405.69 +6.17%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-backward] 314.69 333.82 +6.08%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1,027 968.56 -5.73%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[200-img_shape3-large_batch] 815.63 769.17 -5.70%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[200-img_shape3-large_batch] 133.70 140.76 +5.28%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-backward] 240.89 252.74 +4.92%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape2-large_img] 405.63 425.11 +4.80%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[numpy] 361,387 375,991 +4.04%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 22.91 23.81 +3.93%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape2-large_img] 169.74 176.30 +3.87%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-True] 17,425 18,087 +3.80%
benchmarks/test_envs_benchmark.py::test_simple 1.2538 1.2062 -3.79%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[False-backward] 128.41 133.25 +3.77%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.60 22.40 +3.69%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-cuda_storage_cpu_sampler] 85.35 88.44 +3.62%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape1-atari] 276.36 266.81 -3.45%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-backward] 220.20 227.52 +3.32%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-backward] 270.00 278.90 +3.29%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape1-atari] 4,039 3,913 -3.13%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1,035 1,003 -3.06%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-True] 23,831 23,134 -2.93%
benchmarks/test_collectors_benchmark.py::test_single 6.7586 6.5624 -2.90%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[50-img_shape0-small] 4,376 4,252 -2.85%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-backward] 392.89 404.07 +2.85%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 159.42 163.76 +2.72%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape1-atari] 688.98 707.56 +2.70%
benchmarks/test_objectives_benchmarks.py::test_values[vec_generalized_advantage_estimate-True-True] 298.47 306.23 +2.60%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 162.86 166.87 +2.46%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-backward] 360.37 369.17 +2.44%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 164.38 168.16 +2.30%
benchmarks/test_objectives_benchmarks.py::test_values[td1_return_estimate-False-False] 20.49 20.03 -2.26%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[reduce-overhead-None] 796.60 814.53 +2.25%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape1-atari] 638.33 652.47 +2.21%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 184.23 188.30 +2.21%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-True] 37,816 37,023 -2.10%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-False] 48,354 49,363 +2.09%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[reduce-overhead-None] 1,905 1,866 -2.05%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-False] 63,707 62,405 -2.04%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 52.13 53.16 +1.98%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-True] 20,279 19,879 -1.97%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-False] 42,411 41,576 -1.97%
benchmarks/test_envs_benchmark.py::test_parallel 0.5525 0.5419 -1.93%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-None] 820.73 836.50 +1.92%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-True] 32,627 32,008 -1.90%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-None] 53.11 54.12 +1.89%
benchmarks/test_envs_benchmark.py::test_transformed 0.7033 0.7163 +1.85%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-False] 34,368 33,735 -1.84%
benchmarks/test_objectives_benchmarks.py::test_values[td_lambda_return_estimate-True-False] 12.40 12.18 -1.81%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-True] 18,037 18,360 +1.79%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb_cuda[200-img_shape1-large_batch] 8.5477 8.4006 -1.72%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 159.29 162.03 +1.72%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-backward] 474.22 482.35 +1.71%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[200-img_shape3-large_batch] 305.76 311.00 +1.71%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[reduce-overhead-None] 43.68 42.95 -1.69%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb_cuda[200-img_shape1-large_batch] 8.8261 8.6781 -1.68%
benchmarks/test_objectives_benchmarks.py::test_values[generalized_advantage_estimate-True-True] 48.61 47.80 -1.67%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 158.40 161.03 +1.66%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb_cuda[100-img_shape0-atari] 16.98 16.70 -1.63%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 163.84 166.51 +1.63%
benchmarks/test_objectives_benchmarks.py::test_values[td0_return_estimate-False-False] 11,793 11,602 -1.62%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 0.5986 0.6082 +1.61%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[50-img_shape0-small] 6,008 6,102 +1.56%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-None] 1,921 1,950 +1.53%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 52.84 53.64 +1.52%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-False] 76,889 75,758 -1.47%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-None] 512.27 519.75 +1.46%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 48.87 49.58 +1.46%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[50-img_shape0-small] 3,534 3,483 -1.43%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-False] 53,565 54,322 +1.41%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.54 22.86 +1.41%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1,280 1,263 -1.40%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[reduce-overhead-None] 88.50 89.72 +1.38%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb_cuda[100-img_shape0-atari] 17.77 17.53 -1.36%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-False] 63,024 62,174 -1.35%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-backward] 71.16 72.10 +1.32%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-lstm] 21.99 21.71 -1.31%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 0.5295 0.5363 +1.30%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-True] 33,925 34,361 +1.28%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-True] 21,781 21,512 -1.24%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-gru] 47.68 48.26 +1.22%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-None] 732.17 741.07 +1.22%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[reduce-overhead-None] 858.36 868.74 +1.21%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-True] 27,795 28,122 +1.18%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[200-img_shape1-large_batch] 15.33 15.15 -1.17%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-None] 716.48 724.49 +1.12%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[reduce-overhead-None] 827.30 836.29 +1.09%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-backward] 40.41 40.84 +1.08%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-True] 20,348 20,567 +1.07%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-True] 30,574 30,249 -1.06%
benchmarks/test_collectors_benchmark.py::test_async_pixels 10.76 10.65 -1.06%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[False-backward] 460.12 455.25 -1.06%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 1,321 1,335 +1.06%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-False] 37,681 37,287 -1.04%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-None] 398.43 394.32 -1.03%
... ... ... Showing 120 of 197 comparisons, sorted by absolute change.

[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Feature New feature Integrations/torch_geometric Integrations Modules

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant