Skip to content

[BugFix] Fix vLLM CompilationConfig compat and Windows CI pybind11#3673

Merged
vmoens merged 1 commit intomainfrom
fix-for-12
Apr 27, 2026
Merged

[BugFix] Fix vLLM CompilationConfig compat and Windows CI pybind11#3673
vmoens merged 1 commit intomainfrom
fix-for-12

Conversation

@vmoens
Copy link
Copy Markdown
Collaborator

@vmoens vmoens commented Apr 27, 2026

Summary

  • Fix vLLM >=0.18.0 compatibility: the level keyword was removed from CompilationConfig and replaced with mode. Uses pyvers.implement_for to select the right key based on installed vLLM version.
  • Fix Windows CI build failure on release branches: cmake and pybind11[global] are now installed unconditionally (they were previously only installed when RELEASE==0, but torchrl's C++ extensions need them regardless).

Test plan

  • vLLM async tests that were previously skipped (~115 tests) due to CompilationConfig validation error should now load successfully
  • Windows CI unittests-cpu job should no longer fail at pip install -e .

🤖 Generated with Claude Code

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Apr 27, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3673

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 27, 2026
@github-actions github-actions Bot added BugFix CI Has to do with CI setup (e.g. wheels & builds, tests...) Collectors llm/ LLM-related PR, triggers LLM CI tests Modules Record Integrations/torch_geometric Integrations labels Apr 27, 2026
vLLM >=0.18.0 removed the `level` keyword from CompilationConfig
(replaced by `mode`). Use pyvers implement_for to select the right
key based on installed vLLM version.

Also install cmake and pybind11 unconditionally in the Windows CI
script so release-branch builds can compile TorchRL C++ extensions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vmoens vmoens changed the base branch from release/0.12.0 to main April 27, 2026 11:59
@vmoens vmoens merged commit 4a976a0 into main Apr 27, 2026
92 of 96 checks passed
@vmoens vmoens deleted the fix-for-12 branch April 27, 2026 12:05
@github-actions
Copy link
Copy Markdown
Contributor

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 172. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}13$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 82.0970μs 80.7073μs 12.3905 KOps/s 12.3388 KOps/s $\color{#35bf28}+0.42\%$
test_tensor_to_bytestream_speed[torch.save] 0.1424ms 0.1415ms 7.0686 KOps/s 7.0814 KOps/s $\color{#d91a1a}-0.18\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1105s 0.1101s 9.0855 Ops/s 9.5746 Ops/s $\textbf{\color{#d91a1a}-5.11\%}$
test_tensor_to_bytestream_speed[numpy] 2.5956μs 2.5917μs 385.8449 KOps/s 386.4880 KOps/s $\color{#d91a1a}-0.17\%$
test_tensor_to_bytestream_speed[safetensors] 38.8019μs 38.2812μs 26.1225 KOps/s 25.6584 KOps/s $\color{#35bf28}+1.81\%$
test_simple 0.7886s 0.7857s 1.2728 Ops/s 1.2349 Ops/s $\color{#35bf28}+3.07\%$
test_transformed 1.3859s 1.3815s 0.7238 Ops/s 0.7124 Ops/s $\color{#35bf28}+1.60\%$
test_serial 2.3166s 2.3109s 0.4327 Ops/s 0.4293 Ops/s $\color{#35bf28}+0.79\%$
test_parallel 1.9138s 1.8263s 0.5475 Ops/s 0.5611 Ops/s $\color{#d91a1a}-2.41\%$
test_step_mdp_speed[True-True-True-True-True] 0.2281ms 40.5122μs 24.6839 KOps/s 23.5973 KOps/s $\color{#35bf28}+4.60\%$
test_step_mdp_speed[True-True-True-True-False] 89.2440μs 22.9979μs 43.4821 KOps/s 42.7788 KOps/s $\color{#35bf28}+1.64\%$
test_step_mdp_speed[True-True-True-False-True] 49.7920μs 23.6182μs 42.3403 KOps/s 42.4327 KOps/s $\color{#d91a1a}-0.22\%$
test_step_mdp_speed[True-True-True-False-False] 35.9320μs 12.9313μs 77.3317 KOps/s 77.9088 KOps/s $\color{#d91a1a}-0.74\%$
test_step_mdp_speed[True-True-False-True-True] 74.1440μs 44.0569μs 22.6979 KOps/s 22.1821 KOps/s $\color{#35bf28}+2.33\%$
test_step_mdp_speed[True-True-False-True-False] 49.2030μs 25.4432μs 39.3033 KOps/s 39.0138 KOps/s $\color{#35bf28}+0.74\%$
test_step_mdp_speed[True-True-False-False-True] 92.0650μs 26.4352μs 37.8284 KOps/s 37.9880 KOps/s $\color{#d91a1a}-0.42\%$
test_step_mdp_speed[True-True-False-False-False] 45.9930μs 15.3472μs 65.1585 KOps/s 65.1400 KOps/s $\color{#35bf28}+0.03\%$
test_step_mdp_speed[True-False-True-True-True] 86.8350μs 47.4215μs 21.0875 KOps/s 21.3604 KOps/s $\color{#d91a1a}-1.28\%$
test_step_mdp_speed[True-False-True-True-False] 57.7230μs 28.2505μs 35.3976 KOps/s 35.2241 KOps/s $\color{#35bf28}+0.49\%$
test_step_mdp_speed[True-False-True-False-True] 52.6530μs 26.2219μs 38.1360 KOps/s 37.0341 KOps/s $\color{#35bf28}+2.98\%$
test_step_mdp_speed[True-False-True-False-False] 46.0720μs 15.4701μs 64.6408 KOps/s 64.3220 KOps/s $\color{#35bf28}+0.50\%$
test_step_mdp_speed[True-False-False-True-True] 82.0840μs 50.2857μs 19.8864 KOps/s 20.3990 KOps/s $\color{#d91a1a}-2.51\%$
test_step_mdp_speed[True-False-False-True-False] 61.8430μs 30.9501μs 32.3101 KOps/s 32.8884 KOps/s $\color{#d91a1a}-1.76\%$
test_step_mdp_speed[True-False-False-False-True] 60.3630μs 28.5952μs 34.9709 KOps/s 35.0161 KOps/s $\color{#d91a1a}-0.13\%$
test_step_mdp_speed[True-False-False-False-False] 47.4230μs 17.9352μs 55.7561 KOps/s 54.6769 KOps/s $\color{#35bf28}+1.97\%$
test_step_mdp_speed[False-True-True-True-True] 77.6440μs 47.2938μs 21.1444 KOps/s 21.3050 KOps/s $\color{#d91a1a}-0.75\%$
test_step_mdp_speed[False-True-True-True-False] 2.4937ms 28.3831μs 35.2322 KOps/s 35.3506 KOps/s $\color{#d91a1a}-0.33\%$
test_step_mdp_speed[False-True-True-False-True] 76.2740μs 29.7021μs 33.6677 KOps/s 34.2147 KOps/s $\color{#d91a1a}-1.60\%$
test_step_mdp_speed[False-True-True-False-False] 48.3920μs 17.1638μs 58.2621 KOps/s 57.9616 KOps/s $\color{#35bf28}+0.52\%$
test_step_mdp_speed[False-True-False-True-True] 3.7352ms 48.9605μs 20.4246 KOps/s 20.6358 KOps/s $\color{#d91a1a}-1.02\%$
test_step_mdp_speed[False-True-False-True-False] 61.2240μs 30.0321μs 33.2977 KOps/s 33.3578 KOps/s $\color{#d91a1a}-0.18\%$
test_step_mdp_speed[False-True-False-False-True] 0.4430ms 31.6039μs 31.6417 KOps/s 31.1502 KOps/s $\color{#35bf28}+1.58\%$
test_step_mdp_speed[False-True-False-False-False] 0.4436ms 19.3003μs 51.8126 KOps/s 52.7033 KOps/s $\color{#d91a1a}-1.69\%$
test_step_mdp_speed[False-False-True-True-True] 91.3040μs 51.5947μs 19.3818 KOps/s 19.3239 KOps/s $\color{#35bf28}+0.30\%$
test_step_mdp_speed[False-False-True-True-False] 0.4581ms 32.8698μs 30.4231 KOps/s 30.2445 KOps/s $\color{#35bf28}+0.59\%$
test_step_mdp_speed[False-False-True-False-True] 0.4616ms 31.4527μs 31.7937 KOps/s 30.7854 KOps/s $\color{#35bf28}+3.28\%$
test_step_mdp_speed[False-False-True-False-False] 0.4384ms 19.1677μs 52.1711 KOps/s 51.1117 KOps/s $\color{#35bf28}+2.07\%$
test_step_mdp_speed[False-False-False-True-True] 89.0740μs 53.0135μs 18.8631 KOps/s 18.3820 KOps/s $\color{#35bf28}+2.62\%$
test_step_mdp_speed[False-False-False-True-False] 0.4619ms 35.2075μs 28.4030 KOps/s 27.4987 KOps/s $\color{#35bf28}+3.29\%$
test_step_mdp_speed[False-False-False-False-True] 0.4578ms 33.5965μs 29.7650 KOps/s 29.4479 KOps/s $\color{#35bf28}+1.08\%$
test_step_mdp_speed[False-False-False-False-False] 0.4582ms 21.7817μs 45.9100 KOps/s 45.2121 KOps/s $\color{#35bf28}+1.54\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8489s 0.7462s 1.3402 Ops/s 1.3409 Ops/s $\color{#d91a1a}-0.06\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7178s 0.6109s 1.6368 Ops/s 1.6411 Ops/s $\color{#d91a1a}-0.26\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7358s 1.6495s 0.6062 Ops/s 0.5984 Ops/s $\color{#35bf28}+1.31\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5118s 1.4260s 0.7013 Ops/s 0.7047 Ops/s $\color{#d91a1a}-0.49\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 1.9748s 1.9086s 0.5239 Ops/s 0.5287 Ops/s $\color{#d91a1a}-0.91\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7647s 1.6808s 0.5950 Ops/s 0.5963 Ops/s $\color{#d91a1a}-0.22\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.7205s 4.6460s 0.2152 Ops/s 0.2172 Ops/s $\color{#d91a1a}-0.90\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5963s 4.4699s 0.2237 Ops/s 0.2259 Ops/s $\color{#d91a1a}-0.95\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9524s 1.8740s 0.5336 Ops/s 0.5289 Ops/s $\color{#35bf28}+0.89\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.7167s 1.6118s 0.6204 Ops/s 0.6274 Ops/s $\color{#d91a1a}-1.11\%$
test_values[generalized_advantage_estimate-True-True] 20.5931ms 20.1996ms 49.5059 Ops/s 50.2810 Ops/s $\color{#d91a1a}-1.54\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1465s 3.8475ms 259.9109 Ops/s 284.4241 Ops/s $\textbf{\color{#d91a1a}-8.62\%}$
test_values[td0_return_estimate-False-False] 0.1044ms 82.9568μs 12.0545 KOps/s 12.1033 KOps/s $\color{#d91a1a}-0.40\%$
test_values[td1_return_estimate-False-False] 51.8291ms 49.3894ms 20.2472 Ops/s 21.1359 Ops/s $\color{#d91a1a}-4.20\%$
test_values[vec_td1_return_estimate-False-False] 1.3209ms 1.0899ms 917.5515 Ops/s 927.4081 Ops/s $\color{#d91a1a}-1.06\%$
test_values[td_lambda_return_estimate-True-False] 85.0968ms 83.1667ms 12.0240 Ops/s 13.0194 Ops/s $\textbf{\color{#d91a1a}-7.65\%}$
test_values[vec_td_lambda_return_estimate-True-False] 1.3057ms 1.0915ms 916.2110 Ops/s 933.9399 Ops/s $\color{#d91a1a}-1.90\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 22.1607ms 21.3285ms 46.8856 Ops/s 46.7647 Ops/s $\color{#35bf28}+0.26\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0356ms 0.7617ms 1.3128 KOps/s 1.3335 KOps/s $\color{#d91a1a}-1.55\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.9276ms 0.6901ms 1.4490 KOps/s 1.5011 KOps/s $\color{#d91a1a}-3.47\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5309ms 1.4849ms 673.4516 Ops/s 674.7496 Ops/s $\color{#d91a1a}-0.19\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7723ms 0.7245ms 1.3802 KOps/s 1.4689 KOps/s $\textbf{\color{#d91a1a}-6.04\%}$
test_dqn_speed[False-None] 1.7495ms 1.5938ms 627.4196 Ops/s 627.2225 Ops/s $\color{#35bf28}+0.03\%$
test_dqn_speed[False-backward] 2.2684ms 2.2273ms 448.9688 Ops/s 449.8450 Ops/s $\color{#d91a1a}-0.19\%$
test_dqn_speed[True-None] 0.9217ms 0.6401ms 1.5622 KOps/s 1.5988 KOps/s $\color{#d91a1a}-2.29\%$
test_dqn_speed[True-backward] 2.0668ms 1.2243ms 816.7633 Ops/s 779.6715 Ops/s $\color{#35bf28}+4.76\%$
test_dqn_speed[reduce-overhead-None] 0.6782ms 0.6231ms 1.6048 KOps/s 1.5783 KOps/s $\color{#35bf28}+1.68\%$
test_ddpg_speed[False-None] 3.3490ms 2.9858ms 334.9239 Ops/s 336.1010 Ops/s $\color{#d91a1a}-0.35\%$
test_ddpg_speed[False-backward] 4.5644ms 4.2270ms 236.5744 Ops/s 236.3471 Ops/s $\color{#35bf28}+0.10\%$
test_ddpg_speed[True-None] 1.5187ms 1.4298ms 699.3926 Ops/s 697.3808 Ops/s $\color{#35bf28}+0.29\%$
test_ddpg_speed[True-backward] 2.5668ms 2.4703ms 404.8040 Ops/s 404.1112 Ops/s $\color{#35bf28}+0.17\%$
test_ddpg_speed[reduce-overhead-None] 1.4918ms 1.4138ms 707.2920 Ops/s 712.3854 Ops/s $\color{#d91a1a}-0.71\%$
test_sac_speed[False-None] 8.8513ms 8.4435ms 118.4338 Ops/s 118.5848 Ops/s $\color{#d91a1a}-0.13\%$
test_sac_speed[False-backward] 11.8926ms 11.3424ms 88.1646 Ops/s 88.0878 Ops/s $\color{#35bf28}+0.09\%$
test_sac_speed[True-None] 2.3256ms 2.0498ms 487.8640 Ops/s 510.1116 Ops/s $\color{#d91a1a}-4.36\%$
test_sac_speed[True-backward] 3.6750ms 3.6160ms 276.5509 Ops/s 263.9930 Ops/s $\color{#35bf28}+4.76\%$
test_sac_speed[reduce-overhead-None] 16.5376ms 10.2529ms 97.5339 Ops/s 99.3157 Ops/s $\color{#d91a1a}-1.79\%$
test_redq_deprec_speed[False-None] 10.2966ms 9.4372ms 105.9632 Ops/s 106.3206 Ops/s $\color{#d91a1a}-0.34\%$
test_redq_deprec_speed[False-backward] 12.9532ms 12.4301ms 80.4501 Ops/s 78.5155 Ops/s $\color{#35bf28}+2.46\%$
test_redq_deprec_speed[True-None] 2.8780ms 2.7602ms 362.2965 Ops/s 349.3980 Ops/s $\color{#35bf28}+3.69\%$
test_redq_deprec_speed[True-backward] 4.9809ms 4.3231ms 231.3181 Ops/s 221.6821 Ops/s $\color{#35bf28}+4.35\%$
test_redq_deprec_speed[reduce-overhead-None] 14.9980ms 9.6699ms 103.4141 Ops/s 103.7897 Ops/s $\color{#d91a1a}-0.36\%$
test_td3_speed[False-None] 8.7172ms 8.3301ms 120.0472 Ops/s 120.4793 Ops/s $\color{#d91a1a}-0.36\%$
test_td3_speed[False-backward] 11.8870ms 10.8662ms 92.0282 Ops/s 92.1456 Ops/s $\color{#d91a1a}-0.13\%$
test_td3_speed[True-None] 1.7889ms 1.7453ms 572.9600 Ops/s 571.4550 Ops/s $\color{#35bf28}+0.26\%$
test_td3_speed[True-backward] 4.3767ms 3.3944ms 294.6008 Ops/s 300.3258 Ops/s $\color{#d91a1a}-1.91\%$
test_td3_speed[reduce-overhead-None] 0.1001s 26.4148ms 37.8576 Ops/s 38.2002 Ops/s $\color{#d91a1a}-0.90\%$
test_cql_speed[False-None] 18.1412ms 17.6526ms 56.6490 Ops/s 56.9921 Ops/s $\color{#d91a1a}-0.60\%$
test_cql_speed[False-backward] 23.5101ms 23.1127ms 43.2662 Ops/s 43.5832 Ops/s $\color{#d91a1a}-0.73\%$
test_cql_speed[True-None] 3.6416ms 3.5348ms 282.9000 Ops/s 283.0828 Ops/s $\color{#d91a1a}-0.06\%$
test_cql_speed[True-backward] 7.7772ms 6.0142ms 166.2728 Ops/s 174.7650 Ops/s $\color{#d91a1a}-4.86\%$
test_cql_speed[reduce-overhead-None] 0.8598s 17.8009ms 56.1769 Ops/s 83.3098 Ops/s $\textbf{\color{#d91a1a}-32.57\%}$
test_a2c_speed[False-None] 3.4122ms 3.3006ms 302.9722 Ops/s 303.5799 Ops/s $\color{#d91a1a}-0.20\%$
test_a2c_speed[False-backward] 6.8494ms 6.4088ms 156.0364 Ops/s 164.3789 Ops/s $\textbf{\color{#d91a1a}-5.08\%}$
test_a2c_speed[True-None] 1.6326ms 1.4921ms 670.2105 Ops/s 682.7849 Ops/s $\color{#d91a1a}-1.84\%$
test_a2c_speed[True-backward] 3.4906ms 3.3967ms 294.4022 Ops/s 317.0611 Ops/s $\textbf{\color{#d91a1a}-7.15\%}$
test_a2c_speed[reduce-overhead-None] 1.1730ms 1.0998ms 909.2203 Ops/s 878.7725 Ops/s $\color{#35bf28}+3.46\%$
test_ppo_speed[False-None] 4.2980ms 4.0340ms 247.8942 Ops/s 247.2559 Ops/s $\color{#35bf28}+0.26\%$
test_ppo_speed[False-backward] 7.7090ms 7.3246ms 136.5261 Ops/s 140.0110 Ops/s $\color{#d91a1a}-2.49\%$
test_ppo_speed[True-None] 1.6865ms 1.6289ms 613.9120 Ops/s 611.5344 Ops/s $\color{#35bf28}+0.39\%$
test_ppo_speed[True-backward] 3.5775ms 3.5354ms 282.8531 Ops/s 293.3794 Ops/s $\color{#d91a1a}-3.59\%$
test_ppo_speed[reduce-overhead-None] 1.2920ms 1.1530ms 867.3319 Ops/s 829.7149 Ops/s $\color{#35bf28}+4.53\%$
test_reinforce_speed[False-None] 2.4747ms 2.3849ms 419.3077 Ops/s 422.5815 Ops/s $\color{#d91a1a}-0.77\%$
test_reinforce_speed[False-backward] 3.9661ms 3.5225ms 283.8872 Ops/s 286.2979 Ops/s $\color{#d91a1a}-0.84\%$
test_reinforce_speed[True-None] 1.5263ms 1.4711ms 679.7496 Ops/s 686.4746 Ops/s $\color{#d91a1a}-0.98\%$
test_reinforce_speed[True-backward] 3.7921ms 3.3489ms 298.6084 Ops/s 294.8758 Ops/s $\color{#35bf28}+1.27\%$
test_reinforce_speed[reduce-overhead-None] 16.1792ms 8.9128ms 112.1988 Ops/s 112.3940 Ops/s $\color{#d91a1a}-0.17\%$
test_iql_speed[False-None] 9.9822ms 9.7067ms 103.0217 Ops/s 103.9128 Ops/s $\color{#d91a1a}-0.86\%$
test_iql_speed[False-backward] 13.9533ms 13.5256ms 73.9337 Ops/s 75.8465 Ops/s $\color{#d91a1a}-2.52\%$
test_iql_speed[True-None] 2.5855ms 2.3828ms 419.6706 Ops/s 425.4972 Ops/s $\color{#d91a1a}-1.37\%$
test_iql_speed[True-backward] 5.7033ms 5.1921ms 192.6006 Ops/s 194.3898 Ops/s $\color{#d91a1a}-0.92\%$
test_iql_speed[reduce-overhead-None] 16.8437ms 10.0957ms 99.0520 Ops/s 97.3981 Ops/s $\color{#35bf28}+1.70\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4842ms 6.0142ms 166.2727 Ops/s 165.8897 Ops/s $\color{#35bf28}+0.23\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.9063ms 0.4087ms 2.4465 KOps/s 2.7751 KOps/s $\textbf{\color{#d91a1a}-11.84\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6635ms 0.3480ms 2.8737 KOps/s 2.6904 KOps/s $\textbf{\color{#35bf28}+6.81\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1526ms 5.7908ms 172.6870 Ops/s 172.7820 Ops/s $\color{#d91a1a}-0.06\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1878ms 0.3131ms 3.1943 KOps/s 2.7172 KOps/s $\textbf{\color{#35bf28}+17.56\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5609ms 0.3041ms 3.2889 KOps/s 2.7160 KOps/s $\textbf{\color{#35bf28}+21.10\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.4895ms 1.2723ms 785.9612 Ops/s 736.7014 Ops/s $\textbf{\color{#35bf28}+6.69\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4367ms 1.1854ms 843.6060 Ops/s 751.0688 Ops/s $\textbf{\color{#35bf28}+12.32\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 12.9040ms 6.0648ms 164.8855 Ops/s 169.3966 Ops/s $\color{#d91a1a}-2.66\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.3949ms 0.5242ms 1.9076 KOps/s 2.3150 KOps/s $\textbf{\color{#d91a1a}-17.60\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7454ms 0.4864ms 2.0557 KOps/s 2.4108 KOps/s $\textbf{\color{#d91a1a}-14.73\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.1769ms 5.6811ms 176.0215 Ops/s 171.6604 Ops/s $\color{#35bf28}+2.54\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.3595ms 0.3014ms 3.3175 KOps/s 2.6679 KOps/s $\textbf{\color{#35bf28}+24.35\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4606ms 0.2710ms 3.6897 KOps/s 2.8175 KOps/s $\textbf{\color{#35bf28}+30.95\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9264ms 5.6869ms 175.8424 Ops/s 169.4830 Ops/s $\color{#35bf28}+3.75\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.6115ms 0.3225ms 3.1007 KOps/s 2.9638 KOps/s $\color{#35bf28}+4.62\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5210ms 0.3199ms 3.1261 KOps/s 3.1673 KOps/s $\color{#d91a1a}-1.30\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 8.2497ms 5.8704ms 170.3458 Ops/s 164.6493 Ops/s $\color{#35bf28}+3.46\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9998s 2.0416ms 489.8037 Ops/s 2.2738 KOps/s $\textbf{\color{#d91a1a}-78.46\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8634ms 0.4281ms 2.3360 KOps/s 2.1669 KOps/s $\textbf{\color{#35bf28}+7.80\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.7498ms 5.1250ms 195.1205 Ops/s 194.3504 Ops/s $\color{#35bf28}+0.40\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.9968ms 1.8585ms 538.0713 Ops/s 445.3141 Ops/s $\textbf{\color{#35bf28}+20.83\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.2959ms 0.9576ms 1.0442 KOps/s 760.3484 Ops/s $\textbf{\color{#35bf28}+37.34\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 11.2835ms 5.1710ms 193.3854 Ops/s 192.8515 Ops/s $\color{#35bf28}+0.28\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 6.5257ms 1.8766ms 532.8683 Ops/s 433.1508 Ops/s $\textbf{\color{#35bf28}+23.02\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.0833ms 0.9687ms 1.0323 KOps/s 1.0287 KOps/s $\color{#35bf28}+0.35\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.6782s 18.8063ms 53.1736 Ops/s 44.0321 Ops/s $\textbf{\color{#35bf28}+20.76\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 4.1381ms 2.0067ms 498.3307 Ops/s 443.7691 Ops/s $\textbf{\color{#35bf28}+12.30\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.2670ms 1.1675ms 856.5124 Ops/s 831.0734 Ops/s $\color{#35bf28}+3.06\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 43.3097ms 40.4400ms 24.7280 Ops/s 24.9966 Ops/s $\color{#d91a1a}-1.07\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.3285ms 18.6177ms 53.7123 Ops/s 54.3620 Ops/s $\color{#d91a1a}-1.20\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 45.0534ms 41.1975ms 24.2733 Ops/s 24.2954 Ops/s $\color{#d91a1a}-0.09\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.3145ms 18.6329ms 53.6686 Ops/s 52.6948 Ops/s $\color{#35bf28}+1.85\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 45.3212ms 43.5211ms 22.9774 Ops/s 23.4474 Ops/s $\color{#d91a1a}-2.00\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.1821ms 20.0188ms 49.9530 Ops/s 50.6071 Ops/s $\color{#d91a1a}-1.29\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8915ms 0.2259ms 4.4272 KOps/s 4.4203 KOps/s $\color{#35bf28}+0.16\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.7727ms 1.4618ms 684.1110 Ops/s 686.2627 Ops/s $\color{#d91a1a}-0.31\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.8059ms 2.3871ms 418.9239 Ops/s 405.6179 Ops/s $\color{#35bf28}+3.28\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.2757ms 3.0334ms 329.6590 Ops/s 330.4118 Ops/s $\color{#d91a1a}-0.23\%$
test_storage_write_contiguous[50-img_shape0-small] 0.5225ms 0.1670ms 5.9896 KOps/s 5.8949 KOps/s $\color{#35bf28}+1.61\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3994ms 0.2347ms 4.2609 KOps/s 4.6189 KOps/s $\textbf{\color{#d91a1a}-7.75\%}$
test_storage_write_contiguous[100-img_shape2-large_img] 2.3277ms 1.9413ms 515.1097 Ops/s 574.6648 Ops/s $\textbf{\color{#d91a1a}-10.36\%}$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.5505ms 1.3783ms 725.5449 Ops/s 753.3862 Ops/s $\color{#d91a1a}-3.70\%$
test_collector_stack_then_write[50-img_shape0-small] 1.2377ms 1.1687ms 855.6176 Ops/s 863.7022 Ops/s $\color{#d91a1a}-0.94\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.7871ms 3.6391ms 274.7944 Ops/s 273.4248 Ops/s $\color{#35bf28}+0.50\%$
test_collector_stack_then_write[100-img_shape2-large_img] 6.1005ms 5.8136ms 172.0115 Ops/s 165.2400 Ops/s $\color{#35bf28}+4.10\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.6422ms 7.1865ms 139.1497 Ops/s 133.3703 Ops/s $\color{#35bf28}+4.33\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4343ms 0.2807ms 3.5624 KOps/s 3.6321 KOps/s $\color{#d91a1a}-1.92\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.6919ms 1.4823ms 674.6319 Ops/s 628.7577 Ops/s $\textbf{\color{#35bf28}+7.30\%}$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.9530ms 2.4755ms 403.9522 Ops/s 384.4728 Ops/s $\textbf{\color{#35bf28}+5.07\%}$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.5205ms 3.2333ms 309.2814 Ops/s 306.9423 Ops/s $\color{#35bf28}+0.76\%$
test_collector_without_rb[100-img_shape0-atari] 34.2966ms 33.5362ms 29.8185 Ops/s 30.5369 Ops/s $\color{#d91a1a}-2.35\%$
test_collector_without_rb[200-img_shape1-large_batch] 66.9733ms 65.7879ms 15.2004 Ops/s 15.7199 Ops/s $\color{#d91a1a}-3.30\%$
test_collector_with_rb[100-img_shape0-atari] 38.9078ms 38.1376ms 26.2208 Ops/s 26.9012 Ops/s $\color{#d91a1a}-2.53\%$
test_collector_with_rb[200-img_shape1-large_batch] 75.0100ms 74.2734ms 13.4638 Ops/s 13.7254 Ops/s $\color{#d91a1a}-1.91\%$
test_collector_without_rb_cuda[100-img_shape0-atari] 57.9222ms 56.8486ms 17.5906 Ops/s 17.9775 Ops/s $\color{#d91a1a}-2.15\%$
test_collector_without_rb_cuda[200-img_shape1-large_batch] 0.1141s 0.1132s 8.8303 Ops/s 9.0837 Ops/s $\color{#d91a1a}-2.79\%$
test_collector_with_rb_cuda[100-img_shape0-atari] 59.4365ms 58.7268ms 17.0280 Ops/s 17.4782 Ops/s $\color{#d91a1a}-2.58\%$
test_collector_with_rb_cuda[200-img_shape1-large_batch] 0.1167s 0.1157s 8.6447 Ops/s 8.8017 Ops/s $\color{#d91a1a}-1.78\%$

@github-actions
Copy link
Copy Markdown
Contributor

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 172. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}13$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 87.2799μs 85.9093μs 11.6402 KOps/s 11.4088 KOps/s $\color{#35bf28}+2.03\%$
test_tensor_to_bytestream_speed[torch.save] 0.1477ms 0.1459ms 6.8542 KOps/s 6.7914 KOps/s $\color{#35bf28}+0.92\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1116s 0.1114s 8.9787 Ops/s 9.0543 Ops/s $\color{#d91a1a}-0.83\%$
test_tensor_to_bytestream_speed[numpy] 2.5482μs 2.5432μs 393.2111 KOps/s 387.9236 KOps/s $\color{#35bf28}+1.36\%$
test_tensor_to_bytestream_speed[safetensors] 39.8115μs 39.2511μs 25.4770 KOps/s 25.2148 KOps/s $\color{#35bf28}+1.04\%$
test_simple 0.5631s 0.5622s 1.7788 Ops/s 1.7093 Ops/s $\color{#35bf28}+4.06\%$
test_transformed 1.1128s 1.1110s 0.9001 Ops/s 0.8782 Ops/s $\color{#35bf28}+2.49\%$
test_serial 1.7349s 1.7320s 0.5774 Ops/s 0.5648 Ops/s $\color{#35bf28}+2.23\%$
test_parallel 1.0611s 1.0589s 0.9444 Ops/s 0.9558 Ops/s $\color{#d91a1a}-1.20\%$
test_step_mdp_speed[True-True-True-True-True] 0.3284ms 42.9893μs 23.2616 KOps/s 23.2085 KOps/s $\color{#35bf28}+0.23\%$
test_step_mdp_speed[True-True-True-True-False] 57.3010μs 23.8826μs 41.8714 KOps/s 42.3344 KOps/s $\color{#d91a1a}-1.09\%$
test_step_mdp_speed[True-True-True-False-True] 56.9310μs 24.3742μs 41.0271 KOps/s 39.1866 KOps/s $\color{#35bf28}+4.70\%$
test_step_mdp_speed[True-True-True-False-False] 37.4810μs 13.1643μs 75.9633 KOps/s 73.8884 KOps/s $\color{#35bf28}+2.81\%$
test_step_mdp_speed[True-True-False-True-True] 0.1362ms 46.0777μs 21.7025 KOps/s 21.7240 KOps/s $\color{#d91a1a}-0.10\%$
test_step_mdp_speed[True-True-False-True-False] 58.6910μs 26.2649μs 38.0736 KOps/s 37.6634 KOps/s $\color{#35bf28}+1.09\%$
test_step_mdp_speed[True-True-False-False-True] 52.1110μs 27.0389μs 36.9837 KOps/s 36.3154 KOps/s $\color{#35bf28}+1.84\%$
test_step_mdp_speed[True-True-False-False-False] 45.7510μs 15.9313μs 62.7694 KOps/s 61.7637 KOps/s $\color{#35bf28}+1.63\%$
test_step_mdp_speed[True-False-True-True-True] 0.1174ms 48.7543μs 20.5110 KOps/s 20.1798 KOps/s $\color{#35bf28}+1.64\%$
test_step_mdp_speed[True-False-True-True-False] 56.8910μs 29.5215μs 33.8736 KOps/s 33.9461 KOps/s $\color{#d91a1a}-0.21\%$
test_step_mdp_speed[True-False-True-False-True] 60.5910μs 27.5196μs 36.3377 KOps/s 36.1618 KOps/s $\color{#35bf28}+0.49\%$
test_step_mdp_speed[True-False-True-False-False] 54.6310μs 16.0341μs 62.3670 KOps/s 62.5925 KOps/s $\color{#d91a1a}-0.36\%$
test_step_mdp_speed[True-False-False-True-True] 89.3020μs 50.8676μs 19.6589 KOps/s 18.9267 KOps/s $\color{#35bf28}+3.87\%$
test_step_mdp_speed[True-False-False-True-False] 69.6420μs 32.1754μs 31.0796 KOps/s 31.2105 KOps/s $\color{#d91a1a}-0.42\%$
test_step_mdp_speed[True-False-False-False-True] 69.3410μs 29.7864μs 33.5723 KOps/s 32.9014 KOps/s $\color{#35bf28}+2.04\%$
test_step_mdp_speed[True-False-False-False-False] 53.4210μs 18.7168μs 53.4280 KOps/s 53.0710 KOps/s $\color{#35bf28}+0.67\%$
test_step_mdp_speed[False-True-True-True-True] 0.1292ms 49.0658μs 20.3808 KOps/s 19.9189 KOps/s $\color{#35bf28}+2.32\%$
test_step_mdp_speed[False-True-True-True-False] 2.1008ms 29.7551μs 33.6077 KOps/s 34.0208 KOps/s $\color{#d91a1a}-1.21\%$
test_step_mdp_speed[False-True-True-False-True] 62.8210μs 31.1114μs 32.1425 KOps/s 31.6159 KOps/s $\color{#35bf28}+1.67\%$
test_step_mdp_speed[False-True-True-False-False] 46.1300μs 17.4616μs 57.2684 KOps/s 55.4812 KOps/s $\color{#35bf28}+3.22\%$
test_step_mdp_speed[False-True-False-True-True] 84.6810μs 51.2803μs 19.5007 KOps/s 19.1622 KOps/s $\color{#35bf28}+1.77\%$
test_step_mdp_speed[False-True-False-True-False] 62.4220μs 31.9947μs 31.2552 KOps/s 31.4174 KOps/s $\color{#d91a1a}-0.52\%$
test_step_mdp_speed[False-True-False-False-True] 61.3810μs 33.7800μs 29.6034 KOps/s 29.8350 KOps/s $\color{#d91a1a}-0.78\%$
test_step_mdp_speed[False-True-False-False-False] 58.1310μs 20.3954μs 49.0306 KOps/s 49.4408 KOps/s $\color{#d91a1a}-0.83\%$
test_step_mdp_speed[False-False-True-True-True] 86.8810μs 54.4759μs 18.3567 KOps/s 18.3644 KOps/s $\color{#d91a1a}-0.04\%$
test_step_mdp_speed[False-False-True-True-False] 74.7010μs 34.5157μs 28.9723 KOps/s 28.6492 KOps/s $\color{#35bf28}+1.13\%$
test_step_mdp_speed[False-False-True-False-True] 67.1610μs 33.9555μs 29.4503 KOps/s 29.2240 KOps/s $\color{#35bf28}+0.77\%$
test_step_mdp_speed[False-False-True-False-False] 46.1100μs 20.1766μs 49.5624 KOps/s 48.3239 KOps/s $\color{#35bf28}+2.56\%$
test_step_mdp_speed[False-False-False-True-True] 93.7620μs 56.9961μs 17.5450 KOps/s 17.9952 KOps/s $\color{#d91a1a}-2.50\%$
test_step_mdp_speed[False-False-False-True-False] 64.3710μs 37.7239μs 26.5084 KOps/s 27.2110 KOps/s $\color{#d91a1a}-2.58\%$
test_step_mdp_speed[False-False-False-False-True] 68.8510μs 35.9263μs 27.8348 KOps/s 28.2265 KOps/s $\color{#d91a1a}-1.39\%$
test_step_mdp_speed[False-False-False-False-False] 56.4010μs 22.8473μs 43.7688 KOps/s 43.7272 KOps/s $\color{#35bf28}+0.09\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.7577s 0.7547s 1.3251 Ops/s 1.2867 Ops/s $\color{#35bf28}+2.98\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7313s 0.6337s 1.5779 Ops/s 1.5727 Ops/s $\color{#35bf28}+0.33\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.8042s 1.7096s 0.5849 Ops/s 0.5810 Ops/s $\color{#35bf28}+0.67\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5535s 1.4740s 0.6784 Ops/s 0.6734 Ops/s $\color{#35bf28}+0.74\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 2.0423s 1.9628s 0.5095 Ops/s 0.5062 Ops/s $\color{#35bf28}+0.64\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.8188s 1.7346s 0.5765 Ops/s 0.5720 Ops/s $\color{#35bf28}+0.80\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 5.0686s 4.7708s 0.2096 Ops/s 0.2115 Ops/s $\color{#d91a1a}-0.91\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5743s 4.4528s 0.2246 Ops/s 0.2227 Ops/s $\color{#35bf28}+0.85\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 2.0614s 1.9445s 0.5143 Ops/s 0.5046 Ops/s $\color{#35bf28}+1.91\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.8453s 1.7219s 0.5808 Ops/s 0.6086 Ops/s $\color{#d91a1a}-4.57\%$
test_values[generalized_advantage_estimate-True-True] 10.4936ms 10.0788ms 99.2180 Ops/s 99.0876 Ops/s $\color{#35bf28}+0.13\%$
test_values[vec_generalized_advantage_estimate-True-True] 21.4637ms 17.6237ms 56.7416 Ops/s 56.4927 Ops/s $\color{#35bf28}+0.44\%$
test_values[td0_return_estimate-False-False] 0.2947ms 0.1409ms 7.0956 KOps/s 7.5592 KOps/s $\textbf{\color{#d91a1a}-6.13\%}$
test_values[td1_return_estimate-False-False] 28.8287ms 27.6672ms 36.1438 Ops/s 36.1557 Ops/s $\color{#d91a1a}-0.03\%$
test_values[vec_td1_return_estimate-False-False] 20.4663ms 17.7500ms 56.3379 Ops/s 56.3818 Ops/s $\color{#d91a1a}-0.08\%$
test_values[td_lambda_return_estimate-True-False] 45.1468ms 40.9761ms 24.4045 Ops/s 24.3242 Ops/s $\color{#35bf28}+0.33\%$
test_values[vec_td_lambda_return_estimate-True-False] 20.7641ms 17.7867ms 56.2219 Ops/s 56.5270 Ops/s $\color{#d91a1a}-0.54\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.0125ms 8.8819ms 112.5884 Ops/s 113.4966 Ops/s $\color{#d91a1a}-0.80\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 4.6383ms 1.5230ms 656.5877 Ops/s 653.2781 Ops/s $\color{#35bf28}+0.51\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5504ms 0.4294ms 2.3286 KOps/s 2.3751 KOps/s $\color{#d91a1a}-1.96\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 38.6520ms 34.4793ms 29.0029 Ops/s 32.5684 Ops/s $\textbf{\color{#d91a1a}-10.95\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 2.0770ms 1.7421ms 574.0342 Ops/s 577.6597 Ops/s $\color{#d91a1a}-0.63\%$
test_dqn_speed[False-None] 1.7770ms 1.4735ms 678.6504 Ops/s 687.4567 Ops/s $\color{#d91a1a}-1.28\%$
test_dqn_speed[False-backward] 2.0831ms 2.0136ms 496.6228 Ops/s 503.9315 Ops/s $\color{#d91a1a}-1.45\%$
test_dqn_speed[True-None] 1.2964ms 0.6724ms 1.4871 KOps/s 1.7124 KOps/s $\textbf{\color{#d91a1a}-13.16\%}$
test_dqn_speed[True-backward] 1.1228ms 1.0674ms 936.8862 Ops/s 886.2539 Ops/s $\textbf{\color{#35bf28}+5.71\%}$
test_dqn_speed[reduce-overhead-None] 0.7941ms 0.5664ms 1.7655 KOps/s 1.6909 KOps/s $\color{#35bf28}+4.41\%$
test_ddpg_speed[False-None] 3.4584ms 2.9955ms 333.8305 Ops/s 336.3141 Ops/s $\color{#d91a1a}-0.74\%$
test_ddpg_speed[False-backward] 4.3567ms 4.2068ms 237.7092 Ops/s 235.0208 Ops/s $\color{#35bf28}+1.14\%$
test_ddpg_speed[True-None] 1.7932ms 1.4924ms 670.0765 Ops/s 675.1641 Ops/s $\color{#d91a1a}-0.75\%$
test_ddpg_speed[True-backward] 2.6099ms 2.5195ms 396.9026 Ops/s 386.8775 Ops/s $\color{#35bf28}+2.59\%$
test_ddpg_speed[reduce-overhead-None] 1.8713ms 1.4630ms 683.5375 Ops/s 685.4717 Ops/s $\color{#d91a1a}-0.28\%$
test_sac_speed[False-None] 8.8608ms 8.3793ms 119.3417 Ops/s 121.1605 Ops/s $\color{#d91a1a}-1.50\%$
test_sac_speed[False-backward] 12.2536ms 11.7559ms 85.0635 Ops/s 86.9490 Ops/s $\color{#d91a1a}-2.17\%$
test_sac_speed[True-None] 2.3815ms 2.2537ms 443.7241 Ops/s 448.6298 Ops/s $\color{#d91a1a}-1.09\%$
test_sac_speed[True-backward] 4.3946ms 4.2109ms 237.4804 Ops/s 223.6417 Ops/s $\textbf{\color{#35bf28}+6.19\%}$
test_sac_speed[reduce-overhead-None] 2.4679ms 2.2150ms 451.4575 Ops/s 435.0004 Ops/s $\color{#35bf28}+3.78\%$
test_redq_speed[False-None] 15.9435ms 10.9637ms 91.2101 Ops/s 91.5009 Ops/s $\color{#d91a1a}-0.32\%$
test_redq_speed[False-backward] 19.0715ms 18.4182ms 54.2940 Ops/s 54.7825 Ops/s $\color{#d91a1a}-0.89\%$
test_redq_speed[True-None] 4.9007ms 4.6640ms 214.4061 Ops/s 202.4052 Ops/s $\textbf{\color{#35bf28}+5.93\%}$
test_redq_speed[reduce-overhead-None] 4.7859ms 4.6221ms 216.3541 Ops/s 207.5069 Ops/s $\color{#35bf28}+4.26\%$
test_redq_deprec_speed[False-None] 11.9764ms 11.4380ms 87.4282 Ops/s 87.1803 Ops/s $\color{#35bf28}+0.28\%$
test_redq_deprec_speed[False-backward] 16.7941ms 16.3897ms 61.0140 Ops/s 61.1083 Ops/s $\color{#d91a1a}-0.15\%$
test_redq_deprec_speed[True-None] 4.1542ms 3.7410ms 267.3108 Ops/s 269.8014 Ops/s $\color{#d91a1a}-0.92\%$
test_redq_deprec_speed[True-backward] 7.9291ms 7.6189ms 131.2529 Ops/s 125.4550 Ops/s $\color{#35bf28}+4.62\%$
test_redq_deprec_speed[reduce-overhead-None] 3.9943ms 3.6709ms 272.4128 Ops/s 268.3400 Ops/s $\color{#35bf28}+1.52\%$
test_td3_speed[False-None] 8.6428ms 8.3532ms 119.7149 Ops/s 118.6070 Ops/s $\color{#35bf28}+0.93\%$
test_td3_speed[False-backward] 15.2811ms 11.2807ms 88.6468 Ops/s 89.1464 Ops/s $\color{#d91a1a}-0.56\%$
test_td3_speed[True-None] 1.9333ms 1.8954ms 527.5852 Ops/s 532.1715 Ops/s $\color{#d91a1a}-0.86\%$
test_td3_speed[True-backward] 3.8066ms 3.6774ms 271.9314 Ops/s 265.7450 Ops/s $\color{#35bf28}+2.33\%$
test_td3_speed[reduce-overhead-None] 1.9524ms 1.8537ms 539.4693 Ops/s 557.6263 Ops/s $\color{#d91a1a}-3.26\%$
test_cql_speed[False-None] 30.9460ms 27.2084ms 36.7533 Ops/s 38.0942 Ops/s $\color{#d91a1a}-3.52\%$
test_cql_speed[False-backward] 41.2703ms 36.8148ms 27.1630 Ops/s 28.1531 Ops/s $\color{#d91a1a}-3.52\%$
test_cql_speed[True-None] 13.0108ms 12.6026ms 79.3488 Ops/s 79.1653 Ops/s $\color{#35bf28}+0.23\%$
test_cql_speed[True-backward] 18.6539ms 18.2857ms 54.6877 Ops/s 55.0228 Ops/s $\color{#d91a1a}-0.61\%$
test_cql_speed[reduce-overhead-None] 15.6962ms 12.7595ms 78.3732 Ops/s 77.7796 Ops/s $\color{#35bf28}+0.76\%$
test_a2c_speed[False-None] 5.9602ms 5.5472ms 180.2713 Ops/s 178.4195 Ops/s $\color{#35bf28}+1.04\%$
test_a2c_speed[False-backward] 12.4232ms 12.1142ms 82.5478 Ops/s 81.7565 Ops/s $\color{#35bf28}+0.97\%$
test_a2c_speed[True-None] 4.1671ms 3.9223ms 254.9524 Ops/s 252.3215 Ops/s $\color{#35bf28}+1.04\%$
test_a2c_speed[True-backward] 9.1626ms 8.8462ms 113.0423 Ops/s 110.9903 Ops/s $\color{#35bf28}+1.85\%$
test_a2c_speed[reduce-overhead-None] 4.2451ms 3.9069ms 255.9554 Ops/s 251.7850 Ops/s $\color{#35bf28}+1.66\%$
test_ppo_speed[False-None] 6.4472ms 6.1850ms 161.6824 Ops/s 162.4711 Ops/s $\color{#d91a1a}-0.49\%$
test_ppo_speed[False-backward] 13.2195ms 12.9767ms 77.0611 Ops/s 77.5154 Ops/s $\color{#d91a1a}-0.59\%$
test_ppo_speed[True-None] 4.3564ms 3.8855ms 257.3659 Ops/s 251.9778 Ops/s $\color{#35bf28}+2.14\%$
test_ppo_speed[True-backward] 9.3458ms 8.9164ms 112.1526 Ops/s 101.9403 Ops/s $\textbf{\color{#35bf28}+10.02\%}$
test_ppo_speed[reduce-overhead-None] 4.1091ms 3.8154ms 262.0982 Ops/s 260.1894 Ops/s $\color{#35bf28}+0.73\%$
test_reinforce_speed[False-None] 4.8597ms 4.6810ms 213.6274 Ops/s 211.6662 Ops/s $\color{#35bf28}+0.93\%$
test_reinforce_speed[False-backward] 7.9186ms 7.6251ms 131.1466 Ops/s 131.5952 Ops/s $\color{#d91a1a}-0.34\%$
test_reinforce_speed[True-None] 3.2954ms 3.0906ms 323.5624 Ops/s 322.1116 Ops/s $\color{#35bf28}+0.45\%$
test_reinforce_speed[True-backward] 8.3232ms 8.0960ms 123.5178 Ops/s 119.5138 Ops/s $\color{#35bf28}+3.35\%$
test_reinforce_speed[reduce-overhead-None] 3.4583ms 3.0637ms 326.4049 Ops/s 328.0903 Ops/s $\color{#d91a1a}-0.51\%$
test_iql_speed[False-None] 26.1515ms 21.1202ms 47.3480 Ops/s 48.5021 Ops/s $\color{#d91a1a}-2.38\%$
test_iql_speed[False-backward] 31.7406ms 31.0681ms 32.1874 Ops/s 32.1199 Ops/s $\color{#35bf28}+0.21\%$
test_iql_speed[True-None] 9.0472ms 8.6920ms 115.0490 Ops/s 114.4446 Ops/s $\color{#35bf28}+0.53\%$
test_iql_speed[True-backward] 17.3447ms 17.0003ms 58.8225 Ops/s 58.4840 Ops/s $\color{#35bf28}+0.58\%$
test_iql_speed[reduce-overhead-None] 9.1768ms 8.6448ms 115.6767 Ops/s 114.7038 Ops/s $\color{#35bf28}+0.85\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.6275ms 6.2859ms 159.0867 Ops/s 159.6610 Ops/s $\color{#d91a1a}-0.36\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3.0705ms 0.4060ms 2.4628 KOps/s 2.5225 KOps/s $\color{#d91a1a}-2.37\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6895ms 0.3430ms 2.9153 KOps/s 2.6353 KOps/s $\textbf{\color{#35bf28}+10.63\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3630ms 6.0635ms 164.9221 Ops/s 165.1474 Ops/s $\color{#d91a1a}-0.14\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.0185ms 0.3451ms 2.8981 KOps/s 2.5879 KOps/s $\textbf{\color{#35bf28}+11.98\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5821ms 0.3403ms 2.9382 KOps/s 2.7118 KOps/s $\textbf{\color{#35bf28}+8.35\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.8885ms 1.4444ms 692.3398 Ops/s 674.6993 Ops/s $\color{#35bf28}+2.61\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.7163ms 1.3826ms 723.2770 Ops/s 718.6310 Ops/s $\color{#35bf28}+0.65\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 10.4678ms 6.3024ms 158.6697 Ops/s 163.1136 Ops/s $\color{#d91a1a}-2.72\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8380ms 0.5137ms 1.9468 KOps/s 1.7783 KOps/s $\textbf{\color{#35bf28}+9.48\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8067ms 0.4884ms 2.0475 KOps/s 1.8437 KOps/s $\textbf{\color{#35bf28}+11.05\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2538ms 6.0633ms 164.9256 Ops/s 167.7885 Ops/s $\color{#d91a1a}-1.71\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9102ms 0.3565ms 2.8050 KOps/s 3.3286 KOps/s $\textbf{\color{#d91a1a}-15.73\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5533ms 0.3401ms 2.9406 KOps/s 3.5538 KOps/s $\textbf{\color{#d91a1a}-17.25\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.2742ms 5.9965ms 166.7630 Ops/s 169.8851 Ops/s $\color{#d91a1a}-1.84\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8348ms 0.3658ms 2.7336 KOps/s 3.2158 KOps/s $\textbf{\color{#d91a1a}-15.00\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5840ms 0.3585ms 2.7894 KOps/s 2.9030 KOps/s $\color{#d91a1a}-3.91\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4504ms 6.1865ms 161.6415 Ops/s 162.0333 Ops/s $\color{#d91a1a}-0.24\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.1830ms 0.5047ms 1.9814 KOps/s 2.1620 KOps/s $\textbf{\color{#d91a1a}-8.35\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7458ms 0.4806ms 2.0809 KOps/s 2.0925 KOps/s $\color{#d91a1a}-0.55\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.6630ms 5.2086ms 191.9918 Ops/s 192.8645 Ops/s $\color{#d91a1a}-0.45\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.5464ms 2.2020ms 454.1289 Ops/s 503.1417 Ops/s $\textbf{\color{#d91a1a}-9.74\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 9.6314ms 1.3209ms 757.0641 Ops/s 1.0585 KOps/s $\textbf{\color{#d91a1a}-28.48\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.6661s 18.4942ms 54.0710 Ops/s 194.5615 Ops/s $\textbf{\color{#d91a1a}-72.21\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.6109ms 2.0125ms 496.8972 Ops/s 538.4357 Ops/s $\textbf{\color{#d91a1a}-7.71\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 3.5916ms 0.9901ms 1.0100 KOps/s 1.0462 KOps/s $\color{#d91a1a}-3.46\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.7509ms 5.4554ms 183.3047 Ops/s 51.3125 Ops/s $\textbf{\color{#35bf28}+257.23\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 11.1348ms 2.1262ms 470.3210 Ops/s 481.0555 Ops/s $\color{#d91a1a}-2.23\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 3.5161ms 1.1728ms 852.6809 Ops/s 879.9716 Ops/s $\color{#d91a1a}-3.10\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 44.4593ms 40.1832ms 24.8860 Ops/s 24.4799 Ops/s $\color{#35bf28}+1.66\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 21.0755ms 19.0020ms 52.6260 Ops/s 52.8467 Ops/s $\color{#d91a1a}-0.42\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 45.6155ms 41.1183ms 24.3201 Ops/s 23.9269 Ops/s $\color{#35bf28}+1.64\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.6568ms 19.1883ms 52.1151 Ops/s 51.8078 Ops/s $\color{#35bf28}+0.59\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 44.5807ms 42.9158ms 23.3014 Ops/s 23.0191 Ops/s $\color{#35bf28}+1.23\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 22.4736ms 20.8731ms 47.9085 Ops/s 48.2047 Ops/s $\color{#d91a1a}-0.61\%$
test_storage_write_lazystack[50-img_shape0-small] 0.9465ms 0.2336ms 4.2802 KOps/s 4.4096 KOps/s $\color{#d91a1a}-2.93\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.7536ms 1.4235ms 702.4886 Ops/s 716.5838 Ops/s $\color{#d91a1a}-1.97\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.8240ms 2.3621ms 423.3482 Ops/s 422.1170 Ops/s $\color{#35bf28}+0.29\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.1463ms 2.9138ms 343.1938 Ops/s 338.5552 Ops/s $\color{#35bf28}+1.37\%$
test_storage_write_contiguous[50-img_shape0-small] 0.4625ms 0.1430ms 6.9941 KOps/s 6.9215 KOps/s $\color{#35bf28}+1.05\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3714ms 0.2139ms 4.6750 KOps/s 5.1710 KOps/s $\textbf{\color{#d91a1a}-9.59\%}$
test_storage_write_contiguous[100-img_shape2-large_img] 1.9037ms 1.7844ms 560.4125 Ops/s 568.9050 Ops/s $\color{#d91a1a}-1.49\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.5427ms 1.3323ms 750.5704 Ops/s 768.0659 Ops/s $\color{#d91a1a}-2.28\%$
test_collector_stack_then_write[50-img_shape0-small] 1.2524ms 1.1540ms 866.5234 Ops/s 865.6851 Ops/s $\color{#35bf28}+0.10\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.7547ms 3.5958ms 278.0985 Ops/s 279.8875 Ops/s $\color{#d91a1a}-0.64\%$
test_collector_stack_then_write[100-img_shape2-large_img] 6.2067ms 5.8427ms 171.1547 Ops/s 171.8398 Ops/s $\color{#d91a1a}-0.40\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.5880ms 7.3842ms 135.4246 Ops/s 134.2915 Ops/s $\color{#35bf28}+0.84\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4612ms 0.2885ms 3.4666 KOps/s 3.5252 KOps/s $\color{#d91a1a}-1.66\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.7097ms 1.5322ms 652.6662 Ops/s 654.7056 Ops/s $\color{#d91a1a}-0.31\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.9270ms 2.4880ms 401.9330 Ops/s 403.0020 Ops/s $\color{#d91a1a}-0.27\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.3961ms 3.1167ms 320.8570 Ops/s 314.1511 Ops/s $\color{#35bf28}+2.13\%$
test_collector_without_rb[100-img_shape0-atari] 34.4024ms 33.2967ms 30.0330 Ops/s 29.8565 Ops/s $\color{#35bf28}+0.59\%$
test_collector_without_rb[200-img_shape1-large_batch] 65.6397ms 65.3183ms 15.3096 Ops/s 15.1083 Ops/s $\color{#35bf28}+1.33\%$
test_collector_with_rb[100-img_shape0-atari] 0.7048s 64.2470ms 15.5649 Ops/s 25.8539 Ops/s $\textbf{\color{#d91a1a}-39.80\%}$
test_collector_with_rb[200-img_shape1-large_batch] 75.7539ms 75.3336ms 13.2743 Ops/s 13.1154 Ops/s $\color{#35bf28}+1.21\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BugFix CI Has to do with CI setup (e.g. wheels & builds, tests...) CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Collectors Integrations/torch_geometric Integrations llm/ LLM-related PR, triggers LLM CI tests Modules Record

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant