Problem description
Hi, once again, thank you for this amazing tool!
I started noticing that global updates are quite slow, so I started investigating a bit.
If I understood well, pixi global update processes all environments sequentially. Each environment goes through repodata fetch --> SAT solve --> install --> manifest sync one at a time. With many global environments, this adds up quickly. On my setup with 33 environments, a no-op update (everything already up-to-date) takes ~1m 26s.
Also, when an environment is out of sync, it is installed twice with identical specs: once to populate the prefix for executable scanning, then again for the actual update. Both calls produce the same result, so the first install is redundant.
Proposed solution
Example implementation at: flferretti@2ff60a2
Split the update into two phases:
-
Parallel install: All environments run environment_in_sync + install_environment + expose-type detection concurrently via futures::future::join_all, sharing an immutable &Project reference (and therefore the same repodata gateway and authenticated client).
-
Sequential manifest sync: The cheap manifest update (sync_exposed_names, sync_shortcuts, expose_executables_from_environment, sync_completions) run sequentially since they need &mut self.
The redundant double install is eliminated by reading pre-update executables before the install when the environment is already in sync.
As a minor improvement, I think that environment names from CLI input could be de-duplicated via IndexSet
Benchmarks
Scenario 1: Real update with 20 environments (17 updated + 3 already up-to-date)
20 environments (bat, black, codespell, dust, fd-find, flake8, git-delta, httpie, hyperfine, isort, jq, mypy, pre-commit, pylint, pyupgrade, ripgrep, ruff, sd, starship, zoxide) installed at older pinned versions, then updated to latest with "*" constraints. Warm repodata cache.
|
Wall time |
CPU time |
| Before (sequential) |
27.0s |
~29.5s |
| After (parallel) |
10.8s |
~33.5s |
| Speedup |
2.5× |
— |
Scenario 2: No-op update with 33 environments (all already up-to-date)
33 global environments (bat, black, chezmoi, codespell, conda-recipe-manager, conda-smithy, direnv, eza, gh, gdb, git-lfs, go, htop, htop-rs, flake8, isort, nvitop, nvtop, pixi-browse, pixi-gui, pre-commit, pylint, pyupgrade, rattler-build, refurb, ripgrep, ruff, speedtest-cli, tectonic, ty, uv, wandb, zensical), all already at their latest versions.
|
Wall time |
CPU time |
| Before (sequential) |
1m 26s |
~1m 26s |
| After (parallel) |
~33s |
~1m 30s |
| Speedup |
2.6× |
— |
Per-environment timing (after fix)
Sync checks complete in parallel (~670ms each). Install times vary by environment complexity:
| Environment |
Install time |
| chezmoi |
1.2s |
| go |
1.4s |
| uv, direnv, git-lfs |
~1.9s |
| black |
2.9s |
| pylint |
3.3s |
| wandb |
5.5s |
| conda-recipe-manager |
11.6s |
| conda-smithy |
14.2s |
| zensical |
25.3s |
Wall-clock time is bounded by the slowest environment (zensical at 25s) plus overhead, rather than the sum of all environments.
System info
- Linux 7.0.0-13-generic x86_64
- Intel Core Ultra 9 285H (16 cores, 1 thread/core) 32 GB RAM
- Rust 1.90.0
- pixi built from
cf4a8cc (main)
- Build profile debug (unoptimized)
Problem description
Hi, once again, thank you for this amazing tool!
I started noticing that global updates are quite slow, so I started investigating a bit.
If I understood well,
pixi global updateprocesses all environments sequentially. Each environment goes through repodata fetch --> SAT solve --> install --> manifest sync one at a time. With many global environments, this adds up quickly. On my setup with 33 environments, a no-op update (everything already up-to-date) takes ~1m 26s.Also, when an environment is out of sync, it is installed twice with identical specs: once to populate the prefix for executable scanning, then again for the actual update. Both calls produce the same result, so the first install is redundant.
Proposed solution
Example implementation at: flferretti@2ff60a2
Split the update into two phases:
Parallel install: All environments run
environment_in_sync+install_environment+ expose-type detection concurrently viafutures::future::join_all, sharing an immutable&Projectreference (and therefore the same repodata gateway and authenticated client).Sequential manifest sync: The cheap manifest update (
sync_exposed_names,sync_shortcuts,expose_executables_from_environment,sync_completions) run sequentially since they need&mut self.The redundant double install is eliminated by reading pre-update executables before the install when the environment is already in sync.
As a minor improvement, I think that environment names from CLI input could be de-duplicated via
IndexSetBenchmarks
Scenario 1: Real update with 20 environments (17 updated + 3 already up-to-date)
20 environments (bat, black, codespell, dust, fd-find, flake8, git-delta, httpie, hyperfine, isort, jq, mypy, pre-commit, pylint, pyupgrade, ripgrep, ruff, sd, starship, zoxide) installed at older pinned versions, then updated to latest with
"*"constraints. Warm repodata cache.Scenario 2: No-op update with 33 environments (all already up-to-date)
33 global environments (bat, black, chezmoi, codespell, conda-recipe-manager, conda-smithy, direnv, eza, gh, gdb, git-lfs, go, htop, htop-rs, flake8, isort, nvitop, nvtop, pixi-browse, pixi-gui, pre-commit, pylint, pyupgrade, rattler-build, refurb, ripgrep, ruff, speedtest-cli, tectonic, ty, uv, wandb, zensical), all already at their latest versions.
Per-environment timing (after fix)
Sync checks complete in parallel (~670ms each). Install times vary by environment complexity:
Wall-clock time is bounded by the slowest environment (zensical at 25s) plus overhead, rather than the sum of all environments.
System info
cf4a8cc(main)