RFC: conda-forge epochs for solver accuracy, speed & debuggability?
See original GitHub issueConda and mamba’s solver take into account the entirety of packages ever published when trying to resolve an environment (with some accelerations, i.e. checking first if things are resolvable with repodata_current.json
).
This can lead sometimes lead the solver astray and force it into very weird contortions, where very old packages are picked just because they seemingly satisfy the constraints (though realistically, this is almost always an error in our metadata). There are many examples of this, here’s a few that came up recently:
While this definitely also has some advantages (less rebuilds, old packages stay installable), this also can run into inevitable problems where old packages haven’t been rebuilt for modern dependencies (e.g. no run-exports), not aware of unknown-at-the-time ABI breaks, noarch vs. yesarch, etc.
So it would be nice to give users a way to enforce an option that says “I only want comparatively recent packages” or, in other words, “please don’t do unexpected/unintended/crazy things while trying to resolve my environment”.
I was thinking about how this could be done in a way that wouldn’t require constant rebuilds (i.e. say, if a “conda-forge epoch” were to be defined as equal to a calendar year, nothing would be installable in January until all common packages have been rebuilt).
My current idea looks as follows:
- There’s an empty metapackage
__conda-forge-epoch
that gets built every day (or week, or month), and versioned accordingly, i.e.2022.12.19
. - All outputs gain an automatic run-constraint
run_constrained: {% set epoch = datetime.date.today().strftime('%Y.%m.%d') %} - __conda-forge-epoch <={{ epoch }}
- note the
<=
, which is the other way around from e.g. our usual run-exports. - implementing this (without having to modify every recipe) probably needs support from conda-build, but for now I’m assuming this is possible.
- note the
- By default,
__conda-forge-epoch
does not get installed, and therefore the constraints don’t get triggered.- This also means we wouldn’t have to rebuild stuff more often than we already do, as the proposed default is effectively the same as the status quo.
- In other words, there are no hard “epoch breaks” (like we had once upon a time for going from the old compilers to the new ones).
- If a user wants to make avoid certain solver errors, or simply enforce recent builds, they can add
__conda-forge-epoch>=yyyy.mm.dd
to their environment specs (now we have the>=
). This would force the solver to only take into account packages built after that date. - Perhaps even more importantly, it would allow users (& conda-forge members) to more easily debug solver errors, by forcing the solver to only consider a more recent subset of packages, without getting lost in the weeds of the past.
I think just the debugging capabilities of this would make this worth considering, but maybe I’m just not very good at debugging resolver errors. 😅
Would be interested to hear people’s thoughts.
Issue Analytics
- State:
- Created 9 months ago
- Reactions:1
- Comments:9 (9 by maintainers)
Top GitHub Comments
Rather than having a metapackage I think this could be an install flag as the build timestamps are already included in the repodata.
This is in fact explicitly what I’d like to be able to do (not by default of course). Packages that haven’t been rebuilt in a while are often subtly incompatible (compare the recent libxml2 issues), and figuring out which feedstocks among a given set of dependencies haven’t been rebuilt in a while is a useful tool for chasing down resolver errors.