Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Pushing commit to master on total conversion failure

See original GitHub issue

It is common enough that the conversion job fails due to some transient issue networking or Web API (e.g. CIs) that we end up having to restart frequently. If at least a few things convert, we try to save our progress by removing those things that converted and retrigger Travis CI to perform conversion. However, we don’t do this if conversion fails to convert any recipe. Instead we just fail on master and it is up to someone with permissions to check or some other user to ping the right people.

Given how common it is to have a transient failure on master and the turnaround time on getting it fixed, it seems that it would make sense to just autoretry. Perhaps we can limit it to a certain number of retries (maybe 5). To do this we can have the conversion process autoretrigger itself by pushing a commit (ideally non-empty) so as to retrigger Travis CI. This commit could be writing the current retry count to a file for instance. If conversion ever succeeds (even partially), we can clear the retry count file.

Thoughts on this idea?

cc @conda-forge/core @conda-forge/staged-recipes

Issue Analytics

State:
Created 7 years ago
Comments:11 (11 by maintainers)

Top GitHub Comments

2reactions

jakirkhamcommented, Apr 9, 2017

Have enabled a daily cron job on this repo with Travis. I know that isn’t exactly what is proposed in this issue. However, we are regularly having recipes sit for a few days. So it seems like this would help and probably not hurt.

1reaction

jakirkhamcommented, Apr 11, 2017

Thanks for the support.

To try to summarize what you are saying @ericdill, it sounds like you are concerned about race cases at staged-recipes, correct? FWIW we already encounter this situations when several merges occur at staged-recipes in short order.

When it comes to trying to create the same feedstock around the same time, conda-smithy is smart enough to gracefully handle this case. Though the feedstock may get some extra commits as part of token encryption. This doesn’t seem to have any ill effect for the feedstock. Though it does trigger extra unnecessary builds. FWIW this has basically been the case since the beginning.

As for the other issues, this definitely was very problematic a year ago. A very common case was to have a conversion (or part of one) complete only to fail because it could not push to staged-recipes. IOW it was beaten by another conversion job. One would need to push empty commits just to restart the thing, which was pretty annoying. Another annoying issue is some recipes would be converted to feedstocks, but they would not be cleared if any conversion job failed. This resulted in some manual removals needing to be done.

Early last fall I made some changes to make staged-recipes a bit more robust in these various scenarios. First it grabs the latest changes in master before starting conversion. Second we convert as many feedstocks as we can, skipping over CI registration errors if they occur. Third we pull in the latest master changes. Fourth we push our changes back to master. Now these obviously don’t fix CI web API failures nor do they stop race conditions from occurring. However they do make staged-recipes more robust to both and provide a few perks.

That said, the cron job is a good thing. Just noting we can still do more in this direction without worrying too much.

TL;DR staged-recipes should be robust enough to handle repeated restarts.