question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Ignore missing submodules?

See original GitHub issue

Proposed change

Don’t fail the build if a submodule is missing.

Example of failure (used to work, broke once a student removed their personal repo): https://mybinder.org/v2/gh/mdeff/ntds_2018/outputs?urlpath=lab.

Alternative options

Update the repository to remove the now missing submodules. But that has a maintenance cost and breaks the intent of preserving the original state of a repository for reproducibility.

Who would use this feature?

People who freeze (archive) repositories for the sake of reproducibility. An old repo might depend on submodules that are not available anymore. This shouldn’t completely prevent people from building a container and running the code.

Downside: this is kind of allowing a build with missing dependencies. The problem is however more severe as github repositories are deleted more often than pypi or conda packages. I would actually even proceed with missing pypi or conda packages after emitting a warning (which should ideally be made more visible than in the build log).

(Out-of-topic, but a way to be notified of binder build failures would be great. Checking manually that it still works is sub-optimal.)

How much effort will adding it take?

Easy. Check the return value of git submodule update --init --recursive, emit a warning if non-zero, and move on with the build.

Who can do this work?

Anybody with a shallow understanding of the codebase.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
mihcommented, May 4, 2021

One more data point in favor of being less strict about “missing” submodules. DataLad (http://handbook.datalad.org/) uses the submodule mechanism to specific subdataset component/dependencies. Scientific datasets, e.g. https://github.com/psychoinformatics-de/studyforrest-data use this to link all components in a single toplevel repository (that is the most useful entrypoint for demos). However, not all dataset components can have the same level of access (think personal data in a neuroimaging study), hence some dataset components will be inaccessible to a public binder instance. However, they are not missing or invalid either.

0reactions
zerothicommented, Sep 9, 2020

Perhaps another more clear way would be to allow repo2docker to not initialise certain submodules. I.e. a configuration file to determine which sub-modules should be initialised, and optionally whether that initialisation would be allowed to fail, say in yaml:

submodules:
  submodules/a:
    - error: pass|error|warn

I have a repo where I don’t want to download the submodule.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to exclude a specific git submodule from update?
From the git help: update. Update the registered submodules to match what the superproject expects by cloning missing submodules and ...
Read more >
gitsubmodules Documentation - Git
This collects information from the submodule's working directory by running status in the submodule while paying attention to the . gitignore file of...
Read more >
Clone missing submodules - 30 seconds of code
Use git submodule update --init --recursive to clone missing submodules and checkout commits. git submodule update --init --recursive. git ...
Read more >
How to ignore submodule tracking? : r/git - Reddit
How to ignore submodule tracking? ... I'd like to use Git submodules to include test data on demand. We have many test files...
Read more >
Running mypy and managing imports
To suppress a single missing import error, add a # type: ignore at the end ... Note: if the module you are trying...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found