question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Auto-pairing and synching new and edited files

See original GitHub issue

Working on a shared repository with several other (non-technical) folk with a variety of set-ups including folk using notebooks with commit hooks set, folk editing markdown documents in a text editor, scripts updating ipynb files automatically, I was wondering if there is a way combining a set of jupytext commands for a Github Action that would work with a hybrid ipynb/md setup with a broad structure:

- notebooks/
  - chapter1/
    - file.ipynb
    - .md
      - file.md

and then synchronise files as follows:

  • if file.ipynb and file.md are paired, updating one then forces a sync operation to update the paired document, (so something like a simple: jupytext --sync notebooks/* should do that? If I update the .md/file.md will the sync on jupytext --sync notebooks/file.ipynb then update the ipynb?)
  • if someone creates and commits a new file in the .md directory, pair it with a dynamically created ipynb doc in the parent directory (essentially, run a --to ipynb on the md file;
  • if someone creates and commits a new .ipynb file in notebooks/ dir, then create and pair it with a markdown file in .md.

Ideally, anyone should be able to either edit or create either an ipynb or an .md/md file, commit it to the repo, and the Github Action would ensure that paired files are created in appropriate locations (if required) or synched otherwise. Files of the particular filetype out of path (eg a mardown file in notebooks/) shouldn’t be processed and shouldn’t throw an error.

In my mind, something like jupytext --use-formats ipynb,.md//md --sync --force notebooks/** maybe where --use-formats stes the mapping and the paths, --sync says to sync, --force says to --to if the pared file is missing, and notebooks/** checks the notebooks directory tree? If the paired file didn’t exist create it. If eg an .md file appears in notebooks (notebooks/example.md), then don’t pair it and don’t throw an error.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:19 (9 by maintainers)

github_iconTop GitHub Comments

3reactions
mwoutscommented, Nov 22, 2021

Hi everyone, sorry for the delay in answering the latest comments.

@FeryET I like your use case and there are a few reasons for which it is rather challenging 😄

  1. You have ignored all the .ipynb files, and indeed that does not seem to work so well with the pre-commit framework. As @psychemedia mentioned we used to have the possibility to write custom pre-commit scripts, and a previous version of the doc did mention a use case in which .ipynb files were removed from the index.
  2. You use VS Code to edit your .ipynb files, but we don’t have a plugin yet for VS Code (I’d like to do something about this, see your issue #875).

Maybe I can comment on the Jupytext plugin for Jupyter (our “contents manager”). The synchronization between the two files happens as follows. When a paired notebook is opened, the inputs are loaded from the most recent file, and merged with the outputs from the ipynb file (but the files don’t change on disk). When the paired notebook is saved, both files are updated with the content of the notebook (so, after a save, the two files are consistent and are in the same state as if you had run jupytext --sync). In your case, I guess you want to sync the two versions not only after a change in VS Code (py or ipynb file), but also when one of the file is changed externally e.g. through a pull instruction, or at least just before the file is opened in VS Code.

2reactions
griaicommented, May 3, 2022

Hi everyone, sorry for jumping into this discussion, which is already open for quite some time. I just wanted to mention that we have a very similar use case (which, I believe, is not too esoteric): We use notebooks for several things where visualizations are necessary, but would like to track only pure python source files in order to ease code reviews and to avoid checking in the notebook metadata. Therefore, we would like to git-ignore the corresponding notebooks because we would just duplicate the code in our repo, otherwise. However, people might want to use the proper notebook files (e.g. in PyCharm, where there is no jupytext plugin at the moment), which is why we wanted to use jupytext --sync in a pre-commit hook. However, when doing so, the jupytext hook always errors out telling us to git-add the notebook files, which is exactly what we do not want. Is it possible that the jupytext pre-commit hook respects the existing .gitignore file and does not raise an error there?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Intro to syncing your Mac and your devices - Apple Support
Automatically sync all content: This is the quickest and easiest option. Simply select the “Automatically sync when this [device] is connected” checkbox in...
Read more >
Sync files and folders - Dropbox.com
Once you get back online, Dropbox will automatically synchronize your folders and files with all the latest changes.
Read more >
Autosync User's Guide - MetaCtrl
Sync Method: Two-Way ... New files and modified files on one side are transferred to the other side. Files deleted on one side...
Read more >
Sync files with OneDrive in Windows - Microsoft Support
The sync app automatically takes over syncing in the same folder location you were using before. To choose which folders you're syncing, right-click...
Read more >
Networking: Auto Sync with GoodReader Pro for iPad and ...
Alternatively, you can find the Sync button at the top of every file viewer`s navigation menu. This button will execute the sync for...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found