question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Help running napari with IPykernel on GitHub actions

See original GitHub issue

Background

First the good news! Thanks to the efforts of the executable book community, and of @GenevieveBuckley, @potating-potato, and @tlambert03 in napari/napari.github.io#95, I’ve got a very nice workflow going for the skan docs authoring content in markdown within a jupyter notebook, and executing them in GitHub Actions, including grabbing screenshots “invisibly” (using :tags: ["remove-input"] at the start of the code cell containing the napari.utils.nbscreenshot() call, as well as any calls setting up e.g. the camera position). I really think this is going to be amazing for our docs!

Also, aside: I don’t know when this started but now when you restart builds, GitHub Actions now lets you see previous builds!

Ok the bad news: I am seeing sporadic cells timing out when they interact with napari. Unfortunately they happen at a high enough frequency that you’d never get a complete docs build in napari. So I’m really interested in nipping them dead!

I’m starting this as a thread for people to provide ideas, since I know a few people on here have dealt with timeouts in the past, and I’m volunteering the skan repo as a nice place to experiment, since it has just a single page that depends on napari.

Authoring with myst markdown

Here’s the PR implementing myst markdown on skan: jni/skan#151. Summary for a new page:

  • install jupytext
  • install myst-nb (both are in the docs requirements)
  • open a jupyter notebook
  • click file > jupytext > pair with myst-nb
  • work on your notebook as normal
  • when done, close it, delete the ipynb (the .md contains all the info)
  • commit the .md file

For editing an existing page, you can either edit the markdown, or

  • launch jupyter notebook (again, ensuring you have jupytext and myst-nb installed)
  • click on the .md file
  • edit as usual

Here’s an example myst markdown source page, and here’s the rendered version.

Rendering napari windows on myst-nb

Any time you want to add a napari screenshot, you can add a cell with contents like so:

:tags: ["remove-input"]
viewer.camera.angles = (-30, 30, -135)
viewer.camera.zoom = 6.5
napari.utils.nbscreenshot(viewer)

This causes a screenshot to appear but the code to not be shown. You can see an example page with napari screenshots here, though the last screenshot is currently missing because of the timeout issue.

Rendering on CI

In addition to needing Talley’s “setup Qt libs” action, I also copied from @GenevieveBuckley’s napari/napari.github.io#95 starting a display and setting the DISPLAY env variable.

timeouts

So that just leaves the timeouts. They seem to happen arbitrarily on any cells that have napari interaction, not necessarily screenshot cells. Here are two builds during the PR adding that page:

  • In the first build, I get the message “ERROR: Execution Failed with traceback saved in /home/runner/work/skan/skan/doc/_build/html/reports/visualizing_3d_skeletons.log”. The build artifact is unfortunately not preserved, but, spoiler alert, it was a timeout error. Here’s the full traceback:
nbclient.exceptions.CellTimeoutError: A cell timed out while it was being executed, after 300 seconds.
The message was: Cell execution timed out.
Here is a preview of the cell contents:
-------------------
skeleton_layer.edge_color = 'branch-distance'
skeleton_layer.edge_colormap = 'viridis'
# for now, we need to set the face color as well
skeleton_layer.face_color = 'branch-distance'
skeleton_layer.face_colormap = 'viridis'
-------------------
  File "/opt/hostedtoolcache/Python/3.9.10/x64/lib/python3.9/site-packages/nbclient/client.py", line 618, in _async_poll_for_reply
    msg = await ensure_async(self.kc.shell_channel.get_msg(timeout=new_timeout))
  File "/opt/hostedtoolcache/Python/3.9.10/x64/lib/python3.9/site-packages/nbclient/util.py", line 96, in ensure_async
    result = await obj
  File "/opt/hostedtoolcache/Python/3.9.10/x64/lib/python3.9/site-packages/jupyter_client/channels.py", line 230, in get_msg
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.9.10/x64/lib/python3.9/site-packages/jupyter_cache/executors/utils.py", line 51, in single_nb_execution
    executenb(
  File "/opt/hostedtoolcache/Python/3.9.10/x64/lib/python3.9/site-packages/nbclient/client.py", line 1085, in execute
    return NotebookClient(nb=nb, resources=resources, km=km, **kwargs).execute()
  File "/opt/hostedtoolcache/Python/3.9.10/x64/lib/python3.9/site-packages/nbclient/util.py", line 84, in wrapped
    return just_run(coro(*args, **kwargs))
  File "/opt/hostedtoolcache/Python/3.9.10/x64/lib/python3.9/site-packages/nbclient/util.py", line 62, in just_run
    return loop.run_until_complete(coro)
  File "/opt/hostedtoolcache/Python/3.9.10/x64/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/opt/hostedtoolcache/Python/3.9.10/x64/lib/python3.9/site-packages/nbclient/client.py", line 551, in async_execute
    await self.async_execute_cell(
  File "/opt/hostedtoolcache/Python/3.9.10/x64/lib/python3.9/site-packages/nbclient/client.py", line 830, in async_execute_cell
    exec_reply = await self.task_poll_for_reply
  File "/opt/hostedtoolcache/Python/3.9.10/x64/lib/python3.9/site-packages/nbclient/client.py", line 642, in _async_poll_for_reply
    await self._async_handle_timeout(timeout, cell)
  File "/opt/hostedtoolcache/Python/3.9.10/x64/lib/python3.9/site-packages/nbclient/client.py", line 689, in _async_handle_timeout
    raise CellTimeoutError.error_from_timeout_and_cell(
nbclient.exceptions.CellTimeoutError: A cell timed out while it was being executed, after 300 seconds.
The message was: Cell execution timed out.
Here is a preview of the cell contents:
-------------------
skeleton_layer.edge_color = 'branch-distance'
skeleton_layer.edge_colormap = 'viridis'
# for now, we need to set the face color as well
skeleton_layer.face_color = 'branch-distance'
skeleton_layer.face_colormap = 'viridis'
-------------------

The workflow on main after merging the PR got a timeout in the final cell.

Help!

If anyone has ideas on how to deal with the timeouts, I’m all ears and deeply interest in implementing them! Thank you all! 🙏

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
jnicommented, Feb 13, 2022

@tlambert03

Here’s a diff that I would apply

This was on the right track but not quite! TIL that steps and jobs can have a working-directory: tag, and that xvfb-action can’t do bashy things like if/then/fi, and that each xvfb run step is run independent of the others so cd has no effect, and that xvfb-action also has a working-directory: parameter. 😜 You can see the final diff at jni/skan#156.

Anyway, after that change I’ve had two successful builds without timeouts, so maybe that’s fixed it…??? 🤞 Gonna keep fiddling, I’ll very happily close this if I get a few more uneventful builds…!

Possibly related: on the first screenshot cell I get the following warning (this was the case before the xvfb-action change also):

WARNING: QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to ‘/tmp/runtime-runner’

Do I just add XDG_RUNTIME_DIR=‘/tmp/runtime-runner’ to my env? Or is there something cleaner here?

0reactions
jnicommented, Feb 13, 2022

Dang, 3/4 built without error, but timeout happened again.

https://github.com/jni/skan/runs/5171591210?check_suite_focus=true#step:7:46

Read more comments on GitHub >

github_iconTop Results From Across the Web

NAP-2 — Distributing napari with conda-based packaging
Building conda-based installers for napari. Adding support for conda packages in the plugin manager. Enabling in-app napari version updates.
Read more >
Release 7.26.0 The IPython Development Team
A lightweight persistence framework via the %store command, which allows you to save arbitrary Python vari- ables. These get restored when you ...
Read more >
Mamba meets JupyterLite - Jupyter Blog
Introducing a mamba-based distribution for WebAssembly, and deploying scalable computing environments with JupyterLite. ... JupyterLite is a ...
Read more >
Scheduling Notebook and Script Runs with GitHub Actions
We can use GitHub Actions to periodically run our scripts for us and update the outputs and plots to do this. I will...
Read more >
conda-forge - :: Anaconda.org
actions-runner, 2.300.2, MIT, X, The Runner for GitHub Actions ... 1.3.1, MIT, X, Python FLEUR simulation package containing an AiiDA Plugin for running....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found