question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Implement liveness check for notebook extensions

See original GitHub issue

Currently, each TensorBoard process writes its meta-information to a file in the shared .tensorboard-info temp directory, and tries to clean up the file on graceful exit. This has two problems on Windows:

  • The base temporary directory %TMP% is never automatically cleaned, even after logout or reboot.
  • It is not possible to gracefully shut down an arbitrary process given its PID.

The result is that most any TensorBoards started by %tensorboard will leave their info files around forever, unless manually cleaned up, and the instructions suggested to the user (“use !kill …”) are not adequate to effect this cleanup. See #2481.

We can ameliorate this by implementing the liveness check mentioned in a TODO in manager.py, cleaning up the dead info file on failure:

https://github.com/tensorflow/tensorboard/blob/c1c2771a5ed5ce1b9ad605dd1834616905c2f190/tensorboard/manager.py#L453-L458

We can also provide a notebook.kill(pid) function with implementation something vaguely like

def kill(pid):
  if os.name == "nt":
    subprocess.check_output(["taskkill", "/pid", str(int(pid)), "/f"])
    manager.remove_info_file(pid)
  else:
    os.kill(pid, signal.SIGTERM)

and then replace the “reusing TensorBoard” user-facing messaging on Windows with something like

template = (
    "Reusing TensorBoard on port {port} (pid {pid}), started {delta} ago. "
    "To kill it, run `from tensorboard import notebook; notebook.kill({pid})`."
)

though that incantation still is a bit of a mouthful.

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
rmcrowley2000commented, Aug 10, 2021

For those on Windows trying to do this from Jupyter, the following two commands will automate the process. You can just toss them in a cell and run them after you are done with the open TensorBoard instance (or before you need to check a different log file).

!taskkill /IM "tensorboard.exe" /F
!rmdir /S /Q %temp%\.tensorboard-info
0reactions
mthomp89commented, Jul 21, 2020

Same issues: TensorBoard is not reliable within Jupyter. Based on this thread I did the following with marginal success. TensorBoard was good for only one time after the cleanup.

"taskkill /im tensorboard.exe /f" to kill all live pids in command
deleted all the pid-xxxx.info files in the "%TMP%.tensorboard-info" directory.
deleted the whole "%TMP%.tensorboard-info" directly
Read more comments on GitHub >

github_iconTop Results From Across the Web

Adding health checks with Liveness, Readiness, and Startup ...
In this post I discuss the health check probes in Kubernetes, how to specify them in Helm and ASP.NET Core, and issues to...
Read more >
Implement Health Checks for Kubernetes in Your Application
Application Health checks and Kubernetes liveness and readiness probes need to be implemented right to make the application fully resilient.
Read more >
Dive into Kubernetes Healthchecks (part 2) | by Woj Sierakowski
It is usually suggested to use dumb probes for the liveness check. With liveness probes we want the kubelet to know whether the...
Read more >
Liveness probe in jupyter-server - Notebook
The problem we hit with that was the route to the extension in the notebook container is GET /user/:userid/<extension path> and while the...
Read more >
Configure Liveness, Readiness and Startup Probes
If your application implements gRPC Health Checking Protocol, kubelet can be configured to use it for application liveness checks. You must ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found