question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Facing issue while converting tensorboard logs to Aim

See original GitHub issue

🐛 Bug

Trying to convert tensorboard event log file to Aim Run, but getting below error,

One more question, do we have a way to sync tensorboard logs real-time, like while training is in-progress parallelly can we sync tensorboard logs? Currently it’s cli command to sync once we have tensorboard logs in place.

Many thanks!

The lock file /mnt/c/sharath_mk/ubuntu/aim/.aim/.repo_lock is on a filesystem of type `drvfs` (device id: 14). Using soft file locks to avoid potential data corruption.
2022-07-25 15:21:57.067693: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-07-25 15:21:57.067771: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Converting TensorBoard logs:   0%|                                                                                                                         | 0/1 [00:00<?, ?it/sWARNING:tensorflow:From /home/miniconda3/lib/python3.8/site-packages/tensorflow/python/summary/summary_iterator.py:27: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
Parsing logs in /mnt/c/sharath_mk/ubuntu/aim/tensorboard/run_tb_sync/test_tb:   0%|                                                              | 0/2 [00:00<?, ?it/s]
Converting TensorBoard logs:   0%|                                                                                                                         | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/miniconda3/bin/aim", line 8, in <module>
    sys.exit(cli_entry_point())
  File "/home/miniconda3/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/miniconda3/lib/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/miniconda3/lib/python3.8/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/miniconda3/lib/python3.8/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/miniconda3/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/miniconda3/lib/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/miniconda3/lib/python3.8/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/miniconda3/lib/python3.8/site-packages/aim/cli/convert/commands.py", line 39, in convert_tensorboard
    parse_tb_logs(logdir, repo_inst, flat, no_cache)
  File "/home/miniconda3/lib/python3.8/site-packages/aim/cli/convert/processors/tensorboard.py", line 220, in parse_tb_logs
    track_val = value.tensor.float_val[0]
IndexError: list index (0) out of range

To reproduce

Log tensorbord event log file

Expected behavior

Environment

  • Aim Version (e.g., 3.0.1)
  • Python version
  • pip version
  • OS (e.g., Linux)
  • Any other relevant information

Additional context

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
Sharathmk99commented, Jul 31, 2022

I’ll open separate issue to track this request.

0reactions
Sharathmk99commented, Jul 31, 2022

@Sharathmk99 would appreciate that a lot. Regarding the real-time convert, I guess setting up a cron job, that calls the convert command every 5 minutes for example, would do the trick for a short term, would that help you? I’ll check that out myself and will let you know if it works as expected.

Better to accept new parameter for Run class called sync_tensorboard_dir to accept tensorboard event log directory and start a separate thread to monitor the file events and sync if any changes. Every subfolder inside the tensorboard event log directory becomes entity What do you think @mihran113 ?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Aim 3.5 — TensorBoard logs support, Matplotlib integration ...
Run Aim on your TensorBoard logs; Track system params, CLI ,Env, ... scan then convert the scalar and image type logs from your...
Read more >
Unable to open Tensorboard in browser - Stack Overflow
Show activity on this post. This solves my problem: I run my jupyterlab on school server, and opening a port for tensorboard solves...
Read more >
Error while converting Tensorflow model to IR - Intel Communities
I have used the --log_level=DEBUG for running the model optimizer, I tried to follow the sequence of output shapes and operations it used,...
Read more >
How to Use Tensorboard (LIVE) - YouTube
We're going to learn how the visualizer that comes with Tensorflow works in this live stream. We'll go through a bunch of different...
Read more >
How to convert a Transformers model to TensorFlow?
Ask for help when you're stuck! The Transformers team is here to help, and we've probably found solutions to the same problems you're...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found