question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Multiple synchronization of a running offline job

See original GitHub issue

Hi,

I’m working with Slurm and my compute nodes do not have access to the internet. Therefore I’m using the offline feature of neptune.

I was wondering if it is possible to sync an offline job while it is running (as in my case neptune is writing on a shared disk, I can sync the .neptune folder from another node with internet access). I’d like to monitor my training while it is running to see in advance how it performs.

I have tried neptune sync and it seems to work but only the first time. Then when I synchronize again it does not upload any more logs until the end of the program

Reproduction

Here is an example to reproduce what I experienced on a single computer with internet access

import neptune.new as neptune


run = neptune.init(mode="offline")


while True:
    x = float(input("New log:"))
    run["inputs"].log(x)

In a shell, let’s run the script and enter a few values

# Shell 1
$ python my_script.py
New log: 0.5
New log: 1.5
New log: 2.5

After this let’s synchronize this experiment, while keeping the script running

# Shell 2
$ neptune sync
Offline run run__01ee0d02-dfcc-4424-a0ce-d99cec75733a registered as raphaelreme/MyExp/TRAC-10
Synchronising raphaelreme/MyExp/TRAC-10
Synchronization of run raphaelreme/MyExp/TRAC-10 completed.

Now I see my new run, with three inputs: 0.5, 1.5, 2.5

But if I continue to log data in my script:

# Shell 1
New log: 2.5  # From before
New log: 3.5
New log: 4.5

And try to sync it again:

#Shell 2
$ neptune sync
Synchronising raphaelreme/MyExp/TRAC-10
Synchronization of run raphaelreme/MyExp/TRAC-10 completed.

Nothing more will come in my dashboard: still only three inputs (0.5, 1.5, 2.5, and no sign of 3.5, 4.5)

I found out that stopping the script and then synchronizing it afterward, will log the missing data. But if my training is long I don’t really want to wait for the end to watch the training curves (Even if I have other ways to see the logs it would be nice to be able to manually sync neptune while the training is running)

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
Blaizzycommented, May 26, 2022

Hi @raphaelreme,

Prince here, from Neptune.ai

Thank for reaching out!

That’s interesting use case, I will run a few tests and perhaps talk to the devs and get back to you!

0reactions
Blaizzycommented, Aug 9, 2022

Hi @raphaelreme

I’m closing this issue as it’s a bit stale,

Feel free to leave a comment here if you still need help with this so we can re-open the issue,

Until then have a great day.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Offline data synchronization - IBM Developer
This article walks you through basic and advanced strategies for offline data synchornization for mobile apps.
Read more >
How to Implement Offline Data Synchronization in Mobile App?
Data synchronization is another technique for sending data from the device to the server. This approach leverages both local caching and local ...
Read more >
Offline Synchronization | Mendix Documentation
Running multiple synchronization processes at the same time is not supported, regardless the of the type (full or selective).
Read more >
Work offline with tables that are linked to SharePoint lists
To synchronize your offline lists with data from the server, click Synchronize. To reconnect the linked tables after you work offline, click Work...
Read more >
Unable to Sync jobs to offline database D365
When I run jobs the offline database remains stuck at Available and they never get ... Offline data synchronization can occur only if...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found