question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

bug: `Task.set_initial_iteration` doesn't affect the server

See original GitHub issue

I want to report two problems:

  1. The docs for the parameter iteration of any Logger.report_* method are not matching what happens.
  2. Setting the initial iteration doesn’t affect the server.

For instance, if I have a clearml.Task instance which is being resumed from iteration 100, the initial iteration (as returned by Task.get_initial_iteration) is 100. Lets say I want to report a scalar for iteration 105 by calling Logger.report_scalar

clearml_logger.report_scalar(value, iteration=105)

However, it gets plotted for iteration 205 on the server (i.e. 100 + 105). Hence the docs should say here that iteration really is an offset to the initial iteration. There would be two ways to do this properly:

  1. manually calculate the offset
    clearml_logger.report_scalar(value, iteration=105 - clearml_task.get_initial_iteration())
    
    This works for all manual summaries, however not for the automatically done summaries (GPU / CPU / memory utilization; see #440)
  2. Initially set the initial iteration statically to 0, relying on an external recovery of the current iteration. This is broken however. When I do this, my local instance of clearml_task gets updated properly. However,
    clearml_logger.report_scalar(value, iteration=105)
    
    will still cause the server to show that summary value for iteration 205, not 105.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
patzmcommented, Sep 1, 2021

I think this comment is a good summary of the idea already: https://github.com/allegroai/clearml/issues/440#issuecomment-908284422

1reaction
bmartinncommented, Aug 31, 2021

I like the idea. I would see (2) as an extension to (1) though. I would still implement set_last_checkpoint (and thus expose), and then simply call it when detecting that a checkpoint was written

Agreed! Let’s call it a new feature request 😃 @patzm Could you close this issue and update this feature request issue with the discussed design ?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Filter !=bug and !=task with component != doesnt work
Solved: Hello, i wanted to filter for issues which are not Bugs and not Tasks. Additionally these issues should not have the component...
Read more >
Is it a bug or a task when something doesn't work, yet, in ...
I would call this a task. If I understand you correctly, it is meant to enable a new feature, so it is not...
Read more >
CSCvu72374 - Active Control Participant-Add ... - Cisco Bug
Symptom: ActiveControl device tries to add the participant to the conference, Cisco Meeting Server should be making an outgoing call to the ...
Read more >
MySQL Bugs: #100118: Server doesn't restart because of too ...
Bug #100118, Server doesn't restart because of too many gaps in the mysql.gtid_executed table. Submitted: 6 Jul 2020 5:44, Modified: ...
Read more >
Bugs that are fixed in SQL Server 2012 Service Pack 2
This article lists the bugs that are fixed in Microsoft SQL Server 2012 Service ... FIX: File Share subscription doesn't populate the description...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found