question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

** Environment **

  • OS: Ubuntu 22.04 LTS
  • GPU:
  *-display                 
       description: VGA compatible controller
       product: GP102 [GeForce GTX 1080 Ti]
       vendor: NVIDIA Corporation
       physical id: 0
       bus info: pci@0000:01:00.0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress vga_controller bus_master cap_list rom
       configuration: driver=nvidia latency=0
       resources: irq:58 memory:fa000000-faffffff memory:c0000000-cfffffff memory:d0000000-d1ffffff ioport:e000(size=128) memory:c0000-dffff
  *-graphics
       product: EFI VGA
       physical id: 2
       logical name: /dev/fb0
       capabilities: fb
       configuration: depth=32 resolution=1024,768
  • Cuda: 11.7
  • Composer: 0.8.2
  1. loss should specify micro_batch instead of batch
  2. Cannot log multiple losses, allow me to return dictionary of losses
  3. Cannot log things at batch level (not micro-batch level) inside of loss method
  4. (Bug) grad_accum fails with CUDA OOM even though batch_size=1 w/ no grad_accum works
  5. (Bug) Nothing is printed indicated that composer is restarting the forward method when grad_accum="auto" is set to True

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:13 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
hanlintcommented, Aug 5, 2022

@mvpatel2000 for the Auto grad accum issue, it may be that we need to do a cuda cache clear after the grad accum adjustment, otherwise the repeated restarts can cause memory fragmentation on the GPU. This can be triggered by increasing the image resolution such as a known small batch size fits into memory, then greatly increasing the batch size to force multiple grad accum adjustments.

1reaction
abhi-mosaiccommented, Aug 2, 2022

Hey @vedantroy , I’m having a bit of trouble reproducing the LR issue. Could you try running it again and printing the logs to console via ProgressBarLogger(progress_bar=False, log_to_console=True, log_level='BATCH') and report back what the raw float values for lr are?

I’m trying to figure out if there is something missing in my testing, or if it’s just WandB’s logging of float values that is truncated… Like in the early stage of a cosine decay, the true value might be 0.9999912… but then maybe it’s just getting displayed as 1.0.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Assorted - Definition, Meaning & Synonyms - Vocabulary.com
An assorted group of things are different from each other. Assorted presents could include a gift certificate, a bike, and a stuffed animal....
Read more >
Assorted Definition & Meaning - Merriam-Webster
The meaning of ASSORTED is suited especially by nature or character. How to use assorted in a sentence.
Read more >
The assortment problem: A survey - ScienceDirect.com
The assortment or catalog problem involves determining which of the possible set of sizes or qualities of some product should be stocked ...
Read more >
Assorted Problems
Mrs. Napholtz's Math Site. Assorted Problems. The computer club spent $300 on a printer, the members sharing the cost equally.
Read more >
2021 & 2022 Any 4 issues for $4.00 + Shipping (See pics) | eBay
See pics for what issues are available. Makes a great gift. ... Assorted Magazines - 2021 & 2022 Any 4 issues for $4.00...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found