question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Textual Inversion Broken: it updates entire `embeddings` weights, loss diverges

See original GitHub issue

Describe the bug

The weights of the entire text_encoder evolve over the course of training, thus breaking the text_encoder. I’m not sure why yet, but this in turn breaks Inversion.

To demonstrate it,

1.) save a random token id and it’s embedding, outside the main loop:

    token_embed_w_copy = text_encoder.get_input_embeddings().weight.data.clone().detach().requires_grad_(False).to(accelerator.device)

    # never seen
    test_tok_id = tokenizer.convert_tokens_to_ids('alligator')
    test_tok = token_embed_w_copy[test_tok_id]

2.) Inside the loop, assert that it’s not changing:

test_tok_actual = text_encoder.get_input_embeddings().weight.data[test_tok_id]
assert(torch.allclose(test_tok, test_tok_actual))
# BREAKS!

The assertion passes until an entire batch completes, at which time the embeddings diverge.

The code currently tries to solve this by zeroing all the non-placeholder_token gradients to zero, but this (or something else) fails to keep the weights from updating.

I’ve confirmed that this breaks TI by manually copying back the entire set of non-placeholder weights after every batch, and this fixes TI. But it’s ducttape, really, and I’m hoping someone has a better idea.

EDIT: this does not actually solve it. It solves it a little, it seems, but the loss still random-walks / diverges. I can even 0 out all the gradient each step and it still behaves strangely.

System Info

Debian, Python 3.9.2, revision b2b3b1a8ab83b020ecaf32f45de3ef23644331cf

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:12 (9 by maintainers)

github_iconTop GitHub Comments

4reactions
patil-surajcommented, Sep 28, 2022

Working on the fix, should ready by end of the week, sorry to get back to this only now!

3reactions
patil-surajcommented, Sep 16, 2022

You are right @JunnYu ! The weight_decay indeed updates the whole embeddings. Will send a fix soon.

Read more comments on GitHub >

github_iconTop Results From Across the Web

[Textual Inversion] Do not update other embeddings #1665
Successfully merging this pull request may close these issues. Textual Inversion Broken: it updates entire embeddings weights, loss diverges.
Read more >
Stable Diffusion Tutorial Part 2: Using Textual Inversion ...
This tutorial shows in detail how to train Textual Inversion for Stable ... using new "words" in the embedding space of pre-trained text-to-image...
Read more >
What should I do when my neural network doesn't learn?
The best method I've ever found for verifying correctness is to break your ... in a network will still train and the weights...
Read more >
Textual Inversion - Make Anything Using Stable Diffusion
Well, now you can thanks to textual inversion ! Create personalised embeddings to easily add your favourite things into your stable diffusion ...
Read more >
D - Scopus
No information is available for this page.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found