Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

The bad generations of `generate.py`.

See original GitHub issue

Thanks for the repo.

I trained the DALLE on visual genome dataset. During the training, one of the generations is shown below,

But when I want to generate an image by generate.py, the generated images are non-sense, even though I use the text also appeared in the training phase.

The scripts I use

# separate the punctuation marks with characters
python generate.py --dalle_path ./dalle.pt --text "tire on bus . window on bus . window on bus . window on bus . window on bus . pole in grass . window on bus .

and

# don't separate the punctuation marks with characters
python generate.py --dalle_path ./dalle.pt --text "tire on bus. window on bus. window on bus. window on bus. window on bus. pole in grass. window on bus."

The results of both scripts are similar, and the generations are

I have checked the model weights are loaded normally. Any thoughts on this issue?

Issue Analytics

State:
Created 2 years ago
Reactions:1
Comments:10 (10 by maintainers)

Top GitHub Comments

3reactions

ylsungcommented, Mar 27, 2021

@afiaka87 It turns out it’s because the approach we save images is different in train_dalle.py and generate.py.

In train_dalle.py we use wandb.Image to process the image, and it will automatically normalize and scale the image. https://github.com/wandb/client/blob/9cc04578ebc6d593450e9dbbcae07452bf7bec35/wandb/sdk/data_types.py#L1676-L1679

However, in generate.py, we use torchvision.utils.save_image to do it. It won’t normalize the image to (0, 1) unless we specify the argument normalize=True. Also, the VAE’s output range is roughly within -1 and 1, if we don’t normalize the image, save_image will directly transform the float number to uint8 by

ndarr = grid.mul(255).add_(0.5).clamp_(0, 255).permute(1, 2, 0).to('cpu', torch.uint8).numpy()

Hence, there will be lots of pixels, which are original smaller than 0, become 0. That’s why there is a big part of the image is black.

There are some results, the config I use is

EPOCHS = 20
BATCH_SIZE = 8
LEARNING_RATE = 3e-4
GRAD_CLIP_NORM = 0.5

MODEL_DIM = 256 # 512
TEXT_SEQ_LEN = 64 # 256
DEPTH = 32
HEADS = 16
DIM_HEAD = 64
REVERSIBLE = False
ATTN_TYPES = None

And the outputs of generate.py, given text “frame on wall. lamp by bed. wall on building.”, are

After I add normalize=True in save_image(image, outputs_dir / f'{i}.jpg', normalize=True), the outputs are

Looks much better.

BTW, the mask seems also important for generations. output = dalle.generate_images(text_chunk, mask = mask, filter_thres = args.top_k). The above results are generated by inputting mask. But I haven’t dig into this too much.

Edit Generations without masks (original implementation):

The results are quite weird, and I cannot even align the image with the text.

Generations with masks:

We can see some beds and lamps in the images, so the quality is higher than which without masks. It seems that the pad token influences the results a lot, so we need to use a mask to exclude them.

2reactions

afiaka87commented, Mar 28, 2021

This is great work and an obvious opportunity to submit a pull request if you’d like. @louis2889184

Top Results From Across the Web

Code generation is terrible and great | by Anders Hovmöller

It's based on Python which is our back end language where we spend most of our time. You want the generator to be...

How to Use Generators and yield in Python

In this step-by-step tutorial, you'll learn about generators and yielding in Python. You'll create generator functions and generator expressions using ...

Generators - Python Wiki

The performance improvement from the use of generators is the result of the lazy (on demand) generation of values, which translates to lower ......

code generation - Python generating Python - Stack Overflow

I've done the same thing (well, generating java from python in this case) using str.format. - while it feels a bit wrong on...

What you can generate and how - Hypothesis! - Read the Docs

To support this principle Hypothesis provides strategies for most built-in types with arguments to constrain or adjust the output, as well as higher-order ......