question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Some details about the training parameters.

See original GitHub issue

I trained the network from scratch but got poor results. Below are my training parameters(same as the training guide you provide).

--program_name=twingan
--dataset_name="image_only"
--dataset_dir="dir to celeba tfrecord files"
--unpaired_target_dataset_name="anime_faces"
--unpaired_target_dataset_dir="dir to the anime tfrecord you provided"
--train_dir="dir to save results"
--dataset_split_name=train
--preprocessing_name="danbooru"
--resize_mode=RANDOM_CROP
--do_random_cropping=True
--learning_rate=0.0001
--learning_rate_decay_type=fixed
--is_training=True
--generator_network="pggan"
--loss_architecture=dragan
--pggan_max_num_channels=256
--generator_norm_type=batch_renorm
--use_ttur=True
--num_images_per_resolution=50000

Compared with official PGGAN repo, I found some differences.

  1. num_image_per_resolution of TwinGAN is 50000 while PGGAN is 600000.
  2. TwinGAN uses RANDOM_CROP while PGGAN uses RESIZE directly.

Could you please help me out here?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
jerryli27commented, Aug 17, 2018

Hi @lionel3 I updated the training documentation. There was indeed a bug in my default parameters. After fixing that I am able to reproduce my previous results.

Please sync to the latest version and see https://github.com/jerryli27/TwinGAN/blob/master/docs/training.md .

The parameters I added are:

--do_pixel_norm=True
--l_content_weight=0.1
--l_cycle_weight=1.0

Please reopen this issue if you cannot reproduce. Thanks!

1reaction
jerryli27commented, Aug 8, 2018

I added two lines to the training script. It should work now. --gradient_penalty_lambda=0.25 --use_unet=True

The whole script now looks like:

python pggan_runner.py
--program_name=twingan
--dataset_name="image_only"
# Assume you have data like 
# ./data/celeba/train-00000-of-00100.tfrecord,  
# ./data/celeba/train-00001-of-00100.tfrecord ...
--dataset_dir="./data/celeba/"
--unpaired_target_dataset_name="anime_faces"
--unpaired_target_dataset_dir="./data/anime_faces/"
--train_dir="./checkpoints/twingan_faces/"
--dataset_split_name=train
--preprocessing_name="danbooru"
--resize_mode=RESHAPE
--do_random_cropping=True
--learning_rate=0.0001
--learning_rate_decay_type=fixed
--is_training=True
--generator_network="pggan"
--use_unet=True
--num_images_per_resolution=300000
--loss_architecture=dragan
--gradient_penalty_lambda=0.25
--pggan_max_num_channels=256
--generator_norm_type=batch_renorm
--hw_to_batch_size="{4: 8, 8: 8, 16: 8, 32: 8, 64: 8, 128: 4, 256: 3, 512: 2}"

I haven’t tested with the multi-gpu setting thoroughly yet due to limits in hardware, so yes there may be some bug, but you can try to add the following flags.

--sync_replicas=False
--replicas_to_aggregate=1
--num_clones=2
--worker_replicas=1

I updated the training readme with the comments above.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Training Parameters - Amazon Machine Learning
Typically, machine learning algorithms accept parameters that can be used to control certain properties of the training process and of the resulting ML...
Read more >
Training Parameters - Cognex Deep Learning Help
The Deep Learning framework provides several general parameters that you can use to influence the training process: Epoch Count, Training Set for all...
Read more >
Parameters, Hyperparameters, Machine Learning
Explains hyperparameters and parameters in machine learning and deep learning and how they relate with each other. list of hyperparameters and parameters.
Read more >
How much training data do you need? | by Malay Haldar
Of course, larger models need more training data, but for a given ratio of the number of training samples to the number of...
Read more >
Chapter 9 Training Neural Networks Part 3 - Deep Learning
Training related parameters such the Learning Rate, Momentum parameter, Regularization parameters, Batch Size, Stopping Time etc. All these parameters are ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found