[Help wanted] Not satisfaying image_stylization result
See original GitHub issueHello,
based on the info at https://github.com/tensorflow/magenta/tree/master/magenta/models/image_stylization I tried to train new models from scratch but when transferring the style I get bad results, probably due to the same cause as #935 and #1285.
I report here the more information I can to help solve the issue:
- I ran
image_stylization_create_dataset \
--vgg_checkpoint=vgg/vgg_16.ckpt \
--style_files=STYLE_IMAGE.jpg \
--output_file=TRAIN_RECORD.tfrecord
then
image_stylization_train \
--train_dir=TRAIN_DIR \
--style_dataset_file=TRAIN_RECORD.tfrecord \
--num_styles=1 \
--vgg_checkpoint=vgg/vgg_16.ckpt \
--imagenet_data_dir=imagenet_tf_output
After the model was generated I used the following code to transfer the style.
image_stylization_transform \
--num_styles=1 \
--checkpoint=MODEL.CKPT-XXXX \
--input_image=INPUT_IMAGE.jpg \
--which_styles="[0]" \
--output_dir=OUTPUT_DIR \
--output_basename="stylized"
- Attached you can find the style image I used for training, starry_night_1280.jpg, resolution is 1280 × 1014 (I get OOM issues during image_stylization_create_dataset if I use bigger resolution images)
-
I used the default content/style loss hyperparameters at the one in the scripts, I didn’t change them.
-
Attached you can find the content image I used, home_1600.jpg, resolution is 1600 × 992.
- Attached you can find first the style transfer result obtained using the Varied model provided, which has the Starry night style, and then the result I got using the model I trained. If you look at the one obtained with the Varied model (home_varied_model_result.jpg) you can clearly see the starry night style has been applied; if you look at the one obtained using the model I trained (home_custom_model_result.jpg), you can see there is “something” missing.
- Environment info: TensorFlow: v1.12.0 Nvidia K80 Nvidia driver: v390.77 Cuda: v9.0 Installed using conda package https://anaconda.org/anaconda/tensorflow-gpu
Could you please point me in the right direction to discover what’s wrong in the process I follow? Thank you very much.
Issue Analytics
- State:
- Created 5 years ago
- Reactions:1
- Comments:6 (4 by maintainers)
Top Results From Across the Web
Help your employees find purpose--or watch them leave
In this article, we describe the role that work can play in individual purpose, highlight what employees want from employers and what they...
Read more >Maslow's Hierarchy of Needs - Simply Psychology
If these needs are not satisfied the human body cannot function optimally. Maslow considered physiological needs the most important as all ...
Read more >Know Your Customers' “Jobs to Be Done”
If it does the job well, we'll hire it again. If it does a crummy job, we “fire” it and look for something...
Read more >What Are the Most Satisfying Jobs? - LiveAbout
Check out some of the most satisfying jobs, review what makes a job satisfying, explore options, and learn how to find a satisfying...
Read more >Careers for creative people - Bureau of Labor Statistics
It discusses the creative process, highlights selected occupations that require creativity, and offers employment and wage data for these occupations.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi Emanuele!
The resolution of the content and style images you are using is probably too large for the model. We usually train on square content images of size 256px (see here). I don’t recall what exact style image sizes we used, but I remember them being smaller than the resolutions you trained at (this is also the case for other commonly-used fast stylization models; see here and here).
This has several consequences which could explain your observations:
With all of that in mind, I would suggest that you lower the resolution of the style image you use to train as well as the resolution of the content image you use for evaluation.
Please don’t hesitate to reach out again if you have further questions!
I have similar issues to the ones @ema987 mentioned in his latest post. I can’t get a satisfying output from a 256x256 input image. I followed every step of the OP link.
The content loss appears to be definitely smaller than the style loss