question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[TensorFlow] Failed to get reproducible trainings with albumentations included to the data pipeline

See original GitHub issue

🐛 Bug

I could not get my training work in reproducible way when albumentations added to the data pipeline. I followed this thread https://github.com/albumentations-team/albumentations/issues/93 and fixed all possible seeds, so in overall my snippet that should have enabled reproducible experiments looks like this:

import os
import random

import numpy as np
import tensorflow as tf

def set_random_seed(seed: int = 42):
    """
    Globally fix all possible sources of randomness to keep experiment reproducible 
    """
    random.seed(seed)
    np.random.seed(seed)
    tf.random.set_seed(seed)
    os.environ['PYTHONHASHSEED'] = str(seed)
    os.environ['TF_DETERMINISTIC_OPS'] = '1'
    os.environ['TF_CUDNN_DETERMINISTIC'] = '1'

Unfortunately, this doesn’t help me to get reproducible results. I have executed training process 6 times and got all different results. You can also see the whole picture in W&B:

Screenshot 2021-05-23 at 12 29 29
  • Mean: 0.74478
  • Std: 0.044726

Also, I tried to set random.seed() right before passing my batch into a.Compose() pipeline. That did not really help.

However, when I comment out albumentations from my data pipeline or replace it with some pure TF augmentations, I can get my training reproducible.

Any clues what’s wrong here?

To Reproduce

Steps to reproduce the behavior:

  1. Clone the project state at 0.1.0-bugrep tag:
git clone --depth 1 --branch 0.1.0-bugrep https://github.com/roma-glushko/rock-paper-scissor
  1. Pull dataset:
cd data
kaggle datasets download --unzip frtgnn/rock-paper-scissor
  1. Install project deps:
poetry install
  1. Uncomment any of the reported augmentations in the config file (they are all commented out in the git): https://github.com/roma-glushko/rock-paper-scissor/blob/master/configs/basic_config.py

  2. Run training a couple of times and you get results that differs by a lot:

python train.py

Expected behavior

In order to do experiments that analyze impact of different ideas and changes, I would like to see my training process reproducible.

Environment

  • Albumentations version (e.g., 0.1.8): 0.5.2
  • Python version (e.g., 3.7): 3.8.6
  • OS (e.g., Linux): Ubuntu 20.10
  • How you installed albumentations (conda, pip, source): poetry (pip-like)
  • tensorflow-gpu: 2.5.0 (for the sake of compatibility with RTX3070 (ampere arch.))

Additional context

This report is reproduced in a project that is also mentioned in https://github.com/albumentations-team/albumentations/issues/905

The data pipeline is the same for both issues:

def augment_image(inputs, labels, augmentation_pipeline: a.Compose):
    def apply_augmentation(images):
        aug_data = augmentation_pipeline(image=images.astype('uint8'))
        return aug_data['image']

    inputs = tf.numpy_function(func=apply_augmentation, inp=[inputs], Tout=tf.uint8)

    return inputs, labels


def get_dataset(
        dataset_path: str,
        subset_type: str,
        augmentation_pipeline: a.Compose,
        validation_fraction: float = 0.2,
        batch_size: int = 32,
        image_size: Tuple[int, int] = (300, 300),
        seed: int = 42
) -> tf.data.Dataset:
    augmentation_func = partial(
        augment_image,
        augmentation_pipeline=augmentation_pipeline,
    )

    dataset = image_dataset_from_directory(
        dataset_path,
        subset=subset_type,
        class_names=class_names,
        validation_split=validation_fraction,
        image_size=image_size,
        batch_size=batch_size,
        seed=seed,
    )

    return dataset \
        .map(augmentation_func, num_parallel_calls=AUTOTUNE) \
        .prefetch(AUTOTUNE)

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:19 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
Dipetcommented, May 31, 2021

Looks good. I think current differences associated with the instability of algorithms and hardware.

1reaction
BloodAxecommented, May 27, 2021

Hmm. All of a sudden, this issue starts looking more interesting than at the beginning.

Чт, 27 мая 2021 г. в 11:57, Roman Glushko @.***>:

@Dipet https://github.com/Dipet sure, all tests were performed with the following configuration of augmentation pipeline:

args[‘train_augmentation’] = a.Compose([ a.VerticalFlip(), a.HorizontalFlip(), a.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.1, brightness_by_max=False), a.CoarseDropout(max_holes=20, max_height=8, max_width=8, min_holes=10, min_height=8, min_width=8), a.GaussNoise(p=1.0, var_limit=(10.0, 50.0)), ]) args[‘validation_augmentation’] = a.Compose([])

I kept validation step augmentation-free as @BloodAxe https://github.com/BloodAxe suggested above.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/albumentations-team/albumentations/issues/906#issuecomment-849462738, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEB6YDBTXNOLU5VJ6VVWLTTPYCPFANCNFSM45LQYTDQ .

Read more comments on GitHub >

github_iconTop Results From Across the Web

I got error trying to use albumentations on tensorflow data ...
Im pretty new at deep learning and tensorflow, then when i try to use albumentations on tensorflow data pipeline, this error occurs (i...
Read more >
How to save and load parameters of an augmentation pipeline
Albumentations has built-in functionality to serialize the augmentation parameters and save them. Then you can use those parameters to recreate an augmentation ...
Read more >
Albumentations: Fast and Flexible Image Augmentations - MDPI
Data augmentation is a commonly used technique for increasing both the size and the diversity of labeled training sets by leveraging input transformations ......
Read more >
image_dataset_from_directory get training set - You.com
Bug. I could not get my training work in reproducible way when albumentations added to the data pipeline. I followed this thread https://github.com/ ......
Read more >
TorchIO: A Python library for efficient loading, preprocessing ...
Several computer vision libraries supporting data augmentation have appeared recently, such as Albumentations, or imgaug. PyTorch also includes some ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found