question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] INVALID_PARAMETER_VALUE: Changing param values is not allowed.

See original GitHub issue

Issues Policy acknowledgement

  • I have read and agree to submit bug reports in accordance with the issues policy

Willingness to contribute

No. I cannot contribute a bug fix at this time.

MLflow version

1.30.0

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): WSL Ubuntu 20.04
  • Python version: 3.10.6
  • yarn version, if running the dev UI: N/A

Describe the problem

Autologging for TensorFlow (tf.keras) works when I run just python train.py but not when I run it from mlflow run on the MLproject (which uses the same train.py script).

It appears that the autologger logs the state of the model during creation and this prevents it from updating the log values after training.

Here’s the error I get:

2022/10/24 19:46:15 WARNING mlflow.utils.autologging_utils: Encountered unexpected error during tensorflow autologging: INVALID_PARAMETER_VALUE: Changing param values is not allowed. Params were already logged='[{'key': 'validation_split', 'old_value': None, 'new_value': '0.0'}, {'key': 'shuffle', 'old_value': None, 'new_value': 'True'}, {'key': 'class_weight', 'old_value': None, 'new_value': 'None'}, {'key': 'sample_weight', 'old_value': None, 'new_value': 'None'}, {'key': 'initial_epoch', 'old_value': None, 'new_value': '0'}, {'key': 'steps_per_epoch', 'old_value': None, 'new_value': 'None'}, {'key': 'validation_steps', 'old_value': None, 'new_value': 'None'}, {'key': 'validation_batch_size', 'old_value': None, 'new_value': 'None'}, {'key': 'validation_freq', 'old_value': None, 'new_value': '1'}, {'key': 'max_queue_size', 'old_value': None, 'new_value': '10'}, {'key': 'workers', 'old_value': None, 'new_value': '1'}, {'key': 'use_multiprocessing', 'old_value': None, 'new_value': 'False'}]' for run ID='402712a4625a43bca38c0bce38fa4ed1'.

As you can see, the autolog apparently logged None for all of these values.

Again, the same script works well when I run it outside of MLproject.

Tracking information

No response

Code to reproduce issue

"""
TF/Keras Training script for MLFlow
"""

# mlflow run -e train_entry --env-manager=local --experiment-name=tony-reina-experiments .

from datetime import datetime
import os

os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"

import click  # pip install click
import tensorflow as tf  # pip install tensorflow

# The following import and function call,
# are the only additions to code required
# to automatically log
# metrics and parameters to MLflow.
import mlflow  # pip install mlflow

EXPERIMENT_NAME = "tony-reina-experiments"


def load_data():
    """Load dataset and pre-process
    Fashion MNIST https://github.com/zalandoresearch/fashion-mnist
       28x28 grayscale images of clothes from 10 different categories
    """
    fashion_mnist = tf.keras.datasets.fashion_mnist

    (train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

    # Normalize the images from 0.0 to 1.0
    train_images = train_images / 255.0
    test_images = test_images / 255.0

    # Human-readable class names
    class_names = [
        "T-shirt/top",
        "Trouser",
        "Pullover",
        "Dress",
        "Coat",
        "Sandal",
        "Shirt",
        "Sneaker",
        "Bag",
        "Ankle boot",
    ]

    return train_images, train_labels, test_images, test_labels, class_names


def create_model(parameters):
    """Create a simple TensorFlow Keras model

    Args:
        parameters(dict): Number of units,
                          optimizer, and metrics for model

    """

    model = tf.keras.Sequential(
        [
            tf.keras.layers.Flatten(input_shape=(28, 28)),
            tf.keras.layers.Dense(parameters["num_units"], activation="relu"),
            tf.keras.layers.Dense(10),
        ]
    )

    model.compile(
        optimizer=tf.keras.optimizers.Adam(learning_rate=parameters["learning_rate"]),
        loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
        metrics=[parameters["metrics"]],
    )

    return model


@click.command(help="The base training script for MLFlow.")
@click.option(
    "--num-units", default=128, type=int, help="Number of units in the dense layer"
)
@click.option("--epochs", default=3, type=int, help="Number of training epochs")
@click.option("--batch-size", default=32, type=int, help="Batch size")
@click.option("--learning-rate", default=1e-4, type=float, help="Learning rate")
@click.option("--metrics", default="accuracy", type=str, help="Model metric to track")
@click.option(
    "--training-data", default=".", type=str, help="Path to the training data"
)
@click.option("--testing-data", default=".", type=str, help="Path to the testing data")
def train(
    num_units,
    epochs,
    batch_size,
    learning_rate,
    metrics,
    training_data,
    testing_data,
):
    """Run training"""

    train_images, train_labels, test_images, test_labels, class_names = load_data()

    # Instead of passing lots of variables,
    # we'll just pass a dictionary
    parameters = {
        "num_units": num_units,
        "num_epochs": epochs,
        "batch_size": batch_size,
        "learning_rate": learning_rate,
        "metrics": metrics,
        "training_data": training_data,
        "testing_data": testing_data,
    }

    click.secho(parameters)

    click.secho("Setting up MLflow tracking uri...")
    mlflow.tracking.set_tracking_uri(os.environ.get("MLFLOW_TRACKING_URI"))
    mlflow.set_experiment(experiment_name=EXPERIMENT_NAME)

    mlflow.tensorflow.autolog(
        log_models=True,
        silent=False,
        registered_model_name="ye_olde_mnist_fashion",
    )

    current_time = datetime.now().strftime("%Y-%m-%d %H-%M-%S")
    click.secho("Starting the MLFlow Run...")

    model = create_model(parameters)

    with mlflow.start_run(
        #run_name=f"YeOldDemo-{current_time}",
        tags={"ImageTag": "local"},
        description="Ye Olde Model Xample",
    ):

        model.fit(
            train_images,
            train_labels,
            epochs=parameters["num_epochs"],
            batch_size=parameters["batch_size"],
        )

        click.secho("Finished training")

        test_loss, test_acc = model.evaluate(test_images, test_labels)

        mlflow.log_param(key="test_loss", value=test_loss)
        mlflow.log_param(key="test_acc", value=test_acc)

        mlflow.log_param(key="Class names", value=class_names)
        mlflow.log_param(key="TensorFlow version", value=tf.__version__)


if __name__ == "__main__":
    train()

Stack trace

2022/10/24 19:49:34 WARNING mlflow.utils.autologging_utils: Encountered unexpected error during tensorflow autologging: INVALID_PARAMETER_VALUE: Changing param values is not allowed. Params were already logged='[{'key': 'validation_split', 'old_value': None, 'new_value': '0.0'}, {'key': 'shuffle', 'old_value': None, 'new_value': 'True'}, {'key': 'class_weight', 'old_value': None, 'new_value': 'None'}, {'key': 'sample_weight', 'old_value': None, 'new_value': 'None'}, {'key': 'initial_epoch', 'old_value': None, 'new_value': '0'}, {'key': 'steps_per_epoch', 'old_value': None, 'new_value': 'None'}, {'key': 'validation_steps', 'old_value': None, 'new_value': 'None'}, {'key': 'validation_batch_size', 'old_value': None, 'new_value': 'None'}, {'key': 'validation_freq', 'old_value': None, 'new_value': '1'}, {'key': 'max_queue_size', 'old_value': None, 'new_value': '10'}, {'key': 'workers', 'old_value': None, 'new_value': '1'}, {'key': 'use_multiprocessing', 'old_value': None, 'new_value': 'False'}]' for run ID='a6fc78cd0973486aa6b0ddb5f36581ae'.

Other info / logs

2022/10/24 19:49:34 WARNING mlflow.utils.autologging_utils: Encountered unexpected error during tensorflow autologging: INVALID_PARAMETER_VALUE: Changing param values is not allowed. Params were already logged='[{'key': 'validation_split', 'old_value': None, 'new_value': '0.0'}, {'key': 'shuffle', 'old_value': None, 'new_value': 'True'}, {'key': 'class_weight', 'old_value': None, 'new_value': 'None'}, {'key': 'sample_weight', 'old_value': None, 'new_value': 'None'}, {'key': 'initial_epoch', 'old_value': None, 'new_value': '0'}, {'key': 'steps_per_epoch', 'old_value': None, 'new_value': 'None'}, {'key': 'validation_steps', 'old_value': None, 'new_value': 'None'}, {'key': 'validation_batch_size', 'old_value': None, 'new_value': 'None'}, {'key': 'validation_freq', 'old_value': None, 'new_value': '1'}, {'key': 'max_queue_size', 'old_value': None, 'new_value': '10'}, {'key': 'workers', 'old_value': None, 'new_value': '1'}, {'key': 'use_multiprocessing', 'old_value': None, 'new_value': 'False'}]' for run ID='a6fc78cd0973486aa6b0ddb5f36581ae'.

What component(s) does this bug affect?

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/pipelines: Pipelines, Pipeline APIs, Pipeline configs, Pipeline Templates
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

What interface(s) does this bug affect?

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow’s components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

What language(s) does this bug affect?

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

What integration(s) does this bug affect?

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
harupycommented, Oct 26, 2022

@tonyreina I was able to reproduce the issue with this command:

docker run --rm -w /workdir -v $(pwd):/workdir -e MLFLOW_TRACKING_URI=sqlite:///mlflow.db python:3.8 bash -c "pip install mlflow==1.29.0 tensorflow && mlflow run --env-manager=local -e train_entry --experiment-name=tony-reina-experiments . && rm mlflow.db"

I think you’re using mlflow 1.29.0. https://github.com/mlflow/mlflow/pull/7057 fixed the issue. mlflow 1.30.0 contains this patch.

0reactions
tonyreinacommented, Oct 27, 2022

Thanks. I finally figured it out. My MLFlow was 1.30.0 but my company server was using MLFlow 1.29.0. I’ve asked them to updrade.

Read more comments on GitHub >

github_iconTop Results From Across the Web

mlflow.exceptions.MlflowException: Changing param values is ...
I think you need an MLflow "run" for every new batch of data, so that your parameters are logged independently for each new...
Read more >
[BUG]mlflow project throws INVALID_PARAMETER_VALUE ...
RestException : INVALID_PARAMETER_VALUE: Changing param values is not allowed. Param with key='alpha' was already logged with value='1e-2' ...
Read more >
Resolve the "Parameter validation failed" error in AWS ...
How do I resolve the "Parameter validation failed: parameter value 'abc' for parameter name 'ABC' does not exist" error in CloudFormation?
Read more >
Solved: Conditional Flow creating 'invalid parameter value...
I have a boolean value set in my app to indicate whether a picture was taken or not (HasAttachment), and this is what...
Read more >
Custom Case Button pre-populates fine, but "Invalid ...
Invalid parameter value "Bob Jones" for parameter "cas3". Error: The value of the parameter specified above contains a character that is not ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found