question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Handling param limit exceeded error

See original GitHub issue

I have a node which takes an S3 artifact URL as an input. This broke my pipeline because of an mlflow.exceptions.MlflowException: Param value 's3://very/long/string' had length 593, which exceeded length limit of 250.

IMO there should at least be the option to truncate long param strings.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
Galileo-Galileicommented, Sep 23, 2020

I slightly disagree with @kaemo here: I agree we should obviously align with mlflow and we will not support a specific trick to enable logging above this limit, but we still have to manage this situation in the plugin. Indeed, if someone still use a parameter above this limit I don’t think the best solution is “do not use the plugin at all”.

I can see 2 situations (in my personal experience) where too long parameters are used:

  • when you pass a dict which contains a lot of keys as a parameter (typically, all the hyperparameters of a ml model like xgboost). => This situation is already managed with the flatten_dict_params argument of MlflowNodeHook
  • when you pass a path to a file / an url to a node (your situation here), which can be very long. This situation is seen as a bad practice in Kedro, because it means you want to perform some I/O operations and they should be addressed in the DataCatalog (if you want to implement your own logic and no Kedro dataset suits you, they recommend to create your own dataset for clear separation with compute / better portability. However, I acknowledge that I sometimes encountered situations where I preferred doing this operation inside a node because the logic was very specific and not reusable elsewhere.

Potential solution

A possible solution I would support would be adding a long_parameters_strategy in the mlflow.yml, in the section:

hooks:
    node:
        long_parameters_strategy

This could take the following values: - fail: raise an error in the pipeline like current situation. - truncate: truncate the string by logging param[0:250] - tag: log the string with set_tag instead of log_param if it is above the limit, as it seems to be the recommended way in mlflow: mlflow/mlflow#1976

And then:

  1. Add the same key long_parameters_strategy in the KedroMlflowConfig class: https://github.com/Galileo-Galilei/kedro-mlflow/blob/94bae3df9a054c85dfc0bf13de8db876363de475/kedro_mlflow/framework/context/config.py#L20 and modify accordingly the methods calls
  2. modify the before_pipeline_run method of the MlflowNodeHook to replace this line and add the corresponding logical tests to address all possible situations.

https://github.com/Galileo-Galilei/kedro-mlflow/blob/94bae3df9a054c85dfc0bf13de8db876363de475/kedro_mlflow/framework/hooks/node_hook.py#L53

Some points to have in mind:

  • a logger should put a warning in the log because it prevents reproducibility
  • the logical test on length must use MAX_PARAM_VAL_LENGTH (and not hardcoded 250 value to ensure we will adapt in the future.

I don’t have time right now and it is not in my top priorities for the plugin, but I will definitely address this in the coming months (llikely by the end of november). If you are in hurry @crypdick, feel free to open a PR 😉. I am open to discussion about the best way to address this, so do not hesitate to suggest alternative possibilities.

1reaction
crypdickcommented, Sep 24, 2020

@Galileo-Galilei ty for the ideas! Indeed, I had to disable the MlflowNodeHook, which was a bummer.

In the interest of time, I’m going to port the full URIs into a YAMLDataSet, and keep just the pointers in the parameters.yml.

Read more comments on GitHub >

github_iconTop Results From Across the Web

UT000047: The number of parameters exceeded the ...
Solution: This is the Wildfly (JBoss) configuration problem which has a restriction of 1000 parameters by default as part of request input (in ......
Read more >
Resolve limit exceeded error when adding rules in AWS WAF
In AWS WAF, the error appears as "WAFInvalidParameterException: Error reason: You exceeded the capacity limit for a rule group or web ACL., ...
Read more >
max request parameter limit — oracle-mosc
Which is getting exceeded due to HTML parameter list limitation. Pls. make changes on server config. to increase the limitation.
Read more >
Java Compile error: Parameter x is exceeding the limit of 255 ...
I have a constructor (for an auto generated class) that has 255 paremeters. Using ant on linux with javac 1.6.0_02. The class compiles...
Read more >
exceeding the limit of 255 words eligible for method parameters
Hi: I'm encountering errors during compilation time. I'm using eclipse IDE running on jdk 1.4.2.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found