Handling param limit exceeded error
See original GitHub issueI have a node which takes an S3 artifact URL as an input. This broke my pipeline because of an mlflow.exceptions.MlflowException: Param value 's3://very/long/string' had length 593, which exceeded length limit of 250
.
IMO there should at least be the option to truncate long param strings.
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (4 by maintainers)
Top Results From Across the Web
UT000047: The number of parameters exceeded the ...
Solution: This is the Wildfly (JBoss) configuration problem which has a restriction of 1000 parameters by default as part of request input (in ......
Read more >Resolve limit exceeded error when adding rules in AWS WAF
In AWS WAF, the error appears as "WAFInvalidParameterException: Error reason: You exceeded the capacity limit for a rule group or web ACL., ...
Read more >max request parameter limit — oracle-mosc
Which is getting exceeded due to HTML parameter list limitation. Pls. make changes on server config. to increase the limitation.
Read more >Java Compile error: Parameter x is exceeding the limit of 255 ...
I have a constructor (for an auto generated class) that has 255 paremeters. Using ant on linux with javac 1.6.0_02. The class compiles...
Read more >exceeding the limit of 255 words eligible for method parameters
Hi: I'm encountering errors during compilation time. I'm using eclipse IDE running on jdk 1.4.2.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I slightly disagree with @kaemo here: I agree we should obviously align with mlflow and we will not support a specific trick to enable logging above this limit, but we still have to manage this situation in the plugin. Indeed, if someone still use a parameter above this limit I don’t think the best solution is “do not use the plugin at all”.
I can see 2 situations (in my personal experience) where too long parameters are used:
flatten_dict_params
argument of MlflowNodeHookDataCatalog
(if you want to implement your own logic and no Kedro dataset suits you, they recommend to create your own dataset for clear separation with compute / better portability. However, I acknowledge that I sometimes encountered situations where I preferred doing this operation inside anode
because the logic was very specific and not reusable elsewhere.Potential solution
A possible solution I would support would be adding a
long_parameters_strategy
in themlflow.yml
, in the section:This could take the following values: -
fail
: raise an error in the pipeline like current situation. -truncate
: truncate the string by logging param[0:250] -tag
: log the string withset_tag
instead oflog_param
if it is above the limit, as it seems to be the recommended way in mlflow: mlflow/mlflow#1976And then:
long_parameters_strategy
in theKedroMlflowConfig
class: https://github.com/Galileo-Galilei/kedro-mlflow/blob/94bae3df9a054c85dfc0bf13de8db876363de475/kedro_mlflow/framework/context/config.py#L20 and modify accordingly the methods callsbefore_pipeline_run
method of theMlflowNodeHook
to replace this line and add the corresponding logical tests to address all possible situations.https://github.com/Galileo-Galilei/kedro-mlflow/blob/94bae3df9a054c85dfc0bf13de8db876363de475/kedro_mlflow/framework/hooks/node_hook.py#L53
Some points to have in mind:
250
value to ensure we will adapt in the future.I don’t have time right now and it is not in my top priorities for the plugin, but I will definitely address this in the coming months (llikely by the end of november). If you are in hurry @crypdick, feel free to open a PR 😉. I am open to discussion about the best way to address this, so do not hesitate to suggest alternative possibilities.
@Galileo-Galilei ty for the ideas! Indeed, I had to disable the
MlflowNodeHook
, which was a bummer.In the interest of time, I’m going to port the full URIs into a YAMLDataSet, and keep just the pointers in the
parameters.yml
.