Issues with scientific notation metric values (and metrics that are too large)
See original GitHub issue/kind feature
Describe the solution you’d like It looks like the default regex filter that matches metric lines allows for lines like
metric1=0.54
metric2=0.54
But not of things like
unsupportedMetric1=1.2e+32
unsupportedMetric2=1.5E-1
unsupportedMetric4=12e2
Seems like this can be extended by changing this line https://github.com/kubeflow/katib/blob/master/pkg/metricscollector/v1alpha3/common/const.go#L33
Anything else you would like to add: As a workaround I’ve tried following this example I have looked at https://github.com/kubeflow/katib/blob/master/examples/v1alpha3/file-metricscollector-example.yaml#L16 . Things work if the metricValue is small, but if the metric is a large float, then the experiment doesn’t proceed. In this example, the first trial’s pod starts, but nothing else happens.
apiVersion: "kubeflow.org/v1alpha3"
kind: Experiment
metadata:
namespace: kubeflow
labels:
controller-tools.k8s.io: "1.0"
name: test-search-broken
spec:
objective:
type: minimize
# 0 is optimal and is practically unreachable
goal: 0.0
objectiveMetricName: final-objective
algorithm:
algorithmName: bayesianoptimization
algorithmSettings:
- name: "random_state"
value: "10"
parallelTrialCount: 1
maxTrialCount: 6
maxFailedTrialCount: 3
metricsCollectorSpec:
source:
filter:
metricsFormat:
- "{metricName:([\\w|-]+), metricValue:((-?\\d+)(\\.\\d+)?([eE](-|\\+)?\\d+)?)}"
fileSystemPath:
path: "/tmp/final_objective"
kind: File
collector:
kind: File
parameters:
- name: -arena__LG
parameterType: int
feasibleSpace:
min: "8"
max: "15"
trialTemplate:
goTemplate:
rawTemplate: |-
apiVersion: batch/v1
kind: Job
metadata:
name: {{.Trial}}
namespace: {{.NameSpace}}
spec:
template:
spec:
containers:
- image: python:3.6
name: {{.Trial}}
imagePullPolicy: Always
command: ["bash", "-c"]
args:
- set -eou pipefail;
echo "{metricName:final-objective, metricValue:1.9999e200}" >/tmp/final_objective
restartPolicy: Never
Pods
$ kubectl get pods -nkubeflow | grep test-search-broken
test-search-broken-bayesianoptimization-8c75f457f-h4mw6 1/1 Running 0 3m2s
test-search-broken-jmn6ld78-nb6gf 0/2 Completed 0 2m45s
Output of the metric logs container
I0313 00:43:02.625970 15 main.go:85] Trial Name: test-search-broken-jmn6ld78
I0313 00:43:02.627181 15 main.go:79] {metricName:final-objective, metricValue:1.9999e200}
W0313 00:43:02.627871 15 file-metricscollector.go:59] Metrics will not have timestamp since error parsing time {metricName:final-objective,: parsing time "{metricName:final-objective," as "2006-01-02T15:04:05.999999999Z07:00": cannot parse "{metricName:final-objective," as "2006"
I0313 00:43:02.658876 15 main.go:129] Metrics reported. :
metric_logs:<time_stamp:"0001-01-01T00:00:00Z" metric:<name:"final-objective" value:"1.9999e200" > >
I should also note that if you change 1.9999e200
to 1.9999e20
things work, so I’m guessing there’s some ceiling for the values that work. I’m also not sure how to specify “infinity” but that seems like another alternative if I knew how to specify it
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (3 by maintainers)
Top GitHub Comments
Thanks for the issue. It is helpful.
/cc @hougangliu
Feel free to re-open issue if it is needed.