question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Proposal] Support JSON format for `file-metrics-collector`

See original GitHub issue

/kind feature

Describe the solution you’d like [A clear and concise description of what you want to happen.]

Motivation

Currently, it is difficult to parse JSON format files by file-metrics-collector using regexp filter since file-metrics-collector is designed to use TEXT format files. I believe if file-metrics-collector supports JSON format files, we can be further made Katib powerful because we can make use of JSON format metrics files without regexp more easily. Therefore, I would like to support JSON format in file-metrics-collector, such as the following example, which is split by newlines.

{"foo": “bar", “fiz": “buz"…}
{“foo": “bar", “fiz": “buz"…}
{“foo": “bar", “fiz": “buz"…}
{“foo": “bar", “fiz": “buz"…}
…

This JSON format is also used in cloudml-hypertune recommended for use in GCP AI Platform or Vertex AI.

If you use a custom container for training or if you want to perform hyperparameter tuning with a framework other than TensorFlow, then you must use the cloudml-hypertune Python package to report your hyperparameter metric to AI Platform Training.

https://cloud.google.com/ai-platform/training/docs/using-hyperparameter-tuning#other_machine_learning_frameworks_or_custom_containers

Design

I’m thinking of the following Kubernetes API and webhook. Also, file-metrics-collector collects values whoose key is spec.objective.objectiveMetricName and spec.objective.additionalMetricNames from the metrcs file if FileSystemFileFormat is set Json.

+ type FileSystemFileFormat string
+
+ const (
+   TextFormat    FileSystemFileFormat = "Text"
+   JsonFormat    FileSystemFileFormat = "Json"
+ )

type FileSystemPath struct {
  Path string                     `json:"path,omitempty"`
  Kind FileSystemKind             `json:"kind,omitempty"`
+ FileFormat FileSystemFileFormat `json:"fileFormat,omitempty"`
}
func (g *DefaultValidator) validateMetricsCollector(inst *experimentsv1beta1.Experiment) error {
  mcSpec := inst.Spec.MetricsCollectorSpec
  mcKind := mcSpec.Collector.Kind
  ...
  switch mcKind {
  ...
  case commonapiv1beta1.FileCollector:
    ...
+     fileFormat := mcSpec.Source.FileSystemPath.FileSytemFileFormat
+     if fileFormat == "" {
+       fileFormat = commonapiv1beta1.TextFormat
+     } else if fileFormat != commonapiv1beta1.TextFormat && fileFormat != commonapiv1beta1.JsonFormat {
+         return return fmt.Errorf("The format of the metrics file is required by .spec.metricsCollectorSpec.source.fileSystemPath.fileFormat.")
+     }
  ...

Does it sound good to you? @kubeflow/wg-automl-leads

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:15 (15 by maintainers)

github_iconTop GitHub Comments

3reactions
andreyvelichcommented, Dec 6, 2021

@gaocegege @johnugeorge Please give your feedback on this proposal

0reactions
tenzen-ycommented, Dec 10, 2021

Please assign me when the PR is ready to review

Thanks for your contribution! 🎉 👍

Sure, Thanks for your review! @gaocegege

Read more comments on GitHub >

github_iconTop Results From Across the Web

Internals: JSON Output Format | Terraform
Terraform provides a machine-readable JSON representation of state, configuration and plan.
Read more >
Work with JSON data - SQL Server - Microsoft Learn
Format the results of Transact-SQL queries in JSON format. Overview of built-in JSON support. Key JSON capabilities of SQL Server and SQL ...
Read more >
JSON editing in Visual Studio Code
JSON is a data format that is common in configuration files like ... which are used by the JSON language support to provide...
Read more >
JSON Schema | The home of JSON Schema
The home of JSON Schema. ... allows you to annotate and validate JSON documents. JSON Schema enables the confident and reliable use of...
Read more >
Working with JSON data in Google Standard SQL | BigQuery
JSON is a widely used format that allows for semi-structured data, because it does not require a schema. ... Other batch load formats...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found