question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

The file metric collector example docker image does not sync with the code

See original GitHub issue

/kind bug

What steps did you take and what happened: [A clear and concise description of what the bug is.] The trial image docker.io/liuhougangxa/pytorch-mnist:1.0 in https://github.com/kubeflow/katib/blob/master/examples/v1alpha3/file-metricscollector-example.yaml is outdated with https://github.com/kubeflow/katib/blob/master/examples/v1alpha3/file-metrics-collector/mnist.py.

The mnist.py in the docker image

def test(args, model, device, test_loader, epoch):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item() # sum up batch loss
            pred = output.max(1, keepdim=True)[1] # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)
    logging.info('\n{{metricName: accuracy, metricValue: {:.4f}}};{{metricName: loss, metricValue: {:.4f}}}\n'.format(float(correct) / len(test_loader.dataset), test_loss))

Here the logging format is {{metricName: accuracy, metricValue: {:.4f}}}, so that the file collector cannot parse it correctly.

@hougangliu

What did you expect to happen:

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

  • Kubeflow version:
  • Minikube version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
hougangliucommented, Dec 4, 2019

sorry blocking you, I updated the image in https://github.com/kubeflow/katib/pull/947

0reactions
k8s-ci-robotcommented, Dec 5, 2019

@johnugeorge: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Collect Docker metrics with Prometheus
This topic shows you how to configure Docker, set up Prometheus to run as a Docker container, and monitor your Docker instance using...
Read more >
Collecting Metrics | Airbyte Documentation
Collecting Metrics. Airbyte supports two ways to collect metrics - using datadog or open telemetry. Fill in METRIC_CLIENT field in .env file to...
Read more >
Troubleshooting the container runtime - Google Cloud
This document provides troubleshooting steps for common issues that you might encounter with the container runtime on your Google Kubernetes Engine (GKE) ...
Read more >
Monitoring Kubernetes | Troubleshooting - Outcold Solutions
Pod is not getting scheduled; Failed to pull the image ... OK kubernetes uses other container runtime File Inputs: x input(syslog): FAILED no...
Read more >
GitLab Container Registry administration
GitLab does not back up Docker images that are not stored on the file system. Enable backups with your object storage provider if...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found