question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Duplicate entry '[Experiment Name]' for key 'name' in Katib DB

See original GitHub issue

/kind bug

What steps did you take and what happened: [A clear and concise description of what the bug is.]

First Step: We cleared all existing experiments. Katib DB - experiments is empty.

show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| katib              |
| mysql              |
| performance_schema |
| sys                |
+--------------------+
5 rows in set (0.00 sec)

mysql> use katib;
Database changed
mysql> show tables;
+--------------------------+
| Tables_in_katib          |
+--------------------------+
| experiments              |
| extra_algorithm_settings |
| observation_logs         |
| trials                   |
+--------------------------+
describe experiments;
+------------------------+--------------+------+-----+---------+----------------+
| Field                  | Type         | Null | Key | Default | Extra          |
+------------------------+--------------+------+-----+---------+----------------+
| id                     | int(11)      | NO   | PRI | NULL    | auto_increment |
| name                   | varchar(255) | NO   | UNI | NULL    |                |
| parameters             | text         | YES  |     | NULL    |                |
| objective              | text         | YES  |     | NULL    |                |
| algorithm              | text         | YES  |     | NULL    |                |
| trial_template         | text         | YES  |     | NULL    |                |
| metrics_collector_spec | text         | YES  |     | NULL    |                |
| parallel_trial_count   | int(11)      | YES  |     | NULL    |                |
| max_trial_count        | int(11)      | YES  |     | NULL    |                |
| status                 | tinyint(4)   | YES  |     | NULL    |                |
| start_time             | datetime(6)  | YES  |     | NULL    |                |
| completion_time        | datetime(6)  | YES  |     | NULL    |                |
| nas_config             | text         | YES  |     | NULL    |                |
+------------------------+--------------+------+-----+---------+----------------+
13 rows in set (0.00 sec)

mysql> SELECT * FROM experiments;
Empty set (0.00 sec)

Second Step: Launched a new experiment.

kubectl get experiment
NAME                                     STATUS    AGE
katib-mnist-with-summaries-from-hdfs-3   Running   6m16s

One new record in the DB:

SELECT * FROM experiments;

| id | name                                   | parameters                                                                                                                                                                                               | objective                                                          | algorithm                  | trial_template | metrics_collector_spec | parallel_trial_count | max_trial_count | status | start_time                 | completion_time            | nas_config |
+----+----------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------+----------------------------+----------------+------------------------+----------------------+-----------------+--------+----------------------------+----------------------------+------------+
| 25 | katib-mnist-with-summaries-from-hdfs-3 | {"parameters":[{"name":"--learning_rate","parameterType":"DOUBLE","feasibleSpace":{"max":"0.05","min":"0.01"}},{"name":"--batch_size","parameterType":"INT","feasibleSpace":{"max":"200","min":"100"}}]} | {"type":"MAXIMIZE","goal":0.99,"objectiveMetricName":"accuracy_1"} | {"algorithmName":"random"} |                |                        |                    3 |               0 |      1 | 2019-08-01 05:17:08.000000 | 0001-01-01 00:00:00.000000 |            |

1 row in set (0.00 sec)

In the Katib controller, we will see the error log Duplicate entry 'katib-mnist-with-summaries-from-hdfs-3' for key 'name':

{"level":"error","ts":1564637349.4766934,"logger":"experiment-controller","caller":"experiment/experiment_controller.go:250","msg":"Create experiment in DB error","Experiment":"ml-algorithms/katib-mnist-with-summaries-from-hdfs-3","error":"rpc error: code = Unknown desc = Error 1062: Duplicate entry 'katib-mnist-with-summaries-from-hdfs-3' for key 'name'","stacktrace":"github.com/kubeflow/katib/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/kubeflow/katib/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/kubeflow/katib/pkg/controller/v1alpha2/experiment.(*ReconcileExperiment).Reconcile\n\t/go/src/github.com/kubeflow/katib/pkg/controller/v1alpha2/experiment/experiment_controller.go:250\ngithub.com/kubeflow/katib/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/kubeflow/katib/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:207\ngithub.com/kubeflow/katib/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/github.com/kubeflow/katib/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:157\ngithub.com/kubeflow/katib/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/kubeflow/katib/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/kubeflow/katib/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/kubeflow/katib/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/kubeflow/katib/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/kubeflow/katib/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}

Question

Why do we want to insert duplicate experiment names to the DB? What is the design here?

The name in the DB schema is marked by unique.

In a previous discussion, @johnugeorge mentioned that Trials are not completed until trial metrics are persisted in DB. There is a metric collector cronjob per trial that spawns every 1 minute which collects the metrics and write into DB.

Anything we should change from our side to make the DB work? Is the data fed into it wrong or the table schema is wrong?

What did you expect to happen: No duplicate entries into the Katib DB. Maybe use the Trail as name instead of Experiment? I’m not sure about the design of Katib is here.

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment: Kubeflow version: 0.5 Minikube version: N/A, own cluster Kubernetes version: (use kubectl version): kubectl version Client Version: version.Info{Major:“1”, Minor:“15”, GitVersion:“v1.15.0”, GitCommit:“e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529”, GitTreeState:“clean”, BuildDate:“2019-06-19T16:40:16Z”, GoVersion:“go1.12.5”, Compiler:“gc”, Platform:“darwin/amd64”} Server Version: version.Info{Major:“1”, Minor:“14”, GitVersion:“v1.14.0”, GitCommit:“641856db18352033a0d96dbc99153fa3b27298e5”, GitTreeState:“clean”, BuildDate:“2019-03-25T15:45:25Z”, GoVersion:“go1.12.1”, Compiler:“gc”, Platform:“linux/amd64”} OS (e.g. from /etc/os-release): rhel6

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:15 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
johnugeorgecommented, Aug 1, 2019

Can you provide me a dump of controller logs?

0reactions
k8s-ci-robotcommented, Oct 10, 2019

@gaocegege: Closing this issue.

In response to this:

/close

It is stale. But feel free to ask here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Update entity (JPA+MySQL) - Duplicate entry '142' for key ' ...
I have to add - in my database table "idpracownik" is set as AI and Primary key. I am aware that i can't...
Read more >
Duplicate entries in search_index (MySQL error ...
I'm trying to restore a MySQL database dump from a Drupal 6.19 installation; restoring the dump fails because of "duplicate entry" errors:
Read more >
Error 1062: Duplicate entry 'mydb/regimes' for key 'name'
is an error indicating you've tried to INSERT or UPDATE (explicitly or implicitly) a row in a table using a key that already...
Read more >
How Katib tunes hyperparameter automatically in a ...
Katib is a Kubernetes Native System for Hyperparameter Tuning and Neural Architecture ... NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
Read more >
Solved: Hive Metastore Start
... 0: jdbc:mysql://ip-10-0-1-45.ec2.internal/hiv> /*!40101 SET NAMES ... 'Hive release version 2.1.0') Error: Duplicate entry '1' for key ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found