question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

KeyError: 'Data point X is not unique'

See original GitHub issue

Hi and congrats for this sweet piece of software, really love it 😃

I’ve implemented an async solution very similar to the one in the example notebook, with 13 virtual machines connecting to a master server hosting the Tornado webserver (like in the notebook), and it seems like I’m now constantly running over the same symptoms on the master server (the server registering the tested points and their respective target in the async example):

Error: 'Data point [3.47653914e+04 2.10539196e+02 3.15656650e+00 6.77134492e+00  1.01962491e+01] is not unique'
Traceback (most recent call last):
  File "playground_master.py", line 72, in post
    self._bo.register(params=body["params"], target=body["target"])
  File "/usr/local/lib/python3.5/dist-packages/bayes_opt/bayesian_optimization.py", line 104, in register
    self._space.register(params, target)
  File "/usr/local/lib/python3.5/dist-packages/bayes_opt/target_space.py", line 161, in register
    raise KeyError('Data point {} is not unique'.format(x))
KeyError: 'Data point [3.47653914e+04 2.10539196e+02 3.15656650e+00 6.77134492e+00 1.01962491e+01] is not unique'

It seems like the BO is suggesting to the 13 VMs to test points (set of 5 values in my case) which are already tested, almost in a constant manner after ~ 800 points tested.

Here’s my troubleshooting so far:

  • Using the load/save examples in the front page of BO, I save all points tested in a JSON file and for this instance of the problem, I see 817 points already registered in the JSON
  • It seems like this data point thrown in the traceback is indeed ALREADY in the JSON, with an associated target value
  • The 13 slave VMs are still trying to ask for “suggestions” (further set of points to test), but it seems like BO gives set of points already tested ; I still see some rare cases when it’s not tested yet and the counts case increase slightly to 818, 819 points… (but most of the time the traceback is thrown)

I’m a little bit surprised you can end up with such scenario given my PBOUND is very broad, and so, has a lot to points to be worked on without having to test the same ones again:

            {'learning_timesteps': (5000, 40000),
            'timesteps_per_batch': (4, 72),
            'observation_loopback': (1, 20),
            'mmr__timeperiod': (7, 365),
            'model': (-0.49, 5.49) }

This is how I initialized the utility, which, as far as I understood, is responsible for suggesting points that were already tested: _uf = UtilityFunction(kind="ucb", kappa=2.576, xi=0)

Should I modify the acquisition function? or some of the hyperparameters “kappa”, “xi” ?

I see https://github.com/fmfn/BayesianOptimization/issues/92 related to this but I’m not doing any manual point probing / not doing any initialization, I really sticked to the Async notebook example, so I’m not sure this issue applies to me 😦

Let me know if I can share further information & more context to this Thanks in advance for the help 😃 Lucas

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:12 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
rmcconkecommented, Jun 14, 2022

It seems this could be part of a larger feature related to a “termination” condition - AFAIK, the current code only runs for a specified number of iterations, it does not have a “convergence criteria”. The error arises from the register function finding an identical point, so when the optimizer gets “stuck” here I think the same point will be repeatedly suggested.

After suggesting a duplicate point, the point is not registered (the try statement fails), and it will suggest the same point on the next iteration (the posterior hasn’t changed). So, it may be possible to detect immediately (i.e., after only a single duplicate suggestion). It isn’t really an error, it is the optimizer getting stuck at a point where the utility function is higher than any other unprobed location.

This tends to happen with highly exploitative values of kappa or xi, so we could also suggest a higher value in the error message if this occurs.

0reactions
bwheelz36commented, Dec 4, 2022

fixed in #372

Read more comments on GitHub >

github_iconTop Results From Across the Web

Python KeyError Exceptions and How to Handle Them
In this tutorial, you'll learn how to handle Python KeyError exceptions. They are often caused by a bad key lookup in a dictionary,...
Read more >
How to fix Python KeyError Exceptions in simple steps?
A Python KeyError is raised when you try to access an invalid key in a dictionary. In simple terms, when you see a...
Read more >
I'm getting Key error in python - Stack Overflow
A KeyError generally means the key doesn't exist. So, are you sure the path key exists? From the official python docs: exception KeyError....
Read more >
How to Fix KeyError in Pandas (With Example) - Statology
The way to fix this error is to simply make sure we spell the column name correctly. ... We can see that there...
Read more >
Error Types in Python - TutorialsTeacher
In Python 3.x, print is a built-in function and requires parentheses. ... KeyError, Raised when a key is not found in a dictionary....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found