question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

UCB1 arm_to_expectation not updated for all arms (potential bug)

See original GitHub issue

When partial_fit() is done with UCB1, arm_to_expectation of only the arms having a reward are updated (because of https://github.com/fmr-llc/mabwiser/blob/0c860253be017d1f393e18bf9d9d7e1739f93dca/mabwiser/ucb.py#L62 ). If an arm does not have a reward, its arm_to_expectation is not updated.

arm_to_expectation depends on self.total_count which gets updated when “any” of the arms are invoked. Thus, arm_to_expectation of all arms need to be updated when self.total_count changes.

Solution: remove the above condition (if arm_rewards.size).

Happy to submit a fix if this change can be made.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
erstrongcommented, Dec 20, 2019

PR #12 has been merged. Closing issue. Thanks again for reporting this @harisankarh!

1reaction
skadiocommented, Dec 20, 2019

Thank you both for brainstorming on this! And the current PR seems to address this issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Multi-Armed Bandits in Python: Epsilon Greedy, UCB1 ...
This post explores four algorithms for solving the multi-armed bandit problem (Epsilon Greedy, EXP3, Bayesian UCB, and UCB1), ...
Read more >
The UCB1 Algorithm for Multi-Armed Bandit Problems
Understanding the UCB1 Algorithm​​ Here, n(a) is the number of times arm a has been pulled. Because the number of times an arm...
Read more >
Correlated Multi-armed Bandits with a Latent Random Source
We consider a novel multi-armed bandit framework where the rewards obtained by pulling the arms are functions of a common latent random variable....
Read more >
A/B testing — Is there a better way? An exploration of multi- ...
In this post, I'll simulate a traditional A/B test and discuss its shortcomings, then I'll simulate some different multi-armed bandit ...
Read more >
Finding All ∈-Good Arms in Stochastic Bandits
Mathematically, the all-∈-good arm identification problem presents significant new challenges and surprises that do not arise in the pure-exploration ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found