Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

UCB1 arm_to_expectation not updated for all arms (potential bug)

See original GitHub issue

When partial_fit() is done with UCB1, arm_to_expectation of only the arms having a reward are updated (because of https://github.com/fmr-llc/mabwiser/blob/0c860253be017d1f393e18bf9d9d7e1739f93dca/mabwiser/ucb.py#L62 ). If an arm does not have a reward, its arm_to_expectation is not updated.

arm_to_expectation depends on self.total_count which gets updated when “any” of the arms are invoked. Thus, arm_to_expectation of all arms need to be updated when self.total_count changes.

Solution: remove the above condition (if arm_rewards.size).

Happy to submit a fix if this change can be made.

Issue Analytics

State:
Created 4 years ago
Comments:6 (6 by maintainers)

Top GitHub Comments

1reaction

erstrongcommented, Dec 20, 2019

PR #12 has been merged. Closing issue. Thanks again for reporting this @harisankarh!

1reaction

skadiocommented, Dec 20, 2019

Thank you both for brainstorming on this! And the current PR seems to address this issue.

Top Results From Across the Web

Multi-Armed Bandits in Python: Epsilon Greedy, UCB1 ...

This post explores four algorithms for solving the multi-armed bandit problem (Epsilon Greedy, EXP3, Bayesian UCB, and UCB1), ...

The UCB1 Algorithm for Multi-Armed Bandit Problems

Understanding the UCB1 Algorithm Here, n(a) is the number of times arm a has been pulled. Because the number of times an arm...

Correlated Multi-armed Bandits with a Latent Random Source

We consider a novel multi-armed bandit framework where the rewards obtained by pulling the arms are functions of a common latent random variable....

A/B testing — Is there a better way? An exploration of multi- ...

In this post, I'll simulate a traditional A/B test and discuss its shortcomings, then I'll simulate some different multi-armed bandit ...

Finding All ∈-Good Arms in Stochastic Bandits

Mathematically, the all-∈-good arm identification problem presents significant new challenges and surprises that do not arise in the pure-exploration ...