Questions about MCTS
See original GitHub issueHi, in update_branch function, you update the parent node value with the same total_reward with the current node. Why? I think they have different values, because action from parent node to current node will get a reward which should be added in parent total_reward.
def update_branch(self, total_reward):
self.update(total_reward)
if self.parent:
self.parent.update_branch(total_reward)
Issue Analytics
- State:
- Created 3 years ago
- Comments:13 (9 by maintainers)
Top Results From Across the Web
5 questions with answers in MCTS | Science topic
Review and cite MCTS protocol, troubleshooting and other methodology information | Contact experts in MCTS to get answers.
Read more >Free MCTS 70-432 Practice Test Questions - Accelerated Ideas
Welcome to our free 70-432 practice test which covers questions for the Microsoft MCTS certification for Database implementation and maintenance.
Read more >Questions or Comments? - MCTS | Real-Time
MCTS welcomes customer comments regarding Real-Time. Please email bustime@cleverdevices.com or contact MCTS Customer Service at 414-344-6711.
Read more >What questions did they ask during your interview at MCTS?
Find answers to 'What questions did they ask during your interview at MCTS?' from MCTS employees. Get answers to your biggest company ...
Read more >Beginner questions about MCTS : r/deeplearning - Reddit
- Looking ahead a bit, is there any good articles or discussion different approaches of combining MCTS with a neural net that can...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
That makes sense, I suppose the shift invariance of the UCB makes it so that the magnitude of the value function doesn’t quite matter, but I suppose for other non-argmax or other policies it might make a difference.
Yes it should! thank you for catching this, I’ll fix it.
Indeed, it would matter for a Boltzmann policy for instance. In which case, it is probably best to ignore the past.