DQN with state vectors not frames
See original GitHub issueI wanted to start from example_1 and apply it to my specific environment. The state is a vector of 5001 floats. For some reason, this is interpreted as a sequence of frames:
Frame-based buffer using 5001-frame sequences.
How do I make sure it is doing the right thing?
Issue Analytics
- State:
- Created 4 years ago
- Comments:12 (6 by maintainers)
Top Results From Across the Web
Deep Q-Learning with Space Invaders - Hugging Face
The Deep Q-Network (DQN) As input, we take a stack of 4 frames passed through the network as a state and output a...
Read more >Deep Q-Network (DQN)-II - Towards Data Science
Both vectors are the ones we will use in the loss function. To do this, remember that we must use the target network....
Read more >DQN input representation for a card game
Another option is a single (40,) vector, with 0 being "still in deck", 1 being "in hand", 2 being "on table", 3 being...
Read more >Playing Atari with Deep Reinforcement Learning
The emulator's internal state is not observed by the agent; instead it observes an image xt ∈ Rd from the emulator, which is...
Read more >Q-Learning with Neural Networks, algorithm DQN
The first one is called the main neural network, represented by the weight vector θ, and it is used to estimate the Q-values...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Oh! You’re right! That’s embarrassing haha sorry about that…fixed in 85d4e01 😃 With a warning that it’s totally up to the user to make sure the provided replay buffer class has all the right behaviors.
@astooke I did what you recommended in my own customized environment(my state is also a vector) and it solved the “low >= high” problem.