Modifying DQN model to accept 3D images
See original GitHub issueI’ve had some difficulties modifying your code to work directly on image stacks. Your RL model uses the past few 2D frames as channels and does 3D convolutions on that frame history. Instead, I want my agent to only see the current step (i.e. FRAME_HISTORY = 1
) but for the inputs to be single-channel image stacks. I was hoping you could give me some insight.
Issue Analytics
- State:
- Created 5 years ago
- Comments:13 (8 by maintainers)
Top Results From Across the Web
DQN with CNN: Recreating the Google DeepMind Network
We build a DQN with a convolutional neural network (CNN) in order to learn to play from the pixels on the screen. This...
Read more >3DCNN-DQN-RNN: A Deep Reinforcement Learning ...
(i) We propose a novel deep reinforcement learning model to precisely parse large-scale 3D point clouds. Most of the parameters in the 3DCNN-DQN-RNN...
Read more >DQN based single-pixel imaging - Optica Publishing Group
Here, we propose an optimized sampling method using a Deep Q-learning Network (DQN), which considers the sampling process as decision-making, ...
Read more >Deep Q-Network (DQN)-I. OpenAI Gym Pong and Wrappers
In this post we will introduce how to code a Deep Q-Network for OpenAI Gym Pong Environment.
Read more >3DCNN-DQN-RNN: A Deep Reinforcement Learning ... - arXiv
In our method, an eye window under control of the 3D CNN and DQN can localize and segment the points of the object...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I usually get around ~3-4 it/sec using the default big architecture on a GTX 1080. You can try using a tiny architecture first till you see some decent results, then try a bigger one. And if you do not need 3D convs then it will be much faster if you use 2D convs.
You can also monitor the performance using tensorboard and passing the train_log directory. Signs like playing more games with time or increasing the number of successful episodes/games, are healthy indicators that the agent is learning.
@amiralansary @crypdick @gravity1989 @sunalbert
too slow training speed
my current env is win7 x64 System Nvidia Geforce GTX 1080 (8G) CUDA9.0 cuDNN7.0.5 tensorflow-gpu(1.6.0) tensorpack (0.8.0) gym now use(0.12.1)
and i used examples data for training \tensorpack-medical\examples\LandmarkDetection\DQN\data\filenames\image_files \tensorpack-medical\examples\LandmarkDetection\DQN\data\filenames\landmark_files
for gpu memory limit, and i used parameters: BATCH_SIZE = 24
and GPU and CPU setting: mem_fraction = 0.8 # conf = tf.ConfigProto(log_device_placement=True) conf = tf.ConfigProto() # conf.allow_soft_placement = True conf.intra_op_parallelism_threads = 6 conf.inter_op_parallelism_threads = 6 conf.gpu_options.per_process_gpu_memory_fraction = mem_fraction conf.gpu_options.allow_growth = True
and exclude Data Load’s effect.i used FakeData
dataflow = FakeData([[BATCH_SIZE,45,45,45,5],[BATCH_SIZE],[BATCH_SIZE],[BATCH_SIZE]],size=1000,random=False, dtype=
[‘uint8’,‘float32’,‘int8’,‘bool’])
and minimal training setting:
return TrainConfig( data=QueueInput(dataflow), model=Model(), callbacks=[], # steps_per_epoch=10, steps_per_epoch=10, max_epoch=1000, session_config= conf, )
the training speed is 28 seconds per iter.
even i reduce the model complexness (by commented Conv3D and Pool3D ):
with argscope(Conv3D, nl=PReLU.symbolic_function, use_bias=True): # core layers of the network conv = (LinearWrap(image) .Conv3D(‘conv0’, out_channel=32, kernel_shape=[5,5,5], stride=[1,1,1]) .MaxPooling3D(‘pool0’,16) # .Conv3D(‘conv1’, out_channel=32, # kernel_shape=[5,5,5], stride=[1,1,1]) # .MaxPooling3D(‘pool1’,2) # .Conv3D(‘conv2’, out_channel=64, # kernel_shape=[4,4,4], stride=[1,1,1]) # .MaxPooling3D(‘pool2’,2) # .Conv3D(‘conv3’, out_channel=64, # kernel_shape=[3,3,3], stride=[1,1,1]) )
the training speed is 22 seconds per iter.
it is 100x slow by comparison with your training speed {around ~3-4 it/sec using the default big architecture on a GTX 1080}
I want to know why and please give me some suggestions about reduce the training time.