Minor issues - but nothing's broken
See original GitHub issueThanks for this amazing code. I was working on your implementation when I came across a bunch of issues in it. Since my changes might not be to your liking, I’m not going to make a PR (my implementation is a little bit different). But I thought I should report the issues I’ve faced.
Before I list the issues, you need to know that none of them are breaking the implementation, meaning that your code works perfectly just the way it is. But it’s just that if someone wants to make a change to it (like me), they’ll have a headache. And also, some of the issues are just wrong.
1. The way the fake states are initialized is unnecessary:
def create_inital_state(inputs, hidden_size):
# We are not using initial states, but need to pass something to K.rnn funciton
fake_state = K.zeros_like(inputs) # <= (batch_size, enc_seq_len, latent_dim
fake_state = K.sum(fake_state, axis=[1, 2]) # <= (batch_size)
fake_state = K.expand_dims(fake_state) # <= (batch_size, 1)
fake_state = K.tile(fake_state, [1, hidden_size]) # <= (batch_size, latent_dim
return fake_state
fake_state_c = create_inital_state(encoder_out_seq, encoder_out_seq.shape[-1])
fake_state_e = create_inital_state(encoder_out_seq, encoder_out_seq.shape[1]) # <= (batch_size, enc_seq_len, latent_dim
Just simply initialize the tensors with zeros:
fake_state_e = K.zeros_like(K.placeholder(shape=(decoder_out_seq.shape[0], 1)))
fake_state_c = K.zeros_like(K.placeholder(shape=(decoder_out_seq.shape[0], 1)))
2. In both your step functions you return the state
like this:
def energy_step(inputs, states):
...
return e_i, [e_i]
def context_step(inputs, states):
...
return c_i, [c_i]
While this does not throw any errors (because the states are discarded), but this is actually wrong. You just have to return the states
, like this:
def energy_step(inputs, states):
...
return e_i, states
def context_step(inputs, states):
...
return c_i, states
3. You already have a PR on this. I’m just going to mention it for the sake of completeness. Your output shape is this:
def compute_output_shape(self, input_shape):
""" Outputs produced by the layer """
return [
tf.TensorShape((input_shape[1][0], input_shape[1][1], input_shape[1][2])),
tf.TensorShape((input_shape[1][0], input_shape[1][1], input_shape[0][1]))
]
But it should be this:
def compute_output_shape(self, input_shape):
""" Outputs produced by the layer """
return [
tf.TensorShape((input_shape[1][0], input_shape[1][1], input_shape[0][2])),
tf.TensorShape((input_shape[1][0], input_shape[1][1], input_shape[0][1]))
]
Thanks again. I learned a lot from your code.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:12 (5 by maintainers)
Top GitHub Comments
@ziadloo ,
Thanks for pointing these out. I’ll have a go through and make the necessary changes.
@John-8704 ,
This is something I will soon working on. Yes, since LSTM returns two states, it might not work properly.
@OmniaZayed ,
Regarding your question. I think you’re right.
encoder_inf_states
seems to be redundant. You should be able to useencoder_inf_out
as you suggested. But I’ll need to double check whether there was a reason I did this. But I suspect this is just a mistake. I’ll update the issue if I find anything.