Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Minor issues - but nothing's broken

See original GitHub issue

Thanks for this amazing code. I was working on your implementation when I came across a bunch of issues in it. Since my changes might not be to your liking, I’m not going to make a PR (my implementation is a little bit different). But I thought I should report the issues I’ve faced.

Before I list the issues, you need to know that none of them are breaking the implementation, meaning that your code works perfectly just the way it is. But it’s just that if someone wants to make a change to it (like me), they’ll have a headache. And also, some of the issues are just wrong.

1. The way the fake states are initialized is unnecessary:

def create_inital_state(inputs, hidden_size):
    # We are not using initial states, but need to pass something to K.rnn funciton
    fake_state = K.zeros_like(inputs)  # <= (batch_size, enc_seq_len, latent_dim
    fake_state = K.sum(fake_state, axis=[1, 2])  # <= (batch_size)
    fake_state = K.expand_dims(fake_state)  # <= (batch_size, 1)
    fake_state = K.tile(fake_state, [1, hidden_size])  # <= (batch_size, latent_dim
    return fake_state

fake_state_c = create_inital_state(encoder_out_seq, encoder_out_seq.shape[-1])
fake_state_e = create_inital_state(encoder_out_seq, encoder_out_seq.shape[1])  # <= (batch_size, enc_seq_len, latent_dim

Just simply initialize the tensors with zeros:

fake_state_e = K.zeros_like(K.placeholder(shape=(decoder_out_seq.shape[0], 1)))
fake_state_c = K.zeros_like(K.placeholder(shape=(decoder_out_seq.shape[0], 1)))

2. In both your step functions you return the state like this:

def energy_step(inputs, states):
    ...
    return e_i, [e_i]

def context_step(inputs, states):
    ...
    return c_i, [c_i]

While this does not throw any errors (because the states are discarded), but this is actually wrong. You just have to return the states, like this:

def energy_step(inputs, states):
    ...
    return e_i, states

def context_step(inputs, states):
    ...
    return c_i, states

3. You already have a PR on this. I’m just going to mention it for the sake of completeness. Your output shape is this:

def compute_output_shape(self, input_shape):
    """ Outputs produced by the layer """
    return [
        tf.TensorShape((input_shape[1][0], input_shape[1][1], input_shape[1][2])),
        tf.TensorShape((input_shape[1][0], input_shape[1][1], input_shape[0][1]))
    ]

But it should be this:

def compute_output_shape(self, input_shape):
    """ Outputs produced by the layer """
    return [
        tf.TensorShape((input_shape[1][0], input_shape[1][1], input_shape[0][2])),
        tf.TensorShape((input_shape[1][0], input_shape[1][1], input_shape[0][1]))
    ]

Thanks again. I learned a lot from your code.

Issue Analytics

State:
Created 4 years ago
Reactions:1
Comments:12 (5 by maintainers)

Top GitHub Comments

2reactions

thushv89commented, Dec 2, 2019

@ziadloo ,

Thanks for pointing these out. I’ll have a go through and make the necessary changes.

@John-8704 ,

This is something I will soon working on. Yes, since LSTM returns two states, it might not work properly.

1reaction

thushv89commented, Jul 21, 2020

@OmniaZayed ,

Regarding your question. I think you’re right. encoder_inf_states seems to be redundant. You should be able to use encoder_inf_out as you suggested. But I’ll need to double check whether there was a reason I did this. But I suspect this is just a mistake. I’ll update the issue if I find anything.