Hello @pemami4911,
The problem really was with the mask. I’ve fixed it and the network started to learn. My Decoder now is:
class Decoder(nn.Module):
def __init__(self, feactures_dim,hidden_size, n_layers=1):
super(Decoder, self).__init__()
self.W1 = Var(hidden_size, hidden_size)
self.W2 = Var(hidden_size, hidden_size)
self.b2 = Var(hidden_size)
self.V = Var(hidden_size)
self.lstm = nn.LSTM(hidden_size, hidden_size, batch_first=True, num_layers=n_layers)
def forward(self, input, hidden, enc_outputs,mask, prev_idxs):
w1e = torch.matmul(enc_outputs,self.W1)
w2h = (torch.matmul(hidden[0][-1], self.W2) + self.b2).unsqueeze(1)
u = F.tanh(w1e + w2h)
a = torch.matmul(u, self.V)
a, mask = self.apply_mask( a, mask, prev_idxs)
a = F.softmax(a)
res, hidden = self.lstm(input, hidden)
return a, hidden, mask
def apply_mask(self, attentions, mask, prev_idxs):
if mask is None:
mask = Variable(torch.ones(attentions.size())).cuda()
maskk = mask.clone()
if prev_idxs is not None:
for i,j in zip(range(attentions.size(0)),prev_idxs.data):
maskk[i,j[0]] = 0
masked= maskk*attentions + maskk.log()
else:
masked = attentions
return masked, maskk
For n=10, I’m obtaining the following during training the AC version:
Step 0
Average train model loss: -81.07959747314453
Average train critic loss: 29.443866729736328
Average train pred-reward: -0.08028505742549896
Average train reward: 5.2866997718811035
Average loss: -15.106804847717285
------------------------
Step 1000
Average train model loss: -0.7814755792869255
Average train critic loss: 0.7740849611759186
Average train pred-reward: 4.219553744537756
Average train reward: 4.272005982398987
Average loss: -6.201663847446442
------------------------
(...)
Step 19000
Average train model loss: -0.06441724334075116
Average train critic loss: 0.1361817416474223
Average train pred-reward: 3.0573679950237276
Average train reward: 3.0583059163093567
Average loss: -1.5689961900115013
I’ve checked and the returned solutions are all feasible so it seems that it is really converging. I will clean my code and hope that by the end of the week I will have some training history plots and test set validation. If you like, I can share my notebook.
Best regards
Issue Analytics
- State:
- Created 6 years ago
- Reactions:1
- Comments:35 (17 by maintainers)
Top Results From Across the Web
Use and Care of Masks - CDC
Learn about how to care for cloth face masks, including how to store a dirty or wet face mask, and how to wash...
Read more >Mask (1985) - IMDb
A teenager with a massive facial skull deformity and biker gang mother attempt to live as normal a life as possible under the...
Read more >How well do face masks protect against COVID-19?
Face masks can help slow the spread of coronavirus disease 2019 (COVID-19). Learn about mask types, which masks to use and how to...
Read more >Mask - Wikipedia
A mask is an object normally worn on the face, typically for protection, disguise, performance, or entertainment and often they have been employed...
Read more >How to Properly Wear a Face Mask: Infographic
If worn properly, wearing a face mask in public helps prevent the spread of COVID-19. Read our mask-wearing tips to get the maximum...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I haven’t made any nice plots yet, but these are quick screenshots from Tensorboard for TSP 50. Been running for ~21 hours, looks like the validation avg reward has just about reached 6.05-6.10.
After 50 epochs for TSP 20, I got
3.95
average val reward! Fairly close to3.89
, it probably would have gotten there eventually if I had let it keep training.After two epochs on TSP 50, I’m seeing:
not bad!