question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Invalid value error because of labels data on Movie Dialog Dataset

See original GitHub issue

Hi, I tried to train Memory Networks with #moviedd-qa. But I encountered below problem. I think labels data on Movie Dialog is invalid because it contains multiple entities. In cands, it is from entities.txt and only contains single vocab like ‘Three Kings’ and ‘hungury’.

  • python: 3.6.1
  • Ubuntu 16.04, 64bit
  • ParlAI: f43343f4bf, Wed Sep 20 17:15:55
python examples/train_model.py -m memnn -t "#moviedd-qa" --hops 1 -e 5 -bs 1 --gpu 0
[ Main ParlAI Arguments: ] 
[  task: #moviedd-qa ]
[  download_path: /home/jonki/work/clean/ParlAI/downloads ]
[  datatype: train ]
[  image_mode: raw ]
[  numthreads: 1 ]
[  batchsize: 1 ]
[  datapath: /home/jonki/work/clean/ParlAI/data ]
[ ParlAI Model Arguments: ] 
[  model: memnn ]
[  model_file: None ]
[  dict_class: None ]
[ Dictionary Arguments: ] 
[  dict_file: None ]
[  dict_initpath: None ]
[  dict_language: english ]
[  dict_max_ngram_size: -1 ]
[  dict_minfreq: 0 ]
[  dict_nulltoken: __NULL__ ]
[  dict_endtoken: __END__ ]
[  dict_unktoken: __UNK__ ]
[  dict_starttoken: __START__ ]
[  dict_maxexs: 100000 ]
[ MemNN Arguments: ] 
[  learning_rate: 0.01 ]
[  embedding_size: 128 ]
[  hops: 1 ]
[  mem_size: 100 ]
[  time_features: True ]
[  position_encoding: False ]
[  output: rank ]
[  rnn_layers: 2 ]
[  dropout: 0.1 ]
[  optimizer: adam ]
[  no_cuda: False ]
[  gpu: 0 ]
[ Training Loop Arguments: ] 
[  evaltask: None ]
[  display_examples: False ]
[  num_epochs: 5.0 ]
[  max_train_time: -1 ]
[  log_every_n_secs: 2 ]
[  validation_every_n_secs: -1 ]
[  validation_max_exs: -1 ]
[  validation_patience: 5 ]
[  validation_metric: accuracy ]
[  dict_build_first: True ]
[ building dictionary first... ]
Tried to build dictionary but `--dict-file` is not set. Set this param so the dictionary can be saved.
[ Using CUDA ]
[creating task(s): moviedialog:Task:1]
[building data: /home/jonki/work/clean/ParlAI/data/MovieDialog]

unpacking moviedialog.tar.gz
unpacking p6tyohj.tgz
[loading fbdialog data:/home/jonki/work/clean/ParlAI/data/MovieDialog/movie_dialog_dataset/task1_qa/task1_qa_train.txt]
[ training... ]
[ time:1006s parleys:1 total_exs:1 time_left:483939705s ] {'total': 1, 'accuracy': 0, 'f1': 0, 'hits@k': {1: 0, 5: 0, 10: 0, 50: 0, 100: 0}}
[ time:2008s parleys:2 total_exs:2 time_left:483044539s ] {'total': 1, 'accuracy': 0, 'f1': 0, 'hits@k': {1: 0, 5: 0, 10: 0, 50: 0, 100: 0}}
[ time:3059s parleys:3 total_exs:3 time_left:490416900s ] {'total': 1, 'accuracy': 0, 'f1': 0, 'hits@k': {1: 0, 5: 0, 10: 0, 50: 0, 100: 0}}
Traceback (most recent call last):
  File "/home/jonki/work/clean/ParlAI/examples/train_model.py", line 217, in <module>
    main()
  File "/home/jonki/work/clean/ParlAI/examples/train_model.py", line 131, in main
    world.parley()
  File "/home/jonki/work/clean/ParlAI/parlai/core/worlds.py", line 243, in parley
    acts[1] = agents[1].act()
  File "/home/jonki/work/clean/ParlAI/parlai/agents/memnn/memnn.py", line 315, in act
    return self.batch_act([self.observation])[0]
  File "/home/jonki/work/clean/ParlAI/parlai/agents/memnn/memnn.py", line 306, in batch_act
    predictions = self.predict(xs, cands, ys)
  File "/home/jonki/work/clean/ParlAI/parlai/agents/memnn/memnn.py", line 141, in predict
    label_inds = [cand_list.index(self.labels[i]) for i, cand_list in enumerate(cands)]
  File "/home/jonki/work/clean/ParlAI/parlai/agents/memnn/memnn.py", line 141, in <listcomp>
    label_inds = [cand_list.index(self.labels[i]) for i, cand_list in enumerate(cands)]
ValueError: 'Morgan Freeman, Mickey Rourke, Ellen Barkin' is not in list

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:11 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
alexholdenmillercommented, Sep 26, 2017

data has been fixed, #325 will force parlai to download a new copy of the data, but you can immediately access the fixed data by just doing rm -rf data/MovieDialog

1reaction
alexholdenmillercommented, Sep 26, 2017

Yes exactly, that’s what has given me pause. I’ll have an updated version soon which resolves that!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Creating and accessing datasets in an HDF5 file
I am trying to create an HDF5 file with two datasets, 'data' and 'label'. When I tried to access the said file, however,...
Read more >
datasets/builder.py at main · huggingface/datasets - GitHub
Name of the dataset configuration. It affects the data generated on disk. Different configurations will have their own subdirectories and. versions.
Read more >
Import or link to data in an Excel workbook - Microsoft Support
Troubleshoot missing or incorrect values. If you receive the message An error occurred trying to import file, the import operation completely failed. Conversely ......
Read more >
Excel VBA Error Handling - All You Need to Know!
Learn all about Excel VBA errors and how make sure these are handled properly in your VBA code. Covers all the error types...
Read more >
Import data from spreadsheets and text files
The maximum amount of data that can be imported into a single dataset is 255 ... contains invalid data, the import operation will...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found