Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Invalid value error because of labels data on Movie Dialog Dataset

See original GitHub issue

Hi, I tried to train Memory Networks with #moviedd-qa. But I encountered below problem. I think labels data on Movie Dialog is invalid because it contains multiple entities. In cands, it is from entities.txt and only contains single vocab like ‘Three Kings’ and ‘hungury’.

python: 3.6.1
Ubuntu 16.04, 64bit
ParlAI: f43343f4bf, Wed Sep 20 17:15:55

python examples/train_model.py -m memnn -t "#moviedd-qa" --hops 1 -e 5 -bs 1 --gpu 0
[ Main ParlAI Arguments: ] 
[  task: #moviedd-qa ]
[  download_path: /home/jonki/work/clean/ParlAI/downloads ]
[  datatype: train ]
[  image_mode: raw ]
[  numthreads: 1 ]
[  batchsize: 1 ]
[  datapath: /home/jonki/work/clean/ParlAI/data ]
[ ParlAI Model Arguments: ] 
[  model: memnn ]
[  model_file: None ]
[  dict_class: None ]
[ Dictionary Arguments: ] 
[  dict_file: None ]
[  dict_initpath: None ]
[  dict_language: english ]
[  dict_max_ngram_size: -1 ]
[  dict_minfreq: 0 ]
[  dict_nulltoken: __NULL__ ]
[  dict_endtoken: __END__ ]
[  dict_unktoken: __UNK__ ]
[  dict_starttoken: __START__ ]
[  dict_maxexs: 100000 ]
[ MemNN Arguments: ] 
[  learning_rate: 0.01 ]
[  embedding_size: 128 ]
[  hops: 1 ]
[  mem_size: 100 ]
[  time_features: True ]
[  position_encoding: False ]
[  output: rank ]
[  rnn_layers: 2 ]
[  dropout: 0.1 ]
[  optimizer: adam ]
[  no_cuda: False ]
[  gpu: 0 ]
[ Training Loop Arguments: ] 
[  evaltask: None ]
[  display_examples: False ]
[  num_epochs: 5.0 ]
[  max_train_time: -1 ]
[  log_every_n_secs: 2 ]
[  validation_every_n_secs: -1 ]
[  validation_max_exs: -1 ]
[  validation_patience: 5 ]
[  validation_metric: accuracy ]
[  dict_build_first: True ]
[ building dictionary first... ]
Tried to build dictionary but `--dict-file` is not set. Set this param so the dictionary can be saved.
[ Using CUDA ]
[creating task(s): moviedialog:Task:1]
[building data: /home/jonki/work/clean/ParlAI/data/MovieDialog]

unpacking moviedialog.tar.gz
unpacking p6tyohj.tgz
[loading fbdialog data:/home/jonki/work/clean/ParlAI/data/MovieDialog/movie_dialog_dataset/task1_qa/task1_qa_train.txt]
[ training... ]
[ time:1006s parleys:1 total_exs:1 time_left:483939705s ] {'total': 1, 'accuracy': 0, 'f1': 0, 'hits@k': {1: 0, 5: 0, 10: 0, 50: 0, 100: 0}}
[ time:2008s parleys:2 total_exs:2 time_left:483044539s ] {'total': 1, 'accuracy': 0, 'f1': 0, 'hits@k': {1: 0, 5: 0, 10: 0, 50: 0, 100: 0}}
[ time:3059s parleys:3 total_exs:3 time_left:490416900s ] {'total': 1, 'accuracy': 0, 'f1': 0, 'hits@k': {1: 0, 5: 0, 10: 0, 50: 0, 100: 0}}
Traceback (most recent call last):
  File "/home/jonki/work/clean/ParlAI/examples/train_model.py", line 217, in <module>
    main()
  File "/home/jonki/work/clean/ParlAI/examples/train_model.py", line 131, in main
    world.parley()
  File "/home/jonki/work/clean/ParlAI/parlai/core/worlds.py", line 243, in parley
    acts[1] = agents[1].act()
  File "/home/jonki/work/clean/ParlAI/parlai/agents/memnn/memnn.py", line 315, in act
    return self.batch_act([self.observation])[0]
  File "/home/jonki/work/clean/ParlAI/parlai/agents/memnn/memnn.py", line 306, in batch_act
    predictions = self.predict(xs, cands, ys)
  File "/home/jonki/work/clean/ParlAI/parlai/agents/memnn/memnn.py", line 141, in predict
    label_inds = [cand_list.index(self.labels[i]) for i, cand_list in enumerate(cands)]
  File "/home/jonki/work/clean/ParlAI/parlai/agents/memnn/memnn.py", line 141, in <listcomp>
    label_inds = [cand_list.index(self.labels[i]) for i, cand_list in enumerate(cands)]
ValueError: 'Morgan Freeman, Mickey Rourke, Ellen Barkin' is not in list

Issue Analytics

State:
Created 6 years ago
Comments:11 (4 by maintainers)

Top GitHub Comments

1reaction

alexholdenmillercommented, Sep 26, 2017

data has been fixed, #325 will force parlai to download a new copy of the data, but you can immediately access the fixed data by just doing rm -rf data/MovieDialog

1reaction

alexholdenmillercommented, Sep 26, 2017

Yes exactly, that’s what has given me pause. I’ll have an updated version soon which resolves that!

Top Results From Across the Web

Creating and accessing datasets in an HDF5 file

I am trying to create an HDF5 file with two datasets, 'data' and 'label'. When I tried to access the said file, however,...

datasets/builder.py at main · huggingface/datasets - GitHub

Name of the dataset configuration. It affects the data generated on disk. Different configurations will have their own subdirectories and. versions.

Import or link to data in an Excel workbook - Microsoft Support

Troubleshoot missing or incorrect values. If you receive the message An error occurred trying to import file, the import operation completely failed. Conversely ......

Excel VBA Error Handling - All You Need to Know!

Learn all about Excel VBA errors and how make sure these are handled properly in your VBA code. Covers all the error types...

Import data from spreadsheets and text files

The maximum amount of data that can be imported into a single dataset is 255 ... contains invalid data, the import operation will...