question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DataLossError (see above for traceback): Unable to open table file /wmt16_systems/en-de/model.npz: Data loss: not an sstable

See original GitHub issue

Hi, I am trying to use pretrained model en-de from (http://data.statmt.org/rsennrich/wmt16_systems/ ) and translate english sentence with this script:

# this sample script translates a test set, including
# preprocessing (tokenization, truecasing, and subword segmentation),
# and postprocessing (merging subword units, detruecasing, detokenization).

# instructions: set paths to mosesdecoder, subword_nmt, and nematus,
# then run "./translate.sh < input_file > output_file"

# suffix of source language
SRC=en

# suffix of target language
TRG=de

# path to moses decoder: https://github.com/moses-smt/mosesdecoder
mosesdecoder=../../mosesdecoder

# path to subword segmentation scripts: https://github.com/rsennrich/subword-nmt
subword_nmt=../../subword-nmt

# path to nematus ( https://www.github.com/rsennrich/nematus )
nematus=../../nematus

# theano device
device=cpu

# preprocess
$mosesdecoder/scripts/tokenizer/normalize-punctuation.perl -l $SRC | \
$mosesdecoder/scripts/tokenizer/tokenizer.perl -l $SRC -penn | \
$mosesdecoder/scripts/recaser/truecase.perl -model truecase-model.$SRC | \
$subword_nmt/apply_bpe.py -c $SRC$TRG.bpe | \
# translate
THEANO_FLAGS=mode=FAST_RUN,floatX=float32,device=$device,on_unused_input=warn python $nematus/nematus/translate.py \
     -m model.npz \
     -k 12 -n 
#-n -p 1 --suppress-unk | \
# postprocess
sed 's/\@\@ //g' | \
$mosesdecoder/scripts/recaser/detruecase.perl | \
$mosesdecoder/scripts/tokenizer/detokenizer.perl -l $TRG

When I execute ./translate.sh < en_text.txt > output.txt, I got this error:

DataLossError (see above for traceback): Unable to open table file /wmt16_systems/en-de/model.npz: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
	 [[{{node model0/save/RestoreV2}} = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_model0/save/Const_0_0, model0/save/RestoreV2/tensor_names, model0/save/RestoreV2/shape_and_slices)]]

ERROR: Translate worker process 600 crashed with exitcode 1
Warning: No built-in rules for language de.
Detokenizer Version $Revision: 4134 $
Language: de

Could you give me any suggest? Thanks

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:6

github_iconTop GitHub Comments

1reaction
rsennrichcommented, Nov 13, 2018

Yes, these instructions should help you train your own model. You may want to change some things, e.g. the preprocessing, depending on the language pair.

1reaction
rsennrichcommented, Nov 12, 2018
  1. This should work with all 11 language pairs on http://data.statmt.org/wmt17_systems/

  2. each directory has the script postprocess.sh, which performs the necessary post-processing. For example, check http://data.statmt.org/wmt17_systems/de-en/postprocess.sh

Read more comments on GitHub >

github_iconTop Results From Across the Web

DataLossError: Unable to open table file #17 - GitHub
DataLossError (see above for traceback): Unable to open table file ... Data loss: not an sstable (bad magic number): perhaps your file is...
Read more >
DataLossError: Unable to open table file error in TensorFlow
Empty files are not valid tensorflow checkpoint files. Try saving a checkpoint by training a model first.
Read more >
Unable to open table file .data-00000-of-00001 - CSDN博客
DataLossError (see above for traceback): Unable to open table file . ... Data loss: not an sstable (bad magic number): perhaps your file...
Read more >
Restoring Tensorflow Models - Orville McDonald
DataLossError (see above for traceback): Unable to open table file ...: Data loss: not an sstable (bad magic number): Checkpoint v2 saves ...
Read more >
Numpy Savez, Explained - Sharp Sight
This tutorial shows how to save multiple Numpy arrays into a single file with Numpy savez. It explains the syntax of np.savez and...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found