question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unable to check Evaluation results

See original GitHub issue

I want to test & evaluate performance of this model on GAP data-set. Using its below google colab notebook to avoid machine dependency issues. https://colab.research.google.com/drive/1SlERO9Uc9541qv6yH26LJz5IM9j7YVra#scrollTo=H0xPknceFORt

How can I view the results of evaluation metrics as shown in the mentioned research paper?

Secondly, I tried to run cmd ! GPU=0 python evaluate.py $CHOSEN_MODEL in colab, assuming it would generate the evaluation results, but getting below error:

..
..
..
W0518 20:39:24.163360 140409457641344 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/optimizer_v2/learning_rate_schedule.py:409: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
W0518 20:39:24.184890 140409457641344 deprecation_wrapper.py:119] From /content/coref/optimization.py:64: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.

bert:task 199 27
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gradients_util.py:93: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gradients_util.py:93: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2020-05-18 20:39:34.733225: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2020-05-18 20:39:34.735705: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2020-05-18 20:39:34.735782: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (fcef0d45f3e2): /proc/driver/nvidia/version does not exist
2020-05-18 20:39:34.736239: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-05-18 20:39:34.750671: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200160000 Hz
2020-05-18 20:39:34.751024: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x78bf9c0 executing computations on platform Host. Devices:
2020-05-18 20:39:34.751074: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
Restoring from ./spanbert_base/model.max.ckpt
2020-05-18 20:39:40.322311: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set.  If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU.  To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
W0518 20:39:47.165590 140409457641344 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
Traceback (most recent call last):
  File "evaluate.py", line 26, in <module>
    model.evaluate(session, official_stdout=True, eval_mode=True)
  File "/content/coref/independent.py", line 538, in evaluate
    self.load_eval_data()
  File "/content/coref/independent.py", line 532, in load_eval_data
    with open(self.config["eval_path"]) as f:
FileNotFoundError: [Errno 2] No such file or directory: './dev.english.384.jsonlines'

Any idea what could be the possible reason? I am new to this area/environment and following the colab code right now.

Looking for some suggestions / steps to do evaluation part.

Thanks in advance!

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:16 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
mandarjoshi90commented, Jun 19, 2020

I’m not sure I understand. How did you generate the input file for predict.py? Did you not use gap_to_jsonlines.py? If not, that error makes sense. Here’s the full pipeline in case it wasn’t clear:

#!/bin/bash
gap_file_prefix=$1
vocab_file=$2
python gap_to_jsonlines.py $gap_file_prefix.tsv $vocab_file
GPU=0 python predict.py bert_base $gap_file_prefix.jsonlines $gap_file_prefix.output.jsonlines
python to_gap_tsv.py $gap_file_prefix.output.jsonlines
python2 ../gap-coreference/gap_scorer.py --gold_tsv $gap_file_prefix.tsv --system_tsv $gap_file_prefix.output.tsv

$1/$gap_file_prefix points to the path of the original GAP file without the tsv prefix.

1reaction
mandarjoshi90commented, Jun 13, 2020

Sorry, I’m getting to this only now. GAP uses accuracy for evaluation as opposed to F1 for OntoNotes. If you’ve been able to make predictions, you can convert them to tsv files and call the gap scorer script.

python to_gap_tsv.py <prediction_jsonline_file> <tsv_output_file> should do the trick.

Read more comments on GitHub >

github_iconTop Results From Across the Web

I Disagree With the School's Evaluation Results. Now What?
There are different reasons you may disagree with evaluation results. If you're concerned that the test results didn't measure what needed to be...
Read more >
Set up iPad and Mac to give tests and assessments
Check with your assessment provider to make sure that their app uses the Automatic Assessment Configuration framework. Assessment mode in iOS.
Read more >
Deleting Evaluation Results from AWS Config Rules
After you delete an evaluation, you cannot retrieve it. After the evaluation results are deleted, you can manually start a new evaluation.
Read more >
Evaluating the Employees You Can't See
Don't focus solely on results. Performance should be based on a combination of two things: results and behavior. · Beware unintended consequences ......
Read more >
Section 1414 - Individuals with Disabilities Education Act
The agency proposing to conduct an initial evaluation to determine if the child ... public education to the child for the failure to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found