Unable to check Evaluation results
See original GitHub issueI want to test & evaluate performance of this model on GAP data-set. Using its below google colab notebook to avoid machine dependency issues. https://colab.research.google.com/drive/1SlERO9Uc9541qv6yH26LJz5IM9j7YVra#scrollTo=H0xPknceFORt
How can I view the results of evaluation metrics as shown in the mentioned research paper?
Secondly, I tried to run cmd ! GPU=0 python evaluate.py $CHOSEN_MODEL
in colab, assuming it would generate the evaluation results, but getting below error:
..
..
..
W0518 20:39:24.163360 140409457641344 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/optimizer_v2/learning_rate_schedule.py:409: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
W0518 20:39:24.184890 140409457641344 deprecation_wrapper.py:119] From /content/coref/optimization.py:64: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.
bert:task 199 27
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gradients_util.py:93: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gradients_util.py:93: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2020-05-18 20:39:34.733225: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2020-05-18 20:39:34.735705: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2020-05-18 20:39:34.735782: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (fcef0d45f3e2): /proc/driver/nvidia/version does not exist
2020-05-18 20:39:34.736239: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-05-18 20:39:34.750671: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200160000 Hz
2020-05-18 20:39:34.751024: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x78bf9c0 executing computations on platform Host. Devices:
2020-05-18 20:39:34.751074: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): <undefined>, <undefined>
Restoring from ./spanbert_base/model.max.ckpt
2020-05-18 20:39:40.322311: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
W0518 20:39:47.165590 140409457641344 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
Traceback (most recent call last):
File "evaluate.py", line 26, in <module>
model.evaluate(session, official_stdout=True, eval_mode=True)
File "/content/coref/independent.py", line 538, in evaluate
self.load_eval_data()
File "/content/coref/independent.py", line 532, in load_eval_data
with open(self.config["eval_path"]) as f:
FileNotFoundError: [Errno 2] No such file or directory: './dev.english.384.jsonlines'
Any idea what could be the possible reason? I am new to this area/environment and following the colab code right now.
Looking for some suggestions / steps to do evaluation part.
Thanks in advance!
Issue Analytics
- State:
- Created 3 years ago
- Comments:16 (5 by maintainers)
Top Results From Across the Web
I Disagree With the School's Evaluation Results. Now What?
There are different reasons you may disagree with evaluation results. If you're concerned that the test results didn't measure what needed to be...
Read more >Set up iPad and Mac to give tests and assessments
Check with your assessment provider to make sure that their app uses the Automatic Assessment Configuration framework. Assessment mode in iOS.
Read more >Deleting Evaluation Results from AWS Config Rules
After you delete an evaluation, you cannot retrieve it. After the evaluation results are deleted, you can manually start a new evaluation.
Read more >Evaluating the Employees You Can't See
Don't focus solely on results. Performance should be based on a combination of two things: results and behavior. · Beware unintended consequences ......
Read more >Section 1414 - Individuals with Disabilities Education Act
The agency proposing to conduct an initial evaluation to determine if the child ... public education to the child for the failure to...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I’m not sure I understand. How did you generate the input file for predict.py? Did you not use gap_to_jsonlines.py? If not, that error makes sense. Here’s the full pipeline in case it wasn’t clear:
$1/$gap_file_prefix
points to the path of the original GAP file without the tsv prefix.Sorry, I’m getting to this only now. GAP uses accuracy for evaluation as opposed to F1 for OntoNotes. If you’ve been able to make predictions, you can convert them to tsv files and call the gap scorer script.
python to_gap_tsv.py <prediction_jsonline_file> <tsv_output_file>
should do the trick.