question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Improve wn.ic documentation and error messages regarding unsmoothed frequencies

See original GitHub issue

While trying to create a list of similarities between synsets I’m running lin similarity on each element of a list, so basically:

first_word = wn.synsets(token.text, pos=wordnet_pos)[0]
hyponyms_ic = [(hyponym, lin(first_word, hyponym, wn_ic)) for hyponym in first_word.hyponyms()]

This throws a ValueError in the ic.py file line 32 (as of right now):

return -log(synset_probability(synset, freq))

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
goodmamicommented, Jul 12, 2021

Is there any disadvantages for using an IC file with the -add1 smoothing?

I don’t think so. It prevents these kind of divide-by-zero and other math domain errors, and I don’t think anyone cares about the actual frequencies of the words. The -resnik version is probably good, too, as it gives weight proportional to the number of senses for a word. I’m not sure which performs empirically better, if any does.

If not, then perhaps the documentation could default to […]

Yes, that’s probably a good idea.

Either that or maybe an Exception for the ic.py code which explains why the ValueError was thrown and what can be done about it.

Yeah, maybe best to do both. Not everyone reads the documentation 😃

Thanks for the feedback!

0reactions
goodmamicommented, Nov 22, 2021

@M49ICKPIxi3 it took me a while to get to this but I’ve now updated the wn.ic documentation so the example shows the file with smoothing and there’s a short explanation in the text.

I did not introduce a custom exception on the ValueError because I could no longer recreate the error with the latest version of the software and data, and therefore I couldn’t test it out. If some example persists with the newest software and data, I’d probably add in the custom error message. But for now I think the documentation fix is sufficient.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Direct and Modal Frequency Response – ANALYSIS - DFREQ ...
Frequency response analysis is a method used to compute structural response to steady-state oscillatory excitation.
Read more >
N-gram Language Models - Stanford University
This technique of visualizing a language model by sampling was first suggested very early on by Shannon (1951) and Miller and Selfridge (1950)...
Read more >
4.3 Tuning Notes · betaflight/betaflight Wiki - GitHub
Betaflight 4.3 Introduction: Welcome to a comprehensive guide to the Betaflight 4.3 update. Please read these important introductory notes:.
Read more >
Figure 1. GPS Modernization Schedule
For example, the civil community will have signals at multiple frequencies, increased code rates, improved ephemeris information and more advanced receivers ...
Read more >
Report of the TG-106 of the Therapy Physics Committee of the ...
methods to reduce measurement errors (1%), beam data processing and detector size convolution ... sioning,” characterizing and documenting beam-specific be-.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found