question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Order of lemmas returned by derivationally_related_forms() of WordNet?

See original GitHub issue

I use derivationally_related_forms() to get the derivationally related lemmas of a WordNet lemma. When I upgraded nltk from version 2.0.4 to 3.2.3, I realized that the lemmas returned by this function are no longer sorted as they were. Instead, the order could change from one Python session to another. Of course, I could sort the list afterwards to get a deterministic result, but I want to replicate the same order as I found in nltk 2 (which doesn’t work on Python 3).

The following code is an example. Please save it as a script, and run it a few times to see if the output stays the same for each run. (Remember that for nltk 2, you need to change synset.lemmas() to synset.lemmas on the third line.)

from nltk.corpus import wordnet as wn
synset = wn.synsets('study', 'n')[1]
lemma = synset.lemmas()[0]
print(lemma.derivationally_related_forms())

The output from nltk 2 is always: [Lemma('study.v.02.study'), Lemma('study.v.05.study'), Lemma('bookish.s.01.studious'), Lemma('learn.v.04.study')]

However, according to my observation, the output from nltk 3 is either: [Lemma('bookish.s.01.studious'), Lemma('study.v.05.study'), Lemma('study.v.02.study'), Lemma('learn.v.04.study')] Or: [Lemma('bookish.s.01.studious'), Lemma('study.v.02.study'), Lemma('learn.v.04.study'), Lemma('study.v.05.study')]

Does anyone know how was the list ordered in nltk 2? Does the order have any linguistic significance? A temporary workaround that allows me to replicate the same order in nltk 3 would be appreciated.

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
ymfacommented, Aug 22, 2017

@alvations These are the Python versions I use.

$ python3 -c "import sys; print(sys.version)"
3.6.1 |Anaconda 4.4.0 (x86_64)| (default, May 11 2017, 13:04:09) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]

$ /usr/bin/python -c "import sys; print(sys.version)"
2.7.10 (default, Feb  7 2017, 00:08:15) 
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.34)]

Please note that in tests you just posted (both on Ubuntu and on Mac), the lists are not in the same order.

When I looked at the source code, I realized that in NLTK 3 there is a sorted() wrapped around the return value of Lemma._related() in nltk/nltk/corpus/reader/wordnet.py, which didn’t exist in NLTK 2:

    def _related(self, relation_symbol):
        get_synset = self._wordnet_corpus_reader.synset_from_pos_and_offset
        return sorted([
            get_synset(pos, offset)._lemmas[lemma_index]
            for pos, offset, lemma_index
            in self._synset._lemma_pointers[self._name, relation_symbol]
        ])

However, removing the sorted() does not revert it to NLTK 2 behaviour either, and it still returns lemmas in different order.

0reactions
alvationscommented, Apr 24, 2018

After support for Python 3.6, moving forward, I think the issue is resolved upstream in CPython and also in #1918.

Thanks @ymfa !

Read more comments on GitHub >

github_iconTop Results From Across the Web

Sample usage for wordnet - NLTK
Each synset contains one or more lemmas, which represent a specific sense of a specific word. Note that some relations are defined by...
Read more >
Looking up lemmas and synonyms in WordNet
The antonyms() method returns a list of lemmas. In the first case, as we can see in the previous code, the second Synset...
Read more >
senseidx(5WN) - WordNet - Princeton University
Concatenating the lemma and lex_sense fields of a semantically tagged word (represented in a ... a data structure containing the parsed synset is...
Read more >
wn 0.9.3 documentation
The id argument is then passed to the Wordnet.word() method. > ... The return value of Word.lemma() and the members of the list...
Read more >
8.1.21. cltk.wordnet package - The Classical Language Toolkit
Returns a WordNetCorpusReader appropriate to the Document's language ... short stabbing weapon with a pointed blade') >>> s1.hyponyms() [Synset(pos='n', ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found