question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Found why best_match has low performance with Levenshtein distance comparision

See original GitHub issue

@gunthercox, @vkosuri, @mymusise

See you guys concern the performance issue for statement get response, what I did for performance improvement may be helpful. The performance improves from 1.9s to 96.8ms by following changes to LevenshteinDistance:

move

import sys
from difflib import SequenceMatcher

to the front of the class.

comment out try … exception … block for library import.

        # import sys
        #
        # # Use python-Levenshtein if available
        # try:
        #     from Levenshtein.StringMatcher import StringMatcher as SequenceMatcher
        # except ImportError:
        #     from difflib import SequenceMatcher

        # PYTHON = sys.version_info[0]

        # Return 0 if either statement has a falsy text value
        # if not statement.text or not other_statement.text:
        #     return 0
        #
        # # Get the lowercase version of both strings
        # if PYTHON < 3:
        #     statement_text = unicode(statement.text.lower()) # NOQA
        #     other_statement_text = unicode(other_statement.text.lower()) # NOQA
        # else:
        #     statement_text = str(statement.text.lower())
        #     other_statement_text = str(other_statement.text.lower())

        statement_text = str(statement.text.lower())
        other_statement_text = str(other_statement.text.lower())

Good luck!

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:8

github_iconTop GitHub Comments

2reactions
pylobotcommented, Jan 12, 2018

Yeah Levenshtein.StringMatcher.StringMatcher and difflib.SequenceMatcher are both different libraries. Maybe faster but I think this try/except is because ChatterBot support both Python 2.7 and 3

2reactions
zxsimplecommented, Jan 11, 2018

@vkosuri I’ll create PR after fully test

Read more comments on GitHub >

github_iconTop Results From Across the Web

algorithm - Most efficient way to calculate Levenshtein distance
After profiling my code, I found out that the overwhelming majority of time is spent calculating the distance between the query and the...
Read more >
Levenshtein distance for NLP machine learning named entities
This is the lowest scoring string comparison in the example, as we have made sure we have chosen the best matches possible from...
Read more >
3GOLD: optimized Levenshtein distance for clustering third ...
The lowest edit distance value between the comparisons is used in clustering threshold analysis. An example of this improvement is shown in ...
Read more >
stringdist: Approximate String Matching, Fuzzy Text Search ...
If match was found, element (i, j) contains the match, otherwise it is set to NA. Running cosine distance. This algorithm gains efficiency...
Read more >
levenshtein - Manual - PHP
The Levenshtein distance is defined as the minimal number of characters ... I found that lowercasing the array prior to comparing yields a ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found