question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Non deterministic results

See original GitHub issue

In order to test the process of disambiguation, some possibilities of query were given. Let’s take test cases with disambiguation of Pdf files, the service showed a strange behavior since it gave different results even for the same query. The following is some test cases done on a Pdf file with the same query template: 2009.Infiniti.pdf

{
    "mentions": [
        "ner",
        "wikipedia"
    ],
    "nbest": false,
    "customisation": "generic"
}

The service gave different results, for instance the mention ‘Francesco Speranza’ can be full recognized as ‘Francesco Speranza’, can partially recognized as ‘Speranza’, or even cannot be recognized at all.

Below are some screenshots of the results.

  1. ‘Francesco Speranza’ can be full recognized screen shot 2018-01-15 at 16 41 10

  2. ‘Francesco Speranza’ can partially recognized as ‘Speranza’ screen shot 2018-01-15 at 16 51 44

  3. ‘Francesco Speranza’ cannot be recognized at all screen shot 2018-01-15 at 16 39 54

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:12 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
kermitt2commented, Jan 16, 2018

Also note that it has nothing specific to the PDF, it is apparent whatever input we provide to the monster 😃

1reaction
kermitt2commented, Jan 16, 2018

This is something visible for long. My hypothesis is that it is due to the random seed of SMILE, leading to different partitioning of decision trees in the ensemble decision algo, and thus different probabilities - above or bellow the thresholds, and finally to this non-deterministic behaviour. If it comes from that, in sklearn we typically fix the random seed for reproducibility.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Nondeterministic algorithm - Wikipedia
In computer programming, a nondeterministic algorithm is an algorithm that, even for the same input, can exhibit different behaviors on different runs, ...
Read more >
Difference between Deterministic and Non ... - GeeksforGeeks
In a deterministic algorithm, for a given particular input, the computer will always produce the same output going through the same states ......
Read more >
Deterministic and Nondeterministic Functions - SQL Server
Nondeterministic functions may return different results each time they're called with a specific set of input values even if the database ...
Read more >
Nondeterministic vs deterministic
Nondeterminism means that the path of execution isn't fully determined by the specification of the computation, so the same input can produce different...
Read more >
Non-deterministic SQL - IBM
An SQL statement is non-deterministic in a replication set if it does not return the same result when executed on all replication nodes...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found