question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Documents with duplicate ids are being added normally

See original GitHub issue

Hello. Thank you for your hard work on this package!

I think I may have encountered a problem with the ways ids are treated. In particular, when adding two documents with the same id, both documents are added to the search pool. No error is raised, no upsert is being carried out - both documents with the same id are added normally as if they had different ids. I am not sure if that’s expected behavior.

The following example code:

const Minisearch = require('minisearch')


async function run() {
  const minisearch = new Minisearch({
    fields: ['value']
  })

  minisearch.add({ id: 'b', value: 'bob' })
  minisearch.add({ id: 'b', value: 'boba' })


  const ans = minisearch.search('bob', { fuzzy: true })

  console.dir(ans, { depth: 4 })
}


run()

outputs:

[
  {
    id: 'b',
    terms: [ 'bob' ],
    score: 2.0794415416798357,
    match: { bob: [ 'value' ] }
  },
  {
    id: 'b',
    terms: [ 'boba' ],
    score: 0.6398773880082578,
    match: { boba: [ 'value' ] }
  }
]

For convenience, I have created a repository which reproduces the issue outlined in the example above.

Kind Regards, lilsweetcaligula

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

3reactions
lucaongcommented, Aug 19, 2021

Hi @lilsweetcaligula , Thanks for reporting this. I agree that it would be better to throw an error if a document is added having an id that is already in the index. Unfortunately, it is tricky to implement without additionally saving a map of all encountered ids, which would increase memory utilization on large indexes.

Let me think if there is a simple way to fix this.

1reaction
KeKs0rcommented, Sep 24, 2021

I also just came across this issue. Unfortunately just ignoring duplicates would not work, since the reason I am trying to readd the same document is, that I know the document itself changed. My callback is also triggered with all documents so just removing a single and readding is not possible.

I know that is more a userland problem, and I need to better identify the actual change that is happening, but thought I mention it of being the case.

Read more comments on GitHub >

github_iconTop Results From Across the Web

ID attribute values must be unique | Axe Rules | Deque Systems
Rename any duplicate ID attributes values. Duplicate IDs are common validation errors that may break the accessibility of labels, e.g., form fields, ...
Read more >
Why are duplicate ID values not allowed in HTML?
It says that ID must be unique in its home subtree, which is basically the document if we read the definition of it....
Read more >
Ditamap validation configuration Check for duplicate IDs
Duplicate IDs are a problem if we are creating conrefs that use an ID that happens to already exist in the referenced document....
Read more >
Solved: Need help with duplicate ID's - ServiceNow Community
This has been resolved! I was able to run a script to merge any references to the duplicate ID's and then I deleted...
Read more >
Duplicate IDs: Student Enterprise Systems
An individual has Duplicate EMPL IDs when he/she has two or more EMPL IDs in CAESAR. EMPL IDs created in CAESAR begin with...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found