How to get all documents?
See original GitHub issueIs there a way to get all documents returned as results?
For example:
miniSearch.search("")
returns an empty array, but I’m looking for a way to get the opposite, all documents.
A use case I have is that I want to only filter by a numeric range in some cases. Something like this:
// get all documents with val property >= minVal
miniSearch.search('', {
filter: (result) => {
return result.val >= minVal
}
})
but that currently returns nothing since no results are given to the filter.
I know it’s not the best use of this library as mentioned here - https://github.com/lucaong/minisearch/issues/119#issuecomment-1027726138 however it’s just one of several scenarios I’m using it for & would be great to be able to leverage it as well for this.
& awesome library btw 🙌 🙏
Issue Analytics
- State:
- Created a year ago
- Comments:8 (4 by maintainers)
Top Results From Across the Web
Get data with Cloud Firestore - Firebase
There are three ways to retrieve data stored in Cloud Firestore. Any of these methods can be used with documents, collections of documents,...
Read more >Getting all documents from one collection in Firestore
This would be more straightforward, if all you want to do is return the raw data objects for each document in a query...
Read more >Get all documents | Firestore - Google Cloud
Get all documents within a Firestore Collection.
Read more >Firebase 9 Firestore GET ALL DOCUMENTS Data From A ...
Learn how to get all the documents data from a collection in Firebase version 9 Cloud Firestore Database using the getDocs() method.
Read more >db.collection.find() — MongoDB Manual
To return all documents in a collection, omit this parameter or pass an empty document ( {} ). projection. document. Optional. Specifies the...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
It’s unfortunately more complicated than that. For example, how should documents be sorted? It would seem reasonable to return them in the original order, but what if one defines a
boostDocument
function? Then it makes more sense to compute the boost for each document and re-sort them. But since the original list is static, a smart developer would prefer to pre-sort the list only once, and skip the search-time boosting calculation when returning all documents.Similarly, since
MiniSearch
returns an array ofSearchResult
, not documents, when returning all results it would have to first map each document into a search result. But depending on the use case, developers might map results back to documents (like I did in my example before). In that case, it’s a lot more efficient to avoid mapping toSearchResult[]
in the first place (especially as it maps the whole collection, potentially tens of thousands documents).Moreover, at the moment
MiniSearch
does not keep a reference to the original collection of documents, so it cannot return it. This is by choice: it is possible to make some documents searchable without storing the document itself in memory.Of course, it is theoretically possible to implement options for each of these choices, but that would make the API surface huge, and hard to learn. Instead, these details are better defined in code. The reason why code is better than configuration in this case, is that configuration is something that has to be learned for each and every library, while code is general purpose: for a configuration option to be ergonomic, it has to save the developer a non-trivial amount of code or cognitive load. If it generates more open questions, it is not worth, because learning all the implications takes more effort than taking control of the issue with code.
I would say, one does not want to write code at the wrong level of abstraction. What I mean is: even when using a library, one does have to write code. The point is that one normally prefers to avoid writing code that pertains to the internal details of the problem solved by the library, and instead focus on code pertaining to the higher level goal of the application.
Therefore, a library has to choose its own boundaries and goals.
MiniSearch
, as its design document outlines, “enables developers to build [turn-key opinionated solutions] on top of its core API, but does not provide them out of the box.”.MiniSearch
takes care, for example, of implementation details of the inverted index or of the document scoring, but it leaves to the developers the responsibility to write code that defines their specific full-text search problem.It would be absolutely appropriate to build a library on top of
MiniSearch
that makes some of these decision and builds a higher level of abstraction. That would save developers from writing some code, but also restrict their options. For developers that have those specific needs, such library would facilitate things.MiniSearch
itself though has to enable also developers that have different needs. In other words, your request is completely legitimate, it just lies outside ofMiniSearch
self-assigned boundaries of abstraction.I understand and respect the fact that many people have this need. As a matter of fact, even some of my own apps have the same need. But apart from using
MiniSearch
in my production applications, I do not profit fromMiniSearch
: my motivation in maintaining it stems from the satisfaction of what I consider a well crafted piece of software. I am happy if more people use it, because it means that it is solving more problems than it was originally conceived for, but I would not sacrifice the solidity of its design for popularity. By open-sourcing my library, I get to keep the satisfaction of crafting software the way I consider best, without having to sacrifice it to chase more users. Users, in turn, get the freedom to use my library, and to create applications or other libraries on top of it.In sum, I do agree with you that yours is a common need. My opinion though, is that such need is better served by writing some thin layer of code, like the example I provided, than by adding more configuration options. But it is perfectly reasonable to disagree with that, and such thin layer can be packaged in a library for convenience.
@lucaong
Thank you for the in-depth reply and explanation. I overread the stated goal of “[…] enables developers to build [turn-key opinionated solutions] on top of its core API, but does not provide them out of the box” and was looking (expecting) a drop-in replacement for fuse.js.
On a side note, I have a question regarding your i18n workflow at megaloop. Can you send me a DM on Twitter or via email?