$lookup should reference join collection instead of clone data.
See original GitHub issueI’m not sure what’s happening in lookup.js:
each(joinColl, (obj, i) => {
let k = hashCode(obj[foreignField])
hash[k] = hash[k] || []
hash[k].push(i)
})
each(collection, (obj) => {
let k = hashCode(obj[localField])
let indexes = hash[k] || []
obj[asField] = map(indexes, (i) => clone(joinColl[i]))
result.push(obj)
})
I’m confused by this specifically:
obj[asField] = map(indexes, (i) => clone(joinColl[i]))
Is the $lookup
data being cloned, or added by reference?
Cloning, of course, is trivial for small datasets but huge (unnecessary) overhead on big datasets.
Issue Analytics
- State:
- Created 6 years ago
- Comments:15 (7 by maintainers)
Top Results From Across the Web
How to join multiple collections with $lookup in mongodb
According to the documentation, $lookup can join only one external collection. What you could do is to combine ...
Read more >MongoDB Join Two Collections Simplified - Hevo Data
We can join documents on collections in MongoDB by using the $lookup (Aggregation) function. $lookup(Aggregation) creates an outer left join ...
Read more >22. Join two collections using aggregate method lookup to get ...
In this video we will see how to join two collections data using aggregate method lookup to get details - MongoDB If you...
Read more >Joins and Other Aggregation Enhancements Coming in ...
A left outer equi-join produces a result set that contains data for all documents from the left table (collection) together with data from ......
Read more >LOOKUP function - Microsoft Support
Use LOOKUP, one of the lookup and reference functions, when you need to look ... Copy the data in following table, and paste...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Spot on @Redsandro.
Your assumptions are correct and I very much like the breakdown. It describes how things should work.
To answer you question, the clone function does not maintain state.
Cloning currently happens on-demand but for each stage. I considered the approach of cloning once and then keeping a history for subsequent stages but opted out due to the complexity and significant code size introduced. Even with that in place, it won’t address the
$lookup
case since the collection being cloned is the secondary data source. Too many decision points open up with this approach and it is not always clear what is the right thing to do.mingo
does reference values of the input data but will return a new object with reference to unchanged parts of the original if an operation modifies, add, or remove a value from the original object. So what are hypothesizing is what happens in some cases. Cloning the whole object therefore is not necessary, but it is the simplest and safest approach to take in some cases.For example, given the object
{a: {b: {c: {d: { e: 1} } } } }
if the the value for keye
is changed, the entire object must be cloned. Given another object say{x: 3, a: {b: {c: {d: { e: 1} } } } }
, if we change the value forx: 4
then we must create a new object with the updated value but reference the valuea
such that{x: 4, a: <ref-a-val>}
.I found a bug in
$lookup
here https://github.com/kofrasa/mingo/issues/60 which does not do what I describe above but instead modifies the original. The fix should also remove the need to clone the join collection which will address your use-case.I am marking this issue as a bug.
Thanks for reporting and taking the time to discuss.