Omit filtered-out groupings of map-reduce
See original GitHub issueNot entirely sure whether to file this under crossfilter or dc.js, and this phenomenon probably happens by design, but could someone explain to me the reasoning behind this:
- See fiddle http://jsfiddle.net/amergin/23yRD/ for a simple example
- In the fiddle, two dimensions are created for the samples. Create a filter on typeDimension and on the other dimension you should have two sample rows matching the “visa” filter.
- Group the tipDimension and do reduce providing the custom init/add/remove functions. Result:
console.log( JSON.stringify( reducedGroup.top(Infinity) ) );
[{"key":100,"value":{"count":1,"tip":100,"total":200,"quantity":1,"set":0}},{"key":200,"value":{"count":1,"tip":200,"total":300,"quantity":1,"set":1}},{"key":0,"value":{"count":0}}]
- In the result, the third grouping seems to be for those filtered-out samples. From the structure it can be seen that it has only gone through reduceInitial function and therefore is without payload.
My question is this: is there a way to omit this third grouping or easily avoid it being formed? The problem comes clear on the scatterplot: the keyAccessor & valueAccessor both return a point (undefined,undefined) which is plotted on the top-left corner. What am I missing here?
Issue Analytics
- State:
- Created 10 years ago
- Comments:6 (4 by maintainers)
Top Results From Across the Web
Filtering / Excluding columns in MapReduce - Stack Overflow
I'd have the mapper separate each row into a ([col#,rowkey],value) pairs - the col# is so all data from one column will end...
Read more >How to simplify your codebase with map(), reduce(), and filter ...
Let's get started! 1. Remove duplicates from an array of numbers/strings. Well, this is the only one not about map/reduce/filter, but ...
Read more >MapReduce Tutorial - Apache Hadoop
A MapReduce job usually splits the input data-set into independent chunks which are ... User can use OutputLogFilter to filter log files from...
Read more >Hadoop Basics II: Filter, Aggregate and Sort with MapReduce
Filtering, aggregating, and sorting data from a Sequence File in MapReduce. ... Aggregating the sum of total values grouping by city ...
Read more >MapReduce - Quick Guide - Tutorialspoint
Combiner − A combiner is a type of local Reducer that groups similar data from the map phase into identifiable sets. It takes...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I documented the technique here: https://github.com/dc-js/dc.js/wiki/FAQ#filter-the-data-before-its-charted
Hope to create a crossfilter wrappers library to support this sort of thing. I don’t think it should be supported directly by dc.js, so I’m closing this issue.
Right now I’m responding to another question on the list about this very frequently requested feature. I forgot to mention that you can also create a “pseudo-group”; since dc.js really only calls the
.all()
method on a group, it’s pretty easy to create another object that wraps the group and implements its own.all()
that filters the results.Pretty hacky, and it would be great to support this directly, but a workaround for now…