Allow columns to "pass through" summarize? (e.g. across)
See original GitHub issueI have another question which can really optimize the way I work:
often I’m performing calculations on aggregates and would like to allow some
features (that are constant within the group) to pass through after the summarize
.
I know it’s possible to create new variables in the sense of
summarize(new_col=_.feature.mean(), old_col=_.old_col.iloc[0])
, for example, but this gets
tedious if there are many columns (or even with a few columns).
Is there a way to tell siuba (more specifically summarize
) to pass through some variables?
And on a related note - is there a way to make the same operation on many columns without having to use gather
?
(Currently I have the process of gather -> group_by -> summarize -> spread to operate on many same columns)?
Thanks for the awesome library! Omri
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (1 by maintainers)
Top GitHub Comments
Hey, sorry for the delay–I’ve been thinking about how
across
could be implemented. It seems like, similar to siuba’s implementation ofcase_when()
,across()
could essentially take data as its first argument (verbs do this too. e.g.select
ormutate
).Here’s a case_when example (since apparently it is undocumented 😬).
Across proposal
Essentially what could happen is:
across(_, ...)
,across(mtcars, ...)
)across(_, _.contains('abc'), _.mean(), ...)
within verbs will just get evaluated like other symbolic calls_
Examples
does siuba have across verb?