Suggestion: Improve Compare-Object by adding set operations (union, intersection, symmetric difference, relative complement)
See original GitHub issueThis suggestion is the result of a conversation between @iSazonov and me - see #4293.
The idea is to introduce new parameter sets that:
- frame the operations in the more established set-theory terms
- while introducing the relative-complement operation to make it easy to determine the objects unique to one side.
- make it easier to retrieve just the selected input objects (without the custom-object wrapper that contains the
.SideIndicator
property - improve the performance of certain use cases
Below are examples of each desired new parameter (parameter set) that would be mutually incompatible and also incompatible with the current parameter sets.
Aside from referring to the desired set operation, their desired behavior is expressed as s command using Compare-Object
’s current capabilities:
-
Compare-Object -Union $A $B
: union (A ∪ B)Compare-Object $A $B -IncludeEqual -PassThru
-
Compare-Object -Intersection $A $B
: intersection (A ∩ B):Compare-Object $A $B -IncludeEqual -ExcludeDifferent -PassThru
-
Compare-Object -SymmetricDifference $A $B
: symmetric difference (A ∆ B) - the same as the current default behavior, but without the wrapper objectsCompare-Object $A $B -PassThru
-
Compare-Object -Complement $A $B
: relative complement (A ∖ B) - getting objects unique to$B
Compare-Object $A $B | ? SideIndicator -eq '=>' | % InputObject
Syntax note: @dragonwolf83 proposes using a single parameter such as -SetOperation <operation>
(e.g., -SetOperation Intersection
or -SetOperation Union
) instead of distinct switches (e.g., -Intersection
or -Union
), which makes for easier implementation (no need for a distinct parameter set for each operation) and better discoverability, though is slightly more cumbersome to type for experienced users who already know what they want.
Note that all commands above (effectively) suppress the custom-object wrapper with the .SideIndicator
property and return the selected input objects directly (or, with -Property
specified, the resulting [pscustomobject]
instance would lack the .SideIndicator
property).
One thing to note is the order in which objects are output - this is not currently documented (from a set perspective, order doesn’t matter, but for subsequent processing it may), and I haven’t dug into the source to verify, but from what I can tell, it is:
==
(identical) objects first>=
right-side-only objects next<=
left-side-only objects last
On a related note, adding a new switch would make sense in order to return a hashtable of original objects grouped by what is currently the .SideIndicator
value.
Two names have been proposed for this switch:
-Group
-AsHashtable
-AsHashtable
has the advantage of being familiar from Group-Object
, although there it doesn’t indicate a fundamental change in output structure.
The following example uses -Group
for now:
$A = 1, 2, 3, 4
$B = 1, 3, 4, 5
# Wishful thinking
Compare-Object -Group -Union $A $B
The above would yield the equivalent of:
@{
'==' = 1, 3, 4
'<=' = , 2
'=>' = , 5
}
Environment data
PowerShell Core v6.0.0-beta.4
Issue Analytics
- State:
- Created 6 years ago
- Reactions:11
- Comments:6 (2 by maintainers)
I don’t think
-Group
parameter should be used. That starts overloading parameters when the original intent is to pipe to a standard set of cmdlets to do that, likeGroup-Object
. It is not any more complex to use and keeps the cmdlet code clean.Back to using Sets, it would be nicer to have one parameter, like
-UsingSet
, with the option ofUnion
,Intersect
,Except
instead of separate parameters. Hopefully a better name than I came up with.I like this quite a bit. Compare-Object has long been… lacklustre… in implementation. I would personally also prefer if rather than
==
,<=
, and=>
symbols for the grouping (or indeed even for the current behaviour) theSideIndicator
s were changed to match the actual parameters the objects are passed to (i.e.,ReferenceSet
,DifferenceSet
, andBoth
)Currently the default display is actually surprisingly difficult to make sense of in my opinion with any appreciably large comparison sets. It may even make more sense for the default display to actually do a
Format-Table -GroupBy SideIndicator
similar to howGetChildItem
will group the table display by folder in a recursive search. It would make comprehending the data you’re getting significantly less befuddling. 😃