`Seq.zip`, `Seq.map2` etc behave different from `List.zip`, `List.map2`, `Array.zip` etc w.r.t. raising for different lengths
See original GitHub issueIn cases where you apply a pairwise operation to two sequences, like Seq.zip
, the behavior in F# Core is defined by the implementation of Seq.map2
and List.map2
and the like. The behavior between collection types is different, however.
Array.zip
andList.zip
throw anArgumentException
whenever the sequences have differing lengthsSeq.zip
on the other hand doesn’t.
I doubt this behavior can be changed (backwards compatibility), but I do think it is a bug/oversight or whatchamacallit. I’m currently working on implementing and extending TaskSeq
, based on @dsyme’s original code from this repo and raised this as a question to myself: https://github.com/abelbraaksma/TaskSeq/issues/32. Then I figured, let’s broaden the discussion scope 😉.
Repro steps
// this is fine
[1;2;3] |> Seq.zip ["one"; "two" ] |> Seq.toList
// this raises
[1;2;3] |> List.zip ["one"; "two" ]
// this raises too
[1;2;3] |> Array.zip ["one"; "two" ]
Also, this is quite weird:
// returns true??
[1;2;3] |> Seq.forall2 (=) Seq.empty // true
[1;2;3] |> Seq.forall2 (=) [1;2;3;4] // true
[1;2;3] |> List.forall2 (=) Seq.empty // exception
[1;2;3] |> List.forall2 (=) [1;2;3;4] // exception
Expected behavior
The same behavior for all collection types.
Actual behavior
Functions like Seq.map2
, Seq.map3
, Seq.mapi2
, Seq.zip
do not raise an ArgumentException
when the sizes are different. However, the implementations do read past the end of the sequence and even if false, read the next item of the paired sequence as well (see MapEnumerator
code here). In other words, the information whether one or both sequences are exhausted is available.
Known workarounds
In lazily evaluated sequences, the only workaround is to “roll your own”. Easy enough, but still. Alternatively, you could, of course, cache the sequences as an eager sequence like List
or Array
.
Related information
I did try to find a motivation for this behavior in the source code an online, but failed to do so. There’s certainly an argument to be made for not raising an exception, but then one would expect that to be the case for all collection types.
Perhaps there’s something with lazy sequences that suggest not raising exceptions in general. But something like [1..3] |> Seq.take 4
raises (not immediately, but when iterating over the sequence), in other words, it does not seem to be a taboo.
Issue Analytics
- State:
- Created a year ago
- Comments:11 (10 by maintainers)
Top GitHub Comments
Altering this behavior would be a breaking change. I like the python like behaivour of not throwing things which make List.zip unusable in many cases. Compare to the slice operations that are safe now.
I understand the argument “it’s always been this way”, and that we can’t change it. I don’t understand the rationale, as none of my examples require iteration of the whole sequence to throw. In fact, already during the standard operation, all information is available to throw or not (mainly, the two or three booleans that know whether to continue).
This is different from Haskell, btw, in which the order of arguments determines which of the
MoveNext
s are called, which IMO is wrong in a different way (non commutative arguments).Anyway, fair enough to close this out, I agree we shouldn’t change the defaults here.