Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Proposal: Slicing

See original GitHub issue

(Note: this proposal was briefly discussed in #98, the C# design notes for Jan 21, 2015. It has not been updated based on the discussion that’s already occurred on that thread.)

Background

Arrays are extremely prevalent in C# code, as they are in most programming languages, and it’s very common to hand arrays around from one method to another.

Problem

However, it’s also very common to only want to share a portion of an array. This is typically achieved either by copying that portion out into its own array, or by passing around the array along with range indicators for which portion of the array is intended to be used. The former can lead to inefficiencies due to unnecessary copies of non-trivial amounts of data, and the latter can lead both to more complicated code as well as to lack of trust that the intended subset is the only subset that’s actually going to being used.

Solution: `Slice<T>`

To address this common need, .NET and C# should support “slices.” A slice, represented by the Slice<T> value type, is a subset of an array or other contiguous region of memory, including both unmanaged memory and other slices. The act of creating such a slice is referred to as “slicing,” and beyond the support on the Slice<T>, the C# language would include language syntax for declaring slices, slicing off pieces of arrays or other slices, and reading from and writing to them.

An array is represented using array brackets:

int[] array = …;

Similarly, a slice would be represented using square brackets that contain a colon between them:

int[:] slice = …; // same as "Slice<int> slice = ..."

The presence of the colon maps to the syntax for creating slices, which would use an inclusive ‘from’ index before the colon and an exclusive ‘to’ index after the colon to indicate the range that should be sliced (omission of either index would simply imply the start of the array or the end of the array, respectively, and omission of both would mean the entire array):

int[] primes = new int[] { 2, 3, 5, 7, 9, 11, 13 };
int item = primes[1];   // Regular array access, producing the value 3
int[:] a = primes[0:3]; // A slice with elements {2, 3, 5} 
int[:] b = primes[1:2]; // A slice with elements {3} 
int[:] c = primes[:5];  // A slice with elements {2, 3, 5, 7, 9} 
int[:] d = primes[2:];  // A slice with elements {5, 7, 9, 11, 13} 
int[:] e = primes[:];   // A slice with elements {2, 3, 5, 7, 9, 11, 13} 
int[:] f = a[1:2];      // A slice with elements {3}

Arrays could also be implicitly converted to slices (via an implicit conversion operator on the slice type), with the resulting slice representing the entire array, as if both ‘from’ and ‘to’ indices had been omitted from the slicing operation:

int[:] g = primes[:];   // A slice with elements {2, 3, 5, 7, 9, 11, 13} 
int[:] h = primes;      // A slice with elements {2, 3, 5, 7, 9, 11, 13}
int[:] i = h[:];        // A slice with elements {2, 3, 5, 7, 9, 11, 13}

A slice could also be used in a similar manner to arrays, reading from and writing to them via indexing:

int[:] somePrimes = primes[1:3];  // A slice with elements { 3, 5 }

Debug.Assert(primes is Array);// true
Debug.Assert(somePrimes is Slice<int>);   // true

Debug.Assert(somePrimes.Length == 2);     // true
Debug.Assert(somePrimes[0] == primes[1]); // true
Debug.Assert(somePrimes[1] == primes[2]); // true
somePrimes[0] = 17;
Debug.Assert(primes[1] == 17);            // true

As demonstrated in this code example, slicing wouldn’t make a copy of the original data; rather, it would simply create an alias for a particular region of the larger range. This allows for efficient referencing and handing around of a sub-portion of an array without necessitating inefficient copying of data. However, if a copy is required, the ToArray method of Slice<T> could be used to forcibly introduce such a copy, which could then be stored as either an array or as a slice (since arrays implicitly convert to slices):

int[:] aliased = primes[1:3];        // Alias of a portion of the original array
int[:] copied  = primes[1:3].Copy(); // Copy  of a portion of the original array

This gives developers the flexibility as to whether they want the recipient of the slice to be working with the original array or not, minimizing unnecessary copies and ensuring that only the appropriate areas of the larger region are used (by design, there would be no way through the public surface area of Slice<T> nor through the C# language syntax to get back from a slice to the larger entity from which it was sliced).

As creating slices would be very efficient, methods that would otherwise be defined to take an array, an offset, and a count can then be defined to just take a slice.

Solution: `ReadOnlySlice<T>`

In addition to Slice<T>, the .NET Framework could also includes a ReadOnlySlice<T> type, which would be almost identical to Slice<T> except that it would not provide any way for writing to the slice. A Slice<T> would be implicitly convertible to a ReadOnlySlice<T>, but not the other way around.

As with slicing an array, creation of a ReadOnlySlice<T> wouldn’t copy data, but rather would create a read-only alias to the original data; this means that while you couldn’t change the contents of a ReadOnlySlice<T> through it, if you had a writable reference to the underlying data, you could still manipulate it:

int[]  primes= new int[] { 2, 3, 5, 7, 9, 11, 13 };
int[:] a = primes[1:3];     // A slice with elements {3, 5}
ReadOnlySlice<int> b = a;   // A read-only slice with elements {3, 5}
Debug.Assert(a[0] == 3);    // true
Debug.Assert(b[0] == 3);    // true
b[0] = 42;                  // Error: no set accessor available
a[0] = 42;                  // Ok
Debug.Assert(b[0] == 42);   // true

While C# would not have special syntax to represent a ReadOnlySlice<T>, it could still have knowledge of the type. In particular, there is a very commonly-used type in C# that behaves like an array but that’s immutable: string. It’s very common for developers to want to slice off substrings from strings, and historically this has been a relatively expensive operation, as it involves allocating a new string object and copying the string data to it. With ReadOnlySlice<T>, the compiler could provide built-in support for slicing off substrings represented as ReadOnlySlice<char>. This could be done using the same slicing syntax as exists for arrays.

string helloWorld = "hello, world";
ReadOnlySlice<char> hello = helloWorld[0:5];

This would allow for substrings to be taken and handed around in a very efficient manner. In addition to new methods on String like Slice (a call to which is what the slicing syntax on strings would compile down to), String would also support an explicit conversion from a ReadOnlySlice<char> back to a string. This would enable developers to work with substrings efficiently, and then only create a copy as a string when actually needed.

Further, just as the C# compiler today has support for concatenating strings and switching on strings, it could also have support for concatenating ReadOnlySlice<char> and switching on ReadOnlySlice<char>:

string helloWorld = "hello, world";
ReadOnlySlice<char> hello = helloWorld[:5];
ReadOnlySlice<char> world = helloWorld[7:];
switch(hello) { // no allocation necessary to switch on a ReadOnlySlice<T>
    case "hello": Hello(); break;
    case "world": World(); break;
}
Debug.Assert(hello + world == "helloworld"); // only a single allocation needed for the concatenation

Issue Analytics

State:
Created 9 years ago
Reactions:37
Comments:98 (14 by maintainers)

Top GitHub Comments

5reactions

cesarsouzacommented, Jun 18, 2016

+1 against having slice as an array of indices. It would be better to learn with frameworks/libraries that got it right, like for example Python’s NumPy. Slices should represent views of the original array and are interpretable by ordinary functions just like ordinal arrays. It should be totally transparent for called functions whether they are processing an int[] or an int[5:10] or however an slice should be defined.

To be honest, I couldn’t completely understand from the above discussion why sometimes touching the compiler is seem as something to be avoided. In my view, this is a critical feature for #10378 that cannot be left half-baked (such as for example having only a pure BCL solution). Also, deprecating ArraySegment, and re-implementing it in terms of array slices should also be considered as an option. It is not like there weren’t any breaking changes since .NET 1.0.

Array slices (or more generally, safe memory views) are absolutely necessary for the success of C# as a language for high-performance computing. Right now, Python is taken way more seriously for high-performance computing than C#, and this really shouldn’t have been the case (Python is a fine language though, but it was C# that initially proposed the non-compromise solution of handling unsafe contexts for more performant code, for example - as such, the fact that we are not being able to fulfill one of the first premises of the language might be a sign that even large or possibly breaking changes should be considered at this point).

2reactions

prasannavlcommented, May 19, 2015

I don’t understand why there are many comments about slicing in IEnumerable or IList. It simply doesn’t make sense, since they aren’t a contiguous representation of memory. They aren’t even a direct representation of memory. They are very high level structures. The conceptual slicing of them is already possible, and is no different from using Skip, Take, and their relatives. We’re talking about efficient referencing to existing memory, which really, only applies to arrays, or be extended to objects overall - in which case the garbage collector itself has to be tweaked, which changes a lot more dynamics, bringing the whole language closer to C/C++. If this indeed is a proposal, it seems completely out of scope of this thread.

I think the focus here should be only on arrays. If array are accomplished the right way, ILists can easily be extended, by perhaps another interface, that allows access to IList’s source array, which in turn can be sliced.

Top Results From Across the Web

tc39/proposal-slice-notation

Slice notation. This repository contains a proposal for adding slice notation syntax to JavaScript. This is currently at stage 1 of the TC39...

proposal: slices: add Reverse : r/golang

I mean, one reason is that literally everyone can implement reversing slices, but few people can implement gif en/decoding.

Resource Adequacy Seasonal Slice Proposal

Seasonal slice proposal primarily reflects availability in counting rules with an additional validation process. • ELCC methodology establishes QC based on.

Proposal: Python's indexing and slicing - Discussion

Accepting negative indices is problematic for two reasons: it imposes runtime overhead in the index operation to check the sign of the index ......

Proposal: read-only slices

Your immutable slice proposal sounds very similar to strings in general. Therefore I was thinking if it's a good idea to change the...