question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Slice can be simplified does not produce the same code

See original GitHub issue

Currently the IDE suggest with Span<T> not to use slice but instead use the Range indexer (IDE0057: Slice can be simplified to this). So the original code:

new Vector<T>(B.Slice(Idx))

is then translated to

new Vector<T>(B[Idx..])

the resulting generated code is different (which is OK)

    public void WithSlice() {
        ReadOnlySpan<float> A = new float[100];
        
        for (int i=0; i<A.Length; i+= Vector<float>.Count)
            new Vector<float>(A.Slice(i));   
    }
    
       public void WithRange() {
        ReadOnlySpan<float> A = new float[100];
        
        for (int i=0; i<A.Length; i+= Vector<float>.Count)
            new Vector<float>(A[i..]);
    }

and the result taken from sharplab:

    public void WithSlice()
    {
        ReadOnlySpan<float> readOnlySpan = new float[100];
        for (int i = 0; i < readOnlySpan.Length; i += Vector<float>.Count)
        {
            new Vector<float>(readOnlySpan.Slice(i));
        }
    }

    public void WithRange()
    {
        ReadOnlySpan<float> readOnlySpan = new float[100];
        for (int i = 0; i < readOnlySpan.Length; i += Vector<float>.Count)
        {
            ReadOnlySpan<float> readOnlySpan2 = readOnlySpan;
            int length = readOnlySpan2.Length;
            int num = i;
            int length2 = length - num;
            new Vector<float>(readOnlySpan2.Slice(num, length2));
        }
    }

but the JIT Asm seems to be different. I didn’t make any performance comparisons, but for my environment it is crucial that the suggested rewrite does not alter the performance of the code.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:4
  • Comments:9 (6 by maintainers)

github_iconTop GitHub Comments

3reactions
stephentoubcommented, Sep 11, 2020

@svick The C# compiler and the types it’s generating use of are resulting in significantly more complex code for the JIT to then need to unwind. Such issues are covered by existing issues like https://github.com/dotnet/runtime/issues/11848 and https://github.com/dotnet/runtime/issues/11870. If this issue is about the JIT, it should just be closed.

That said, the C# compiler is doing a disservice here by using Slice(start, count) rather than just Slice(start). The latter has less code in it, less checks to be performed, and less code required to invoke it. I don’t know that the JIT would ever be able to make them identical; in theory it could, in practice that’s a whole lot of analysis. This was all raised when the indexing feature was introduced, and it was decided by the language team that such level of optimization wasn’t important.

If the analyzer from dotnet/roslyn is encouraging this transformation and folks expect the resulting code to be identical, then the dotnet/roslyn compiler should be updated to abide. There’s no guarantee the JIT will produce the same asm, otherwise.

0reactions
msedicommented, Sep 14, 2020

I just wanted to point another benchmark on another system. I just wanted to point that there was a long way from coming from the “old” without-Span<T> world, where we did everything either in regular C#, going to unsafe code with pointer and then using Span<T> and finally using Vector<T> - Honestly thanks for the improvement in speed . The last method was a suggestion from @benaadams (thanks for that). One interesting finding is though that the AddVectorizedSpanWithRange is definitely slower on this example computer than on the first.

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.1082 (1909/November2018Update/19H2)
Intel Xeon CPU E5-2670 0 2.60GHz, 2 CPU, 32 logical and 16 physical cores
.NET Core SDK=5.0.100-preview.6.20318.15
  [Host]     : .NET Core 5.0.0 (CoreCLR 5.0.20.30506, CoreFX 5.0.20.30506), X64 RyuJIT
  DefaultJob : .NET Core 5.0.0 (CoreCLR 5.0.20.30506, CoreFX 5.0.20.30506), X64 RyuJIT


|                     Method |     Mean |     Error |    StdDev |
|--------------------------- |---------:|----------:|----------:|
| AddVectorizedSpanWithSlice | 2.953 ms | 0.0143 ms | 0.0111 ms |
| AddVectorizedSpanWithRange | 4.143 ms | 0.0163 ms | 0.0128 ms |
|         AddVectorizedArray | 4.043 ms | 0.0166 ms | 0.0139 ms |
|            AddArrayClassic | 8.396 ms | 0.0050 ms | 0.0045 ms |
|             AddArrayUnsafe | 4.180 ms | 0.0036 ms | 0.0030 ms |
|   AddVectorizedArrayUnsafe | 2.512 ms | 0.0149 ms | 0.0132 ms |

Just for reference, here’s is the code:

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

using System;
using System.Numerics;
using System.Runtime.CompilerServices;

namespace ConsoleApp8
{
    class Program
    {
        static void Main(string[] args)
        {
            BenchmarkRunner.Run<VectorizedBenchmark>();
        }
    }

    public class VectorizedBenchmark
    {
        int N = 2048 * 2048;
        float[] DataA;
        float[] DataB;
        float[] Result;

        [GlobalSetup]
        public void GlobalSetup()
        {
            DataA = new float[N];
            DataB = new float[N];
            Result = new float[N];
            for (int i = 0; i < N; i++)
            {
                DataA[i] = i;
                DataB[i] = i;
            }
        }

        [Benchmark]
        public float[] AddVectorizedSpanWithSlice()
        {
            int StepSize = Vector<float>.Count;
            int L = N - (N % StepSize);
            int i = 0;

            ReadOnlySpan<float> A = DataA;
            ReadOnlySpan<float> B = DataA;
            Span<float> C = Result;

            for (; i < L; i += StepSize)
            {
                (new Vector<float>(A.Slice(i)) + new Vector<float>(B.Slice(i))).CopyTo(C.Slice(i));
            }

            for (; i < N; i++)
            {
                Result[i] = DataA[i] + DataB[i];
            }

            return Result;
        }

        [Benchmark]
        public float[] AddVectorizedSpanWithRange()
        {
            int StepSize = Vector<float>.Count;
            int L = N - (N % StepSize);
            int i = 0;

            ReadOnlySpan<float> A = DataA;
            ReadOnlySpan<float> B = DataA;
            Span<float> C = Result;

            for (; i < L; i += StepSize)
            {
                (new Vector<float>(A[i..]) + new Vector<float>(B[i..])).CopyTo(C[i..]);
            }

            for (; i < N; i++)
            {
                Result[i] = DataA[i] + DataB[i];
            }

            return Result;
        }

        [Benchmark]
        public float[] AddVectorizedArray()
        {
            int StepSize = Vector<float>.Count;
            int L = N - (N % StepSize);
            int i = 0;

            for (; i < L; i += StepSize)
            {
                (new Vector<float>(DataA, i) + new Vector<float>(DataB, i)).CopyTo(Result, i);
            }

            for (; i < N; i++)
            {
                Result[i] = DataA[i] + DataB[i];
            }

            return Result;
        }

        [Benchmark]
        public float[] AddArrayClassic()
        {
            for (int i = 0; i < N; i++)
            {
                Result[i] = DataA[i] + DataB[i];
            }

            return Result;
        }

        [Benchmark]
        public float[] AddArrayUnsafe()
        {
            unsafe
            {
                fixed (float* dataA = DataA)
                fixed (float* dataB = DataB)
                fixed (float* result = Result)
                {
                    float* dataAptr = dataA, dataBptr = dataB, resultptr = result;
                    for (int i = 0; i < N; i++, dataAptr++, dataBptr++, resultptr++)
                    {
                        *resultptr = *dataAptr + *dataBptr;
                    }
                }
            }

            return Result;
        }

        [Benchmark]
        public float[] AddVectorizedArrayUnsafe()
        {
            int StepSize = Vector<float>.Count;
            int L = N - (N % StepSize);
            int i = 0;

            ref float A = ref DataA[0];
            ref float B = ref DataB[0];
            ref float C = ref Result[0];

            for (; i < L; i += StepSize)
            {
                var vectorA = Unsafe.ReadUnaligned<Vector<float>>(ref Unsafe.As<float, byte>(ref Unsafe.Add(ref A, i)));
                var vectorB = Unsafe.ReadUnaligned<Vector<float>>(ref Unsafe.As<float, byte>(ref Unsafe.Add(ref A, i)));
                var vectorC = vectorA + vectorB;

                Unsafe.WriteUnaligned<Vector<float>>(ref Unsafe.As<float, byte>(ref Unsafe.Add(ref C, i)), vectorC);
            }

            for (; i < N; i++)
            {
                Result[i] = DataA[i] + DataB[i];
            }
            return Result;
        }
    }
}
Read more comments on GitHub >

github_iconTop Results From Across the Web

Code review: Can this be simplified? - JavaScript
I have an array with train journeys that I want to visualize in an svg. That is already working. Currently, some trains have...
Read more >
object initialization can be simplified
The object initialization can be simplified messages may be useful in detecting points in your code where you can use a Creational Pattern,...
Read more >
Go Slices: usage and internals
This article will look at what slices are and how they are used. ... they're a bit inflexible, so you don't see them...
Read more >
Code Inspections in C# | ReSharper Documentation
Base member has 'params' parameter, but the overrider does not have it ... Dictionary lookup can be simplified with 'TryAdd'.
Read more >
Experiment, Simplify, Ship
We'd write the same code for slices of bytes, and slices of strings, and so on. Our programs were too complex, because Go...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found