Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[CodeGen] Incorrect vectorization for fp16 on X86-64 (AVX512)

See original GitHub issue

Hi,

There seems a bug related to vectorization with type fp16 on skylake machine. Basically, when cast fp16 to uint8 using select for AVX512 machine, it will generate wrong result with vectorization on.

The generated ll code seems correct after checking dumped llvm code, but the generated asm code seems problematic. It might be some configurations when doing the code generation /src/codegen/llvm/llvm_common.cc

import tvm
import numpy
## shape cause wrong result
m=3
n=2
k=2
## shape cause LLVM ERROR
# m=3
# n=1
# k=4

dtype = "float16"
target = 'llvm -mcpu=skylake-avx512'
ctx = tvm.context(target, 0)

input = [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0]
in_array = numpy.array(input).reshape(m,n,k).astype(dtype)
a = tvm.nd.array(in_array, ctx)

X=tvm.placeholder((m,n,k), name='X', dtype="float16")
zero = tvm.const(0, "uint8")
one = tvm.const(1, "uint8")
z16 = tvm.const(0, "float16")

Y=tvm.compute((m,n,k), lambda i,j,k: tvm.expr.Select(X[i,j,k] == z16, zero, one))
s=tvm.create_schedule(Y.op)
s[Y].vectorize(Y.op.axis[2])
print tvm.lower(s,[X,Y], simple_mode=True)
func = tvm.build(s, [X,Y], target=target, name='bug')
assert func

output = [False, True, True, True, True, True, True, True, True, True, True, True]
out = numpy.array(output).reshape(m,n,k).astype("bool")
b= tvm.nd.array(numpy.zeros((m,n,k), dtype="bool"), ctx)
func(a,b)

tvm.testing.assert_allclose(b.asnumpy(), out)

Issue Analytics

State:
Created 4 years ago
Comments:12 (6 by maintainers)

Top GitHub Comments

1reaction

topperccommented, Oct 16, 2019

FYI, I think might have fixed the LLVM bug this was hitting here https://github.com/llvm/llvm-project/commit/7b49e8ac359bc35f95af548fbed4b7afd625caab

0reactions

topperccommented, Oct 17, 2019

I was asked to look at why -mcpu=cascadelake didn’t generate the same code as -mcpu=skylake-avx512 when using tvm with llvm 8.0. After ruling out llvm I started poking around tvm and happened to see this avx512 issue in my search results. Coincidentally we had hit the same underlying llvm bug internally last week with a different frontend.

Top Results From Across the Web

[CodeGen] Incorrect vectorization for fp16 on X86-64 (AVX512)

Hi, There seems a bug related to vectorization with type fp16 on skylake machine. Basically, when cast fp16 to uint8 using select for...

D105263 [X86] AVX512FP16 instructions enabling 1/6 - LLVM

Enable FP16 type support and basic declarations used by following patches. ... llvm/test/CodeGen/X86/vector-reduce-fmax-nnan.ll.

Bug List - GCC, the GNU Compiler Collection

ID Product Comp Assignee△ Status△ Changed 60481 gcc target unassigned UNCO 2016‑10‑03 13515 gcc target unassigned NEW 2021‑09‑13 43644 gcc target unassigned NEW 2021‑08‑28

AVX-512 - Wikipedia

AVX-512 are 512-bit extensions to the 256-bit Advanced Vector Extensions SIMD instructions for x86 instruction set architecture (ISA) proposed by Intel in ...

2017 EuroLLVM Developers' Meeting: G. Blank “AVX-512 ...

2017 EuroLLVM Developers' Meeting: G. Blank “ AVX-512 Mask Registers Code Gen Challenges in LLVM”. 2.5K views · 5 years ago ...more ...