AVX2GATHER registers
See original GitHub issueHi there,
I think I found one more issue. XED disassembles C4 E2 BD 90 5C DD A2
as vpgatherdq ymm3, qword ptr [rbp+xmm3*8-0x5e], ymm8
.
The SDM says:
If any pair of the index, mask, or destination registers are the same, this instruction results a UD fault.
Obviously XMM3
is not the same register as YMM3
, but as they are sharing the lower 128-bits, this combination should #UD as well. The SDM is not very clear about this, but fter looking at the pseudo-code and the implications, I’m pretty sure the #UD condition refers to identical register-ids (instead of actually identical registers).
Issue Analytics
- State:
- Created 6 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
How are the gather instructions in AVX2 implemented?
I did some benchmarking of the AVX gather instructions (on a Haswell CPU) and it seems to be a fairly simple brute force...
Read more >An Exploration of Using the Intel AVX2 Gather Load ... - CORE
SIMD register loads normally load from consecutive locations in memory, that is, consecutive pixels in a row of the image. For some algorithms,...
Read more >VGATHERDPS/VGATHERQPS — Gather Packed SP FP ...
W0 92 /r VGATHERDPS xmm1, vm32x, xmm2, A, V/V, AVX2, Using dword indices specified in vm32x, gather single-precision FP values from memory conditioned...
Read more >When vectorization hits the memory wall
AVX2 instruction set introduces a collection of gather load instructions. Gather instructions (as opposed to load) can in principle load data ...
Read more >"Optimizing" Day 118 with AVX2 gather instruction
"Optimizing" Day 118 with AVX2 gather instruction ... locations with single instruction and store all results in one SSE/AVX register.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
We feed the XED decoder with some random data and compare the results with Zydis.
If only one disassembler is able to decode the instruction, I take a deeper look into what could cause the issue. We think it’s only fair to report these finds back to you; improving both libraries 😄
You are correct! I changed the title.