question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Hey,

Would be really handy for me if this could support avx, including specifically sse4.1. Benchmarking simple simd mathematics techniques is what I’m hoping to do, to make informed decisions on simd performance.

Here’s a little test I did to see if quick-bench would help me do what I’m trying to do:

#include <x86intrin.h>

static void DPPS(benchmark::State& state) {
  __m128 left, right;
  left = _mm_set_ps(1.0f, 2.0f, 3.0f, 4.0f);
  right = _mm_set_ps(1.0f, 2.0f, 3.0f, 4.0f);
  for (auto _ : state) {
    __m128 dotted = _mm_dp_ps(left, right, 0xff);
    
    benchmark::DoNotOptimize(dotted);  
  }
  benchmark::DoNotOptimize(left);
  benchmark::DoNotOptimize(right);
}
// Register the function as a benchmark
BENCHMARK(DPPS);

static void MULHADD(benchmark::State& state) {
  __m128 left, right;
  left = _mm_set_ps(1.0f, 2.0f, 3.0f, 4.0f);
  right = _mm_set_ps(1.0f, 2.0f, 3.0f, 4.0f);
  for (auto _ : state) {
    __m128 dotted = _mm_mul_ps(left, right);
    dotted = _mm_hadd_ps(dotted, dotted);
    dotted = _mm_hadd_ps(dotted, dotted);
    
    benchmark::DoNotOptimize(dotted);  
  }
  benchmark::DoNotOptimize(left);
  benchmark::DoNotOptimize(right);
}
BENCHMARK(MULHADD);

The errors generated:

Error or timeout
bench-file.cpp:9:21: error: '__builtin_ia32_dpps' needs target feature sse4.1
    __m128 dotted = _mm_dp_ps(left, right, 0xff);
                    ^
/usr/lib/clang/5.0.0/include/smmintrin.h:620:12: note: expanded from macro '_mm_dp_ps'
  (__m128) __builtin_ia32_dpps((__v4sf)(__m128)(X), \
           ^
bench-file.cpp:26:14: error: always_inline function '_mm_hadd_ps' requires target feature 'sse3', but would be inlined into function 'MULHADD' that is compiled without support for 'sse3'
    dotted = _mm_hadd_ps(dotted, dotted);
             ^
bench-file.cpp:27:14: error: always_inline function '_mm_hadd_ps' requires target feature 'sse3', but would be inlined into function 'MULHADD' that is compiled without support for 'sse3'
    dotted = _mm_hadd_ps(dotted, dotted);
             ^
3 errors generated.

Cheers

Issue Analytics

  • State:open
  • Created 6 years ago
  • Reactions:2
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
ZongyiZhoucommented, Jun 12, 2020

@FredTingaud You can try adding “-march native” to the compiler options.

0reactions
xoorathcommented, Feb 1, 2022

Running into this again 4 years later, so I’m back to +1 my own issue. 😃

This time I’m trying to benchmark __popcnt against other methods of counting bits in an integer.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Clang error on SSE4.1 intrinsic - Visual Studio Feedback
Trying to compile with Clang (on a A10-7870k) I get. error : '__builtin_ia32_roundps' needs target feature sse4.1. for a call to _mm_floor_ps.
Read more >
Proper way to enable SSE4 on a per-function / per-block of ...
There is currently no way to target different ISA extensions at block / function granularity in clang. You can only do it at...
Read more >
x86 Options (Using the GNU Compiler Collection (GCC))
VIA Nano 3xxx CPU with x86-64, MMX, SSE, SSE2, SSE3, SSSE3 and SSE4.1 ... This option is overridden when -march indicates that the...
Read more >
[C++] O(n * m) 132 ms - LeetCode Discuss
... #pragma GCC target("sse,sse2,sse3,ssse3,sse4,popcnt,abm,mmx,avx,avx2 ... i--;) { for (int lg : languages[i]) lmap[i][lg] = 1; ...
Read more >
I told the Microsoft Visual C++ compiler not to generate AVX ...
You explicitly requested an SSE4 instruction, so the compiler honored your request. ... [[gnu::target(“sse4.1”)]] void something(int alpha)
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found