Add CMakePresets for target micro arch#1348
Conversation
|
I really like your approach and will eagerly merge it once it validates \o/ |
|
I've only kept the micro architecture target in I have ongoing work to actually do the same as these presets at the CMake level, with a function that can be made available to users to help in the tooling for dynamic dispatch (our current solution in Arrow is very verbose).
|
0d3d6ea to
c573d9d
Compare
|
@serge-sans-paille this is in a ready state, but I am not fully happy with it. Getting into AVX512, and AVX512-256, the combinatorial explosion of possibilities start to show again. This reinforce my belief that I should keep on with the work to do it in CMake (that could also be installed for our users to improve our dynamic dispatch tooling), and also homogenized with the test This PR is not completely worthless though. For example we now have the possibility to really test with What do you think? Should we give this some mileage before I get the time to work on a CMake solution? |
| fi | ||
| if [[ '${{ matrix.sys.flags }}' == 'i386' ]]; then | ||
| CXX_FLAGS="$CXX_FLAGS -m32" | ||
| export CXXFLAGS="$CXXFLAGS -m32" |
There was a problem hiding this comment.
Yes, there is a weird mismatch in master. Both CXX_FLAGS and CXXFLAGS where set but only CXX_FLAGS was explicitly passed to CMake. CXXFLAGS is picked up automatically but it was not exported.
| { | ||
| "name": "avx2", | ||
| "cacheVariables": { | ||
| "CMAKE_CXX_FLAGS": "$env{CXXFLAGS} -march=x86-64-v2 -mno-sse4a -mavx2 -mno-avx512f" |
There was a problem hiding this comment.
we sometime have fallback from avx2 instructions to sse instructions. How can this work??
There was a problem hiding this comment.
I do understand the need to prune higher instruction sets, but not the need to prune lower ones, please explain.
There was a problem hiding this comment.
Do you mean -mno-sse4a ? This can be removed, I added when trying to debug some -march=native that was added by the absence of TARGET_ARCH.
Though it is not a problem here: sse4a is an AMD extension that was never implemented on Intel (and that is why it was failing in SDE).
serge-sans-paille
left a comment
There was a problem hiding this comment.
All looks good, except the question on pruning lower architectures which raises a big unknown to me.
I've taken the direction of explicit flags such as
-mavx -mno-avx2.This is IMHO less error prone and more accurate that using architecture name such as
haswell.The main difference is that this does not add other feature flags or change the
-mtunemodel.For a test setting accuracy is more important IMHO.