-
Notifications
You must be signed in to change notification settings - Fork 429
Description
The following simple test code does not compile when using -DXTENSOR_USE_XSIMD:
#include <xtensor/xfixed.hpp>
void xtensor_or(xt::xtensor_fixed<bool, xt::xshape<16>>& b1,
const xt::xtensor_fixed<bool, xt::xshape<16>>& b2) {
b1 = b1 | b2;
}
The compiler complains that it cannot convert an xsimd::batch_bool<int, xsimd::sse2> to xsimd::batch<int, xsimd::sse2>. Apparently xtensor and/or xsimd internally use these types for performing the actual computations.
When I change b1 | b2; to xt::cast<bool>(b1 | b2); the compiler will accept the code. The resulting assembly looks good: b1 and b2 are loaded from memory, then there's a single por instruction, and b1 is stored again.
When I change the two bools to chars or uint8_t, the resulting assembly shows many pack and unpack instructions along with four por instructions. This assembly confirms that xtensor/xsimd internally uses ints instead of chars. When I disable xsimd, the resulting assembly looks good again: No pack/unpack instructions and a single por instruction. I'm suspecting that the xsimd_return_type forces the conversion of char and uint8_t to integers, however, I couldn't pinpoint where it exactly happens.
I can understand why xsimd unpacks 8-bit booleans into 128-bit simd registers with 4 32-bit booleans. xsimd::batch_bool allows using booleans together with other types. However, this feature should not result in compilation errors. Also, operations that only use booleans should not have degraded performance.
Since this issue occurs when combining xtensor with xsimd, I am raising it here. Without xsimd, xtensor<bool, N> works fine. Also, xsimd::batch_bool<bool> works fine in isolation. Fixes may therefore be needed in both xtensor and xsimd.
Please have a look at this issue and investigate the compilation error at least.