c++ - Why can't GCC optimize the logical bitwise AND pair in "x && (x & 4242)" to "x & 4242"? -
here 2 functions claim same thing:
bool fast(int x) { return x & 4242; } bool slow(int x) { return x && (x & 4242); }
logically same thing, , 100% sure wrote test ran 4 billion possible inputs through both of them, , matched. assembly code different story:
fast: andl $4242, %edi setne %al ret slow: xorl %eax, %eax testl %edi, %edi je .l3 andl $4242, %edi setne %al .l3: rep ret
i surprised gcc not make leap of logic eliminate redundant test. tried g++ 4.4.3 , 4.7.2 -o2, -o3, , -os, of generated same code. platform linux x86_64.
can explain why gcc shouldn't smart enough generate same code in both cases? i'd know if other compilers can better.
edit add test harness:
#include <cstdlib> #include <vector> using namespace std; int main(int argc, char* argv[]) { // make vector filled numbers starting argv[1] int seed = atoi(argv[1]); vector<int> v(100000); (int j = 0; j < 100000; ++j) v[j] = j + seed; // count how many times function returns true int result = 0; (int j = 0; j < 100000; ++j) (int : v) result += slow(i); // or fast(i), try both return result; }
i tested above clang 5.1 on mac os -o3. took 2.9 seconds using fast()
, 3.8 seconds using slow()
. if instead use vector of zeros, there no significant difference in performance between 2 functions.
you correct appears deficiency, , possibly outright bug, in optimizer.
consider:
bool slow(int x) { return x && (x & 4242); } bool slow2(int x) { return (x & 4242) && x; }
assembly emitted gcc 4.8.1 (-o3):
slow: xorl %eax, %eax testl %edi, %edi je .l2 andl $4242, %edi setne %al .l2: rep ret slow2: andl $4242, %edi setne %al ret
in other words, slow2
misnamed.
i have contributed occasional patch gcc, whether point of view carries weight debatable :-). strange, in view, gcc optimize 1 of these , not other. suggest filing bug report.
[update]
surprisingly small changes appear make big difference. example:
bool slow3(int x) { int y = x & 4242; return y && x; }
...generates "slow" code again. have no hypothesis behavior.
you can experiment of these on multiple compilers here.
Comments
Post a Comment