c++ - Why can't GCC optimize the logical bitwise AND pair in "x && (x & 4242)" to "x & 4242"? -


here 2 functions claim same thing:

bool fast(int x) {   return x & 4242; }  bool slow(int x) {   return x && (x & 4242); } 

logically same thing, , 100% sure wrote test ran 4 billion possible inputs through both of them, , matched. assembly code different story:

fast:     andl    $4242, %edi     setne   %al     ret  slow:     xorl    %eax, %eax     testl   %edi, %edi     je      .l3     andl    $4242, %edi     setne   %al .l3:     rep     ret 

i surprised gcc not make leap of logic eliminate redundant test. tried g++ 4.4.3 , 4.7.2 -o2, -o3, , -os, of generated same code. platform linux x86_64.

can explain why gcc shouldn't smart enough generate same code in both cases? i'd know if other compilers can better.

edit add test harness:

#include <cstdlib> #include <vector> using namespace std;  int main(int argc, char* argv[]) {     // make vector filled numbers starting argv[1]     int seed = atoi(argv[1]);     vector<int> v(100000);     (int j = 0; j < 100000; ++j)         v[j] = j + seed;      // count how many times function returns true     int result = 0;     (int j = 0; j < 100000; ++j)         (int : v)             result += slow(i); // or fast(i), try both      return result; } 

i tested above clang 5.1 on mac os -o3. took 2.9 seconds using fast() , 3.8 seconds using slow(). if instead use vector of zeros, there no significant difference in performance between 2 functions.

you correct appears deficiency, , possibly outright bug, in optimizer.

consider:

bool slow(int x) {   return x && (x & 4242); }  bool slow2(int x) {   return (x & 4242) && x; } 

assembly emitted gcc 4.8.1 (-o3):

slow:     xorl    %eax, %eax     testl   %edi, %edi     je      .l2     andl    $4242, %edi     setne   %al .l2:     rep ret  slow2:     andl    $4242, %edi     setne   %al     ret 

in other words, slow2 misnamed.

i have contributed occasional patch gcc, whether point of view carries weight debatable :-). strange, in view, gcc optimize 1 of these , not other. suggest filing bug report.

[update]

surprisingly small changes appear make big difference. example:

bool slow3(int x) {   int y = x & 4242;   return y && x; } 

...generates "slow" code again. have no hypothesis behavior.

you can experiment of these on multiple compilers here.


Comments

Popular posts from this blog

windows - Single EXE to Install Python Standalone Executable for Easy Distribution -

c# - Access objects in UserControl from MainWindow in WPF -

javascript - How to name a jQuery function to make a browser's back button work? -