1 Mar 2008 19:21
Re: binary compiled with -O1 and w/ individual optimization flags are not the same
Brian Dessent <brian <at> dessent.net>
2008-03-01 18:21:46 GMT
2008-03-01 18:21:46 GMT
CSights wrote: > Also, I didn't mention earlier (did I?) that the program's output when > compiled on the Macintosh matched at all optimization levels. (O0 == O1 == > O2) (Though the output did not match any output from the program compiled on > linux.) Is this possibly b/c the Mac has sse2 (Core 2 Duo) and able to use > those instructions which have more meaningful decimal places? Yes, it's probably using the sse2 unit. > If this is the problem, what would be a good way of dealing with it? Well first realize that it's not a problem per se. The results *are* equivalent in the significant digits that actually represent what a double can hold. The only reason they seem different is because there are these extra bits of precision that result from the value still being in a 387 register. But those bits shouldn't matter because as soon as the result is moved into memory they are truncated away. > Throwing away the meaningless decimal digits is okay with me, but avoiding > the performance hit that comes with ffloat-store would be nice. Also, it Like I said, you can use -mpc64 to explicitly set the 387 to 64 bits precision, just like the sse2 unit. If you don't have a gcc new enough to have this option or you don't want to depend on requiring an option, you can simply manually configure the 387 it at the beginning of your program to disable the extended precision. See <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323#c60> for a code snippet of how to do this. (That relies on a glibc-specific fpu_control.h header but the definitions in that header are pretty self-contained.)(Continue reading)
RSS Feed