1 Mar 2008 18:25
Re: binary compiled with -O1 and w/ individual optimization flags are not the same
Tim Prince <tprince <at> myrealbox.com>
2008-03-01 17:25:28 GMT
2008-03-01 17:25:28 GMT
CSights wrote: > > Currently using doubles, but thanks for reminding me about the number of > decimals that make sense. > > >> By default calculations on the 387 are done by the hardware in 80 bits >> precision, but truncated down to 64 (assuming double types) when moved >> out of the registers. There are a number of ways to deal with it, or at >> least expose it: >> >> -ffloat-store will cause gcc to always move intermediate results out of >> registers and into memory, which effectively gets rid of the excess >> precision at the cost of a speed hit. >> > > Progress! Now the program output matching blocks are > (O0 -ffloat-store == O1 ffloat-store == O2 ffloat-store) != (O0) != (O1 == O2 > == O3) In other words, now the O0 matches 1,2 with the addition > of -ffloat-store, even though it still doesn't match the Ox without > ffloat-store. > Does this suggest to you the mismatching output was due to decimal point > differences rather than other problems (aliasing for example)? > It suggests that you were in fact getting more than 53-bit double somewhere, and that it's not an aliasing error. > Also, I didn't mention earlier (did I?) that the program's output when > compiled on the Macintosh matched at all optimization levels. (O0 == O1 == > O2) (Though the output did not match any output from the program compiled on > linux.) Is this possibly b/c the Mac has sse2 (Core 2 Duo) and able to use(Continue reading)
RSS Feed