Andrew C. Morrow | 27 Aug 23:56 2012
Picon

macports valgrind 3.8.0 built with XCode clang on OS X 10.8.1 fails to start

Hi -

I recently built valgrind-3.8.0 using clang from XCode 4.5, via
macports, on OS X 10.8.1. It builds successfully, but the resulting
valgrind binary doesn't work. Any invocation of valgrind (even just
'valgrind --help') fails as follows:

--2030:0:main     Root stack 0x238E20000 to 0x238F24000, a local 0x238F21FB0
--2030:0:main     Valgrind: FATAL: Initial stack switched failed.
--2030:0:main        Cannot continue.  Sorry.

Turning on debug logging doesn't reveal anything interesting:

--2127:1:debuglog DebugLog system started by Stage 1, level 1 logging requested
--2127:1:launcher no tool requested, defaulting to 'memcheck'
--2127:1:launcher valgrind_lib = /opt/local/lib/valgrind
--2127:1:launcher arch 'x86' IS NOT installed
--2127:1:launcher arch 'amd64' IS installed
--2127:1:launcher arch 'arm' IS NOT installed
--2127:1:launcher arch 'ppc32' IS NOT installed
--2127:1:launcher arch 'ppc64' IS NOT installed
--2127:1:launcher no client specified, defaulting arch to 'amd64'
--2127:1:launcher launcher_name = /opt/local/bin/valgrind
--2127:1:launcher launching /opt/local/lib/valgrind/memcheck-amd64-darwin
--2127:1:debuglog DebugLog system started by Stage 2 (main), level 1
logging requested
--2127:1:main     Welcome to Valgrind version 3.8.0 debug logging
--2127:1:main     Checking current stack is plausible
--2127:0:main     Root stack 0x238E20000 to 0x238F24000, a local 0x238F21FB0
--2127:0:main     Valgrind: FATAL: Initial stack switched failed.
(Continue reading)

Julian Seward | 28 Aug 11:05 2012
Picon

Re: macports valgrind 3.8.0 built with XCode clang on OS X 10.8.1 fails to start


On Monday, August 27, 2012, Andrew C. Morrow wrote:
> --2030:0:main     Root stack 0x238E20000 to 0x238F24000, a local
> 0x238F21FB0 --2030:0:main     Valgrind: FATAL: Initial stack switched
> failed.
> --2030:0:main        Cannot continue.  Sorry.
> 
> If I rebuild valgrind with llvm-gcc-4.2 as the compiler, all is well.

Ah yes.  I fell over this too, and documented it in the sources, but
failed to document it anywhere else.

It is to do with this bit of m_main.c, around line 1614 -- see the
comment.  I didn't chase it further, because it compiled with gcc and
it had (and still does have) more serious problems on OSX 10.8, so
I spent time investigating those instead.

If you want to try and make sense of what's going on here with
clang, that would be good, since I find it hard to believe that
clang would really have miscompiled an expression as simple
as "aLocal < limLo || aLocal >= limHi".

J

   //--------------------------------------------------------------
   // Ensure we're on a plausible stack.
   //   p: logging
   //--------------------------------------------------------------
   VG_(debugLog)(1, "main", "Checking current stack is plausible\n");
   { HChar* limLo  = (HChar*)(&VG_(interim_stack).bytes[0]);
(Continue reading)

Andrew C. Morrow | 28 Aug 17:27 2012
Picon

Re: macports valgrind 3.8.0 built with XCode clang on OS X 10.8.1 fails to start

Hi Julian -

Thanks for the pointer to the relevant code.

I can make the check work again by marking aLocal as volatile:

     HChar* volatile aLocal = (HChar*)&limLo; /* any auto local will do */

Complete speculation, but maybe what is happening is that clang is
optimizing away the entire conditional as always taken, reasoning
(correctly or incorrectly) that an auto variable will never fall
within an address range that is reserved for an extern char array?

On Tue, Aug 28, 2012 at 5:05 AM, Julian Seward <jseward@...> wrote:
>
> On Monday, August 27, 2012, Andrew C. Morrow wrote:
>> --2030:0:main     Root stack 0x238E20000 to 0x238F24000, a local
>> 0x238F21FB0 --2030:0:main     Valgrind: FATAL: Initial stack switched
>> failed.
>> --2030:0:main        Cannot continue.  Sorry.
>>
>> If I rebuild valgrind with llvm-gcc-4.2 as the compiler, all is well.
>
> Ah yes.  I fell over this too, and documented it in the sources, but
> failed to document it anywhere else.
>
> It is to do with this bit of m_main.c, around line 1614 -- see the
> comment.  I didn't chase it further, because it compiled with gcc and
> it had (and still does have) more serious problems on OSX 10.8, so
> I spent time investigating those instead.
(Continue reading)

Julian Seward | 28 Aug 17:43 2012
Picon

Re: macports valgrind 3.8.0 built with XCode clang on OS X 10.8.1 fails to start

On Tuesday, August 28, 2012, Andrew C. Morrow wrote:

> I can make the check work again by marking aLocal as volatile:
> 
>      HChar* volatile aLocal = (HChar*)&limLo; /* any auto local will do */
> 
> Complete speculation, but maybe what is happening is that clang is
> optimizing away the entire conditional as always taken, reasoning
> (correctly or incorrectly) that an auto variable will never fall
> within an address range that is reserved for an extern char array?

Hmm, that's a good point.  I couldn't figure out why it was failing.

If you can verify that the change works with both clang and gcc then
I'll be happy to commit it.

J

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
Andrew C. Morrow | 28 Aug 19:02 2012
Picon

Re: macports valgrind 3.8.0 built with XCode clang on OS X 10.8.1 fails to start

>
> Hmm, that's a good point.  I couldn't figure out why it was failing.
>
> If you can verify that the change works with both clang and gcc then
> I'll be happy to commit it.
>

I just tried it with both llvm-gcc-4.2 and clang, and the resulting
valgrind binary does start up correctly in both cases. However, as
subsequent emails from Florian and Patrick point out, this is still
undefined behavior.

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
Florian Krohm | 28 Aug 18:34 2012
Picon

Re: macports valgrind 3.8.0 built with XCode clang on OS X 10.8.1 fails to start

On 08/28/2012 11:27 AM, Andrew C. Morrow wrote:
> Hi Julian -
> 
> Thanks for the pointer to the relevant code.
> 
> I can make the check work again by marking aLocal as volatile:
> 
>      HChar* volatile aLocal = (HChar*)&limLo; /* any auto local will do */
> 
> Complete speculation, but maybe what is happening is that clang is
> optimizing away the entire conditional as always taken, reasoning
> (correctly or incorrectly) that an auto variable will never fall
> within an address range that is reserved for an extern char array?

The reason is that the comparisons aLocal < limLo and aLocal >= limHi
cause undefined behaviour (according to c99 6.5.8) because they compare
pointers that do not point into the same aggregate.
Compilers are free to do whatever they like in that case.
Another fix would be to transform the pointers to like-sized integers
and compare those. That kind of conversion causes implementation defined
behaviour which compilers are supposed to document. So you could look up
whether it works.. I suppose, using volatile is the easier "fix".

  Florian

>>    VG_(debugLog)(1, "main", "Checking current stack is plausible\n");
>>    { HChar* limLo  = (HChar*)(&VG_(interim_stack).bytes[0]);
>>      HChar* limHi  = limLo + sizeof(VG_(interim_stack));
>>      HChar* aLocal = (HChar*)&limLo; /* any auto local will do */
>>      /* "Apple clang version 4.0 (tags/Apple/clang-421.0.57) (based on
(Continue reading)

Patrick J. LoPresti | 28 Aug 18:56 2012
Picon

Re: macports valgrind 3.8.0 built with XCode clang on OS X 10.8.1 fails to start

On Tue, Aug 28, 2012 at 9:34 AM, Florian Krohm <britzel@...> wrote:
>
> The reason is that the comparisons aLocal < limLo and aLocal >= limHi
> cause undefined behaviour (according to c99 6.5.8) because they compare
> pointers that do not point into the same aggregate.
> Compilers are free to do whatever they like in that case.
> Another fix would be to transform the pointers to like-sized integers
> and compare those.

This sounds like a better fix to me, if you do not mind using C99.
Just cast to uintptr_t (from <stdint.h>).  (If not C99, "unsigned
long" should work.)

Comparing such integers is technically implementation-defined, but (a)
that is much better than "undefined" and (b) let's face it, on any
conceivable system it will work fine.

Assuming this actually fixes the problem, of course.

 - Pat

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
Florian Krohm | 28 Aug 19:32 2012
Picon

Re: macports valgrind 3.8.0 built with XCode clang on OS X 10.8.1 fails to start

On 08/28/2012 12:56 PM, Patrick J. LoPresti wrote:
> On Tue, Aug 28, 2012 at 9:34 AM, Florian Krohm <britzel@...> wrote:
>>

> Comparing such integers is technically implementation-defined, but (a)
> that is much better than "undefined" 

Yes, very true.

> and (b) let's face it, on any
> conceivable system it will work fine.

But so will be the volatile hack by making one of the pointers in the
comparison volatile. The compiler would have to prove that there is
undefined behavior in order to exploit it. But as a volatile variable
can change its value in ways invisible to the compiler, the compiler
must make the conservative assumption that it points into the same
aggregate as the other pointer its compared against. I.e. the behaviour
is possibly defined.

   Florian

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
Julian Seward | 28 Aug 20:56 2012
Picon

Re: macports valgrind 3.8.0 built with XCode clang on OS X 10.8.1 fails to start

On Tuesday, August 28, 2012, Patrick J. LoPresti wrote:
> On Tue, Aug 28, 2012 at 9:34 AM, Florian Krohm <britzel@...> wrote:
> > The reason is that the comparisons aLocal < limLo and aLocal >= limHi
> > cause undefined behaviour (according to c99 6.5.8) because they compare
> > pointers that do not point into the same aggregate.
> > Compilers are free to do whatever they like in that case.
> > Another fix would be to transform the pointers to like-sized integers
> > and compare those.
> 
> This sounds like a better fix to me, if you do not mind using C99.
> Just cast to uintptr_t (from <stdint.h>).  (If not C99, "unsigned
> long" should work.)
> 
> Comparing such integers is technically implementation-defined, but (a)
> that is much better than "undefined" and (b) let's face it, on any
> conceivable system it will work fine.
> 
> Assuming this actually fixes the problem, of course.

I don't remember exactly, but I think I tried casting the 3 values
to "HWord" (unsigned int the size of a host word) and it still
didn't work.  But, Andrew, if you are inclined to check that,
pls do :)

Since the volatile fix is verified by Andrew as working, I'm inclined
to go with it, unless the cast-to-int thing works, in which case
let's do that instead.

J

(Continue reading)

Julian Seward | 28 Aug 20:58 2012
Picon

Re: macports valgrind 3.8.0 built with XCode clang on OS X 10.8.1 fails to start

On Tuesday, August 28, 2012, Florian Krohm wrote:
> On 08/28/2012 11:27 AM, Andrew C. Morrow wrote:
> > Hi Julian -
> > 
> > Thanks for the pointer to the relevant code.
> > 
> > I can make the check work again by marking aLocal as volatile:
> >      HChar* volatile aLocal = (HChar*)&limLo; /* any auto local will do
> >      */
> > 
> > Complete speculation, but maybe what is happening is that clang is
> > optimizing away the entire conditional as always taken, reasoning
> > (correctly or incorrectly) that an auto variable will never fall
> > within an address range that is reserved for an extern char array?
> 
> The reason is that the comparisons aLocal < limLo and aLocal >= limHi
> cause undefined behaviour (according to c99 6.5.8) because they compare
> pointers that do not point into the same aggregate.
> Compilers are free to do whatever they like in that case.

Wow .. I think I would not have guessed that in a long time.
So, it seems we've been holed beneath the waterline by LLVM's alias
analyser, yes?  Interesting.

J

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
(Continue reading)

Andrew C. Morrow | 28 Aug 21:42 2012
Picon

Re: macports valgrind 3.8.0 built with XCode clang on OS X 10.8.1 fails to start

On Tue, Aug 28, 2012 at 2:58 PM, Julian Seward <jseward@...> wrote:
> On Tuesday, August 28, 2012, Florian Krohm wrote:
>> On 08/28/2012 11:27 AM, Andrew C. Morrow wrote:
>> > Hi Julian -
>> >
>> > Thanks for the pointer to the relevant code.
>> >
>> > I can make the check work again by marking aLocal as volatile:
>> >      HChar* volatile aLocal = (HChar*)&limLo; /* any auto local will do
>> >      */
>> >
>> > Complete speculation, but maybe what is happening is that clang is
>> > optimizing away the entire conditional as always taken, reasoning
>> > (correctly or incorrectly) that an auto variable will never fall
>> > within an address range that is reserved for an extern char array?
>>
>> The reason is that the comparisons aLocal < limLo and aLocal >= limHi
>> cause undefined behaviour (according to c99 6.5.8) because they compare
>> pointers that do not point into the same aggregate.
>> Compilers are free to do whatever they like in that case.
>
> Wow .. I think I would not have guessed that in a long time.
> So, it seems we've been holed beneath the waterline by LLVM's alias
> analyser, yes?  Interesting.
>
> J

I was curious about this, so I asked on stackoverflow:

http://stackoverflow.com/questions/12165288/c99-is-it-possible-to-portably-determine-if-two-pointers-point-within-the-same
(Continue reading)

Andrew C. Morrow | 30 Aug 16:25 2012
Picon

Re: macports valgrind 3.8.0 built with XCode clang on OS X 10.8.1 fails to start

There were three possible solutions proposed in this thread:

- Using volatile
- Converting to an integer representation and comparing.
- My proposal to just do a linear scan of interim_stack looking for aLocal

I didn't get any feedback on the last idea, so I guess it wasn't too
well received. Is there any consensus on what is the best fix? I'm
happy to put together a patch if that would help get this resolved (or
maybe someone with commit rights can just add 'volatile' if that is
the chosen solution) but I'd rather make sure it is the right patch
first.

Thanks,
Andrew

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
Julian Seward | 30 Aug 16:44 2012
Picon

Re: macports valgrind 3.8.0 built with XCode clang on OS X 10.8.1 fails to start

On Thursday, August 30, 2012, Andrew C. Morrow wrote:
> There were three possible solutions proposed in this thread:
> 
> - Using volatile
> - Converting to an integer representation and comparing.
> - My proposal to just do a linear scan of interim_stack looking for aLocal
> 
> I didn't get any feedback on the last idea, so I guess it wasn't too
> well received.

It sounded a bit expensive.

> Is there any consensus on what is the best fix?  I'm happy to put
> together a patch if that would help get this resolved (or
> maybe someone with commit rights can just add 'volatile' if that is
> the chosen solution) but I'd rather make sure it is the right patch
> first.

I'd be happy to go with either the volatile thing, or the cast-to-HWord
scheme assuming that works.  I don't know enough about the details of the
C99 language spec to guess which is less likely to get messed up by 
future compilers.  Florian, any opinion here?

J

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
(Continue reading)

Florian Krohm | 30 Aug 17:19 2012
Picon

Re: macports valgrind 3.8.0 built with XCode clang on OS X 10.8.1 fails to start

On 08/30/2012 10:25 AM, Andrew C. Morrow wrote:
> There were three possible solutions proposed in this thread:
> 
> - Using volatile
> - Converting to an integer representation and comparing.
> - My proposal to just do a linear scan of interim_stack looking for aLocal
> 
> I didn't get any feedback on the last idea, so I guess it wasn't too
> well received. 

Yeah, it seems excessive...

> Is there any consensus on what is the best fix?

Everybody felt, that relying on implementation behaviour would be better
than relying on undefined behaviour be handled in a specific way.
But it was not verified yet, that that casting the pointers to HWords
and comparing them would actually work. Did you try it?

  Florian

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
Andrew C. Morrow | 10 Sep 03:19 2012
Picon

Re: macports valgrind 3.8.0 built with XCode clang on OS X 10.8.1 fails to start

On Thu, Aug 30, 2012 at 11:19 AM, Florian Krohm <britzel@...> wrote:
> On 08/30/2012 10:25 AM, Andrew C. Morrow wrote:
>> There were three possible solutions proposed in this thread:
>>
>> - Using volatile
>> - Converting to an integer representation and comparing.
>> - My proposal to just do a linear scan of interim_stack looking for aLocal
>>
>> I didn't get any feedback on the last idea, so I guess it wasn't too
>> well received.
>
> Yeah, it seems excessive...
>
>> Is there any consensus on what is the best fix?
>
> Everybody felt, that relying on implementation behaviour would be better
> than relying on undefined behaviour be handled in a specific way.
> But it was not verified yet, that that casting the pointers to HWords
> and comparing them would actually work. Did you try it?
>
>   Florian
>

Hi Florian -

Apologies for the delay: I was away on vacation last week. I have not
yet tried the HWord conversion but I will try to do so this week.

Thanks,
Andrew
(Continue reading)

Florian Krohm | 11 Sep 04:19 2012
Picon

Re: macports valgrind 3.8.0 built with XCode clang on OS X 10.8.1 fails to start

On 09/09/2012 09:19 PM, Andrew C. Morrow wrote:
> On Thu, Aug 30, 2012 at 11:19 AM, Florian Krohm <britzel@...> wrote:

>>
>> Everybody felt, that relying on implementation behaviour would be better
>> than relying on undefined behaviour be handled in a specific way.
>> But it was not verified yet, that that casting the pointers to HWords
>> and comparing them would actually work. Did you try it?
>>
>>   Florian
>>
> 
> Hi Florian -
> 
> Apologies for the delay: I was away on vacation last week. I have not
> yet tried the HWord conversion but I will try to do so this week.
> 

Hi Andrew, before you embark on something do check the source. I think
Julian already committed a change using volatile.

   Florian

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
(Continue reading)


Gmane