Markus Trippelsdorf | 24 Oct 22:11 2011
Picon

general protection faults with "git grep" version 1.7.7.1

Suddenly I'm getting strange protection faults when I run "git grep" on
the gcc tree:

git[4245] general protection ip:7f291f01461f sp:7fff5618a8b0 error:0 in libc-2.14.90.so[7f291ef9a000+15d000]

 % gdb git
GNU gdb (Gentoo 7.3.1 p1) 7.3.1
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.gentoo.org/>...
Reading symbols from /usr/bin/git...done.
(gdb) run grep composite_pointer_type
Starting program: /usr/bin/git grep composite_pointer_type
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffff7859700 (LWP 18367)]
[New Thread 0x7ffff7058700 (LWP 18368)]
[New Thread 0x7ffff6857700 (LWP 18369)]
[New Thread 0x7ffff6056700 (LWP 18370)]
[New Thread 0x7ffff5855700 (LWP 18371)]
[New Thread 0x7ffff5054700 (LWP 18372)]
[New Thread 0x7ffff4853700 (LWP 18373)]
[New Thread 0x7ffff4052700 (LWP 18374)]

Program received signal SIGSEGV, Segmentation fault.
(Continue reading)

Richard W.M. Jones | 24 Oct 23:49 2011
Picon

Re: general protection faults with "git grep" version 1.7.7.1

On Mon, Oct 24, 2011 at 10:11:53PM +0200, Markus Trippelsdorf wrote:
> Suddenly I'm getting strange protection faults when I run "git grep" on
> the gcc tree:

Jim Meyering and I are trying to chase what looks like a similar or
identical bug in git-grep.  We've not got much further than gdb and
valgrind so far, but see:

https://bugzilla.redhat.com/show_bug.cgi?id=747377

It's slightly suspicious that this bug only started to happen with the
latest glibc, but that could be coincidence, or could be just that
glibc exposes a latent bug in git-grep.

Rich.

--

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://et.redhat.com/~rjones/virt-df/
Markus Trippelsdorf | 25 Oct 00:58 2011
Picon

Re: general protection faults with "git grep" version 1.7.7.1

On 2011.10.24 at 22:49 +0100, Richard W.M. Jones wrote:
> On Mon, Oct 24, 2011 at 10:11:53PM +0200, Markus Trippelsdorf wrote:
> > Suddenly I'm getting strange protection faults when I run "git grep" on
> > the gcc tree:
> 
> Jim Meyering and I are trying to chase what looks like a similar or
> identical bug in git-grep.  We've not got much further than gdb and
> valgrind so far, but see:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=747377
> 
> It's slightly suspicious that this bug only started to happen with the
> latest glibc, but that could be coincidence, or could be just that
> glibc exposes a latent bug in git-grep.

Thanks for the pointer.

Compiling git with -O1 "solves" the problem for me. 
This issue is independent of the exact git version being used (I tried
three different ones and always hit the problem).
It happens always on the _second_ run of "git grep" on my machine. The
first run always succeeds. So this might be a cache related issue.

--

-- 
Markus
Bernt Hansen | 25 Oct 02:00 2011
Picon

Re: general protection faults with "git grep" version 1.7.7.1

Markus Trippelsdorf <markus <at> trippelsdorf.de> writes:

> On 2011.10.24 at 22:49 +0100, Richard W.M. Jones wrote:
>> On Mon, Oct 24, 2011 at 10:11:53PM +0200, Markus Trippelsdorf wrote:
>> > Suddenly I'm getting strange protection faults when I run "git grep" on
>> > the gcc tree:
>> 
>> Jim Meyering and I are trying to chase what looks like a similar or
>> identical bug in git-grep.  We've not got much further than gdb and
>> valgrind so far, but see:
>> 
>> https://bugzilla.redhat.com/show_bug.cgi?id=747377
>> 
>> It's slightly suspicious that this bug only started to happen with the
>> latest glibc, but that could be coincidence, or could be just that
>> glibc exposes a latent bug in git-grep.
>
> Thanks for the pointer.
>
> Compiling git with -O1 "solves" the problem for me. 
> This issue is independent of the exact git version being used (I tried
> three different ones and always hit the problem).
> It happens always on the _second_ run of "git grep" on my machine. The
> first run always succeeds. So this might be a cache related issue.

Hi,

I updated from an old commit 2883969 (Sync with maint, 2011-10-15)
to origin/master 10b2a48 (Merge branch 'maint', 2011-10-23) today and
promptly got segfaults on git status in my org-mode repository.
(Continue reading)

Jeff King | 25 Oct 07:53 2011
Picon

Re: general protection faults with "git grep" version 1.7.7.1

On Mon, Oct 24, 2011 at 08:00:15PM -0400, Bernt Hansen wrote:

> I updated from an old commit 2883969 (Sync with maint, 2011-10-15)
> to origin/master 10b2a48 (Merge branch 'maint', 2011-10-23) today and
> promptly got segfaults on git status in my org-mode repository.
> 
> Going back to the old commit makes it work again.
> 
> Git bisect identifies the following commit as the problem:
> 
> [2548183badb98d62079beea62f9d2e1f47e99902] fix phantom untracked files when core.ignorecase is set

I think this is a separate problem. See this thread and patch:

  http://thread.gmane.org/gmane.comp.version-control.git/184094/focus=184148

-Peff
Bernt Hansen | 25 Oct 13:11 2011
Picon

Re: general protection faults with "git grep" version 1.7.7.1

Jeff King <peff <at> peff.net> writes:

> On Mon, Oct 24, 2011 at 08:00:15PM -0400, Bernt Hansen wrote:
>
>> I updated from an old commit 2883969 (Sync with maint, 2011-10-15)
>> to origin/master 10b2a48 (Merge branch 'maint', 2011-10-23) today and
>> promptly got segfaults on git status in my org-mode repository.
>> 
>> Going back to the old commit makes it work again.
>> 
>> Git bisect identifies the following commit as the problem:
>> 
>> [2548183badb98d62079beea62f9d2e1f47e99902] fix phantom untracked files when core.ignorecase is set
>
> I think this is a separate problem. See this thread and patch:
>
>   http://thread.gmane.org/gmane.comp.version-control.git/184094/focus=184148

Thanks,

I'll look at that thread.

-Bernt
Thomas Rast | 25 Oct 15:50 2011
Picon

Re: general protection faults with "git grep" version 1.7.7.1

[Shawn, Peff, Nicolas: maybe you can say something on the
(non)raciness of xmalloc() in parallel with read_sha1_file().  See the
last paragraph below.]

Richard W.M. Jones wrote:
> On Mon, Oct 24, 2011 at 10:11:53PM +0200, Markus Trippelsdorf wrote:
> > Suddenly I'm getting strange protection faults when I run "git grep" on
> > the gcc tree:
> 
> Jim Meyering and I are trying to chase what looks like a similar or
> identical bug in git-grep.  We've not got much further than gdb and
> valgrind so far, but see:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=747377
> 
> It's slightly suspicious that this bug only started to happen with the
> latest glibc, but that could be coincidence, or could be just that
> glibc exposes a latent bug in git-grep.

I'm tempted to write this off as a GCC bug.  If that's ok for you,
I'll leave further investigation and communication with the GCC folks
to you.

My findings are as follows:

It's easy to reproduce the behavior described in the above bug report,
using an F16 beta install in a VM.  I gave the VM two cores, but
didn't test what happens with only one.  By "easy" I mean I didn't
have to do any fiddling and it crashes at least one out of two times.

(Continue reading)

Jim Meyering | 25 Oct 17:17 2011
Picon

Re: general protection faults with "git grep" version 1.7.7.1

Thomas Rast wrote:
> [Shawn, Peff, Nicolas: maybe you can say something on the
> (non)raciness of xmalloc() in parallel with read_sha1_file().  See the
> last paragraph below.]
>
> Richard W.M. Jones wrote:
>> On Mon, Oct 24, 2011 at 10:11:53PM +0200, Markus Trippelsdorf wrote:
>> > Suddenly I'm getting strange protection faults when I run "git grep" on
>> > the gcc tree:
>>
>> Jim Meyering and I are trying to chase what looks like a similar or
>> identical bug in git-grep.  We've not got much further than gdb and
>> valgrind so far, but see:
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=747377
>>
>> It's slightly suspicious that this bug only started to happen with the
>> latest glibc, but that could be coincidence, or could be just that
>> glibc exposes a latent bug in git-grep.
>
> I'm tempted to write this off as a GCC bug.  If that's ok for you,
> I'll leave further investigation and communication with the GCC folks
> to you.
>
> My findings are as follows:
>
> It's easy to reproduce the behavior described in the above bug report,
> using an F16 beta install in a VM.  I gave the VM two cores, but
> didn't test what happens with only one.  By "easy" I mean I didn't
> have to do any fiddling and it crashes at least one out of two times.
(Continue reading)

Markus Trippelsdorf | 25 Oct 17:32 2011
Picon

Re: general protection faults with "git grep" version 1.7.7.1

On 2011.10.25 at 17:17 +0200, Jim Meyering wrote:
> Thomas Rast wrote:
> > [Shawn, Peff, Nicolas: maybe you can say something on the
> > (non)raciness of xmalloc() in parallel with read_sha1_file().  See the
> > last paragraph below.]
> >
> > Richard W.M. Jones wrote:
> >> On Mon, Oct 24, 2011 at 10:11:53PM +0200, Markus Trippelsdorf wrote:
> >> > Suddenly I'm getting strange protection faults when I run "git grep" on
> >> > the gcc tree:
> >>
> >> Jim Meyering and I are trying to chase what looks like a similar or
> >> identical bug in git-grep.  We've not got much further than gdb and
> >> valgrind so far, but see:
> >>
> >> https://bugzilla.redhat.com/show_bug.cgi?id=747377
> >>
> >> It's slightly suspicious that this bug only started to happen with the
> >> latest glibc, but that could be coincidence, or could be just that
> >> glibc exposes a latent bug in git-grep.
> >
> > I'm tempted to write this off as a GCC bug.  If that's ok for you,
> > I'll leave further investigation and communication with the GCC folks
> > to you.
> >
> > My findings are as follows:
> >
> > It's easy to reproduce the behavior described in the above bug report,
> > using an F16 beta install in a VM.  I gave the VM two cores, but
> > didn't test what happens with only one.  By "easy" I mean I didn't
(Continue reading)

Thomas Rast | 25 Oct 18:00 2011
Picon

Re: general protection faults with "git grep" version 1.7.7.1

Jim Meyering wrote:
> Thomas Rast wrote:
> > [GCC moves access to a file-static variable across pthread_mutex_lock()]
> 
> Thanks for the investigation.
> Actually, isn't gcc -O2's code-motion justified?
> While we *know* that those globals may be modified asynchronously,
> builtin/grep.c forgot to tell gcc about that.

I'm somewhat unwilling to believe that:

* "volatile" enforces three unrelated things, see e.g. [1].

* Removing "static" would do the same as it prevents the compiler from
  proving at compile-time that pthread_mutex_lock() cannot affect the
  variable in question.

  If this is correct, it also means that all code in all pthreads
  tutorials I can find works merely by the accident of not declaring
  their variables "static".

  Furthermore, a future smarter compiler with better link-time
  optimization might again prove the same and eliminate the
  "superfluous" load.

However, as a result of the discussion I now have a shorter testcase:

  #include <pthread.h>

  int y;
(Continue reading)

Thomas Rast | 25 Oct 18:07 2011
Picon

Re: general protection faults with "git grep" version 1.7.7.1

Thomas Rast wrote:
> Jim Meyering wrote:
> > Thomas Rast wrote:
> > > [GCC moves access to a file-static variable across pthread_mutex_lock()]
> > 
> > Thanks for the investigation.
> > Actually, isn't gcc -O2's code-motion justified?
> > While we *know* that those globals may be modified asynchronously,
> > builtin/grep.c forgot to tell gcc about that.
> 
> I'm somewhat unwilling to believe that:
> 
> * "volatile" enforces three unrelated things, see e.g. [1].

Argh, forgot my reference:
[1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2016.html
    section "Existing portable uses"

--

-- 
Thomas Rast
trast <at> {inf,student}.ethz.ch
Jim Meyering | 25 Oct 18:37 2011
Picon

Re: general protection faults with "git grep" version 1.7.7.1

Thomas Rast wrote:
> Jim Meyering wrote:
>> Thomas Rast wrote:
>> > [GCC moves access to a file-static variable across pthread_mutex_lock()]
>>
>> Thanks for the investigation.
>> Actually, isn't gcc -O2's code-motion justified?
>> While we *know* that those globals may be modified asynchronously,
>> builtin/grep.c forgot to tell gcc about that.
>
> I'm somewhat unwilling to believe that:

You're right to be skeptical.
I should have stuck with "using volatile works around the problem for me".
The real problem seems to be in glibc, with its addition of
the "leaf" attribute to those synchronization primitives:

  http://bugzilla.redhat.com/747377#c22
Thomas Rast | 25 Oct 18:54 2011
Picon

Re: general protection faults with "git grep" version 1.7.7.1

Jim Meyering wrote:
> Thomas Rast wrote:
> > Jim Meyering wrote:
> >> Thomas Rast wrote:
> >> > [GCC moves access to a file-static variable across pthread_mutex_lock()]
> >>
> >> Thanks for the investigation.
> >> Actually, isn't gcc -O2's code-motion justified?
> >> While we *know* that those globals may be modified asynchronously,
> >> builtin/grep.c forgot to tell gcc about that.
> >
> > I'm somewhat unwilling to believe that:
> 
> You're right to be skeptical.
> I should have stuck with "using volatile works around the problem for me".
> The real problem seems to be in glibc, with its addition of
> the "leaf" attribute to those synchronization primitives:
> 
>   http://bugzilla.redhat.com/747377#c22

Aha.  Glad you found it :-)

Meanwhile I read

  http://www.hpl.hp.com/techreports/2004/HPL-2004-209.html

which discusses a similar issue in section 4.3, but is very
interesting on its own.  It's funny how it says

  We know of at least three optimizing compilers (two of them
(Continue reading)

Jim Meyering | 25 Oct 22:24 2011
Picon

Re: general protection faults with "git grep" version 1.7.7.1

Thomas Rast wrote:
...
>> The real problem seems to be in glibc, with its addition of
>> the "leaf" attribute to those synchronization primitives:
>>
>>   http://bugzilla.redhat.com/747377#c22
>
> Aha.  Glad you found it :-)
>
> Meanwhile I read
>
>   http://www.hpl.hp.com/techreports/2004/HPL-2004-209.html
>
> which discusses a similar issue in section 4.3, but is very
> interesting on its own.  It's funny how it says
>
>   We know of at least three optimizing compilers (two of them
>   production compilers) that performed this transformation at some
>   point during their lifetime; usually at least partially reversing
>   the decision when the implications on multi-threaded code became
>   known.
>
> I guess that would be four now if it was literally the same problem.

Yep.  For those not following the BZ comments at the about URL,
POSIX is quite clear.  Quoting from
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_11:

    The following functions synchronize memory with respect to other threads:

(Continue reading)

Jeff King | 25 Oct 17:37 2011
Picon

Re: general protection faults with "git grep" version 1.7.7.1

On Tue, Oct 25, 2011 at 03:50:21PM +0200, Thomas Rast wrote:

> That being said, I'm not entirely convinced that the code in
> builtin/grep.c works in the face of memory pressure.  It guards
> against concurrent access to read_sha1_file() with the
> read_sha1_mutex, but any call to xmalloc() outside of that mutex can
> still potentially invoke the try_to_free_routine.  Maybe one of the
> pack experts can say whether this is safe.  (However, I implemented
> locking around try_to_free_routine as a quick hack and it did not fix
> the issue discussed in the bug report.)

Yes, I think it needs to set try_to_free_routine. See this thread:

  http://thread.gmane.org/gmane.comp.version-control.git/180446

which discusses a possible subtlety with doing so.

-Peff

Gmane