PRINGLE Chris | 29 Jun 2012 10:08
Favicon

Lookaside cache with high volume of objects hangs the kernel on destruction (deadlock)

Dear All,

I posted about this a little while ago but ran out of time to look at the problem; unfortunately our problem
has now resurfaced and I'm looking for some advice as to what might be going wrong.

We have a device driver that utilises lookaside caches; some of those caches are subject to a fairly high
volume of objects. During normal operation there isn't a problem, however on destruction (when we unload
the driver) the kernel hangs. I've tried this on v3.0.3 on 32 bit i686 with rt12 (9 months ago), and I've also
tried it on PowerPC kernel v3.0.34 with rt55 (this week); both exhibit the same behaviour. Without the RT
patches, everything appears to be okay; the code also appears to work on older 2.6.33 kernels (with the rt patch).

As far as I can tell, it looks like kmem_cache_free is trying to acquire a lock that it already has and the
system subsequently deadlocks; I'm no expert on the PI code and I'm not entirely sure what is wrong, but it
looks like it's the RT patches that have created the problem.

The following simplified driver reproduces the issue:

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

#include <linux/init.h>
#include <linux/module.h>
#include <linux/poll.h>
#include <linux/version.h>
#include <linux/signal.h>
#include <linux/sched.h>
#include <linux/delay.h>
#include <linux/slab.h>

MODULE_DESCRIPTION("test - driver module for Development Testing");
MODULE_AUTHOR("Chris Pringle");
(Continue reading)

PRINGLE Chris | 29 Jun 2012 12:40
Favicon

RE: Lookaside cache with high volume of objects hangs the kernel on destruction (deadlock)

FYI: This also happens with 3.4.4 with rt13 on i686 (cannot verify on PowerPC at the moment but it's
consistent across architectures so I've no reason to believe it won't be the same).

The problem shown below occurs with Full preemption; with basic RT preemption the system just hangs (no
output whatsoever), and with low-latency desktop preemption the issue appears to go away.

Disabling SMP doesn't fix the issue but the OOPS stack trace changes a bit, although it's the same up to rt_spin_lock.

In all cases, the kernel consistently prints out an OOPS at "BUG_ON(rt_mutex_owner(lock) == self)" in
rt_spin_lock_slowlock and then hangs.

Any suggestions anyone could offer would be great.

Cheers,
C

-----Original Message-----
From: linux-rt-users-owner <at> vger.kernel.org [mailto:linux-rt-users-owner <at> vger.kernel.org] On
Behalf Of PRINGLE Chris
Sent: 29 June 2012 09:09
To: linux-rt-users <at> vger.kernel.org
Subject: Lookaside cache with high volume of objects hangs the kernel on destruction (deadlock)

Dear All,

I posted about this a little while ago but ran out of time to look at the problem; unfortunately our problem
has now resurfaced and I'm looking for some advice as to what might be going wrong.

We have a device driver that utilises lookaside caches; some of those caches are subject to a fairly high
volume of objects. During normal operation there isn't a problem, however on destruction (when we unload
(Continue reading)

PRINGLE Chris | 4 Jul 2012 14:55
Favicon

RE: Lookaside cache with high volume of objects hangs the kernel on destruction (deadlock)

Is anyone able to give the test driver demonstrating the problem a try? If more information is needed please
let me know and I'll try and supply it. All you need is a Linux kernel with full preemption enabled and you'll
see the issue with my test driver.

Thanks,
Chris

-----Original Message-----
From: linux-rt-users-owner <at> vger.kernel.org [mailto:linux-rt-users-owner <at> vger.kernel.org] On
Behalf Of PRINGLE Chris
Sent: 29 June 2012 11:41
To: linux-rt-users <at> vger.kernel.org
Subject: RE: Lookaside cache with high volume of objects hangs the kernel on destruction (deadlock)

FYI: This also happens with 3.4.4 with rt13 on i686 (cannot verify on PowerPC at the moment but it's
consistent across architectures so I've no reason to believe it won't be the same).

The problem shown below occurs with Full preemption; with basic RT preemption the system just hangs (no
output whatsoever), and with low-latency desktop preemption the issue appears to go away.

Disabling SMP doesn't fix the issue but the OOPS stack trace changes a bit, although it's the same up to rt_spin_lock.

In all cases, the kernel consistently prints out an OOPS at "BUG_ON(rt_mutex_owner(lock) == self)" in
rt_spin_lock_slowlock and then hangs.

Any suggestions anyone could offer would be great.

Cheers,
C

(Continue reading)

Luis Claudio R. Goncalves | 4 Jul 2012 15:15

Re: Lookaside cache with high volume of objects hangs the kernel on destruction (deadlock)


On Wed, Jul 04, 2012 at 12:55:17PM +0000, PRINGLE Chris wrote:
| Is anyone able to give the test driver demonstrating the problem a try? If more information is needed
please let me know and I'll try and supply it. All you need is a Linux kernel with full preemption enabled and
you'll see the issue with my test driver.
| 
| Thanks,
| Chris

Yes, I was able to reproduce the issue on 3.4.4-rt13. I ran a few tests
under ftrace and got a bit more of information.

The problem seems to happen in kmem_cache_free(), at this piece of code:

        local_lock_irqsave(slab_lock, flags);
        __cache_free(cachep, objp, __builtin_return_address(0));
        unlock_slab_and_free_delayed(flags);

It seems to fail at the local_lock_irqsave() statement with:

	[  149.328013] kernel BUG at kernel/rtmutex.c:725!
	[  149.328013] invalid opcode: 0000 [#1] PREEMPT SMP

And by the end I see:

	[  149.420228] note: insmod[2801] exited with preempt_count 1

Here it is the ftrace log followed by the backtrace:

[  149.262424] Creating
(Continue reading)

PRINGLE Chris | 6 Jul 2012 11:34
Favicon

RE: Lookaside cache with high volume of objects hangs the kernel on destruction (deadlock)

> Yes, I was able to reproduce the issue on 3.4.4-rt13. I ran a few tests under
> ftrace and got a bit more of information.
> 
> The problem seems to happen in kmem_cache_free(), at this piece of code:
> 
>         local_lock_irqsave(slab_lock, flags);
>         __cache_free(cachep, objp, __builtin_return_address(0));
>         unlock_slab_and_free_delayed(flags);
> 
> It seems to fail at the local_lock_irqsave() statement with:
> 
> 	[  149.328013] kernel BUG at kernel/rtmutex.c:725!
> 	[  149.328013] invalid opcode: 0000 [#1] PREEMPT SMP
> 
> And by the end I see:
> 
> 	[  149.420228] note: insmod[2801] exited with preempt_count 1 

Thanks for this. Any ideas where we might go from here? This used to work in 2.6.33 but got broken between
2.6.33 and 3.0; unfortunately there don't appear to be any RT patch releases between those two versions so
it's hard to see what might have been broken.

It worries me a bit because this could potentially go wrong during runtime if the kernel ever reaps the
lookaside cache for memory; this will then crash the entire system.

Cheers,
Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo <at> vger.kernel.org
(Continue reading)

Thomas Gleixner | 9 Jul 2012 12:11
Picon

RE: Lookaside cache with high volume of objects hangs the kernel on destruction (deadlock)

On Fri, 6 Jul 2012, PRINGLE Chris wrote:

> > Yes, I was able to reproduce the issue on 3.4.4-rt13. I ran a few tests under
> > ftrace and got a bit more of information.
> > 
> > The problem seems to happen in kmem_cache_free(), at this piece of code:
> > 
> >         local_lock_irqsave(slab_lock, flags);
> >         __cache_free(cachep, objp, __builtin_return_address(0));
> >         unlock_slab_and_free_delayed(flags);
> > 
> > It seems to fail at the local_lock_irqsave() statement with:
> > 
> > 	[  149.328013] kernel BUG at kernel/rtmutex.c:725!
> > 	[  149.328013] invalid opcode: 0000 [#1] PREEMPT SMP
> > 
> > And by the end I see:
> > 
> > 	[  149.420228] note: insmod[2801] exited with preempt_count 1 
> 
> Thanks for this. Any ideas where we might go from here? This used to
> work in 2.6.33 but got broken between 2.6.33 and 3.0; unfortunately
> there don't appear to be any RT patch releases between those two
> versions so it's hard to see what might have been broken.
>
> It worries me a bit because this could potentially go wrong during
> runtime if the kernel ever reaps the lookaside cache for memory;
> this will then crash the entire system.

Found the root cause, working on a fix.
(Continue reading)

Thomas Gleixner | 9 Jul 2012 15:40
Picon

RE: Lookaside cache with high volume of objects hangs the kernel on destruction (deadlock)

Chris,

On Mon, 9 Jul 2012, Thomas Gleixner wrote:
> On Fri, 6 Jul 2012, PRINGLE Chris wrote:
> > It worries me a bit because this could potentially go wrong during
> > runtime if the kernel ever reaps the lookaside cache for memory;
> > this will then crash the entire system.
> 
> Found the root cause, working on a fix.

Does the patch below work for you?

Thanks,

	tglx

Index: linux-stable-rt/mm/slab.c
===================================================================
--- linux-stable-rt.orig/mm/slab.c
+++ linux-stable-rt/mm/slab.c
 <at>  <at>  -743,8 +743,26  <at>  <at>  slab_on_each_cpu(void (*func)(void *arg,
 {
 	unsigned int i;

+	get_cpu_light();
 	for_each_online_cpu(i)
 		func(arg, i);
+	put_cpu_light();
+}
+
(Continue reading)

PRINGLE Chris | 9 Jul 2012 18:40
Favicon

RE: Lookaside cache with high volume of objects hangs the kernel on destruction (deadlock)

> 
> Chris,
> 
> On Mon, 9 Jul 2012, Thomas Gleixner wrote:
> > On Fri, 6 Jul 2012, PRINGLE Chris wrote:
> > > It worries me a bit because this could potentially go wrong during
> > > runtime if the kernel ever reaps the lookaside cache for memory;
> > > this will then crash the entire system.
> >
> > Found the root cause, working on a fix.
> 
> Does the patch below work for you?
> 
> Thanks,
> 
> 	tglx
>

Supplied patch seems to have fixed the issue. Thanks a lot for looking at this; I'll do some more testing with
the patch applied to make sure everything works as expected.

Cheers,
Chris

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

(Continue reading)


Gmane