Travis Rhoden | 19 Jun 2012 20:32
Picon
Gravatar

kernel crash from RBD in Ubuntu 12.04

Hey folks,

Ran into this today.  Not sure what I did wrong.  =)

I had an RBD successfully mounted and was done with it.  Proceeded to
do the following:

root <at> spcnode2:~# ls /sys/bus/rbd/devices/
0
root <at> spcnode2:~# echo 0 > /sys/bus/rbd/remove
root <at> spcnode2:~# ls /sys/bus/rbd/devices/      <--- At this point, I
believe the RBD has been successfully removed

----  About an hour passes where I am messing with my ceph cluster.
No other commands are run on this machine ----
----  New cluster is up.  Time to mount my new RBD

root <at> spcnode2:~# echo "10.55.30.0,10.55.30.1,10.55.30.2
name=admin,secret=AQCNv+BPoPQENBAAxlm39kJ5XteNxg2S/dulXw== rbd
perftest" | tee /sys/bus/rbd/add
10.55.30.0,10.55.30.1,10.55.30.2
name=admin,secret=AQCNv+BPoPQENBAAxlm39kJ5XteNxg2S/dulXw== rbd
perftest
Segmentation fault

Well that's ugly.  What's in syslog?

Jun 19 11:16:56 spcnode2 kernel: [76564.387890] ------------[ cut here
]------------
Jun 19 11:16:56 spcnode2 kernel: [76564.392569] WARNING: at
(Continue reading)

Alex Elder | 19 Jun 2012 20:45
Favicon

Re: kernel crash from RBD in Ubuntu 12.04

On 06/19/2012 01:32 PM, Travis Rhoden wrote:
> Hey folks,
> 
> Ran into this today.  Not sure what I did wrong.  =)

It appears you are running Linux 3.2.0.  This has symptoms that
could be explained by a bug that has been fixed in newer Ceph
code.  Specifically, I think this is the fix that, without it,
you might see something like this:

    rbd: don't drop the rbd_id too early

https://github.com/ceph/ceph-client/commit/32eec68d2f233e8a6ae1cd326022f6862e2b9ce3

					-Alex

> I had an RBD successfully mounted and was done with it.  Proceeded to
> do the following:
> 
> root <at> spcnode2:~# ls /sys/bus/rbd/devices/
> 0
> root <at> spcnode2:~# echo 0 > /sys/bus/rbd/remove
> root <at> spcnode2:~# ls /sys/bus/rbd/devices/      <--- At this point, I
> believe the RBD has been successfully removed
> 
> ----  About an hour passes where I am messing with my ceph cluster.
> No other commands are run on this machine ----
> ----  New cluster is up.  Time to mount my new RBD
> 
> root <at> spcnode2:~# echo "10.55.30.0,10.55.30.1,10.55.30.2
(Continue reading)

Travis Rhoden | 19 Jun 2012 20:50
Picon
Gravatar

Re: kernel crash from RBD in Ubuntu 12.04

Awesome.  Thanks Alex.  I'll eagerly await 0.48 once it has finished QA.

 - Travis

On Tue, Jun 19, 2012 at 2:45 PM, Alex Elder <elder <at> dreamhost.com> wrote:
> On 06/19/2012 01:32 PM, Travis Rhoden wrote:
>> Hey folks,
>>
>> Ran into this today.  Not sure what I did wrong.  =)
>
> It appears you are running Linux 3.2.0.  This has symptoms that
> could be explained by a bug that has been fixed in newer Ceph
> code.  Specifically, I think this is the fix that, without it,
> you might see something like this:
>
>    rbd: don't drop the rbd_id too early
>
> https://github.com/ceph/ceph-client/commit/32eec68d2f233e8a6ae1cd326022f6862e2b9ce3
>
>
>                                        -Alex
>
>> I had an RBD successfully mounted and was done with it.  Proceeded to
>> do the following:
>>
>> root <at> spcnode2:~# ls /sys/bus/rbd/devices/
>> 0
>> root <at> spcnode2:~# echo 0 > /sys/bus/rbd/remove
>> root <at> spcnode2:~# ls /sys/bus/rbd/devices/      <--- At this point, I
>> believe the RBD has been successfully removed
(Continue reading)

Dan Mick | 20 Jun 2012 01:33
Gravatar

Re: kernel crash from RBD in Ubuntu 12.04

Actually it appears this fix is in the kernel (repo 'ceph-client'), so I 
don't think 0.48 will contain it (I could be wrong).  You may need to 
grab that repo and build the kernel (or wait until that sha1 gets into 
your distro's kernel release)

On 06/19/2012 11:50 AM, Travis Rhoden wrote:
> Awesome.  Thanks Alex.  I'll eagerly await 0.48 once it has finished QA.
>
>   - Travis
>
> On Tue, Jun 19, 2012 at 2:45 PM, Alex Elder<elder <at> dreamhost.com>  wrote:
>> On 06/19/2012 01:32 PM, Travis Rhoden wrote:
>>> Hey folks,
>>>
>>> Ran into this today.  Not sure what I did wrong.  =)
>>
>> It appears you are running Linux 3.2.0.  This has symptoms that
>> could be explained by a bug that has been fixed in newer Ceph
>> code.  Specifically, I think this is the fix that, without it,
>> you might see something like this:
>>
>>     rbd: don't drop the rbd_id too early
>>
>> https://github.com/ceph/ceph-client/commit/32eec68d2f233e8a6ae1cd326022f6862e2b9ce3
>>
>>
>>                                         -Alex
>>
>>> I had an RBD successfully mounted and was done with it.  Proceeded to
>>> do the following:
(Continue reading)


Gmane