Dave Jones | 10 Aug 2012 05:17
Picon
Favicon

3.5.1 ext4_ sleeping while atomic bug.

BUG: sleeping function called from invalid context at include/linux/buffer_head.h:333
in_atomic(): 1, irqs_disabled(): 0, pid: 9894, name: fstest
3 locks held by fstest/9894:
 #0:  (&type->i_mutex_dir_key#4/1){+.+.+.}, at: [<ffffffff811d5dae>] kern_path_create+0x7e/0x140
 #1:  (&ei->i_data_sem){++++..}, at: [<ffffffff81252e76>] ext4_map_blocks+0xb6/0x250
 #2:  (&(&bgl->locks[i].lock)->rlock){+.+...}, at: [<ffffffff8124a5e7>] ext4_validate_block_bitmap+0x77/0x230
Pid: 9894, comm: fstest Not tainted 3.5.1-1.fc17.x86_64.debug #1
Call Trace:
 [<ffffffff8109cd0a>] __might_sleep+0x18a/0x240
 [<ffffffff811fb430>] __sync_dirty_buffer+0x30/0xf0
 [<ffffffff811fb503>] sync_dirty_buffer+0x13/0x20
 [<ffffffff81273018>] ext4_commit_super+0x1e8/0x260
 [<ffffffff81273283>] save_error_info+0x23/0x30
 [<ffffffff81274539>] __ext4_error+0x89/0xa0
 [<ffffffff8124a5e7>] ? ext4_validate_block_bitmap+0x77/0x230
 [<ffffffff8124a72b>] ext4_validate_block_bitmap+0x1bb/0x230
 [<ffffffff8124b0ae>] ext4_read_block_bitmap_nowait+0x8e/0x3b0
 [<ffffffff812891c0>] ext4_mb_init_cache+0x160/0x990
 [<ffffffff810d16bd>] ? trace_hardirqs_on_caller+0x10d/0x1a0
 [<ffffffff81289b16>] ext4_mb_init_group+0x126/0x250
 [<ffffffff81289d56>] ext4_mb_good_group+0x116/0x130
 [<ffffffff8128c493>] ext4_mb_regular_allocator+0x1a3/0x420
 [<ffffffff811aa920>] ? kmem_cache_alloc+0xe0/0x290
 [<ffffffff8128e2c1>] ext4_mb_new_blocks+0x4f1/0xb90
 [<ffffffff811fad9f>] ? __find_get_block+0xaf/0x220
 [<ffffffff81293e7e>] ext4_alloc_branch+0x42e/0x690
 [<ffffffff816c6030>] ? _raw_spin_unlock_irq+0x30/0x50
 [<ffffffff812949a7>] ext4_ind_map_blocks+0x1e7/0x990
 [<ffffffff816c348a>] ? down_write+0x9a/0xb0
 [<ffffffff81252e76>] ? ext4_map_blocks+0xb6/0x250
(Continue reading)

Theodore Ts'o | 10 Aug 2012 20:24
Picon
Picon
Favicon
Gravatar

Re: 3.5.1 ext4_ sleeping while atomic bug.

Hi Dave,

Thanks for the bug report!  The following should address the bug which
you found.

						- Ted

From 05ca87aa00121756b5d41f3d71eb8b51bed3bc92 Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <tytso <at> mit.edu>
Date: Fri, 10 Aug 2012 13:57:52 -0400
Subject: [PATCH] ext4: don't call ext4_error while block group is locked

While in ext4_validate_block_bitmap(), if an block allocation bitmap
is found to be invalid, we call ext4_error() while the block group is
still locked.  This causes ext4_commit_super() to call a function
which might sleep while in an atomic context.

There's no need to keep the block group locked at this point, so hoist
the ext4_error() call up to ext4_validate_block_bitmap() and release
the block group spinlock before calling ext4_error().

The reported stack trace can be found at:

	http://article.gmane.org/gmane.comp.file-systems.ext4/33731

Reported-by: Dave Jones <davej <at> redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso <at> mit.edu>
Cc: stable <at> vger.kernel.org
---
 fs/ext4/balloc.c | 62 +++++++++++++++++++++++++++++++++-----------------------
(Continue reading)


Gmane