KOSAKI Motohiro | 1 Aug 2010 10:19
Favicon

Re: [PATCH 6/6] vmscan: Kick flusher threads to clean pages when reclaim is encountering dirty pages

Hi Trond,

> There is that, and then there are issues with the VM simply lying to the
> filesystems.
> 
> See https://bugzilla.kernel.org/show_bug.cgi?id=16056
> 
> Which basically boils down to the following: kswapd tells the filesystem
> that it is quite safe to do GFP_KERNEL allocations in pageouts and as
> part of try_to_release_page().
> 
> In the case of pageouts, it does set the 'WB_SYNC_NONE', 'nonblocking'
> and 'for_reclaim' flags in the writeback_control struct, and so the
> filesystem has at least some hint that it should do non-blocking i/o.
> 
> However if you trust the GFP_KERNEL flag in try_to_release_page() then
> the kernel can and will deadlock, and so I had to add in a hack
> specifically to tell the NFS client not to trust that flag if it comes
> from kswapd.

Can you please elaborate your issue more? vmscan logic is, briefly, below

	if (PageDirty(page))
		pageout(page)
	if (page_has_private(page)) {
		try_to_release_page(page, sc->gfp_mask))

So, I'm interest why nfs need to writeback at ->release_page again even
though pageout() call ->writepage and it was successfull.

(Continue reading)

Trond Myklebust | 1 Aug 2010 18:21
Picon
Picon

Re: [PATCH 6/6] vmscan: Kick flusher threads to clean pages when reclaim is encountering dirty pages

On Sun, 2010-08-01 at 17:19 +0900, KOSAKI Motohiro wrote:
> Hi Trond,
> 
> > There is that, and then there are issues with the VM simply lying to the
> > filesystems.
> > 
> > See https://bugzilla.kernel.org/show_bug.cgi?id=16056
> > 
> > Which basically boils down to the following: kswapd tells the filesystem
> > that it is quite safe to do GFP_KERNEL allocations in pageouts and as
> > part of try_to_release_page().
> > 
> > In the case of pageouts, it does set the 'WB_SYNC_NONE', 'nonblocking'
> > and 'for_reclaim' flags in the writeback_control struct, and so the
> > filesystem has at least some hint that it should do non-blocking i/o.
> > 
> > However if you trust the GFP_KERNEL flag in try_to_release_page() then
> > the kernel can and will deadlock, and so I had to add in a hack
> > specifically to tell the NFS client not to trust that flag if it comes
> > from kswapd.
> 
> Can you please elaborate your issue more? vmscan logic is, briefly, below
> 
> 	if (PageDirty(page))
> 		pageout(page)
> 	if (page_has_private(page)) {
> 		try_to_release_page(page, sc->gfp_mask))
> 
> So, I'm interest why nfs need to writeback at ->release_page again even
> though pageout() call ->writepage and it was successfull.
(Continue reading)

KOSAKI Motohiro | 2 Aug 2010 09:57
Favicon

Re: [PATCH 6/6] vmscan: Kick flusher threads to clean pages when reclaim is encountering dirty pages

Hi

> The problem that I am seeing is that the try_to_release_page() needs to
> be told to act as a non-blocking call when the process is kswapd, just
> like the pageout() call.
> 
> Currently, the sc->gfp_mask is set to GFP_KERNEL, which normally means
> that the call may wait on I/O to complete. However, what I'm seeing in
> the bugzilla above is that if kswapd waits on an RPC call, then the
> whole VM may gum up: typically, the traces show that the socket layer
> cannot allocate memory to hold the RPC reply from the server, and so it
> is kicking kswapd to have it reclaim some pages, however kswapd is stuck
> in try_to_release_page() waiting for that same I/O to complete, hence
> the deadlock...

Ah, I see. so as far as I understand, you mean
 - Socket layer use GFP_ATOMIC, then they don't call try_to_free_pages().
   IOW, kswapd is only memory reclaiming thread.
 - Kswapd got stuck in ->release_page().
 - In usual use case, another thread call kmalloc(GFP_KERNEL) and makes
   foreground reclaim, then, restore kswapd stucking. but your case
   there is no such thread.

Hm, interesting.

In short term, current nfs fix (checking PF_MEMALLOC in nfs_wb_page())
seems best way. it's no side effect if my understanding is correct.

> IOW: I think kswapd at least should be calling try_to_release_page()
> with a gfp-flag of '0' to avoid deadlocking on I/O.
(Continue reading)


Gmane