Nigel Cunningham | 1 Feb 2006 13:24

Re: [RFC][PATCH -mm][Experimental] swsusp: freeze userspace processes first

Hi.

On Wednesday 01 February 2006 21:47, Pavel Machek wrote:
> Hi!
>
> > This is an experimantal patch aimed at the "unable to freeze processes
> > under load" problem.
> >
> > On my box the 2.6.16-rc1-mm4 kernel with this patch applied survives the
> > "dd if=/dev/hda of=/dev/null" test.
> >
> > Please have a look.
>
> It makes it better (well, I used my own, simpler variant, but that
> should not matter; patch is attached). I now can't reproduce hangs
> with simple stress testing, but running kernel make alongside that
> makes it hang sometimes. Example of non-frozen gcc:
>
> gcc           D EEE06A70     0  1750   1749  1751
> (NOTLB)
> df85df38 00000046 bf878130 eee06a70 00004111 eee06a70 eee06a70
> 003d0900
>        00000000 c0137cf5 df85c000 00000000 c058ada2 c012503e ef2c915c
> ef2c9030
>        c1c0b480 7c3b8500 003d0927 df85c000 00000a98 7c3b8500 003d0927
> c0770800
> Call Trace:
>  [<c0137cf5>] attach_pid+0x25/0xb0
>  [<c058ada2>] _write_unlock_irq+0x12/0x30
>  [<c012503e>] copy_process+0xe5e/0x11b0
(Continue reading)

Pavel Machek | 1 Feb 2006 13:49
Picon

Re: [RFC][PATCH -mm][Experimental] swsusp: freeze userspace processes first

Hi!

> > > This is an experimantal patch aimed at the "unable to freeze processes
> > > under load" problem.
> > >
> > > On my box the 2.6.16-rc1-mm4 kernel with this patch applied survives the
> > > "dd if=/dev/hda of=/dev/null" test.
> > >
> > > Please have a look.
> >
> > It makes it better (well, I used my own, simpler variant, but that
> > should not matter; patch is attached). I now can't reproduce hangs
> > with simple stress testing, but running kernel make alongside that
> > makes it hang sometimes. Example of non-frozen gcc:
> >
> > gcc           D EEE06A70     0  1750   1749  1751
> > (NOTLB)
> > df85df38 00000046 bf878130 eee06a70 00004111 eee06a70 eee06a70
> > 003d0900
> >        00000000 c0137cf5 df85c000 00000000 c058ada2 c012503e ef2c915c
> > ef2c9030
> >        c1c0b480 7c3b8500 003d0927 df85c000 00000a98 7c3b8500 003d0927
> > c0770800
> > Call Trace:
> >  [<c0137cf5>] attach_pid+0x25/0xb0
> >  [<c058ada2>] _write_unlock_irq+0x12/0x30
> >  [<c012503e>] copy_process+0xe5e/0x11b0
> >  [<c0588f74>] wait_for_completion+0x94/0xd0
> >  [<c0121690>] default_wake_function+0x0/0x10
> >  [<c01254d9>] do_fork+0x149/0x210
(Continue reading)

Rafael J. Wysocki | 2 Feb 2006 00:57
Picon
Gravatar

Re: [RFC][PATCH -mm][Experimental] swsusp: freeze userspace processes first

Hi,

On Wednesday 01 February 2006 13:49, Pavel Machek wrote:
> > > > This is an experimantal patch aimed at the "unable to freeze processes
> > > > under load" problem.
> > > >
}-- snip --{
> > >  /*
> > >   * Timeout for stopping processes
> > >   */
> > > -#define TIMEOUT	(6 * HZ)
> > > +#define TIMEOUT	(60 * HZ)
> > 
> > You're kidding, right?
> 
> sync takes long time... and 6 seconds were not enough to deliver
> signals on highly-loaded ext2.

The appended modified version of the original patch solves the timeout vs sync
problem. ;-)

Seriously speaking I have incoroprated your changes (extended to the userland
interface) and made some fixes (the usermodehelper-related part is still
there, for completness).

Greetings,
Rafael

Signed-off-by: Rafael J. Wysocki <rjw <at> sisk.pl>

(Continue reading)

Rafael J. Wysocki | 2 Feb 2006 14:55
Picon
Gravatar

Re: Re: [RFC][PATCH -mm][Experimental] swsusp: freeze userspace processes first

Hi,

On Thursday 02 February 2006 00:57, Rafael J. Wysocki wrote:
> On Wednesday 01 February 2006 13:49, Pavel Machek wrote:
> > > > > This is an experimantal patch aimed at the "unable to freeze processes
> > > > > under load" problem.
> > > > >
> }-- snip --{
> > > >  /*
> > > >   * Timeout for stopping processes
> > > >   */
> > > > -#define TIMEOUT	(6 * HZ)
> > > > +#define TIMEOUT	(60 * HZ)
> > > 
> > > You're kidding, right?
> > 
> > sync takes long time... and 6 seconds were not enough to deliver
> > signals on highly-loaded ext2.
> 
> The appended modified version of the original patch solves the timeout vs sync
> problem. ;-)
> 
> Seriously speaking I have incoroprated your changes (extended to the userland
> interface) and made some fixes (the usermodehelper-related part is still
> there, for completness).
}-- snip --{  
> +	mutex_lock(&freezer_lock);
> +	freezing_processes = 1;
> +	mutex_unlock(&freezer_lock);
> +	while (atomic_read(&usermodehelper_waiting))
(Continue reading)

Pavel Machek | 2 Feb 2006 16:08
Picon

Re: Re: [RFC][PATCH -mm][Experimental] swsusp: freeze userspace processes first

Hi!

> That requires a timeout in case we have a user mode helper in the D state.
> The corrected patch is appended.
> 
> BTW, it contains a change that may help solve the unfreezeable gcc problem
> that has appeared in your tests.  Could you please try it or tell me what I
> should do to reproduce the problem?

I'm away from real macine just now... I could reproduce it with
Nigel's "stress ..." command, then trying to build kernel.

								Pavel

--

-- 
Thanks, Sharp!
Hi!

> That requires a timeout in case we have a user mode helper in the D state.
> The corrected patch is appended.
> 
> BTW, it contains a change that may help solve the unfreezeable gcc problem
> that has appeared in your tests.  Could you please try it or tell me what I
> should do to reproduce the problem?

I'm away from real macine just now... I could reproduce it with
Nigel's "stress ..." command, then trying to build kernel.

(Continue reading)

Rafael J. Wysocki | 2 Feb 2006 19:32
Picon
Gravatar

Re: Re: [RFC][PATCH -mm][Experimental] swsusp: freeze userspace processes first

Hi,

On Thursday 02 February 2006 16:08, Pavel Machek wrote:
> > That requires a timeout in case we have a user mode helper in the D state.
> > The corrected patch is appended.
> > 
> > BTW, it contains a change that may help solve the unfreezeable gcc problem
> > that has appeared in your tests.  Could you please try it or tell me what I
> > should do to reproduce the problem?
> 
> I'm away from real macine just now... I could reproduce it with
> Nigel's "stress ..." command, then trying to build kernel.

OK, I did the following:
1) run "swapoff -a"
2) run kernel make on one vt,
3) run "stress -d 5 --hdd-bytes 100M -i 5 -c 5" on another vt,
4) run "for f in 1 2 3 4 5 6 7 8 9 10; do echo disk > /sys/power/state ; sleep 5; done" on the 3rd vt.

Appended is the version of the patch that has freezed processes in 10 attempts
out of 10 (please note the "if (!freezing(p))" in freeze_process() ;-)).

Still freezing the userspace processes may take more that 15 secs under such
a load on my box, so the timeout is set to 20 sec (probably overkill for any
sane real-life situation).

Greetings,
Rafael

Signed-off-by: Rafael J. Wysocki <rjw <at> sisk.pl>
(Continue reading)

Pavel Machek | 4 Feb 2006 22:26
Picon

Re: Re: [RFC][PATCH -mm][Experimental] swsusp: freeze userspace processes first

Hi!

> > > That requires a timeout in case we have a user mode helper in the D state.
> > > The corrected patch is appended.
> > > 
> > > BTW, it contains a change that may help solve the unfreezeable gcc problem
> > > that has appeared in your tests.  Could you please try it or tell me what I
> > > should do to reproduce the problem?
> > 
> > I'm away from real macine just now... I could reproduce it with
> > Nigel's "stress ..." command, then trying to build kernel.
> 
> OK, I did the following:
> 1) run "swapoff -a"
> 2) run kernel make on one vt,
> 3) run "stress -d 5 --hdd-bytes 100M -i 5 -c 5" on another vt,
> 4) run "for f in 1 2 3 4 5 6 7 8 9 10; do echo disk > /sys/power/state ; sleep 5; done" on the 3rd vt.
> 
> Appended is the version of the patch that has freezed processes in 10 attempts
> out of 10 (please note the "if (!freezing(p))" in freeze_process() ;-)).
> 
> Still freezing the userspace processes may take more that 15 secs under such
> a load on my box, so the timeout is set to 20 sec (probably overkill for any
> sane real-life situation).

You have my ACK on freezer parts, but please reserve usermode helper
parts for separate patch. Is there simple way to demonstrate usermode
helper problem?
								Pavel
--

-- 
(Continue reading)

Rafael J. Wysocki | 4 Feb 2006 22:47
Picon
Gravatar

Re: Re: [RFC][PATCH -mm][Experimental] swsusp: freeze userspace processes first

Hi,

On Saturday 04 February 2006 22:26, Pavel Machek wrote:
> > > > That requires a timeout in case we have a user mode helper in the D state.
> > > > The corrected patch is appended.
> > > > 
> > > > BTW, it contains a change that may help solve the unfreezeable gcc problem
> > > > that has appeared in your tests.  Could you please try it or tell me what I
> > > > should do to reproduce the problem?
> > > 
> > > I'm away from real macine just now... I could reproduce it with
> > > Nigel's "stress ..." command, then trying to build kernel.
> > 
> > OK, I did the following:
> > 1) run "swapoff -a"
> > 2) run kernel make on one vt,
> > 3) run "stress -d 5 --hdd-bytes 100M -i 5 -c 5" on another vt,
> > 4) run "for f in 1 2 3 4 5 6 7 8 9 10; do echo disk > /sys/power/state ; sleep 5; done" on the 3rd vt.
> > 
> > Appended is the version of the patch that has freezed processes in 10 attempts
> > out of 10 (please note the "if (!freezing(p))" in freeze_process() ;-)).
> > 
> > Still freezing the userspace processes may take more that 15 secs under such
> > a load on my box, so the timeout is set to 20 sec (probably overkill for any
> > sane real-life situation).
> 
> You have my ACK on freezer parts, but please reserve usermode helper
> parts for separate patch.

OK, I'll remove the usermodehelper-related part for now.  If we have a test
(Continue reading)

Nigel Cunningham | 1 Feb 2006 22:41

Re: [RFC][PATCH -mm][Experimental] swsusp: freeze userspace processes first

Hi.

On Wednesday 01 February 2006 22:49, Pavel Machek wrote:
> Hi!
>
> > > > This is an experimantal patch aimed at the "unable to freeze
> > > > processes under load" problem.
> > > >
> > > > On my box the 2.6.16-rc1-mm4 kernel with this patch applied survives
> > > > the "dd if=/dev/hda of=/dev/null" test.
> > > >
> > > > Please have a look.
> > >
> > > It makes it better (well, I used my own, simpler variant, but that
> > > should not matter; patch is attached). I now can't reproduce hangs
> > > with simple stress testing, but running kernel make alongside that
> > > makes it hang sometimes. Example of non-frozen gcc:
> > >
> > > gcc           D EEE06A70     0  1750   1749  1751
> > > (NOTLB)
> > > df85df38 00000046 bf878130 eee06a70 00004111 eee06a70 eee06a70
> > > 003d0900
> > >        00000000 c0137cf5 df85c000 00000000 c058ada2 c012503e ef2c915c
> > > ef2c9030
> > >        c1c0b480 7c3b8500 003d0927 df85c000 00000a98 7c3b8500 003d0927
> > > c0770800
> > > Call Trace:
> > >  [<c0137cf5>] attach_pid+0x25/0xb0
> > >  [<c058ada2>] _write_unlock_irq+0x12/0x30
> > >  [<c012503e>] copy_process+0xe5e/0x11b0
(Continue reading)


Gmane