Adam Hawthorne | 14 Oct 20:31 2010

CMS initial mark pauses

Hi all,


I'm seeing a customer running CMS with some fairly long initial mark pauses, especially relative to all the other pauses.  The machine is (I believe) an Itanium 6-way running HP-UX, with HP's version of Hotspot, version 1.6.0.06 .

What I'm seeing is that all the remark and young gc pauses are less than 500ms, and very consistent.  There are a lot of initial mark pauses that also fall in this range.  That's our target, and things look really good.  The problem is that there are occasional pauses of up to 1.5s, which is unacceptable for the customer.  Does anyone have any ideas what can cause (seemingly random) long initial mark pauses?

Here's an example of one of the long mark pauses:

45946.930: [GC [1 CMS-initial-mark: 957254K(1598236K)] 1473999K(2151196K), 1.3505680 secs] [Times: user=1.34 sys=0.00, real=1.35 secs] 

Here's an example of one of the more typical pauses:

45954.362: [GC [1 CMS-initial-mark: 963824K(1598236K)] 1001370K(2151196K), 0.1579016 secs] [Times: user=0.16 sys=0.00, real=0.16 secs] 


From my understanding, initial mark pauses are supposed to be relatively short, and usually shorter than remark pauses, but I don't have a remark pause greater than 300ms.

Any help is appreciated.  Thanks,

Adam

--
Adam Hawthorne
Software Engineer
BASIS International Ltd.
www.basis.com
+1.505.345.5232 Phone
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use@...
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
Y. S. Ramakrishna | 14 Oct 20:56 2010
Picon

Re: CMS initial mark pauses

Hi Adam --

Do you have a fuller GC log (perhaps including
PrintCMSStatistics=2) to help make a sharper diagnosis?
It could be:-

    6412968 CMS: Long initial mark pauses

which we have unfortunately not gotten to addressing yet:
CMS initial mark work is (still) done single-threaded.
Usually there is little such work so we are usually fine,
but if survivor spaces are large and full and/or if CMS triggering
occupancy is such that CMS runs frequently then you can
be affected by long serial initial mark pauses because
the work is non-trivial. (CMS-remark and scavenges on
the other hand are done by several worker threads working
in parallel.)

Here's an excerpt from the "workaround" section of that bug
(reproduced here because i cannot seem to get bugs.sun.com
to display it) :-

> This is not really a viable workaround since it might lead to suboptimal
> heap configuration:
> (1) use no survivor spaces (at the risk of larger scavenge pauses, larger remark pauses,
>     even concurrent mode failures)
> (2) use a sufficiently large heap so as to be able to afford to set a
>     mark initiation threshold above the low water-mark (after a major
>     collection cycle). This will keep init-mark's riding on the coat-tails
>     of scavenges.
> *** (#1 of 2): 2006-04-13 09:53:14 PDT xxxx@...
> 
> Also, if using iCMS (Inceremental CMS), drop the Incremental mode and revert to
> vanilla CMS.
> *** (#2 of 2): 2010-04-14 11:02:03 PDT xxxx@...

If you have support, you can try escalating it via your support channels
to get this addressed, especially if the workaround/retuning doesn't
do the job.

-- ramki

On 10/14/10 11:31, Adam Hawthorne wrote:
> Hi all,
> 
> I'm seeing a customer running CMS with some fairly long initial mark 
> pauses, especially relative to all the other pauses.  The machine is (I 
> believe) an Itanium 6-way running HP-UX, with HP's version of Hotspot, 
> version 1.6.0.06 .
> 
> What I'm seeing is that all the remark and young gc pauses are less than 
> 500ms, and very consistent.  There are a lot of initial mark pauses that 
> also fall in this range.  That's our target, and things look really 
> good.  The problem is that there are occasional pauses of up to 1.5s, 
> which is unacceptable for the customer.  Does anyone have any ideas what 
> can cause (seemingly random) long initial mark pauses?
> 
> Here's an example of one of the long mark pauses:
> 
> 45946.930: [GC [1 CMS-initial-mark: 957254K(1598236K)] 
> 1473999K(2151196K), 1.3505680 secs] [Times: user=1.34 sys=0.00, 
> real=1.35 secs] 
> 
> Here's an example of one of the more typical pauses:
> 
> 45954.362: [GC [1 CMS-initial-mark: 963824K(1598236K)] 
> 1001370K(2151196K), 0.1579016 secs] [Times: user=0.16 sys=0.00, 
> real=0.16 secs] 
> 
> 
>  From my understanding, initial mark pauses are supposed to be 
> relatively short, and usually shorter than remark pauses, but I don't 
> have a remark pause greater than 300ms.
> 
> Any help is appreciated.  Thanks,
> 
> Adam
> 
> --
> Adam Hawthorne
> Software Engineer
> BASIS International Ltd.
> www.basis.com <http://www.basis.com>
> +1.505.345.5232 Phone
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use@...
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use@...
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

Y. S. Ramakrishna | 14 Oct 22:17 2010
Picon

Re: CMS initial mark pauses

Just realized that this is on HPUX/Itanium on a
JVM built by HP. You'll of course need  to go to HP to have that
addressed; sorry for not reading yr email carefully enough to
note the platform information that you had included before
writing my response below.

-- ramki

On 10/14/10 11:56, Y. S. Ramakrishna wrote:
> Hi Adam --
> 
> Do you have a fuller GC log (perhaps including
> PrintCMSStatistics=2) to help make a sharper diagnosis?
> It could be:-
> 
>    6412968 CMS: Long initial mark pauses
> 
> which we have unfortunately not gotten to addressing yet:
> CMS initial mark work is (still) done single-threaded.
> Usually there is little such work so we are usually fine,
> but if survivor spaces are large and full and/or if CMS triggering
> occupancy is such that CMS runs frequently then you can
> be affected by long serial initial mark pauses because
> the work is non-trivial. (CMS-remark and scavenges on
> the other hand are done by several worker threads working
> in parallel.)
> 
> Here's an excerpt from the "workaround" section of that bug
> (reproduced here because i cannot seem to get bugs.sun.com
> to display it) :-
> 
>> This is not really a viable workaround since it might lead to suboptimal
>> heap configuration:
>> (1) use no survivor spaces (at the risk of larger scavenge pauses, 
>> larger remark pauses,
>>     even concurrent mode failures)
>> (2) use a sufficiently large heap so as to be able to afford to set a
>>     mark initiation threshold above the low water-mark (after a major
>>     collection cycle). This will keep init-mark's riding on the 
>> coat-tails
>>     of scavenges.
>> *** (#1 of 2): 2006-04-13 09:53:14 PDT xxxx@...
>>
>> Also, if using iCMS (Inceremental CMS), drop the Incremental mode and 
>> revert to
>> vanilla CMS.
>> *** (#2 of 2): 2010-04-14 11:02:03 PDT xxxx@...
> 
> If you have support, you can try escalating it via your support channels
> to get this addressed, especially if the workaround/retuning doesn't
> do the job.
> 
> -- ramki
> 
> On 10/14/10 11:31, Adam Hawthorne wrote:
>> Hi all,
>>
>> I'm seeing a customer running CMS with some fairly long initial mark 
>> pauses, especially relative to all the other pauses.  The machine is 
>> (I believe) an Itanium 6-way running HP-UX, with HP's version of 
>> Hotspot, version 1.6.0.06 .
>>
>> What I'm seeing is that all the remark and young gc pauses are less 
>> than 500ms, and very consistent.  There are a lot of initial mark 
>> pauses that also fall in this range.  That's our target, and things 
>> look really good.  The problem is that there are occasional pauses of 
>> up to 1.5s, which is unacceptable for the customer.  Does anyone have 
>> any ideas what can cause (seemingly random) long initial mark pauses?
>>
>> Here's an example of one of the long mark pauses:
>>
>> 45946.930: [GC [1 CMS-initial-mark: 957254K(1598236K)] 
>> 1473999K(2151196K), 1.3505680 secs] [Times: user=1.34 sys=0.00, 
>> real=1.35 secs]
>> Here's an example of one of the more typical pauses:
>>
>> 45954.362: [GC [1 CMS-initial-mark: 963824K(1598236K)] 
>> 1001370K(2151196K), 0.1579016 secs] [Times: user=0.16 sys=0.00, 
>> real=0.16 secs]
>>
>>  From my understanding, initial mark pauses are supposed to be 
>> relatively short, and usually shorter than remark pauses, but I don't 
>> have a remark pause greater than 300ms.
>>
>> Any help is appreciated.  Thanks,
>>
>> Adam
>>
>> -- 
>> Adam Hawthorne
>> Software Engineer
>> BASIS International Ltd.
>> www.basis.com <http://www.basis.com>
>> +1.505.345.5232 Phone
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use@...
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> 
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use@...
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


Gmane