Coyle, James J [ITACD] | 22 Oct 02:00 2013

pbs_mom consuming 19 Gbytes of memory on idle nodes after a few weeks.

I’m running Torque version 4.2.2 under Redhat Enterprise Linux 6.3

and pbs_mom starts out after a reboot using a small amount of virtual and

resident memory (VIRT and RES in the top –a listings below)

After running for a while I about 19Gbytes for each.

 

   Is this a known problem? 

Is there a fix?

 

Thanks,

-         Jim C.

 

Just after reboot

 

  PID USER      PR  NI  VIRT  RES  SHR     S %CPU %MEM    TIME+  COMMAND       

2991 root      20   0 96876  48m 9112 S   0.7      0.0      0:01.07   pbs_mom

 

 

From a server that has been up a few weeks:

 

  PID USER      PR  NI  VIRT   RES   SHR   S  %CPU  %MEM    TIME+     COMMAND                                        

 7330 root      20   0  19.1g  19g  9112 S   0.0      15.2     123:15.95   pbs_mom  

 

The 19.1 and 19 Gbytes seems consistent for those nodes that exhibit this issue.

 

 

 James Coyle, PhD

High Performance Computing Group    

 217 Durham Center           

 Iowa State Univ.           phone: (515)-294-2099

Ames, Iowa 50011           web: http://jjc.public.iastate.edu/

 

_______________________________________________
torqueusers mailing list
torqueusers <at> supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers
Ken Nielson | 22 Oct 11:44 2013

Re: pbs_mom consuming 19 Gbytes of memory on idle nodes after a few weeks.

We have fixed a large memory leak in pbs_mom in upcoming releases of 4.2.6 and 4.5.0.




On Mon, Oct 21, 2013 at 6:00 PM, Coyle, James J [ITACD] <jjc <at> iastate.edu> wrote:

I’m running Torque version 4.2.2 under Redhat Enterprise Linux 6.3

and pbs_mom starts out after a reboot using a small amount of virtual and

resident memory (VIRT and RES in the top –a listings below)

After running for a while I about 19Gbytes for each.

 

   Is this a known problem? 

Is there a fix?

 

Thanks,

-         Jim C.

 

Just after reboot

 

  PID USER      PR  NI  VIRT  RES  SHR     S %CPU %MEM    TIME+  COMMAND       

2991 root      20   0 96876  48m 9112 S   0.7      0.0      0:01.07   pbs_mom

 

 

From a server that has been up a few weeks:

 

  PID USER      PR  NI  VIRT   RES   SHR   S  %CPU  %MEM    TIME+     COMMAND                                        

 7330 root      20   0  19.1g  19g  9112 S   0.0      15.2     123:15.95   pbs_mom  

 

The 19.1 and 19 Gbytes seems consistent for those nodes that exhibit this issue.

 

 

 James Coyle, PhD

High Performance Computing Group    

 217 Durham Center           

 Iowa State Univ.           phone: (515)-294-2099

Ames, Iowa 50011           web: http://jjc.public.iastate.edu/

 


_______________________________________________
torqueusers mailing list
torqueusers <at> supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers




--
Ken Nielson
+1 801.717.3700 office +1 801.717.3738 fax
1712 S. East Bay Blvd, Suite 300  Provo, UT  84606
www.adaptivecomputing.com

_______________________________________________
torqueusers mailing list
torqueusers <at> supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers
Eva Hocks | 22 Oct 23:18 2013
Picon

Re: pbs_mom consuming 19 Gbytes of memory on idle nodes after a few weeks.


Ken,

when will 4.2.6 be available?

Thanks
Eva

On Tue, 22 Oct 2013, Ken Nielson wrote:

> We have fixed a large memory leak in pbs_mom in upcoming releases of 4.2.6
> and 4.5.0.
>
>
>
>
> On Mon, Oct 21, 2013 at 6:00 PM, Coyle, James J [ITACD] <jjc <at> iastate.edu>wrote:
>
> >  Im running Torque version 4.2.2 under Redhat Enterprise Linux 6.3****
> >
> > and pbs_mom starts out after a reboot using a small amount of virtual and*
> > ***
> >
> > resident memory (VIRT and RES in the top a listings below)****
> >
> > After running for a while I about 19Gbytes for each.****
> >
> > ** **
> >
> >    Is this a known problem?  ** **
> >
> > Is there a fix?****
> >
> > ** **
> >
> > Thanks,****
> >
> > **-         **Jim C.****
> >
> > ** **
> >
> > Just after reboot****
> >
> > ** **
> >
> >   PID USER      PR  NI  VIRT  RES  SHR     S %CPU %MEM    TIME+
> > COMMAND        ****
> >
> > 2991 root      20   0 96876  48m 9112 S   0.7      0.0      0:01.07
> >   pbs_mom****
> >
> > ** **
> >
> > ** **
> >
> > From a server that has been up a few weeks:****
> >
> >   ****
> >
> >   PID USER      PR  NI  VIRT   RES   SHR   S  %CPU  %MEM    TIME+
> >    COMMAND                                         ****
> >
> >  7330 root      20   0  19.1g  19g  9112 S   0.0      15.2     123:15.95
> >   pbs_mom   ****
> >
> > ** **
> >
> > The 19.1 and 19 Gbytes seems consistent for those nodes that exhibit this
> > issue.****
> >
> > ** **
> >
> >   ****
> >
> >  James Coyle, PhD****
> >
> > High Performance Computing Group     ****
> >
> >  217 Durham Center            ****
> >
> >  Iowa State Univ.           phone: (515)-294-2099****
> >
> > Ames, Iowa 50011           web: http://jjc.public.iastate.edu/****
> >
> > ** **
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers <at> supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> >
> >
>
>
>

_______________________________________________
torqueusers mailing list
torqueusers <at> supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers
David Beer | 22 Oct 23:19 2013

Re: pbs_mom consuming 19 Gbytes of memory on idle nodes after a few weeks.

4.2.6 is slated for early next month.

David


On Tue, Oct 22, 2013 at 3:18 PM, Eva Hocks <hocks <at> sdsc.edu> wrote:


Ken,

when will 4.2.6 be available?

Thanks
Eva


On Tue, 22 Oct 2013, Ken Nielson wrote:

> We have fixed a large memory leak in pbs_mom in upcoming releases of 4.2.6
> and 4.5.0.
>
>
>
>
> On Mon, Oct 21, 2013 at 6:00 PM, Coyle, James J [ITACD] <jjc <at> iastate.edu>wrote:
>
> >  I’m running Torque version 4.2.2 under Redhat Enterprise Linux 6.3****
> >
> > and pbs_mom starts out after a reboot using a small amount of virtual and*
> > ***
> >
> > resident memory (VIRT and RES in the top –a listings below)****
> >
> > After running for a while I about 19Gbytes for each.****
> >
> > ** **
> >
> >    Is this a known problem?  ** **
> >
> > Is there a fix?****
> >
> > ** **
> >
> > Thanks,****
> >
> > **-         **Jim C.****
> >
> > ** **
> >
> > Just after reboot****
> >
> > ** **
> >
> >   PID USER      PR  NI  VIRT  RES  SHR     S %CPU %MEM    TIME+
> > COMMAND        ****
> >
> > 2991 root      20   0 96876  48m 9112 S   0.7      0.0      0:01.07
> >   pbs_mom****
> >
> > ** **
> >
> > ** **
> >
> > From a server that has been up a few weeks:****
> >
> >   ****
> >
> >   PID USER      PR  NI  VIRT   RES   SHR   S  %CPU  %MEM    TIME+
> >    COMMAND                                         ****
> >
> >  7330 root      20   0  19.1g  19g  9112 S   0.0      15.2     123:15.95
> >   pbs_mom   ****
> >
> > ** **
> >
> > The 19.1 and 19 Gbytes seems consistent for those nodes that exhibit this
> > issue.****
> >
> > ** **
> >
> >   ****
> >
> >  James Coyle, PhD****
> >
> > High Performance Computing Group     ****
> >
> >  217 Durham Center            ****
> >
> >  Iowa State Univ.           phone: (515)-294-2099****
> >
> > Ames, Iowa 50011           web: http://jjc.public.iastate.edu/****
> >
> > ** **
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers <at> supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> >
> >
>
>
>


_______________________________________________
torqueusers mailing list
torqueusers <at> supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers




--
David Beer | Senior Software Engineer
Adaptive Computing
_______________________________________________
torqueusers mailing list
torqueusers <at> supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers
Ken Nielson | 23 Oct 15:12 2013

Re: pbs_mom consuming 19 Gbytes of memory on idle nodes after a few weeks.

Eva,

I do not have an exact day yet. We are approaching a code freeze this week or next. So it should be available in November. But that is just my estimate. Do not make any solid plans on that information.

Regards




On Tue, Oct 22, 2013 at 3:18 PM, Eva Hocks <hocks <at> sdsc.edu> wrote:


Ken,

when will 4.2.6 be available?

Thanks
Eva


On Tue, 22 Oct 2013, Ken Nielson wrote:

> We have fixed a large memory leak in pbs_mom in upcoming releases of 4.2.6
> and 4.5.0.
>
>
>
>
> On Mon, Oct 21, 2013 at 6:00 PM, Coyle, James J [ITACD] <jjc <at> iastate.edu>wrote:
>
> >  I’m running Torque version 4.2.2 under Redhat Enterprise Linux 6.3****
> >
> > and pbs_mom starts out after a reboot using a small amount of virtual and*
> > ***
> >
> > resident memory (VIRT and RES in the top –a listings below)****
> >
> > After running for a while I about 19Gbytes for each.****
> >
> > ** **
> >
> >    Is this a known problem?  ** **
> >
> > Is there a fix?****
> >
> > ** **
> >
> > Thanks,****
> >
> > **-         **Jim C.****
> >
> > ** **
> >
> > Just after reboot****
> >
> > ** **
> >
> >   PID USER      PR  NI  VIRT  RES  SHR     S %CPU %MEM    TIME+
> > COMMAND        ****
> >
> > 2991 root      20   0 96876  48m 9112 S   0.7      0.0      0:01.07
> >   pbs_mom****
> >
> > ** **
> >
> > ** **
> >
> > From a server that has been up a few weeks:****
> >
> >   ****
> >
> >   PID USER      PR  NI  VIRT   RES   SHR   S  %CPU  %MEM    TIME+
> >    COMMAND                                         ****
> >
> >  7330 root      20   0  19.1g  19g  9112 S   0.0      15.2     123:15.95
> >   pbs_mom   ****
> >
> > ** **
> >
> > The 19.1 and 19 Gbytes seems consistent for those nodes that exhibit this
> > issue.****
> >
> > ** **
> >
> >   ****
> >
> >  James Coyle, PhD****
> >
> > High Performance Computing Group     ****
> >
> >  217 Durham Center            ****
> >
> >  Iowa State Univ.           phone: (515)-294-2099****
> >
> > Ames, Iowa 50011           web: http://jjc.public.iastate.edu/****
> >
> > ** **
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers <at> supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> >
> >
>
>
>


_______________________________________________
torqueusers mailing list
torqueusers <at> supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers




--
Ken Nielson
+1 801.717.3700 office +1 801.717.3738 fax
1712 S. East Bay Blvd, Suite 300  Provo, UT  84606
www.adaptivecomputing.com

_______________________________________________
torqueusers mailing list
torqueusers <at> supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers

Gmane