Cameron Pierce | 27 Jul 02:36 2010
Picon

[Check_mk (english)] Question regarding livestatus and pnp4nagios

Now that check_mk has livestatus, should pnp4nagios' processes be run?

I've been having problems with the nagios daemon ceasing to run it's checks and suspect it might be caused by a conflict between the two.  Nagios hasn't generated any error messages in it's logs.  I came back to the office to find that it hadn't run any service checks since Friday.   The nagios process was still running.  I've been having this problem for a few weeks now.

Both livestatus and npcd (npcdmod) have broker_modules, and wonder if they might be conflicting some how.  I was using the npcd to process my performance data but disabled it last week when I was trying to troubleshoot this issue; switching to using pnp4nagios' synchronous mode instead, with the same results.

So today I've commented out the broker_module entry for livestatus.

Any suggestions?




_______________________________________________
checkmk-en mailing list
checkmk-en@...
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
Mathias Kettner | 27 Jul 08:08 2010
Picon

Re: [Check_mk (english)] Question regarding livestatus and pnp4nagios

Please make sure that,

1. enable_environ_macros=0 in nagios.cfg
2. debug=0 or not set when loading the livestatus module

Both logging and the environment macros can cause a hanging Nagios
when using Livestatus in rare situations.

Mathias

Am 27.07.2010 02:36, schrieb Cameron Pierce:
> Now that check_mk has livestatus, should pnp4nagios' processes be run?
>
> I've been having problems with the nagios daemon ceasing to run it's
> checks and suspect it might be caused by a conflict between the two.
> Nagios hasn't generated any error messages in it's logs.  I came back to
> the office to find that it hadn't run any service checks since Friday.
> The nagios process was still running.  I've been having this problem for
> a few weeks now.
>
> Both livestatus and npcd (npcdmod) have broker_modules, and wonder if
> they might be conflicting some how.  I was using the npcd to process my
> performance data but disabled it last week when I was trying to
> troubleshoot this issue; switching to using pnp4nagios' synchronous mode
> instead, with the same results.
>
> So today I've commented out the broker_module entry for livestatus.
>
> Any suggestions?
>
>
>
>
>
>
> _______________________________________________
> checkmk-en mailing list
> checkmk-en@...
> http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
Mark Taffar | 28 Jul 14:54 2010

Re: [Check_mk (english)] Question regarding livestatus and pnp4nagios

I have had the same hanging nagios issue but I thought it had to do with my backup script or the version of nagios I was on.  I set up a clone of nagios to run the builtin check_nagios plugin against the nagios log file (set up NFS share from prod nagios) to monitor functionality of nagios.

 

You  might want to confirm with Mathias but if I recall correctly, Mathias said in another thread to set the following in nagios.cfg to fix this issue:  Enable_environment_macros =0

 

Since my problem hasn’t happened since I disabled my backup script and upgraded nagios to the latest version (and check_mk), I haven’t disabled enable_environment_macros because frankly I don’t understand the setting. 

 

Mark

 

From: checkmk-en-bounces-qhrM8SXbD5JpaB0eVFyvwnWFp+d4uDoM@public.gmane.org [mailto:checkmk-en-bounces-qhrM8SXbD5JpaB0eVFyvwnWFp+d4uDoM@public.gmane.org] On Behalf Of Cameron Pierce
Sent: Monday, July 26, 2010 8:36 PM
To: checkmk-en-qhrM8SXbD5JpaB0eVFyvwnWFp+d4uDoM@public.gmane.org
Subject: [Check_mk (english)] Question regarding livestatus and pnp4nagios

 

Now that check_mk has livestatus, should pnp4nagios' processes be run?

I've been having problems with the nagios daemon ceasing to run it's checks and suspect it might be caused by a conflict between the two.  Nagios hasn't generated any error messages in it's logs.  I came back to the office to find that it hadn't run any service checks since Friday.   The nagios process was still running.  I've been having this problem for a few weeks now.

Both livestatus and npcd (npcdmod) have broker_modules, and wonder if they might be conflicting some how.  I was using the npcd to process my performance data but disabled it last week when I was trying to troubleshoot this issue; switching to using pnp4nagios' synchronous mode instead, with the same results.

So today I've commented out the broker_module entry for livestatus.

Any suggestions?



_______________________________________________
checkmk-en mailing list
checkmk-en@...
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
Alex Broad | 29 Jul 12:20 2010

Re: [Check_mk (english)] Question regarding livestatus and pnp4nagios

I have also had this issue, and in my case it was caused by local checks on the nagios host it self - check_dhcp

For me the steps to reproduce this issue are simple ( after some time debugging it)

1) on nagios (localhost) add a check to see if dhcp is working and set the time out to 30 seconds  (an other server needs to be a dhcp server)
2) then stop the dhcpd server ( so the check takes it time to respond)
3) watch for when nagios is it doing a local check  - ps -ef |grep nagios |grep dhcp
4) as soon as nagios starts the thread for checking dhcp - do a check_mk -R

Everything reports as OK but then doing a 'ps -ef |grep nagios' I can see that the nagios check on dhcp is still waiting and fails to exit correctly, and nagios no longer is doing checks or updating the pnp graphs - the nagios process just hang.

Our solution was to add a few extra lines to the /etc/init.d/nagios script in the restart section.

Now when a restart is done it issues a stop then a check to confirm that all nagios threads are stopped and if not then kills them all and then starts nagios.

After this change it has worked with out issue.


Hope this helps

Alex

On Wed, Jul 28, 2010 at 10:54 PM, Mark Taffar <MLT-ol7Ywu07Flc@public.gmane.org> wrote:

I have had the same hanging nagios issue but I thought it had to do with my backup script or the version of nagios I was on.  I set up a clone of nagios to run the builtin check_nagios plugin against the nagios log file (set up NFS share from prod nagios) to monitor functionality of nagios.

 

You  might want to confirm with Mathias but if I recall correctly, Mathias said in another thread to set the following in nagios.cfg to fix this issue:  Enable_environment_macros =0

 

Since my problem hasn’t happened since I disabled my backup script and upgraded nagios to the latest version (and check_mk), I haven’t disabled enable_environment_macros because frankly I don’t understand the setting. 

 

Mark

 

From: checkmk-en-bounces-qhrM8SXbD5JpaB0eVFyvwg@public.gmane.orgttner.de [mailto:checkmk-en-bounces-qhrM8SXbD5JpaB0eVFyvwnWFp+d4uDoM@public.gmane.org] On Behalf Of Cameron Pierce
Sent: Monday, July 26, 2010 8:36 PM
To: checkmk-en-qhrM8SXbD5JpaB0eVFyvwnWFp+d4uDoM@public.gmane.org
Subject: [Check_mk (english)] Question regarding livestatus and pnp4nagios

 

Now that check_mk has livestatus, should pnp4nagios' processes be run?

I've been having problems with the nagios daemon ceasing to run it's checks and suspect it might be caused by a conflict between the two.  Nagios hasn't generated any error messages in it's logs.  I came back to the office to find that it hadn't run any service checks since Friday.   The nagios process was still running.  I've been having this problem for a few weeks now.

Both livestatus and npcd (npcdmod) have broker_modules, and wonder if they might be conflicting some how.  I was using the npcd to process my performance data but disabled it last week when I was trying to troubleshoot this issue; switching to using pnp4nagios' synchronous mode instead, with the same results.

So today I've commented out the broker_module entry for livestatus.

Any suggestions?




_______________________________________________
checkmk-en mailing list
checkmk-en-qhrM8SXbD5JeoWH0uzbU5w@public.gmane.orghias-kettner.de
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en




_______________________________________________
checkmk-en mailing list
checkmk-en@...
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
Mathias Kettner | 12 Sep 13:28 2010
Picon

Re: [Check_mk (english)] Question regarding livestatus and pnp4nagios

Hi,

I thinks this is not a livestatus problem but an old problem of Nagios
restart not shutting down correctly before starting anew. This should
be fixed in /etc/init.d/nagios...

Mathias

Am 29.07.2010 12:20, schrieb Alex Broad:
> I have also had this issue, and in my case it was caused by local checks
> on the nagios host it self - check_dhcp
>
> For me the steps to reproduce this issue are simple ( after some time
> debugging it)
>
> 1) on nagios (localhost) add a check to see if dhcp is working and set
> the time out to 30 seconds  (an other server needs to be a dhcp server)
> 2) then stop the dhcpd server ( so the check takes it time to respond)
> 3) watch for when nagios is it doing a local check  - ps -ef |grep
> nagios |grep dhcp
> 4) as soon as nagios starts the thread for checking dhcp - do a check_mk -R
>
> Everything reports as OK but then doing a 'ps -ef |grep nagios' I can
> see that the nagios check on dhcp is still waiting and fails to exit
> correctly, and nagios no longer is doing checks or updating the pnp
> graphs - the nagios process just hang.
>
> Our solution was to add a few extra lines to the /etc/init.d/nagios
> script in the restart section.
>
> Now when a /*restart*/ is done it issues a stop then a check to confirm
> that all nagios threads are stopped and if not then kills them all and
> then starts nagios.
>
> After this change it has worked with out issue.
>
>
> Hope this helps
>
> Alex
>
> On Wed, Jul 28, 2010 at 10:54 PM, Mark Taffar <MLT <at> akc.org
> <mailto:MLT <at> akc.org>> wrote:
>
>     I have had the same hanging nagios issue but I thought it had to do
>     with my backup script or the version of nagios I was on.  I set up a
>     clone of nagios to run the builtin check_nagios plugin against the
>     nagios log file (set up NFS share from prod nagios) to monitor
>     functionality of nagios.
>
>     You  might want to confirm with Mathias but if I recall correctly,
>     Mathias said in another thread to set the following in nagios.cfg to
>     fix this issue:  Enable_environment_macros =0
>
>     Since my problem hasn’t happened since I disabled my backup script
>     and upgraded nagios to the latest version (and check_mk), I haven’t
>     disabled enable_environment_macros because frankly I don’t
>     understand the setting.
>
>     Mark
>
>     *From:* checkmk-en-bounces <at> lists.mathias-kettner.de
>     <mailto:checkmk-en-bounces <at> lists.mathias-kettner.de>
>     [mailto:checkmk-en-bounces <at> lists.mathias-kettner.de
>     <mailto:checkmk-en-bounces <at> lists.mathias-kettner.de>] *On Behalf Of
>     *Cameron Pierce
>     *Sent:* Monday, July 26, 2010 8:36 PM
>     *To:* checkmk-en <at> lists.mathias-kettner.de
>     <mailto:checkmk-en <at> lists.mathias-kettner.de>
>     *Subject:* [Check_mk (english)] Question regarding livestatus and
>     pnp4nagios
>
>     Now that check_mk has livestatus, should pnp4nagios' processes be run?
>
>     I've been having problems with the nagios daemon ceasing to run it's
>     checks and suspect it might be caused by a conflict between the
>     two.  Nagios hasn't generated any error messages in it's logs.  I
>     came back to the office to find that it hadn't run any service
>     checks since Friday.   The nagios process was still running.  I've
>     been having this problem for a few weeks now.
>
>     Both livestatus and npcd (npcdmod) have broker_modules, and wonder
>     if they might be conflicting some how.  I was using the npcd to
>     process my performance data but disabled it last week when I was
>     trying to troubleshoot this issue; switching to using pnp4nagios'
>     synchronous mode instead, with the same results.
>
>     So today I've commented out the broker_module entry for livestatus.
>
>     Any suggestions?
>
>
>
>
>     _______________________________________________
>     checkmk-en mailing list
>     checkmk-en <at> lists.mathias-kettner.de
>     <mailto:checkmk-en <at> lists.mathias-kettner.de>
>     http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
>
>
>
>
>
>
> _______________________________________________
> checkmk-en mailing list
> checkmk-en <at> lists.mathias-kettner.de
> http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en

_______________________________________________
checkmk-en mailing list
checkmk-en <at> lists.mathias-kettner.de
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en

Gmane