Nicolas Ross | 5 Apr 2011 18:48
Picon

fence_apc and Apc AP-8941

Hi !

I've got my cluster now setup in it's final position at the colo facility, 
and we've got an APC ap-8941 power bar. At the moment, our fencing is 
configured with ipmilan via our RMM3 modules on our intel servers. But I'd 
like to add a backup fence device, being the apc.

I can't seem to make it work. On our apc bar, I enabled ssh and disabled 
telnet. I can ssh from our cluster nodes to the ip of the apc bar and 
perform operations, altough connectin via ssh takes about 2 or 3 seconds. I 
try to call manual fence_apc from the command line like so :

fence_apc -a ip -l user -p pass -n node101 -x -v

and I get very rapidly :

Unable to connect/login to fencing device

Netstat shows me a time_wait connection, so it has made a tcp connection.

Any hints ? 

Fabio M. Di Nitto | 6 Apr 2011 08:23
Picon
Favicon

Re: fence_apc and Apc AP-8941

On 4/5/2011 6:48 PM, Nicolas Ross wrote:
> Hi !
> 
> I've got my cluster now setup in it's final position at the colo
> facility, and we've got an APC ap-8941 power bar. At the moment, our
> fencing is configured with ipmilan via our RMM3 modules on our intel
> servers. But I'd like to add a backup fence device, being the apc.
> 
> I can't seem to make it work. On our apc bar, I enabled ssh and disabled
> telnet. I can ssh from our cluster nodes to the ip of the apc bar and
> perform operations, altough connectin via ssh takes about 2 or 3
> seconds. I try to call manual fence_apc from the command line like so :
> 
> fence_apc -a ip -l user -p pass -n node101 -x -v
> 
> and I get very rapidly :
> 
> Unable to connect/login to fencing device
> 
> Netstat shows me a time_wait connection, so it has made a tcp connection.
> 
> Any hints ?

It would be very useful if you could collect the output from the verbose
log and send it to Marek (in CC).

Also, what version of agents are you using? OS?

Fabio

(Continue reading)

Nicolas Ross | 6 Apr 2011 15:02
Picon

Re: fence_apc and Apc AP-8941

(...)

>>
>> fence_apc -a ip -l user -p pass -n node101 -x -v
>>
>> and I get very rapidly :
>>
>> Unable to connect/login to fencing device
>>
>> Netstat shows me a time_wait connection, so it has made a tcp connection.
>>
>> Any hints ?
>
> It would be very useful if you could collect the output from the verbose
> log and send it to Marek (in CC).
>
> Also, what version of agents are you using? OS?

I am on RHEL6, with fence-agents version 3.0.12.8.el6_0.3 (so, up 2 date).

When executed, the command above only display the error I mentionned (unable 
to connect). If I add --debug-file to the command line, the file id creates 
is empty.

I also tried by re-enabeling telnet instead of ssh, and I got the same 
result, except that now the debug file looks like :

--------------------
telnet> set binary
Negotiating binary mode with remote host.
(Continue reading)

Fabio M. Di Nitto | 6 Apr 2011 15:38
Picon
Favicon

Re: fence_apc and Apc AP-8941

Nicolas, please report the issue via GSS.

Marek can start looking into it.

Fabio

On 4/6/2011 3:02 PM, Nicolas Ross wrote:
> (...)
> 
>>>
>>> fence_apc -a ip -l user -p pass -n node101 -x -v
>>>
>>> and I get very rapidly :
>>>
>>> Unable to connect/login to fencing device
>>>
>>> Netstat shows me a time_wait connection, so it has made a tcp
>>> connection.
>>>
>>> Any hints ?
>>
>> It would be very useful if you could collect the output from the verbose
>> log and send it to Marek (in CC).
>>
>> Also, what version of agents are you using? OS?
> 
> I am on RHEL6, with fence-agents version 3.0.12.8.el6_0.3 (so, up 2 date).
> 
> When executed, the command above only display the error I mentionned
> (unable to connect). If I add --debug-file to the command line, the file
(Continue reading)

Nicolas Ross | 6 Apr 2011 16:51
Picon

Re: fence_apc and Apc AP-8941

> Nicolas, please report the issue via GSS.
> 
> Marek can start looking into it.
> 
> Fabio
> 

Sorry, what's GSS ? Is it bugzilla.redhat.com ?

Fabio M. Di Nitto | 6 Apr 2011 21:27
Picon
Favicon

Re: fence_apc and Apc AP-8941

On 04/06/2011 04:51 PM, Nicolas Ross wrote:
>> Nicolas, please report the issue via GSS.
>>
>> Marek can start looking into it.
>>
>> Fabio
>>
> 
> Sorry, what's GSS ? Is it bugzilla.redhat.com ?

Red Hat Global Support Service... the one you contact to report
customer/product related issues. No it's not bugzilla.

Fabio

Marek Grac | 6 Apr 2011 16:19
Picon
Favicon

Re: fence_apc and Apc AP-8941

On 04/06/2011 03:02 PM, Nicolas Ross wrote:
> (...)
>
>>>
>>> fence_apc -a ip -l user -p pass -n node101 -x -v
>>>
>>> and I get very rapidly :
>>>
>>> Unable to connect/login to fencing device
>>>
>>> Netstat shows me a time_wait connection, so it has made a tcp 
>>> connection.
>>>
>>> Any hints ?
>>
> I am on RHEL6, with fence-agents version 3.0.12.8.el6_0.3 (so, up 2 
> date).
>
> When executed, the command above only display the error I mentionned 
> (unable to connect). If I add --debug-file to the command line, the 
> file id creates is empty.
>
> I also tried by re-enabeling telnet instead of ssh, and I got the same 
> result, except that now the debug file looks like :
>
> --------------------
> telnet> set binary
> Negotiating binary mode with remote host.
> telnet> open 1.1.1.1 -23
> Trying 1.1.1.1...
(Continue reading)

Nicolas Ross | 6 Apr 2011 17:06
Picon

Re: fence_apc and Apc AP-8941

(...)
>> When executed, the command above only display the error I mentionned 
>> (unable to connect). If I add --debug-file to the command line, the file 
>> id creates is empty.
>>
>> I also tried by re-enabeling telnet instead of ssh, and I got the same 
>> result, except that now the debug file looks like :
>>
>> --------------------
>> telnet> set binary
>> Negotiating binary mode with remote host.
>> telnet> open 1.1.1.1 -23
>> Trying 1.1.1.1...
>> Connected to 1.1.1.1.
>> Escape character is '^]'.
>>
>> User Name : user
>> Password
>> --------------------
>> I replaced username and ip with fake ones.
>>
>> Regards,
>
> If response is too fast then problem is in connecting information/process. 
> If it took long enough (timeout problem) then it can be problem with 
> change in command prompt. If it is possible please send me what it is 
> displayed when you are trying to do it manually.

Response is indeed very fast when trying with the agent. With tcpdump on the 
node I try with the agent, I see ssh packets go to and from the apc switch.
(Continue reading)

Marek Grac | 8 Apr 2011 10:17
Picon
Favicon

Re: fence_apc and Apc AP-8941

Hi,

On 04/06/2011 05:06 PM, Nicolas Ross wrote:
>> If response is too fast then problem is in connecting 
>> information/process. If it took long enough (timeout problem) then it 
>> can be problem with change in command prompt. If it is possible 
>> please send me what it is displayed when you are trying to do it 
>> manually.
>
>
> Response is indeed very fast when trying with the agent. With tcpdump 
> on the node I try with the agent, I see ssh packets go to and from the 
> apc switch.
>
> When ssh-iing to the apc switch, it takes about 2 or 3 seconds before 
> I get the password prompt, and then I see :
> ------------------------------
> user <at> 1.1.1.1's password:
>
>
> American Power Conversion               Network Management Card AOS 
> v5.1.2
> (c) Copyright 2009 All Rights Reserved  RPDU 2g v5.1.0
> ------------------------------------------------------------------------------- 
>
> Name      : Unknown                                   Date : 04/06/2011
> Contact   : Unknown                                   Time : 10:52:44
> Location  : Unknown                                   User : Device 
> Manager
> Up Time   : 0 Days 1 Hour 52 Minutes                  Stat : P+ N4+ 
(Continue reading)

Nicolas Ross | 8 Apr 2011 16:57
Picon

Re: fence_apc and Apc AP-8941

>> Protocol major versions differ: 1 vs. 2
>
> So they drop support for ssh v1. Unfortunately old versions were not 
> usable with v2. I can make ssh_options tunable and not only pre-set.
>
>>
>> So, there is no ssh version 1 on this version of the apc switrch. I 
>> commented out that line in /usr/sbin/fence_apc, and now the fence agent 
>> is able to establish the connection, but it cannot go any further.
>
> Add "cmd_prompt" into device_opt in fence_apc. Then you will have 
> possibility to set --command-prompt to "apc>".
>
> Both fixes will be simple, feel free to create bugzilla entry for them.

Ok, we are progressing. I did create a support case, as suggested by Fabio. 
It's case # 447666

I get a little further, but now it also seems that the command have also 
changed

Now my log shows :

-------------------------
American Power Conversion               Network Management Card AOS 
v5.1.2
(c) Copyright 2009 All Rights Reserved  RPDU 2g 
v5.1.0
-------------------------------------------------------------------------------
Name      : Unknown                                   Date : 04/08/2011
(Continue reading)

Nicolas Ross | 8 Apr 2011 21:00
Picon

Re: fence_apc and Apc AP-8941

>> So, there is no ssh version 1 on this version of the apc switrch. I 
>> commented out that line in /usr/sbin/fence_apc, and now the fence 
>> agent is able to establish the connection, but it cannot go any further.
> 
> Add "cmd_prompt" into device_opt in fence_apc. Then you will have 
> possibility to set --command-prompt to "apc>".
> 
> Both fixes will be simple, feel free to create bugzilla entry for them.

I submited but # 694894 for this. Let's take it there.

Nicolas Ross | 20 May 2011 14:37
Picon

Re: fence_apc and Apc AP-8941

> Add "cmd_prompt" into device_opt in fence_apc. Then you will have 
> possibility to set --command-prompt to "apc>".
> 
> Both fixes will be simple, feel free to create bugzilla entry for them.

Hi !

It appears that development management won't fix the problem :

https://bugzilla.redhat.com/show_bug.cgi?id=694894

It's not all that bad, since I now use fence_apc_snmp instead.

Regards,


Gmane