Rob C | 7 Mar 2012 23:30
Favicon

Re: newsletter: Slow page loads when in failover with mod_auth_kerb

Found a smoking gun when creating a parallel lab environment.  This is a forehead slapping ommission from my original problem description so apologies in advance!

I did not mention that I contact my web server via a CNAME alias of "www".  i.e. http://www gets you to our intranet that is mod_auth_kerb protected.  The servername is of the form server1.ny1.example.com.  In the normal case, this doesnt make any appreciable difference to the client connections.  The client is able to authenticate to the http://www server instantaneously and no problem.  In a failure scenario, every page takes 30 seconds.  However, pointing the browser directly at http://server1 instead of the CNAME is instantaneous even in the failure scenario.

If you are using another kerberized application like ssh, running an ssh login to the cname "www" is fine - i.e. passwordless kerberos works fine and the timeout is a couple of seconds as expected.  It is only apache/mod_auth_kerb that has this very specific issue with using a CNAME when one of the kerberos servers is down.

Not sure if this clarifies anything or muddies the waters further though!

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
modauthkerb-help mailing list
modauthkerb-help <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/modauthkerb-help
jm | 8 Mar 2012 10:34
Picon
Favicon

Re: newsletter: Re: newsletter: Slow page loads when in failover with mod_auth_kerb

Hi Rob,

I strongly recommend that you run tcpdump/wireshark on the client requesting the
service ticket and the kerberized webserver to see what is going on.
Port 53 an 88 usually give very good hints what went wrong.

Furthermore I would try to run apache in singel process mode http://httpd.apache.org/dev/debugging.html#truss
and try to see what is going on with strace, while viewing parallel to tcpdump

And I would first only test with the real name the it works when 
a kds fails and once you are sure all is fine continue to check if this is an cname issue.

I also struggled once about cnames, the problem is that browsers handle this differently.
Firefox and older IE version do not request a service ticket for the cname but for the
kanonical name (A record). They always first drill down the cname to a name and then request a service ticket
for that.

Therefore we are currently using the A name in our keytab file and apache conf instead of the CNAME.
This works fine for firefox and IE but not for Safari (and afaik this is not rfc compliant).

If you would like to to it right you would have to have two SPN, 
one for the real server name and one for the cname and you have to duplicate
your key entry in your keytab, then safari and ie and firefox would work with cnames.

E.g. www.example.com is the CNAME to srv01.example.com and runs there as vhost 

When accessing www.example.com then firefox would try to get a service ticket
for HTTP/srv01.example.com <at> EXAMPLE.com, wich is not correct but this is how it currently works.
Safari would do it right and try to get a service ticket for HTTP/www.example.com <at> EXAMPLE.COM

To work around this you have to make sure that there is an SPN for www and srv01 in you ADS
Assuming you already have a service principal for HTTP/srv01.example.com <at> EXAMPLE.COM create an alias:

setspn -A HTTP/www.example.com <at> EXAMPLE.COM srv01-http

(where srv01-http is the ADS user to whom the service pricipal HTTP/srv01.example.com <at> EXAMPLE.COM got
mapped to)

Now you also have to create an alias entry in your keytab with the same secret key

cp /etc/srv01-http.keytab /etc/srv01-http.keytab.bak
ktutil
ktutil:  rkt /etc/srv01-http.keytab
ktutil:  list -e -k
slot KVNO Principal
---- ---- ---------------------------------------------------------------------
   1    3 HTTP/srv01.example.com <at> EXAMPLE.COM (AES-256 CTS mode with 96-bit SHA-1 HMAC) (0xabc123abc1231111111111111111111111111111111111111111111111111111)

so 0xabc... is the secret key
create an alias and use the same key (without the leading 0x ) and KVNO

ktutil:  addent -key -p HTTP/example.com <at> EXAMPLE.COM -k 3 -e aes256-cts
Key for HTTP/www.example.com <at> EXAMPLE.COM (hex): abc123abc1231111111111111111111111111111111111111111111111111111

Verify

ktutil:  list -e -k
slot KVNO Principal
---- ---- ---------------------------------------------------------------------
  1    3 HTTP/srv01.example.com <at> EXAMPLE.COM (AES-256 CTS mode with 96-bit SHA-1 HMAC) (0xabc123abc1231111111111111111111111111111111111111111111111111111)
  2    3 HTTP/www.example.com <at> EXAMPLE.COM (AES-256 CTS mode with 96-bit SHA-1 HMAC)  (0xabc123abc1231111111111111111111111111111111111111111111111111111)

Write
kutil: wkt /etc/srv01-http.keytab

Now cname should work in all browsers.

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
jm | 8 Mar 2012 10:41
X-Face
Picon
Favicon

Re: newsletter: Re: newsletter: Re: newsletter: Slow page loads when in failover with mod_auth_kerb

typo,

> ktutil:  addent -key -p HTTP/example.com <at> EXAMPLE.COM -k 3 -e aes256-cts

should read

> ktutil:  addent -key -p HTTP/www.example.com <at> EXAMPLE.COM -k 3 -e aes256-cts

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
Henry B. Hotz | 8 Mar 2012 19:15
Picon
Picon
Favicon

Re: newsletter: Re: newsletter: Slow page loads when in failover with mod_auth_kerb

Good of you to provide a detailed workaround!!

I don't think any rfc's cover this (though perhaps the naming convention draft does).  The community
considers DNS name canonicalization to be a DOS risk and generally wants to move away from its historical
dependence on it.  Note the addition of the "TrustDNS" configuration item to OpenSSH.

It's kind of ironic that PKI implementations are increasing their dependence on DNS canonicalization.

On Mar 8, 2012, at 1:34 AM, jm <at> wn.de wrote:

> Therefore we are currently using the A name in our keytab file and apache conf instead of the CNAME.  This
works fine for firefox and IE but not for Safari (and afaik this is not rfc compliant).

------------------------------------------------------
The opinions expressed in this message are mine,
not those of Caltech, JPL, NASA, or the US Government.
Henry.B.Hotz <at> jpl.nasa.gov, or hbhotz <at> oxy.edu

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
Rob C | 8 Mar 2012 22:08
Favicon

Re: newsletter: Re: newsletter: Slow page loads when in failover with mod_auth_kerb

Well, I have a solution and it isn't strictly mod_auth_kerb related, but I hate googling and finding open ended forum posts on the web so I thought I would at least close the loop and see if anyone has comments on the *why* as well as the *what* that eventually fixed my issue.

I took the advice above and started going through the wireshark/tcpdump logs of both the client browser interaction with the 2 DCs and the apache server, as well as the interaction from the apache server looking at traffic to the DCs and the client.  It was immediately apparent that the hangup did not appear to be kerberos based but rather a rash of DNS lookups on the apache host trying to resolve the CNAME to an A record (itself).  In fact, given the failure scenario, the kerberos ticket had already been negotiated, issued and cached before the KDC was taken down, so from a mod_auth_kerb perspective it was using cached credentials anyway (KrbSaveCredentials=on) - no kerberos network traffic was taking place! 

Long story short, adding the cname of the webserver as a host alias in /etc/hosts allowed the page to load instantaneously when in the DC failure state.  As such I have a work around but don't fully understand why it is working.  I am doing some further investigation.

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
modauthkerb-help mailing list
modauthkerb-help <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/modauthkerb-help

Gmane