Tom H | 16 Jul 2010 17:01
Picon
Gravatar

why do attempts to access a nfs v3 filesystem (ro,soft) block the process for minutes at a time? (when the server is down)


Hi all,

I have a web server which serves some content from an nfs filesystem 
mounted like so;
nfsserver1:/somemount /var/www/html/somefiles  nfs     rw,soft 
             0 0

# mount | grep nfs
nfsserver1:/somemount on /var/www/html/somefiles type nfs 
(ro,soft,addr=xx.xx.xx.xx)

According to the documentation, an NFS operation on a soft mount should 
wait for a "major timeout" and then report "server not responding" to 
syslog and return an error. where a major timeout is after default 
retrans=3 retransmissions.

I understand the process to be like this;
call --->0.7 secs --->retransmission--->1.4 
secs--->retransmission--->2.8 secs--->server not responding(major timeout)

However it is pretty clear that this is retrying indefinitely, as the 
log files show loads of;
Jul 16 07:56:09 server1 kernel: nfs: server server2 not responding, 
timed out
Jul 16 07:57:09 server1 last message repeated 4 times
Jul 16 07:57:09 server1 last message repeated 6 times

and eventually this kills the apache server as all the available 
processes are blocked during "retrying indefinitely", until the apache 
(Continue reading)


Gmane