Discussion:
CLOSE_WAIT not closing in Bind 9.2.0
D.M.
2002-01-21 19:52:17 UTC
Permalink
We have multiple lingering CLOSE_WAIT's on our bind 9.2.0 server which
means the client sent a FIN, but the server hasn't closed its socket
yet. This leads me to believe this is possibly a problem with the
server. Anyone know how to control these? make sure they die after a
set time?
Barry Margolin
2002-01-21 22:03:07 UTC
Permalink
Post by D.M.
We have multiple lingering CLOSE_WAIT's on our bind 9.2.0 server which
means the client sent a FIN, but the server hasn't closed its socket
yet. This leads me to believe this is possibly a problem with the
server. Anyone know how to control these? make sure they die after a
set time?
Since DNS clients normally use UDP, not TCP, are you sure that these
connections are related to BIND? Normally, TCP is only used for zone
transfers. Are the connections going to port 53 on the server?
--
Barry Margolin, ***@genuity.net
Genuity, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
D.M.
2002-01-21 22:36:50 UTC
Permalink
Post by Barry Margolin
Post by D.M.
We have multiple lingering CLOSE_WAIT's on our bind 9.2.0 server which
means the client sent a FIN, but the server hasn't closed its socket
yet. This leads me to believe this is possibly a problem with the
server. Anyone know how to control these? make sure they die after a
set time?
Since DNS clients normally use UDP, not TCP, are you sure that these
connections are related to BIND? Normally, TCP is only used for zone
transfers. Are the connections going to port 53 on the server?
Thanks for the reply.

Yes, netstat says port 53. It's all TCP stuff.

# netstat -an |grep WAIT
Local Address Remote Address Swind Send-Q Rwind Recv-Q State
-------------------- -------------------- ----- ------ ----- ------ -------
192.168.30.2.53 210.156.196.31.1260 32120 0 24583 0 CLOSE_WAIT
192.168.25.2.53 130.125.1.10.1397 17520 0 24774 0 CLOSE_WAIT
192.168.30.2.53 209.192.164.70.59595 17520 0 24774 0 CLOSE_WAIT
192.168.25.2.53 216.37.1.19.1269 32120 0 24570 0 CLOSE_WAIT
192.168.30.2.53 62.232.35.13.4088 17520 0 24777 0 CLOSE_WAIT
192.168.30.2.53 204.201.58.9.59167 17520 0 24774 0 CLOSE_WAIT

Any thoughts?
Barry Margolin
2002-01-22 00:15:23 UTC
Permalink
Post by D.M.
Yes, netstat says port 53. It's all TCP stuff.
# netstat -an |grep WAIT
Local Address Remote Address Swind Send-Q Rwind Recv-Q State
-------------------- -------------------- ----- ------ ----- ------ -------
192.168.30.2.53 210.156.196.31.1260 32120 0 24583 0 CLOSE_WAIT
192.168.25.2.53 130.125.1.10.1397 17520 0 24774 0 CLOSE_WAIT
192.168.30.2.53 209.192.164.70.59595 17520 0 24774 0 CLOSE_WAIT
192.168.25.2.53 216.37.1.19.1269 32120 0 24570 0 CLOSE_WAIT
192.168.30.2.53 62.232.35.13.4088 17520 0 24777 0 CLOSE_WAIT
192.168.30.2.53 204.201.58.9.59167 17520 0 24774 0 CLOSE_WAIT
Any thoughts?
Interesting. I assume none of those are your slave servers. My guess was
going to be that you were being port-scanned, but some of those addresses
reverse-resolve to names that look like ISP servers (216.37.1.19 =
ns1.onecall.net).

Do you have any large entries in your DNS that would cause DNS queries to
switch from UDP to TCP? Like a name with several dozen A records, or an
address with lots of PTR records.
--
Barry Margolin, ***@genuity.net
Genuity, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
D.M.
2002-01-22 00:41:46 UTC
Permalink
Post by Barry Margolin
Post by D.M.
Yes, netstat says port 53. It's all TCP stuff.
<snip>
Post by Barry Margolin
Post by D.M.
Any thoughts?
Interesting. I assume none of those are your slave servers. My guess was
going to be that you were being port-scanned, but some of those addresses
reverse-resolve to names that look like ISP servers (216.37.1.19 =
ns1.onecall.net).
Do you have any large entries in your DNS that would cause DNS queries to
switch from UDP to TCP? Like a name with several dozen A records, or an
address with lots of PTR records.
Nope, those aren't our slavers.

Yeah, we have quite a few records with data that would exceed UDP
packet size which explains the use of TCP.

I can't figure out why named isn't closing those connections. Why
would it sit in CLOSE_WAIT ? The following doc says I should truss
and snoop to find the answer.

http://sunsolve.sun.com/pub-cgi/retrieve.pl?doc=finfodoc%2F19137&zone_32=closing%20tcp%20connections

Guess that's my only path of action at this point.

Thanks.
Pete Ehlke
2002-01-22 05:42:35 UTC
Permalink
Post by D.M.
Nope, those aren't our slavers.
Yeah, we have quite a few records with data that would exceed UDP
packet size which explains the use of TCP.
I can't figure out why named isn't closing those connections. Why
would it sit in CLOSE_WAIT ? The following doc says I should truss
and snoop to find the answer.
Interesting. One sees a lot of this behaviour on busy web servers- it's
usually caused by browsers that die or are killed in the middle of
receiving data. ISTR that some versions of MSIE from a few years ago
would do the same sort of thing. I wonder if there is something out
there that's causing some name servers to behave the same way.
I notice that your name server is in 1918 space. Perhaps your NAT or
firewall device is doing something interesting with 53/tcp? What sort of
NAT device are you using?

At any rate, I don't recall you telling us what OS you run, but
http://www.sean.de/Solaris/tune.html used to be of some use on Solaris
web servers, but tuning a web server and tuning a name server are *vastly*
different enterprises, and even at that, really affecting CLOSE_WAIT (as
opposed to TIME_WAIT) is *hard*.

Were I you (and I'm not, but...), I'd be looking hard at the reasons for
those > 512 byte records. There's virtually *always* a way to do
whatever it is that you're trying to do without causing TCP retransmits.

my $.02

-Pete
Barry Margolin
2002-01-22 15:13:50 UTC
Permalink
Post by Pete Ehlke
At any rate, I don't recall you telling us what OS you run, but
http://www.sean.de/Solaris/tune.html used to be of some use on Solaris
web servers, but tuning a web server and tuning a name server are *vastly*
different enterprises, and even at that, really affecting CLOSE_WAIT (as
opposed to TIME_WAIT) is *hard*.
CLOSE_WAIT isn't something you can tune away. In this case, it suggests a
bug in the server -- it isn't closing the connection after it's done
sending the response.
--
Barry Margolin, ***@genuity.net
Genuity, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
Nate Campi
2002-01-22 16:38:19 UTC
Permalink
Post by Barry Margolin
Post by Pete Ehlke
At any rate, I don't recall you telling us what OS you run, but
http://www.sean.de/Solaris/tune.html used to be of some use on Solaris
web servers, but tuning a web server and tuning a name server are *vastly*
different enterprises, and even at that, really affecting CLOSE_WAIT (as
opposed to TIME_WAIT) is *hard*.
CLOSE_WAIT isn't something you can tune away. In this case, it suggests a
bug in the server -- it isn't closing the connection after it's done
sending the response.
W. Richard Stevens backs up Barry:

FIN WAIT 2 State
In the FIN_WAIT_2 state we have sent our FIN and the other end has
acknowledged it. Unless we have done a half-close, we are waiting for
the application on the other end to recognize that it has received an
end-of-file notification and close its end of the connection, which
sends us a FIN. Only when the process at the other end does this close
will our end move from the FIN_WAIT_2 to the TIME_WAIT state.

This means our end of the connection can remain in this state forever.
The other end is still in the CLOSE_WAIT state, and can remain there
forever, until the application decides to issue its close.

Many Berkeley-derived implementations prevent this infinite wait in the
FIN_WAIT_2 state as follows. If the application that does the active
close does a complete close, not a half-close indicating that it expects
to receive data, then a timer is set. If the connection is idle for 10
minutes plus 75 seconds, TCP moves the connection into the CLOSED state.
A comment in the code acknowledges that this implementation feature
violates the protocol specification.

...this is from TCP/IP Illustrated Vol 1.
--
Nate Campi | Terra Lycos DNS | WiReD UNIX Operations

He may look like an idiot and talk like an idiot but don't let that fool
you. He really is an idiot. - Groucho Marx
D.M.
2002-01-23 16:11:36 UTC
Permalink
Solaris (at least solaris 7) will sometimes leave a socket in
"CLOSE_WAIT" even after the socket is closed (and the program has
terminated). This times out after a little while --maybe around 4-5
minutes. When I have seen this, there is always some unsent data in
the Send-Q.
We're running Solaris 8. My experience shows nothing in the Send-Q:

Local Address Remote Address Swind Send-Q Rwind Recv-Q
State
-------------------- -------------------- ----- ------ ----- ------
-------
192.168.10.2.domain 192.168.10.2.44440 32768 0 32768 0
CLOSE_WAIT
192.168.10.2.domain 192.168.10.2.44469 32768 0 32768 0
CLOSE_WAIT

Strangely, I can duplicate (create more of) the CLOSE_WAITs by
telnet-ing to our DNS servers VIP on port 53 and then '^]' escaping
out of it and close. I do this and on the 9th time you'll be unable
to connect on 53 as the CLOSE_WAIT's make named unresponsive.

# Daryl
D.M.
2002-02-15 20:33:25 UTC
Permalink
Post by D.M.
We have multiple lingering CLOSE_WAIT's on our bind 9.2.0 server which
means the client sent a FIN, but the server hasn't closed its socket
yet. This leads me to believe this is possibly a problem with the
server. Anyone know how to control these? make sure they die after a
set time?
We finally resolved this issue. A little about our implementation to
help describe the problem. Our name servers sit in pairs behind a
load balancer. The interfaces (lots of virtual IFs) between the
servers and the load balancer are mostly private nets and are NATed
via the balancer. Bind confs had those private nets in the blackhole
acl which you wouldn't think is a problem since the source IPs
shouldn't be in those ranges but it appears that something is breaking
over those NAT-ed addresses. I'm thinking the clients are sending
their FINs and closing before the server can send the close acks and
it just hangs. Anyway, we removed those nets from the blackhole acl
and put them on the load balancers ACLs for inbound traffic. Since
then, no more persistent CLOSE_WAIT sessions.

Our duplication of the hung CLOSE_WAIT state on the sockets was done
by telnet-ing to port 53 on the server, from a blackhole ACL-ed net,
then escaping out and quitting. Not sure if this is how the blackhole
was intended to behave, but it does.

I'm just glad it's fixed.

# D

Loading...