I had an interesting call a few days back. Old customer, old SCO box overdue for the graveyard but still in use. Customer needed to renumber his ip scheme and ran into a confusing situation after changing from 192.168.2.x to 10.126.10.x.
Here are the observed conditions:
"ifconfig" shows correct address, broadcast, mask.
"netstat -rn" shows correct routes.
"netstat -in" does show a high number of collisions.
Nothing funky in /etc/hosts.
Ping a local machine or the router, ping hangs.
Delete the default route and ping works.
On the face of it, this is wrong. Local LAN packets have nothing to do with the default route; it shouldn't matter whether it has a route or not - local packets don't need to go through the router..
I should have realized what this was instantly, because there is something that local packets could need the router for.. I'll give you a second to think about it..
Ok.. that's long enough. The reason I didn't hit on this instantly is because I had talked to the admin there and reminded him that he needed to change /etc/resolv.conf as part of this work. Unfortunately, he forgot it - well, he remembered that he *needed* to do it, but never actually did it. So resolv.conf still pointed at a 192.168.2 address for a nameserver.
Why does that matter? Because of reverse lookups. Every packet coming in needs to be resolved to a name if possible. The resolver looks at /etc/resolv.conf to find out how to do that, and of course it found that 192.168.2.x address. Without a default route, there would be nothing it could do with that: it can't ask the name server to resolve any ip's because it has no way to get there. Therefor it instantly gives up, and the ping works.
But with a handy default route, it's going to try to reach the name server. That might have been OK if the router knew nothing about any 192.168.2.x addresses and had said so instantly, but apparently it tried to route that.. I'm not sure where, though they do have hardware VPN's also, so that may be where it sent the packets. Whatever actually happened, the resolver requests would never be answered, so everything had to wait for a timeout.. I don't know how long that is configured for on that router, but whatever it is, it was plenty long enough to hang ping.
Putting the correct DNS server into resolv.conf solved this instantly (there's no need to reboot - the resolver notices changes to resolv.conf immediately).
I don't know where the collisions came from.. but I wasn't asked to do anything more, so I left that to the normal admin. The system worked at this point and the collisions were not increasing.
Got something to add? Send me email.
More Articles by Anthony Lawrence © 2011-03-10 Anthony Lawrence
Much to the surprise of the builders of the first digital computers, programs written for them usually did not work. (Rodney Brooks)