From: Bela Lubkin <fi...@armory.com> Subject: Re: SCO 5.0.7 MP5 network hung up Date: Thu, 26 Feb 2009 02:03:20 -0800 Message-ID: <email@example.com> References: <49A050C6.firstname.lastname@example.org> Steve Fabac wrote: > I had the user run netstat -m and log this result. I then had him reboot the system. > > I've never seen negative numbers in a netstat -m field before. These stats are all 32-bit numbers; the fact that it prints a negative shows that it's signed, so 2147483647 (2^31 - 1) is the largest number it can represent. If you add up the "total" column for classes 1..14, it's well over 2^31. Thus the negative number. The numbers actually don't add up: you need to add another ~1 billion (1184352475) to wrap it around to the "total mblks" value here. I believe this is because there's a missing line in this table. Notice that the sum of 14 classes worth of "0" in the fail column == 13. There's a missing line with 13 fails, 1.2 billion total allocations, and... ... what is this missing line? I half remember that allocations smaller than the smallest buffer size (64) can be stored within a buffer header itself. Ah yes: /usr/include/sys/stream.h: #define SMALLBUFSIZE 16 /* size of buffer contained in header */ I believe the whole missing line should look like: config alloc free total max fail stream 8192 15 8177 247532 136 0 queues 566 30 536 495073 279 0 mblks 19556 17474 2082 -676977404 19414 13 buffer headers 20410 19063 1347 147485852 20323 0 hdr internal, 16 bytes 493 0 493 1184352475 411 13 class 1, 64 bytes 16064 15860 204 1248340854 16022 0 class 2, 128 bytes 256 0 256 445159253 252 0 Those are the numbers that make the "mblks" line add up. Each mblks field should equal the sum of class[1..N] same field, plus the number of header-internal buffers in that column. You don't have a problem with 16-byte buffers, I just wanted to explain the negative numbers and other discrepancies (mostly to make sure I had a handle on it myself). BTW, "total" means "number ever used". If something allocated a 1K buffer, freed it, allocated it again, repeat 10 million times, then for that process's purposes "max" would never have gone higher than 1; "total" would be 10 million. I guess you do have a small problem since there were 13 failures. But I think the missing line also gets credited with some allocation failures that had nothing to do with actually being out of memory (things like an allocation from an interrupt context where it isn't allowed). Nothing here shows you to be totally out of memory, in fact your system went bad while way below the given STREAMS memory limit. That's odd in itself. > class 3, 256 bytes 864 0 864 444583140 857 0 > class 4, 512 bytes 88 0 88 14160819 81 0 > class 5, 1024 bytes 76 0 76 2589218 76 0 > class 6, 2048 bytes 890 832 58 218841126 890 0 > class 7, 4096 bytes 808 782 26 42992457 808 0 > class 8, 8192 bytes 6 0 6 102440 6 0 > class 9, 16384 bytes 8 0 8 16868076 8 0 > class 10, 32768 bytes 3 0 3 34 3 0 > class 11, 65536 bytes 0 0 0 0 0 0 > class 12, 131072 bytes 0 0 0 0 0 0 > class 13, 262144 bytes 0 0 0 0 0 0 > class 14, 524288 bytes 0 0 0 0 0 0 > total configured streams memory: 25024.00KB > streams memory in use: 6493.53KB > maximum streams memory used: 7446.71KB Given the low number of failures and the moderate use of total configured memory, I'm thinking this isn't a STREAMS failure at all. To be precise: you have a small STREAMS failure problem that probably isn't a worry; you have an ongoing STREAMS leak that probably isn't causing the lockup; and you have something else causing the lockup. There's another point of total failure in the network stack: the NIC itself. Look at `netstat -in`, `ndstat -l` at the time of failure. Collect some pre-fail samples so you'll have a baseline. In fact, add these commands to your cron job, and dial it back to once an hour. The leakage you show is very slow, you don't need 5min samples. The failure seems to be an all-or-nothing proposition, but when you look at those additional stats you may see something changing over time. This will be another gradual thing heading towards an eventual failure point. If it's something that happens all at once, it's probably something you'll never see in these statistics. So no point in collecting them too frequently. One thing you can do that _might_ clear the condition and also definitively point the finger at the NIC: # (netstat -am; netstat -in; ndstat -l) > /tmp/before # ifconfig net0 down # sleep 5 # (netstat -am; netstat -in; ndstat -l) > /tmp/during # ifconfig net0 up # sleep 5 # (netstat -am; netstat -in; ndstat -l) > /tmp/after If the memory-in-use stats drop after bringing the interface down, the NIC driver was leaking. Unfortunately, if they don't drop that doesn't prove the NIC driver wasn't leaking -- it proves it's either innocent _or_ does a crap job of cleaning up after itself. In any case, whether this cleans up the leak and whether it awakens the network are also two separate things. So take note of both: Does it reduce STREAMS memory allocation? Does it wake the net back up? I can't go fooling with NIC interfaces on deeptht, and I don't have an OSR5 VM handy right here, so: there is some chance that after a down/up cycle, the NIC will have forgotten its IP address etc. So you'll want to experiment in advance on your own test system. If it does forget stuff, it should all be stuff you'd have seen when running `ifconfig -a`. So run that in advance; then if it forgets, you have reference to get back to where you were. Notice that I'm using `netstat -am`. This adds some extra stats: streams allocation: ... maximum streams memory used: 27582.71KB inet mblk cache: 256 = 0, 2048 = 1000, 4096 = 1000 networking allocation: type alloc max fail socket 388 1323 0 rawcb 0 1 0 inpcb 397 3462 0 tcpcb 254 3337 0 ifnet 4 4 0 route 641 641 0 ifaddr 37 37 0 ipfrag 0 2 0 sockaddr 776 2646 0 iovec 0 0 0 moptions 0 2 0 ipmaddr 2 2 0 arpinfo 5 8 0 mbcl 0 0 0 ppp 0 2 0 usock 44 274 0 The "inet mblk cache" refers to STREAMS blocks which are held by the IP stack. On deeptht, these have reached their limit of 1000 each. (I'm not sure why 256-byte blocks are pegged at 0.) I believe these blocks are allocated out of 2 & 4K STREAMS blocks. IP's algorithm is to allocate these and hold onto them forever, up to the 1000 limit. For this reason it is perfectly normal for an OSR5 system to "leak" STREAMS memory until these two values have reached steady state at 1000. At which point there will be at least 1000 2K and 4K blocks allocated; therefore the "config" and "max" values for those STREAMS block classes will also be at least 1000. And baseline STREAMS memory usage, at steady state, will be at least (1000 * 2K + 1000 * 4K) == 6M. The stuff under "networking allocations" are certain structures that the IP stack chooses to allocate out of STREAMS buffers. Some research would be needed to pin down their actual sizes, which would tell you which classes of STREAMS buffers are held down by those allocations. Without doing that, you can at least add up all the `alloc' values and subtract it from the `alloc mblks' value. Whatever leak you have is of no more than (alloc mblks - blocks used by various IP stack allocations) blocks. Note: the net structures, but not (I believe) the 2 & 4K block caches, were moved into regular kmem_alloc()'d memory some time after 505. I don't remember if 506 or 507 was the first to ship that way. Evidence here on deeptht (which is running 506) suggests it's already changed. So leakage on 506/507 is (alloc mblks - 2K & 4K blocks held by IP), and you can specifically see which sizes are leaking. -- Oh, and I suddenly realize that you're running 507, not 505 like I thought, so these hidden stats could only account for your pinned-down 2 & 4K blocks. You have nearly 16K 64-byte blocks allocated. If anything is leaking, it's 64-byte blocks. But that only accounts for 1MB, so it's not that significant anyway. > syslog at the time of the reboot: ... doesn't show anything interesting. > I was told that that SCO had done a lot of work to fix streams leaks > in the network drivers for 5.0.7. The following is excerpted from > my nstr.log that I collect ever five minutes via cron when the > system is up. I inserted the "syslog: restart" lines (extracted from > /usr/adm/syslog showing when the system was rebooted) and included > the lines around the restart to show the reset in "streams memory in use." > The period from Aug 25 2008 to Jan 19 2009 shows that streams in use > at 00:00:00 each night crept from 1738.70Kb to 13316.70KB. [lots of log stuff follows] ... but this is mostly irrelevant since I don't believe your lockup is caused by a leak. Oh, if you fix the basic cause of the lockup it might be valid for you to then worry about what happens when your STREAMS usage hits the ceiling, but not yet. Next time it locks, first, see if very simple networking with small packets still works: # ping -n -c3 -s8 127.0.0.1 & Background it so that if it locks, you don't lose your shell. There are cases where a process like that won't accept interrupts so then you're stuck. If that works, try larger packets (e.g. 500, 1400, 8000, 32000). Just trying to feel around in the network / network memory allocation systems to see what kind of shape they're in. After that, move on to the NIC down/up stuff I was talking about. If you reasonably suspect the NIC / driver, a replacement NIC is like $100 ($2 if you want something that'll be at least as flaky as the current one). Choose a well regarded NIC that uses a different driver. If the lockup problem disappears, you're done. Separately, observing the new behavior of STREAMS buffer allocation will tell you whether the old NIC was leaking. And if the new NIC doesn't change anything then it proves (well, strongly suggests) that you should look elsewhere. >Bela<
Got something to add? Send me email.
If you ask "Should we be in space?" you ask a nonsense question. We are in space. We will be in space. (Frank Herbert)