This is an old SCO Unix article. If you are having trouble with Linux kernel building, this isn't going to help, sorry. There are other posts here that might, so use the Search box above to find them.
That's a pretty awful feeling, isn't it? You've got to link a new kernel because you need to change a value or needed to add something, and it fails. The near gibberish it outputs looks completely unhelpful and you haven't a clue where to start. Well, this article hopes to give you some clues.
A cover your butt procedure I always follow is to link a kernel BEFORE you change anything. If it fails, you know it was already broken, and didn't break because of something you did. If you are feeling really paranoid, answer "N" to the "Do you want this kernel to boot by default" message, and then do:
sum -r /stand/unix ./unix
and see if the two files are the same- they certainly should be if you haven't changed anything yet. If they aren't, I suggest:
btmnt -w cd /stand cp unix unix.good cd btmnt -d
We're going to start with an actual case. A local consultant called me because he had tried to increase a kernel variable, but the link failed. The increase was critical to the proper functioning of the system, and he couldn't fix it.
As it turns out, I could have identified the problem in seconds. Unfortunately, I didn't realize that at the time (live and learn), but even if I had thought of that method, I would have dismissed it because I was sure the problem was elsewhere. I'll tell you what I should have done that would have instantly told me what was wrong, but I'll hold off explaining why until later. Here's what would have given me the answer I needed:
cd /etc/conf/cf.d diff sdevice sdevice.new
Think about that as you read along.
This article doesn't go into the whole subject of drivers and the link directories very deeply. You might want to read Understanding Device Drivers if you want to understand more.
The first thing I did was this:
cd /etc/conf/cf.d script /tmp/linkerr ./link_unix
After the script finished belching out its errors, I used CTRL-D to exit "script", and went to look at /tmp/linkerr. Here it is:
# ./link_unix The UNIX Operating System will now be rebuilt. This will take a few minutes. Please wait. Root for this system build is / undefined first referenced symbol in file putctl /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o sdistributed /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o freemsg /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o qreply /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o flushq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o putq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o qsize /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o getq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o putbq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o allocb /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o linkb /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o copyb /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/log/Driver.o dupb /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/log/Driver.o freeb /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/log/Driver.o canput /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/log/Driver.o putnext /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/log/Driver.o putctl1 /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ptm/Driver.o qenable /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ptm/Driver.o bufcall /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ldterm/Driver.o pullupmsg /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/timod/Driver.o copymsg /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/timod/Driver.o msgdsize /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/tirdwr/Driver.o unlinkb /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/tirdwr/Driver.o rmvq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/tirdwr/Driver.o insq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/tirdwr/Driver.o lock_stp /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o backq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o unlock_stp /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o qdetach /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o at_qrunflag /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o strwaitbuf /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o dupmsg /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o lock_str_bfsleep /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o strmaxblk /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o getclass /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o allocq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o streams /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o freeq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o setq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o shlock_str_qnext /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o clnopen /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o noenable /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o qdisable /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o strdoioctl /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ksl/Driver.o strwaitq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ksl/Driver.o findmod /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ksl/Driver.o qattach /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ksl/Driver.o strqset /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ip/Driver.o adjmsg /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ip/Driver.o strmsgsz /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/rip/Driver.o unbufcall /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/iknt/Driver.o bsize /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/iknt/Driver.o esballoc /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/net0/Driver.o mblock /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ipl/Driver.o emblock /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ipl/Driver.o rbsize /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/nfs/Driver.o i386ld fatal: Symbol referencing errors. No output written to unix ERROR: Can not link-edit unix idbuild: idmkunix had errors. System build failed. #
Pretty awful mess, isn't it? I was convinced that a driver file in /etc/con/pack.d must be missing or horribly corrupted. Actually, though, it couldn't have been a missing driver file- the link_unix would have reported that in plain English. A really badly corrupted driver file would have also barfed differently, though the error message wouldn't be as obvious (I'll show examples of that later).Could it be that a good driver had been copied incorrectly- for example somehow copying /etc/conf/pack.d/clone/Driver.o to /etc/conf/pack.d/kbd ? No, because that would give us multiply defined symbols, and there's no mention of that in the output.
How about a Driver.o from a different release, or from a backup prior to the application of patches? Yes, that could cause these kind of errors, and that was my first thought. Yet, I know the local consultant pretty well, and that doesn't sound like something he would have done, even accidentally, so I gave up that and decided that some needed driver was just not being linked into the kernel. Now to find it.
I picked a symbol from the list of errors and went looking for it like this:
cd /etc/conf/pack.d for i in */Driver.o do strings $i | grep esballoc && echo $i done
Let me say right away: that's NOT the best way to look for symbols in a .o file, but I got lucky and "str" popped up as a match. I checked /etc/conf/sdevice.d/str, and it was marked N:
str N 0 0 0 0 0 0 0 0
Now that's pretty odd: it shouldn't have been: "str" is the Streams driver and is necessary for just about everything on the network. I changed it to "Y" and tried the link again:
# ./link_unix The UNIX Operating System will now be rebuilt. This will take a few minutes. Please wait. Root for this system build is / undefined first referenced symbol in file clnopen /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o i386ld fatal: Symbol referencing errors. No output written to unix ERROR: Can not link-edit unix idbuild: idmkunix had errors. System build failed.
That's better; a lot less errors, but still no success. When you are linking a kernel, even one error is one too many. So I tried my script again, but with clnopen this time:
cd /etc/conf/pack.d for i in */Driver.o do strings $i | grep clnopen >> echo $i done
This didn't work, though. It's not that "clnopen" isn't somewhere in one of those Driver.o files, it's that "strings" isn't good enough to find it. However, I had other weapons: I was dialed in to the customer, but was working from my own machine which happens to be the same OS release. On my machine, I have the Development System installed, and the Development System has "nm". So on my system I did this:
cd /etc/conf/pack.d for i in */Driver.o do nm $i | grep clnopen >> echo $i done
Bingo! The "clone" driver has "clnopen", and sure enough, it too was turned off in /etc/conf/sdevice.d (nobody knows how or why this happened, by the way). I turned it back on, and now the kernel linked successfully.
If I had not had "nm", I could have done this:
cd /etc/conf/pack.d for i in */Driver.o do hd $i | grep clnopen && echo $i done
As I said at the outset, if I had done a diff on the two sdevice files, this would have shown me:
60c60 < clone Y 1 0 0 0 0 0 0 0 --- > clone N 1 0 0 0 0 0 0 0 319c319 < str Y 0 0 0 0 0 0 0 0 --- > str N 0 0 0 0 0 0 0 0
The reason that works is that link_unix apparently doesn't replace sdevice until the link is successful (sdevice is built from the individual files in /etc/conf/sdevice.d). That's very helpful for this kind of error, because it immediately shows you what has changed since the last successful link.
Of course, there are other things that can go wrong. One I see now and then is where a new device has been partially installed or partially removed, and the kernel fails to link because enough of it is still there to confuse it. In a case like this, you want to look in /etc/conf/cf.d/mdevice, and the offending device will probably be at the end of it. If you are not really sure, you can just comment out the line you think is the problem by putting a "#" at the beginning of the line; if the kernel then relinks, that was it. For example, here's the end of my mdevice; the E3H was the last thing I added to this machine:
vdsp ocriI ioc vdsp 0 126 0 0 -1 vgic ociI ioc vgic 0 127 1 1 -1 vkbd ocwiI ioc vkbd 0 128 0 0 -1 vmouse ociI ioc vmse 0 129 1 1 -1 vw I icS vw 0 130 8 128 -1 net0 I iSc net0 0 131 1 256 -1 e3E I icSH e3e 0 132 0 1 -1 ipl Iocir ico ipl 0 133 1 1 -1 net1 - iSc net1 0 134 1 256 -1 e3H I icSH e3H 0 135 0 1 -1
What about a corrupted driver? The errors you get will depend upon the nature of the corruption, but let's try some experiments (if you aren't comfortable and sure of yourself, don't try this on a working machine):
cd /etc/conf/pack.d/str mv Driver.o Safe date > Driver.o cd /etc/conf/cf.d/ ./link_unix
When I did this, I got a message saying that the file "Wed" (it happened to be Wednesday) couldn't be opened for input. Let's try something else:
cd /etc/conf/pack.d/str cp /bin/ls Driver.o cd /etc/conf/cf.d/ ./link_unix
This time I got a message complaining that it couldn't open "file ELF". That would be a very definite sign of corruption: Driver files would always be "COFF".
To put everything back as it was:
cd /etc/conf/pack.d/str rm Driver.o mv Safe Driver.o
I hope this gives you a little more confidence should you ever run into a broken kernel relink. Certainly other errors are possible, but these are the most common I've seen.
Got something to add? Send me email.
More Articles by Tony Lawrence © 2011-03-17 Tony Lawrence