The subject of disk fragmentation will almost always draw heated arguments but seldom gets treated in its entirety. I'm going to try to do that here, but will probably miss a point or two: this is an extraordinarily complex subject and honestly there aren't any easy answers.
The basic idea is this: if a file's disk blocks are contiguous, following one after another on the physical disk drive, the disk heads won't have to move very much (perhaps not even at all) when reading the file. As moving those heads obviously takes more time than not moving them, fragmentation is undesirable.
Yes, but that doesn't mean your disk needs defragging. The first thing you need to know is that many modern file systems defrag themselves "on the fly". For example, the HFS+ file system currently used in Mac OS X defrags some files automatically (from https://www.osxbook.com/book/bonus/chapter12/hfsdebug/fragmentation.html):
When a file is opened on an HFS+ volume, the following conditions are tested: * If the file is less than 20 MB in size * If the file is not already busy * If the file is not read-only * If the file has more than eight extents * If the system has been up for at least three minutes If all of the above conditions are satisfied, the file is relocated -- it is defragmented on-the-fly.
Modern file systems tend to use a "worst fit" algorithm when trying to decide where to put a file initially - that is, rather than looking for the smallest block of free space that will fit the new file and jamming it in between other files, they'll look for a big unused chunk of space and put the new file far away from any other files - and it certainly helps that we often have very large, mostly unused disks now! This Why doesn't Linux need defragmenting? post attempts to show that graphically. I have a little quibble with that: Linux has many possible filesystems and not all approach defragmentation the same way, but it can help visualize the basic concept.
Before we get too excited about the various techniques to avoid defragmentation, consider a Linux or Unix system supporting a few dozen programmers and web developers. Obviously they'll mostly be working on different files, so those disk heads are going to be jumping around from file to file constantly. Fragmentation is of less concern on such a system, so you'll often hear people arguing that fragmentation has nothing to do with multi-user systems.
But what about a large multi-user system where there's a big database and the users are all accessing that. Wouldn't that need to worry about fragmentation? Well, perhaps more so than the system where everybody is doing different things, but even so it's still likely that the users will be asking for different sections, so the disk heads will be fluttering about just the same.
Well maybe not. Instead of just asking the drive to give you the block Joe wants, why not wait a little bit and see what Sarah and Jane might need? If you do that, and Jane's next requested block is close to where the heads are now, why not get that first? Indeed, that's a common optimization technique.
But wait: modern disk drives are pretty smart. Why couldn't they optimize requests the same way? In fact, they can. And probably need to..
Logical block addressing throws a new wrinkle into this. The block numbers the OS wants to access are translated into real physical addresses. On the face of it, that shouldn't necessarily affect fragmentation: Logical block 1 is likely to still be physically right beside logical block 2 or at least close by. But not if there has been bad sector reassignment: in that case, the supposedly contiguous data may in fact have a chunk that has been moved far, far away. The data is contiguous as far as the OS knows, but in fact it may not be. If the disk drive itself implements elevator seeking, it might be able to intelligently work around that up to soften the effect.
Note that disk caching also tends to destroy any value from OS based anticipation and reordering: the drive heads may not have to move at all because the data you want is already in the drive's own on-board cache, but OS based reordering may actually cause movement when none was necessary!
Partitioning can help fragmentation. For example, if you put /tmp and other frequently used file systems in their own partitions, files coming and going won't have to compete for disk blocks with more stable files. The other way to look at that is that a heavily accessed file like that multi-user database mentioned above might benefit from its own separate partition and filesystem.
LVM screws that royally. Again, it's reasonable to assume that when first created, a LVM file system is likely to be built from contiguous blocks. But if it is extended, who knows where the new blocks came from?
Let's just pause for a second: imagine an LVM filesystem on a disk capable of elevator seeks that also has a large on-board cache and is being used on a multi-user system with hundreds of users all charged with different responsibilities.. how much do you think you need to worry about fragmentation?
Ok, but MY system is single user. Well.. sort of. Actually, whether it's Linux or OS X or Windows XP, it isn't really single user anymore. There's a lot of "system" stuff constantly going on - logging, daemons checking their config files, downloading OS updates and patches.. it's not just you asking for disk blocks, is it?
I just ran "sar -d 5 30" on my Mac and walked away.. Here are the results, stripped of the lines where there was no disk activity at all:
New Disk: [disk1] IODeviceTree:/PCI0@0/USB7@1D,7/@3:0 New Disk: [disk0] IODeviceTree:/PCI0@0/SATA@1F,2/PRT2@2/PMP@0/@0:0 13:05:08 device r+w/s blks/s 13:05:13 disk0 1 11 13:05:23 disk0 1 9 13:05:33 disk0 1 9 13:05:38 disk0 4 85 13:05:43 disk0 1 9 13:05:53 disk0 1 9 13:06:03 disk0 1 9 13:06:08 disk0 2 39 13:06:13 disk0 1 11 13:06:23 disk0 1 9 13:06:33 disk0 1 9 13:06:38 disk0 186 19165 13:06:43 disk0 52 13837 13:06:53 disk0 1 9 13:07:03 disk0 1 11 13:07:08 disk0 3 82 13:07:13 disk0 1 9 13:07:23 disk0 1 9 13:07:33 disk0 1 9 13:07:38 disk0 2 34 disk1 IODeviceTree:/PCI0@0/USB7@1D,7/@3:0 Average: disk1 0 0 disk0 IODeviceTree:/PCI0@0/SATA@1F,2/PRT2@2/PMP@0/@0:0 Average: disk0 9 1113
I wasn't doing anything, but OS X found plenty of reason to access my disk, didn't it? If I had been reading files, my reads would have had to compete with system reads.. what does that do with the carefully managed defragmentation mentioned above?
The other thing I often hear is "What the heck - it only takes a few minutes and it can't hurt anything". Well, it can hurt: if the defrag gets interrupted midstream, that could hurt a lot. Defragging also obviously adds more wear and tear: you are reading and rewriting a lot of stuff and that does add up. But let's look at it from another slant: what file or files is the defragger worried about?
That is, are the files that cause the defrag utility to want to rearrange our whole disk anything that we are going to be accessing sequentially? The answer might be "yes", but it also might be "no". The defragger may see a system log file scattered willy-nilly here and there.. do we care? No, because we certainly are not reading that front to back on a daily basis and the OS itself is only adding to the end of it - it may be horribly fragmented, but that's never going to cause a disk head to move anywhere unusual.
But no, the fragmented file is that big database we mentioned earlier - this time on a single user system. The stupid thing is all over your drive, here, there, everywhere.. surely a good defragging is in order?
Maybe. But often large databases use indexes.. your access probably involves reading that index (which may remain sitting in cache after the first read) and jumping out to particular sections of the big file from there - no sequential access at all. If that's the case (and it often is), the fragmentation of the big file is completely irrelevant: defragging it won't speed up a thing.
As I said at the beginning, it's a big, complicated subject and I'm sure I've missed something. Feel free to add your thoughts to the comments.
Got something to add? Send me email.
More Articles by Anthony Lawrence © 2010-10-30 Anthony Lawrence
One of the main causes of the fall of the Roman Empire was that, lacking zero, they had no way to indicate successful termination of their C programs. (Robert Firth)