Sparse files are a result of the way file systems allocate storage space. Sparse files appear to take up much more disk space than they actually do (use ls -s, which shows blocks used, to see if a file is sparse).
Most modern file systems support sparse files to some degree or another. Not all utilities do, though, as we shall see.
Traditional Unix file systems have thirteen disk block pointers in the inode. The first 10 are direct pointers; that is they point to actual disk blocks.
The next pointer is an indirect pointer: it points to a block that contains pointers to actual storage blocks. If the pointers are 32 bit numbers (more likely to be larger nowadays but this makes the math easy for us), there will be 128 blocks covered by the 11th pointer. The next points to double indirect blocks: blocks which themselves contain more pointers to the actual blocks. The 13th is triple indirect: pointers to blocks of pointers to blocks of pointers. This scheme lets you have very large files without very large inode records.
If a program opens a new file and seeks to a point beyond that covered by the first 10 pointers and writes data there, the OS doesn't bother to allocate data blocks that are not used, so those first 10 pointers are still empty. The same is true for some of the indirect pointers if the next seek extends beyond their reach. Database files using hash keys often create such files as a natural result of the hash key creation. The file may show it has a very large size in an ls -l listing, but it could actually be consuming only a very few disk blocks. If you read data from such a file, the OS just returns null bytes for the unallocated space, which means that naively copying it will NOT create another sparse file (although GNU cp is aware of sparse files and will preserve the sparseness). The same is true when backing up to tape (again, modern utilities can properly handle this).
The need for sparse files is almost always for hashed files.
A hash is an access/storage method where a mathematical function is applied to a key. The number that results from that function is used as the record number, or offset of the file. For example, suppose we had the keys mary and tom (with associated data, of course), Our hash function turns the word "mary" into the number 45, and turns "tom" into 128. Pretending that the data stored is 512 bytes for each record, you'd find mary's data 512 * 45 bytes from the start of the file, and tom's at 512 * 128 bytes. This sort of "indexing" with hashed keys gives incredibly fast access to records (there are issues with how to deal with keys that hash to the same value, biut we'll ignore that here).
A good hash function is going to generate widely disparate numbers (that's one of the ways to minimize the duplication problem). So rather than 45 and 128, we'd really get something like 2 and 438,785. Now suppose that these were the only data stored in the file so far: it would be a pretty big file, over 200 megabytes (433,785 * 512), but there's really only 1024 bytes of real data in it- a whole bunch of wasted space.
As noted above, the first ten pointers point directly to data blocks, the next points to indirect blocks which in turn point to real data blocks and so on.
So, the "mary" data ends up in the second data block (assuming 1k blocks here) and the "tom" data ends way out in one of the double indirect blocks somewehre. None of the other pointers are being used. No data needs to be stored, so no need to waste space: this is a sparse file.
If you look at it with "ls -l" it looks like it's 200+ MB, but if you removed it, you wouldn't gain 200 MB of space. If you do something that reads it sequentially, the driver just returns nulls for the data that isn't there. And that's a problem some older utilities: ordinary tape utilities like tar would write those nulls, using up 200+ MB of tape, and if it is restored with the same non-aware utility, the data blocks actually get allocated and filled with ascii 0's- now you really have used 200 MB of space.
GNU tar and cpio have options for handling sparse files correcty, but you would need to watch out for this on older Unix (SCO Unix, for example).
GNU cp is even better: if a file is NOT sparse, but could be, you can create a copy that IS sparse.
As implied in the previous paragraph, you may find sparse files in VM's. This comes from telling the new VM that you want (for example) a 500GB file system but don't want it all allocated now. See Using thin provisioned disks with virtual machines.
I suppose there is the case (which I've certainly used where I could) where some sequentially assigned number is part of the data- in that case you can use that number in the same manner as a hash, and it is, in effect, a degenerate hash- the function is just times 1 or plus zero- but to my mind that's a hash even though you don't actually do any math.. Purists can disagree, of course, but I ignore purists anyway :-) I'm pretty sure that the rationale for sparse files was hashing- I'm not sure whether the driving force was efficiency in creating and reading or to save space, but certainly both space and media speed were much more critical then than now.
Apple's HFS+ file system does NOT support sparse files (at least not as I write this). Weirdly, you can have a disk image of another HFS+ file system stored on HFS as a sparse file.
On a Linux system:
# dd of=sparse bs=1k seek=6000 count=0 0+0 records in 0+0 records out 0 bytes (0 B) copied, 2.0325e-05 s, 0.0 kB/s # ls -lskh sparse 0 -rw-r--r-- 1 root root 5.9M Aug 2 07:29 sparse
MAC OS X on HFS+:
$ dd of=sparse bs=1k seek=6000 count=0 0+0 records in 0+0 records out 0 bytes transferred in 0.000008 secs (0 bytes/sec) $ ls -lksh sparse 6000 -rw-r--r-- 1 tony staff 5.9M Aug 2 07:32 sparse
Almost 6 MB of wasted space there..
This also means that you have to be careful about copying and backing up VM's stored on HFS+ disks. See Excluding the Virtual Machines folder from being backed up by Time Machine and Time Machine-friendly HDD images..
Fortunately, HFS+ is not your only choice. You can create UFS file systems if you need sparse files and maybe someday it may get ZFS or at least something more modern than HFS+.
Got something to add? Send me email.
More Articles by Anthony Lawrence © 2013-08-02 Anthony Lawrence
As I said in my comments to the committee, [Fortran 90' would be a] nice language, too bad it's not Fortran. (Dan Davison)