One of the problems computers have is identifying data. Given a random chunk of bytes in a disk file, how can anyone (you or the computer) know what their purpose is? Is it a program? A configuration file or other data for some other program? How can you tell?
Sometimes the answer was simply "you can't". The earliest systems didn't do much in this area. The only real attempt was to use file name extensions: a DOS ".com" file is nothing more than the actual machine code that will execute on the CPU - and I mean absolutely nothing more - it IS the code. Therefore, any random block of data, if saved with a ".com" extension, could be loaded and executed (not likely to have good results, of course).
But even poor old DOS had .exe files, which are a bit more complicated. The extension is part of what tells DOS that you have a .exe file, but the file also has a specific structure; you can't just take any bunch of junk, rename it as "junk.exe" and convince DOS it ought to try running it. Specifically, .exe files use a "magic number" : hex 4d5a in the first two bytes.
Mac OS X of course has a much more sophisticated way to do this kind of thing, but you do have a file /usr/share/file/magic and "man magic" will tell you all about it.
But neither magic numbers nor extensions really provide a satisfactory answer to the problem. Enter UTI's.
UTI's (stored in metadata, of course) identify data. It's not just files, by the way; streaming data, cut-and-paste data, a disk volume - if it exists, it can have a UTI type. Obviously this is much more powerful than identifying things by extensions or by magic numbers. For one thing, UTI's can be related to other UTI's: if a .html file's UTI is a subset of the text UTI, then any text app can handle it. Extensions aren't needed, and data files are easily asociated with applications.
When I first read about this, my immediate reaction was to think "Mime?". Mime types aren't the answer, at least in Apple's opinion. Mime starts off in the right direction, but only has a two level hierarchy and requires approval by the IANA for extensions other than "x-". The UTI that Apple proposes a more extensible format where it has provided top-level identifiers such as public.text, public.plain-text, etc. If you want to define a type yourself, you don't need Apple: "com.yourdomain.yourtype" is how you do it. In your app's Info.plist, you declare that and say what UTI's it inherits from. See https://developer.apple.com/documentation/Carbon/Conceptual/understanding_utis/ for more details and https://arstechnica.com/reviews/os/macosx-10.4.ars/11 also has a nice overview of all this.
Got something to add? Send me email.
More Articles by Tony Lawrence © 2011-03-21 Tony Lawrence
If you don't know anything about computers, just remember that they are machines that do exactly what you tell them but often surprise you in the result. (Richard Dawkins)