APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

Mess of Metadata

© May 2019 Anthony Lawrence

There was a long series of comments at the article about mdfind that got very confused talking about OS X metadata. I thought I'd try to straighten some of that out in a separate post - though honestly I'm still easily confused myself!

First, what metadata are we talking about? For an old Unix hand, the metadata is information stored in the inode: file size, permissions, pointers to datablocks, link counts.. that's traditional metadata.

However, there's more metadata today - not just in Unix systems, but especially in Mac OS X. There are extended permissions, acl's, xattributes, Spotlight related metadata.. it's very hard to ferret all this out of Google because similar terms are used for dissimilar features.

Macs had "resource forks" early on. OS X still has resource forks. but apparently Apple would like to move away from those. That's probably why things get so darn confusing: search for information on metadata and OS X and you'll find lots of pointers to things that talk about resource forks, but usually that's deprecated and doesn't usually apply to OS X.

Let's take Spotlight metadata first. These are specific keys that Spotlight indexes. For example, you can do things like this:

mdfind 'kMDItemFSSize > 20000000'.
mdfind 'kMDItemFinderComment == "script application wrapper"' 
mdfind  'kMDItemTextContent == "*Seneca*" && kMDItemFSName != "*emlx"'
mdfind  'kMDItemTextContent == "*Seneca*" && kMDItemContentType != "com.apple.mail.emlx"'

How does Spotlight get the info to index? It asks an Spotlight Importer. This BASICS OF SPOTLIGHT page explains:

Once the Mac OS does kick-off the extraction of metadata from a file, it does so through a Spotlight Importer. Spotlight Importers are plug-ins for the Mac OS that a developer provides specifically for helping files created by their applications to be searchable within Spotlight. Spotlight crawls through its list of changed files, handing each one to the appropriate importer. The importers then read the files, compile a list of metadata, and then hand the metadata back to Spotlight. At this point, the changed file is available for searching within Spotlight.

OK, great, but where does the metadata that the importer supplies come from? Apparently, that's up to the developer. Apple's Extracting Metadata from Documents says:

Avoid the use of external files to store metadata content. All critical metadata should be in the same file as the data. The system store of metadata should be considered volatile.

I want to quibble a little: if it's stored in the data file, it's really not metadata, is it? But never mind. Some apps do it that way. For example, ID3 tags. But other apps do not. For example. In my ~/Library/Caches/Metadata I found some interesting stuff. *Some* apps store Spotlight metadata there. I found:

$ ls  ~/Library/Caches/Metadata 
Billings		Microsoft		Safari
Camino			Precipitate		com.evernote.Evernote

If I look in Billings, I find this:

                <string>Extreme rework</string>
                <string>Extreme rework</string>

But obviously not all apps store their Spotlight related metadata there. Entourage does, as seen in this HOW DOES ENTOURAGE WORK WITH SPOTLIGHT? bit:

When you enable Spotlight indexing within Entourage, a "cache" file is created for each item within your Entourage database. If you have 100,000 e-mail messages in your Entourage database, 100,000 cache files will be created. If you want to see the cache files, you can find them within your Library/Caches/Metadata/Microsoft folder.

Each cache file contains all the metadata that will be needed for indexing by Spotlight. All changes within Entourage are reflected to the cache files. Create a new item and a new cache file will be created. Updated an item and its cache file will update. Delete an item and its cache file will be deleted. With all these changes, Spotlight receives file change notifications and eventually will ask the modified cache files to go through the import process using the Entourage Spotlight Importer.

But there's no iTunes folder there..

There are also defaults. If I create a text file with "date > file", an "mdls" will show Spotlight keys:

kMDItemContentCreationDate     = 2009-04-12 12:07:02 -0400
kMDItemContentModificationDate = 2009-04-12 12:07:02 -0400
kMDItemContentType             = "public.data"
kMDItemContentTypeTree         = (
kMDItemDisplayName             = "file"
kMDItemFSContentChangeDate     = 2009-04-12 12:07:02 -0400
kMDItemFSCreationDate          = 2009-04-12 12:07:02 -0400
kMDItemFSCreatorCode           = ""
kMDItemFSFinderFlags           = 0
kMDItemFSHasCustomIcon         = 0
kMDItemFSInvisible             = 0
kMDItemFSIsExtensionHidden     = 0
kMDItemFSIsStationery          = 0
kMDItemFSLabel                 = 0
kMDItemFSName                  = "file"
kMDItemFSNodeCount             = 0
kMDItemFSOwnerGroupID          = 501
kMDItemFSOwnerUserID           = 501
kMDItemFSSize                  = 29
kMDItemFSTypeCode              = ""
kMDItemKind                    = "Plain text"
kMDItemLastUsedDate            = 2009-04-12 12:07:02 -0400
kMDItemUsedDates               = (
    2009-04-12 00:00:00 -0400

Obviously the "date" command didn't create those. Spotlight won't even index that file (no extension), but it has some default keys just the same! See Spotlight, mdfind (Mac OS X Tiger searching) for more on that.

You can add metadata yourself and can modify one item of Spotlight's domain.

$ xattr -w mystuff "hello there" file
$ xattr -l file
mystuff: hello there

The only Spotlight related data you can modify is kMDItemFinderComment. You do that with GetInfo and after adding it, xattr shows this:

xattr -l file
0000   62 70 6C 69 73 74 30 30 5A 4D 79 20 43 6F 6D 6D    bplist00ZMy Comm
0010   65 6E 74 08 00 00 00 00 00 00 01 01 00 00 00 00    ent.............
0020   00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00    ................
0030   00 00 00 13                                        ....

mystuff: hello there

Note that this gives us the clue as to where the data was stored, but I don't find a file with that "com.apple.metadata" name. I do find:


But those aren't related.

So what do we know? Well, we know it's up to the application responsible for a file to provide importer code. It's up to the same app to decide where to store metadata. Obviously, that implies that for some data that would be the across all files of this type, there's no need to store it anywhere - the importer could generate the response when Spotlight asks.

That's as far as I've gone.. maybe someone else can add more.

Got something to add? Send me email.

(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

-> Mess of Metadata


Inexpensive and informative Apple related e-books:

Take Control of Numbers

Digital Sharing Crash Course

Take Control of IOS 11

Take control of Apple TV, Second Edition

Sierra: A Take Control Crash Course

More Articles by © Anthony Lawrence

Mon Jun 1 16:42:53 2009: 6431   TomAndersen

Spotlight importers import metadata for one (or a group) of file types. But Apple has a system to import metadata on any file: if you set an xattr extended attribute with a plist that is not a dictionary, under a key like com.apple.metadata:YourKeyNameHere then in the spotlight database you will find an entry under 'YourKeyNameHere'. (you see them by doing an mdls on the file). We have developed OpenMeta to come up with a standard set of keys and safe procedures for setting this data. Apple itself uses this feature all over the place, with kMDItemWhereFroms, with time machine, and many other places.

By creating a spotlight plugin, a search for 'tag:superman' in the Spotlight UI will find all files with a com.apple.metadata:kOMUserTags array with 'Superman' as an array element.

See (link)

--Tom Andersen

Mon Jun 1 19:29:04 2009: 6433   TonyLawrence

Thanks for the info and link!


Printer Friendly Version

Have you tried Searching this site?

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us

Printer Friendly Version

Legend has it that every new technology is first used for something related to sex or pornography. That seems to be the way of humankind. (Tim Berners-Lee)

Linux posts

Troubleshooting posts

This post tagged:



Unix/Linux Consultants

Skills Tests

Unix/Linux Book Reviews

My Unix/Linux Troubleshooting Book

This site runs on Linode

SCO Unix Sales, Support, & Service

Phone:  707-SCO-UNIX (707-726-8649Toll Free: 833-SCO-UNIX (833-726-8649)