While it's not a specific New Year's Resolution, I've decided to try and get better about how I organise my photos. I understand the importance and the general 'how to' of organising files, however I've never been very good at actually implementing it.
Generally after a photography 'session' (sometimes a few days after), I will copy the images from the card to my computer. If I was mainly photographing landscape style photos, then I'll put them in a date based subfolder of a folder with the location name. If I was photographing plants or insects, then I'll separate the images by subject, and put each one in a different subfolder of a 'Needs sorting' folder.
Now, I don't actually intend to change that behaviour, as that system works well for me. However, what I do intend to change is what happens to the photos after that.
Previously, nothing would happen to the photos until (if) I came to processing them. After processing them, I would then add metadata, such as shooting info, keywords, description etc. (Actually sometimes I would add shooting info after copying the files from the memory card, but quite often I would forget).
The issue with this is that I have a large number of photos that I haven't processed, and these photos have nothing to identify them. So, for example, if I wanted to find a nice sunset photo, I would probably have 100s of unprocessed sunset photos, but no easy way to find them.
Similarly, by the time I got round to processing a photo I would often have forgotten the shooting details. While this information isn't important for most people, it is useful for me if I want to find an image to illustrate a point, such as the use of a polarising filter.
So, my aim is try and make sure I add relevant keywords and shooting data to my photos as part of the file 'ingest' process.
Another, related issue, is that of image filenames. So far I have just been copying the photos straight from the memory card. This leaves me with filenames like
_MG_6031.CR2. The issue with this is that these numbers loop every 10,000 files. Plus, I currently shoot with two Canon cameras, both of which use the same file naming conventions. So ending up with duplicate filenames is quite easy.
Because of my filing system, and the fact that I don't use both my Canon cameras to shoot the same thing at the same time, the duplicate filename issue doesn't cause a problem for storing files. However, it does cause a problem if I want to find the RAW of a JPEG I've uploaded to the web.
The issue isn't a big one, I'll likely only have 3 or 4 images with the same filename, so it's easy to quickly check which one is the one I want. But I was thinking there must be a better system than this.
ImageUniqueID EXIF value
In looking into this, it seems there is a
Image Unique ID EXIF property. This seems like a good candidate to tie RAWs and JPEGs together, and could be used as the RAW filename upon ingest to allow searching via filename rather than metadata (should be faster).
However, when I looked into this further, it seems that most cameras and software do not make use of this EXIF tag. The only references I could find were that the tag would be displayed by Google Picassa.
ImageUniqueID with Picasa
I downloaded Picasa to see how that works. It assigns the RAW file a 32 digit hex value, e.g.
fcde1dbd711d7bb20000000000000000. It ignores the ImageUniqueID exif value of the RAW file if it already exists, and will still use its own generated unique ID as above.
When you convert the RAW file to a JPEG, the JPEG is assigned an ImageUniqueID, which is recorded in the EXIF. This ID is based on the unique ID that Picasa assigned to the RAW file, using the same first 16 digits for the ID as the RAW file. The last 16 digits are unique to that JPEG, e.g.
This is quite a sensible system, as it ensures that each JPEG has its own unique ID while still being able to be traced back to the RAW original.
I did a couple more tests with Picasa. A copy of a RAW file is assigned the same ImageUniqueID as the original. Renaming the original does not affect the ImageUniqueID either. But if you modify the EXIF of the RAW file (I changed the DateTimeOriginal value), then Picasa will generate a different ImageUniqueID for the RAW.
So, using Picasa's method will only work so long as you never modify the EXIF of your RAWs after generating JPEGs from the RAWs. If you do, the Unique ID assigned to the RAW will change, and so the Unique ID assigned to the JPEG will no longer point to the correct RAW file.
The other issue with Picasa's method is that these Unique IDs for the RAW files are not stored with the original files. They are stored in Picasa's database.
So, say you have a JPEG exported through Picasa, and want to find the RAW file it came from based on the JPEG's ImageUniqueID EXIF value. You will only be able to do this through Picasa. If you decide to use some other software in the future, or Google stops supporting Picasa, then you'll be out of luck.
Image Unique ID (or not) with Bridge Photo Downloader
The Bridge Photo Downloader allows automatically renaming files upon ingest to include 'Image Unique ID'. However, this does not appear to be a true 'Image Unique ID'. Instead it is just the sequence number section of the current filename. E.g. the 'Image Unique ID' for
_MG_6031.CR2 would be 6031.
Generating Unique IDs with Exiftool
Something that would make sense to me would be to calculate the md5 sum of the image data portion of the RAW file. This would then give a 128 bit / 16 digit string that would always be the same for the RAW file, even if you altered the metadata stored in the file.
The file could be renamed to this value, and the value stored in the metadata. Any subsequent conversions would carry the value in the metadata, making it easy to trace the file back to the RAW original. However, I don't think doing this would be very easy.
Now, you can assign a Unique ID to RAWs yourself. Exiftool is very handy for this, as it can automatically generate Unique IDs and assign them to RAWs. Note that the generated ID includes random numbers, so if you generate a GUID for a file multiple times, it will be different each time. In the command below I use
-wm c, which will make sure that a new GUID is not generated if the file already has a GUID assigned.
exiftool -wm c '-ImageUniqueID<NewGUID#' -ext CR2 -overwrite_original ./
However, when converting the RAWs to JPEGs, the ImageUniqueID EXIF value is not copied over from the RAW file. (At least it isn't when using ACR to convert RAWs to JPEGs).
When you think about it, this kind of makes sense. A JPEG is a different image to the original RAW, and so it should have a different Unique ID. (Just as it does when using Picasa).
Making use of the xmpMM:DocumentID XMP value
The solution to the above problem seemed to be to write the Unique ID to the RAW's XMP. This is a form of metadata, similar to EXIF. The XMP is copied over to the JPEG when the RAW file is processed, so this would work for writing a reference to the RAW file into the JPEG.
When I looked into this, I found that there is actually something already set up for this. All Adobe software (I think) automatically creates a DocumentID for an original file. Versions / derivatives of that file then have a OriginalDocumentID field written to them, which references the DocumentID of the original file.
So it seems like what I was looking for was already being done for me automatically, just I didn't realise it!
I did do a bit of research on the xmpMM:DocumentID value. With Adobe Bridge, it will automatically create a DocumentID for a RAW file when it creates the sidecar XMP file. The DocumentID appears to be based on the RAW file, so if you delete the sidecar and then get Bridge to create a new one, the DocumentID assigned will still be the same.
If you modify the RAW file (e.g. writing geo data to the EXIF) after having already used Bridge to assign a DocumentID, then this is not a problem. Bridge will continue to read the DocumentID stored in the sidecar XMP file. The only issue would be if you then deleted the sidecar XMP, then Bridge would generate a different DocumentID for the RAW file (since the RAW file has been modified).
So, this does seem like a pretty solid way of referencing the original RAW file from a JPEG.
I seem to remember something about Lightroom generating unique image IDs as well. However, after searching the web, I couldn't find any reference to Lightroom creating unique image ids.
I did, however, find this blog post: 5 Lightroom Organizing Mistakes and How To Avoid Them. As usual with these things, the comments are much more valuable than the article itself. Lots of photographers discussing the best way to store and rename images.
After some further research I discovered that Photo Mechanic can rename files using a sequence and continue the sequencing from one ingest to the next: Photo Mechanic Wiki: Rename Ingested Photos As. The Bridge Photo Downloader also features the ability to rename files using a sequence. But unfortunately the numbering starts again on each subsequent ingest. So this isn't any good for getting unique image filenames with Bridge.
The best solution I've seen so far is one of the comments on the Lightroom article suggesting to use a combination of date and original filename / number. They also use a shoot sequence number, but this wouldn't work for the type of photography I do. But something like
YYYY-MM-DDoriginalfilename should at least give a more unique ID than just the filename as I am using at the moment.
So, my aim for my ingest process is:
- Copy images to PC. Organised into folders based on location then date, or into separate folders for flora & fauna for later IDing.
- Geotag if relevant.
- Rename to YYYY-MM-DDoriginalfilename using Batch rename in Bridge.
- Assign DocumentIDs to RAWs (not actually something I need to do, since this is done automatically by Bridge).
- Add relevant shooting data and keywords to images.
That will hopefully make my images better organised and easier to search through when looking for a specific image. Of course, it doesn't help with the all the images I already have. But at least it will be better for all new images.