This Python program automates the offloading, organizing, categorizing, and archiving of photos and videos from an iOS device to a Linux computer’s directory structure. It executes these steps:
- Incrementally offload media to a new, datestamped “raw offload” folder.
- Copy offloaded media into directory structure based on Year/Month created.
- Categorize each photo/video: display each and prompt user for where to copy in main directory structure. Provides shortcuts for quick processing (‘c’ for cats, ‘h’ for house, etc.), with each corresponding to a predefined directory path.
For years, I’ve used a manual system to back up my phone’s pictures/videos to my home storage hierarchy and copy them into the right directories for later access.
Offloading all the phone’s media would take hours each time each time (tens of gigabytes), so it can only be backed up incrementally. The first step is to find the first image on the phone that was not included the last offload. This was a tedious process, especially since the iPhone doesn’t maintain a consistent naming convention. Editing photos and video can result in oddly-named photos showing up at the end of the sorted folders. The iPhone also breaks groups of images into different folders to keep a mostly-even quantity in each folder.
Once that breakpoint is identified, the next step is copying that image and later images in that folder from phone to computer. If the phone has created a new folder since last offload, that folder must be copied in its entirety as well.
This “raw offload” gets put into a folder on the computer labeled with today’s date, e.g. /Raw_Offload/2018-04-11.
Since I often find myself needing to search for pictures by date, the next step in the process is mapping these offloaded images to a separate folder hierarchy based on year and month taken. For example, pictures from October 2015 will be copied to /Date_Organized/2015/2015-10. Unfortunately, neither the image name nor iPhone folder indicate date taken, so this process involves referencing the date shown on the phone’s photo app to find the breakpoint between months. Then I copied each group of pictures to their proper folder in the date-organized hierarchy.
So far, the offloaded photos could be found if I knew either what day they were offloaded or what year and month they were taken. But what I need is to have my family photos, vacations, cat videos, museum-visit photos, etc. separated, labeled, and distributed to their rightful place in my main storage hierarchy. That way, I can browse all my vacation photos neatly organized into their own folders. So in the manual system, I looked through each of the photos in the raw-offload set to determine where each should be copied. Then I had to copy them into one of a dozen other folders while trying not to lose my place in the raw-offload set. This involved a lot of hunting through the directory structure over and over as I came across all types of photos.
I realized most of this could be automated, so I wrote a Python program to do each of the above steps. It automatically finds the phone’s virtual file system mount point on the computer (something I could not access with Windows). It compares the phone’s picture set with the computer’s previous offload sets to find which pics haven’t been offloaded before. It then offloads those to a new, datestamped raw-offload folder.
Next, it must determine when each photo was taken, which took some research. The creation time and modification time listed by the phone and computer file systems do not report the right date. I found out that each photo has what’s called EXIF data embedded in the file. This can be extracted and read, and it usually contains the original date taken, as well as many (sometimes hundreds) other attributes. Programmer Phil Harvey has written and generously made available an awesome Perl program called ExifTool I needed a way to use it in Python, and Sven Marnach had just the thing – PyExifTool wrapper. I was able to call this from my program and extract the photo’s date taken from its EXIF data. There are often a half-dozen timestamps listed in EXIF data with ambiguous names, and they’re inconsistently named across different formats (e.g. PNG vs. JPG), so it took some research to identify the appropriate EXIF tag for each (cross-referencing the timestamp in the iOS Photos app). In addition, some are not timezone-adjusted so that added some complication. Now using ExifTool, my program could sort each offloaded pic by date taken and copy them to the correct location in the date-organized hierarchy, creating new folders as necessary as the months and years pass.
Finally, the program should distribute the photos to various folders according to content – nature pictures, documentation of a house project, etc. The computer can’t know for sure where I want each photo, so I still need to be in the loop somehow, but since a big chunk of the pictures end up going to the same group of destination folders each time, I put those folder paths into the program for quick lookup. The program just displays each image or video one by one (using an xdg-open call) and awaits input in the terminal. Each common destination path is associated to a one- or two-character shortcut (like ‘f’ for family pics, or ‘h’ for house ideas), so I just have to type the shortcut instead of finding the correct folder in the hierarchy each time. In addition to the common destinations mapped to short commands, the program allows special inputs like ‘n’ for no transfer (if it is a throw-away pic that I don’t need saved) or ‘u’ for uncategorized, which moves the pic to a buffer folder for later manual transfer after I figure out where I want it. It can also copy the picture to multiple destination folders, repeating its prompt when the ‘&’ symbol is appended to the first destination input. If I get into a batch of pictures I know will all go to the same destination, I can input the target directory and append a ‘+’ then a number – say, 10 – and the program will copy the next pictures all to that target directory instead of requiring me to repeat the same command 11 times. Aside from the many shortcut strings, the program also accepts a raw folder path as input for those photos not going to one of the more generic destinations.