Skip to main content
Dad25
Participating Frequently
March 11, 2019
Question

Beginner workflow for scanned photos

  • March 11, 2019
  • 6 replies
  • 11446 views

I scanned about 10,000 printed family photos over the course of the last few months, so they do not have any useful metadata. The photos were very disorganized, and I know that there were many duplicate prints. All files are on one external hard drive (with a backup copy on another one). To complete the work, I need to do some efficient clean up and add useful metadata.

When the project is completed, I will give a hard drive with a copy of the photos to each of my adult children. I want each photo to contain the following saved metadata, in the correct EXIF/IPTC/XMP fields (not just as keywords):

  • Date the photo was taken (or at least a close approximation).
  • Place names and GPS coordinates.
  • People names, preferably in the “Person Shown in the Image” field.
  • People tags (MWG face regions) if possible.
  • Events.

I have no past experience with this and I would love to have your thoughts, feedback, advice or warnings on my planned “workflow.”

  1. Duplicates. Find and delete visual duplicates (or near duplicates), using free software such as VisiPics.
  2. Dates, Locations and Events. Create a new catalog in Daminion (free standalone). Assign dates, place names, GPS coordinates and events. Sync to the metadata of the files.
  3. People names. Import all photos into a new catalog in PSE 2018 Organizer (since I already own it). Name and tag all of the people, using facial recognition. (Unfortunately, the actual tagging of the faces in their respective spots in the photo may not matter, since Organizer will not preserve the MWG face regions information in the metadata of the files.) Save the people names to the files’ metadata as keywords. (Unfortunately, this does not save the names in the IPTC extension field “Person Shown in the Image,” which I would prefer.)

In the above workflow, I have tried to use free (or already owned) software. I am not opposed to paying for software, such as Lightroom, if that would work more efficiently. However, I need the end result to have metadata saved in the files and not be dependent on a catalog that would force my family to purchase software in order to effectively read and use the photos.

Thanks!

    This topic has been closed for replies.

    6 replies

    Dad25
    Dad25Author
    Participating Frequently
    March 15, 2019

    Thank you all for your time and helpful comments. Here is a summary of my conclusions and amended workflow. How does this look?

    COMPLETED:

    Duplicates. Find and delete visual duplicates.

    The helpful discussion in this forum persuaded me that I would not find any effective way to achieve this from within LR (with or without a plug-in). Therefore, I moved forward, using the free VisiPics software. The results were great! The software took about an hour to process, plus I spent a couple hours selecting and deleting the duplicates. I successfully deleted 800 files. It is really a shame that LR does not have this technology. It would be very helpful, not only for removing duplicates, but for helping to show groups of similar photos for tagging, culling, etc.

    PLANNED:

    Import. Import all my photos into a LR catalog.

    The above discussions had good advice about setting keywords on import, using Smart Collections, etc. However, the photos currently have no identifying data, so these suggestions will be more applicable once they are in LR.

    People names. Name and tag all of the people, using facial recognition.

    I have read countless online discussions about naming conventions, but still feel largely undecided. For now, I will just use “First Last” as of the time of the photo. I will add “(Maiden)” or some other term to distinguish those with the same name. Initially, I do not plan on using keyword hierarchies.

    Dates, Locations and Events.

    I don’t know exactly how to go about recording these pieces of information efficiently. Since LR cannot automatically “look” at and group the photos for me, I suppose I will just have to manually look at them in the image display area and dive in. I need to assign the date it was taken, place name, GPS coordinates and events. This will take many months.

    Backups. After each significant work session, I will back up to a separate drive.

    Another question:

    Apparently, once I am completely done, and ready to create the collection to give to my family, I will need to “export” all the photos in order to have the people names saved in the XMP:PersonInImage field of the files. However, when should I do the Metadata > Save Metadata To File function? Should I do it frequently as I go, or wait until I have all of the above completed and then do it all at once?

    johnrellis
    Legend
    March 15, 2019

    Apparently, once I am completely done, and ready to create the collection to give to my family, I will need to “export” all the photos in order to have the people names saved in the XMP:PersonInImage field of the files. However, when should I do the Metadata > Save Metadata To File function? Should I do it frequently as I go, or wait until I have all of the above completed and then do it all at once?

    For the purposes of providing photos to others, Export will correctly write the metadata into the exported photos.  Metadata > Save Metatadata To File isn't necessary for that.

    However, Metadata > Save Metadata To File does have other uses:

    1. If you want to use an external tool to manipulate the metadata, you need to run Save Metadata To File first so that the master photos have the correct metadata written from the catalog available for the external tool to read.

    2. As a belt-and-suspenders, last-ditch backup mechanism. Hopefully, you've got a regular backup mechanism in place (e.g. Mac Time Machine), and depending on your backup program, you're having LR make regular catalog backups when it exits.  But backup regimes are notoriously prone to failure (due to human error and lack of testing), so some of us ensure that metadata is also always written to the photo files, providing a last-ditch backup in case the primary catalog backups don't work.  Rather than doing manual Metadata > Save Metadata To File commands, you can set the option Catalog Settings > Automatically Write Changes To XMP.  Years ago that used to cause performance issues for some people, but such reports are rare these days. 

    Legend
    March 13, 2019

    You don't mention it, so I will. I planning this project, think hard about your backup strategy. EXPECT the computer, and any hard disk to fail some time in the project, and plan so you lose no more than a way's work. Test your plan! This is annoying, time consuming stuff, but losing everything is worse.

    Dad25
    Dad25Author
    Participating Frequently
    March 13, 2019

    ItWasNotMe​ : I started sorting them out as I was scanning”

    Unfortunately, I did not have that luxury. I suddenly had temporary access to a scanner and needed to do all of the scanning without being able to pre-sort or organize the photos. Currently, the only organization is that I put the physical photos into small boxes as they were scanned. The name of the digital folder just matches the label on the box. So now, if I needed to find a particular photo, I would “only” have to search through about 500 in a given box.

    @Todd Shaner :

    Thank you for sharing test results of those various duplicate finders. It sounds like Antidupl.net, Image Comparer, and Deduplicator will not do the work I am hoping for. So, I have downloaded and tried VisiPics. For my small sample (268 photos) it seemed to work pretty well. It took less than 2 minutes to find 14 duplicate groups with a total of 44 pics. That was using the “Loose” setting, so there were many false positives (non-identical photos). It is too bad that kind of technology is not built into LR. It would be very helpful in creating stacks of visually similar photos. When I put the setting to “Basic” it created 5 groups, with 2 identical photos in each group. Then I could automatically select the ones to delete or move. I think this will probably work as a way to pre-clean my collection before importing it into LR.

    Eventually, once I have added metadata, perhaps I will want to use a LR plug-in to help find and manage possible duplicates from within LR. Victoria Bampton, in her blog (https://www.lightroomqueen.com/clean-duplicate-photos/) said: “The downside of an external duplicate detection app is it doesn’t know which of the duplicates you’ve edited in Lightroom, so you can end up deleting the wrong ones and creating a long job restoring broken links.”

    @ johnrellis :

    “LR does a reasonable job of following the MWG standard, including using tagged regions for faces. With just a little care, you'll be able to move your photos and their metadata from LR to other platforms.”

    That is excellent news! So, if I correctly understand, you are saying that when I run face recognition and place tags on the faces of all known people in my photos, LR will save the location of those tags in the MWG face regions of the files? Do you happen to know whether or not the people names will be saved to the IPTC “Person Shown in the Image” field (and not just as keywords)?

    Todd Shaner
    Legend
    March 13, 2019

    Dad25  wrote

    ItWasNotMe  : I started sorting them out as I was scanning”

    Eventually, once I have added metadata, perhaps I will want to use a LR plug-in to help find and manage possible duplicates from within LR.

    Give the Deduplicator LR Plugin a try. It's easy to use and creates a collection named 'Duplicates' with the found duplicate image files.

    Todd Shaner
    Legend
    March 12, 2019

    I agree with the suggestion to keep you're work inside LR. There are numerous LR plugins available as needed:

    https://www.photographers-toolbox.com/

    Any Comment Lightroom Plugin (and many more here)

    Jeffrey Friedl's Blog » Jeffrey’s Lightroom Goodies (Plugins and Tools)

    A good book on the subject. It says LR 5, but is applicable to the newer versions as well.

    Organizing Your Photos with Lightroom 5 - The DAM Book

    For finding duplicates here are two LR plugins with instructions on the Lightroom Queen's website, but they probably won't find all of the duplicates. When keywording and creating collections make sure to add specific information (place, people names, event, etc.) that can be used with the LR Text filter to help find the remaining duplicates. This is also useful for sharing the archive since many photo viewers have the ability to filter on keywords to find specific pictures.

    https://www.lightroomqueen.com/clean-duplicate-photos/

    Concerning sharing the finished photo archive you can use USB 3.1 flash drives. This one is available in capacities of 64GB to 256GB and is very small and economical.

    https://www.amazon.com/SanDisk-128GB-Ultra-Flash-Drive/dp/B07855LJ99/ref=sr_1_fkmrnull_3?keywords=sandisk+ultra+fit+128g…

    ItWasNotMe
    Known Participant
    March 12, 2019

    I'm not sure that these plug-ins do what the OP is looking for, they seem largely metadata based matches; programme he is proposing matches the visuals.

    I think he is trying to reduce the amount of work by eliminating the duplicates before he adds keywords etc.

    Dad25
    Dad25Author
    Participating Frequently
    March 12, 2019

    30,000? I have no right to be concerned about my measly 10k!

    Yes, the reason I listed eliminating dups first is to reduce the work and to do some initial clean up. Also, I'm hoping that finding visually similar photos would allow automatically creating "stacks." Then, I could assign metadata (such as date taken, locations and events) to those stacks in batches.

    ItWasNotMe
    Known Participant
    March 12, 2019

    I've been through this with c30,000 images that I scanned. I second the suggestion to do it all in one application.

    What I would do is also organise the photos into (approximate) 'date taken' folders, i.e. a limited number of photos in each folder.

    I started with this, and also

    • set the exif time and date in the photo as I went
    • used a simple algorithm to set the time so any photos for a date would appear in the best guess of correct order when sorted on date/time.

    Once I'd done that I worked through them chronologically doing the rest so could measure progress and be sure that I had done them all.

    Once I'd "finished" a suitably large date range I wrote it to non-erasable media, e.g DVD/Blu-Ray depending on size of images.

    Lightroom will allow you to write the metadata into the file (or XMP if the files are raw)

    If you do Lightroom edits, they can also be written into the file, but either:

    1. The recipient will need Lightroom to see these, or

    2. You export copies that contain the edits and distribute those

    When looking for visual duplicates you could first create 'stacks' in Lightroom and then choose the best from the stack

    Tony_See
    Inspiring
    March 12, 2019

    In your instance (10,000 scans what an effort!!) I'd say keeping in simpler would be better.

    Do EVERYTHING in Lightroom, but use plugins if you have to achieve other results not there by default in Lightroom.

    |  KIS  | keep it simple

    Dad25
    Dad25Author
    Participating Frequently
    March 12, 2019

    Yes, I had no idea that we had that many photos! I would love to be able to do everything in one application. However, my very first step would be to eliminate dups and I understand that Lightroom does not have a way to automatically identify and bring together visually similar photos, so that is why I assume I would need to use VisiPics (or some similar program). Is there a plugin for LR that you would recommend that will do this?