How to fix folders with mass number of duplicate (exact) files before importing to PSE 2019
Hello,
I'm posting on this forum on my mother's behalf. I have to admit, I am pulling my hair out after spending the last month building my mother, a digital photographer, a brand new computer. Here's the situation:
- My mother has 19 years work of photos (~85,000)
- She is not good with computers and I have tried my best to be supporting, and I am at the end of my rope!
- Her previous computer's extra internal hard drive died and she hadn't performed a backup onto the external drive for some time. She ended up loosing some photos.
- I decided it was time for me to build her a brand new, powerful desktop computer, which I successfully did. The computer had multiple internal hard drives and a number of backup schemes so we'll never lose another file.
- After building the new computer, we had to upgrade her to Photoshop Elements 2019, because PSE 8-10 was no longer supported on Windows 10.
- After installing PSE 2019, I tried importing her ~85,000 photo files using PSE Organizer's import feature.
- And then... after an hour of two... PSE was done and reported that ~9,000 of her ~85,000 photos were duplicates (date/size exact... not "similar").
- I examined a few files and to my horror, my mother's method for organizing files did in fact produce many exact duplicates (a result of her copying and pasting duplicates all over her folders). She would make these duplicates in Windows Explorer (not through the PSE catalog interface).
- I did some research to determine if there's a setting that allows duplicates to be imported and discovered somewhere in the Adobe forum that this particular setting was removed in more recent versions of PSE. The reason for this that PSE Organizer should be considered a database and there shouldn't be duplicate entries in a database.
- As a software engineer, who meticulously manages thousands of files every day, I agree with this philosophy - duplicates are horrible.
- However, for a 62 year old mother who doesn't have this experience or perspective (and has spent her life putting her family first at her expense), this concept of avoiding duplicates just doesn't make sense. (I'll skip the heated argument part.)
- So I am trying to figure a few things:
- Is there truly no way to have PSE 2019 allow exact duplicates to be imported into the catalog? I would have figured different file paths would have allowed this to be possible.
- If not... then I'm stuck trying to eliminate the duplicate files before re-importing the catalog from scratch. I know there are tools that do this, but they present the duplicates on a photo-by-photo basis, instead of identifying and grouping the photos in sub-folders. Example below...
Here's an example of what my mom has going on. Folders are showing in <CAPS>, where as individual files are shown with lowercase names and the .jpeg extension. I've bolded all the files that are considered duplicates. From my mother's perspective, she would copy (duplicate) a file if if also fit a sub-category.
- She would start by uploading all the photos she took at Christmas time in 2018 into the <2018>\<CHRISTMAS> folder.
- Then she would want to further sub-categorize the photos based on family members in the photo, so she would:
- Create sub-folders like: <PETS>, <SMITH_FAMILY>, and <WILSON_FAMILY>.
- And then, instead of moving (cutting) the files from the CHRISTMAS folder into the sub-folders, she would COPY and PASTE (duplicate) these photos in the bub-category photos.
- Then she would make a folder to organize photos to make a photo book on Snapfish. She'd then copy photos from all over her other folders into the <MY_2018_PHOTOBOOK> folder (creating more duplicates).
- (See the folder structure below as a simplified example...)
- <2016>
- image_100.jpeg
- ...
- image_123.jpeg
- <2017>
- image_200.jpeg
- ...
- image_234.jpeg
- <2018>
- <CHRISTMAS>
- <PETS>
- image_03.jpeg
- <SMITH_FAMILY>
- image_01.jpeg
- image_03.jpeg
- <WILSON_FAMILY>
- image_04.jpeg
- image_02.jpeg
- image_01.jpeg
- image_02.jpeg
- image_03.jpeg
- image_04.jpeg
- image_05.jpeg
- image_06.jpeg
- <PETS>
- <CHRISTMAS>
- <MY_2018_PHOTOBOOK>
- image_02.jpeg
- image_03.jpeg
- image_04.jpeg
- image_123.jpeg
- image_200.jpeg
In the end, I thought I could simply import the parent folder containing the folders <2016>, <2017>, <2018>, <MY_2018_PHOTOBOOK>. And this is where I encountered the message that ~9,000 photos were skipped because they were duplicates. And I can't figure out the order PSE Organizer is importing files and how is chooses which files (which technically have duplicates) will be import and which it will exclude. For example, will it import image_03.jpeg from the <PETS> folder and then skip the image_03.jpeg from the <SMITH_FAMILY> and <2018> folders? Or will it take image_03.jpeg from the <2018> folder and then skip it in any lower sub-folder? And then will it find image_03.jpeg in the <MY_2018_PHOTOBOOK> folder before the <2018> folder, or will it find it in the <2018> folder first, because numbers have higher path priority than letters?
Is there a tool I can use that will easily show me a grouping (by folder) of duplicates files by folder? Meaning, it would present me some prompt indicating "Hey, you have 5 photos in your <MY_2018_PHOTOBOOK> folder that are duplicates in these other folders. Would you like to delete the duplicates in the other folders all at once, or in the <MY_2018_PHOTOBOOK>, all at once.
So I'm begging for help because I'm about to fake my own death and run away from this mess. I hope this is making sense and that someone out there has been through this and can give some guidance.
Thanks in advance...
Phil
