Is anyone smart enough to know how to avoid duplicate sub-keywords?
Copy link to clipboard
Copied
Is anyone smart enough to know how to avoid duplicate sub-keywords? I am not. I had made phylogenetic keywords originally, in the form, Diptera (FLIES). I found out, importing to my website, that the proper separation is with commas, and () causes problems. So I decided to rename to Diptera, FLIES,
This is an Order of Insects, and there are Families, Subfamilies and Tribes below this. And there are parents, Arthropods and Insects above. I was first just trying to work on one family of Bee Flies, Bombyllidae, whose format I thus changed. But then I found when I checked one of the subkeywords, the same one in the older one would also check. Likewise with uncheck. I spent hours, trying to view all the subkeywords and starting over, unchecking all the levels but they still came duplicated. that when I then went to the parent keyword Diptera and did the same thing, but it made no difference except creating another duplicate of Bee flies in the new format. Then I selected all the images in the family folder and selected every level and uncheck all the way up to Arthropods. I deleted all levels of duplicate keywords. I I closed and reopened the Bridge and tried again, all to no avail. I had also tried deleting both instances of a subkeyword in the old and new keywords and then putting back in the new and checking. The old came back with all the levels.. I changed the preference to not apply parents, but the subkeyword still duplicates.
Copy link to clipboard
Copied
Have a look at this post. It might explain what's going on. I think your case is more difficult to solve because you have a lot of deelpy nested keywords.
Unable to delete identical but re-assigned child-keyword
Copy link to clipboard
Copied
Hierarchical keywords are designed to peacefully co-exist.
Take for a simple example the parent of "House" or "Car". Both of which can have valid sub-keywords of "Door" or "Window", however selecting the Door sub-keyword in House shouldn't include the Door sub-keyword in Car.
Copy link to clipboard
Copied
I think I found the problem, or at least I figured out why @Stephen Marsh example works.
If you enable Preferences > Keywords > Options "Write Hierarchical Keywords", the duplicate child keywords can be edited independently.
When "Write Hierarchical Keywords" is selected, the dc:subject is "Test - Car; Test - Car|Door; Test - House; Test - House|Door"
When "Write Hierarchical Keywords" is NOT selected, dc:subject is "Test - House; Door; Test - Car"
I think the problem is that "Door" and "Car" only exist once here and that affects how the Keyword panel edits them. When I delete "Door" under "Test-Car", "Door" under "Test-House" is also deleted. This is not apparent until you click off the thumbnail then click back on refreshing the keyword panel display.
When I re-saved my keywords with "Write Hierarchical Keywords" enabled, I was able to delete the child keywords independently.
Copy link to clipboard
Copied
I should add that you can't just enable "Write Hierarchical Keywords" and solve the problem if your previous keywords were not saved in that form. The dc:subject (IPTC Core) keywords would have to re-written in hierarchical form which would require re-selecting them in the Keyword Panel or by using a script.
Copy link to clipboard
Copied
I did not perfectly understand your case – I would advise provoking a dead simple case with apples and pears and leave animal taxonomies away.
Yet, the situation looks to me as if some keyword-swap hadn't worked properly. The keywords in italic are (no longer?) in your keyword list, but seem still to be assigned on file-level. Bridge requires a rigid workflow for deleting or renaming keywords. You need to have absolutely all items that have a keyword-old assigned on screen and selected to perform a keyword deletion that applies both to your keyword list and your files.
This is a mayor difference from programs that use a catalogue, where you may freely rename and delete keywords and this metadata-change will get written back to all files / their database record – whether or not they are visible or selected.
Copy link to clipboard
Copied
Based on the link suggesed, I tried something similar. I had an arrangement with an examples such as thus: Arthropods>Insects>Diptera, FLIES,>Acalyptratae>Tephritoidea>Ulidiidae (PICTURE WINGED-FLIES)>Otitinae>Cephalini.
I lready had duplicates of almost all the subfolders in two places from previous parent folder renames, the old ones of which were Diptera , and Diptera (FLIES). So almost all below that was twice duplicated and I wanted to get rid of. I decided to work with all under Acalyptratae, and renamed that "Acalyptrate" and then all the FAMILIES ( end in dae) below it I changed to the comma format, so the example above became Ulidiidae, PICTURE-WINGED FILES. All child levels below that, I renamed with a period after. So Otininae becomes Otininae. .I'd go to a family folder where images are stored, and since Bridge removed the ability to find 'ALL' that it once had, I choose find all images that don't contain 'zzz'. this pulls up everything in the subfolders as well. Then in the folder panel I select a lowest level child keyword, such as Cephalini, and in the keyword panel, check the Cephalini. (with the period), then uncheck all the ones without the period and I can delete that keyword., from the other levels, I go up to the family level and delete that. Once the family was gone, I'd also remove parents above that if they didn't automatically. I would look on the folder panel with all selected and see if there were any old keyword, check them and uncheck in the keyword panel. Several hours of this worked well for quite a few families that didn't come back. It got harder when Acalyptratae would show once in the folder panel but be in 2 different old parent FLY keywords. Earlier, it seemed one check on the folder panel would show all checked in the keywords and I'd uncheck and they'd all be gone. but then I started getting cases where the 2 on the right would show (-) and I'd have to start doing smaller selections of images till I'd get a solid check and then uncheck and get rid of that group. sometimes it would only be one image that would allow a single check. Tedious. But then a stranger thing happened, before I could finish the last families in the group. "Acalyptratae" became uncheckable. Earlier as a parent, anything below would check it, now the non quote Acalyptratae would check when I checked a family or its child instead and if I checked "Acalypratate" the check would go away. Mind boggling, but I got rid of all the extra families and their children of the 25 families in the one subgroup. Not perfect, but I was able to export less cluttered keywords to the metadata file being prepared to re-import into my website.
Copy link to clipboard
Copied
@robirdman1 you had a very complex case. I'm glad you got it mostly resolved. See my comment above about enabling Preferences > Keywords > Options "Write Hierarchical Keywords".
You might try this on one image to see if the results work for you. You will get IPTC Core keywords containing delimiters, e.g., "Diptera, FLIES|Tephritoidea|Ulidiidae, PICTURE WINGED FLIES". These will also appear in the Keywords filter panel, like this:
Copy link to clipboard
Copied
Athought the families I did so far did not duplicate, it seems that Bridge does not like " ". When I went back to look at the ones I changed, I found that any parent "Acalyptrate" was unchecked and there were 4 different Acalyptrate in other parent folders checked. While I once had a beautifully organized phylogenetic heirarchy, I should have left it alone. My idea of using standar commas , instead of parentheses () had just introducted a chaos of multiplicity and attempts to correct keep adding. I haven't tried the write heirarchical yet, worried about how many levels I have to go up to without creating still more duplicates.
Copy link to clipboard
Copied
It may be time for drastic measures - starting over. You could delete all the current keywords then create a new, clean keyword list and add keywords fresh. This is really drastic and painful.
Another approach would be to export all your metadata to a spreadsheet and do the cleanup there using find/replace and formulas. Then you would remove your old saved keyword list and create a new one based on the clean keywords in the spreadsheet. Finally, you would import the metdata from the spreadsheet into the images. This could also be time consuming but, depending on the complexity of the keywords and your Excel expertise, much of the clean up could be done in batch edits. I can help with scripts for the export-import process.
Copy link to clipboard
Copied
I would like to try the export. I thought I already posted such, but done see that, I went through all the possible files with Acalpytratae in the keywords and changed preferences so that parent keywords aren't automatically selected. Then I unselected all instances of any version of the keyword acalyptrate, where alone, in "" or with *. Then I went through all families that had () and applied the comma style instead all of which are under Diptera, FLIES.. Then I unselected alny with () so only comma ones were left. Searches for keywords with acalyptrate now turned up none. I went to the "Acalyptrate" version under Diptera, Files. and renamed it without the "". So though no Acalyptratae of any type was checked, a whole new set of families with scientific name, COMMON NAME, was created, undoing all I had done.
Copy link to clipboard
Copied
@robirdman1 For the export to a spreadsheet approach, download this script and README file file: https://drive.google.com/drive/folders/1fb2Vob7KHEZWt5ZXLYpoJaLvbmFLCVuo?usp=sharing
I'm trying Google drive for sharing this file for the first time. Let me know if you have trouble downloading it and I will find another method.
See the README file for instructions.
Start by installing the script in the Bridge Startup scripts (see README ## Installation)
Then:
## Open the plugin in Adobe Bridge
### Export
Follow the instructions to open your .txt file in a spreadsheet (Excel, Google Sheets, or any other spreadsheet app) using tab delimiters. Look for the "Keywords" and "Lightroom Keywords" columns for the current values. They will look messy, but we can use spreadsheet functions to break them apart and clean them. Send me a private message and we can collaborate to accomplish this.
Finally, save your spreadsheet as a tab-delimited .txt file and use the script to import the clean keywords back into the files. We'll also reset your keyword list based on your clean keywords list to make new keywording easier.
Copy link to clipboard
Copied
I started the procedure. I see the script in the startup folder, along with a simpler metadata extractor program, But when I reopened Bridge, in the metadata tab ( no metadata deluxe tab ) I only see my other script. Aslo it isn't in the list of startup scripts. If I get beyond that, I am wondering how extensively I have to select files. When I was trying before I went to my main archive of Arthropods and selected all with Acalyptratae, and then I went to my reject folder (these are due to be deleted, but I keep them temporarily till I export that metadata so I can correlate with my access database what numbers are deleted and delete from there too. I went to recent day folders, and I went to another drive that had some videos. When done, Acalyptratae was unchecked everywhere and all but "acalyptratae" deleted but when I removed the "", then I had a whole new set with all the child keywords.
Copy link to clipboard
Copied
It looks like Google converted the file to a Word document on download. That's annoying. Delete that file.
Let's try using Dropbox. When you open this link, clode the login box and you should be able to click download. Dropbox will try very hard to get you to log in or create an account, but don't have to.
If this doesn't work, we'll try another method.
As for selecting which files to export to a spreadsheet, if your files are nested in subfolders (if you have a parent folder for this project) you can use the export from subfolders option in the script. You'll get metadata from all your files, which you can then sort through.
Copy link to clipboard
Copied
So I got further. downloaded, didn't ask me to sign up, maybe because I have the free account already. I got it indo Bridge and the metadata deluxe came up. I decided to 1st just try the last day folder with few shots to see how it does. I did the extraction and imported the text file into Excel, leaving pretty much just the file number, description and keywords and I can see that Lightroom Keywords is the problem. Bridge Keywords are in column C and I can extend that and it doesn't go very far. LR ones are in D and it just goes on and on, with so much duplication. I had to extend E which has nothing just to get to the end and it ovelaps E, apparently having reached the limit of extension. I don't really use LR so do I just delete everything in D? And make wanted adjustments in C?
thanks for all the great help!
Copy link to clipboard
Copied
You have to open the exported .txt file in Excel.
- Open your .txt file in a spreadsheet using tab delimiters
- Because the .txt file is created as UTF-8, it is best to use the Excel import text wizard to retain
proper character encoding, e.g. ©, 작, ü
- Use import option and select a .txt file to import
- Text Import Wizard
- File origin = 65001- Unicode (UTF-8) or (UTF-16)
- Delimited = Tab
- Finish
- The first row will contain the field names
- Each row below that will contain metadata for a single exported file
You can also find instruction b clicking the"?" button.
You might be able copy and paste columns from Excel and Access tables.
Your Keywords columns are probably going to be quite complex and will need to splitting, filtering, sorting, etc. to clean up. Have you ever tried OpenRefine? it's an amazing tool for cleaning and transforming data. The Cluster and Edit feature would very useful for your work.
You can accomplish a lot in Excel too. I can help with that, if you need it.
Also remember that you'll want to purge and reload your keyword panel list. You might want to start with creating your optimized hierarchical keyword list, then fix your file keywords based on that and then import everything back in.
One last thing...you mentioned using standard commas instead of parentheses (). I would avoid commas, if possible. They are allowed in keywords but, because some photo applications treat commas as separators, they are dangerous. Even in Bridge, you'll see that any keyword phrase containg a comma will be surrounded by quotes which can be confusing.
Copy link to clipboard
Copied
I was a away for awhile and have returned to try to get back to some kind of order with my keywords. Since Bridge had updated, the script was gone but I got it back. Before I do any more, the status is that I had exported and tried to correct things in the Excel, replacing ( and ) with , . But I created a worse mess in bulk replacement as sometimes the ) was at the end and sometimes it wasn't, so I have an extra multiplicity of commas and even more keywords. You had said that commas were not recommended, though the website accepts and rejects (). So would using | as in an earlier thread post be acceptable? I am so sorry I ruined my well ordered phylogenetic tree and really want to have something clear again that will also be accepted by the site and amenable to google searches. No I have lots like this:
Copy link to clipboard
Copied
Variations of Diptera FLIES can be sorted out in Excel using sorting, filtering and formulas. I can work with you on that. We could work on the Excel clean up privately because that's outside the Bridge community. Sorry I missed your private message, but I finally responded to it.
Copy link to clipboard
Copied
I don't know where the readme file is so I just tried to figure out the procedure. Initially I got a message about not using the Bridge subfolder option, but I wasn't using it so proceeded. I just took one group in a folder, seleted thumbnails, and exported. I didn't remember how to get it into Excel so I just pasted into a blank from notecard. I made the desired modifications to the keywords field only and then saved as Homoptera (the types I'd selected) text file. I went to import and chose the file and tried to import as both basic and IPTC Core and it said no metadata in both cases. So I don't remember the procedure, though I can modify fields in Excel.
when I went to import
Copy link to clipboard
Copied
I'm attaching the README file here.
In the script, click the "?" button for instructions:
Yes, you can just make your changes in a text editor. Working in a spreadsheet can make the metadata values easier to see and evaluate, but it does introduce additional steps.
If you get error messages again, post a screenshot so I can figure out the cause. It would also help if you post your import text file so I can look for problems.
Copy link to clipboard
Copied
Thank you. This time I am reading the directions but now export is grayed out and I can't proceed,
Copy link to clipboard
Copied
Oh, I see what's going on, you are working with search results. The script works at the folder level, either selected files or the entire folder.
You could export the entire folder and use Excel to sort or filter "1 sort" in the description column.
If you only want to import to the "1 sort" files, you could sort on that value, then delete all the other rows in the spreadsheet.
Copy link to clipboard
Copied
So now I successfuly exported a selection of spiders from the sort folder. but when I go to Excel, and choose open, it doesn't show in the folder where it is saved. Also I don't see an import on the Excel menu.
With the previously saved Homoptera exported file that I couldn't import, it seems that Adobe forum doesn't allow attachments to add.
Copy link to clipboard
Copied
You should be able to drag & drop a file in your reply on this forum. If that doesn't work, we can work it out in a private message.
To open the export .txt file in Excel, try this quick method:
Right click on the .txt file and select Open With > Excel
The longer method, which is best if you have any sepecial characters in your metadata:
- Open Excel
- Select the Data menu > Get External Data > From text
- Text Import Wizard
- File origin = 65001- Unicode (UTF-8) or (UTF-16)
- Delimited = Tab
- Finish
Here is an instructional video. The Excel import starts at 6:30
Copy link to clipboard
Copied
For some reason, I don't get the option to open with Excel in a menu. then I chose open with program and looked in the list and still couldn't find Excel. so I used the other method, imported, modified some fields. and then, when I named the file spider sort and save as txt, I got

