Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
2

Inflated JPG File Size - Photoshop Document:Ancestors Metadata

Enthusiast ,
Feb 02, 2016 Feb 02, 2016

Has anyone run into this issue?

I have a JPG that I have removed all image from, and filled with white. When I save it, the size is 7.89 MB. If I go into Bridge and clear the metadata, most likely the "<Document:Ancestors>" which contains "<rdf:bag>" , and 100+ lines of Hexadecimal code, the file size goes down to around 150KB. Maybe I'm missing a file save option out of photoshop, but It seems this metadata should be scrubbed on file saved. Additionally, does anyone know what kind of information is stored in Document Ancestors? Or if there is image information embedded that could potentially be extracted?

74.0K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Enthusiast , Feb 18, 2016 Feb 18, 2016

I was able to reach out to Adobe through my work. They mentioned it was an abnormal behavior for there to be over 100,000 lines of Document Ancestors, and thought it may be because it was a template. It turns out a lot of the art I work with has extensive lines of Ancestors, and when placing them into a completely new file, that file acquires them. A lot of these assets are CG renders, which too I would believe to be new files. While I'm not sure where the Document Ancestors originated, I was ab

...
Translate
Adobe
Community Expert ,
Feb 02, 2016 Feb 02, 2016

According to the XMP specifications (search Google for "XMP Specifications Part" -- there are three parts), the Document Ancestors denote "copy-and-paste or place" operations. These do not identify what was incorporated into the file -- it could be an entire picture or a portion of a picture. We only know that these four separate files were incorporated into an existing file. These records identify other documents (DID) that were added to this document. This is explicitly the definition of a composition: a picture made from other pictures.

Gene

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Feb 02, 2016 Feb 02, 2016

Thanks for this! Unfortunately, save for web isn't an option since these JPGs are for print or reference, and I generally have to email multiple to coworkers. Save for web also tanks when an image is too big. I cleared this info directly from a photoshop file, and resaved it finding that it immediately added it back.

I will definitely look that up, and hope to get a better understanding of why it's inflating my files so much. Potentially I may have to figure out how to scrub the file on save with a script.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 02, 2016 Feb 02, 2016

That's why I removed SFW in my reply. It shouldn't be hard to remove what you don't want and keep important information like Print size.

Also if Photoshop Preferences > History Log is active, you can decide to disable that or direct the History to a text file if you want clients to have a history of your work for billing purposes.

Gene

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Feb 18, 2016 Feb 18, 2016

I was able to reach out to Adobe through my work. They mentioned it was an abnormal behavior for there to be over 100,000 lines of Document Ancestors, and thought it may be because it was a template. It turns out a lot of the art I work with has extensive lines of Ancestors, and when placing them into a completely new file, that file acquires them. A lot of these assets are CG renders, which too I would believe to be new files. While I'm not sure where the Document Ancestors originated, I was able to figure out how to eliminate them:

function deleteDocumentAncestorsMetadata() {

    whatApp = String(app.name);//String version of the app name

    if(whatApp.search("Photoshop") > 0)  { //Check for photoshop specifically, or this will cause errors

        //Function Scrubs Document Ancestors from Files

        if(!documents.length) {

        alert("There are no open documents. Please open a file to run this script.")

        return;

        }

        if (ExternalObject.AdobeXMPScript == undefined) ExternalObject.AdobeXMPScript = new ExternalObject("lib:AdobeXMPScript");

        var xmp = new XMPMeta( activeDocument.xmpMetadata.rawData);

        // Begone foul Document Ancestors!

            xmp.deleteProperty(XMPConst.NS_PHOTOSHOP, "DocumentAncestors");

            app.activeDocument.xmpMetadata.rawData = xmp.serialize();

         }

}

//Now run the function to remove the document ancestors

deleteDocumentAncestorsMetadata();

Adobe mentioned that these were meant for file forensics and did not contain any sensitive information. I also tested scripting this with a loop provided in one of the Javascript Reference Documents, and while it does successfully eliminate the ancestors without opening the file, it doesn't reduce the file size until it is opened, and saved again. So, it must be run before save on an open file and will incorporate easily into my save scripts. Problem solved!

If you are new to scripting, copy the above lines into a plain text document(no formatting as it may not work) and change the extension to JSX. From there in Photoshop go to File>Scripts>Browse, and locate this script to run it on an open file. This can be tedious to do per file, so if you have to run on many, place the file in a stationary location, and record an action that runs this script. If you are unfamiliar, once the script is recorded you can batch this script on files by navigating File > Automate > Batch. As the file does need to be saved to eradicate the document ancestors, I recommend pairing your script action with a save as in a stationary location, as save as actions will remember the folder you saved to.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Sep 01, 2017 Sep 01, 2017

This saved me. So helpful. We had 100kb files that refused to save at anything less than 8MB because of all the document ancestor data that had built up over the years. 

At my company, people will just open last month's PSD file (as a template) then replace the contents under the template... save a new filename... delete the layer and move on.  Over years of this, the data built up to a point we couldn't get anything to come out at the right size without doing Save For Web (legacy) on every single file, when a simple Save-As-JPG action/batch would be much easier.

Thank you so much.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Sep 01, 2017 Sep 01, 2017

Glad this thread helps! Also to note, each PS smart object retains its own document ancestors, which if using a lot can significantly increase a working file's size. Similarly, InDesign retains info like document ancestors that can be expunged with a simple save as:

How to reduce the InDesign file size?

Just thought it worth the mention as I recently ran into a 5 GB indd file that was reduced significantly by this method. Adobe stores way too much information hidden in metadata that the every day user isn't privy to.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 02, 2017 Sep 02, 2017

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 02, 2017 Sep 02, 2017

Side note re InDesign: If you really want to clean up an ID document, export to IDML and reopen, save as new .indd document. This clears out all kinds of accumulated junk. It's a standard troubleshooting procedure.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Sep 09, 2017 Sep 09, 2017

Stephen_A_Marsh  wrote

Thanks for the clarification! In the link I posted, [Jongware]​ mentioned it was undo information that built up over time. Since Doc Ancestors contain copy paste information and other artifacts, my thought was that they were similarly stored. I'm now wondering if one could scrub this inflation in InDesign via scripting...

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Sep 11, 2018 Sep 11, 2018

Perfect!  Thank you

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Sep 17, 2021 Sep 17, 2021

I logged in just to say a huge THANK YOU and you are a genius. True lifesaver you are! 😄

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Oct 31, 2022 Oct 31, 2022

This really helped me, so I wanted to say thank you!

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jul 06, 2023 Jul 06, 2023
LATEST

I have to admit, I had NO FAITH whatsoever that this script would work and when I ran it, it seemed like nothing happened. Then I saved the file that I ran the script on and it went down to 1.6megs from 101megs. Unbelievable. I owe you bigtime. Thank you! 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 16, 2016 Mar 16, 2016

Is there any way to do this on Photoshop files without the use of scripts? I am not a programmer or coder. Save for Web does the trick for a single save of a layer but what about a Photoshop file with layers, smart objects or one that is bigger than 8K? Very frustrating this problem.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Mar 16, 2016 Mar 16, 2016

You can open the XMP in Bridge and apply a template that is without the doc ancestors but this is really the only other way. If you're working mostly on photoshop files, the inflation is negligible, and you hardly notice it on a 100MB+ file. But if you are like me and are exporting jpgs to email, it is a huge pain. The script is incredibly simple to use if you want me to post it, that mixed with an action, and it's super easy to use. Surprisingly only 11 lines of code to eradicate an annoying issue.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 16, 2016 Mar 16, 2016

Thanks for replying so quickly. Yes, I tried with Bridge that exact same thing and it worked on some files but today I Can't get it to work... the templates are grayed out when I try to apply them. When I import them into Photoshop inside File info and select Clear it does wipe and replace but as soon as I close the file info it's back. Ugh! And yes, I cannot run layers to files anymore with this in the Photoshop file, every JPEG it outputs is 30MB or more and only Save-For-Web which is very limiting. What a royal pain in the a**. I would love to try your script and yes if you could explain how to apply/run it -- much appreciated -- I've been dealing with this issue for many months now and at my wits' end.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Mar 16, 2016 Mar 16, 2016

Sure thing. Think I ran into the same thing in Photoshop even when saving out, it miraculously came back. The trick with the script is that it does have to save the file or a new file in order to work. The file holds on to disk space if the script is run without opening the file, but my script will be reliant on saving, because of my previous statement.

When I was emailing with adobe support, they too recommended save for web, and quickly withdrew that recommendation when I brought up the same, it does not work well for anything over web size.

If you want to send me a file to test, shoot me a message with a dropbox, wetransfer link or email address, I can test it and make sure it works for your issue. You can even fill the image with a solid color(if you'd prefer not to share a full image) and if it is truly a metadata issue, it will still be inflated.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 18, 2016 Mar 18, 2016

That's very generous of you. The file I have is quite big, i.e. 4GB. Would you mind running it? If so send me your email to michael@raygunstudio.com and I'll post it via wetransfer. Thank you!

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Mar 18, 2016 Mar 18, 2016

I emailed to you, and as well used the code markup feature of the forums, and included instructions.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 06, 2016 Apr 06, 2016

I too would be interested in looking at a smaller sized sample file bloated with this unwanted metadata. It *should* be easy enough to remove using ExifTool.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 21, 2017 Apr 21, 2017

It is easy to remove only this metadata using ExifTool:

 

Photoshop files much larger than usual.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Apr 20, 2017 Apr 20, 2017

I have another way to figure out this,just select the layers all you want,then click right of your mouse,choose copy layers,then select NEW at the bottom of the options,then save this new document.you will find this one is more smaller than the original file.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jun 13, 2017 Jun 13, 2017

I too have this problem. I find it amazing that there isn't an easy solution to this issue within Photoshop. I'm trying to export simple JPEGs and they are coming out at way over 100mb!

Re the text for using in the script, should it be pasted into Text Edit (on a Mac) and then saved with the extension .jsx?

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 13, 2017 Jun 13, 2017

Correct and for best results place it in the Startup Scripts folder of Bridge. That way it loads up as a menu item under Tools.

You can select one or many files, run the command and it removes the ancestor metadata without the need to resave the files.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines