• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Parsing/Extracting info from InDesign via API or SDK

Explorer ,
Sep 03, 2021 Sep 03, 2021

Copy link to clipboard

Copied

Hello,

 

After parsing/extracting info from the received indd/idml input we want to build an HTML or convert it into a different format like JSON. 

Without having InDesign installed, are there any APIs or SDKs by Adobe that allows extracting/parsing detailed info from indd/idmls? Or, are there any Adobe suggested open-source libraries, to accomplish the same?

 

Looking forward to your response

 

Thanks!

TOPICS
How to , Scripting , SDK , Type

Views

1.3K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Expert , Sep 03, 2021 Sep 03, 2021

Yes @Nikhil Ranka the IDML file contains all the information like styling, color, object properties etc. The infomration is all that is needed to construct the document in InDesign. Regarding graphic assets, the IDML file contains the file path of the placed file, for embedded images I think it stores the base64 data of the embedded asset.

-Manan

Votes

Translate

Translate
Community Expert ,
Sep 03, 2021 Sep 03, 2021

Copy link to clipboard

Copied

IDML is your friend. The format is open you can read the specification and parse the IDML file to pull out any information you want from the file. The parsing involves parsing XML files as IDML is a collection of mainly XML files that contain all the file info.

-Manan

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Sep 03, 2021 Sep 03, 2021

Copy link to clipboard

Copied

Thanks for your response Manan.

 

Being a backend developer and fairly new to the IDML format, wanted to understand if that contains all the info about the file ie: element location, overlay info, formatting info, etc., or there is some info that is left out.

Also are image and other graphic assets also accessible via the XML? 

 

Thanks!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 03, 2021 Sep 03, 2021

Copy link to clipboard

Copied

Yes @Nikhil Ranka the IDML file contains all the information like styling, color, object properties etc. The infomration is all that is needed to construct the document in InDesign. Regarding graphic assets, the IDML file contains the file path of the placed file, for embedded images I think it stores the base64 data of the embedded asset.

-Manan

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Sep 05, 2021 Sep 05, 2021

Copy link to clipboard

Copied

Thanks @Manan Joshi for your detailed response.

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Sep 05, 2021 Sep 05, 2021

Copy link to clipboard

Copied

@Manan Joshi Are there any APIs or libraries, Adobe, or non Adobe that allows extraction of info?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Sep 05, 2021 Sep 05, 2021

Copy link to clipboard

Copied

APIs all require InDesign of course! Adobe made IDML for their own purposes. They do not want to encourage non-Adobe apps to work without InDesign. Adobe prefer to sell InDesign or InDesign Server. So this is a complex adventure in reverse engineering, not a simple task. It contains the info used by InDesign to do the layout, not a convenient extraction for reuse. You have to duplicate some complex layout, I believe. But it's the only way. 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Sep 06, 2021 Sep 06, 2021

Copy link to clipboard

Copied

Thanks for sharing Adobe's approach on this. So I am assuming to build an automated publishing solution that converts InDesign files to HTML, PDFs or other formats InDesign Server would be required. 

Any idea if the InDesign sever is available via API? Google did not help. 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 06, 2021 Sep 06, 2021

Copy link to clipboard

Copied

There are some samples that demonstrates query, creation etc of IDML without the use of InDesign ofcourse. You can have a look at it by downloading the C++ sdk from the following site

https://www.adobe.io/console/servicesandapis

Look at the following path in the SDK folder <SDK_ROOT>/devtools/idmltools

As i said it's all about understanding the file format and using XSL for parsing and getting the stuff you need, these samples also do the same.

As far as third party API's or SDK's are concerned there are some available in Java and Python but all of them have a paid plan. You can google it if you want to go this route, else there is no other shortcut other than getting your hands dirty with the format specification.

-Manan 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Nov 19, 2024 Nov 19, 2024

Copy link to clipboard

Copied

LATEST

Sorry, @Nikhil Ranka, I'm late to the party. I had similar requirements as you and created an open-source library, to convert IDML documents into JSON and back again. This makes it easy to parse or even modify information of an InDesign document. The library is written in PHP and can be found here: https://github.com/BitAndBlack/idml-json-converter Maybe it will help you as well. 🙂

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines