Skip to main content
Participant
May 2, 2024
Answered

Importing docx file into indesign, not working

  • May 2, 2024
  • 5 replies
  • 2127 views

I am working with a pdf file for a client's memoir and have converted each chapter to a docx format using a free tool online. Now, when I open it on my mac, it opens up on pages and the text styles are different from the pdf format. When I go to import it into the InDesign text box for book design it imports with words being combined. For example, the text would look like this:

" I saw a duck, butitranaway."

Now, I don't want to go in manually trying to figure out which words to separate as a lot of the words also have slang that I don't fully know. I also want a more efficient way to fix this issue. I have tried googling this issue, but I can't seem to find people solving this issue. This memoir is around 280 pages, and money-wise, it is not beneficial to do everything manually. Does anyone know a solution to this?

 
 
 
Upvote1Downvote0comments
Correct answer Eugene Tyson
quote

 

 

The simple solution is to open any such files in a real copy of Word, and re-save in RTF, DOC and DOCX format under a new name. InDesign will often import one of those much more cleanly than the pretender version.

 


By @James Gifford—NitroPress

 

Yes, I was quite suprised this works - found this out a few months ago too. 

But no matter what - something always goes out of kilter.

 

Best success I have had is to save as a .doc file not a .docx

5 replies

Robert at ID-Tasker
Legend
May 3, 2024

@defaultov3ej5blkkjy

 

Why this memoir is in PDF? Can't you get your hands on the source file? 

 

Participant
May 3, 2024

I wanted to see if I could still use just the pdf without asking as they want to make sure their content stays safe. 

 

Robert at ID-Tasker
Legend
May 3, 2024
quote

I wanted to see if I could still use just the pdf without asking as they want to make sure their content stays safe. 

 

By @defaultov3ej5blkkjy

 

I'm sorry but i don't understand? 

 

"Safe"?? 

 

Brad @ Roaring Mouse
Community Expert
Community Expert
May 3, 2024

You have to remember that a PDF is, essentially, merely a container of the print instructions to create a page, and as such has no idea of how the original document was constructed or laid out. More often than not, chunks of text are broken up into individual objects and, whether you use Acrobat's own export-to-Word function, or a sketchy free online tool, both have to GUESS how to put things back together, like how words are spaced and also which blocks of text are part of a paragraph. Better tools do this better than others, but NONE are perfect.

James Gifford—NitroPress
Legend
May 3, 2024

Excellent summary. I'll just add that it was never meant to be an editable format, either, any more than a printed sheet of paper. All of the edit/modify/extract etc. features are badly glued on to a structure that doesn't really support them.

Barb Binder
Community Expert
Community Expert
May 3, 2024

Hi @defaultov3ej5blkkjy:

 

Here's one more thing to look at: click on the sentence with "butitranaway". Enable Type > Show Hidden Characters and then choose Edit > Edit in Story Editor. This is an unformatted view of the same story. Do you see blue dots between the words?

  • If so, that means you're dealing with overrides that are collapsing the spaces. They are easy to remove.
  • If not, I would do the PDF to InDesign conversion in Adobe Acrobat, which does a great job these days. I'm assuming you don't already have it since you used a free convertion tool. But you could ask a friend, or download the free 7-day trial, or subscribe for a month. At US$30 for a month, that's well worth it vs trying to fix this manually.

 

~Barb

~Barb at Rocky Mountain Training
James Gifford—NitroPress
Legend
May 3, 2024

I realize there's a limit to the number of blades even a mega-Swiss Army Knife like ID can have, but especially for these fairly common workflows between Adobe/Adobe-compliant formats and such... I just sigh at the need for One More D*mned Plugin. 😛

 

(Plugin, conversion service, outside tool, script, helper, whatever.)

James Gifford—NitroPress
Legend
May 2, 2024

The takeaway here is that not all putative Word files are actually fully Word standard or compliant  — most export options from things like Google Docs and Pages and such do just a good enough job that Word itself can usually open the files. But most (I'm tempted to say all) files from these secondary sources and converters are not compatible with InDesign import.

 

The simple solution is to open any such files in a real copy of Word, and re-save in RTF, DOC and DOCX format under a new name. InDesign will often import one of those much more cleanly than the pretender version.

 

As Derek notes, using Acrobat Pro to export PDF to Word is one of the more reliable workflows, although it's still a good idea to open the result and do basic tidying and cleanup in Word before saving as a valid Word file set.

 

I have also seen more than one reference to "spacing issues with Word files" that trace to the file being set for full justification. Select all styles in Word and set them to left justification, and on top of the other fixes above, I'd be surprised if that doesn't clear up the spacing issue.

 

All that said, keep in mind that PDF does not store most text as absolutely linear flows. Depending on what tool and version created the PDF (and there are many, many second-rate ones), the text may actually have soft or hard returns at the end of each line as it was presented in the PDF. Search and replace to eliminate all soft returns (change to spaces) can help at the technical level, although you'll still have a lot of cleanup formatting and proofreading to do.

 

There just isn't any really clean way to get text back out of PDF, not without fairly specialized tools that are often too expensive for one-shot use.

Eugene TysonCommunity ExpertCorrect answer
Community Expert
May 3, 2024
quote

 

 

The simple solution is to open any such files in a real copy of Word, and re-save in RTF, DOC and DOCX format under a new name. InDesign will often import one of those much more cleanly than the pretender version.

 


By @James Gifford—NitroPress

 

Yes, I was quite suprised this works - found this out a few months ago too. 

But no matter what - something always goes out of kilter.

 

Best success I have had is to save as a .doc file not a .docx

James Gifford—NitroPress
Legend
May 3, 2024

MS ran out of kilter long ago. 🙂

 

I have found all three formats to work in varying situations, but DOC does seem to be the most consistent. I automatically create all three options for any but the most expedient workflow, just so I don't have to back up and dig for more kilter.

Derek Cross
Community Expert
Community Expert
May 2, 2024

Have you tried exporting from the PDF to a Word Doc using Acrobat Pro?

Participant
May 3, 2024

I have not, I will try this. Thank you.