Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Importing docx file into indesign, not working

Community Beginner ,
May 02, 2024 May 02, 2024

I am working with a pdf file for a client's memoir and have converted each chapter to a docx format using a free tool online. Now, when I open it on my mac, it opens up on pages and the text styles are different from the pdf format. When I go to import it into the InDesign text box for book design it imports with words being combined. For example, the text would look like this:

" I saw a duck, butitranaway."

Now, I don't want to go in manually trying to figure out which words to separate as a lot of the words also have slang that I don't fully know. I also want a more efficient way to fix this issue. I have tried googling this issue, but I can't seem to find people solving this issue. This memoir is around 280 pages, and money-wise, it is not beneficial to do everything manually. Does anyone know a solution to this?

 
 
 
Upvote1Downvote0comments
TOPICS
How to , Print , Publish online , Type
1.5K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Expert , May 03, 2024 May 03, 2024
quote

 

 

The simple solution is to open any such files in a real copy of Word, and re-save in RTF, DOC and DOCX format under a new name. InDesign will often import one of those much more cleanly than the pretender version.

 


By @James Gifford—NitroPress

 

Yes, I was quite suprised this works - found this out a few months ago too. 

But no matter what - something always goes out of kilter.

 

Best success I have had is to save as a .doc file not a .docx

Translate
Community Expert ,
May 02, 2024 May 02, 2024

Have you tried exporting from the PDF to a Word Doc using Acrobat Pro?

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
May 03, 2024 May 03, 2024

I have not, I will try this. Thank you. 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 02, 2024 May 02, 2024

The takeaway here is that not all putative Word files are actually fully Word standard or compliant  — most export options from things like Google Docs and Pages and such do just a good enough job that Word itself can usually open the files. But most (I'm tempted to say all) files from these secondary sources and converters are not compatible with InDesign import.

 

The simple solution is to open any such files in a real copy of Word, and re-save in RTF, DOC and DOCX format under a new name. InDesign will often import one of those much more cleanly than the pretender version.

 

As Derek notes, using Acrobat Pro to export PDF to Word is one of the more reliable workflows, although it's still a good idea to open the result and do basic tidying and cleanup in Word before saving as a valid Word file set.

 

I have also seen more than one reference to "spacing issues with Word files" that trace to the file being set for full justification. Select all styles in Word and set them to left justification, and on top of the other fixes above, I'd be surprised if that doesn't clear up the spacing issue.

 

All that said, keep in mind that PDF does not store most text as absolutely linear flows. Depending on what tool and version created the PDF (and there are many, many second-rate ones), the text may actually have soft or hard returns at the end of each line as it was presented in the PDF. Search and replace to eliminate all soft returns (change to spaces) can help at the technical level, although you'll still have a lot of cleanup formatting and proofreading to do.

 

There just isn't any really clean way to get text back out of PDF, not without fairly specialized tools that are often too expensive for one-shot use.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 03, 2024 May 03, 2024
quote

 

 

The simple solution is to open any such files in a real copy of Word, and re-save in RTF, DOC and DOCX format under a new name. InDesign will often import one of those much more cleanly than the pretender version.

 


By @James Gifford—NitroPress

 

Yes, I was quite suprised this works - found this out a few months ago too. 

But no matter what - something always goes out of kilter.

 

Best success I have had is to save as a .doc file not a .docx

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 03, 2024 May 03, 2024

MS ran out of kilter long ago. 🙂

 

I have found all three formats to work in varying situations, but DOC does seem to be the most consistent. I automatically create all three options for any but the most expedient workflow, just so I don't have to back up and dig for more kilter.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 03, 2024 May 03, 2024

Yeh various ways have different results - sometimes I import the docx or the doc file - sometimes it needs to go all the way to RTF.

Sometimes this doesn't work.

 

Sometimes I'll copy and paste directly, but then .docx first (as supplied) then .doc then last resort rtf. 

 

Things often don't import/paste in correctly - and it's a struggle. 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
May 03, 2024 May 03, 2024

Thank you, This is very helpful. I wanted to  learn more about indesign as I worked with the work I am working with and learning why pdf's don't work well is good. 

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
May 03, 2024 May 03, 2024
quote

Thank you, This is very helpful. I wanted to  learn more about indesign as I worked with the work I am working with and learning why pdf's don't work well is good. 


By @defaultov3ej5blkkjy

 

PDF is the end result of the work done in the InDesign. 

 

It's never "the first step" or source - it's the "last resort" if you don't have access to the original file. 

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 03, 2024 May 03, 2024

Hi @defaultov3ej5blkkjy:

 

Here's one more thing to look at: click on the sentence with "butitranaway". Enable Type > Show Hidden Characters and then choose Edit > Edit in Story Editor. This is an unformatted view of the same story. Do you see blue dots between the words?

  • If so, that means you're dealing with overrides that are collapsing the spaces. They are easy to remove.
  • If not, I would do the PDF to InDesign conversion in Adobe Acrobat, which does a great job these days. I'm assuming you don't already have it since you used a free convertion tool. But you could ask a friend, or download the free 7-day trial, or subscribe for a month. At US$30 for a month, that's well worth it vs trying to fix this manually.

 

~Barb

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 03, 2024 May 03, 2024

I realize there's a limit to the number of blades even a mega-Swiss Army Knife like ID can have, but especially for these fairly common workflows between Adobe/Adobe-compliant formats and such... I just sigh at the need for One More D*mned Plugin. 😛

 

(Plugin, conversion service, outside tool, script, helper, whatever.)

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 03, 2024 May 03, 2024

You have to remember that a PDF is, essentially, merely a container of the print instructions to create a page, and as such has no idea of how the original document was constructed or laid out. More often than not, chunks of text are broken up into individual objects and, whether you use Acrobat's own export-to-Word function, or a sketchy free online tool, both have to GUESS how to put things back together, like how words are spaced and also which blocks of text are part of a paragraph. Better tools do this better than others, but NONE are perfect.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 03, 2024 May 03, 2024

Excellent summary. I'll just add that it was never meant to be an editable format, either, any more than a printed sheet of paper. All of the edit/modify/extract etc. features are badly glued on to a structure that doesn't really support them.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
May 03, 2024 May 03, 2024

@defaultov3ej5blkkjy

 

Why this memoir is in PDF? Can't you get your hands on the source file? 

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
May 03, 2024 May 03, 2024

I wanted to see if I could still use just the pdf without asking as they want to make sure their content stays safe. 

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
May 03, 2024 May 03, 2024
quote

I wanted to see if I could still use just the pdf without asking as they want to make sure their content stays safe. 

 

By @defaultov3ej5blkkjy

 

I'm sorry but i don't understand? 

 

"Safe"?? 

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
May 03, 2024 May 03, 2024

That is fine, don't need to understand that part. It's NDA contract. 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
May 04, 2024 May 04, 2024
quote

That is fine, don't need to understand that part. It's NDA contract. 


By @defaultov3ej5blkkjy

 

You said it's just a memoir...

 

So you are working on a text - but you can't have access to the original / source and need to recreate it from "printed" version? 

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 04, 2024 May 04, 2024
LATEST

Since the primary question (all the lousy ways to get text out of PDF for editing) has been answered, I will note that—

  • First-time/one-time/amateur authors are always convinced/terrified that their work will be stolen by the first person allowed to glance at it;
  • It's not unreasonable to have some protections, mostly a formal contract or NDA, with an unknown print/publication provider;
  • Someone in the chain thinks PDF is "safe" since htey can't edit it like the Word version;
  • Just maybe the OP/provider here is running ahead of the process and trying to do some spec work, before having full authorization to use the live material.

 

All of the problems would be solved or minimized by getting the live material, so resolving the reasons why that hasn't been done (some/any of the above, or who-knows) is the next useful step.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines