Skip to main content
Participant
August 20, 2018
Question

Problems with docx Word import

  • August 20, 2018
  • 5 replies
  • 2159 views

I've experienced problems with text disappearing when importing Word docx files into InDesign 5.5 for PC, but I think the problem also arises in other versions. The solution seems to be to use the older Word doc format.

I'll illustrate with a short amount of text in two paragraphs (see screen shot). I have imported the text in docx format, then copied and pasted the same text. Then I saved the docx file in the older doc format and got a different (better) result. Pasting from that format achieved the same result as for the docx format.

The problems are the hyphen sometimes dropping out (from 'so-called'm which appears twice in parentheses in first paragraph - the first time the hyphen drops out, the second time not, and on pasting it drops out but leaves a space, and the second time it's fine), and a whole chunk of text dropping out from the dox import. It doesn't happen with the doc import, though there are numerous missing glyphs. They at least are easy to see and fix. The hyphens and text dropping out aren't easy to see of course. 

The solution seems to be to forget about docx imports and use doc always.

    This topic has been closed for replies.

    5 replies

    r-j-fAuthor
    Participant
    August 20, 2018

    Thank you Jongware and Bob.

    I always 'eyeball' a Word file and do the obvious tidy-up of things like superfluous tabs and spaces and so on. But I'd be interested to know what might be involved in a 'rigid clean-up' and what Bob is checking for, and how.

    Following on from jane-e's post, I copied the character at the point where the text drops out and pasted that into a search in Word, and indeed Word found it. But it isn't revealed when you tell Word to show all formatting/characters (although I have just noticed a difference in the hyphens I mentioned above, so that's a step forward).  

    jane-e
    Community Expert
    Community Expert
    August 20, 2018

    It is a Word issue, and it has to do with the unknown characters in Word. I can select them and copy and paste them into InDesign, but I can't figure out what they are. Copying them into the Glyphs panel reveals nothing.

    The only direct formatting is the Arial italics on the space before (RV), so it's not a formatting issue.

    r-j-fAuthor
    Participant
    August 20, 2018

    Many thanks Uwe - that's a useful confirmation.

    Jane - files from two different authors, not linked in any way.

    r-j-fAuthor
    Participant
    August 20, 2018

    Thank you for your comments and suggestions. I'm not sure about linking the file - I tried uploading 'test2.docx' to Dropbox and the link is here (not sure whether this will work...): Dropbox - test2.docx

    I did investigate the point at which the text drops. There's nothing on the hyphens, but at 'biography of...', if I change the fount to Times and look at the glyph panel, it says the character is GID 846, unicode 202A, 'left-to-right embedding', which is interesting. I hadn't tried before, but if I use the arrow keys to traverse over that point in the Word file, there is an invisible character (after the word space and before the capital R) at the point where the text drops out.

    The point is though that in the normal course of events you wouldn't want to be investigating the file before getting on with the job of importing it. These and similar problems recently cropped up on two completely separate jobs, which is what prompted me to post.

    Thanks again.

    Community Expert
    August 20, 2018

    Hi,

    just tested the docx file you uploaded with placing and the result was the trimmed text.

    InDesign CC 2018.1 version 13.1.0.76 on Windows 10 (1803).

    From my German InDesign:

    Result:

    Regards,
    Uwe

    AnneMarie Concepcion
    Community Expert
    Community Expert
    August 20, 2018

    Maybe it's an InDesign CS5.5 problem  ... because it's so old, its Word import filters are also old and have problems with .docx.

    Do you have  a friend w/CC 2018?  (or any recent vintage of ID?)  They could test to verify. Or post a link to that snippet of .docx file and we can test for you.

    AM

    BobLevine
    Community Expert
    Community Expert
    August 20, 2018

    Looks to me like the Word files are a mess. I've never had an issue using DOCX and if there are images involved, it's a must to keep the original images intact.

    jane-e
    Community Expert
    Community Expert
    August 20, 2018

    I agree with Bob. Examine the Word files at the point where the text drops. That's where your problem is.

    Select the first character that disappears both in Word and in the "paste" version. What can you tell us about it?