Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Need Grep To Add Single White Space Between English and Chinese Character

Engaged ,
Apr 12, 2019 Apr 12, 2019

Windows 10 InDesign 2019 64 bit

I'm working on a 115 page product manual for a client that has mostly simplified Chinese throughout, but some English words as well. I'm noticing there aren't any spaces where a Chinese symbol ends and an English character begins and vice versa. My first thought was to use Grep, but I don't know how to get it to differentiate between the two languages, or if this can even be done.

Here is an example of what I'm seeing:

她的工作由米歇尔拉尔森在Core Dynamics延续。

Does anyone know how to make a GREP description in Find/Change that could drop a single space before and after the English words in this so it reads:

她的工作由米歇尔拉尔森在 Core Dynamics 延续。

By the way, I have a paragraph style applied to the Chinese symbols using the Noto SC typeface, but I used Grep to apply a character style to the English which uses Bryant typeface.


It's only an island if you look at it from the water.
2.1K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 12, 2019 Apr 12, 2019

find-change.png

SUMMARY

Find what: "([a-z,A-Z, ]+)"

Change to: " $1 "

Note: Do not include the " character in the Find/Change fields

FIND BLOCKS OF ENGLISH

This GREP expression should find simple blocks of English composed of just letters and spaces (there are many ways to do this kind of search.).

[a-z,A-Z, ]+

The square brackets identify a range of characters to be searched for. In this case we are looking for all the lower case letters, all the upper case letters and any spaces. The "+" character repeats the expression "one or more times".

If there are other characters you wanted included in your search you could just add them inside the square brackets. The following example now includes the "-" character in the search.

[a-z,A-Z, ,-]+

FIND AND REPLACE BLOCKS OF ENGLISH

To be able to reference everything that is found put the expression inside brackets.

([a-z,A-Z, ]+)

With the brackets included in the "Find what:" field, we can now refer to that result in the "Change to:" field by using "$1". To add a space at either end of the block of English simply add a space before and after the "$1" in the "Change to:" field.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guide ,
Apr 13, 2019 Apr 13, 2019

Hmm! … Btw:

You can't use a comma as "separator"!

The parenthesis are useless:  Find: xxx Replace by: [space]$0[space]

[That doesn't mean I agree with the way you define to treat the user's matter. It's just general comments on the GREP syntax used]

Best,

Michel, for FRIdNGE

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 13, 2019 Apr 13, 2019

Hi Michel,

I agree that there are different and more succinct ways to create these GREP elements. My thinking was to suggest a method that could be easily modified and expanded upon.

Cheers,

Michael

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Mentor ,
Apr 13, 2019 Apr 13, 2019

Chris Panny  wrote

I used Grep to apply a character style to the English

Well, so you can base your search on English/non-English character change, or use already applied Char style. Let’s try the second approach. In 2 steps:

Find what:

(?<!.).

Change to:

$0

Find Format: point to your *English* char style here.

This will insert a space before EN text.

Then:

Find what:

.(?!.)

Change to:

$0

Find Format: point to your *English* char style.

This will insert a space after EN text.

Notice there must be a space character inserted before $0 in the first step, and after $0 in the second. That’s essential!

Both searches should be run only once. Every other run will keep inserting additional spaces.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Engaged ,
Apr 17, 2019 Apr 17, 2019

Thanks for everyone's replies! I've tried the suggestions, but it's still not doing what I intended. I now see there are some more conditions that I should have included in the original post. When I first applied my Character style to just the English characters, I did so using the following Character set:

[\da-zA-Z~2~d~r]

This sniffed out each English character and applied my character style. It did this on a character by character basis. In doing so, it skipped any white space between these words.

Now I need to describe a location that resides between an English and Chinese character and add a single white space.


It's only an island if you look at it from the water.
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Mentor ,
Apr 18, 2019 Apr 18, 2019
LATEST

So your concern is white spaces between English words, because they get doubled?

I see two possible quick workarounds here, without investing too much time and effort into this.

1. After your [\da-zA-Z~2~d~r] string, add another one:

[\da-zA-Z~2~d~r]\K\s

which in plain English means "also pick up every white space which follows my 'English' character", and apply the same Eng Char style to it.

Now strings from previous post will work as expected.

2. Or, after running those two searches from previous post, just run another search to replace double spaces that appear between English words, to single ones.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines