Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
6

grep to subscript chemical compunds

Community Beginner ,
Apr 30, 2024 Apr 30, 2024

Good afternoon,

 

I am looking for a simple grep script to find chemical compounds without subscripts and change them to subscripts.

 

They could be with and without parenthesis (C2H4).

 

Many thanks in advance.

TOPICS
How to , Scripting
853
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 30, 2024 Apr 30, 2024
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 30, 2024 Apr 30, 2024
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 30, 2024 Apr 30, 2024

Use a GREP Style and this expression:

 

 

(?<=\u\l)\d+|(?<=\u)\d+

 

 

Screenshot 2024-04-30 at 4.30.04 PM.png 

 

For the nerds: I couldn’t get the OR to work within one PLB.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 30, 2024 Apr 30, 2024

@Scott Falkner

 

How about:

 

(?<=\u\l?)\d+

 

But I'm on my phone so can't check - "?" means "zero or one" so should work?  

 

You are right - looks like "?" can't be used inside Positive LookBehind either.

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Engaged ,
Apr 30, 2024 Apr 30, 2024
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 30, 2024 Apr 30, 2024

Just to add to other suggestions (I tried them and didn't quite work forme for some reason)

 

(?<=(\l|\u))\d+

 

EugeneTyson_0-1714546409915.png

 

 

But Caveat is that it will find other things that you might not want found.

 

Like C1 and C2 in the example 

 

 

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 01, 2024 May 01, 2024

A tip for typographic consideration that you define the subscript as an OpenType subscript (in the Character Style) if the font has it available. This will look much nicer.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 01, 2024 May 01, 2024

Most comprehensive one I've seen was written over at CreativePro.com, and this uses GREP styles to do this automatically, but best GREP style is in the comments section by Laurent Tournier: https://creativepro.com/auto-format-superscript-and-subscript-numbers-using-grep-styles/

 

If the answer wasn't in my post, perhaps it might be on my blog at colecandoo!
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guide ,
May 01, 2024 May 01, 2024
LATEST

@ronnag74426794 

 

Our colleague Laurent Tournier worked on this GREP fifteen years ago 😉

I guess this compact form should sum it up:

 

 

((?<=Uu[bhopqst])|(?<=A[cglmrstu]|B[aehikr]|C[adeflmnorsu]|D[bsy]|E[rsu]|F[emr]|G[ade]|H[efgos]|I[nr]|Kr|L[airu]|M[dgnot]|N[abdeiop]|Os|P[abdmortu]|R[abefghnu]|S[bcegimnr]|T[abcehilm]|Xe|Yb|Z[nr])|(?<=[BCFHIKNOPSUVWY]))[1-9]\d{0,1}

 

 

(Edit: if you need more than two digits change the ending \d{0,1} into \d{0,2} or more — I don't know whether it's chemically relevant though.)

 

Best,

Marc

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines