• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Framescript: Retrieving text from a broken xref

Participant ,
Jan 26, 2017 Jan 26, 2017

Copy link to clipboard

Copied

Hello fellows,

I'd like to write a script that will find broken external xrefs and attempt to restore them.  In my case, the xrefs get broken because the xref markers appearing in the source files are often updated by a system that generates these source files. I'd like to restore the xrefs by retrieving the xref string and attempting to find this string in the source files. But here is a problem: when an xref is broken, it no longer allows to retrieve XRefSrcText properly. The xref string no longer appears in XRefSrcText. I used the following code to test that:

Set vCurrentDoc = ActiveDoc;

Set vXRef = vCurrentDoc.FirstXRefInDoc;

Loop While(vXRef)

If vXRef.XRefIsUnresolved

Write Console  vXRef.XRefSrcText;

EndIf

Set vXRef = vXRef.NextXRefInDoc;

EndLoop

So, the XRefSrcText of an unresolved xref appears as follows:

25117: TableTitleTable Entry: ;2511718;2511719

While the XRefSrcText  of a live xref  appears as follows:

41369: TableTitle: Table 329: Item Numbering

What I would need to extract from here is the string "Item Numbering" and then find it in the xref source file to create a new link.

My question is how can one retrieve the text string from a broken xref? One way I could think of is converting the xrefs to text, copying the relevant part into a variable using regex. Any other ideas?

Thank you!

TOPICS
Scripting

Views

8.8K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Expert , Jan 30, 2017 Jan 30, 2017

You don't need to make a text range object; an XRef already has one.

Display oXRef.TextRange.Text;

Rick

Votes

Translate

Translate
Mentor ,
Jan 27, 2017 Jan 27, 2017

Copy link to clipboard

Copied

Hi rombanks,

Interesting question. If you have to resort to conversion to text, I would copy xrefs into another "scratch pad" document, then convert to text. Seems messy to do it in the original doc.

What is that first number in the XRefSrcText string, like 25117? The docs don't explain what it is, that I can find. Does that point to anything?

Russ

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Mentor ,
Jan 27, 2017 Jan 27, 2017

Copy link to clipboard

Copied

One other thought... I would expect the XRefSrcText string to stay static, until an update attempt. Do you have these files set to automatically update xrefs when they are opened? I'm a little fuzzy in this area, so maybe this is not useful.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 27, 2017 Jan 27, 2017

Copy link to clipboard

Copied

The XRefSrcText string does stay the same, whether the cross-reference is resolved or not. The cross-reference is unresolved because the Cross-Ref marker with the matching text is missing (or the target file is missing or has been renamed). What Roman is trying to do (I think) is grab the actual cross-reference text (for example, See "Installation" on page 1) and trying to do a match on that. Sometimes that text will be missing for an unresolved cross-reference.

As I read Roman's original post, I think he may be misunderstanding the XRefSrcText property. The value of this property needs to match the value of the Cross-Ref marker that it is pointing to. As I said above, this property doesn't change if a cross-reference becomes unresolved.

It seems to me that the problem is with the process that creates the source files. One solution may be to have this process also create Cross-Ref markers in the source documents that will have fixed text. You will have to make sure that each Cross-Ref marker's text is unique, but then the cross-references would still be resolved even after new source files are generated.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jan 28, 2017 Jan 28, 2017

Copy link to clipboard

Copied

Rick, Russ,

Hi!

I appreciate your response!

After running additional tests, I saw that even some resolved xrefs don't include their corresponding xref text string in their XRefSrcText -- Rick is right.

I also agree with Rick that the generated source files should have included static xref markers. Unfortunately, that's not the case and this is not going to change. That's the main source of the problem. So, I'll have to create a workaround. As far as I can see, the only way to do so is to convert each unresolved xref to text, extract the target xref text (using regex), delete the rest, search for a matching string in the xref source file, and create a new xref if possible. For some reason, I can't find a Framescript command that converts an xref to plain text. Please, advise!

Thanks,

Roman

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 28, 2017 Jan 28, 2017

Copy link to clipboard

Copied

Hi Roman. Try this.

// Assuming oDoc is your document object and oXRef is the cross-reference you want to "flatten".

// Create a temporary Cross-Reference format.

New XRefFmt Name('~temp~') DocObject(oDoc) NewVar(oXRefFmt);

Set oXRefFmt.Fmt = oXRef.XRefFmt.Fmt;

// Assign the temporary Cross-Reference format to the Cross-Reference.

Set oXRef.XRefFmt = oXRefFmt;

// Now delete the temporary format, which should "flatten" the Cross-Reference.

Delete Object(oXRefFmt);

I haven't tested the code above, but I have used this technique successfully. Note that you can use this same method to flatten user variables. Please let me know if you have any questions or comments. -Rick

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jan 30, 2017 Jan 30, 2017

Copy link to clipboard

Copied

Hi Rick,

I appreciate your response and the code sample! It indeed converts the xref to plain text.

However, the current problem is that the code doesn't select the xref first. I need the xref to be selected to extract the relevant text string once the xref becomes plain text and then delete the xref text.

So, I tried to select the xref and retrieve its text, as follows:

Set oDoc = ActiveDoc;

Set oXRef = oDoc.FirstXRefInDoc;

New TextRange NewVar(vXrefRange) Object(oXRef)

Offset(0) Offset(oXRef.Size);

Set oDoc.TextSelection = vXrefRange;

Display vXrefRange.Text;

So, the xref is selected, but its text is not retrieved/displayed for some reason.

The text is only displayed if I write "Display oDoc.TextSelection.Text;" instead.

What am I missing?

Thank you!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 30, 2017 Jan 30, 2017

Copy link to clipboard

Copied

You don't need to make a text range object; an XRef already has one.

Display oXRef.TextRange.Text;

Rick

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jan 30, 2017 Jan 30, 2017

Copy link to clipboard

Copied

Hi Rick,

Thank you for your valuable input!

I've just noticed that too, while checking the Reference Guide.

Thanks,

Roman

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jan 30, 2017 Jan 30, 2017

Copy link to clipboard

Copied

I wonder if it is possible to access the xref building blocks. For example, is there a way to retrieve the <$paratext> text of the xref?

As per the Reference Guide, there is only a way to retrieve the building blocks as a string.

Thanks,

Roman

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advocate ,
Jan 30, 2017 Jan 30, 2017

Copy link to clipboard

Copied

Hello Roman,

The XRefSrcText property contains the paragraph text, preceded by identification info. When referencing a structured element, this identification is the element ID plus element tag name. In unstructured FM, it is the paragraph ID plus the paragraph tag name (the name of the paragraph format).

To get the referenced paratext, search for the last occurrence of the : character in XRefSrcText and use whatever follows it. Note that this text is only replaced when the XRef is resolved. Unresolved cross-refs will not show any paratext.

PS. If you are working in structured FM, you should really get the XRef Wizard from Russ Ward - its $35 price tag is dirt cheap for what it does.

Good luck

Jang

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advocate ,
Jan 30, 2017 Jan 30, 2017

Copy link to clipboard

Copied

Hello Roman,

The XRefSrcText property contains the paragraph text, preceded by identification info. When referencing a structured element, this identification is the element ID plus element tag name. In unstructured FM, it is the paragraph ID plus the paragraph tag name (the name of the paragraph format).

To get the referenced paratext, search for the last occurrence of the : character in XRefSrcText and use whatever follows it. Note that this text is only replaced when the XRef is resolved. Unresolved cross-refs will not show any paratext.

PS. If you are working in structured FM, you should really get the XRef Wizard from Russ Ward - its $35 price tag is dirt cheap for what it does.

Good luck

Jang

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jan 30, 2017 Jan 30, 2017

Copy link to clipboard

Copied

Hi Jang,

Thank you for joining the discussion!

I am using unstructured FM.

In my case, the XRefSrcText doesn't contain the paratext, even if an xref is resolved. That's because the xref source text has a custom xref marker.

I wondered if the xref format's <$paratext> element can be somehow accessed to retrieve its value. Theoretically, this should have been possible, but based on what I see in the Framescript Ref. Guide, it's not.

Regards,

Roman

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advocate ,
Jan 30, 2017 Jan 30, 2017

Copy link to clipboard

Copied

Hi Roman,

If the XRefSrcText does not contain the para text, I really don't know what your system has been up to. Are you sure they are FrameMaker cross-references and not hyperlihks?

If they are true XRefs (or even if they are not - if they turn out to be hypertext links), and there is some text visible in them in FM, you can always retrieve that text. Assuming you have your active document in variable oDoc, try this at home:

var oXRef = oDoc.FirstXRefInDoc;

var oaTextItems = oXRef.GetText( Constants.FTI_String );

var sText = "";

var i;

for( i = 0; i < oaTextItems.length; i++ )

{

     sText += oaTextItems.sdata;

}

alert( sText );

Then you will have to figure out how to pull the text without the page number etc. from that string and find that text in the target document. That is up to you to code.

Ciaoi

Jang

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jan 30, 2017 Jan 30, 2017

Copy link to clipboard

Copied

Hi Jang,

Thanks for your input!

These are paragraph xrefs. I create xrefs manually. With Framescript, you can easily retrieve the xref text with oXRef.TextRange.Text;.

The tricky part comes next. It seems that the only way to retrieve the relevant string from the xref textrange is by using regex. I am not sure if Framescript v5.2 I have supports regex. I didn't see any info on that matter anywhere. The Find String command doesn't seem to include the regex option.

What are my options? I deliberately tried to avoid messing up with extendscript as it's not documented well, but it seems that there is no other choice.

Thanks,

Roman

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 30, 2017 Jan 30, 2017

Copy link to clipboard

Copied

You can get regular expression support in FrameScript 5.2 by using the VBScript.RegExp COM object built into most versions of Windows. Here is how you can use a regular expression to test the selected text:

Set sText = TextSelection.Text;

Set oRegex = InitializeXObject{'VBScript.RegExp'};

If oRegex = 0

  Write Console 'VBScript.RegExp object could not be initialized.';

  LeaveSub;

EndIf

// This is the regular expression pattern.

// These use the JavaScript RegExp syntax.

Set oRegex.Pattern = '[\xd0\xd1\x2d]';

// These flags are optional.

Set oRegex.Global = True; // Default is True

Set oRegex.IgnoreCase = True; // Default is True.

Display oRegex.Test{sText};

If oRegex.Test{sText} = 1

  Set oMatches = oRegex.Execute{sText};

  Loop While(i < oMatches.Count) LoopVar(i) Init(0) Incr(1)

    Display oMatch.Value; // Match value.

    Display oMatch.FirstIndex; // Offset from beginning of string.

  EndLoop

EndIf

Delete Object(oRegex);

Function InitializeXObject sProgId

//

// Initializes and returns an EActiveXObject if it exists, or 0 (zero) if

// it doesn't exist.

Local Result(0);

// Create the object.

New EActiveXObject NewVar(Result) ProgId(sProgId);

If Result.ErrorCode <> 0

  Write Console sProgId + ' could not be initialized.';

  Write Console 'ErrorCode: '+Result.ErrorCode;

  Set Result = 0;

EndIf

//

EndFunc //--------------------------------------------------------------------

There is way to get matching groups and you can do replacements. For more details, google VBScript regular expressions. Or ask me here.

--Rick

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 30, 2017 Jan 30, 2017

Copy link to clipboard

Copied

Here is a link to a chapter from my out-of-print FrameScript book:

http://www.rickquatro.com/resources/RegularExpressions.pdf

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jan 31, 2017 Jan 31, 2017

Copy link to clipboard

Copied

Hi Rick,

This is definitely a very interesting regex workaround. I greatly appreciate your guidance!

I need to do my homework first -- look into the regex syntax.

Thanks,

Roman

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Feb 13, 2017 Feb 13, 2017

Copy link to clipboard

Copied

Hello fellows,

I took some time to play with regex and think how to implement the xref fixing procedure.

While testing my regex, I encountered the following problem. Let's say I have a string <Text Goes Here> that is the 1st string on the line and I'd like to match it using regex. I guess there are a couple of options syntax-wise to do it. One is ^<([^<>]*?)>.

So, when I run this regex query in FM, it matches <Text Goes Here>. But when I add it to my code, the script fails to match the string. I can't figure out why.

When I set oRegex.Pattern = '^<([^<>]*?)>'; the string is not found.

My additional question is how to match Text Goes Here in <Text Goes Here> (without the angle brackets). AFAIK, this can be done using the lookaround technique but I didn't manage to create a matching query so far. Theoretically, it should be (?<=(<))[\w\s+])(?=(>)) but it doesn't match anything.

Thank you for your input in advance!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 13, 2017 Feb 13, 2017

Copy link to clipboard

Copied

The VB.Script RegExp syntax is based on JavaScript regular expressions and lookbehind is not supported.

I might need to see your code to see why the pattern doesn't match in your script.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Feb 13, 2017 Feb 13, 2017

Copy link to clipboard

Copied

Hi Rick,

Thank you for your response!

Lack of lookaround support is bad news. What are the alternatives in this case?

Here is the code part that deals with the relevant string match and extraction, before the new xref is created.

Set oDoc = ActiveDoc;

Set oXRef = oDoc.FirstXRefInDoc;

New EActiveXObject ProgId('VBScript.RegExp') NewVar(oRegex);

If oRegex.ErrorCode <> 0

MsgBox 'The VBScript.RegExp object could not be initialized.';

LeaveSub;

EndIf

Loop While(oXRef)

    If oXRef.XRefIsUnresolved

        Set vXrefTxt = oXRef.TextRange.Text;

        Set vXrefFmt = oXRef.XRefFmt.Name;

        Set vXrefSrc = oXref.XRefFile;

        NEW TextLoc NewVar(targetLoc) Object(oXref) Offset(oXref.Begin);

      

        If vXrefFmt = (("Table Entry and Table") | ("Table Entry") | ("Cell and Table"))

            Set oRegex.Pattern = '^<([^<>]*?)>';

        ElseIf vXrefFmt = (("Catalog Name") | ("Catalog Name, Number, Page (short)"))

            Set oRegex.Pattern = '^.*(?=(\s)+(Catalog))';

        EndIf

       

        Set oRegex.Global = False;

        Set oMatches = oRegex.Execute{vXrefTxt};

       

        Loop While(i < oMatches.Count) LoopVar(i) Init(0) Incr(1)

          Set TrgtValue = oMatches.Value;

          //Display oMatches.Value;

        EndLoop

       

        If TrgtValue = ''

         MsgBox 'No matches found.';

         LEAVELOOP;

        EndIf

...

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Feb 14, 2017 Feb 14, 2017

Copy link to clipboard

Copied

Guys,

Do you have any suggestions/remarks?

Thank you!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Feb 14, 2017 Feb 14, 2017

Copy link to clipboard

Copied

After running some debug tests, I see that the offending code is the code that checks the xref Pgf format. Without it, the regexp works well.

I changed the code to:

If vXrefFmt == (('Table Entry and Table') | ('Table Entry') | ('Cell and Table'))

            Set oRegex.Pattern = '^<([^<>]*?)>';

ElseIf vXrefFmt == (('Catalog Name') | ('Catalog Name, Number, Page (short)'))

            Set oRegex.Pattern = '^.*(?=(\s)+(Catalog))';

EndIf

But it still causes the regexp not to be matched. What's wrong with the IF part here?

Thanks!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Feb 15, 2017 Feb 15, 2017

Copy link to clipboard

Copied

Hello again,

I've resolved the issue above, it was just a matter of syntax.

The next hurdle I have is pretty strange. I can retrieve the target xref text's marker string and the target filename and store them in their variables.

But, when I try to create a new xref in place of the broken one and specify the new xref's XRefSrcText and XRefFile, I get the following error:

Run Error (Read Only Variable on command SET at line 86 and line 87...

Here is the old xref deletion and new xref creation code:

Displaying = False;      

//Delete the old Xref and create a new one in target file;

SET ActiveDoc = objTarget;

NEW TextLoc NewVar(targetLoc) Object(oXref) Offset(oXref.Begin.Offset);

//Display targetLoc;

SET tempXRef = oXRef;

SET oXRef = tempXRef.NextXRefInDoc;

Delete Object (tempXRef);

NEW XRef Format(vXrefFmt) TextLoc(targetLoc) NewVar(vxrefVar);           

[Line 86] SET vxrefVar.XRefSrcText = vTrgtMarkerTxt;

[Line 87] SET vxrefVar.XRefFile = vXrefSrc;

It is important to note that the vTrgtMarkerTxt and vXrefSrc values are the right ones (checked by displaying them).

Please, advise!

Thanks in advance!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 16, 2017 Feb 16, 2017

Copy link to clipboard

Copied

Hi Roman,

A few things of note here:

1) Test if the new Cross-Reference exists before trying to set its properties:

If vrefVar

  // Set the properties here (lines 87, 88)

Else

  Write Console 'Cross-reference was not inserted.';

EndIf

2) There is no reason to explicitly create a TextLoc for the new cross-reference. You can simply do this:

Set targetLoc = oXRef.TextRange.Begin;

3) It is not necessary to delete the old cross-reference and insert a new one. Simply change the properties on the old cross-reference:

Set oXRef.XRefSrcText = vTrgMarkerTxt;

Set oXRef.XRefFile = vXrefSrc;

-Rick

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines