# replaced by %23 when creating a web link

Report · May 20, 2020

I need help with this one. I have been setting up documents for my organization with a url fragment for 4 years now. (A fragment is an internal page reference, sometimes called a named anchor. It usually appears at the end of a URL and begins with a hash (#) character followed by an identifier. It refers to a section within a web page. In HTML documents, the browser looks for an anchor tag with a name attribute matching the fragment).

I use the #page= [page number here] so that the url goes to a specific in the pdf when opened in browser.

Today, it isn't working! I type in the #page= and acrobat is automatically changing the URL text that I am typing to %23page=

Simply: If I enter the Link with # in a browser, it works perfectly. The page and anchor exist.

If I enter the Link with %23 in any browser, I get an error 404

Why is acrobat replacing my url text?

Could it be a preference setting i'm not aware of?

Any help would be GREATLY appreciated.

Report · May 20, 2020

Have you noticed if this happened after an update took place in your machine?

Report · May 20, 2020

I'm not sure. My subscription with adobe cc auto-updates Acrobat and since this happens in the background, I'm not sure if it has recently updated. Since my first post, I uninstalled Acrobat, then reinstalled.... no change.

Report · May 21, 2020

I was thinking if that is happening because you 're using more than one opening parameter actions in a single URL and the opening page action is not set as the first action in that line. If that is the case have you tried rearranging the order of the first action to opening pages first?

Also, I was wondering if this coul also be related to having spaces in between the opening parameters and not using the appropriate case sensitive spelling of the commands.

There should be no spaces, like correct --->>> #page=20 instead of #page= 20 <<<------incorrect with the exception of separating page numbers by comas and a space.

This is better explained (although very briefly) in the Parameters for Opening PDF Files guide: https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/pdf_open_parameters.pdf

Report · May 22, 2020

Hello ls_rbls: thanks for replying and your input.

The issue is not the url structure that I am using. I have been using the same structure for 4 years now, with no previous issues. The automatic replacement Acrobat is doing happens after I type in the url with the # sign in the (create) web link window and then hit OK.

Since my first post, I have uninstalled my subscription-based Acrobat DC completely (I used the adobe cleanup app too) and installed Acrobat Pro (2017) version and the problem went away. So I am deducing that this issue has something to do with the last update. I am wondering if there is a conflict with my Windows 7 platform and the latest Acrobat update Adobe pushes out with the subscription...?

I am back to work thanks to having the older version installed, but not happy that i feel its a band-aid solution.

Report · May 22, 2020

Hey alexandra,

Just a couple of days ago I've been coming accross this thread posted by clairecessford which provides a very quick third-party solution:

https://community.adobe.com/t5/acrobat/ist-replaced-by-23-in-urls-gt-error-404/td-p/5811448

In my opinion this is not answering why or what causes the issue, but it seems like it worked for other users.

Report · May 23, 2020

Hi ls_rbls ,

Yes, I saw this post too. But with the amount of web links I generate, there are 2 issues:

1. I would need another subscription to another app because I would easily exceed the 1000 free limit

2. It would almost double my work time to do this: copy > paste > generate- there, then copy > paste > generate back in Acrobat.

argh. So frustrating 😞

Report · May 23, 2020

This is not related to Windows 7. It happens with MS Windows 10 too.

And it does seems like a bug was introduced with the May 2020 update.

However, you can apply javascript manually to this field.

In "Link Properties" goto the "Actions" tab , and instead of selecting "Open a web link" from the Select Action dropdown menu scroll down and select "Run a JavaScript".

Enter your URL with oepning parameter like this:

app.launchURL("https://www.adobe.com/#page=3");

You can test with the same URL that I am using in the example script above. It works.

Report · May 24, 2020

Hi ls_rbls,

I plan on waiting for the next Acrobat-upgrage (and or a personal PC upgrade to Windows 10) to try out your possible fix if the issue persists at that point.

Thanks for your reply: I am saving your javascript coding just in case i need it then!

For now the Acrobat Pro 2017 version doesn't seem to have this glitch. So i'm good for now (keeping my fingers crossed).

Report · May 24, 2020

This is nothing to do with Unicode or spaces or JavaScript. Acrobat has always had major problems understanding the structure of a URI, and it makes an invalid assumption about reserved characters.

The RFC3986 standard defines which characters can and cannot be used in a URI, and also defines a set of "reserved" characters that have a special meaning (such as / : ? # ). Any character not on the permitted list, such as a space, must be percent-encoded (commonly called "URL-encoding"). The RFC is perfectly clear about the rules:

If a reserved character is being used for a reserved purpose (such as # to indicate an anchor name, or ? to show the start of a parameter list) then it must NEVER be percent-encoded. Browsers must process the character using its defined reserved function.
If a reserved character is being used for any other purpose (a parameter might want to define mystring=has#tagz) then it must ALWAYS be percent-encoded, so it must be written mystring=has%23tagz . Browsers must process the character as plain text.

Some browsers are a little flexible when it comes to spaces in URLs, but they will always follow RFC3986 when they encounter a ? # or %

Acrobat always applies rule 2 and percent-encodes everything, no matter what. For the vast majority of URIs that is going to break things, and there is no way to tell Acrobat to stop messing about with the string. Adobe will argue it is playing safe by guaranteeing there will never be any "illegal" characters in a web link, but it's totally wrong to forcibly change the intention of user-entered data. It should leave the URI exactly as entered; if it's invalid that's the author's problem.

Report · May 24, 2020

Awesome!

thank you for clarifying.

Report · May 24, 2020

Hi Dave_Merchant ,

Yes, I understand this and have seen this in previous posts regarding this issue, and i don't want to disagree with your premise, however: why is this only happening now and not in the previous 4 years I have been using Acrobat performing the same repetitive "create link > open a web page > Enter a URL for this link" action? Why now?

It appears something changed either:

a) within the latest Acrobat update itself; or

b) something that (I did?) set the preferences differently that enabled the "Acrobat always applies rule 2 and percent-encodes everything, no matter what"

Consider that it is not happening in the Acrobat Pro 2017 version that I had to revert to and am using now successfully.

If only Adobe would let the user just apply the URL as typed, and let the user worry about its validity before "it makes an invalid assumption about reserved characters". Alas, I digress.

Report · May 24, 2020

The way Acrobat "improves" your URIs keeps changing, it's pot luck which versions do or do not mess with the characters. It's been broken on and off since Acrobat 8, but I've never seen it mentioned in the change notes.

There is no config option in Acrobat that controls URI parsing, even in the secret PreferenceReference.

Report · May 27, 2020

Hey Alexandra I will get to the bottom of this with very long answer.

I do have to thank Dave again for how he documents his answers with very specific answers. Dave reference to RFC3986 is spot on which led me to dig up more. Below is my suggestions.

IN ADDITION TO THE JAVASCRIPT OPTION THAT I POSTED EARLIER BELOW IS THE MOST UP TO DATE SOLUTION THAT I WAS ABLE TO CONFIRM

****************** DO NOT USE THE WEBLINK PLUG-IN ***********************

************* UNTIL ADOBE CAN LOOK INTO THIS *****************

*********** AND FIX IT ************

Links work perfectly fine if you just enable from EDIT -->>> PREFERENCES---->>> GENERAL --->>> check the tickbox "Create Links from URLs" and then just right-click on the document and select from the context menu "Edit Text & Images" (OR , opening the "Edit PDF Tool") ---> select "Add Text" and type in the desired URL with its opening parameters in just plain text.

NOTE 1 : You can also use copy and paste the URL text string from a file text editor such as Notepad and paste it in the text box. (DO NOT copy from MS Word or Wordpad , they both will convert the text string to a hyperlink automatically; this is what we're trying to avoid). This method won't work when you copy a hyperlinks that are already encoded as Text/HTML and paste it in the text box in Acrobat; it will create the same issue because it trigger to use the problematic weblink creation tool in Acrobat.

NOTE 2: When you're done adding your URLs to your document, SAVE and close it . Reopening your document in Acrobat will apply the conversion to hyperlinks automatically and most importantly without the incorrect %23 encoding.

HERE IS MORE OBSERVATIONS FOR THE COMMUNITY THAT MAY BE INTERESTED IN LOOKING INTO THIS:

copying a URL from a hyperlink that was generated with MS Word, Wordpad, and/or from a web browser application and pasting it directly into an opened PDF document in Acrobat will trigger the incorrect percent encoding to UTF-8.

Manually creating a web link using the "Add-Edit link" tool will also trigger the incorrect percent encoding.

Both of these methods fail to decode strings of plain text that contain special characters, such as the "#" in this case, into valid UTF-8 encoded URL formats

These methods are failing to properly convert text to URL because the Acrobat WebLink plug-in that handles this part of the conversion is doing it improperly.

Not only the Acrobat WebLink plug-in seems to be glitchy but is also partially encoding only some of the special charachters with the correct encoding to UTF-8.

You WOULD NOT see this problem if the Weblink plug-in was DECODING UTF-8 to TEXT STRINGS instead of encoding it to UTF-8, ASCII or to ISO-8859-1. AND you would've also be able to manuallly fix this problem if Acrobat allow these weblinks to be properly tagged as and containerized as text annotation in the document structure.

That is the Bug! It should NOT encode, IT SHOULD DECODE to begin with.

In a web page that use the older HTML 4, for example, it will accept both the text string "#" or %23 in a URL created with the Weblink plug-in, even with Acrobat Pro DC Weblinking acting out.

You can test with the URLs below using the weblink method in Acrobat:

https://www.w3.org/TR/REC-html40/interact/forms.html%23form-content-type

And you can also type it like this in any of your browsers address bar and it will also open up the page successfully:

https://www.w3.org/TR/REC-html40/interact/forms.html#form-content-type

To see the HTML document type used in that page and its encoded character set, hit "CTRL+U in your keyboard and you will see this clearly outline in the first three lines:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">

Notice the HTML document type and the type of content accepted to include the character set.

By itself this is not a bug, but as mentioned by Dave is related to security that is constantly under revision by the Internet Society AND (as addressed in the RFC ), that was imporved in HTML5 and now it results in a violation of those HTML5 standards.

Now compare how a page with HTML5 displays this same information:

//JUST THE FIRST LINE  !DOCTYPE HTML is how you identify HTML5 it doesn't have any other disclaimer notes; simple

<!DOCTYPE HTML>
    <html class="spectrum--medium" lang="en">


//THEN SCROLL DOWN PASS ALL THE SUPPORTED LANGUAGES AND YOU WILL SEE THE ENCODING TYPE 

 <meta charset="UTF-8"/>

That said, it is worth noting that back in 2016 a vulnerability was addressed in the XMPCore libraries (which are used by the Weblink plug-in)

See the quote below here: https://www.cvedetails.com/cve/CVE-2016-4216/

"XMPCore in Adobe XMP Toolkit for Java before 5.1.3 allows remote attackers to read arbitrary files via XML data containing an external entity declaration in conjunction with an entity reference, related to an XML External Entity (XXE) issue."

To fix this exploit Adobe recommended to update the xmpcore library:

https://helpx.adobe.com/security/products/xmpcore/apsb16-24.html

So I'm thinking if this may explain in a very indirect way why your version of Acrobat DC 2017 works as if there is no problem.

The dates prior to when this vulnerability was addressed coincide with the same timeframe that your version was released. Something worth examining between Acrobat Pro DC and Acrobat DC 2017 and compare what the XMPs that the Weblink driver is registering with looks like .

In Acrobat Pro DC there is no bug when it comes to enforcing this UTF-8 decoding and encoding . At least we have a clear indication that the secuirty mechanism is doing its job.

But the actual bug that I am identifying for the Adobe community lives in the Weblink plug-in driver registration process that is used in Acrobat Pro DC.

It is causing an unnecessary and incorrect encoding of string text to UTF-8 in the PDF itself which doesn't let the HTML 5 web pages that already enforce UTF-8 to properly fall back to the appropriate decoding.

Let's use this URL: https"//www/adobe.com/#page=3 as an example.

You can clearly visually spot that the conversion is incomplete after a link is created with the Weblink plug-in because the "#" sign gets encoded to the %23 but not the "=" sign, for example, to include all the other reserved special characters..

If the link used in my example above was encoded correctly to UTF-8, ASCII, or ISO-8859-1 (also known as ISO-Latin-1), should look like this:

https"//www/adobe.com/%23page%3D3

But this is also incorrect.

See the default character-set in HTML5 for UTF-8 and how it works.

If this Web-link plug-in was doing its job correctly, when you manually type in an URL address as plain text in the Weblink tool like this :

https"//www/adobe.com/#page=3

it should read like this:

https"//www/adobe.com/#page=3

No changes should occur.

And if the Weblink plug-in was actually encoding the text string correctly to UTF-8 , ASCII, or ISO-8859-1, then it should read or look like this:

https%3A%2F%2Fwww.adobe.com%23page%3D3

Using Dave's referenced link for the RFC it clearly states: "URI producing applications should percent-encode data octets that correspond to characters in the reserved set unless these characters are specifically allowed by the URI scheme to represent data in that component"

To see this for yourself use this online ENCODER/DECODER tool: https://www.urlencoder.org/

From further digging more information out of the Acrobat DC SDK documentation, the Weblink plug-in is having an issue somewhere with Unicode paths(if on MS Windows (because Unicode is separate from the default encoding standard of the OS which nowadays is UTF-16--->> This is not a problem in macOS and other Unixlike OSs because Unicode is the standard already) , Host Function Tables used by the Weblink plug-in (https://help.adobe.com/en_US/acrobat/acrobat_dc_sdk/2015/HTMLHelp/Acro12_MasterBook/API_References_S...) , XMPCore libraries, Weblink extended API, incorrect character sets or character tables, and not limited to incorrect XMP metadata. More importantly look here: https://help.adobe.com/en_US/acrobat/acrobat_dc_sdk/2015/HTMLHelp/Acro12_MasterBook/API_References_S...

Also look at good old document here:

https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/WeblinkAPIReference.pdf

The document linked below actually explains how the URI actions are "parsed" on the Acrobat side of the house:

https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf.)

Report · Jun 02, 2020

Hi,

Today a new optional update was released that addresses this issue.

Update Acrobat and see how it goes.

Creating a URL with the Weblink plug-in seems to be working now.

I am getting a different error though.

"The Web Capture operation you have requested has failed because of an error"

Just checking on my end what is the issue.

Report · Jun 02, 2020

THANK YOU ADOBE!!!

this issue has been fixed with the last update to version:

2020.009.20067

Additional note to the Acrobat community users:

If you get an error "The Web Capture operation you have requested has failed because of an error"

Is because when you type in an http URL using a URI producing application, such as the Acrobat Weblink plug-in , you don't need to type https://www.adobe.com.

Just type your desired URL as:

https://adobe.com

Website written in HTML5 will handle this standard convention.

You will notice that your web browser will find the page and convert the whole URL automatically to:

https:// www.adobe.com

Report · Aug 31, 2021

This seems to be a problem again. Anyone know of a good fix this time? version 2021.005.20060