• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Unicode words mixed-up with special characters along with numeric values.

Explorer ,
Mar 16, 2023 Mar 16, 2023

Copy link to clipboard

Copied

User copying Arabic words from PDF file and pasting into Text box.

After pasting words shows all fine with no any alien characters. But at submit, taking alien characters mix-up with special characters along with numeric values.

Please provide us solution to identify alien string value at client side and validate user right there.

e.g. 

دھﺎﻧﺎراج ﺗﻮﻣﺎس ﺟﯿﺎراج

Views

404

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Expert , Mar 16, 2023 Mar 16, 2023

A suggestion:

  1.  Escape every # in the input string by converting it to ##.
  2.  Then use the canonicalize function.

    Example:
    <cfscript>
    originalString="دھ&##65166;&##65255;&##65166;راج &##65175;&##65262;&##65251;&##65166;س &##65183;&##64511;&##65166;را;";
    canonicalizedString=canonicalize(originalString,false,false);
    writeoutput(canonicalizedString);
    </cfscript>
    ​

Votes

Translate

Translate
Community Expert ,
Mar 16, 2023 Mar 16, 2023

Copy link to clipboard

Copied

Do you want to perform validation in the PDF or in the HTML form? Can you possibly accept the PDF as a form submission rather than copying from PDF to HTML? What's the character set in the PDF, and what is it in the HTML form and the CF server?

 

Dave Watts, Eidolon LLC

Dave Watts, Eidolon LLC

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Mar 16, 2023 Mar 16, 2023

Copy link to clipboard

Copied

Sometimes users copying text from PDF and pasting it into HTML. In this activity the issue occurs. 

Copying from Google translator or typing directly into HTML textbox is not an issue. In copy paste text from PDF only having issue. I hope now you understood exact matter.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 16, 2023 Mar 16, 2023

Copy link to clipboard

Copied

Right, but my questions still stand. Do you want to change the PDF? Do you want to change how you accept form data? Right now, I think the problem is an incompatibility between the character set used by the PDF and the one used by CF and your HTML form.

 

Dave Watts, Eidolon LLC

Dave Watts, Eidolon LLC

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Mar 16, 2023 Mar 16, 2023

Copy link to clipboard

Copied

No Dave I don't want to change PDF but I want to validate pasted text in HTML textbox either with JavaScript or jQuery at client side. So that user may not submit incorrect unicode text.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 16, 2023 Mar 16, 2023

Copy link to clipboard

Copied

A suggestion:

  1.  Escape every # in the input string by converting it to ##.
  2.  Then use the canonicalize function.

    Example:
    <cfscript>
    originalString="دھ&##65166;&##65255;&##65166;راج &##65175;&##65262;&##65251;&##65166;س &##65183;&##64511;&##65166;را;";
    canonicalizedString=canonicalize(originalString,false,false);
    writeoutput(canonicalizedString);
    </cfscript>
    ​

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 16, 2023 Mar 16, 2023

Copy link to clipboard

Copied

BKBK,
This is really amazing. This solution working.
Since very long issue was persisting in my application. I was always suggesting users to avoid copy paste Arabic words from PDF. Now it will give me relax as well as our clients.
Further I have to do testing deeply and I will implement into production.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 17, 2023 Mar 17, 2023

Copy link to clipboard

Copied

LATEST

This is a better answer than mine. Thanks!

 

Dave Watts, Eidolon LLC 

Dave Watts, Eidolon LLC

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
Documentation