Problems exporting pdf to word with Indigenous language characters

New Here ,
Jan 30, 2021 Jan 30, 2021

Copy link to clipboard

Copied

Hello I am working on our Indigenous language and trying to convert my pdf to word.  The linguistic symbols are not coming across, is there a specific language I should be selecting on the export?  If I can get the document into word I can quickly edit the document and save it as a csv and import it into SQL Server.

TOPICS
Edit and convert PDFs, General troubleshooting, How to, Scan documents and OCR

Views

75

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Most Valuable Participant ,
Jan 30, 2021 Jan 30, 2021

Copy link to clipboard

Copied

Unfortunately, the encoding of the fonts in this file is bad. You'll notice you can't even correctly copy and paste text in English from it to another document. This means it can't be exported to another format, like Word.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jan 30, 2021 Jan 30, 2021

Copy link to clipboard

Copied

Ahh you hit something that prompted me to try a different method of preperation. The English is coming through now. I will attach the new files to the original post.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jan 30, 2021 Jan 30, 2021

Copy link to clipboard

Copied

Capture.PNG

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Most Valuable Participant ,
Jan 30, 2021 Jan 30, 2021

Copy link to clipboard

Copied

You're creating the file? Is it being scanned and then OCRed? If so, make sure the fonts used support Unicode for best results.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jan 30, 2021 Jan 30, 2021

Copy link to clipboard

Copied

I have a scanned picture of the page, I then create a pdf from the page. I then OCR the pdf using English. Which language should I be selecting during the OCR process?

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Most Valuable Participant ,
Jan 31, 2021 Jan 31, 2021

Copy link to clipboard

Copied

LATEST

To get effective OCR, you need to choose the actual language of the text. This is because getting accurate OCR is a complex process which uses - among other things - the language structure, punctuation and accent set, spell checking, and other language-specific techniques. There are many languages - including major world languages like Arabic - that have no support in Acrobat, so if you are working with a less well known language, the chances of getting a scan into accurate text are small.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines