Participant

Question

OCR - w/ColdFusion

Forum|Forum|8 years ago
November 18, 2017
3 replies
2281 views

Has anyone attempted to perform OCR with CF? I have been tasked with extracting the text from a faxed image file (TIF). Are there any open source OCR APIs to integrate with CF?

search content of ocr pdf

This topic has been closed for replies.

T

Timothy29913166fgds

Participant

Here's what worked for me:

Download tesseract ocr, it's free and open-source: https://tesseract-ocr.github.io/tessdoc/Downloads.html

Then in cfscript:

public function ocr(FilePath) {
    ocrtext = "";
    cfexecute(
        name=full_tesseract_filepath
        arguments="""" & FilePath & """ stdout"
        variable="ocrtext",
        timeout="10"
        );
    return ocrtext;
}

Note: full_tesseract_filepath is usually "C:\Program Files\Tesseract-OCR\tesseract.exe" on windows machines.

T

Timothy29913166fgds

Participant

Oops: missing a few columns on that function call, it should be:

cfexecute(
    name=full_tesseract_filepath,
    arguments="""" & FilePath & """ stdout",
    variable="ocrtext",
    timeout="10"
);

EddieLotter

Inspiring

Note that to avoid clumsy double quotes and concatenation you can change the arguments parameter syntax to the following:

arguments='"#FilePath#" stdout',

It's easier to read.

D

danielheighton

Participant

You will likely not find a free OCR platform out there.

However my company wrote a Coldfusion and Java integration for Google Cloud Vision (also referenced above). You would obviously still need to pay for the usage on Cloud Vision

https://github.com/Construction-Monitor/coldfusion-vision-api

D

danielheighton

Participant

Also as a note, you mention you have TIF images, as far as I am aware you can't send TIF images in directly, so you would need to convert them to JPEG first. My recommendation for that would be to fall back to command line tools like GraphicsMagick as their performance is much better then Coldfusion's built in image tag.

A

armandof44358221Author

Participant

Thanks Daniel,

I went ahead and purchased an SDK and combined with a DLL I created to communicate with CF in which converts image to text via the CFOBJECT. Thanks for your input .

R

rbuong

Participant

The Google Cloud Vision API can extract text from an image or PDF file.

https://cloud.google.com/vision/

WolfShade

Legend

rbuong, that is neither open-source, nor ColdFusion.

armandof44358221, I did a quick Google search, and found a blog entry from 2009 that might get you started on the right path.

http://coldfusion.sys-con.com/node/1173727

HTH,

^ _ ^

UPDATE: Okay.. it's not CF, but it is an API that can be used from CF (ColdFusion or ColdBox). However, it is a commercial API, and not free. You will be charged anywhere from US$1.50 to US$3.50 per 1,000 API calls, depending on what you are trying to do with it.

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.

Scanning file for viruses.

This file cannot be downloaded