Skip to main content
Participant
November 18, 2017
Question

OCR - w/ColdFusion

  • November 18, 2017
  • 3 replies
  • 2246 views

Has anyone attempted to perform OCR with CF? I have been tasked with extracting the text from a faxed image file (TIF).   Are there any open source OCR APIs to integrate with CF? 

This topic has been closed for replies.

3 replies

Participant
January 5, 2024

Here's what worked for me:

 

Download tesseract ocr, it's free and open-source:  https://tesseract-ocr.github.io/tessdoc/Downloads.html

 

Then in cfscript:

public function ocr(FilePath) {
    ocrtext = "";
    cfexecute(
        name=full_tesseract_filepath
        arguments="""" & FilePath & """ stdout"
        variable="ocrtext",
        timeout="10"
        );
    return ocrtext;
}

Note:  full_tesseract_filepath is usually "C:\Program Files\Tesseract-OCR\tesseract.exe" on windows machines. 

Participant
January 5, 2024

Oops:  missing a few columns on that function call, it should be:

cfexecute(
    name=full_tesseract_filepath,
    arguments="""" & FilePath & """ stdout",
    variable="ocrtext",
    timeout="10"
);
EddieLotter
Inspiring
January 5, 2024

Note that to avoid clumsy double quotes and concatenation you can change the arguments parameter syntax to the following:

arguments='"#FilePath#" stdout',

It's easier to read. 

Participant
November 21, 2017

You will likely not find a free OCR platform out there.

However my company wrote a Coldfusion and Java integration for Google Cloud Vision (also referenced above). You would obviously still need to pay for the usage on Cloud Vision

https://github.com/Construction-Monitor/coldfusion-vision-api

Participant
November 21, 2017

Also as a note, you mention you have TIF images, as far as I am aware you can't send TIF images in directly, so you would need to convert them to JPEG first. My recommendation for that would be to fall back to command line tools like GraphicsMagick as their performance is much better then Coldfusion's built in image tag.

Participant
November 21, 2017

Thanks Daniel,

I went ahead and purchased an SDK and combined with a DLL I created to communicate with CF in which converts image to text via the CFOBJECT. Thanks for your input .

Participant
November 20, 2017

The Google Cloud Vision API can extract text from an image or PDF file. 

https://cloud.google.com/vision/

WolfShade
Legend
November 20, 2017

rbuong, that is neither open-source, nor ColdFusion.

armandof44358221​, I did a quick Google search, and found a blog entry from 2009 that might get you started on the right path.

http://coldfusion.sys-con.com/node/1173727

HTH,

^ _ ^

UPDATE:  Okay.. it's not CF, but it is an API that can be used from CF (ColdFusion or ColdBox).  However, it is a commercial API, and not free.  You will be charged anywhere from US$1.50 to US$3.50 per 1,000 API calls, depending on what you are trying to do with it.