Skip to main content
Participant
November 11, 2020
Question

When perform OCR on pdf using PDF tool API it gives output in PDF only. How can i get output in .txt

  • November 11, 2020
  • 1 reply
  • 229 views

I'm using PDF Tool API for OCR. below is the code. 

 

/*
* Copyright 2019 Adobe
* All Rights Reserved.
*
* NOTICE: Adobe permits you to use, modify, and distribute this file in
* accordance with the terms of the Adobe license agreement accompanying
* it. If you have received this file from a source other than Adobe,
* then your use, modification, or distribution of it requires the prior
* written permission of Adobe.
*/

package com.journaldev.bootifulmongo.adobe;

/**
* This sample illustrates how to perform OCR operation on a PDF file and convert it into a searchable PDF file.
* <p>
* Note that OCR operation on a PDF file results in a PDF file.
* <p>
* Refer to README.md for instructions on how to run the samples.
*/
public class OcrPDF {

// Initialize the logger.
private static final Logger LOGGER = LoggerFactory.getLogger(OcrPDF.class);

public static void main(String[] args) {

try {

// Initial setup, create credentials instance.
Credentials credentials = Credentials.serviceAccountCredentialsBuilder()
// .fromFile("pdftools-api-credentials.json")
.fromFile("D:\\New folder\\bootiful-mongo\\src\\pdftools-api-credentials.json")
.build();


//Create an ExecutionContext using credentials and create a new operation instance.
ExecutionContext executionContext = ExecutionContext.create(credentials);
OCROperation ocrOperation = OCROperation.createNew();

// Set operation input from a source file.
// FileRef source = FileRef.createFromLocalFile("src/main/resources/ocrInput.pdf");
FileRef source = FileRef.createFromLocalFile("D:\\New folder\\bootiful-mongo\\src\\main\\resources\\7_input\\P7P1.pdf");

ocrOperation.setInput(source);

// Execute the operation
FileRef result = ocrOperation.execute(executionContext);
OutputStream object = new FileOutputStream("D:\\New folder\\bootiful-mongo\\src\\main\\resources\\adobe_output\\output.txt");

// Save the result at the specified location
result.saveAs("D:\\New folder\\bootiful-mongo\\src\\main\\resources\\adobe_output\\res.txt");

} catch (ServiceApiException | IOException | SdkException | ServiceUsageException ex) {
LOGGER.error("Exception encountered while executing operation", ex);
}
}
}



I want outout in .txt in english language.
This topic has been closed for replies.

1 reply

Participating Frequently
December 8, 2020

Hi @jaymin5CFE , Thanks for reaching out to us, We currently do not support saving the OCR output file response directly to .txt file.