Skip to main content
Participant
August 16, 2023
Answered

.writeToStream on the fileRef returned from extractPDFOperation not working

  • August 16, 2023
  • 1 reply
  • 1547 views

I was attempting to read the zip file directly to the code memory due to working in a read-only serverless environment instead of saving and then reading and the writeToStream method does not work. This works with other functions where a pdf files are returned by the sdk (like split PDF), but not zip files returned from extracting text. The documentation clearly states this meethod should be an option but is it not in this case.

Exception encountered while executing operation TypeError: result.saveToStream is not a function

    This topic has been closed for replies.
    Correct answer Raymond Camden

    Oh, you are using saveAsStream:  "result.saveToStream is not a function" Shouldn't it be writeToStream?

    1 reply

    Raymond Camden
    Community Manager
    Community Manager
    August 16, 2023

    Could you share a bit more of your code?

    Participant
    August 16, 2023
     ScrapeK1: async function ScrapeK1(data) {
        console.log("FX ScrapeK1");
        try {
         
          const PDFServicesSdk = require("@adobe/pdfservices-node-sdk");
          const path = require("path");

          const credentials = await PDFServicesSdk.Credentials.servicePrincipalCredentialsBuilder()
            .withClientId(process.env.PDF_SERVICES_CLIENT_ID)
            .withClientSecret(process.env.PDF_SERVICES_CLIENT_SECRET)
            .build();

          const executionContext = PDFServicesSdk.ExecutionContext.create(credentials);

          const options = new PDFServicesSdk.ExtractPDF.options.ExtractPdfOptions.Builder()
            .addElementsToExtract(PDFServicesSdk.ExtractPDF.options.ExtractElementType.TEXT)
            .build();

          const extractPDFOperation = PDFServicesSdk.ExtractPDF.Operation.createNew();

          const buffer = data.file;
          console.log(buffer);

          const stream = Readable.from(buffer);

          console.log(stream);
          const input = PDFServicesSdk.FileRef.createFromStream(stream, "application/pdf");

          extractPDFOperation.setInput(input);

          extractPDFOperation.setOptions(options);

          // Generating a file name
          let outputFilePath = createOutputFilePath("/tmp");

          const AdmZip = require("adm-zip");

          return await extractPDFOperation
            .execute(executionContext)
            .then(async (result) => {
              // Save the zip file -- Right here is the function that cannot use the steam method
              await result.saveAsFile(outputFilePath);

              const zip = new AdmZip(outputFilePath);
              const zipEntries = zip.getEntries();

              const structuredDataEntry = zipEntries.find((entry) => entry.entryName === "structuredData.json");
              if (!structuredDataEntry) {
                console.log("structuredData.json not found in the zip.");
                return;
              }

              const jsonData = structuredDataEntry.getData().toString("utf8");

              const parsedData = JSON.parse(jsonData);

              console.log("JSON Data:", util.inspect(parsedData, { depth: null }));

              return parsedData.elements;
            })
            .catch((err) => {
              if (err instanceof PDFServicesSdk.Error.ServiceApiError || err instanceof PDFServicesSdk.Error.ServiceUsageError) {
                console.log("Exception encountered while executing operation", err);
              } else {
                console.log("Exception encountered while executing operation", err);
              }
            });

          //Generates a string containing a directory structure and file name for the output file.
          function createOutputFilePath(directory) {
            let date = new Date();
            let dateString =
              date.getFullYear() +
              "-" +
              ("0" + (date.getMonth() + 1)).slice(-2) +
              "-" +
              ("0" + date.getDate()).slice(-2) +
              "T" +
              ("0" + date.getHours()).slice(-2) +
              "-" +
              ("0" + date.getMinutes()).slice(-2) +
              "-" +
              ("0" + date.getSeconds()).slice(-2);
            return path.join(directory, dateString + ".zip");
          }
        } catch (err) {
          console.log("Exception encountered while executing operation", err);
        }
      },
    Raymond Camden
    Community Manager
    Community Manager
    August 16, 2023

    If you logout result, is it a FileRef? What object does it appear to be.

     

    Also, if I were doing serverless stuff with our APIs, I'd skip the SDK and just hit the *super* simple REST API direct. Much more control that way.