I need help with the following request:
I want to apply OCR to a PDF document and save it as a new PDF document.
The problem now is that the output file contains only pages of the original, which have no text. Since pages that contain text are skipped, they are also not included in the output file.
What I need is the following setting:
- An OCR is applied to a document
- Pages that already contain text are excluded from OCR
- The output file must contain all pages of the original. Just so that previously textless pages now contain text.
My setting looks like this:
Code: Select all
Private Sub SetOptionsNode(ByRef optionNode As ICabNode, ByVal argInfo As ArgumentInfo)
optionNode("PagesRange.Type").v = "All"
optionNode("ExtParams.Accuracy").v = argInfo.dpi
optionNode("ExtParams.Language").v = argInfo.languages
optionNode("ExtParams.AutoDeskew").v = argInfo.autoDeskew
optionNode("OCRNoTextPagesOnly").v = argInfo.ocrNoTextPagesOnly
optionNode("OutputType").v = 1
optionNode("OutputDPI").v = 0
End Sub
The difference to the other OCR posts is the setting for the OutputType. So far I have only found posts where the OutputType is set to 0. But to be able to skip pages with text, OutputType = 1 is needed. Accordingly in my case "IOperation.Params.Root("Output").v" is written as output file.
I hope you can help me