Page 1 of 1

Combine convert and OCR tasks

Posted: Sun Feb 18, 2018 12:54 pm
by trwul62
Up front: I am a first time user of PDF-Tools (v7.0). It is still in demo mode.
I'd like to use the tool primarily to batch convert files, something that is really poor within Acrobat.
A few questions:
1. it seems that converting and OCR are two different jobs - can they be joined/combined to 1 task - first convert, then OCR-ed immy thereafter?

2. what is the limit of a single conversion task? 100 files, 1000 files?

3. how to convert Excel and Word documents simultaneously?

4. will PDF-Tools launch Excel and Word each time, for each conversion? (making it impossible to perform any other tasks)

Thanks!

Re: Combine convert and OCR tasks

Posted: Mon Feb 19, 2018 3:04 pm
by Tracker Supp-Stefan
Hello trwul62,

Welcome to our forums, and thanks for your enquiry.
Yes - the steps are separate - you need to convert the file(s) to the PDF file format, and then our OCR engine can do it's job, but you can create custom tools that will e.g. take the source file(s) as input, and then do several operations on them.
Take a look at e.g. the "Scan to PDF" tool that is already prepared. It has as a first step - get images from scanner, then the second one is to create the PDF files, and then the third one is to OCR the pages of the newly created PDF files.

As for converting different file formats - that's also possible - just make sure that in the "Choose Input Files" menu you've selected to allow all supported file formats.

We should request Word or Excel APIs to help us with the conversion process, but having a separate e.g. Word window open should not be an issue.

If you process more than 100 files at a time - we might need to allocate a bit more memory and some processing power for managing the number of files, but there should not really be any hard limit on the number. Still I'd recommend running reasonable batches, so that if something happens you could recover form it quicker and need to reprocess less files.

Regards,
Stefan

Re: Combine convert and OCR tasks

Posted: Mon Feb 19, 2018 10:11 pm
by DenisO
Hi trwul62,
1. it seems that converting and OCR are two different jobs - can they be joined/combined to 1 task - first convert, then OCR-ed immy thereafter?
In attachment you can find the tool that converts images to PDF and than applies OCR operation. Just choose Options -> Import Tools and select file from archive.
If you do not need to use extended options for conversion images to PDF you can use standard tool "OCR pages". Change file types so tool can accept raster images as input.
2. what is the limit of a single conversion task? 100 files, 1000 files?
Note that you can uncheck Batch processing mode so every input will be processed by every tool action one by one. To get result you don't need to wait until all inputs are processed.

Re: Combine convert and OCR tasks

Posted: Mon Feb 19, 2018 11:17 pm
by TrackerSupp-Daniel
Thanks for the assist DenisO! :)

If any further questions come up, do not hesitate to ask!

Re: Combine convert and OCR tasks

Posted: Tue Feb 20, 2018 7:09 am
by trwul62
Many thanks - I truly appreciate the feedback!

Re: Combine convert and OCR tasks

Posted: Tue Feb 20, 2018 10:03 am
by Tracker Supp-Stefan
:D

Re: Combine convert and OCR tasks

Posted: Tue Feb 20, 2018 11:46 pm
by Vasyl-Tracker Dev Team
Hi, trwul62.
3. how to convert Excel and Word documents simultaneously?
You may do it by running two or more copies of PDF-Tools application. Then in one process, for example, you may start the docx-to-pdf conversion and in second - xlsx-to-pdf conversion, in third - OCR.
All will work simultaneously.

HTH.

Re: Combine convert and OCR tasks

Posted: Wed May 20, 2020 12:35 pm
by inspector71
Unfortunately I can't open Denis Oleksenko's file. It's a pdtex file which I presume is no longer supported. If you're out there Denis, can you re-post your guide in another format? Or maybe someone else can help?

Thanks, Graham

Re: Combine convert and OCR tasks

Posted: Wed May 20, 2020 11:37 pm
by TrackerSupp-Daniel
Hi, inspector71

These files are not intended to be opened, you would download them and then use PDF-Tools and choose to "import tool" before selecting that tool file. It should still be compatible.

Kind regards,

Re: Combine convert and OCR tasks

Posted: Thu May 21, 2020 10:47 am
by inspector71
Ah, thank you. Forgive my ignorance - I bought Editor yesterday.

Cheers

Re: Combine convert and OCR tasks

Posted: Thu May 21, 2020 10:55 am
by inspector71
Daniel, to ensure I don't mess up my new app, can I ask you to confirm that this page has the info I need to import Denis's tool file:
viewtopic.php?f=70&t=34274&p=142198&hil ... ol#p142198

If not, I ask you to indulge my ignorance again ...

Thanks for your patience

Re: Combine convert and OCR tasks

Posted: Thu May 21, 2020 4:01 pm
by TrackerSupp-Daniel
Hi, inspector71

That post would be pertaining to running a tool from the command line after it has been imported, to import any tool into PDF-Tools simply open the PDF-Tools application window, and click "Options > import tool" then select the PDTex file mentioned earlier and it will import that specific tool without affecting your other tools or settings.
PDFXTools_UqTPKA6knT.png
Once the tool has been imported you can run it by double clicking on it directly in the PDF-Tools window, by dragging the desired files overtop of the tool in the window, or by setting up a watched folder.
Note that you can also create your own custom tools at any time by following the steps here.

Kind regards,

Re: Combine convert and OCR tasks

Posted: Thu May 21, 2020 5:35 pm
by inspector71
Morning Daniel

Thanks for getting back to me quickly. For the life of me I can't find the tools, import option you're referring to. I'm using Editor 8.0.339

Kind regards
Graham

Re: Combine convert and OCR tasks

Posted: Thu May 21, 2020 5:45 pm
by TrackerSupp-Daniel
Hi, inspector71

That will be where the problem lies, PDF-Tools is a separate application from our Editor, and has its own functions, files and the like, it is not a part of the Editor at all.
You can download PDF-Tools from its product page here to test it and decide if it is worth making a purchase.
If you already hold an Editor license and decide that you also need PDF-tools, you can upgrade your license to a Tools or PRO license, (both of which include PDF-Tools and the Editor, from your account page, on the upgrade options tab.

Kind regards,

Re: Combine convert and OCR tasks

Posted: Thu May 21, 2020 5:54 pm
by inspector71
Aw nuts. Penny's dropped. Tools is a separate, batch op app.

Never mind. Thanks for your help for this poor dunce

Combine convert and OCR tasks

Posted: Thu May 21, 2020 6:03 pm
by TrackerSupp-Daniel
:)