Combine convert and OCR tasks

This Forum is for the use of End Users requiring help and assistance for Tracker Software's PDF-Tools.

Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Tracker Supp-Stefan

Post Reply
trwul62
User
Posts: 2
Joined: Sun Feb 18, 2018 10:06 am

Combine convert and OCR tasks

Post by trwul62 »

Up front: I am a first time user of PDF-Tools (v7.0). It is still in demo mode.
I'd like to use the tool primarily to batch convert files, something that is really poor within Acrobat.
A few questions:
1. it seems that converting and OCR are two different jobs - can they be joined/combined to 1 task - first convert, then OCR-ed immy thereafter?

2. what is the limit of a single conversion task? 100 files, 1000 files?

3. how to convert Excel and Word documents simultaneously?

4. will PDF-Tools launch Excel and Word each time, for each conversion? (making it impossible to perform any other tasks)

Thanks!
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17824
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Combine convert and OCR tasks

Post by Tracker Supp-Stefan »

Hello trwul62,

Welcome to our forums, and thanks for your enquiry.
Yes - the steps are separate - you need to convert the file(s) to the PDF file format, and then our OCR engine can do it's job, but you can create custom tools that will e.g. take the source file(s) as input, and then do several operations on them.
Take a look at e.g. the "Scan to PDF" tool that is already prepared. It has as a first step - get images from scanner, then the second one is to create the PDF files, and then the third one is to OCR the pages of the newly created PDF files.

As for converting different file formats - that's also possible - just make sure that in the "Choose Input Files" menu you've selected to allow all supported file formats.

We should request Word or Excel APIs to help us with the conversion process, but having a separate e.g. Word window open should not be an issue.

If you process more than 100 files at a time - we might need to allocate a bit more memory and some processing power for managing the number of files, but there should not really be any hard limit on the number. Still I'd recommend running reasonable batches, so that if something happens you could recover form it quicker and need to reprocess less files.

Regards,
Stefan
User avatar
DenisO
Site Admin
Posts: 104
Joined: Fri Jun 09, 2017 5:40 pm

Re: Combine convert and OCR tasks

Post by DenisO »

Hi trwul62,
1. it seems that converting and OCR are two different jobs - can they be joined/combined to 1 task - first convert, then OCR-ed immy thereafter?
In attachment you can find the tool that converts images to PDF and than applies OCR operation. Just choose Options -> Import Tools and select file from archive.
If you do not need to use extended options for conversion images to PDF you can use standard tool "OCR pages". Change file types so tool can accept raster images as input.
2. what is the limit of a single conversion task? 100 files, 1000 files?
Note that you can uncheck Batch processing mode so every input will be processed by every tool action one by one. To get result you don't need to wait until all inputs are processed.
Attachments
batch.JPG
batch.JPG (16.82 KiB) Viewed 3765 times
PDFFromImageAndOCR.zip
(1.47 KiB) Downloaded 137 times
Denis Oleksenko
Software Developer
Tracker Software Products (Canada) LTD
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8440
Joined: Wed Jan 03, 2018 6:52 pm

Re: Combine convert and OCR tasks

Post by TrackerSupp-Daniel »

Thanks for the assist DenisO! :)

If any further questions come up, do not hesitate to ask!
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
trwul62
User
Posts: 2
Joined: Sun Feb 18, 2018 10:06 am

Re: Combine convert and OCR tasks

Post by trwul62 »

Many thanks - I truly appreciate the feedback!
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17824
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Combine convert and OCR tasks

Post by Tracker Supp-Stefan »

:D
User avatar
Vasyl-Tracker Dev Team
Site Admin
Posts: 2352
Joined: Thu Jun 30, 2005 4:11 pm
Location: Canada

Re: Combine convert and OCR tasks

Post by Vasyl-Tracker Dev Team »

Hi, trwul62.
3. how to convert Excel and Word documents simultaneously?
You may do it by running two or more copies of PDF-Tools application. Then in one process, for example, you may start the docx-to-pdf conversion and in second - xlsx-to-pdf conversion, in third - OCR.
All will work simultaneously.

HTH.
Vasyl Yaremyn
Tracker Software Products
Project Developer

Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
inspector71
User
Posts: 5
Joined: Wed May 20, 2020 11:47 am

Re: Combine convert and OCR tasks

Post by inspector71 »

Unfortunately I can't open Denis Oleksenko's file. It's a pdtex file which I presume is no longer supported. If you're out there Denis, can you re-post your guide in another format? Or maybe someone else can help?

Thanks, Graham
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8440
Joined: Wed Jan 03, 2018 6:52 pm

Re: Combine convert and OCR tasks

Post by TrackerSupp-Daniel »

Hi, inspector71

These files are not intended to be opened, you would download them and then use PDF-Tools and choose to "import tool" before selecting that tool file. It should still be compatible.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
inspector71
User
Posts: 5
Joined: Wed May 20, 2020 11:47 am

Re: Combine convert and OCR tasks

Post by inspector71 »

Ah, thank you. Forgive my ignorance - I bought Editor yesterday.

Cheers
inspector71
User
Posts: 5
Joined: Wed May 20, 2020 11:47 am

Re: Combine convert and OCR tasks

Post by inspector71 »

Daniel, to ensure I don't mess up my new app, can I ask you to confirm that this page has the info I need to import Denis's tool file:
viewtopic.php?f=70&t=34274&p=142198&hil ... ol#p142198

If not, I ask you to indulge my ignorance again ...

Thanks for your patience
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8440
Joined: Wed Jan 03, 2018 6:52 pm

Re: Combine convert and OCR tasks

Post by TrackerSupp-Daniel »

Hi, inspector71

That post would be pertaining to running a tool from the command line after it has been imported, to import any tool into PDF-Tools simply open the PDF-Tools application window, and click "Options > import tool" then select the PDTex file mentioned earlier and it will import that specific tool without affecting your other tools or settings.
PDFXTools_UqTPKA6knT.png
Once the tool has been imported you can run it by double clicking on it directly in the PDF-Tools window, by dragging the desired files overtop of the tool in the window, or by setting up a watched folder.
Note that you can also create your own custom tools at any time by following the steps here.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
inspector71
User
Posts: 5
Joined: Wed May 20, 2020 11:47 am

Re: Combine convert and OCR tasks

Post by inspector71 »

Morning Daniel

Thanks for getting back to me quickly. For the life of me I can't find the tools, import option you're referring to. I'm using Editor 8.0.339

Kind regards
Graham
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8440
Joined: Wed Jan 03, 2018 6:52 pm

Re: Combine convert and OCR tasks

Post by TrackerSupp-Daniel »

Hi, inspector71

That will be where the problem lies, PDF-Tools is a separate application from our Editor, and has its own functions, files and the like, it is not a part of the Editor at all.
You can download PDF-Tools from its product page here to test it and decide if it is worth making a purchase.
If you already hold an Editor license and decide that you also need PDF-tools, you can upgrade your license to a Tools or PRO license, (both of which include PDF-Tools and the Editor, from your account page, on the upgrade options tab.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
inspector71
User
Posts: 5
Joined: Wed May 20, 2020 11:47 am

Re: Combine convert and OCR tasks

Post by inspector71 »

Aw nuts. Penny's dropped. Tools is a separate, batch op app.

Never mind. Thanks for your help for this poor dunce
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8440
Joined: Wed Jan 03, 2018 6:52 pm

Combine convert and OCR tasks

Post by TrackerSupp-Daniel »

:)
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Post Reply