Combine convert and OCR tasks
Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Tracker Supp-Stefan
Combine convert and OCR tasks
Up front: I am a first time user of PDF-Tools (v7.0). It is still in demo mode.
I'd like to use the tool primarily to batch convert files, something that is really poor within Acrobat.
A few questions:
1. it seems that converting and OCR are two different jobs - can they be joined/combined to 1 task - first convert, then OCR-ed immy thereafter?
2. what is the limit of a single conversion task? 100 files, 1000 files?
3. how to convert Excel and Word documents simultaneously?
4. will PDF-Tools launch Excel and Word each time, for each conversion? (making it impossible to perform any other tasks)
Thanks!
I'd like to use the tool primarily to batch convert files, something that is really poor within Acrobat.
A few questions:
1. it seems that converting and OCR are two different jobs - can they be joined/combined to 1 task - first convert, then OCR-ed immy thereafter?
2. what is the limit of a single conversion task? 100 files, 1000 files?
3. how to convert Excel and Word documents simultaneously?
4. will PDF-Tools launch Excel and Word each time, for each conversion? (making it impossible to perform any other tasks)
Thanks!
- Tracker Supp-Stefan
- Site Admin
- Posts: 17908
- Joined: Mon Jan 12, 2009 8:07 am
- Location: London
- Contact:
Re: Combine convert and OCR tasks
Hello trwul62,
Welcome to our forums, and thanks for your enquiry.
Yes - the steps are separate - you need to convert the file(s) to the PDF file format, and then our OCR engine can do it's job, but you can create custom tools that will e.g. take the source file(s) as input, and then do several operations on them.
Take a look at e.g. the "Scan to PDF" tool that is already prepared. It has as a first step - get images from scanner, then the second one is to create the PDF files, and then the third one is to OCR the pages of the newly created PDF files.
As for converting different file formats - that's also possible - just make sure that in the "Choose Input Files" menu you've selected to allow all supported file formats.
We should request Word or Excel APIs to help us with the conversion process, but having a separate e.g. Word window open should not be an issue.
If you process more than 100 files at a time - we might need to allocate a bit more memory and some processing power for managing the number of files, but there should not really be any hard limit on the number. Still I'd recommend running reasonable batches, so that if something happens you could recover form it quicker and need to reprocess less files.
Regards,
Stefan
Welcome to our forums, and thanks for your enquiry.
Yes - the steps are separate - you need to convert the file(s) to the PDF file format, and then our OCR engine can do it's job, but you can create custom tools that will e.g. take the source file(s) as input, and then do several operations on them.
Take a look at e.g. the "Scan to PDF" tool that is already prepared. It has as a first step - get images from scanner, then the second one is to create the PDF files, and then the third one is to OCR the pages of the newly created PDF files.
As for converting different file formats - that's also possible - just make sure that in the "Choose Input Files" menu you've selected to allow all supported file formats.
We should request Word or Excel APIs to help us with the conversion process, but having a separate e.g. Word window open should not be an issue.
If you process more than 100 files at a time - we might need to allocate a bit more memory and some processing power for managing the number of files, but there should not really be any hard limit on the number. Still I'd recommend running reasonable batches, so that if something happens you could recover form it quicker and need to reprocess less files.
Regards,
Stefan
Re: Combine convert and OCR tasks
Hi trwul62,
If you do not need to use extended options for conversion images to PDF you can use standard tool "OCR pages". Change file types so tool can accept raster images as input.
In attachment you can find the tool that converts images to PDF and than applies OCR operation. Just choose Options -> Import Tools and select file from archive.1. it seems that converting and OCR are two different jobs - can they be joined/combined to 1 task - first convert, then OCR-ed immy thereafter?
If you do not need to use extended options for conversion images to PDF you can use standard tool "OCR pages". Change file types so tool can accept raster images as input.
Note that you can uncheck Batch processing mode so every input will be processed by every tool action one by one. To get result you don't need to wait until all inputs are processed.2. what is the limit of a single conversion task? 100 files, 1000 files?
- Attachments
-
- batch.JPG (16.82 KiB) Viewed 3776 times
-
- PDFFromImageAndOCR.zip
- (1.47 KiB) Downloaded 140 times
Denis Oleksenko
Software Developer
Tracker Software Products (Canada) LTD
Software Developer
Tracker Software Products (Canada) LTD
- TrackerSupp-Daniel
- Site Admin
- Posts: 8588
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Combine convert and OCR tasks
Thanks for the assist DenisO!
If any further questions come up, do not hesitate to ask!
If any further questions come up, do not hesitate to ask!
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Re: Combine convert and OCR tasks
Many thanks - I truly appreciate the feedback!
- Tracker Supp-Stefan
- Site Admin
- Posts: 17908
- Joined: Mon Jan 12, 2009 8:07 am
- Location: London
- Contact:
- Vasyl-Tracker Dev Team
- Site Admin
- Posts: 2353
- Joined: Thu Jun 30, 2005 4:11 pm
- Location: Canada
Re: Combine convert and OCR tasks
Hi, trwul62.
All will work simultaneously.
HTH.
You may do it by running two or more copies of PDF-Tools application. Then in one process, for example, you may start the docx-to-pdf conversion and in second - xlsx-to-pdf conversion, in third - OCR.3. how to convert Excel and Word documents simultaneously?
All will work simultaneously.
HTH.
Vasyl Yaremyn
Tracker Software Products
Project Developer
Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
Tracker Software Products
Project Developer
Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
-
- User
- Posts: 5
- Joined: Wed May 20, 2020 11:47 am
Re: Combine convert and OCR tasks
Unfortunately I can't open Denis Oleksenko's file. It's a pdtex file which I presume is no longer supported. If you're out there Denis, can you re-post your guide in another format? Or maybe someone else can help?
Thanks, Graham
Thanks, Graham
- TrackerSupp-Daniel
- Site Admin
- Posts: 8588
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Combine convert and OCR tasks
Hi, inspector71
These files are not intended to be opened, you would download them and then use PDF-Tools and choose to "import tool" before selecting that tool file. It should still be compatible.
Kind regards,
These files are not intended to be opened, you would download them and then use PDF-Tools and choose to "import tool" before selecting that tool file. It should still be compatible.
Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
-
- User
- Posts: 5
- Joined: Wed May 20, 2020 11:47 am
Re: Combine convert and OCR tasks
Ah, thank you. Forgive my ignorance - I bought Editor yesterday.
Cheers
Cheers
-
- User
- Posts: 5
- Joined: Wed May 20, 2020 11:47 am
Re: Combine convert and OCR tasks
Daniel, to ensure I don't mess up my new app, can I ask you to confirm that this page has the info I need to import Denis's tool file:
viewtopic.php?f=70&t=34274&p=142198&hil ... ol#p142198
If not, I ask you to indulge my ignorance again ...
Thanks for your patience
viewtopic.php?f=70&t=34274&p=142198&hil ... ol#p142198
If not, I ask you to indulge my ignorance again ...
Thanks for your patience
- TrackerSupp-Daniel
- Site Admin
- Posts: 8588
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Combine convert and OCR tasks
Hi, inspector71
That post would be pertaining to running a tool from the command line after it has been imported, to import any tool into PDF-Tools simply open the PDF-Tools application window, and click "Options > import tool" then select the PDTex file mentioned earlier and it will import that specific tool without affecting your other tools or settings. Once the tool has been imported you can run it by double clicking on it directly in the PDF-Tools window, by dragging the desired files overtop of the tool in the window, or by setting up a watched folder.
Note that you can also create your own custom tools at any time by following the steps here.
Kind regards,
That post would be pertaining to running a tool from the command line after it has been imported, to import any tool into PDF-Tools simply open the PDF-Tools application window, and click "Options > import tool" then select the PDTex file mentioned earlier and it will import that specific tool without affecting your other tools or settings. Once the tool has been imported you can run it by double clicking on it directly in the PDF-Tools window, by dragging the desired files overtop of the tool in the window, or by setting up a watched folder.
Note that you can also create your own custom tools at any time by following the steps here.
Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
-
- User
- Posts: 5
- Joined: Wed May 20, 2020 11:47 am
Re: Combine convert and OCR tasks
Morning Daniel
Thanks for getting back to me quickly. For the life of me I can't find the tools, import option you're referring to. I'm using Editor 8.0.339
Kind regards
Graham
Thanks for getting back to me quickly. For the life of me I can't find the tools, import option you're referring to. I'm using Editor 8.0.339
Kind regards
Graham
- TrackerSupp-Daniel
- Site Admin
- Posts: 8588
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Combine convert and OCR tasks
Hi, inspector71
That will be where the problem lies, PDF-Tools is a separate application from our Editor, and has its own functions, files and the like, it is not a part of the Editor at all.
You can download PDF-Tools from its product page here to test it and decide if it is worth making a purchase.
If you already hold an Editor license and decide that you also need PDF-tools, you can upgrade your license to a Tools or PRO license, (both of which include PDF-Tools and the Editor, from your account page, on the upgrade options tab.
Kind regards,
That will be where the problem lies, PDF-Tools is a separate application from our Editor, and has its own functions, files and the like, it is not a part of the Editor at all.
You can download PDF-Tools from its product page here to test it and decide if it is worth making a purchase.
If you already hold an Editor license and decide that you also need PDF-tools, you can upgrade your license to a Tools or PRO license, (both of which include PDF-Tools and the Editor, from your account page, on the upgrade options tab.
Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
-
- User
- Posts: 5
- Joined: Wed May 20, 2020 11:47 am
Re: Combine convert and OCR tasks
Aw nuts. Penny's dropped. Tools is a separate, batch op app.
Never mind. Thanks for your help for this poor dunce
Never mind. Thanks for your help for this poor dunce
- TrackerSupp-Daniel
- Site Admin
- Posts: 8588
- Joined: Wed Jan 03, 2018 6:52 pm
Combine convert and OCR tasks
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com