Combine convert and OCR tasks

This Forum is for the use of End Users requiring help and assistance for Tracker Software's PDF-Tools.

Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Sean - Tracker, Chris - Tracker Supp, Tracker Supp-Stefan

Post Reply
trwul62
User
Posts: 2
Joined: Sun Feb 18, 2018 10:06 am

Combine convert and OCR tasks

Post by trwul62 » Sun Feb 18, 2018 12:54 pm

Up front: I am a first time user of PDF-Tools (v7.0). It is still in demo mode.
I'd like to use the tool primarily to batch convert files, something that is really poor within Acrobat.
A few questions:
1. it seems that converting and OCR are two different jobs - can they be joined/combined to 1 task - first convert, then OCR-ed immy thereafter?

2. what is the limit of a single conversion task? 100 files, 1000 files?

3. how to convert Excel and Word documents simultaneously?

4. will PDF-Tools launch Excel and Word each time, for each conversion? (making it impossible to perform any other tasks)

Thanks!

User avatar
Tracker Supp-Stefan
Site Admin
Posts: 13651
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Combine convert and OCR tasks

Post by Tracker Supp-Stefan » Mon Feb 19, 2018 3:04 pm

Hello trwul62,

Welcome to our forums, and thanks for your enquiry.
Yes - the steps are separate - you need to convert the file(s) to the PDF file format, and then our OCR engine can do it's job, but you can create custom tools that will e.g. take the source file(s) as input, and then do several operations on them.
Take a look at e.g. the "Scan to PDF" tool that is already prepared. It has as a first step - get images from scanner, then the second one is to create the PDF files, and then the third one is to OCR the pages of the newly created PDF files.

As for converting different file formats - that's also possible - just make sure that in the "Choose Input Files" menu you've selected to allow all supported file formats.

We should request Word or Excel APIs to help us with the conversion process, but having a separate e.g. Word window open should not be an issue.

If you process more than 100 files at a time - we might need to allocate a bit more memory and some processing power for managing the number of files, but there should not really be any hard limit on the number. Still I'd recommend running reasonable batches, so that if something happens you could recover form it quicker and need to reprocess less files.

Regards,
Stefan

DenisO
Site Admin
Posts: 45
Joined: Fri Jun 09, 2017 5:40 pm

Re: Combine convert and OCR tasks

Post by DenisO » Mon Feb 19, 2018 10:11 pm

Hi trwul62,
1. it seems that converting and OCR are two different jobs - can they be joined/combined to 1 task - first convert, then OCR-ed immy thereafter?
In attachment you can find the tool that converts images to PDF and than applies OCR operation. Just choose Options -> Import Tools and select file from archive.
If you do not need to use extended options for conversion images to PDF you can use standard tool "OCR pages". Change file types so tool can accept raster images as input.
2. what is the limit of a single conversion task? 100 files, 1000 files?
Note that you can uncheck Batch processing mode so every input will be processed by every tool action one by one. To get result you don't need to wait until all inputs are processed.
Attachments
batch.JPG
batch.JPG (16.82 KiB) Viewed 959 times
PDFFromImageAndOCR.zip
(1.47 KiB) Downloaded 47 times
Denis Oleksenko
Software Developer
Tracker Software Products (Canada) LTD

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2694
Joined: Wed Jan 03, 2018 6:52 pm

Re: Combine convert and OCR tasks

Post by TrackerSupp-Daniel » Mon Feb 19, 2018 11:17 pm

Thanks for the assist DenisO! :)

If any further questions come up, do not hesitate to ask!
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

trwul62
User
Posts: 2
Joined: Sun Feb 18, 2018 10:06 am

Re: Combine convert and OCR tasks

Post by trwul62 » Tue Feb 20, 2018 7:09 am

Many thanks - I truly appreciate the feedback!

User avatar
Tracker Supp-Stefan
Site Admin
Posts: 13651
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Combine convert and OCR tasks

Post by Tracker Supp-Stefan » Tue Feb 20, 2018 10:03 am

:D

User avatar
Vasyl-Tracker Dev Team
Site Admin
Posts: 1945
Joined: Thu Jun 30, 2005 4:11 pm
Location: Canada

Re: Combine convert and OCR tasks

Post by Vasyl-Tracker Dev Team » Tue Feb 20, 2018 11:46 pm

Hi, trwul62.
3. how to convert Excel and Word documents simultaneously?
You may do it by running two or more copies of PDF-Tools application. Then in one process, for example, you may start the docx-to-pdf conversion and in second - xlsx-to-pdf conversion, in third - OCR.
All will work simultaneously.

HTH.
Vasyl Yaremyn
Tracker Software Products
Project Developer

Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.

Post Reply