Is my intent for a Scanned OCR possible?

Forum for the PDF-XChange Editor - Free and Licensed Versions

Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan

Post Reply
mark@greenbergcm.com
User
Posts: 16
Joined: Sun Dec 18, 2011 1:34 pm

Is my intent for a Scanned OCR possible?

Post by mark@greenbergcm.com »

I have a 30+ page Costco 2020 hardcopy printout of purchase expenses. The document is obviously some table format. Costco stores still use small IBM mainframes--not Windows.

The output is not filtered. I will have to manually tick off each item line that is pertinent then manually total.

Although I doubt it, I have to ask if there's a way to scan the document using the HP machine so PDFX could 'somehow' convert the scan to a reasonably acceptable (to) Excel spreadsheet?

Mark
Tracker screenshot version.jpg
Tracker screenshot version.jpg
User avatar
Dimitar - Tracker Supp
Site Admin
Posts: 1778
Joined: Mon Jan 15, 2018 9:01 am

Re: Is my intent for a Scanned OCR possible?

Post by Dimitar - Tracker Supp »

Hello Mark,

For some reason, I cannot see your screenshots but I think that our products can handle such tasks.

The PDF Editor cannot do this directly so you will need to first OCR the files and then convert them to EXCEL.

https://www.pdf-xchange.com/produc ... nge-editor

However, we also have a batch processing tool called PDF Tools that can automate this process:

https://www.pdf-xchange.com/product/pdf-tools

You can test their abilities by installing both products as a free version.

The only thing that you will not be able to test is our Enhanced OCR that is based on the ABBYY OCR engine.

Regards.
mark@greenbergcm.com
User
Posts: 16
Joined: Sun Dec 18, 2011 1:34 pm

Re: Is my intent for a Scanned OCR possible?

Post by mark@greenbergcm.com »

The screenshot likely my goof and only supposed to be 1 image not 2- the version. Which is the paid Editor Plus version 8 build 342.

Following your first link -- 2 points:
  • I need to update my version of Plus from 8 to 9, and
    Version 9 'Plus' DOES Include the Enhanced OCR that you mention?
OCR clarification: The Plus product page says: "**Please note that source files must be text-based in order to be converted into editable text."

Does that (in English<g>) mean when I scan the document I instruct the HP scanner to output NOT as a PDF file but as OCR? If I select HP output as PDF then I'll get an image not text and that's precisely what your Enhanced OCR does not want. No?

It then appears that, although 2 different documents may have a PDF extension, they also can be 100% differently functional types of PDF--1 is an image (snapshot/photo) and the other 'sort of an image' that retains the ability to be read/converted using OCR. And the two different flavors of PDF files are indistinguishable from each other on the surface?

Practically, for now, my steps are (please correct as necessary)
1. Update my paid Editor Plus 8 to 9
2. Scan Costco with HP machine auto document feeder
3. Set HP output file to be: PDF? or OCR?????
4. Open HP output file with Editor Plus 9
5. Run OCR (either the first OCR on document or a second OCR this time using your OCR to 'clean up' HP's OCR)?
6. While in Editor Plus after OCR, Export OCR product to Excel

(Edited):
1. Am I correct that I have to purchase Ver 9 Editor Plus to obtain the 'included' Enhanced OCR Plugin that's not a separately purchased plugin but now in 9 an included ability?
2. Given my intent as described, would the Enhanced OCR be significantly beneficial to the Scan/OCR/Excel process RESULT?

Mark
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17824
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Is my intent for a Scanned OCR possible?

Post by Tracker Supp-Stefan »

Hello mark@greenbergcm.com,

That text with the two asterisk at the front was a clarification for some text further up on the same page:
image.png
We have now updated the text on that page to make things clearer and easier to understand!

Kind regards,
Stefan
Post Reply