Unhandled exception OCRing

PDF-X OCR SDK is a New product from us and intended to compliment our existing PDF and Imaging Tools to provide the Developer with an expanding set of professional tools for Optical Character Recognition tasks

Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Tracker Supp-Stefan

Post Reply
tborntrager
User
Posts: 4
Joined: Thu Sep 06, 2012 9:37 pm

Unhandled exception OCRing

Post by tborntrager »

We are evaluating the PDF XChange Pro SDK (downloaded and installed eval copy on 9/5/12) to use the OCR function to create searchable PDF's.

On the development station, I can run a test app OK on PDF's that will cause an error on the network scanner we intend the application to run on.
I'm working with a slightly modified version of a VB.net project that was posted on this forum by Walter.
https://forum.pdf-xchange.com/ ... vb.net+ocr

I've attached a screenshot, the text of the error and one of the PDF's I'm testing with (4 pages of the exact same image) in a zip file.

The network scanner is a Kodak Scanstation 520.
It is running Windows XP Embedded SP2, 1 GIG RAM, 220GIG free space on hard drive, Intel Atom 1.66 CPU.
Using ocrtools.dll ver 1.0.10.0, last modified 8/22/12

There is plenty of free memory on the Scanstation when the OCR process is running. And so far the error usually occurs on page 3 or 4 of the process, but I've seen it happen on the second page as well. I know that the eval version only is set to OCR the first 2 pages, but I want to resolve the crashing before considering this SDK as a possible solution.

The error happens at different stages, Autorotating/Rasterizing.
I've tried a number of different PDF's created from different applications so it isn't tied to a specific file. So far, the first page will always run through OK, even if it is an exact copy of the page that it crashes on.
I've also tried autorotate on/off, and changing the OCR_RegionMode from auto to word.
This Scanstation has run several other OCR engines without trouble.
Any insight?
Thanks.

UPDATE:
I tested on another workstation (Windows 7) and got the same error. I then installed PDF XChange Pro on it to be sure I wasn't missing a prerequisite (from what I've found, should only need ocrtools.dll and the ocrdats folder), and still got the error.
Attachments
PDFXChange error.zip
(248.98 KiB) Downloaded 226 times
Walter-Tracker Supp
User
Posts: 381
Joined: Mon Jun 13, 2011 5:10 pm

Re: Unhandled exception OCRing

Post by Walter-Tracker Supp »

I'm unfamiliar with the Kodak Scanstation; from your description it sounds as if you are running the OCR application on the Scanstation hardware itself (rather than on a workstation connected to the scanner). Is this correct?

I tried the PDF you attached on a standard workstation and could not reproduce the problem; it OCRs fine (with or without the demo limitation). Any other information you could give that could help reproduce the problem would be great.

Is the PDF file completely written to the storage device before reading it for OCR? I wonder if there is a timing issue here.

Also, if you could post your sample application (or email it to us at support@pdf-xchange.com) that might help reproduce the issue or narrow it down.

-Walter
tborntrager
User
Posts: 4
Joined: Thu Sep 06, 2012 9:37 pm

Re: Unhandled exception OCRing

Post by tborntrager »

The Kodak Scanstation is basically a workstation with a built in scanner and yes we are attempting to run OCR directly on it.
But I also tested on another regular desktop and had the same error.
The PDF's I'm testing with are previously saved ones so timing in that respect at least shouldn't be an issue.

I did further testing with the bundled OCRDemoCSharp project and found at least on the regular desktop that was giving the error with our VB.net app ran through OK. So I converted it to a VB.net project and at least in initial testing it isn't having the same issues when running on the Scanstation. I need to do further testing, but it may have been an issue with the initial project. What threw me is that it ran fine on one station with the same PDF's that it failed with on others.

Also, is there any demo dll that we can use to test on more than just 2 pages at once?

Thanks.
Walter-Tracker Supp
User
Posts: 381
Joined: Mon Jun 13, 2011 5:10 pm

Re: Unhandled exception OCRing

Post by Walter-Tracker Supp »

Okay, hopefully you've resolved it. Out of curiosity, what was the cause of the problem? It may be helpful to refer to if it arises again for someone else.

If you have more issues please don't hesitate to contact us.
tborntrager wrote:Also, is there any demo dll that we can use to test on more than just 2 pages at once?
Unfortunately no, this is the limitation on the demo version that differentiates it from the paid version. If you have a license key for the PRO SDK you can download the unlimited DLL (which is in a password protected archive) - email sales@pdf-xchange.com if you need the password.

-Walter
tborntrager
User
Posts: 4
Joined: Thu Sep 06, 2012 9:37 pm

Re: Unhandled exception OCRing

Post by tborntrager »

I'm finding now that the issue is related to using the callback to get the status of the OCR process. This is happening with both the original project and the new one that I thought resolved it. For some reason it doesn't happen on the development station (VS 2008), but does on all 5 other stations (variety of OS's/specs) I've tested on.
Do you have any other VB.NET sample projects that I can take a look at?
Nico - Tracker Supp
User
Posts: 205
Joined: Fri May 18, 2012 8:41 pm

Re: Unhandled exception OCRing

Post by Nico - Tracker Supp »

Hi tborntrager,

Thank you for your post.
Please take a look at the code examples folder at: C:\Program Files\Tracker Software\PDF-XChange PRO <version number> SDK\Examples.
Thanks.

Sincerely,
tborntrager
User
Posts: 4
Joined: Thu Sep 06, 2012 9:37 pm

Re: Unhandled exception OCRing

Post by tborntrager »

I was looking for one that used a callback specific to OCR in VB.NET. There doesn't appear to be one.
Walter-Tracker Supp
User
Posts: 381
Joined: Mon Jun 13, 2011 5:10 pm

Re: Unhandled exception OCRing

Post by Walter-Tracker Supp »

tborntrager wrote:I was looking for one that used a callback specific to OCR in VB.NET. There doesn't appear to be one.
There is a callback function in the VB.net example: see Callback.vb

The callback must return 1; returning 0 sets an abort condition and stops OCR. You'd use this to signify something like a "cancel" request from a user.

Some of the functionality for the OCR DLL is missing from OcrCommon.vb, which just provides access to core functions. See the C++ Header file ocrtools.h if you would like to try to expose more (you'll have to sort out the marshalling yourself).

-Walter
Attachments
VBNetexample.zip
(15.81 KiB) Downloaded 208 times
Post Reply