Poor OCR results

Discussion for the End User use of OCR in PDF-XChange Editor and Viewer

Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan

Post Reply
SteepleChase
User
Posts: 5
Joined: Wed Jan 04, 2012 5:47 pm

Poor OCR results

Post by SteepleChase »

Some document's OCR results are excellent. This one's are not. Why? What would help?

I suspect that the original document is too poor quality.
Attachments
test document.pdf
The top is the text that had the OCR process applied, the bottom is the result.
(161.43 KiB) Downloaded 327 times
Willy Van Nuffel
User
Posts: 2393
Joined: Wed Jan 18, 2006 12:10 pm

Re: Poor OCR results

Post by Willy Van Nuffel »

Hi SteepleChase.
The OCR functionality is a nice piece of software that has been add to the PDF-XChange Viewer Pro.
I did several tests with it and did find out that the "Accuracy" parameter does not really do what it should be supposed to do. When you set it on "Low" or "Medium" the result is fairly good. When you set it on "High" the result is rather bad.
A test with your "test document.pdf" confirms this.
So, to the people of Tracker-Software, if you could make something out of it that results in the best combination of "Low" and "Medium" accuracy, it would almost be perfect. Already thanks to all you for the effort that has been done !
User avatar
Paul - Tracker Supp
Site Admin
Posts: 6897
Joined: Wed Mar 25, 2009 10:37 pm
Location: Chemainus, Canada
Contact:

Re: Poor OCR results

Post by Paul - Tracker Supp »

Hi we have the document and I'l pass that on to the OCR lead developer when he returns from his Christmas Holidays next week. We'll provide feedback here.

hth
Best regards

Paul O'Rorke
Tracker Support North America
http://www.tracker-software.com
wilfriedh
User
Posts: 14
Joined: Fri Mar 12, 2010 1:42 pm

Re: Poor OCR results

Post by wilfriedh »

Just for interest I ran the "test document.pdf" trough PDF-XChange Viewer's OCR (each of the three levels) and Abby FineReader 9. You can see that also FineReader's results are not perfect. The original is too blurred.

Wilfried
Attachments
test document ocr.zip
(1.03 KiB) Downloaded 273 times
User avatar
Paul - Tracker Supp
Site Admin
Posts: 6897
Joined: Wed Mar 25, 2009 10:37 pm
Location: Chemainus, Canada
Contact:

Re: Poor OCR results

Post by Paul - Tracker Supp »

HI wilfriedh,

thanks for the input. That's quite interestinbg to see. Finereader did do a better job but it is also a $ 169.99 product that has had years of market place trial. This is a free OCR that is in it's first release.

We will still be working on this. hth
Best regards

Paul O'Rorke
Tracker Support North America
http://www.tracker-software.com
Walter-Tracker Supp
User
Posts: 381
Joined: Mon Jun 13, 2011 5:10 pm

Re: Poor OCR results

Post by Walter-Tracker Supp »

Hi,

The recommended accuracy setting for most document is "medium"; in some cases the trade-off between speed and accuracy makes it worthwhile to use "low", which is faster but slightly more error-prone. High accuracy should generally be used for high resolution documents with small text; for general use with typical scanned documents (letters, forms, etc) it may end up performing worse than medium. We have left it up to the end users to determine which method is best for their specific document.

Also, the input document you provided is fairly low resolution. If you "zoom in" you can see that there are a lot of things that typically cause problems for OCR - poor delineation of letters and letters often contact their neighbours quite significantly. A higher scanning resolution should resolve this.


We are continually developing our OCR functionality and our products in general, so your feedback is greatly appreciated.

-Walter
Post Reply