Poor OCR results

Discussion for the End User use uf OCR in PDF-XChange Editor and Viewer

Moderators: TrackerSupp-Daniel, Tracker Support, Sean - Tracker, Paul - Tracker Supp, Chris - Tracker Supp, Vasyl-Tracker Dev Team, Ivan - Tracker Software, Tracker Supp-Stefan

Post Reply
SteepleChase
User
Posts: 5
Joined: Wed Jan 04, 2012 5:47 pm

Poor OCR results

Post by SteepleChase » Fri Jan 06, 2012 1:09 am

Some document's OCR results are excellent. This one's are not. Why? What would help?

I suspect that the original document is too poor quality.
Attachments
test document.pdf
The top is the text that had the OCR process applied, the bottom is the result.
(161.43 KiB) Downloaded 225 times

Willy Van Nuffel
User
Posts: 1363
Joined: Wed Jan 18, 2006 12:10 pm

Re: Poor OCR results

Post by Willy Van Nuffel » Fri Jan 06, 2012 1:26 pm

Hi SteepleChase.
The OCR functionality is a nice piece of software that has been add to the PDF-XChange Viewer Pro.
I did several tests with it and did find out that the "Accuracy" parameter does not really do what it should be supposed to do. When you set it on "Low" or "Medium" the result is fairly good. When you set it on "High" the result is rather bad.
A test with your "test document.pdf" confirms this.
So, to the people of Tracker-Software, if you could make something out of it that results in the best combination of "Low" and "Medium" accuracy, it would almost be perfect. Already thanks to all you for the effort that has been done !

Paul - Tracker Supp
Site Admin
Posts: 4838
Joined: Wed Mar 25, 2009 10:37 pm
Location: Chemainus, Canada
Contact:

Re: Poor OCR results

Post by Paul - Tracker Supp » Fri Jan 06, 2012 5:07 pm

Hi we have the document and I'l pass that on to the OCR lead developer when he returns from his Christmas Holidays next week. We'll provide feedback here.

hth
_________________
If posting files to this forum - you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded - thank you.

Best regards

Paul O'Rorke
Tracker Support North America
http://www.tracker-software.com

wilfriedh
User
Posts: 14
Joined: Fri Mar 12, 2010 1:42 pm

Re: Poor OCR results

Post by wilfriedh » Mon Jan 09, 2012 2:54 pm

Just for interest I ran the "test document.pdf" trough PDF-XChange Viewer's OCR (each of the three levels) and Abby FineReader 9. You can see that also FineReader's results are not perfect. The original is too blurred.

Wilfried
Attachments
test document ocr.zip
(1.03 KiB) Downloaded 175 times

Paul - Tracker Supp
Site Admin
Posts: 4838
Joined: Wed Mar 25, 2009 10:37 pm
Location: Chemainus, Canada
Contact:

Re: Poor OCR results

Post by Paul - Tracker Supp » Mon Jan 09, 2012 3:50 pm

HI wilfriedh,

thanks for the input. That's quite interestinbg to see. Finereader did do a better job but it is also a $ 169.99 product that has had years of market place trial. This is a free OCR that is in it's first release.

We will still be working on this. hth
_________________
If posting files to this forum - you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded - thank you.

Best regards

Paul O'Rorke
Tracker Support North America
http://www.tracker-software.com

Walter-Tracker Supp
User
Posts: 383
Joined: Mon Jun 13, 2011 5:10 pm

Re: Poor OCR results

Post by Walter-Tracker Supp » Mon Jan 09, 2012 5:30 pm

Hi,

The recommended accuracy setting for most document is "medium"; in some cases the trade-off between speed and accuracy makes it worthwhile to use "low", which is faster but slightly more error-prone. High accuracy should generally be used for high resolution documents with small text; for general use with typical scanned documents (letters, forms, etc) it may end up performing worse than medium. We have left it up to the end users to determine which method is best for their specific document.

Also, the input document you provided is fairly low resolution. If you "zoom in" you can see that there are a lot of things that typically cause problems for OCR - poor delineation of letters and letters often contact their neighbours quite significantly. A higher scanning resolution should resolve this.


We are continually developing our OCR functionality and our products in general, so your feedback is greatly appreciated.

-Walter

Post Reply