Page 1 of 1

A few questions about OCR

Posted: Fri Apr 27, 2018 6:57 pm
by Willy Van Nuffel
Hello,

I have a few questions about OCR in PDF-XChange Editor:

1) Via "OCR Page(s)..." it is possible to click OK when no language has been selected.
On the contrary, via "Enhance Scanned Pages" it is NOT possible to click OK when only the check box "Recognize text" is active, and there is NO language selected.
Is this "by design" ?

2) Sometimes you can have to OCR a page with first names, last names or words that are not in one of the available dictionaries. There may be a lot of "special characters". Let us say that a large part of the uni-code character-set must be recognized as correct.
Is there a way to tell the OCR-feature that all these characters must be seen as correct, although they are not as words in a dictionary ?

Best regards

Re: A few questions about OCR

Posted: Fri Apr 27, 2018 7:53 pm
by TrackerSupp-Daniel
Hi Willy,
While it is intended to function "Without a language" This should essentially default to English. Though it is a good point that the new function does not allow the same handling. Perhaps we should look at changing it.

On to names, in most cases names come out quite well, despite them not being "Words". As an example I had a user who was OCR'ing an old Russian phone book page just the other day, it came out perfectly in that case. So while there is not a way to expand the dictionary, if no word matches what it is interpreting, I believe it does exactly that and defaults to shows what is present.