Page 1 of 1

OCR makes scanned text invisible in 7.0.323.2

Posted: Sun Jan 07, 2018 8:14 pm
by scan_user
Problem: Scanned pages appear blank when OCR is turned on in “image insertion options”. The text selection tool and search tool both find the text on the scanned page, but the text is invisible. When scanned without auto-OCR the text is visible, and when OCR is run after the scan the text remains visible. The problem only occurs when OCR is done automatically with the scan.
v6.0 does not seem to have this problem.

Example enclosed

First off, I’ve been using TurboPDF for a while, and their scan workflow drives me mad (too many clicks, does not remember preferences), so I started looking around, and after playing with Power PDF I finally looked at PDF Xchange.
And I must say I am impressed by the attention to detail in PDF-Xchange. Little things like search placing a big fat blue highlight on the found text (makes it much easier to find than the hard to see highlights in both others). Or the many settings, e.g. to be able to have an automatic date-bookmark for a scanned page. Really nice. Also the new user interface of 7.0 greatly improves productivity over 6.0 (which I tried too, because of the bug). Being able to define very low-level shortcuts (e.g. “insert from scan”) in the top-bar and drop-to open… all great stuff.
Definitely the smoothest workflow for scan that I could find. (fewest clicks with all settings remembered)

So I’m basically ready to switch, and then I run into this ugly bug:
Scanning documents works fine, until I switch on OCR. As soon as automatic OCR is selected in “Images insertion options” then the scanned page looks BLANK.

When I use the “text select tool” and sweep across that scanned/OCRd page, then the tool highlights all the invisible text. So does the ‘search” function: it highlights the right areas. So the text is there, but it is invisible. Happens both with TWAIN or WIA driver. So I don’t think it’s related to the scanner, which works fine with Turbo PDF and Power PDF.
I tried v6.0 and the scan plus auto-OCR works fine. (but it doesn’t find my TWAIN driver, and the GUI is not as nice as 7.0).
Can you please fix this and make auto OCR possible?
Example document enclosed.

Oh, one more thing: with so many nice settings, how about a setting for keeping or deleting preferences on uninstall. Currently it seems to remember all preferences, even with a full uninstall: the settings (e.g. toolbars) are still there after a reinstall. (only afer installing the previous rev they are gone)
I tried uninstall to see if something was wrong with my install. For that purpose I would have liked to completely clean out the registry of all settings, just to make sure I go back to all defaults…

Re: OCR makes scanned text invisible in 7.0.323.2

Posted: Mon Jan 08, 2018 9:29 am
by Tracker Supp-Stefan
Hello scan_user,

Welcome to our forms and thanks for your report.
This is a known issue in the current build.
Please do the scanning without OCR, and then run the OCR as a separate step - and this will preserve the images and add the needed text layer on top.
We are working on a fix, so the above is just a workaround until the next build.

I will pass the suggestion for a full settings clear up at uninstall time for consideration, but if you want to do a clean uninstall - then you can use the unisntaller tool, and then manually remove the HKEY_CURRENT_USER\Software\Tracker Software tree - which stores all the settings for our software.


Re: OCR makes scanned text invisible in 7.0.323.2

Posted: Sat Jan 13, 2018 7:13 pm
by scan_user
Thanks, Stefan,
I'll wait for the next release then to enable auto OCR.

I did get the license, and now that I started using it, if ran into another unexpected thing, for which I could not seem to find an answer in either help or the forums...

When I run OCR, every other software that I tried (Acrobat, PowerPDF, nuance) will automatically rotate the page so that the text that it found is horizontal. That sometimes gets interesting when there is text in different directions (e.g. on a margin), but it generally works very well. And it saves another lengthy step of having to manually select the scanned pages that were originally printed in landscape mode and then manually rotating them.
I could not find such an "auto-rotate" option in the OCR settings or anywhere else.

It seems like such a no-brainer to have this as an option.... am I missing something?
( I have auto-deskew on in the image insertion options, but that didn't seem to do the trick)


Re: OCR makes scanned text invisible in 7.0.323.2

Posted: Mon Jan 15, 2018 10:41 am
by Tracker Supp-Stefan
Hello Scan_user,

Yes we are not currently doing such auto rotations, but I believe a colleague is currently working on some further improvements to the OCR engine, and I will pass him this suggestion as well for consideration!


Re: OCR makes scanned text invisible in 7.0.323.2

Posted: Wed May 23, 2018 9:58 am
by Willy Van Nuffel
In PDF-XChange Editor V7 build 325.1 the "auto-rotate" in OCR is still missing.

Any news about this ?

Re: OCR makes scanned text invisible in 7.0.323.2

Posted: Wed May 23, 2018 11:12 am
by Tracker Supp-Stefan
Hello Willy,

I am afraid that I do not have any news on the subject, so I will ask around now to see if I can find anything else to share here!