OCR makes scanned text invisible in 7.0.323.2

Forum for the PDF-XChange Editor - Free and Licensed Versions

Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan

Post Reply
scan_user
User
Posts: 13
Joined: Sun Jan 07, 2018 7:14 pm

OCR makes scanned text invisible in 7.0.323.2

Post by scan_user »

Problem: Scanned pages appear blank when OCR is turned on in “image insertion options”. The text selection tool and search tool both find the text on the scanned page, but the text is invisible. When scanned without auto-OCR the text is visible, and when OCR is run after the scan the text remains visible. The problem only occurs when OCR is done automatically with the scan.
v6.0 does not seem to have this problem.

Example enclosed

Hi,
First off, I’ve been using TurboPDF for a while, and their scan workflow drives me mad (too many clicks, does not remember preferences), so I started looking around, and after playing with Power PDF I finally looked at PDF Xchange.
And I must say I am impressed by the attention to detail in PDF-Xchange. Little things like search placing a big fat blue highlight on the found text (makes it much easier to find than the hard to see highlights in both others). Or the many settings, e.g. to be able to have an automatic date-bookmark for a scanned page. Really nice. Also the new user interface of 7.0 greatly improves productivity over 6.0 (which I tried too, because of the bug). Being able to define very low-level shortcuts (e.g. “insert from scan”) in the top-bar and drop-to open… all great stuff.
Definitely the smoothest workflow for scan that I could find. (fewest clicks with all settings remembered)

So I’m basically ready to switch, and then I run into this ugly bug:
Scanning documents works fine, until I switch on OCR. As soon as automatic OCR is selected in “Images insertion options” then the scanned page looks BLANK.

When I use the “text select tool” and sweep across that scanned/OCRd page, then the tool highlights all the invisible text. So does the ‘search” function: it highlights the right areas. So the text is there, but it is invisible. Happens both with TWAIN or WIA driver. So I don’t think it’s related to the scanner, which works fine with Turbo PDF and Power PDF.
I tried v6.0 and the scan plus auto-OCR works fine. (but it doesn’t find my TWAIN driver, and the GUI is not as nice as 7.0).
Can you please fix this and make auto OCR possible?
Example document enclosed.
thanks

Oh, one more thing: with so many nice settings, how about a setting for keeping or deleting preferences on uninstall. Currently it seems to remember all preferences, even with a full uninstall: the settings (e.g. toolbars) are still there after a reinstall. (only afer installing the previous rev they are gone)
I tried uninstall to see if something was wrong with my install. For that purpose I would have liked to completely clean out the registry of all settings, just to make sure I go back to all defaults…
Attachments
New Document from TW-Brother MFC-L2740DW LAN.pdf
(1.09 MiB) Downloaded 81 times
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17823
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: OCR makes scanned text invisible in 7.0.323.2

Post by Tracker Supp-Stefan »

Hello scan_user,

Welcome to our forms and thanks for your report.
This is a known issue in the current build.
Please do the scanning without OCR, and then run the OCR as a separate step - and this will preserve the images and add the needed text layer on top.
We are working on a fix, so the above is just a workaround until the next build.

I will pass the suggestion for a full settings clear up at uninstall time for consideration, but if you want to do a clean uninstall - then you can use the unisntaller tool, and then manually remove the HKEY_CURRENT_USER\Software\Tracker Software tree - which stores all the settings for our software.

Regards,
Stefan
scan_user
User
Posts: 13
Joined: Sun Jan 07, 2018 7:14 pm

Re: OCR makes scanned text invisible in 7.0.323.2

Post by scan_user »

Thanks, Stefan,
I'll wait for the next release then to enable auto OCR.

I did get the license, and now that I started using it, if ran into another unexpected thing, for which I could not seem to find an answer in either help or the forums...

When I run OCR, every other software that I tried (Acrobat, PowerPDF, nuance) will automatically rotate the page so that the text that it found is horizontal. That sometimes gets interesting when there is text in different directions (e.g. on a margin), but it generally works very well. And it saves another lengthy step of having to manually select the scanned pages that were originally printed in landscape mode and then manually rotating them.
I could not find such an "auto-rotate" option in the OCR settings or anywhere else.

It seems like such a no-brainer to have this as an option.... am I missing something?
( I have auto-deskew on in the image insertion options, but that didn't seem to do the trick)

thanks
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17823
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: OCR makes scanned text invisible in 7.0.323.2

Post by Tracker Supp-Stefan »

Hello Scan_user,

Yes we are not currently doing such auto rotations, but I believe a colleague is currently working on some further improvements to the OCR engine, and I will pass him this suggestion as well for consideration!

Regards,
Stefan
Willy Van Nuffel
User
Posts: 2347
Joined: Wed Jan 18, 2006 12:10 pm

Re: OCR makes scanned text invisible in 7.0.323.2

Post by Willy Van Nuffel »

In PDF-XChange Editor V7 build 325.1 the "auto-rotate" in OCR is still missing.

Any news about this ?
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17823
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: OCR makes scanned text invisible in 7.0.323.2

Post by Tracker Supp-Stefan »

Hello Willy,

I am afraid that I do not have any news on the subject, so I will ask around now to see if I can find anything else to share here!

Regards,
Stefan
Post Reply