Searchable image issues in V9

Forum for the PDF-XChange Editor - Free and Licensed Versions

Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan

Post Reply
frebbe
User
Posts: 1
Joined: Wed Jan 11, 2006 4:16 pm

Searchable image issues in V9

Post by frebbe »

Just updated from version 7 to 9. In the old version, without the Enhanced OCR PlugIn, I had no problems with text in tables. My first test of the new version with a PDF fresh from the digitization service of the Saxon State Library in Dresden (without OCR) is very sobering. Recognition is good on pure text pages, but tables are arbitrarily disfigured with newly drawn vertical and horizontal lines. That looks really bad:
Output.jpg
These are the settings with which I carried out the text recognition. However, it is irrelevant which settings OCR is carried out with, the lines in the table always appear.
Settings.jpg
This is really a bad worsening compared to the results I had with version 7. You can see that I have been using your products since 2004. Today is the first time I am considering withdrawing from the purchase of a new version!

And, by the way, why is there no recognition of Fraktur fonts, which unfortunately have been used in German for a very long time?
User avatar
TrackerSupp-JohnG
Site Admin
Posts: 18
Joined: Tue Sep 15, 2020 11:43 pm

Re: Version 9 is now available

Post by TrackerSupp-JohnG »

Hello,

Thank-you for informing us about this issue. To support you better, would you mind sending this document to
support@pdf-xchange.com as we would like to be able to take a closer look?

Kind regards,
John Gareth
Support Technician
Tracker Software Products (Canada) LTD

Support: <Support@tracker-software.com>
Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623
Markus Stamm
User
Posts: 10
Joined: Mon Apr 20, 2020 8:38 pm

Re: Version 9 is now available

Post by Markus Stamm »

frebbe wrote: Sat Jan 16, 2021 11:11 pm (...)And, by the way, why is there no recognition of Fraktur fonts, which unfortunately have been used in German for a very long time?
I second this question, see also my post on this topic: https://forum.pdf-xchange.com/viewtopic.php?f=63&p=147689#p147689

ABBYY FineReader 15 and FineReader Server are capable of black letter OCR, as was Tesseract, so perhaps this is only a question of adding the training data?
User avatar
Vasyl-Tracker Dev Team
Site Admin
Posts: 2353
Joined: Thu Jun 30, 2005 4:11 pm
Location: Canada

Re: Version 9 is now available

Post by Vasyl-Tracker Dev Team »

Hi frebbe
frebbe wrote: Just updated from version 7 to 9. In the old version, without the Enhanced OCR PlugIn, I had no problems with text in tables. My first test of the new version with a PDF fresh from the digitization service of the Saxon State Library in Dresden (without OCR) is very sobering. Recognition is good on pure text pages, but tables are arbitrarily disfigured with newly drawn vertical and horizontal lines. That looks really bad..

Seems that this output-mode:
image.png
- has a trouble because in that mode the Editor should not add any lines at all, only invisible text over the scanned image.
We will fix that soon, sorry for the inconvenience.
Also in the next upcoming build we will add an additional option to suppress adding those lines for tables. You will be able to use it for other output-modes too.

Also tip: with V9 you are still able to use the previous OCR engine if you want. You can enable it there:
image1.png
frebbe wrote: And, by the way, why is there no recognition of Fraktur fonts, which unfortunately have been used in German for a very long time?
As I said, with V9 you are still able to use the previous OCR engine and this engine has the ability to recognize Fraktur fonts as well:
image.png

Unfortunately, the new EnhancedOCR hasn't this ability, while is definitely faster and provides a significantly better result for most documents. And we still improving its performance and quality...

Cheers.
Vasyl Yaremyn
Tracker Software Products
Project Developer

Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
Post Reply