Just updated from version 7 to 9. In the old version, without the Enhanced OCR PlugIn, I had no problems with text in tables. My first test of the new version with a PDF fresh from the digitization service of the Saxon State Library in Dresden (without OCR) is very sobering. Recognition is good on pure text pages, but tables are arbitrarily disfigured with newly drawn vertical and horizontal lines. That looks really bad:
These are the settings with which I carried out the text recognition. However, it is irrelevant which settings OCR is carried out with, the lines in the table always appear.
This is really a bad worsening compared to the results I had with version 7. You can see that I have been using your products since 2004. Today is the first time I am considering withdrawing from the purchase of a new version!
And, by the way, why is there no recognition of Fraktur fonts, which unfortunately have been used in German for a very long time?
Searchable image issues in V9
Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan
- TrackerSupp-JohnG
- Site Admin
- Posts: 18
- Joined: Tue Sep 15, 2020 11:43 pm
Re: Version 9 is now available
Hello,
Thank-you for informing us about this issue. To support you better, would you mind sending this document to
support@pdf-xchange.com as we would like to be able to take a closer look?
Kind regards,
Thank-you for informing us about this issue. To support you better, would you mind sending this document to
support@pdf-xchange.com as we would like to be able to take a closer look?
Kind regards,
John Gareth
Support Technician
Tracker Software Products (Canada) LTD
Support: <Support@tracker-software.com>
Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623
Support Technician
Tracker Software Products (Canada) LTD
Support: <Support@tracker-software.com>
Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623
-
- User
- Posts: 10
- Joined: Mon Apr 20, 2020 8:38 pm
Re: Version 9 is now available
I second this question, see also my post on this topic: https://forum.pdf-xchange.com/viewtopic.php?f=63&p=147689#p147689
ABBYY FineReader 15 and FineReader Server are capable of black letter OCR, as was Tesseract, so perhaps this is only a question of adding the training data?
- Vasyl-Tracker Dev Team
- Site Admin
- Posts: 2353
- Joined: Thu Jun 30, 2005 4:11 pm
- Location: Canada
Re: Version 9 is now available
Hi frebbe
Seems that this output-mode:
- has a trouble because in that mode the Editor should not add any lines at all, only invisible text over the scanned image.
We will fix that soon, sorry for the inconvenience.
Also in the next upcoming build we will add an additional option to suppress adding those lines for tables. You will be able to use it for other output-modes too.
Also tip: with V9 you are still able to use the previous OCR engine if you want. You can enable it there:
Unfortunately, the new EnhancedOCR hasn't this ability, while is definitely faster and provides a significantly better result for most documents. And we still improving its performance and quality...
Cheers.
frebbe wrote: Just updated from version 7 to 9. In the old version, without the Enhanced OCR PlugIn, I had no problems with text in tables. My first test of the new version with a PDF fresh from the digitization service of the Saxon State Library in Dresden (without OCR) is very sobering. Recognition is good on pure text pages, but tables are arbitrarily disfigured with newly drawn vertical and horizontal lines. That looks really bad..
Seems that this output-mode:
- has a trouble because in that mode the Editor should not add any lines at all, only invisible text over the scanned image.
We will fix that soon, sorry for the inconvenience.
Also in the next upcoming build we will add an additional option to suppress adding those lines for tables. You will be able to use it for other output-modes too.
Also tip: with V9 you are still able to use the previous OCR engine if you want. You can enable it there:
As I said, with V9 you are still able to use the previous OCR engine and this engine has the ability to recognize Fraktur fonts as well:frebbe wrote: And, by the way, why is there no recognition of Fraktur fonts, which unfortunately have been used in German for a very long time?
Unfortunately, the new EnhancedOCR hasn't this ability, while is definitely faster and provides a significantly better result for most documents. And we still improving its performance and quality...
Cheers.
Vasyl Yaremyn
Tracker Software Products
Project Developer
Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
Tracker Software Products
Project Developer
Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.