A bug for Enhance Scanned Pages

This Forum is for the use of End Users requiring help and assistance for Tracker Software's PDF-Tools.

Moderators: TrackerSupp-Daniel, Tracker Support, Chris - Tracker Supp, Vasyl-Tracker Dev Team, Sean - Tracker, Tracker Supp-Stefan

Post Reply
eu4you
User
Posts: 4
Joined: Sat Sep 25, 2021 10:44 am

A bug for Enhance Scanned Pages

Post by eu4you » Sat Sep 25, 2021 11:04 am

Hi,

I have some pdf files 'OCR'ed by ABBYY FineReader 14.
FineReader is great program for OCR, but a fault is to split texts to many object.
It results to reduce speed of reading files in any viewer.

So I try 'Enhance Scanned Pages' feature in PDF-Tools,
and your tools works good to combine pictures and text objects.
This works on only applying any filter in the feature.
Calling Guide Book.pdf
(3.9 MiB) Downloaded 15 times
But there is a problem in this, objects order is fail.
Text object should be under picture object, but always is on picture.
So text in PDF show overlaply and horribly.

So I request to fix this or develop a feature to order objects.
Thanks.

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 5174
Joined: Wed Jan 03, 2018 6:52 pm

Re: A bug for Enhance Scanned Pages

Post by TrackerSupp-Daniel » Mon Sep 27, 2021 11:14 pm

Hello, eu4you

I am afraid that I do not understand the issue you are reporting in full here. The document you sent already appear to have had OCR run on it, and the OCR text is transparent, so it does not impact the images.
Could I ask you to
1. Provide a screenshot of what you see in this file as problematic
2. Provide a screenshot of what your OCR settings are, so that we can try to reproduce the issue
3. a copy of the original file, before any OCR has been performed.

Kind regards,
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Support: <Support@tracker-software.com>
Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

eu4you
User
Posts: 4
Joined: Sat Sep 25, 2021 10:44 am

Re: A bug for Enhance Scanned Pages

Post by eu4you » Tue Sep 28, 2021 1:50 pm

Hello Daniel,
Yes, I think my explanation was lacking.

1. This is a original PDF which is processed by OCR feature of FineReader 14.
Calling Guide Book(Before).pdf
(3.9 MiB) Downloaded 17 times

You can see normally texts and pictures in the pdf file, but it may be very slow to read any pages.
And the contents in the file is like below :
Before Contents.png


2. So I excute the feature 'Enhance Scanned Pages' to the file as a setting below :
Setting.png


3. And the result file is below :
https://kutt.it/orc7up (Over 5MB..)

You can see abnormal pages. Texts should be under picture, but they are reversed. So text layers overlap with text on picture in page view.

And the contents in the file is like below :
After contents.png
You know, all contents in each page are structured as layer, and the top in the list places on bottom in real page.
In original PDF, container contained texts is above container contained pictures in list, So texts is hided under picture in page view.
But the result processed by the feature in pdf-tools is wrong, text layer is under than picture layer in list, so texts is upon the picture which is with texts in page view. Texts are overlapped with each other.


So what I would suggest is to modify the order of the layers of pictures and text to change, or to create another function to sort by setting the layer priority.

Note that this situation does not appear when the filter in the feature is turned off. Any filter must be turned on to appear.

eu4you
User
Posts: 4
Joined: Sat Sep 25, 2021 10:44 am

Re: A bug for Enhance Scanned Pages

Post by eu4you » Tue Oct 12, 2021 4:05 pm

Hello?
Is this bug accepted?

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 5174
Joined: Wed Jan 03, 2018 6:52 pm

Re: A bug for Enhance Scanned Pages

Post by TrackerSupp-Daniel » Tue Oct 12, 2021 11:30 pm

Hello, eu4you

My apologies! Yes I managed to reproduce this issue and created the following bug report for it:

RT#5745: Enhance Scans re-orders page content incorrectly.

I thought I had posted it here after creating the ticket, so again, please accept my apologies for missing this.

Kind regards,
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Support: <Support@tracker-software.com>
Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

eu4you
User
Posts: 4
Joined: Sat Sep 25, 2021 10:44 am

Re: A bug for Enhance Scanned Pages

Post by eu4you » Mon Nov 15, 2021 11:43 am

Hello,

Looks like the latest update didn't reflect this bug fix.
May I know how is it going?

User avatar
Tracker Supp-Stefan
Site Admin
Posts: 14680
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: A bug for Enhance Scanned Pages

Post by Tracker Supp-Stefan » Mon Nov 15, 2021 11:54 am

Hello eu4you,

Yes indeed this ticket and the bug reported in it could not be addressed in build 358.
The ticket is still in our system and will be looked at as soon as possible!

Kind regards,
Stefan

User avatar
Vasyl-Tracker Dev Team
Site Admin
Posts: 2102
Joined: Thu Jun 30, 2005 4:11 pm
Location: Canada

Re: A bug for Enhance Scanned Pages

Post by Vasyl-Tracker Dev Team » Tue Nov 16, 2021 10:37 pm

Hi eu4you.

The 'overlapped-text-becomes-visible-after-EnhanceScans' bug is confirmed and will be fixed soon.

Also, a tip related to your case, to your screenshot of EnhanceScans dialog: using the Descreen=High isn't a good idea for a major number of images, because it may add visual artifacts (vertical and horizontal lines). And using a High value for all params doesn't guarantee the Best result for your case. It is better to play more with such params. Sometimes the Medium or even Low value may give you better results, depending on the kind of image.

Cheers.
Vasyl Yaremyn
Tracker Software Products
Project Developer

Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.

Post Reply