A bug for Enhance Scanned Pages

This Forum is for the use of End Users requiring help and assistance for Tracker Software's PDF-Tools.

Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Tracker Supp-Stefan

Post Reply
eu4you
User
Posts: 8
Joined: Sat Sep 25, 2021 10:44 am

A bug for Enhance Scanned Pages

Post by eu4you »

Hi,

I have some pdf files 'OCR'ed by ABBYY FineReader 14.
FineReader is great program for OCR, but a fault is to split texts to many object.
It results to reduce speed of reading files in any viewer.

So I try 'Enhance Scanned Pages' feature in PDF-Tools,
and your tools works good to combine pictures and text objects.
This works on only applying any filter in the feature.
Calling Guide Book.pdf
(3.9 MiB) Downloaded 86 times
But there is a problem in this, objects order is fail.
Text object should be under picture object, but always is on picture.
So text in PDF show overlaply and horribly.

So I request to fix this or develop a feature to order objects.
Thanks.
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8371
Joined: Wed Jan 03, 2018 6:52 pm

Re: A bug for Enhance Scanned Pages

Post by TrackerSupp-Daniel »

Hello, eu4you

I am afraid that I do not understand the issue you are reporting in full here. The document you sent already appear to have had OCR run on it, and the OCR text is transparent, so it does not impact the images.
Could I ask you to
1. Provide a screenshot of what you see in this file as problematic
2. Provide a screenshot of what your OCR settings are, so that we can try to reproduce the issue
3. a copy of the original file, before any OCR has been performed.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
eu4you
User
Posts: 8
Joined: Sat Sep 25, 2021 10:44 am

Re: A bug for Enhance Scanned Pages

Post by eu4you »

Hello Daniel,
Yes, I think my explanation was lacking.

1. This is a original PDF which is processed by OCR feature of FineReader 14.
Calling Guide Book(Before).pdf
(3.9 MiB) Downloaded 87 times

You can see normally texts and pictures in the pdf file, but it may be very slow to read any pages.
And the contents in the file is like below :
Before Contents.png


2. So I excute the feature 'Enhance Scanned Pages' to the file as a setting below :
Setting.png


3. And the result file is below :
https://kutt.it/orc7up (Over 5MB..)

You can see abnormal pages. Texts should be under picture, but they are reversed. So text layers overlap with text on picture in page view.

And the contents in the file is like below :
After contents.png
You know, all contents in each page are structured as layer, and the top in the list places on bottom in real page.
In original PDF, container contained texts is above container contained pictures in list, So texts is hided under picture in page view.
But the result processed by the feature in pdf-tools is wrong, text layer is under than picture layer in list, so texts is upon the picture which is with texts in page view. Texts are overlapped with each other.


So what I would suggest is to modify the order of the layers of pictures and text to change, or to create another function to sort by setting the layer priority.

Note that this situation does not appear when the filter in the feature is turned off. Any filter must be turned on to appear.
eu4you
User
Posts: 8
Joined: Sat Sep 25, 2021 10:44 am

Re: A bug for Enhance Scanned Pages

Post by eu4you »

Hello?
Is this bug accepted?
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8371
Joined: Wed Jan 03, 2018 6:52 pm

Re: A bug for Enhance Scanned Pages

Post by TrackerSupp-Daniel »

Hello, eu4you

My apologies! Yes I managed to reproduce this issue and created the following bug report for it:

RT#5745: Enhance Scans re-orders page content incorrectly.

I thought I had posted it here after creating the ticket, so again, please accept my apologies for missing this.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
eu4you
User
Posts: 8
Joined: Sat Sep 25, 2021 10:44 am

Re: A bug for Enhance Scanned Pages

Post by eu4you »

Hello,

Looks like the latest update didn't reflect this bug fix.
May I know how is it going?
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17765
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: A bug for Enhance Scanned Pages

Post by Tracker Supp-Stefan »

Hello eu4you,

Yes indeed this ticket and the bug reported in it could not be addressed in build 358.
The ticket is still in our system and will be looked at as soon as possible!

Kind regards,
Stefan
User avatar
Vasyl-Tracker Dev Team
Site Admin
Posts: 2351
Joined: Thu Jun 30, 2005 4:11 pm
Location: Canada

Re: A bug for Enhance Scanned Pages

Post by Vasyl-Tracker Dev Team »

Hi eu4you.

The 'overlapped-text-becomes-visible-after-EnhanceScans' bug is confirmed and will be fixed soon.

Also, a tip related to your case, to your screenshot of EnhanceScans dialog: using the Descreen=High isn't a good idea for a major number of images, because it may add visual artifacts (vertical and horizontal lines). And using a High value for all params doesn't guarantee the Best result for your case. It is better to play more with such params. Sometimes the Medium or even Low value may give you better results, depending on the kind of image.

Cheers.
Vasyl Yaremyn
Tracker Software Products
Project Developer

Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
eu4you
User
Posts: 8
Joined: Sat Sep 25, 2021 10:44 am

Re: A bug for Enhance Scanned Pages

Post by eu4you »

Thank you very much for the good news and also for your hard work!
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17765
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

A bug for Enhance Scanned Pages

Post by Tracker Supp-Stefan »

:)
eu4you
User
Posts: 8
Joined: Sat Sep 25, 2021 10:44 am

Re: A bug for Enhance Scanned Pages

Post by eu4you »

Hi,
has this problem been resolved in this version update?
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17765
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: A bug for Enhance Scanned Pages

Post by Tracker Supp-Stefan »

Hello eu4you,

I just tested with the sample file from the ticket - and no - it does not seem like this was fixed in build 360. I will ask our devs to check this again!

Kind regards,
Stefan
User avatar
Vasyl-Tracker Dev Team
Site Admin
Posts: 2351
Joined: Thu Jun 30, 2005 4:11 pm
Location: Canada

Re: A bug for Enhance Scanned Pages

Post by Vasyl-Tracker Dev Team »

Hi eu4you.

This issue will be fixed in the upcoming 361 build.

Cheers.
Vasyl Yaremyn
Tracker Software Products
Project Developer

Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
eu4you
User
Posts: 8
Joined: Sat Sep 25, 2021 10:44 am

Re: A bug for Enhance Scanned Pages

Post by eu4you »

Hi,
I recently updated PDF-Tools, and I've been busy, so I only tried this function today.

In the file I previously attached, the error seemed to be resolved. But when I tried other files, it still didn't solve the problem.

Among the options, there was a function to lower the background to the bottom of the layer, but then the text comes over the background, so I don't think the problem is solved. Rather, I think the background should be at the top. I wish this option was added as well.

Please take a look at the file attached below.

I am replacing the file attachment with the link below due to the error that the attachment is too large :
https://Enoch.myqnapcloud.com:8001/share.cgi?ssid=81b900ccb27b4da1af5dcdd52026510c
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8371
Joined: Wed Jan 03, 2018 6:52 pm

Re: A bug for Enhance Scanned Pages

Post by TrackerSupp-Daniel »

Hello, eu4you

Looking at these files, the document seem to already be both well formed and contain a set of text data. I need to ask why it is that you are trying to run the "Enhance scanned pages" function on this document to begin with, as very, if any improvements at all, could come from it.

Is there some specific reason that you are running Enhance scanned pages on this file to begin with?

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
eu4you
User
Posts: 8
Joined: Sat Sep 25, 2021 10:44 am

Re: A bug for Enhance Scanned Pages

Post by eu4you »

Hi Daniel,

Because it takes too much time to read this file. When reading with a PDF Reader program on a computer, it is slow when turning multiple pages quickly. And most importantly, I want to read these documents with an e-book reader (eg the Onyx Boox series) or something, which is seriously slow. It takes about 10 seconds to turn a page.

I think your program will be very useful in improving this problem, and in fact, some files that have been converted are already useful.
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8371
Joined: Wed Jan 03, 2018 6:52 pm

Re: A bug for Enhance Scanned Pages

Post by TrackerSupp-Daniel »

Hello, eu4you

Thank you for the explanation. "Enhance scanned pages" is not the feature you are looking for in that case, it is not designed to reduce file size or overhead, and often can actually result in an increase in size for each page, making it take longer in the cases you just described.

What you are looking for here is the "save as optimized" function, which is located on the File tab. This is used to trim unnecessary information from pages, recompress images, and discard other unused data, such as duplicate fonts. I would recommend that you take a look at that tool and all of its various options, to see if they work for you.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Post Reply