Batch fix content skew and incorrect page rotation

This Forum is for the use of End Users requiring help and assistance for Tracker Software's PDF-Tools.

Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Tracker Supp-Stefan

Post Reply
User avatar
Jensen Head
User
Posts: 412
Joined: Mon Sep 13, 2021 8:12 am

Batch fix content skew and incorrect page rotation

Post by Jensen Head »

What is the best way to batch fix page skew without changing existing document text objects? That is, where a text layer is present, it should remain unchanged as a result of the tool's operation; where it does not exist, it should not appear. Otherwise, how to set up an analogue of the "Deskew Pages Content" [1] tool from the PDF-XChange Editor in PDF-Tools?

[1] Deskew scanned images in the document to improve reading and text recognition.

Related topic — "Deskewing with PDF-Tools not working" https://forum.pdf-xchange.com/viewtopic.php?f=70&t=35895
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8371
Joined: Wed Jan 03, 2018 6:52 pm

Re: Batch fix content skew and incorrect page rotation

Post by TrackerSupp-Daniel »

Hello, Jensen Head

As is mentioned in the other article that you linked, the issue in PDF-Tools has been fixed already, so that would be your way to accomplish this. The Editor is not designed for batch processing, so it does not offer this tool as a batch option. PDF-Tools however, already offers this function by disabling all options except deskew in "enhance scanned pages".

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
User avatar
Jensen Head
User
Posts: 412
Joined: Mon Sep 13, 2021 8:12 am

Re: Batch fix content skew and incorrect page rotation

Post by Jensen Head »

TrackerSupp-Daniel wrote: Wed Apr 06, 2022 9:40 pmThe Editor is not designed for batch processing, so it does not offer this tool as a batch option.
Of course. That's why I created a thread in the "PDF-Tools" section.
TrackerSupp-Daniel wrote: Wed Apr 06, 2022 9:40 pm PDF-Tools however, already offers this function by disabling all options except deskew in "enhance scanned pages".
That is, automatic correction of page skew of multiple documents is achieved through the following set of settings:
2022-04-11_11-33-56.png
Thank you!
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8371
Joined: Wed Jan 03, 2018 6:52 pm

Batch fix content skew and incorrect page rotation

Post by TrackerSupp-Daniel »

:)
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
User avatar
Jensen Head
User
Posts: 412
Joined: Mon Sep 13, 2021 8:12 am

Re: Batch fix content skew and incorrect page rotation

Post by Jensen Head »

If I set the "If document contains text" setting in the "OCR Pages" module to "Do not OCR but continue processing", if there are text objects in the document, the document processing ends with the error "The PDF document contains text. OCR operation will not be applied" . Does this mean that the setting "Detect skew of page content" in the "Recognition Options" block and "Fix content skew and incorrect page rotation" in the "Output Options" will also not be applied? It's not entirely obvious. Being in this window, it may seem that recognition will not be applied, but, nevertheless, alignment will be performed, for example (as a function that is not strictly speaking text recognition on images).

In order to prevent recognition from happening (but page alignment was performed), I tried to disable all languages in the "Languages" field of the "Recognition Options" block of the "OCR Pages (Enhanced)" window. But after closing the window with the "OK" button, the empty value of the languages was not saved.

I tried toggling "Put the resulting image(s) as a background" on and off, but in both cases I got
Screenshot_2023-02-09_15-06-46.png
from
Screenshot_2023-02-09_15-06-15.png
(original document contains invisible text)

I know that previously suggested [1] for page alignment (including bitmap and text objects) was the "Enhance Scanned Pages" tool with the "Deskew: On" setting in the "Filters" block. Is it still the all-in-one tool you recommend for batch page alignment in documents with complex formatting?

Also, there is a problem with finding the right tool for aligning the page skew in the Actions Library panel:
_
Screenshot_2023-02-09_13-42-30.png
Screenshot_2023-02-09_13-42-30.png (1.56 KiB) Viewed 618 times
Screenshot_2023-02-09_13-42-43.png
Screenshot_2023-02-09_13-42-43.png (1.54 KiB) Viewed 618 times
[1] "Deskewing with PDF-Tools not working" viewtopic.php?t=35895
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8371
Joined: Wed Jan 03, 2018 6:52 pm

Re: Batch fix content skew and incorrect page rotation

Post by TrackerSupp-Daniel »

Hello, Jensen Head
Jensen Head wrote: Thu Feb 09, 2023 10:32 am [...]"Fix content skew and incorrect page rotation" in the "Output Options" will also not be applied?
That is correct, the entire "OCR pages" action in your tool process is skipped if this is enabled. Effiectively this is an option designed to allow you to double check that all your docuemtns have text within them, without needing to manually curate the list. Simply drop the whole folder in, and ocr will be performed only on those which need it.
Jensen Head wrote: Thu Feb 09, 2023 10:32 am I know that previously suggested [1] for page alignment (including bitmap and text objects) was the "Enhance Scanned Pages" tool with the "Deskew: On" setting in the "Filters" block. Is it still the all-in-one tool you recommend for batch page alignment in documents with complex formatting?
Yes, If you need to deskew pages without running OCR, the "Enhance scanned pages" tool/action would still be what you are looking for. Simply configure it as such and you will only be applying deskew to the pages:
image.png
Jensen Head wrote: Thu Feb 09, 2023 10:32 am Also, there is a problem with finding the right tool for aligning the page skew in the Actions Library panel:
As for this, technically there is no deskew tool, so that is working as intended at the moment, but I do believe we could improve this to include sub-options of actions even if a term is not in the name of the action. I will pass this feedback along to our Tools team.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Post Reply