OCR is autorotating pages 180 degrees

PDF-X OCR SDK is a New product from us and intended to compliment our existing PDF and Imaging Tools to provide the Developer with an expanding set of professional tools for Optical Character Recognition tasks

Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Tracker Supp-Stefan

Post Reply
pmazurk
User
Posts: 27
Joined: Tue Feb 22, 2011 10:25 pm

OCR is autorotating pages 180 degrees

Post by pmazurk »

We're in the middle of a project that involves creating searchable PDFs out of non-searchable PDFs. We have over 10,000 files to process. We purchased the PDF SDK that includes the OCR module.
Most of our output files are just as we expect them to be. We've had to manually rotate the input with the PDF Exchange Editor, as they were scanned 90 degrees off. When we submit the correctly rotated files and feed them to the OCR engine we get searchable PDFs.

A small number of the output files are being auto rotated 180 degrees. The output OCR'd PDFs have pages that are 180 degrees off from their source files. Searchable text is gibberish as you migh expect.

What could be causing the OCR to flip these pages? I tried a few through the PDF Editor, and that also rotated the pages. We're well into this project and this is a disappointing surprise. Any help would be most appreciated.
Sasha - Tracker Dev Team
User
Posts: 5522
Joined: Fri Nov 21, 2014 8:27 am
Contact:

Re: OCR is autorotating pages 180 degrees

Post by Sasha - Tracker Dev Team »

Hello pmazurk,

Currently we are searching for a solution to this problem.

Cheers,
Alex
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ
pmazurk
User
Posts: 27
Joined: Tue Feb 22, 2011 10:25 pm

Re: OCR is autorotating pages 180 degrees

Post by pmazurk »

Is there a targeted release date? Clearly this is a fundamental issue.

Also - is there a workaround or a configuration change or a switch setting that would stop this? Should I not use Fast Autorotate?

Thanks -
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8439
Joined: Wed Jan 03, 2018 6:52 pm

Re: OCR is autorotating pages 180 degrees

Post by TrackerSupp-Daniel »

Hello pmazurk,

We do not have a set release date for this fix in particular, apologies for the inconvenience. Please see my email for more information.

I will leave the Fast Autorotate question for my development colleagues to answer, as I am not a developer, and do not know the answer there.
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
pmazurk
User
Posts: 27
Joined: Tue Feb 22, 2011 10:25 pm

Re: OCR is autorotating pages 180 degrees

Post by pmazurk »

Is it possible that the OCR operation is creating a new image layer that is rotated while the original layer is not? We've observed that the OCR is actually correct, in that searching for a word finds it in a location that, while incorrect for the visible image layer, would be correct if the image layer were properly rotated.

Also - if this is the case, can we detect and remove the incorrectly rotated image layer?
Sasha - Tracker Dev Team
User
Posts: 5522
Joined: Fri Nov 21, 2014 8:27 am
Contact:

Re: OCR is autorotating pages 180 degrees

Post by Sasha - Tracker Dev Team »

Hello pmazurk,

Please check this thread out - hopefully this can help:
viewtopic.php?f=42&t=31745#p129064

Cheers,
Alex
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ
pmazurk
User
Posts: 27
Joined: Tue Feb 22, 2011 10:25 pm

Re: OCR is autorotating pages 180 degrees

Post by pmazurk »

That worked! Turns out including the PDFXOCR_Funcs.OCR_ImageProcessingFlags.OCR_Image_SuppressOutput option is key. Docs are coming out of the converter as expected. I left the PDFXOCR_Funcs.OCR_ImageProcessingFlags.OCR_Image_FastAutorotate option in, and the pages are being deskewed but not over-rotated.

Thanks for your help-
Sasha - Tracker Dev Team
User
Posts: 5522
Joined: Fri Nov 21, 2014 8:27 am
Contact:

Re: OCR is autorotating pages 180 degrees

Post by Sasha - Tracker Dev Team »

:)
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ
Post Reply