OCR is autorotating pages 180 degrees

PDF-X OCR SDK is a New product from us and intended to compliment our existing PDF and Imaging Tools to provide the Developer with an expanding set of professional tools for Optical Character Recognition tasks

Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Sean - Tracker, Chris - Tracker Supp, Tracker Supp-Stefan

Post Reply
pmazurk
User
Posts: 27
Joined: Tue Feb 22, 2011 10:25 pm

OCR is autorotating pages 180 degrees

Post by pmazurk » Mon Nov 12, 2018 11:14 pm

We're in the middle of a project that involves creating searchable PDFs out of non-searchable PDFs. We have over 10,000 files to process. We purchased the PDF SDK that includes the OCR module.
Most of our output files are just as we expect them to be. We've had to manually rotate the input with the PDF Exchange Editor, as they were scanned 90 degrees off. When we submit the correctly rotated files and feed them to the OCR engine we get searchable PDFs.

A small number of the output files are being auto rotated 180 degrees. The output OCR'd PDFs have pages that are 180 degrees off from their source files. Searchable text is gibberish as you migh expect.

What could be causing the OCR to flip these pages? I tried a few through the PDF Editor, and that also rotated the pages. We're well into this project and this is a disappointing surprise. Any help would be most appreciated.

User avatar
Sasha - Tracker Dev Team
User
Posts: 4209
Joined: Fri Nov 21, 2014 8:27 am
Contact:

Re: OCR is autorotating pages 180 degrees

Post by Sasha - Tracker Dev Team » Tue Nov 13, 2018 7:14 am

Hello pmazurk,

Currently we are searching for a solution to this problem.

Cheers,
Alex
Join us at Google+:
https://plus.google.com/+PDFXChangeEditorTS
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ

pmazurk
User
Posts: 27
Joined: Tue Feb 22, 2011 10:25 pm

Re: OCR is autorotating pages 180 degrees

Post by pmazurk » Tue Nov 13, 2018 8:54 pm

Is there a targeted release date? Clearly this is a fundamental issue.

Also - is there a workaround or a configuration change or a switch setting that would stop this? Should I not use Fast Autorotate?

Thanks -

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2408
Joined: Wed Jan 03, 2018 6:52 pm

Re: OCR is autorotating pages 180 degrees

Post by TrackerSupp-Daniel » Tue Nov 13, 2018 11:39 pm

Hello pmazurk,

We do not have a set release date for this fix in particular, apologies for the inconvenience. Please see my email for more information.

I will leave the Fast Autorotate question for my development colleagues to answer, as I am not a developer, and do not know the answer there.
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

pmazurk
User
Posts: 27
Joined: Tue Feb 22, 2011 10:25 pm

Re: OCR is autorotating pages 180 degrees

Post by pmazurk » Fri Nov 16, 2018 5:17 pm

Is it possible that the OCR operation is creating a new image layer that is rotated while the original layer is not? We've observed that the OCR is actually correct, in that searching for a word finds it in a location that, while incorrect for the visible image layer, would be correct if the image layer were properly rotated.

Also - if this is the case, can we detect and remove the incorrectly rotated image layer?

User avatar
Sasha - Tracker Dev Team
User
Posts: 4209
Joined: Fri Nov 21, 2014 8:27 am
Contact:

Re: OCR is autorotating pages 180 degrees

Post by Sasha - Tracker Dev Team » Sat Nov 17, 2018 6:55 am

Hello pmazurk,

Please check this thread out - hopefully this can help:
viewtopic.php?f=42&t=31745#p129064

Cheers,
Alex
Join us at Google+:
https://plus.google.com/+PDFXChangeEditorTS
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ

pmazurk
User
Posts: 27
Joined: Tue Feb 22, 2011 10:25 pm

Re: OCR is autorotating pages 180 degrees

Post by pmazurk » Tue Nov 20, 2018 11:58 pm

That worked! Turns out including the PDFXOCR_Funcs.OCR_ImageProcessingFlags.OCR_Image_SuppressOutput option is key. Docs are coming out of the converter as expected. I left the PDFXOCR_Funcs.OCR_ImageProcessingFlags.OCR_Image_FastAutorotate option in, and the pages are being deskewed but not over-rotated.

Thanks for your help-

User avatar
Sasha - Tracker Dev Team
User
Posts: 4209
Joined: Fri Nov 21, 2014 8:27 am
Contact:

Re: OCR is autorotating pages 180 degrees

Post by Sasha - Tracker Dev Team » Wed Nov 21, 2018 8:54 am

:)
Join us at Google+:
https://plus.google.com/+PDFXChangeEditorTS
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ

Post Reply