We're in the middle of a project that involves creating searchable PDFs out of non-searchable PDFs. We have over 10,000 files to process. We purchased the PDF SDK that includes the OCR module.
Most of our output files are just as we expect them to be. We've had to manually rotate the input with the PDF Exchange Editor, as they were scanned 90 degrees off. When we submit the correctly rotated files and feed them to the OCR engine we get searchable PDFs.
A small number of the output files are being auto rotated 180 degrees. The output OCR'd PDFs have pages that are 180 degrees off from their source files. Searchable text is gibberish as you migh expect.
What could be causing the OCR to flip these pages? I tried a few through the PDF Editor, and that also rotated the pages. We're well into this project and this is a disappointing surprise. Any help would be most appreciated.
OCR is autorotating pages 180 degrees
Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Tracker Supp-Stefan
-
- User
- Posts: 5522
- Joined: Fri Nov 21, 2014 8:27 am
- Contact:
Re: OCR is autorotating pages 180 degrees
Hello pmazurk,
Currently we are searching for a solution to this problem.
Cheers,
Alex
Currently we are searching for a solution to this problem.
Cheers,
Alex
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ
Re: OCR is autorotating pages 180 degrees
Is there a targeted release date? Clearly this is a fundamental issue.
Also - is there a workaround or a configuration change or a switch setting that would stop this? Should I not use Fast Autorotate?
Thanks -
Also - is there a workaround or a configuration change or a switch setting that would stop this? Should I not use Fast Autorotate?
Thanks -
- TrackerSupp-Daniel
- Site Admin
- Posts: 8610
- Joined: Wed Jan 03, 2018 6:52 pm
Re: OCR is autorotating pages 180 degrees
Hello pmazurk,
We do not have a set release date for this fix in particular, apologies for the inconvenience. Please see my email for more information.
I will leave the Fast Autorotate question for my development colleagues to answer, as I am not a developer, and do not know the answer there.
We do not have a set release date for this fix in particular, apologies for the inconvenience. Please see my email for more information.
I will leave the Fast Autorotate question for my development colleagues to answer, as I am not a developer, and do not know the answer there.
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Re: OCR is autorotating pages 180 degrees
Is it possible that the OCR operation is creating a new image layer that is rotated while the original layer is not? We've observed that the OCR is actually correct, in that searching for a word finds it in a location that, while incorrect for the visible image layer, would be correct if the image layer were properly rotated.
Also - if this is the case, can we detect and remove the incorrectly rotated image layer?
Also - if this is the case, can we detect and remove the incorrectly rotated image layer?
-
- User
- Posts: 5522
- Joined: Fri Nov 21, 2014 8:27 am
- Contact:
Re: OCR is autorotating pages 180 degrees
Hello pmazurk,
Please check this thread out - hopefully this can help:
viewtopic.php?f=42&t=31745#p129064
Cheers,
Alex
Please check this thread out - hopefully this can help:
viewtopic.php?f=42&t=31745#p129064
Cheers,
Alex
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ
Re: OCR is autorotating pages 180 degrees
That worked! Turns out including the PDFXOCR_Funcs.OCR_ImageProcessingFlags.OCR_Image_SuppressOutput option is key. Docs are coming out of the converter as expected. I left the PDFXOCR_Funcs.OCR_ImageProcessingFlags.OCR_Image_FastAutorotate option in, and the pages are being deskewed but not over-rotated.
Thanks for your help-
Thanks for your help-
-
- User
- Posts: 5522
- Joined: Fri Nov 21, 2014 8:27 am
- Contact:
Re: OCR is autorotating pages 180 degrees
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ