OCR is rotating the originals after processing ~ 15 degrees

PDF-X OCR SDK is a New product from us and intended to compliment our existing PDF and Imaging Tools to provide the Developer with an expanding set of professional tools for Optical Character Recognition tasks

Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Tracker Supp-Stefan

Post Reply
michipapa
User
Posts: 41
Joined: Tue Dec 08, 2009 10:44 pm

OCR is rotating the originals after processing ~ 15 degrees

Post by michipapa »

Hi Tracker,

we have found that some of our 1000 PDF´s are damaged after processing.
The image is rotated at ~15 degrees. It happens not with all pdf´s, its look like only if the image is something like a google maps map.
We can reproduce the error. I send you the files via Email. [Files attached to post by administrator]

regards Michael
Attachments
before.pdf
(69.77 KiB) Downloaded 266 times
after.pdf
(66.69 KiB) Downloaded 253 times
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8436
Joined: Wed Jan 03, 2018 6:52 pm

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by TrackerSupp-Daniel »

Hello Michael,
Thank you for the email, I have attached those files to your forum post so that all of our support members can easily access them in the future.

Looking this over, I notice that the scanned document was created with build 326.1 of our software, could you try updating that machine to 327.1 and let us know if the handling improves? Beyond that, if the issue persists after an update, please send screenshots of your scanning settings in the Editor to reproduce this.

You can attach images, pdf files, and zip files to your forum post at any time by dragging the files from your computer, over the text window while typing your reply.
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
michipapa
User
Posts: 41
Joined: Tue Dec 08, 2009 10:44 pm

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by michipapa »

Hi Support,

1. I tried it with 326.1 and 327.1, same result
2. We talking about the OCR.dll , In the Editor-OCR everything is ok
3. My settings are

Options.ImageFlags=BinaryOR(0x0001,0x0048)
Options.DataPath=sOCR_DataPath
Options.Lang=2 //"PXO_German"
Options.RegionMode=1
Options.accMode=0
Options.Blacklist=""
Options.Whitelist=""
Options.raster_dpi=300

regards Michael
Willy Van Nuffel
User
Posts: 2347
Joined: Wed Jan 18, 2006 12:10 pm

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by Willy Van Nuffel »

The problem is indeed still reproduce-able with the latest build V7.327.1 of PDF-XChange Editor, in the following way:
- open the "before.pdf"
- apply OCR via the Document-menu > OCR Pages... > Accuracy=Medium, Output Type=Create New Searchable PDF, Auto Deskew=ON, and OK

The rotation is far less when the new OCR-feature is being used via the Document-menu > Enhance Scanned Pages... > only set Deskew=ON, and OK

The reason for the rotation is clearly the "Deskew"-process.
I would say, in case Deskew is not really needed for your OCR-processing, then you can best turn it off as a (temporary) work-around.

In the given example there is no text at all, but we may suppose - in case of maps - there will be street-names and other text that mostly will be positioned in mixed angles on the page. So, how should the OCR process know how to turn the page in a correct angle?

Now, I see that it goes about the SDK environment, probably there is also an option or parameter for Deskew ?
michipapa
User
Posts: 41
Joined: Tue Dec 08, 2009 10:44 pm

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by michipapa »

Hi Tracker,

any solutions / ideas ?

We have stopped our ocr online service at our upload server till we have a solution.

regards Michael
Sasha - Tracker Dev Team
User
Posts: 5522
Joined: Fri Nov 21, 2014 8:27 am
Contact:

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by Sasha - Tracker Dev Team »

Hello Michael,

Well, you can leave the original image as is and just place the recognized text on top of it. For that, you will have to specify these flags:

Code: Select all

(uint)PDFXOCR.PDFXOCR_Funcs.OCR_ImageProcessingFlags.OCR_Content_Original | (uint)PDFXOCR.PDFXOCR_Funcs.OCR_ImageProcessingFlags.OCR_Image_SuppressOutput 
Cheers,
Alex
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ
michipapa
User
Posts: 41
Joined: Tue Dec 08, 2009 10:44 pm

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by michipapa »

Hi Alex,



I use already:

Options.ImageFlags=BinaryOR(0x0001,0x0048)

In the other thread we have weeks ago you advice to use:

OCR_Content_Original = 0x0040 // output original content instead of image

In my opinion that means that the original not affected and only the textlayer is added

regards Michael
Sasha - Tracker Dev Team
User
Posts: 5522
Joined: Fri Nov 21, 2014 8:27 am
Contact:

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by Sasha - Tracker Dev Team »

Hello Michael,

Well, you are using the OCR_Image_Autorotate = 0x0001 flag, thus the image is being rotated. Try launching the OCR without it.

Cheers,
Alex
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ
michipapa
User
Posts: 41
Joined: Tue Dec 08, 2009 10:44 pm

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by michipapa »

Hi Sasha,

ok it works. But what disadvantages I have without autorotate ?

And why does
> OCR_Content_Original = 0x0040 // output original content instead of image
not that what I expect ? I need only the text layer in the original file ...




regards Michael
Sasha - Tracker Dev Team
User
Posts: 5522
Joined: Fri Nov 21, 2014 8:27 am
Contact:

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by Sasha - Tracker Dev Team »

Hello Michael,

If the rotation is less then 10 degrees, then everything should work OK, if it's 10-15 degrees or more, then there can be some problems.
As for the flags, the OCR_Content_Original flag will leave the original content as the background. And the OCR_Image_SuppressOutput flag will not add the OCR rendered image as the background. Thus these two are needed, because of you specify only the OCR_Content_Original flag, then you will have the old image and the new OCR rendered image on your page.

Cheers,
Alex
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ
michipapa
User
Posts: 41
Joined: Tue Dec 08, 2009 10:44 pm

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by michipapa »

Hi Alex,

thx.

regards Michael
Sasha - Tracker Dev Team
User
Posts: 5522
Joined: Fri Nov 21, 2014 8:27 am
Contact:

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by Sasha - Tracker Dev Team »

:)
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17818
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by Tracker Supp-Stefan »

Hello Timothy,

Welcome to our forums.
This is an SDK (developer) topic. Are you also using our SDK products, or are you using the OCR tool in our stand alone Editor?

Regards,
Stefan
Post Reply