OCR is rotating the originals after processing ~ 15 degrees

PDF-X OCR SDK is a New product from us and intended to compliment our existing PDF and Imaging Tools to provide the Developer with an expanding set of professional tools for Optical Character Recognition tasks

Moderators: Tracker Support, TrackerSupp-Daniel, Chris - Tracker Supp, Vasyl-Tracker Dev Team, Sean - Tracker, Tracker Supp-Stefan

Post Reply
michipapa
User
Posts: 41
Joined: Tue Dec 08, 2009 10:44 pm

OCR is rotating the originals after processing ~ 15 degrees

Post by michipapa » Tue Nov 13, 2018 3:42 pm

Hi Tracker,

we have found that some of our 1000 PDF´s are damaged after processing.
The image is rotated at ~15 degrees. It happens not with all pdf´s, its look like only if the image is something like a google maps map.
We can reproduce the error. I send you the files via Email. [Files attached to post by administrator]

regards Michael
Attachments
before.pdf
(69.77 KiB) Downloaded 54 times
after.pdf
(66.69 KiB) Downloaded 50 times

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2168
Joined: Wed Jan 03, 2018 6:52 pm

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by TrackerSupp-Daniel » Tue Nov 13, 2018 5:32 pm

Hello Michael,
Thank you for the email, I have attached those files to your forum post so that all of our support members can easily access them in the future.

Looking this over, I notice that the scanned document was created with build 326.1 of our software, could you try updating that machine to 327.1 and let us know if the handling improves? Beyond that, if the issue persists after an update, please send screenshots of your scanning settings in the Editor to reproduce this.

You can attach images, pdf files, and zip files to your forum post at any time by dragging the files from your computer, over the text window while typing your reply.
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

michipapa
User
Posts: 41
Joined: Tue Dec 08, 2009 10:44 pm

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by michipapa » Tue Nov 13, 2018 6:36 pm

Hi Support,

1. I tried it with 326.1 and 327.1, same result
2. We talking about the OCR.dll , In the Editor-OCR everything is ok
3. My settings are

Options.ImageFlags=BinaryOR(0x0001,0x0048)
Options.DataPath=sOCR_DataPath
Options.Lang=2 //"PXO_German"
Options.RegionMode=1
Options.accMode=0
Options.Blacklist=""
Options.Whitelist=""
Options.raster_dpi=300

regards Michael

Willy Van Nuffel
User
Posts: 1374
Joined: Wed Jan 18, 2006 12:10 pm

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by Willy Van Nuffel » Tue Nov 13, 2018 6:49 pm

The problem is indeed still reproduce-able with the latest build V7.327.1 of PDF-XChange Editor, in the following way:
- open the "before.pdf"
- apply OCR via the Document-menu > OCR Pages... > Accuracy=Medium, Output Type=Create New Searchable PDF, Auto Deskew=ON, and OK

The rotation is far less when the new OCR-feature is being used via the Document-menu > Enhance Scanned Pages... > only set Deskew=ON, and OK

The reason for the rotation is clearly the "Deskew"-process.
I would say, in case Deskew is not really needed for your OCR-processing, then you can best turn it off as a (temporary) work-around.

In the given example there is no text at all, but we may suppose - in case of maps - there will be street-names and other text that mostly will be positioned in mixed angles on the page. So, how should the OCR process know how to turn the page in a correct angle?

Now, I see that it goes about the SDK environment, probably there is also an option or parameter for Deskew ?

michipapa
User
Posts: 41
Joined: Tue Dec 08, 2009 10:44 pm

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by michipapa » Wed Nov 14, 2018 11:53 am

Hi Tracker,

any solutions / ideas ?

We have stopped our ocr online service at our upload server till we have a solution.

regards Michael

User avatar
Sasha - Tracker Dev Team
User
Posts: 4056
Joined: Fri Nov 21, 2014 8:27 am
Contact:

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by Sasha - Tracker Dev Team » Thu Nov 15, 2018 8:19 am

Hello Michael,

Well, you can leave the original image as is and just place the recognized text on top of it. For that, you will have to specify these flags:

Code: Select all

(uint)PDFXOCR.PDFXOCR_Funcs.OCR_ImageProcessingFlags.OCR_Content_Original | (uint)PDFXOCR.PDFXOCR_Funcs.OCR_ImageProcessingFlags.OCR_Image_SuppressOutput 
Cheers,
Alex
Join us at Google+:
https://plus.google.com/+PDFXChangeEditorTS
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ

michipapa
User
Posts: 41
Joined: Tue Dec 08, 2009 10:44 pm

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by michipapa » Thu Nov 15, 2018 10:05 am

Hi Alex,



I use already:

Options.ImageFlags=BinaryOR(0x0001,0x0048)

In the other thread we have weeks ago you advice to use:

OCR_Content_Original = 0x0040 // output original content instead of image

In my opinion that means that the original not affected and only the textlayer is added

regards Michael

User avatar
Sasha - Tracker Dev Team
User
Posts: 4056
Joined: Fri Nov 21, 2014 8:27 am
Contact:

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by Sasha - Tracker Dev Team » Thu Nov 15, 2018 10:29 am

Hello Michael,

Well, you are using the OCR_Image_Autorotate = 0x0001 flag, thus the image is being rotated. Try launching the OCR without it.

Cheers,
Alex
Join us at Google+:
https://plus.google.com/+PDFXChangeEditorTS
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ

michipapa
User
Posts: 41
Joined: Tue Dec 08, 2009 10:44 pm

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by michipapa » Thu Nov 15, 2018 12:02 pm

Hi Sasha,

ok it works. But what disadvantages I have without autorotate ?

And why does
> OCR_Content_Original = 0x0040 // output original content instead of image
not that what I expect ? I need only the text layer in the original file ...




regards Michael

User avatar
Sasha - Tracker Dev Team
User
Posts: 4056
Joined: Fri Nov 21, 2014 8:27 am
Contact:

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by Sasha - Tracker Dev Team » Thu Nov 15, 2018 1:22 pm

Hello Michael,

If the rotation is less then 10 degrees, then everything should work OK, if it's 10-15 degrees or more, then there can be some problems.
As for the flags, the OCR_Content_Original flag will leave the original content as the background. And the OCR_Image_SuppressOutput flag will not add the OCR rendered image as the background. Thus these two are needed, because of you specify only the OCR_Content_Original flag, then you will have the old image and the new OCR rendered image on your page.

Cheers,
Alex
Join us at Google+:
https://plus.google.com/+PDFXChangeEditorTS
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ

michipapa
User
Posts: 41
Joined: Tue Dec 08, 2009 10:44 pm

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by michipapa » Thu Nov 15, 2018 1:29 pm

Hi Alex,

thx.

regards Michael

User avatar
Sasha - Tracker Dev Team
User
Posts: 4056
Joined: Fri Nov 21, 2014 8:27 am
Contact:

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by Sasha - Tracker Dev Team » Thu Nov 15, 2018 2:33 pm

:)
Join us at Google+:
https://plus.google.com/+PDFXChangeEditorTS
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ

User avatar
Tracker Supp-Stefan
Site Admin
Posts: 13296
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: OCR is rotating the originals after processing ~ 15 degrees

Post by Tracker Supp-Stefan » Tue Jan 29, 2019 12:51 pm

Hello Timothy,

Welcome to our forums.
This is an SDK (developer) topic. Are you also using our SDK products, or are you using the OCR tool in our stand alone Editor?

Regards,
Stefan

Post Reply