Page 1 of 1

OCR is rotating the originals after processing ~ 15 degrees

Posted: Tue Nov 13, 2018 3:42 pm
by michipapa
Hi Tracker,

we have found that some of our 1000 PDF´s are damaged after processing.
The image is rotated at ~15 degrees. It happens not with all pdf´s, its look like only if the image is something like a google maps map.
We can reproduce the error. I send you the files via Email. [Files attached to post by administrator]

regards Michael

Re: OCR is rotating the originals after processing ~ 15 degrees

Posted: Tue Nov 13, 2018 5:32 pm
by TrackerSupp-Daniel
Hello Michael,
Thank you for the email, I have attached those files to your forum post so that all of our support members can easily access them in the future.

Looking this over, I notice that the scanned document was created with build 326.1 of our software, could you try updating that machine to 327.1 and let us know if the handling improves? Beyond that, if the issue persists after an update, please send screenshots of your scanning settings in the Editor to reproduce this.

You can attach images, pdf files, and zip files to your forum post at any time by dragging the files from your computer, over the text window while typing your reply.

Re: OCR is rotating the originals after processing ~ 15 degrees

Posted: Tue Nov 13, 2018 6:36 pm
by michipapa
Hi Support,

1. I tried it with 326.1 and 327.1, same result
2. We talking about the OCR.dll , In the Editor-OCR everything is ok
3. My settings are

Options.ImageFlags=BinaryOR(0x0001,0x0048)
Options.DataPath=sOCR_DataPath
Options.Lang=2 //"PXO_German"
Options.RegionMode=1
Options.accMode=0
Options.Blacklist=""
Options.Whitelist=""
Options.raster_dpi=300

regards Michael

Re: OCR is rotating the originals after processing ~ 15 degrees

Posted: Tue Nov 13, 2018 6:49 pm
by Willy Van Nuffel
The problem is indeed still reproduce-able with the latest build V7.327.1 of PDF-XChange Editor, in the following way:
- open the "before.pdf"
- apply OCR via the Document-menu > OCR Pages... > Accuracy=Medium, Output Type=Create New Searchable PDF, Auto Deskew=ON, and OK

The rotation is far less when the new OCR-feature is being used via the Document-menu > Enhance Scanned Pages... > only set Deskew=ON, and OK

The reason for the rotation is clearly the "Deskew"-process.
I would say, in case Deskew is not really needed for your OCR-processing, then you can best turn it off as a (temporary) work-around.

In the given example there is no text at all, but we may suppose - in case of maps - there will be street-names and other text that mostly will be positioned in mixed angles on the page. So, how should the OCR process know how to turn the page in a correct angle?

Now, I see that it goes about the SDK environment, probably there is also an option or parameter for Deskew ?

Re: OCR is rotating the originals after processing ~ 15 degrees

Posted: Wed Nov 14, 2018 11:53 am
by michipapa
Hi Tracker,

any solutions / ideas ?

We have stopped our ocr online service at our upload server till we have a solution.

regards Michael

Re: OCR is rotating the originals after processing ~ 15 degrees

Posted: Thu Nov 15, 2018 8:19 am
by Sasha - Tracker Dev Team
Hello Michael,

Well, you can leave the original image as is and just place the recognized text on top of it. For that, you will have to specify these flags:

Code: Select all

(uint)PDFXOCR.PDFXOCR_Funcs.OCR_ImageProcessingFlags.OCR_Content_Original | (uint)PDFXOCR.PDFXOCR_Funcs.OCR_ImageProcessingFlags.OCR_Image_SuppressOutput 
Cheers,
Alex

Re: OCR is rotating the originals after processing ~ 15 degrees

Posted: Thu Nov 15, 2018 10:05 am
by michipapa
Hi Alex,



I use already:

Options.ImageFlags=BinaryOR(0x0001,0x0048)

In the other thread we have weeks ago you advice to use:

OCR_Content_Original = 0x0040 // output original content instead of image

In my opinion that means that the original not affected and only the textlayer is added

regards Michael

Re: OCR is rotating the originals after processing ~ 15 degrees

Posted: Thu Nov 15, 2018 10:29 am
by Sasha - Tracker Dev Team
Hello Michael,

Well, you are using the OCR_Image_Autorotate = 0x0001 flag, thus the image is being rotated. Try launching the OCR without it.

Cheers,
Alex

Re: OCR is rotating the originals after processing ~ 15 degrees

Posted: Thu Nov 15, 2018 12:02 pm
by michipapa
Hi Sasha,

ok it works. But what disadvantages I have without autorotate ?

And why does
> OCR_Content_Original = 0x0040 // output original content instead of image
not that what I expect ? I need only the text layer in the original file ...




regards Michael

Re: OCR is rotating the originals after processing ~ 15 degrees

Posted: Thu Nov 15, 2018 1:22 pm
by Sasha - Tracker Dev Team
Hello Michael,

If the rotation is less then 10 degrees, then everything should work OK, if it's 10-15 degrees or more, then there can be some problems.
As for the flags, the OCR_Content_Original flag will leave the original content as the background. And the OCR_Image_SuppressOutput flag will not add the OCR rendered image as the background. Thus these two are needed, because of you specify only the OCR_Content_Original flag, then you will have the old image and the new OCR rendered image on your page.

Cheers,
Alex

Re: OCR is rotating the originals after processing ~ 15 degrees

Posted: Thu Nov 15, 2018 1:29 pm
by michipapa
Hi Alex,

thx.

regards Michael

Re: OCR is rotating the originals after processing ~ 15 degrees

Posted: Thu Nov 15, 2018 2:33 pm
by Sasha - Tracker Dev Team
:)

Re: OCR is rotating the originals after processing ~ 15 degrees

Posted: Tue Jan 29, 2019 12:51 pm
by Tracker Supp-Stefan
Hello Timothy,

Welcome to our forums.
This is an SDK (developer) topic. Are you also using our SDK products, or are you using the OCR tool in our stand alone Editor?

Regards,
Stefan