OCR much worse than in XChange Viewer (EXAMPLE)

This forum is for plugins used in the PDF-XChange Editor only.

Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan

Post Reply
The_GTA
User
Posts: 4
Joined: Wed Apr 18, 2018 1:12 pm

OCR much worse than in XChange Viewer (EXAMPLE)

Post by The_GTA »

Did you know that the OCR feature of PDF-XChange Editor is much worse than in PDF-XChange Viewer? Take this PDF as example:
Scan Samo Zycie 3.pdf
(525.59 KiB) Downloaded 295 times
Output from PDF-XChange-Viewer:
Scan Samo Zycie 3 OCR XChange Viewer.pdf
(582.02 KiB) Downloaded 285 times
Output from PDF-XChange-Editor:
Scan Samo Zycie 3 OCR XChange Editor.pdf
(562.89 KiB) Downloaded 280 times
Let's compare the first paragraph from the given PDF.
Image

PDF-XChange-Viewer:
0TeresaSkulska-Wagner, Kurallee 20, Bad Fiissing, 08531 24140
0Dr. med. JoannaGottoehiit. LetzierHasenpiad 64. Franklurl7Main, 017622260579
0WandaHerzog, Feldblumenweg 28, Köln, 0221 481453
0BogdanFugiel,Alnienstr. 212,Mü|heim7Ruhr, 02087871700
0+ortooeda, Drmed. BozenaNteswiatowski, AltstādterKirchplatz7, Hoigeismar, 05671 1808
0Hudoli Konietzny, Kopersand 12, Emden, 04921766407
0IrenaKubaiokKieyne,wiihelmsplatz1A, Güniiz, 035817767280

PDF-XChange-Editor:
ITereee 81111151154111551151. 145151155 28. 8511 8155115. 88881 24148 I 8". 111511. .11151115 851111511111. Le1215185551111155 55'. 851111112111115111. 8128 22288828 I '1115111111 1151211131. F5|1111|1.1111511'.'.1'51_.'1 28. 11111111. 8221 481488 I 81.11.111.511 8.11.1151. 211111511511. 212. 1141111151251811111. 8288 2821288 I +511555115. 81111511. 81125115 11155111151111115111. .4111515131151 111151111158 2. H511'1515111111. 88821 1888 I 111111511' 1151115121111. 1151151551111 12. Er111ler1. 1214821155482 I 115115 141111511111 18511115. 11111111151111511121121141. 8811112. 8218815282288

How could you mess up this bad? Did someone sabotage your product?
Willy Van Nuffel
User
Posts: 2347
Joined: Wed Jan 18, 2006 12:10 pm

Re: OCR much worse than in XChange Viewer (EXAMPLE)

Post by Willy Van Nuffel »

I can only confirm. It is going indeed from bad to worse with OCR in PDF-XChange Editor 7.0.325.

I also wonder what has happened ?

:-(
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8436
Joined: Wed Jan 03, 2018 6:52 pm

Re: OCR much worse than in XChange Viewer (EXAMPLE)

Post by TrackerSupp-Daniel »

Hi everyone,
We are indeed working on this, you may have noticed the new "Enhance scanned pages" option on the Convert Tab.
Using this new option I managed to have these results with the left being the new function, and the right being the viewer file you sent earlier:

Image

I do still see some text missing, so i know this is in no way perfect, but i do also see that some of the text is improved over the viewer.
As mentioned before we are working on this, and will be continuing to make further iterations to it in the future.

Also note that i too received nothing but numbers, as in the original post, when i have english selected as the language for OCR instead of Czech.
(You can change this by clicking "Edit..." at the bottom of the new Enhance Scanned Pages window, as well as in the middle of the old OCR window)

Note that I also manually removed the image from the background of both of these, and set the text color to Black so we could see the direct comparison within the PDF itself.
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Willy Van Nuffel
User
Posts: 2347
Joined: Wed Jan 18, 2006 12:10 pm

Re: OCR much worse than in XChange Viewer (EXAMPLE)

Post by Willy Van Nuffel »

Still concerning OCR:

1) The result is indeed acceptable via the new feature "Enhance Scanned Pages".

2) Maybe the removal of the original scanned-image and turning the recognized text into black, could be considered as an additional feature.

3) From the few OCR-tests I ran (with the new release 325 of the Editor) I experienced several exits of the application without any warning or message.

Regards.
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8436
Joined: Wed Jan 03, 2018 6:52 pm

Re: OCR much worse than in XChange Viewer (EXAMPLE)

Post by TrackerSupp-Daniel »

Hello Willy,

1 - Glad to hear it is acceptable :)
2 - I'll gladly pass this request on, i think its a great idea!
3 - Could you send me a copy of the document this happens with? I've done some testing of the feature, but have never encountered the editor exiting or crashing during its use...

Thanks!
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
The_GTA
User
Posts: 4
Joined: Wed Apr 18, 2018 1:12 pm

Re: OCR much worse than in XChange Viewer (EXAMPLE)

Post by The_GTA »

Thank you, @TrackerSupp-Daniel and @Willy van Nuffel for your quick responses! We have verified that optimizing scanned PDFs does indeed lead to better OCR results. Thus we have decided to obtain PDF-XChange-Editor for our office. ;)

Here is the first paragraph text with the new options:
.Teresa Skulska-Wagner, Kurallee 20,
Bad F'Lissing, 08531 24140
. Dr, med. Joanna Gottbehüt. Letzter Hasenptad 64.
FranldurUlvlain, 0176 22260579
.Wanda Herzog, Feldblurrienweg 28. Köln, 0221 481453
. Bogdan Fugiel, Aktienstr. 212, MülheimfRuhr, 0208 7671700
. +ortopeda, Drmed, Bozena Nieswiatowski,
Altstädter Kirchplatz 7. Hotgeismar, 05671 1806
. Rudoli Konietzny, Kopersand 12, Emden, 04921166407
.Irena Kubalok Kieyne. Wilhelmsplatz 1A,
Görlitz, 03581/767280

This is acceptable because the numbers turned out right, mostly, and the rest would be looked over by the guys processing the scans.

Used options (in German, sry):
Image

Also we are really happy about the excellent PDF export options that your tool provides. We did notice an annoyance tho, that aborting the PDF optimization is impossible (it continues anyway, wasting time and power).
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8436
Joined: Wed Jan 03, 2018 6:52 pm

Re: OCR much worse than in XChange Viewer (EXAMPLE)

Post by TrackerSupp-Daniel »

Glad to hear the results are optimal :)

As for the issue at the bottom, It is curious, but thank you for reporting this,
I was going to report this to the Devs, but am unable to reproduce this issue, Could you tell me more about the settings you have in place for the process?
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
The_GTA
User
Posts: 4
Joined: Wed Apr 18, 2018 1:12 pm

Re: OCR much worse than in XChange Viewer (EXAMPLE)

Post by The_GTA »

https://www.youtube.com/watch?v=9qt-Pyg8b8U

Not sure why you cannot reproduce this bug (too good PC, not doing the right order of steps, ...).

Glad to help ya guys. Looking forward to this new OCR module :)
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8436
Joined: Wed Jan 03, 2018 6:52 pm

Re: OCR much worse than in XChange Viewer (EXAMPLE)

Post by TrackerSupp-Daniel »

Hi,
What I see in your video there seems to simply be the OCR function "undo"ing itself.
It may take longer depending on your system specs, however it should close itself (and undo the process) if left for a moment.
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
The_GTA
User
Posts: 4
Joined: Wed Apr 18, 2018 1:12 pm

Re: OCR much worse than in XChange Viewer (EXAMPLE)

Post by The_GTA »

TrackerSupp-Daniel wrote:Hi,
What I see in your video there seems to simply be the OCR function "undo"ing itself.
It may take longer depending on your system specs, however it should close itself (and undo the process) if left for a moment.
Then why is it that if I press "undo" after the OCR the "undo"ing takes less than a second? But this flaw is not something we would decline this tool over, so dw :)
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17823
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: OCR much worse than in XChange Viewer (EXAMPLE)

Post by Tracker Supp-Stefan »

Hi,

Thanks for the video a couple posts above!
I am also unable to reproduce the issue - so could it be that e.g. there was some swapping that occurred at the time that could explain the slow down?

We would definitely want to investigate this and find out why it happens, but for me canceling the "Enhance" operation is also almost immediate as for Daniel. My machine has a good CPU - but the SSD with the OS on it is 6-7 years old now and it is definitely not the snappiest one.

Regards,
Stefan
Post Reply