question(ocr): black text on noise background

Discussion for the End User use uf OCR in PDF-XChange Editor and Viewer

Moderators: Tracker Support, TrackerSupp-Daniel, Paul - Tracker Supp, Chris - Tracker Supp, Vasyl-Tracker Dev Team, Sean - Tracker, Tracker Supp-Stefan, Ivan - Tracker Software

Post Reply
SashaChernykh
User
Posts: 11
Joined: Tue Aug 20, 2019 8:01 am

question(ocr): black text on noise background

Post by SashaChernykh » Fri Aug 23, 2019 9:06 am

1. Summary

PDF-XChange Editor doesn't add OCR for black text on noise background.

2. Data
  • KiraSuperhero.pdf — B&W page of scanned Russian book for which I want add OCR layer
Image

The human eye recognizes the symbols in the bottom block of text.

3. Steps to reproduce

I opened KiraSuperhero.pdf in PDF-XChange Editor → I add OCR layer, with Russian language → I save a file with OCR.

4. Actual behavior

PDF-XChange Editor doesn't recognize letters in the lower block:

Image
5. Expected behavior
  1. Make recognition possible black letters in noise background as in my case.
  2. Or tell me, how I can edit this PDF lossless book design, so that PDF-XChange Editor recognize this document correctly.
6. Note

I haven't paper copy of this book → I can't re-scan this page on grayscale or another better quality.

7.Environment
  • Windows 10 Enterprise LTSB 64-bit EN
  • PDF-XChange Editor 8.0 Build 331.0, Portable
  • Russian Language Pack from Default OCR Engine
Thanks.

User avatar
Dimitar - Tracker Supp
Site Admin
Posts: 561
Joined: Mon Jan 15, 2018 9:01 am

Re: question(ocr): black text on noise background

Post by Dimitar - Tracker Supp » Fri Aug 23, 2019 11:32 am

Hello SashaChernykh,

Unfortunately, our Enhanced OCR currently has some problems recognizing text with a color background.

The developers are working on resolving this issue.

Regards.

User avatar
Ovg
User
Posts: 275
Joined: Tue Sep 05, 2017 4:56 pm
Location: Moscow

Re: question(ocr): black text on noise background

Post by Ovg » Fri Aug 23, 2019 4:21 pm

Default OCR engine doesn't work either
It's impossible to lead us astray for we don't care even to choose the way.
PDF-XChange PRO, 8.0 (Build 336.0) / W7 x64 SP1

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 3049
Joined: Wed Jan 03, 2018 6:52 pm

Re: question(ocr): black text on noise background

Post by TrackerSupp-Daniel » Fri Aug 23, 2019 5:22 pm

Hello OVG,
Have you tried this with the accuracy setting on "Low"?

In my tests that allowed the lower text to be recognized. However it should be noted that this is somewhat of an extreme case of "peppering" on the page, and will almost always lead to seriously reduced OCR capacity.

Kind regards,
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

User avatar
Ovg
User
Posts: 275
Joined: Tue Sep 05, 2017 4:56 pm
Location: Moscow

Re: question(ocr): black text on noise background

Post by Ovg » Fri Aug 23, 2019 7:38 pm

Hello Daniel! Yes, I have tried low accuracy - it isn't working for me.
It's impossible to lead us astray for we don't care even to choose the way.
PDF-XChange PRO, 8.0 (Build 336.0) / W7 x64 SP1

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 3049
Joined: Wed Jan 03, 2018 6:52 pm

Re: question(ocr): black text on noise background

Post by TrackerSupp-Daniel » Mon Aug 26, 2019 9:52 pm

Hello OVG, and Sasha!

Sorry for the delay, Ive just come back to retry with this one, and realized that I was testing with the EOCR instead of the default OCR. I do indeed see the issues with it not seeing anything even on low accuracy. I have informed the Dev team of this, but cannot offer a timeline for a resolution.

Kind regards,
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

SashaChernykh
User
Posts: 11
Joined: Tue Aug 20, 2019 8:01 am

Re: question(ocr): black text on noise background

Post by SashaChernykh » Wed Aug 28, 2019 8:36 am

@Tracker Software support:

Type: Question :?:

What priority (e. g. low, medium, high) does this task have in PDF-XChange Editor task prioritization?

Thanks.

User avatar
Dimitar - Tracker Supp
Site Admin
Posts: 561
Joined: Mon Jan 15, 2018 9:01 am

Re: question(ocr): black text on noise background

Post by Dimitar - Tracker Supp » Wed Aug 28, 2019 8:57 am

Hello SashaChernykh,

Since this affects the work process of our customers this problem is with high priority.

But we can not give an exact date when this issue will be resolved since this OCR engine is developed and supported by external developers.

Regards.

Post Reply