PDF XChanged Editor V9 – Trouble in OCR page Thai language

Forum for the PDF-XChange Editor - Free and Licensed Versions

Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan

Post Reply
nsa-chanaphon
User
Posts: 3
Joined: Mon Jan 25, 2021 6:51 am

PDF XChanged Editor V9 – Trouble in OCR page Thai language

Post by nsa-chanaphon »

Hi,


I am very interested in OCR and Enhance scanned pages in Thai language, so I have been testing with software version and system as per following detail and using PDF file printed though PDF-Xchange Lite from a Thai news website.

Software version
PDF XChanged Editor Version 9.0 build 350.0

System information
Processor Intel(R) Pentium(R) CPU N3700 @ 1.60GHz
Memory 4096MB
Hard Drive 465.8GB
OS Windows 10 Pro 64-Bit

I found that recognizing word seems to be fine, but when I copied the recognized paragraphs and pasted on document or browser, or saved as a text file. It is the same result that only Thai texts has a trouble in OCR page.

Also, I have attached files.


The incorrect words are about 70 percent. How to improve in an accuracy? Do I need to change some settings? Please advise.


Thanks in advance
Fah
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8371
Joined: Wed Jan 03, 2018 6:52 pm

Re: PDF XChanged Editor V9 – Trouble in OCR page Thai language

Post by TrackerSupp-Daniel »

Hi, nsa-chanaphon

Thank you for the report, To begin, I cannot help but notice you are using an older build of the software, please try updating to build 351.0 and let me know if that helps at all.

After that, please confirm which OCR (default or enhanced) you are using, you can see if the "enhanced" OCR is running by checking the title bar when the OCR dialog appears. If you can also send a screenshot of the OCR settings you have used, that would be very helpful.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
User avatar
Vasyl-Tracker Dev Team
Site Admin
Posts: 2351
Joined: Thu Jun 30, 2005 4:11 pm
Location: Canada

Re: PDF XChanged Editor V9 – Trouble in OCR page Thai language

Post by Vasyl-Tracker Dev Team »

Hi nsa-chanaphon.

With the latest 351 build the result made by Enhanced OCR is:
OCRed_page_Thai.pdf
(68.67 KiB) Downloaded 65 times
Have you similar result using the 351 or it is worst for you?
And please provide your OCR'ed document too.

Cheers.
Vasyl Yaremyn
Tracker Software Products
Project Developer

Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
nsa-chanaphon
User
Posts: 3
Joined: Mon Jan 25, 2021 6:51 am

Re: PDF XChanged Editor V9 – Trouble in OCR page Thai language

Post by nsa-chanaphon »

Hi!

Thanks for your replies.

I have a screenshot of the OCR and Enhance setting I have used. As I compared with you OCRed document, it looks different. Can you give me the setting?

OCR setting.JPG
Enhanced setting.JPG

For my OCRed and Enhanced documents, please find below.
OCRed Thai.pdf
(335.71 KiB) Downloaded 24 times
Enhanced Thai.pdf
(1.02 MiB) Downloaded 24 times

Thanks,
Fah
User avatar
Vasyl-Tracker Dev Team
Site Admin
Posts: 2351
Joined: Thu Jun 30, 2005 4:11 pm
Location: Canada

Re: PDF XChanged Editor V9 – Trouble in OCR page Thai language

Post by Vasyl-Tracker Dev Team »

Hi, nsa-chanaphon.

Seems there is small misunderstanding. As I see on your screenshots you used the Default OCR only but not the Enhanced OCR at all.

To try to use the Enhanced OCR - look to the File > Preferences > OCR section and choose the "Enhanced OCR (FineReader)" item from the dropdown-list:
image.png
Note: this option isn't available for free version of the Editor. You need to have a license key for EditorPlus or PDF-XChangePro product.

Seems you confused with 'Enhanced Pages' feature that is different. This feature is to improve visually the scanned pages(by sharpening, adding contrast, removing noise, etc.). Also it has an additional option to OCR that scanned pages. But in your case it uses the same Default OCR engine. So to try the new EnhancedOCR you need to change the OCR-engine globally, for whole app, one time only.

And my settings used for OCR of your doc:
image1.png


Cheers.
Vasyl Yaremyn
Tracker Software Products
Project Developer

Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
nsa-chanaphon
User
Posts: 3
Joined: Mon Jan 25, 2021 6:51 am

Re: PDF XChanged Editor V9 – Trouble in OCR page Thai language

Post by nsa-chanaphon »

Hi!


Thank your for your helpful advises and clarification.

I went back to check the result that you did the Enhanced OCR. The accuracy looks better, but I can't make a decision to purchase additional license or upgrade to EditorPlus without doing more tests with more Thai documents.

It would be really great if there is a demo or trial account to do this around 1 week.
Do you have any suggestions or offers?


Regards,
Fah
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8371
Joined: Wed Jan 03, 2018 6:52 pm

Re: PDF XChanged Editor V9 – Trouble in OCR page Thai language

Post by TrackerSupp-Daniel »

Hi, nsa-chanaphon

Currently we do not have a trial available for the EOCR plugin, but after some internal discussion, it should be coming in a near-future build.

If you do not have enough to make the purchase now, I would suggest waiting, you should find that you can test it after one of the next few updates.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
EUS_Global_Packaging_Team
User
Posts: 1
Joined: Mon Apr 05, 2021 7:52 am

Re: PDF XChanged Editor V9 – Trouble in OCR page Thai language

Post by EUS_Global_Packaging_Team »

Hello
Default_Language
Default_Language
Default_Language
Default_Language
Default_Language
Default_Language
I am from application packaging team, and wanted to create a package of PDF-xchange Pro 9.0.352.0.

I have installed the MSI, file, and it got installed successfully. But when checked the OCR languages there are one 6 languages installed by default, which are Czech, English, Finnish, French, German, Spanish and other are not installed. So please let me know is there any command-line parameter when i can directly install other languages too while installation itself.

Please advise how should i proceed with installation of other OCR languages silently.

Best Regards
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8371
Joined: Wed Jan 03, 2018 6:52 pm

Re: PDF XChanged Editor V9 – Trouble in OCR page Thai language

Post by TrackerSupp-Daniel »

Hi, EUS_Global_Packaging_Team

Currently there is no way to silently do this during installation, but I have spoken with the Dev team and they offered to create a command line switch which would allow this for you. As such, the following feature request has been made, and though internal only, you can check back with me here in the future for an update on its progress:

#5551: Install/Update Editor's external resources(EOCR langs) via Editor's command Line

In the meantime, you will need to tell your users to ensure they use the "add/update languages" feature to download the languages they will need.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Post Reply