Hi there,
When ocr-ing a pdf the resulting (incorrect) text is as follows:
ľŏōŒŘœŝōŒ ŞŏŝŞŚŖŋŘ
ĶřŋŎ ŋŘŎ ĽŞŜŏŝŝ ľŏŝŞ Đ
ĽţŝŞŏŗ ijŘŞŏőŜŋŞœřŘ ľŏŝŞ
etc. etc.
This is when selecting dutch as the language and the result is no dutch at all, if i use english the resulting pdf is ok. I downloaded and use the complete langauge pack. Can i solve this?
Strange output in resulting pdf
Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Tracker Supp-Stefan
-
- User
- Posts: 381
- Joined: Mon Jun 13, 2011 5:10 pm
Re: Strange output in resulting pdf
We will investigate this immediately. If it is not confidential, could you send us the PDF you are working with? You can send it to support@pdf-xchange.com, with "attention: Walter" in the subject. Otherwise we will do our best to reproduce with other PDF inputs.joost wrote:Hi there,
When ocr-ing a pdf the resulting (incorrect) text is as follows:
ľŏōŒŘœŝōŒ ŞŏŝŞŚŖŋŘ
ĶřŋŎ ŋŘŎ ĽŞŜŏŝŝ ľŏŝŞ Đ
ĽţŝŞŏŗ ijŘŞŏőŜŋŞœřŘ ľŏŝŞ
etc. etc.
This is when selecting dutch as the language and the result is no dutch at all, if i use english the resulting pdf is ok. I downloaded and use the complete langauge pack. Can i solve this?
-Walter
-
- User
- Posts: 7
- Joined: Mon Dec 05, 2011 9:16 pm
Re: Strange output in resulting pdf
Thanks for the quick response Walter, i've mailed you an example PDFWalter-Tracker Supp wrote: We will investigate this immediately. If it is not confidential, could you send us the PDF you are working with? You can send it to support@pdf-xchange.com, with "attention: Walter" in the subject. Otherwise we will do our best to reproduce with other PDF inputs.
-Walter
-
- User
- Posts: 381
- Joined: Mon Jun 13, 2011 5:10 pm
Re: Strange output in resulting pdf
I have been unable to reproduce the problem with the provided sample PDF. I wonder if it is a unicode vs. ASCII issue? The text in the PDF should be UTF-8 encoded. It is conceivable that whatever method you are using to extract the text layer from the PDF is using ASCII (8-bit) encoding.
-
- User
- Posts: 381
- Joined: Mon Jun 13, 2011 5:10 pm
Re: Strange output in resulting pdf
Hi Joost,
We have diagnosed this and will provide a fix shortly. It had something to do with a small bug with PDF level unicode / character encodings, which caused some problems with certain PDF viewers.
A new PDF-X OCR DLL which fixes this will be up by the end of the working week (DLL version 1.0.5).
As always we really appreciate you bringing this to our attention!
-Walter
We have diagnosed this and will provide a fix shortly. It had something to do with a small bug with PDF level unicode / character encodings, which caused some problems with certain PDF viewers.
A new PDF-X OCR DLL which fixes this will be up by the end of the working week (DLL version 1.0.5).
As always we really appreciate you bringing this to our attention!
-Walter
-
- User
- Posts: 381
- Joined: Mon Jun 13, 2011 5:10 pm
Re: Strange output in resulting pdf
Hi Joost,
The new version has been built and it resolves this issue with Dutch language files (and greatly improves memory consumption!). I understand that you are a Clarion developer, so you will need to wait for the Clarion build to be available, which will be very soon.
-Walter
The new version has been built and it resolves this issue with Dutch language files (and greatly improves memory consumption!). I understand that you are a Clarion developer, so you will need to wait for the Clarion build to be available, which will be very soon.
-Walter
-
- User
- Posts: 7
- Joined: Mon Dec 05, 2011 9:16 pm
Re: Strange output in resulting pdf
Hi walter, im very pleased how you managed to resolve this issue so quick. Im not a clarion developer,can i already download this new version somewhere? the "live version" in the downloads is of last monthWalter-Tracker Supp wrote:Hi Joost,
The new version has been built and it resolves this issue with Dutch language files (and greatly improves memory consumption!). I understand that you are a Clarion developer, so you will need to wait for the Clarion build to be available, which will be very soon.
-Walter
-
- User
- Posts: 381
- Joined: Mon Jun 13, 2011 5:10 pm
Re: Strange output in resulting pdf
I will contact you via email shortly!joost wrote:Hi walter, im very pleased how you managed to resolve this issue so quick. Im not a clarion developer,can i already download this new version somewhere? the "live version" in the downloads is of last monthWalter-Tracker Supp wrote:Hi Joost,
The new version has been built and it resolves this issue with Dutch language files (and greatly improves memory consumption!). I understand that you are a Clarion developer, so you will need to wait for the Clarion build to be available, which will be very soon.
-Walter
-Walter