Copying RTL text with diacritics

Forum for the PDF-XChange Editor - Free and Licensed Versions

Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan

Post Reply
yossizahn
User
Posts: 33
Joined: Mon Nov 13, 2017 12:15 pm

Copying RTL text with diacritics

Post by yossizahn »

Hi,
Another bug related to RTL text.
It is my understanding that some software will render RTL text on the page in a LTR order. This results in the text being placed on the PDF page in reversed logical order. This can cause trouble when copying or extracting text from the PDF since the logical order of the text hasn't been preserved. I understand that PDFXchange (and other programs) will compensate for this by reversing the copied text when copying. There seems to be a bug however when the text contains diacritics since they need special consideration. According to the Unicode standard, diacritics must be placed after their base character, and most software will actually draw the diacritic to the page after the base character (even when drawing in a reverse direction, i.e. from left to right). It is therefore incorrect to simply reverse the string. What must be done is to reverse only the order of the base characters while keeping the diacritic's placement after their base character.
Yes, Bidi is hard... :)
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17910
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Copying RTL text with diacritics

Post by Tracker Supp-Stefan »

Hello yossizahn,

Can we please get a sample file illustrating the problem so that I can pass it along for investigation and fixing?

Regards,
Stefan
yossizahn
User
Posts: 33
Joined: Mon Nov 13, 2017 12:15 pm

Re: Copying RTL text with diacritics

Post by yossizahn »

Hi,
Thanks for your reply. I have attached a zip folder containing a sample PDF where the problem occurs. I have added also 2 txt files which contain the result of copying text from PDFXchange and Adobe. The text copied using Adobe is perfect.
Attachments
diacritics sample.zip
(86.23 KiB) Downloaded 58 times
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17910
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Copying RTL text with diacritics

Post by Tracker Supp-Stefan »

Thanks for the samples yossizahn,

I have passed them along for investigation!
#4191: Editor 323.2: Issues copying RTL text with diacritics.

Cheers,
Stefan
Post Reply