When double-byte character comments are flattened, edited, and then moved, they are recognized separately.

Forum for the PDF-XChange Editor - Free and Licensed Versions

Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan

Post Reply
User avatar
rakunavi
User
Posts: 871
Joined: Sat Sep 11, 2021 5:04 am

When double-byte character comments are flattened, edited, and then moved, they are recognized separately.

Post by rakunavi »

Hello all,

If you enter double-byte characters using a typewriter or text box and edit them after they have been flattened, blocks of text may be recognized separately after moving the content. When the same operation is performed with single-byte characters (alphabet), the text is recognized together correctly. The "Edit Text Elements as Blocks" option is enabled since all settings are in the default.

I have confirmed that the issue occurs for Japanese, Korean, Traditional Chinese, and Simplified Chinese. This issue is not new to build 364 released today. I have separately confirmed that it has existed since earlier builds. In some cases, the issue may not occur depending on the way edits are made. The following verification procedure includes such cases.

Please see attached verification video for more details. The corresponding timecode in the video is also included.
([MM:SS] "MM" shows minutes, and "SS" shows seconds.)

Single-byte characters case

  • [00:00] Create a new document.
  • [00:04] Enter single-byte characters using the typewriter tool.
  • [00:07] Flatten the comments and convert them into content.
  • [00:09] Activate the content editing mode and enter a certain amount of spaces at the beginning of the text.
  • [00:14] Select and move the text content, making sure that the text is recognized together and moves correctly.
Double-byte characters case (Please pay attention to the section after [00:44] where the issue occurs.)

  • [00:18] Create a new document.
  • [00:23] Copy the appropriate Japanese text to the clipboard.
  • [00:25] Paste the Japanese text several times using the typewriter tool.
  • [00:30] Flatten the comments and convert them into content.
  • [00:34] Activate content edit mode and enter a certain amount of spaces at the beginning of the text.
  • [00:39] Select and move the text content, making sure that the text is recognized together and moves correctly.
  • [00:44] Activate content edit mode and enter a space in the middle of the text.
  • [00:48] Select and move the text content,making sure that the text is unintentionally recognized separately.
    Also confirm that text spaces remain in irregular positions. This is the area circled by the red line in the screenshot shown below.

    Screenshot_ja-JP.png

For your reference, the videos verified for Korean, Traditional Chinese, and Simplified Chinese are attached.

  • Traditional Chinese
    CapturedVideo_zh-TW.zip
    (3.06 MiB) Downloaded 18 times
    Traditional Chinese : "Microsoft JhengHei" was selected as the font.
    Traditional Chinese : "Microsoft JhengHei" was selected as the font.
  • Simplified Chinese
    CapturedVideo_zh-CN.zip
    (1.72 MiB) Downloaded 17 times
    Simplified Chinese : "Microsoft YaHei" was selected as the font.
    Simplified Chinese : "Microsoft YaHei" was selected as the font.

Hoping that the above information will be of some help to you.
Thank you so much for your continued support.

Best regards,
rakunavi

- PDF-XChange Editor Plus Version: 9.4 build 364.0
- OS Version: Windows 11 Home 21H2 Build 22000.1042
- PC Model: Lenovo IdeaPad C340-15IWL
TOP desires for PDFXCE
forum.pdf-xchange.com/viewtopic.php?t=39665 LassoTool
forum.pdf-xchange.com/viewtopic.php?t=38554 CmtGarbled
forum.pdf-xchange.com/viewtopic.php?t=37353 FulScrMultiMon
forum.pdf-xchange.com/viewtopic.php?t=41002 DisableTouchSelect
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17824
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: When double-byte character comments are flattened, edited, and then moved, they are recognized separately.

Post by Tracker Supp-Stefan »

Hello rakunavi,

Many thanks for the detailed report and the clar videos!
I've asked colleagues from the dev team who work with fonts to take a look and advise what the next steps are.
I will post again here as soon as I have an update!

Kind regards,
Stefan
User avatar
rakunavi
User
Posts: 871
Joined: Sat Sep 11, 2021 5:04 am

Re: When double-byte character comments are flattened, edited, and then moved, they are recognized separately.

Post by rakunavi »

Hi Stefan,

Thank you for your quick reply.
I have often reported issues related to double-byte characters, and I hope that things will be improved for the better.

Speaking of fonts, thank you for resolving the text color issue with Microsoft IME in build 364 released today. I am surprised at how quickly it was resolved. Thank you to the Tracker golden team.

Best regards,
rakunavi
TOP desires for PDFXCE
forum.pdf-xchange.com/viewtopic.php?t=39665 LassoTool
forum.pdf-xchange.com/viewtopic.php?t=38554 CmtGarbled
forum.pdf-xchange.com/viewtopic.php?t=37353 FulScrMultiMon
forum.pdf-xchange.com/viewtopic.php?t=41002 DisableTouchSelect
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17824
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: When double-byte character comments are flattened, edited, and then moved, they are recognized separately.

Post by Tracker Supp-Stefan »

Hello rakunavi,

I got the following reply from our devs:

+++++
We do not really support any single/double/multibyte characters at all. All we support is Unicode. All text in PDF files is processed as Unicode only - converted to Unicode before editing/copying, edited as Unicode and then stored into the PDF. So, any difference in text behaviour observed has no relation to the character codes - this is only a side effects of other problems.
As may be seen from the videos provided after moving the edited text some empty boxes (space characters?) were not moved together with the rest of the text.

We can see these two problems with the provided samples:
1. Text Editor inserts spaces and/or newlines into CJK text, when it should not do this. This is an already known issue.
2. For some reason these characters are not included into the text block composed from content (as I can see the spaces overlap with the actual text so that might be one of the reasons), and after moving selected text block these characters remain in place and are not moved with it. This breaks text composure algorithms for the follow up edit and that is why the text block gets separated.

For the moment as a simple workaround for this case we can suggest that you simply delete the spaces left in the original text positions (the small empty text blocks). After that the text composure algorithms will work properly again.
+++++

The first problem is already assigned to one of our devs and he is aware of and working on it.
The second problem is more complex, and to properly solve it we will need to completely rework our text composing algorithms. One of the senior devs is working on this, but unfortunately work is far from finished at this moment.

Solving either of the two problems above will remove the inconsistent behaviour demonstrated by you above, and we are working on getting both sorted!

Kind regards,
Stefan
User avatar
rakunavi
User
Posts: 871
Joined: Sat Sep 11, 2021 5:04 am

Re: When double-byte character comments are flattened, edited, and then moved, they are recognized separately.

Post by rakunavi »

Hi Stefan,

Thank you for taking the time to look into this in detail. I learned a lot from your detailed explanation. I am relieved to share my situation with you.

Please give my best regards to the developer.
Thank you so much for your continued support.

Best regards,
rakunavi
TOP desires for PDFXCE
forum.pdf-xchange.com/viewtopic.php?t=39665 LassoTool
forum.pdf-xchange.com/viewtopic.php?t=38554 CmtGarbled
forum.pdf-xchange.com/viewtopic.php?t=37353 FulScrMultiMon
forum.pdf-xchange.com/viewtopic.php?t=41002 DisableTouchSelect
Post Reply