- Extract text using IPXC_Page.GetText(null).GetChars(nFirstCharIndex, nCharsCount)
- Do something with the extracted text (in this case parse the text and determine a word to highlight).
- Highlight the word using the following code:
Code: Select all
IPXV_TextSelection textSelection = (IPXV_TextSelection)IPXV_Document.CreateStdSel((uint)IPXV_Inst.Str2ID("selection.text"));
IPXV_PageTextSelection pageTextSelection = textSelection.GetSel(pageIndex, true);
pageTextSelection.SelectChars(charIndex, length);
textSelection.OnAdd(IPXV_Document);
IPXV_Document.ActiveSel = textSelection;
textSelection.Show(true);
Code: Select all
IPXC_GetPageTextOptions getPageTextOptions = IPXC_Inst.CreateGetPageTextOptions(2);
getPageTextOptions.Flags = 2; // With ligatures
IPXC_Page.GetText(getPageTextOptions).GetChars(nFirstCharIndex, nCharsCount);
For example:
- Text with separate characters: "Jeg spiser aebler"
- I want to select the word "aebler", so I select with charIndex = 11 and length = 6. Works fine
- Text with ligatures: "Jeg spiser æbler"
- I want to select the word "æbler", so I select with charIndex = 11 and length = 5. This incorrectly selects only the letters "æble".
This is a made up example, but I've attached a PDF with ligatures. Attempting to highlight the word "ægtefæller" in the first line will cause the issue.