Search PDF for text, if found, replace with hyperlink

This Forum is for the use of Software Developers requiring help and assistance for Tracker Software's PDF-Tools SDK of Library DLL functions(only) - Please use the PDF-XChange Drivers API SDK Forum for assistance with all PDF Print Driver related topics or PDF-XChange Viewer SDK if appropriate.

Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Sean - Tracker, Chris - Tracker Supp, Tracker Supp-Stefan

Post Reply
TKFlex
User
Posts: 3
Joined: Mon Jul 08, 2013 8:17 pm

Search PDF for text, if found, replace with hyperlink

Post by TKFlex » Mon Jul 08, 2013 8:56 pm

I'm on a Win 8 64 bit machine. Using VB, I'd like to search an existing PDF file for specific text values. If the values are found, replace them with a hyperlink.

User avatar
Will - Tracker Supp
Site Admin
Posts: 6905
Joined: Mon Oct 15, 2012 9:21 pm
Location: London, UK
Contact:

Re: Search PDF for text, if found, replace with hyperlink

Post by Will - Tracker Supp » Tue Jul 09, 2013 11:07 am

Hi TKFlex,

Thanks for the post - you'll need to extract the text, page by page, find the text you need and, when the text is found you then can get the coordinates of this text and add link annotation at this place.

Cheers, hope that helps!
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.

Best regards

Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com

TKFlex
User
Posts: 3
Joined: Mon Jul 08, 2013 8:17 pm

Re: Search PDF for text, if found, replace with hyperlink

Post by TKFlex » Tue Jul 09, 2013 3:34 pm

you'll need to extract the text, page by page
Am I on the right track with this code?
hr = XCPro40_Defs.PXCp_GetPagesCount(m_pdf, PageCnt)
For curPage = 0 To PageCnt
k = XCPro40_Defs.PXCp_ET_AnalyzePageContent(m_pdf, curPage)
k = XCPro40_Defs.PXCp_ET_GetElementCount(m_pdf, curPage)
Next
find the text you need
What function can I use to find the text?


when the text is found you then can get the coordinates of this text and add link annotation
PCX_AddLink?

Nico - Tracker Supp
User
Posts: 220
Joined: Fri May 18, 2012 8:41 pm

Re: Search PDF for text, if found, replace with hyperlink

Post by Nico - Tracker Supp » Tue Jul 09, 2013 4:59 pm

Hi TKFlex,

Thank you for your post.
We have uploaded a code example that shows how to do this.
All you have to do is to translate that example to your language of preference and adapt it to your needs.
The example can be found in the following page: http://help.tracker-software.com/PDF-To ... _hyperlink
If you have any questions please feel free to come back.
Thanks.

Sincerely,

Ivan - Tracker Software
Site Admin
Posts: 3620
Joined: Thu Jul 08, 2004 10:36 pm
Location: Vancouver Island - Canada
Contact:

Re: Search PDF for text, if found, replace with hyperlink

Post by Ivan - Tracker Software » Tue Jul 09, 2013 5:06 pm

Text on PDF page is not represented as continuous text. Instead of that it contains from text "blocks" and each block has own position on the page.
And it it is not necessary that the visual order of text on the page corresponds to the real order of text blocks in page's content.

Tools SDK gives you two options - you can get each text block, and compose text of the page from these blocks. It is a bit complex job but it will allows you to have position of each block and you can add the link; the other option is to get the composed text from the page, but the information about positions is not available in this case.

If it is possible in your project, I would recommend to use Viewer ActiveX SDK which is much more powerful and much more easy to use in complex cases.
For example, there is 'classic' sample for your task in JavaScript reference (you can run JS in Viewer AX):

Code: Select all

for (var p = 0; p < this.numPages; p++)
{
  var numWords = this.getPageNumWords(p);
  for (var i=0; i<numWords; i++)
  {
    var ckWord = this.getPageNthWord(p, i, true);
    if ( ckWord == "Acrobat")
    {
      var q = this.getPageNthWordQuads(p, i);
      // Convert quads in default user space to rotated
      // User space used by Links.
      m = (new Matrix2D).fromRotated(this,p);
      mInv = m.invert()
      r = mInv.transform(q)
      r=r.toString()
      r = r.split(",");
      l = addLink(p, [r[4], r[5], r[2], r[3]]);
      l.borderColor = color.red
      l.borderWidth = 1
      l.setAction("this.getURL('http://www.adobe.com/');");
    }
  }
}
Tracker Software (Project Director)

When attaching files to any message - please ensure they are archived and posted as a .ZIP, .RAR or .7z format - or they will not be posted - thanks.

TKFlex
User
Posts: 3
Joined: Mon Jul 08, 2013 8:17 pm

Re: Search PDF for text, if found, replace with hyperlink

Post by TKFlex » Thu Jul 18, 2013 3:43 pm

I'm using that JavaScript example above. It seems to only search on a per word basis. Can I search for a phrase or combination of words?

Nico - Tracker Supp
User
Posts: 220
Joined: Fri May 18, 2012 8:41 pm

Re: Search PDF for text, if found, replace with hyperlink

Post by Nico - Tracker Supp » Thu Jul 18, 2013 4:12 pm

Hi TKFlex,

There isn't a built-in function you can use directly, but you can make your own.
Thanks.

Sincerely,

Post Reply