PDF-XChange - Tracker PDF Viewer - TIFF-XChange - Image-XChange - XMF-XChange - Raster-XChange - Support

Moderators: TrackerSupp-Daniel, Tracker Support, Chris - Tracker Supp, Vasyl-Tracker Dev Team, Paul - Tracker Supp, Ivan - Tracker Software, Sean - Tracker, Tracker Supp-Stefan

 
occam
User
Topic Author
Posts: 5
Joined: Sun Aug 22, 2010 7:35 am

OCR - any way of accessing the text overlay as a .txt doc?

Thu Sep 05, 2013 5:44 pm

Hi

I have a portable version of PDF-Xchange viewer (latest) running under Win 8. Using the OCR function I am able to make a pdf searchable. Is there a way, however, of accessing the text overlay e.g. as a .txt document or any other format?

Thanks
 
Walter-Tracker Supp
User
Posts: 383
Joined: Mon Jun 13, 2011 5:10 pm

Re: OCR - any way of accessing the text overlay as a .txt do

Thu Sep 05, 2013 6:44 pm

OCR text is essentially the same as visible text, except that it is not rendered. You can extract text by selecting it with the mouse, and copying / pasting, or you can use the Viewer's javascript provisions. I have attached a simple script that extracts text from the current page and outputs it to a text file.

Simply hit "Ctrl-J" within the Viewer to bring up the javascript console, and paste the contents of the attached script (which is a javascript script compressed with 7Zip). Press the run button and it will prompt you for an output filename to save the plain text results to. You can modify the script as you see fit, for example to save to a text file without user intervention.

Our Viewer replicates much of the functionality of the Adobe Javascript API, so you can check their reference manual for information on usage:

http://www.adobe.com/devnet/acrobat/pdfs/js_api_reference.pdf
Attachments
extract_text.7z
(519 Bytes) Downloaded 136 times
 
occam
User
Topic Author
Posts: 5
Joined: Sun Aug 22, 2010 7:35 am

Re: OCR - any way of accessing the text overlay as a .txt do

Thu Sep 05, 2013 7:11 pm

Great Thanks Walter! I appreciate the quick feedback.

occam
 
User avatar
Will - Tracker Supp
Site Admin
Posts: 6063
Joined: Mon Oct 15, 2012 9:21 pm
Location: London, UK
Contact:

Re: OCR - any way of accessing the text overlay as a .txt do

Thu Sep 05, 2013 8:34 pm

:)
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.

Best regards

Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com

Who is online

Users browsing this forum: No registered users and 1 guest