Extract Text from PDF
Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Tracker Supp-Stefan
Forum rules
DO NOT post your license/serial key, or your activation code - these forums, and all posts within, are public and we will be forced to immediately deactivate your license.
When experiencing some errors, use the IAUX_Inst::FormatHRESULT method to see their description and include it in your post along with the error code.
DO NOT post your license/serial key, or your activation code - these forums, and all posts within, are public and we will be forced to immediately deactivate your license.
When experiencing some errors, use the IAUX_Inst::FormatHRESULT method to see their description and include it in your post along with the error code.
-
- User
- Posts: 1
- Joined: Fri Aug 17, 2018 2:42 pm
Extract Text from PDF
I am evaluating your CoreAPI SDK, I have downloaded the CoreAPIDemo from github -- which provides much useful insight and a framework within which I can do my evaluation. My primary interest is extracting the entire text from a PDF, paragraph-by-paragraph. I will be doing further processing on the text for each paragraph. Can you point me in the direction of useful resources or specific API calls to accomplish this? Thank you.
-
- User
- Posts: 5522
- Joined: Fri Nov 21, 2014 8:27 am
- Contact:
Re: Extract Text from PDF
Hello ddinnebeil,
Please check out the "9.3. Convert from PDF to txt file" sample. It will visually output (like you see it on screen) the text into the txt file.
Also, you can obtain each separate character from the IPXC_PageText by using the information provided by
And you can see where is the end line character so that you can build paragraphs and implement your own logic.
Cheers,
Alex
Please check out the "9.3. Convert from PDF to txt file" sample. It will visually output (like you see it on screen) the text into the txt file.
Also, you can obtain each separate character from the IPXC_PageText by using the information provided by
Code: Select all
Text.GetChars(textsLineInfo[i].nFirstCharIndex, textsLineInfo[i].nCharsCount)
Cheers,
Alex
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ