How to extract field values from a pdf form in vb/csharp

This Forum is for the use of Software Developers requiring help and assistance for Tracker Software's PDF-Tools SDK of Library DLL functions(only) - Please use the PDF-XChange Drivers API SDK Forum for assistance with all PDF Print Driver related topics or PDF-XChange Viewer SDK if appropriate.

Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Tracker Supp-Stefan

Post Reply
goussarova
User
Posts: 9
Joined: Fri Nov 11, 2011 9:58 pm

How to extract field values from a pdf form in vb/csharp

Post by goussarova »

Hi,
I am looking for a tool to extract field values from a pdf form.
From reading this forum I understand that it is doable with pdf tools sdk.

I would like to get a code sample for the extraction with .Net vb /csharp, so I can build a tiny test app as a proof of concept.

Could you please help me out.
Thank you in advance for your help.

Helen.
User avatar
Paul - Tracker Supp
Site Admin
Posts: 6897
Joined: Wed Mar 25, 2009 10:37 pm
Location: Chemainus, Canada
Contact:

Re: How to extract field values from a pdf form in vb/csharp

Post by Paul - Tracker Supp »

Hi goussarova,

I do mostly end user support and we are on a skeleton staffing today so I'll need to get this confirmed by one of our devs but I think you need to use the ActiveX Viewer SDK to do that. Check back here after the week end and I'll have a definite answer for you here.

hth
Best regards

Paul O'Rorke
Tracker Support North America
http://www.tracker-software.com
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17901
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: How to extract field values from a pdf form in vb/csharp

Post by Tracker Supp-Stefan »

Hello goussarova,

After speaking with the guys fromt he dev team, they advised me that this is not possible by mean of the PDF Tools SDK, you will need to use the Viewer AX one instead.

Best,
Stefan
goussarova
User
Posts: 9
Joined: Fri Nov 11, 2011 9:58 pm

Re: How to extract field values from a pdf form in vb/csharp

Post by goussarova »

Thank you very much for your very quick response.

Can I download a free trial version of this Viewer AX?
Is there a cose sample on how to do the extraction in vb / csharp?

Your reply is greatly appreciated.
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17901
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: How to extract field values from a pdf form in vb/csharp

Post by Tracker Supp-Stefan »

Hello goussarova,

Yes certainly - you can download it from here:
https://www.pdf-xchange.com/product ... ctivex-sdk

And there are samples inside. I am not sure whether there is one particularly for your needs but take a look at the JS demo application - as you will most likely need to use some JS code and functions from here:
http://www.adobe.com/content/dam/Adobe/ ... erence.pdf
to extract the needed form fields information.

Best,
Stefan
goussarova
User
Posts: 9
Joined: Fri Nov 11, 2011 9:58 pm

Re: How to extract field values from a pdf form in vb/csharp

Post by goussarova »

There are several things that disappointed me:

- Contradiction in terms. Your site explicitly recommends to develop user applications using a trial version, and buy a license only when ready to deploy. Yet, your installer pdfvActiveXSDK.exe forces me to accept a very restrictive license agreement just to be able to read the documentation. Your license states that I become legally bound as soon as I 'open the package'. It does **not** say that I am free to use the software in dev environment on a local machine. This is vastly unfair.

- Contradiction in representation. You advised me that it is doable to extract field values using your Viewer. Yet, this functionality is not mentioned anywhere in the documentation that came with the viewer - file PDFV_AX.pdf. Please correct me if I am wrong.

- Quality of samples. Your VB Samples are still in VB 5. They provide very limited help. You should take a cue from the samples that pdfsharp provides for its users.

On the positive note,
this discussion helped me find this link that shows how to use adobe api to do just what I need. And I am very grateful for this.

http://acrobatusers.com/forum/forms-liv ... m-database
User avatar
Paul - Tracker Supp
Site Admin
Posts: 6897
Joined: Wed Mar 25, 2009 10:37 pm
Location: Chemainus, Canada
Contact:

Re: How to extract field values from a pdf form in vb/csharp

Post by Paul - Tracker Supp »

Hi goussarova

your points are appreciated. We are deep in the development of new versions of both the Viewer and Drivers, both End user products and SDKs. I appreciate what you have said and agree that the documentation needs to be updated. We are aware of the need for this, however given the push here to complete development of the new versions we are unlikely to address this before the new version work is complete.

What I would offer you, until the documentation is updated, is the support of our staff both through this forum and email to get you going with the right information - even if it is not yet in the manuals - please do ask. We are here to help.

I do appreciate the feedback and we will endeavour to get this all together for the new versions.

hth
Best regards

Paul O'Rorke
Tracker Support North America
http://www.tracker-software.com
goussarova
User
Posts: 9
Joined: Fri Nov 11, 2011 9:58 pm

Re: How to extract field values from a pdf form in vb/csharp

Post by goussarova »

Hi Paul, thank you for your timely reply.

What I need to do looks so simple, yet I have spent a lot of time trying in vain to figure out how to get it done.

Our internal users need to fill out pdf forms - such as quality control check lists. These check lists take time to get filled as they reflect real time work being done in the shop. So, the managers want to see which forms have already been completed, and which ones are still outstanding. here comes my little application in vb .net.

I need to open a pdf file, let the users make whatever changes they need, and on save get the key/value list of the fileds in the file, and run some business logic to determine if the form is complete.

I would like the extraction routine not to be bound to a specific form - eg I do not what to spell out field names as they will be different in different forms.

I am assuming that I will need to reference your PDFXCviewAx.dll.
Could you please help me from here - could you please give me a vb sample that shows how to get the values using your dll.

Thanks is advance for your help!
User avatar
Paul - Tracker Supp
Site Admin
Posts: 6897
Joined: Wed Mar 25, 2009 10:37 pm
Location: Chemainus, Canada
Contact:

Re: How to extract field values from a pdf form in vb/csharp

Post by Paul - Tracker Supp »

thanks for getting back to us with this detail goussarova,

I think we can probably put something together for you over the next few days or so. I will get back to you here. :-)

Paul
Best regards

Paul O'Rorke
Tracker Support North America
http://www.tracker-software.com
goussarova
User
Posts: 9
Joined: Fri Nov 11, 2011 9:58 pm

Re: How to extract field values from a pdf form in vb/csharp

Post by goussarova »

Hi,

After some extensive googling, I came to realize that all I need is to execute exportAsFDFStr command against the pdf document.

Could you please tell me if there is a way to do it using your product and perheps provide me with some sample code?

Thanks
Helen.
User avatar
Paul - Tracker Supp
Site Admin
Posts: 6897
Joined: Wed Mar 25, 2009 10:37 pm
Location: Chemainus, Canada
Contact:

Re: How to extract field values from a pdf form in vb/csharp

Post by Paul - Tracker Supp »

iHI goussarova,

yes I came to the same conclusion myself. It is best to use JavaScript. exportAsFDFStr is supported. You would want to use the Viewer AX examples not PDF-Tools to load the document and run the JS on it. There are VB examples in the folder: C:\Program Files\Tracker Software\PDF-XChange Viewer SDK\Examples\VBExamples of the Viewer AX SDK which you can download here: https://www.pdf-xchange.com/product ... ctivex-sdk With those you can load the PDF in the Viewer then run your JavaScript on it.

Do note that the Viewer SDK is licensed a little differently from the others so talk to us regards your anticipated distribution numbers and we will advise you as to the most efficient license for you.

hth
Best regards

Paul O'Rorke
Tracker Support North America
http://www.tracker-software.com
goussarova
User
Posts: 9
Joined: Fri Nov 11, 2011 9:58 pm

Re: How to extract field values from a pdf form in vb/csharp

Post by goussarova »

Hi Paul,

Correct me if I am wrong, but as I understand, I need to execute JS function exportAsFDFStr against the document using your RunJavaScript method.

Looking up the vb examples you mentioned, I see this call:
Dim JResults As String
JResults = ""
Call CoPDFXCview1.RunJavaScript(t_Jscript.Text, JResults)

This is VB6. Now, in .Net dll I am using, there are two overloads -
one is simple:
CoPDFXCview1.RunJavaScript(script as String) ' but I need to return a value
and another is complex:
CoPDFXCview1.RunJavaScript(script as String, byRef result as string, iD as integer, flag as integer)

I have no way of knowing what do i pass as iD and flag. Trying to pass zeros returns me "can't convert Function to string 0:" in a result string.

Hence, I have two questions:

a. Could you please tell me how to use RunJavaScript function. Or tell me if I need to be using something else.

b. I am a developer, and the only reason I will be buying a license from you is that I will need to build my project using your dll.

Please tell me what will be my method of using your dll - poke around your vb6 code, find something remotely siutable, learn from intellisence that the syntax changed, post a question, and wait for your reply? Or may be I am not getting something?

Sorry if I sound a bit bitter. This whole thing scares me. If my boss pays upward 2k for a license, he will be expecting some rapid development from me. And I will be in deep ouch of I won't be able to use this efficiently.

Thank you as always for your help.
Helen.
User avatar
Paul - Tracker Supp
Site Admin
Posts: 6897
Joined: Wed Mar 25, 2009 10:37 pm
Location: Chemainus, Canada
Contact:

Re: How to extract field values from a pdf form in vb/csharp

Post by Paul - Tracker Supp »

Hi Helen,

thanks for getting back to us with what you have found out. Take a look at page 28 of our ActiveX SDK manual - it details that function. I've attached that page of the manual to this post. The manual gets installed with the ActiveX SDK and should be available from the Windows Start Menu for the SDK.

If you want to send us the JavaScript you are running and we can take a look at what may be happening such that you get the error "can't convert Function to string 0:"

What is your timeframe for your proof of concept? Using the SDK in 'trial mode' I'm sure we can get your proof of concept working before spending any money. That way you can approach your boss with a working example (with water marks of course) and if he likes it then you have just to purchase the license, inject the key and deploy.

We will do what we can to help you in your efforts, I'm sorry that my answers thus far have been frustration. Send that script and let's see what we can manage?

regards
Attachments
Pages from PDFV_AX.zip
(201.59 KiB) Downloaded 218 times
Best regards

Paul O'Rorke
Tracker Support North America
http://www.tracker-software.com
goussarova
User
Posts: 9
Joined: Fri Nov 11, 2011 9:58 pm

Re: How to extract field values from a pdf form in vb/csharp

Post by goussarova »

Thanks Paul.

Your answers are just fine, it is the learning curve that causes the frustration. So, I never realized that the manual will be under Start menu. I was looking for it in samples :-(. Sorry.

With the help of that PDFV_AX.pdf, everything looks good now - except the output string is incorrect and I am getting a warning.

Below is my code:

Private Sub btnSave_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnSave.Click
Try
Dim Id As Integer = 0
AxCoPDFXCview1.GetDocumentID(0, Id)
Dim a As String = GetDocumentFieldsFromTracker(Id)
'TODO - parse output, write to db.
AxCoPDFXCview1.SaveDocument(Id, fileLocation, 1, 0)
Debug.WriteLine(a)
Catch ex As Exception
Debug.WriteLine(ex.Message)
Finally

End Try

End Sub

Private Function GetDocumentFieldsFromTracker(ByVal docid As Integer) As String
Dim OutputStr As String = ""
Dim flags As Integer = 0
AxCoPDFXCview1.RunJavaScript("this.exportAsFDFStr()", OutputStr, docid, flags)
Return OutputStr
End Function

The output string looks like this:

%FDF-1.4
%âãÏÓ
1 0 obj
<<
/FDF <<
/F (C:\\Users\\hgoussarova\\Documents\\Learn_Adobe_Forms\\PDFReader\\bin\\Debug\\PDFFiles\\Quality Checklist Printed Insert-Outsert.pdf)
/ID [(ýRÇ»Ù"²dèœ-Ïâ®?) (t/R®ï


but in fact it should contain the key value pairs like this:

...skip header...<</T(Date)/V(11-03-11)>><</T(Job)/V(123456)>><</T(Processed By)/V(USERID)>>...skip rest

From my previous work on this FDF topic, I had same results when I handled characters incorrectly. This header string contained char(192), parsing blew up, and the output string ended up truncated just like yours.

Additionally, I am getting this Adobe message when opening the pdf after having been saved it using your SaveDocument function - "This document enabled extended features in Adobe Reader. The document has been changed since it was created and use of extended features is no longer available...". Could you please tell me why and how to rid of it.

I hope I am not too annoying, just want to get through with this, and there is no other way but to keep asking you questions even though you probably have too many other things to finish before you go home tonight.

I am really close now!
Thanks!
Helen.
User avatar
Ivan - Tracker Software
Site Admin
Posts: 3549
Joined: Thu Jul 08, 2004 10:36 pm
Location: Vancouver Island - Canada
Contact:

Re: How to extract field values from a pdf form in vb/csharp

Post by Ivan - Tracker Software »

but in fact it should contain the key value pairs
it does contain them but I'm afraid there are problems with /ID array which is generated without hex encoding.
Can you use exportAsXFDFStr ?
Tracker Software (Project Director)

When attaching files to any message - please ensure they are archived and posted as a .ZIP, .RAR or .7z format - or they will not be posted - thanks.
goussarova
User
Posts: 9
Joined: Fri Nov 11, 2011 9:58 pm

Re: How to extract field values from a pdf form in vb/csharp

Post by goussarova »

Thanks Ivan,

I can use exportAsXFDFStr just fine - works perfectly.

But now I am having two problems with documentSave

(a) - I am getting the following message when I open documents that were saved with DocumentSave() - This document enabled extended features in Adobe Reader. The document has been changed since it was created and use of extended features is no longer available...

(b) - Adobe Reader does not recognize the Form fields any longer even though my application keeps working with fields just fine - except it keeps showing the message on load).

From what I Googled, I understand that the two are closly related - except i could not find the work around.

The most bizaar thing is that when I open the original form with the Adobe Reader 9, I can change the field values just fine, and save by overwriting the original document just fine (after going through the file already exists dialog). And the fields are still there the next time when I open the form.

On the other hand, when I save the same form using DocumentSave, I start getting the message above, even though it saves, and I can re-open it through my little project, and make more changes. BUT Adobe Reader can not see the fields any more!

When I look at that pdf's properties, I can see that it has pdf version 1.6 (created by acrobat 7.x) and Security / Filling Of Form Fields set to Not Allowed.

Could you please tell me how to get through this last glitch?

Thanks a million!
Helen.
User avatar
Paul - Tracker Supp
Site Admin
Posts: 6897
Joined: Wed Mar 25, 2009 10:37 pm
Location: Chemainus, Canada
Contact:

Re: How to extract field values from a pdf form in vb/csharp

Post by Paul - Tracker Supp »

Hi again Helen,

Ivan is the lead developer for Tracker Software, you have the ear of one of the most qualified people there are to help you here. Unfortunately he is finished for today, and it being Friday you may not hear back from him until Monday. I'll see if he can take a look over the week-end but cannot promise anything.

hth
Best regards

Paul O'Rorke
Tracker Support North America
http://www.tracker-software.com
goussarova
User
Posts: 9
Joined: Fri Nov 11, 2011 9:58 pm

Re: How to extract field values from a pdf form in vb/csharp

Post by goussarova »

Thanks anyway.
I will be on vacation next week.

Have a great Thanksgiving!
Helen.
User avatar
Paul - Tracker Supp
Site Admin
Posts: 6897
Joined: Wed Mar 25, 2009 10:37 pm
Location: Chemainus, Canada
Contact:

Re: How to extract field values from a pdf form in vb/csharp

Post by Paul - Tracker Supp »

Lucky you! We already had Thanksgiving in Canada so I'll be avoiding the turkey this week end, but thanks!

I'm sure you'll get an answer anyway.

:-)
Best regards

Paul O'Rorke
Tracker Support North America
http://www.tracker-software.com
Post Reply