How to extract the value(s) of barcodes from a PDF

This Forum is for the use of End Users requiring help and assistance for Tracker Software's PDF-Tools.

Moderators: Tracker Support, TrackerSupp-Daniel, Sean - Tracker, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Tracker Supp-Stefan

Post Reply
TerraD
User
Posts: 25
Joined: Tue Jan 28, 2014 9:34 pm

How to extract the value(s) of barcodes from a PDF

Post by TerraD »

1) Is there really no elegant way to directly extract the values of barcodes contained in a .PDF by means of PDF-Tools? Actually I would expect PDF-Tools to offer a way to extract those values in a similar way we can extract text from a pdf. Maybe enhanced to extract the values of e.g. QR-Codes only...

My workaround is to convert a .PDF to a .PNG, save the file and hand that file over to ZXing to extract the values of the barcodes. Not really elegant... The following Command Line will convert my file:

Code: Select all

PDFXTools.exe /RunTool pdfToImages "y:\MyTestFile.pdf" /Output:folder=\"y:\";filename=\"MyTestFile.png\";overwrite=makeuniq;showfiles=yes
However this leads to a second question:
2) How do I make the above command to create a .PNG with a resolution of 300 dpi instead of the 150 dpi default? I assume there are additional parameters for the 'pdfToImages', but I can not find any hint about this in the online help information.
User avatar
DenisO
Site Admin
Posts: 94
Joined: Fri Jun 09, 2017 5:40 pm

Re: How to extract the value(s) of barcodes from a PDF

Post by DenisO »

Hi TerraD,
try to use 'Export Form Data' action.
Kind Regards,
Denis Oleksenko
Software Developer
Tracker Software Products (Canada) LTD
TerraD
User
Posts: 25
Joined: Tue Jan 28, 2014 9:34 pm

Re: How to extract the value(s) of barcodes from a PDF

Post by TerraD »

Maybe I should have been more specific about my input: My .PDFs are scanned or .pdf received by mail/downloaded - produced by many different authorities - each will have a different structure and look. The only thing they have in common is a QR-Code with a strictly defined content.

There are no form fields on the .pdf so I do not see how 'Export Form Data' could be helpful. In fact I did a quick test it and the generated .fdf did not contain any useful information...
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 5971
Joined: Wed Jan 03, 2018 6:52 pm

Re: How to extract the value(s) of barcodes from a PDF

Post by TrackerSupp-Daniel »

Hello, TerraD

Unfortunately if the barcode is not already a form field, we have no way to extract the value at the moment. I know that some OCR engines are capable of reading and converting barcodes, so I will ask our Dev team if it is possible for our OCR engine to offer this in the future, but for the moment I am afraid that the steps you are taking now are likely the best solution to this problem.

If you have a document which DOES have a barcode form field within it, these articles may help you to better understand the field type:
https://help.tracker-software.com/pdfxe ... es_ed.html
https://www.tracker-software.com/knowle ... s#barcodes

Kind regards,
Daniel McIntyre - Support Technician
Tracker Software Products (Canada) LTD

Support: <Support@tracker-software.com>
TerraD
User
Posts: 25
Joined: Tue Jan 28, 2014 9:34 pm

Re: How to extract the value(s) of barcodes from a PDF

Post by TerraD »

Thank you for your answer. Not sure if barcode decoding fits into the concept of an OCR engine. I think its rather a separate treatment of images... I definitively would like to have it in a separate interface (= own <ToolID>).

Please have a look at my Question 2. My workaround will only work, if I can get a resolution of 300 dpi instead of the 150 dpi default! What parameter do I need to pass on the Command Line to 'pdfToImages' in order to get that resolution?

I'll append my solution to the barcode extraction here as an example as soon as I can complete it - but I need that question to be solved for that! The solution is based on an OpenSource library.
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 5971
Joined: Wed Jan 03, 2018 6:52 pm

Re: How to extract the value(s) of barcodes from a PDF

Post by TrackerSupp-Daniel »

Hello, TerraD

It most certainly is possible and built in to the OCR engine, I actually just confirmed with the Dev team that our OCR already does this, the only caveat is that it does not create visible, searchable, or editable text, it adds a hidden "alternative text" item which is currently only visible to screen readers, as such we are looking at a way to allow copying of that alternative text in the future.

Regarding the image DPI, there is not direct control for this (even via the command line), but if you disable the "compression" options in the "Image to PDF" function (making a change in the UI and running the tool once will save those changes, they will be remembered and used if left unchanged when running via the command line), than the images will be placed within the document at their original resolution/DPI.

Kind regards,
Daniel McIntyre - Support Technician
Tracker Software Products (Canada) LTD

Support: <Support@tracker-software.com>
Post Reply