Barcode finding and reading from PDF content?

This Forum is for the use of Software Developers requiring help and assistance for Tracker Software's PDF-Tools SDK of Library DLL functions(only) - Please use the PDF-XChange Drivers API SDK Forum for assistance with all PDF Print Driver related topics or PDF-XChange Viewer SDK if appropriate.

Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Tracker Supp-Stefan

Post Reply
omascia
User
Posts: 48
Joined: Thu Mar 11, 2010 7:07 pm

Barcode finding and reading from PDF content?

Post by omascia »

Are there know good paths to recognize (should I ask this in the OCR topic?) barcodes within a PDF? Through means offered by PDF-Tools SDK or through additional third-parties? It surely would be easy to rasterize a page then hand it for processing to another SDK. Just checking if I'm not about to invest into something I would already have (or close to) and checking if users have past experience doing this with what additional tool. If they're willing to share their experience of course.

Thanks a lot,
User avatar
Paul - Tracker Supp
Site Admin
Posts: 6829
Joined: Wed Mar 25, 2009 10:37 pm
Location: Chemainus, Canada
Contact:

Re: Barcode finding and reading from PDF content?

Post by Paul - Tracker Supp »

Hi omascia,

typically bar codes in a PDF are rendered using a barcode font or directly as an image. At present there is not anything in our SDK for reading the bar codes however we are planning on adding some functionality to locate bar codes in a PDF and extract some information from them. This is however a long term goal and is not going to be ready for some time.

Presently you would need to use third party libraries or functions to actually read the bar codes.

I hope that helps.
Best regards

Paul O'Rorke
Tracker Support North America
http://www.tracker-software.com
omascia
User
Posts: 48
Joined: Thu Mar 11, 2010 7:07 pm

Re: Barcode finding and reading from PDF content?

Post by omascia »

Yes thanks, it helps to know I'm not wrong in seeking some additional tool for that need.

The barcodes I'd like to recognize will be found as images or part of images within the PDF pages. That could be usual text PDF along with embedded images (built using PDF-Tools SDK) or PDF coming from scanners or fax engines (being wholly picture-based then). I will simply follow this path for now : render the PDF page to a bitmap then analyze the bitmap. Only after a pair of hours of search, it looks like there are a lot of commercial solutions available, most have a ridiculous (higher than reasonable) pricing though. I have also found some open-source code with appropriate / compatible licensing terms. And I have found some good academic papers on the techniques to parse the images for barcodes. Pretty much the same business as OCR, the target is slightly different though and the patterns to recognize are a lot simpler while their count is really limited. Not to mention that I do not need to recognize all the symbologies, just one or two of them, among the simplest to decode (EAN/GS1 7/8/12/13).
I just have to evaluate many of these solutions and then proceed.
So thanks for the confirmation I needed.
User avatar
Paul - Tracker Supp
Site Admin
Posts: 6829
Joined: Wed Mar 25, 2009 10:37 pm
Location: Chemainus, Canada
Contact:

Re: Barcode finding and reading from PDF content?

Post by Paul - Tracker Supp »

My pleasure omascia,

do let us know if you need anything further from us here and maybe keep us in the loop regards a solution? I'm sure others here will appreciate hearing what you find out.

Sincerely
Best regards

Paul O'Rorke
Tracker Support North America
http://www.tracker-software.com
omascia
User
Posts: 48
Joined: Thu Mar 11, 2010 7:07 pm

Re: Barcode finding and reading from PDF content?

Post by omascia »

In this ongoing quest, I have successfully experimented with rendering the pages and then scanning them for the EAN8 barcodes I'm interested in. Currently experimenting with "zbar" project for that matter (quite nice, albeit clearly inferior and more limited than some commercial tools I have quickly tested).

Now, optimizing the process, I intend to access the images within the PDF to analyze them, instead of rendering the page to analyze the resulting bitmap. As I'm focused on processing multi-pages documents which will have been scanned with some EAN8 stickers on some pages (to mark new documents and tag their nature - invoice, order, letter, unknown, you get the idea) I clearly have no use spending the resources (and time) it takes to render each page for the sole purpose of processing the resulting bitmap.

I can access and extract images from the PDF (including as TIFF files, sorry for the obvious questions in my last topic).

How could I extract the actual embedded images in whatever format they are stored in the PDF, instead of having to save them to some specific format I'd choose (and thus possibly suffering from an image conversion which *might* in some circumstances imply loosing some quality and negatively impact my later detection and decoding of barcodes)?

My internal knowledge of PDF is limited. Are all images in there always resampled and recoded as a single type of image (at time of storage)? Or are we talking of a full wildlife of formats there (depending on whatever software did the scanning and constructed the PDF storing the scanned pages)?

To better understand the context of this question: my final goal is to get a gray-scaled (not indexed) 8bpp bitmap in memory to hand it to the barcode scanning code. It feels convoluted to let PF-Tools convert whatever is in the PDF to some common disk file format to then re-read it, map it as the required bitmap format and then only proceed with it.
omascia
User
Posts: 48
Joined: Thu Mar 11, 2010 7:07 pm

Re: Barcode finding and reading from PDF content?

Post by omascia »

Answering my own question... :)
It didn't strike me first that Image-XChange SDK is actually included along with PDF-XChange Pro SDK since versions 5.x.
That is *great* (old) news.
Do you want to know why I missed it? Its documentation is missing from the PDF-XChange Pro SDK download. I had to download the Image-XChange SDK kit to get to it. :)

Now to my task at hand: it actually revolves around the following, thanks to the Image-XChange SDK:

- browse the images using PXCp_ImageGetFromPage()
- get them as Image-XChange objects through PXCp_GetDocImageAsXCPage()
- check their encoding format using IMG_PageGetFormat()
- if needed, convert them to my needed grayscale 8 bpp format using IMG_PageConvertToFormat()
- get access to the array of bytes representing the pixels through IMG_PageLockBlock()
- do my stuff from there
- then cleanup resources, which involves IMG_PageUnlockBlock(), IMG_PageDestroy(), PXCp_ImageClearPageData(), and later PXCp_ImageClearAllData()

It works *really* well.
User avatar
Will - Tracker Supp
Site Admin
Posts: 6815
Joined: Mon Oct 15, 2012 9:21 pm
Location: London, UK
Contact:

Re: Barcode finding and reading from PDF content?

Post by Will - Tracker Supp »

Hi omascia,

So am I right in thinking that you have everything you need? Or is there anything else that we can help you with :)
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.

Best regards

Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com
omascia
User
Posts: 48
Joined: Thu Mar 11, 2010 7:07 pm

Re: Barcode finding and reading from PDF content?

Post by omascia »

So am I right in thinking that you have everything you need? Or is there anything else that we can help you with :)
You are right! :wink:

Just keep the idea of possibly adding a barcode finding and decoding feature in some future, either as a special feature of the OCR kit or separately. But I now can reach my goal, with very good performance, and with little add-ons.
Keep up the good work. 8)
User avatar
Patrick-Tracker Supp
Site Admin
Posts: 1645
Joined: Thu Mar 27, 2014 6:14 pm
Location: Vancouver Island
Contact:

Re: Barcode finding and reading from PDF content?

Post by Patrick-Tracker Supp »

:D
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.

Cheers,

Patrick Charest
Tracker Support North America
User avatar
myhealthylawn
User
Posts: 12
Joined: Wed Jan 10, 2018 1:30 pm

Re: Barcode finding and reading from PDF content?

Post by myhealthylawn »

Hello,
Has there been any changes to the status of Bar codes conversation? I would like to have the opportunity to be able to copy and print them.
Big deal when printing off my ballgame tickets, or going to a play.
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17818
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Barcode finding and reading from PDF content?

Post by Tracker Supp-Stefan »

Hello myhealthylawn,

It will depend on the barcode (some are using a special font, some are just images, and some might be dynamically generated), so can you please send us a sample that we can look at?

Cheers,
Stefan
User avatar
myhealthylawn
User
Posts: 12
Joined: Wed Jan 10, 2018 1:30 pm

Re: Barcode finding and reading from PDF content?

Post by myhealthylawn »

can we just pick all your options that you suggested and get them all to work. Your program does nothing for me as the default if it cannot do what adobe does.
User avatar
Patrick-Tracker Supp
Site Admin
Posts: 1645
Joined: Thu Mar 27, 2014 6:14 pm
Location: Vancouver Island
Contact:

Re: Barcode finding and reading from PDF content?

Post by Patrick-Tracker Supp »

Hello myhealthylawn,

Could you please provide some examples of what you mean? I am afraid that it is not as simple as you might think, but we should be able to perform everything adobe can. If that is not the case we will need to see one of these files so that we can investigate the issue, whatever it may be.
Thank you!
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.

Cheers,

Patrick Charest
Tracker Support North America
User avatar
myhealthylawn
User
Posts: 12
Joined: Wed Jan 10, 2018 1:30 pm

Re: Barcode finding and reading from PDF content?

Post by myhealthylawn »

Attached 2 images showing bar codes from the original ticket to Tracker software to adobe
Attachments
Showing how adobe handles the barcode printing
Showing how adobe handles the barcode printing
showing how your software shows the barcode printing. <br />same this for certain bank statements. has the clue that it has to do with security
showing how your software shows the barcode printing.
same this for certain bank statements. has the clue that it has to do with security
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17818
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Barcode finding and reading from PDF content?

Post by Tracker Supp-Stefan »

Hello myhealthylawn,

Thanks for sharing those screenshots.
We are aware of the different sub specifications of the PDF file format, and it's unlikely that a specific sub format is causing the issue.
Barcodes are quite specific on their own - so a sample file will definitely help.
Can you please send a copy of that expired ticket PDF to support@pdf-xchange.com and we will take a further look?

Regards,
Stefan
Post Reply