Problems with PDF Object IDs

tliliith · Post by **tliliith** » Mon Dec 19, 2011 6:22 pm

Hello,

I am working on a software (http://docear.org) which, amongst other things, tries to extract and keep track of bookmarks and annotations from pdfs to inform the user about changes which he has made to these items.
I am currently facing a larger issue and wonder if you can help me with the following:

I thought the object number and generation was sufficient to identify annotations in a PDF over different versions, because the pdf 1.7 specification (http://www.adobe.com/content/dam/Adobe/ ... ce_1-7.pdf) states on page 63:

"Together, the combination of an object number and a generation number
uniquely identifies an indirect object. The object retains the same object number
and generation number throughout its existence, even if its value is modified."

But:

When I open a pdf file with the PDF-Xchange Viewer which contains annotations created by another pdf reader (e.g. the foxit reader), all the object numbers of the annotations seem to be set to different values after saving it.
However if I use pdf files which only contain annotations created with the PDF-Xchange Viewser, then the object numbers remain the same.
If I go the other way around and use a pdf with annotations created by your software and try to open, change and store it with the Foxit PDF Reader, then the object numbers remain the same as well.

The rearrangement of the object number in the first example means, that we are not able to compare the annotations to their previous versions.
I would greatly appreciate any advice or help you can provide.

Best wishes,
Stefan

P.S.: I was using Windows 7 Professional 64-Bit and Version 2.5.200 of the PDF-XChange Viewer.

Tue Dec 20, 2011 9:18 am

Sorry, but you cannot use pdf object numbers to identify annotations or something else after PDF file was changed - most programs may change object numbers when saving document, it is not restricted.

For example we have document with 10 000 pages - this will require at least 20 000 specified objects (actually a bit more for common document structures and resources, but it will be approx 20 000).
As an example - lets delete all pages except the last and save.

In most cases last page will have the greatest object number, so to keep it we should have records for almost all 20 000 of objects. Even most records will note that an object was deleted they will occupy some space in resulting file.

For example if the file conforms to the PDF spec. version 1.4 or or higher, 20 000 Objects or so will be recorded and will occupy 400 000 bytes - too big an overhead for one page.

Another reason to not keep object numbers is the possibility of extensive PDF editing. The PDF format does have an architectural limit in this regard - a maximum 8 388 607 objects. But when when you add/delete pages, annotations, etc. you cannot reuse free object numbers as you are using them for identification, so 8 million is not so big number and may be reached soon - PDF editing and modification as now happening was never intended when the format was first designed and so some of the limits originally prescribed may need updating by the ISO standard to accommodate the ongoing extension of the format - which no doubt will happen in time.

These are the primary reasons we and other developers do not keep the original object numbers in our applications

HTH.

joeran · Post by **joeran** » Mon Jan 09, 2012 9:10 am

i am a user of Docear too and I would like to request keeping the object numbers, too. Currently, I am working with Foxit Reader and they keep the object numbers and I never had trouble during the last years. Maybe you can make this optional?

Post by **Paul - Tracker Supp** » Mon Jan 09, 2012 4:14 pm

Hi joeran,

after reading the post from Victor I think that the position here is fairly firm. I will however pass on your request and see if making 'optional' is something we can/will do.

hth

emrea · Post by **emrea** » Tue Jan 10, 2012 3:59 pm

Hello all,
I am an avid user of PDFX-Change and Docear, and in fact I've been among those users who have been cheering for full compliance between Docear and PDFX-Change in Docear forums (even before it was called Docear

). I have no clue on the technical details of the subject, and I understand that there are some hindrances beneath, but I heartily request from you technical people to provide a solution, or an option to "keep the object numbers" and make the best of two worlds for us the mortal users of both software

.
Thank you for such an excellent product,
Best wishes,
Ayca.

Post by **Paul - Tracker Supp** » Tue Jan 10, 2012 4:02 pm

Thanks emrea,

noted.

rieman123 · Post by **rieman123** » Mon Jan 23, 2012 8:51 am

I would REALLY wish that you can make PDFXV work smoothly with DocEar.
Please, can it be done?
rieman

Mon Jan 23, 2012 11:50 am

Hello Rieman,

Your voice is heard! But as Paul said - for the moment we can't make any promises that we can do something in this respect at the moment. If time allows - we will investigate the possibility to support this optionally.

Best,
Stefan

saulalbert · Post by **saulalbert** » Wed Feb 15, 2012 7:37 pm

Hi PDF-XChange people.

Just a quick note to request this functionality/compatibility with Docear.

I should also add a huge thank you for your amazing work on PDF-XChange Viewer - I'm using it under WINE and it completely wipes the floor with any native Linux pdf viewer/annotator I've used so far - in reliability, features and performance, which is outstanding for a piece of software under emulation.

Thanks a lot, and if you do have the option to add this feature, it would make me even happier!

Saul.

Post by **Paul - Tracker Supp** » Wed Feb 15, 2012 8:58 pm

Hi saulalbert,

well I guess this is the way to get our attention - the more noise we hear the more likely we will reconsider this position.

I must say that we are not going to do anything with new features until after V3 is released in May. We simply cannot afford at this point to digress at all from finalizing it. My suggestion is to keep making noise, we consider ourselves to be very responsive to user feedback and while we must at all time balance the user requests with a guard against 'bloat' the more individuals requesting this the more likely it is to be viewed as something we should seriously investigate.

hth

breiterjanosch · Post by **breiterjanosch** » Thu Feb 16, 2012 11:37 am

In line with the previous requests, I would kindly ask you to integrate the requests made in this forum in order to make Docear and XChange more compatible.

I teach around 180 students who I would be happy to recommend Xchange reader to once this feature is implemented.

Best,

Jan Breitsohl

Thu Feb 16, 2012 11:43 am

Hello Jan Breitsohl,

I am afraid that just as Paul said above - we will not be working on any new features for ver2.5 of the Viewer, but once ver3 is released this May we will check this topic once again.

Best,
Stefan

tom.method@gmail.com · Tue Feb 21, 2012 9:35 pm

As both PDFXchange client and interested in using DocEar, count me in on this request too!

Finally there's a tool that allows me to convert bookmarks and comments (amongst which highlights) into a mindmap (which is my preferred tool for building summaries in a visual sticking format)
It's a d.... waste to have to re-esatblish all links between the pdf and the mindmap all the time.

I'm also in favor of parametrezing this behavior, so people that rely on the current way PdfXChange works or would be negatively impacted f that changes, on't have to suffer from this demand.

Post by **Paul - Tracker Supp** » Tue Feb 21, 2012 9:44 pm

Thakns for the input Tom,

your voice has been heard and judging by the growing reaction to this I'm sure this will be discussed further once V3 is out.

regards

yanaoka · Post by **yanaoka** » Thu Mar 01, 2012 1:05 pm

Dear All,

Please count me as well as a supporter for better integration between Docear and PDFXV.. It would be really a great combo! Hope you will be able to implement in the next versions
best
davide

Post by **Tracker Supp-Stefan** » Thu Mar 01, 2012 1:50 pm

Thanks Davide,

Your voice is also heard and counted for this feature.

Best,
Stefan

phil_shvarcz · Post by **phil_shvarcz** » Fri Mar 02, 2012 4:11 pm

Hello Tracker team,
I'm also a long time user of pdfxchange viewer under wine (best pdf viewer I've encountered in any OS) and would also love to see persistent object IDs for smooth interoperation with Docear.
Thank you for offering the free reader and for considering to implement this feature.
Regards
Phil

Post by **Paul - Tracker Supp** » Fri Mar 02, 2012 4:14 pm

Thanks Phil,

nice to see you here and voicing your support for integration with Docear. Every voice counts...

dre · Post by **dre** » Fri Mar 09, 2012 9:10 pm

Hello Tracker team,

Thank you for PDF-XChange Viewer. I regard it as the best free PDF reader. It works great on my Linux computer through WINE.

PDF- X Change and Docear are the programs I use the most. I believe that the collaboration between the developers of the programs will add value to both.

Regards
Dré

gutberle · Post by **gutberle** » Sat Mar 10, 2012 1:26 pm

Dear folks at Tracker,

being a long time user and an avid promoter of your PDF-XChange Viewer I feel a bit guilty about pitching in "to make noise" as my first post on this forum, so let me say THANKS for the best PDFViewer/Editor out there before I start griping ...

As valid as the reasoning in the initial reply by HTH/lzcat to Stefan's request for persistent object IDs was, to hinge the reason for a possible non-support on more or less theoretical extremes such as PDFs with 20000+ pages, etc. seems a bit overly cautious to me. I am sure such PDFs do exist and I am also sure that someone somewhere would feel the pinch if you simply implemented IDs as persistent, but I and probably beyond 99.9% of all users of your software typically work with 1-200 page PDFs and it would take a fair amount of re-editing annotations to bust the 8,388,607 object limit at those page counts. Simple maths says ~42,000 edits for a file of 200 pages, which may not be the exactly correct way to calculate this, but the number would remain impressive even if I got it an order of magnitude wrong, right? - And even if someone managed to bust this object count limit due to non-reuse of IDs, well then simply present an error message and give the option to reorganize and compact the IDs at that time and at the danger of the user having to realign with DocEar - ONCE - instead of EVERY time as it stands now!

I do see your point, though and think that making ID persistence an optional feature sounds like a perfect solution! You would not only be doing us as DocEar users (check it out, it's cool as well), but potentially the PDF user community at large a wonderful favor, AND you would make PDFViewer even more indispensible to even more people out there. Sound good? ...

Many, many thanks,
Ingmar

Mon Mar 12, 2012 11:43 am

Hello Ingmar,

While you are correct that probably 99.9% of the documents used won't have more than 200 pages - we still can not willingly make something that will place limitation which could potentially cause documents to break, so the alternative is much better - you to opt in for such static IDs - and when you do so - you would be aware of the possible consequences of that rather than throwing error messages to someone unsuspecting what is about to happen in his normal day work (which involves hundreds and thousands of edits to the same file).

So lets get v3 of the Viewer ready first - and see how this case will develop once the new version is out.

Best,
Stefan

spicedreams · Post by **spicedreams** » Wed Mar 21, 2012 8:59 pm

Hi Stefan,

Thanks for a great product.

I agree you should not willingly do something that will cause documents to break. However the PDF specification the original poster quoted makes it clear that you break documents every time you change the IDs.

If the original poster is right in his observations, you only change pre-existing annotations' IDS when the annotations were made be something other than PDF-X. So PDF-X's behaviour doesn't seem to support your argument that you need to change the IDs, because in one case you don't change them.

I look forward to version 3, and I understand why you want to put discussion off till that's done. But please take a hard look at the logic and recognise that static IDs are the right solution, and adjusting them should be the 'opt-in' for advanced users.

Thanks

Graham

Post by **Paul - Tracker Supp** » Wed Mar 21, 2012 9:28 pm

Thanks for your input Graham,

this has indeed turned into quite the debate! All your ideas/thoughts are welcome.

Gadgety · Post by **Gadgety** » Thu Mar 22, 2012 6:14 am

I want to add a "me too" to this request, just to confirm Tracker is making the right decision to respond to this request.

Thu Mar 22, 2012 11:05 am

Noted Gadgety

Best,
Stefan

Shiyin · Post by **Shiyin** » Sun Mar 25, 2012 4:13 am

Me too!

Post by **Tracker Supp-Stefan** » Mon Mar 26, 2012 8:40 am

technatica · Post by **technatica** » Sat Apr 21, 2012 4:56 am

I just wanted to add my voice to the chorus. As a long time Linux user this is the first time I have needed either a software package like Doclear or PDF-XChange viewer. I look forward to seeing this happen.

Sat Apr 21, 2012 11:27 pm

Thank you

lukewallace · Post by **lukewallace** » Tue May 01, 2012 10:15 am

Please make this an option!

Tue May 01, 2012 11:02 am

Hi lukewallace,

Your voice is also noted.

Cheers,
Stefan

Yaakov · Post by **Yaakov** » Fri May 04, 2012 3:09 am

Hello

I will probably be using Docear as the hub for managing mindmaps and bibliographic databases for two dozen complex academic projects, each with 100 -2000 associated PDFs from various origins (and encouraging my students to do likewise). After recently abandoning Adobe Acrobat for most daily work, and being very impressed with PDF-Xchange Viewer as an alternative, I would love for this to be my default PDF reader interfacing with Docear if the object renumbering thing can be worked out through an option.

Thanks for the great product, free OCR, and the civilized tone that pervades all your documentation and responses (and good luck with the V3 release. . . .)

Yaakov

Sat May 05, 2012 12:21 am

Hi,

Thanks for the kind words - I am afraid for now however we can make no promises regarding intergration with Docear ...

critStock · Post by **critStock** » Mon May 07, 2012 8:01 pm

I've been using PDFX for four years or so now. It is hands down the best viewer! The response to problems and feature requests in these forums has also been absolutely stellar. At one point I made a series of requests having to do with exporting comments, and the features were added very quickly, which massively enhanced by workflow and cemented my loyalty to this product. I understand that v3 is the team's top priority right now, but I'd like to add my voice to those requesting an eventual option for fixed IDs to improve compatibility with Docear.

Thanks again for all your hard work and support!

Post by **Paul - Tracker Supp** » Mon May 07, 2012 9:09 pm

Hi critStock,

thanks for the vote and also the kind words. Your loyalty is appreciated and your comments noted.

regards

Padrig · Post by **Padrig** » Fri May 25, 2012 1:37 pm

Hello,

I am also a fan of pdfxviewer and docear and will be so happy to see them working well together.

Hope the version 3 (or 2.xx) of pdfxviewer will fulfil this wish.

Post by **Tracker Supp-Stefan** » Fri May 25, 2012 1:44 pm

Thanks for adding your vote here Patrick

It's heard and taken into account.

Cheers,
Stefan

cassiocv · Post by **cassiocv** » Mon Jun 04, 2012 11:18 am

Me too!!!!!!!!!!!!!

Mon Jun 04, 2012 11:20 am

Thanks for adding your voice cassiocv!
It's noted!

Cheers,
Stefan

priitl · Post by **priitl** » Wed Jun 06, 2012 7:08 am

In general I love your pdf reader and I have used it a lot, but as I started using Docear I really am choosing between Foxit reader and PDF X-Change Viewer and whoever is first to add the features needed to work with Docear the one I will stick with. And the one I will recommend to my students and coworkers.

Post by **Tracker Supp-Stefan** » Wed Jun 06, 2012 8:44 am

Thanks for your input priitl,

This topic is certainly gaining a lot of popularity - so while we can not make any promices that we will actually implement it at this moment - once v3 of our Viewer is released we will investigate this in details!

Best,
Stefan

Mark Phillips · Post by **Mark Phillips** » Tue Jun 12, 2012 9:27 am

I am trying to understand when the object IDs change when they can be relied on.

Will they stay the same (and hence allow the docear integration to work) if

a) I do NOT edit the PDF file at all
b) I only add comments to the PDF file using PDF-Xchange

Or will PDF-Xchange sometimes renumber the comment object IDs if I add a comment inbetween two existing comments or something?

In other words if I stick to static PDF files, and only add comments usign PDF-Xchange will it all work?!?!

Cheers
Mark

Post by **Paul - Tracker Supp** » Tue Jun 12, 2012 6:04 pm

Hi Mark,

I will need to get confirmation of this from Victor. He should be able to answer by tomorrow.

For all the other users out there who have added their voice to requesting persistent Object IDs, I am pleased to be able to to say that this is now formally a Feature Request for V3 of the Viewer and while not yet a promise to deliver the chances are good that this will be done. RT#1401: Feature Request : Persistent Object IDs

joeran · Post by **joeran** » Tue Jul 10, 2012 2:45 pm

For all the other users out there who have added their voice to requesting persistent Object IDs, I am pleased to be able to to say that this is now formally a Feature Request for V3 of the Viewer and while not yet a promise to deliver the chances are good that this will be done. RT#1401: Feature Request : Persistent Object IDs

As one of the founders of Docear I am really glad to read this and I sincerely hope that you will decide to implement this feature.

Best regards
Joeran

Post by **John - Tracker Supp** » Tue Jul 10, 2012 6:34 pm

Pleasure and good to have your voice here Joeran

joeran · Post by **joeran** » Fri Jul 13, 2012 6:34 am

One of our users just told me that in PDF X-Change Viewer you can set the option "Options - General - Saving Documents" to "Always incremental save" (see http://screencast.com/t/SGe4argZgsA ) and it seems that this exactly has the effect of what we all want (stable objectIDs).

Can you confirm this? Or does this setting have some disadvantages?

Nico - Tracker Supp · Fri Jul 13, 2012 10:00 pm

Hi joeran,

Thanks for your post.
The option "Always incremental save" from "Options - General - Saving Documents" may have the effect of preserving stable objectIDs because appends modifications or additions of objects at the end of the document and somehow keeps track of the existing object IDs when creating new ones, but this is not the intended goal of this functionality and cannot be considered as a substitution for what you are looking for, which is to have a unique relationship between objects and the given IDs.
Thanks.

Sincerely,

joeran · Post by **joeran** » Mon Jul 16, 2012 10:47 am

thank you for the clarification

Mon Jul 16, 2012 11:34 am

Steve Pixley · Post by **Steve Pixley** » Wed Jul 18, 2012 12:09 pm

I'm facing some issues with formatting the PDF's source URL

it works fine with path/file.pdf
it fails loading the document if the URL contains %2F instead of a slash
it fails if the file is got trough an indirect request fileget.php?id=3 (the same downloads a PDF if invoked directly in a browser)