Xaris Eireinei wrote: ↑Thu Jun 17, 2021 4:49 pm
I'm bumping into this same issue.
[...]
I tried doing a java script for the Additional Metadata (https://forum.pdf-xchange.com/viewtopic.php?f=62&t=36762) but only the following carry over:
this.info.title = "document title"; [WORKED]
this.info.author = "author"; [WORKED]
this.info.authortitle = "authortitle"; [FAILED]
this.info.subject = "subject"; [WORKED but not after this]
this.info.description = "description"; [FAILED]
this.info.keywords = "keywords"; [WORKED but not after this]
this.info.copyright = "copyright"; [FAILED]
this.info.copyrightnotice = "copyrightnotice"; [FAILED]
Hi, Xaris.
I had a hunch that perhaps the fields you were looking for were not stored in the place where you were looking.
In other words, just because document title can be found under
this.info.title, it doesn't guarantee that the description must be available at
this.info.description — it might rather be found at (say)
this.info.docDesc or at
this.detailedInfo.description or at
this.info.details.description.
Before I go on, I want to make a few quick points. First, I am not an expert in JavaScript ("JS"). Second, despite the similar names, JavaScript is not the same as "Java", and it's not even similar. Third, in the
PDF-XChange Editor's JS Console if I enter this.info.qwertyuiop (which does not exist), it surprisingly doesn't complain: I presume this was what you found when you said "FAILED". Lastly, I didn't initially understand what you meant with "
but not after this"; I now gather that you meant "
but I do not need this".
So I next warn you that I don't provide a complete solution below, but I think it is some progress.
On the question of
where to look for the Description metadata and so on, I wondered if there's a way to list the 'children' of "this" or "this.info". Or in perhaps more correct jargon, to find out their properties and methods.
It turns out that there's a pretty simple way:
Object.getOwnPropertyNames(object), in which
object could be "this" or "this.info", for instance. Sample outputs are given below.
Code: Select all
Object.getOwnPropertyNames(this.info)
ContactEmail,Authors,Author,CreationDate,Creator,Keywords,ModDate,Producer,Subject,Title
So the above are the only things available under "this.info". Notice that, as foreshadowed, the things you marked as "FAILED" don't appear.
Code: Select all
Object.getOwnPropertyNames(this)
alternatePresentations,author,baseURL,bookmarkRoot,calculate,certified,closed,collection,creationDate,creator,dataObjects,delay,dirty,disclosed,docID,documentFileName,dynamicXFAForm,external,filesize,hidden,hostContainer,icons,info,innerAppWindowRect,innerDocWindowRect,isInCollection,isModal,isInProtectedView,keywords,layout,media,metadata,modDate,mouseX,mouseY,noautocomplete,nocache,numFields,numIcons,numPages,numTemplates,pane,path,outerAppWindowRect,outerDocWindowRect,pageNum,pageWindowRect,permStatusReady,producer,requiresFullSave,securityHandler,selectedAnnots,sounds,spellDictionaryOrder,spellLanguageOrder,subject,templates,title,URL,viewState,wireframe,xfa,XFAForeground,zoom,zoomType,history,addAnnot,addField,addIcon,addLink,addRecipientListCryptFilter,addRequirement,addScript,addThumbnails,addWatermarkFromFile,addWatermarkFromText,addWeblinks,applyRedactions,bringToFront,calculateNow,certifyInvisibleSign,closeDoc,createDataObject,createIcon,createTemplate,deletePages,deleteSound,embedDocAsDataObject,embedOutputIntent,encryptForRecipients,encryptUsingPolicy,exportAsFDF,exportAsFDFStr,exportAsText,exportAsXFDF,exportAsXFDFStr,exportDataObject,exportXFAData,extractPages,flattenPages,getAnnot,getAnnotRichMedia,getAnnot3D,getAnnots,getAnnotsRichMedia,getAnnots3D,getDataObject,getDataObjectContents,getField,getIcon,getLegalWarnings,getLinks,getNthFieldName,getNthTemplate,getOCGs,getOCGOrder,getPageBox,getPageLabel,getPageNthWord,getPageNthWordQuads,getPageNumWords,getPageRotation,getPageTransition,getPreflightAuditTrail,getPrintParams,getSound,getTemplate,getURL,gotoNamedDest,importAnFDF,importAnXFDF,importDataObject,importIcon,importSound,importTextData,importXFAData,insertPages,mailDoc,mailForm,movePage,newPage,openDataObject,preflight,print,removeDataObject,removeField,removeIcon,removeLinks,removePreflightAuditTrail,removeRequirement,removeScript,removeTemplate,removeThumbnails,removeWeblinks,replacePages,resetForm,saveAs,scroll,selectPageNthWord,setAction,setDataObjectContents,setOCGOrder,setPageAction,setPageBoxes,setPageLabels,setPageRotations,setPageTabOrder,setPageTransitions,spawnPageFromTemplate,submitForm,syncAnnotScan,timestampSign,validatePreflightAuditTrail
Now there are a huge number of properties found directly under "this", as above, and each might well have numerous subproperties (like "this.info" did). I haven't gone through them all. In fact,
I only tried one, but it turned out to be a promising lead.
I noticed "metadata" among the above properties, so I tried "
this.metadata". Lo and behold it lists pretty much everything.
Code: Select all
this.metadata
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 5.5.0">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about=""
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/"
xmlns:xmp="http://ns.adobe.com/xap/1.0/"
xmlns:pdf="http://ns.adobe.com/pdf/1.3/"
xmlns:xmpRights="http://ns.adobe.com/xap/1.0/rights/"
xmlns:photoshop="http://ns.adobe.com/photoshop/1.0/">
<dc:format>application/pdf</dc:format>
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="x-default">mytit</rdf:li>
</rdf:Alt>
</dc:title>
<dc:creator>
<rdf:Seq>
<rdf:li>myau</rdf:li>
</rdf:Seq>
</dc:creator>
<dc:description>
<rdf:Alt>
<rdf:li xml:lang="x-default">mysubj</rdf:li>
</rdf:Alt>
</dc:description>
<dc:rights>
<rdf:Alt>
<rdf:li xml:lang="x-default">myCN</rdf:li>
</rdf:Alt>
</dc:rights>
<dc:subject>
<rdf:Bag>
<rdf:li>myKW</rdf:li>
</rdf:Bag>
</dc:subject>
<xmpMM:DocumentID>uuid:21b02a54-525c-4144-b792-54fdf74dc9a9</xmpMM:DocumentID>
<xmpMM:InstanceID>uuid:bc081d5c-2ba9-4a60-9623-014943d0e692</xmpMM:InstanceID>
<xmp:ModifyDate>2021-06-18T17:13:38+10:00</xmp:ModifyDate>
<xmp:CreateDate>2021-06-18T17:13:38+10:00</xmp:CreateDate>
<xmp:CreatorTool>PDF-XChange Editor 9.0.351</xmp:CreatorTool>
<pdf:Producer>PDF-XChange Core API SDK (9.0.351)</pdf:Producer>
<pdf:Keywords>myKey</pdf:Keywords>
<xmpRights:Marked>True</xmpRights:Marked>
<xmpRights:WebStatement>www.myurl.com</xmpRights:WebStatement>
<photoshop:AuthorsPosition>myAT</photoshop:AuthorsPosition>
<photoshop:CaptionWriter>myDW</photoshop:CaptionWriter>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end="w"?>
(Note: this was a 'dummy' file with no content that I manually added various metadata for; I deleted some blank lines in the above excerpt.)
The things to note in relation to copyright are
- <dc:rights><rdf:Alt><rdf:li xml:lang="x-default">myCN</rdf:li></rdf:Alt></dc:rights>
- <xmpRights:Marked>True</xmpRights:Marked>
- <xmpRights:WebStatement>www.myurl.com</xmpRights:WebStatement>
and for description (confusingly and inconsistently labelled as "Subject" and "Description" in different parts of the Document Properties dialogue box in
PDF-XChange Editor v9) see
- <dc:description><rdf:Alt><rdf:li xml:lang="x-default">mysubj</rdf:li></rdf:Alt></dc:description>
Cursory testing on a couple of my other documents shows that "this.metadata" seems to grab everything that you could want, including pdfx and xmp entries.
Examples:
- <xmp:CreateDate>2009-05-05T14:11:41+10:00</xmp:CreateDate>
- <pdfx:Deweyâ0020Number>628.162</pdfx:Deweyâ0020Number>
Now, for an experienced programmer the above is probably all they need. For anyone else it'd be nice to have a simple route to access the specific information required, without needing to parse a 'monolithic' block of metadata. Those simple routes may indeed exist, and I would suggest that if you can understand the above discussion, then it might (I hope) provide some hints towards finding those simple routes. And please do report back here if you make progress!!
HTH,
DIV
P.S. I have previously noted some potentially confusing/inconsistent links between metadata property names in the GUI and behind the scenes. See
https://forum.pdf-xchange.com/viewtopic.php?f=62&t=32237&p=131691&hilit=keywords#p131691