Adding new standard or custom metadata fields to PDF files

Forum for the PDF-XChange Editor - Free and Licensed Versions

Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan

Locked
DIV
User
Posts: 252
Joined: Fri Jun 23, 2017 1:47 am

Adding new standard or custom metadata fields to PDF files

Post by DIV »

Hello.

There are numerous details that users might want to add as metadata into a PDF, beside the basic items usually allowed (such as Title, Author/Creator, Subject, Keywords, Description).
Examples might be:
  • Volume
  • Number of pages
  • Number of words
  • Approver
  • Project manager
  • Version number
And so on.

There are several internationally respected standards for metadata, such as Dublin Core (in both the dc and dcterms namespaces) and PRISM (in several namespaces, such as prism, prism-ad, and pur).

There would also be innumerable in-house systems.

Currently it seems to be either very unwieldy or outright impossible to add metadata beyond the most basic fields (such as Title, Author/Creator, ...) into a PDF file, based on my experience with Adobe Acrobat, PDF-Exchange Editor, and one other application.

I don't really understand why it should be so difficult.
The main issue I can think of is whether an appropriate schema is available. That could add an extra level of bother for in-house conventions, or for little-known international 'standards'.
However, for the most popular standards, such as DC and PRISM, I really think by now the metadata supported by those standards should be able to be entered into a PDF without much hassle. Possibly involving a couple of drop-down menus:

Button 1: Add a new standard metadata field to this PDF.
Drop-down/Choice 1: Which schema/namespace to use? (Schema is version-specific. It might be easier to let the user choose the namespace, and then the software can default to the most recent (known) schema for that namespace.)
Drop-down/Choice 2: Which metadata name to include from an enumerated list.

Maybe custom metadata can be added too:
Button 2: Add a new user-defined/custom metadata field to this PDF.
*Display warning like: "May not be supported by all software." if appropriate.

Personally I also think it should be possible to edit the metadata in the XMP structure.
Of course, if it is free text entry, the user's changes would have to be validated by the software as being valid before being accepted. This may be why Adobe Acrobat (at version 7) only allows deleting entries from here, but not adding. Yet PDF-XChange Editor version 7 doesn't directly allow even deletions from the XMP structure listing.
Alternatively a structured dialogue could be created to ensure well-formed entries in the XMP structure. Indeed, that would partly be satisfied by the above buttons (but only for the case of adding new metadata fields, not editing or deleting existing fields).
(It is supposed that XMP files should be able to be imported to update the metadata, but that is (personally) my least preferred option, because it relies on some external method of creating an XMP file, which seems to be an arcane art. Maybe beneficial for some corporate/industry users.)

Yours sincerely,
DIV
User avatar
Ivan - Tracker Software
Site Admin
Posts: 3549
Joined: Thu Jul 08, 2004 10:36 pm
Location: Vancouver Island - Canada
Contact:

Re: Adding new standard or custom metadata fields to PDF files

Post by Ivan - Tracker Software »

Hi,

while technically it easy to implement what you propose, it will not solve the issue. First of all, because it will rely on the application's built-in namespaces/schemas.

There is another way. It is more complicated but more reliable.
The Editor supports custom XMP panels which can be created following the spec described here: http://metadatadeluxe.pbworks.com/f/XMP ... Panels.pdf

The only difference is that the Editor needs proper XML file, so, content of the panel tag should be enclosed in <![CDATA[ ... ]]>.
Customs panels for the Editor should be located in "%ProgramFiles%\Tracker Software\PDF Editor\XMP\Custom File Info Panels" or "%AppData%\Tracker Software\PDFXEditor\3.0\XMP\Custom File Info Panels" folders.


As an example, see the attached source of the Editor 'standard' Description panel.
description.zip
(1.28 KiB) Downloaded 70 times
Tracker Software (Project Director)

When attaching files to any message - please ensure they are archived and posted as a .ZIP, .RAR or .7z format - or they will not be posted - thanks.
DIV
User
Posts: 252
Joined: Fri Jun 23, 2017 1:47 am

Re: Adding new standard or custom metadata fields to PDF files

Post by DIV »

Hi, Ivan.

Thank-you for your advice on an alternative to my proposal.

One thing was still unclear to me. You mentioned, "[...] it will rely on the application's built-in namespaces/schemas." I don't understand why the application can't access whatever the 'official' schema is for a few common 'standards', such as DC and PRISM. Or did you mean that users might feel frustrated about not being able to specify something obscure like AcmeSchema?

—DIV
Xaris Eireinei
User
Posts: 7
Joined: Sun Jan 24, 2021 6:33 pm

Re: Adding new standard or custom metadata fields to PDF files

Post by Xaris Eireinei »

I'm bumping into this same issue. I need to apply the same document properties over and over. Unfortunately, these are 1-offs (can't do it in batches). Is there a way to create some sort of predesigned XML template to apply the following in PDF X-Change Editor Plus?

File > Document Properties > Description > Additional Metadata (Document Title, Author, Author Title, Description, Copyright Status, Copyright Notice)

File > Document Properties > Security > Document Security > (Document Passwords, Permissions)

Thanks for any ideas. I tried doing a java script for the Additional Metadata (https://forum.pdf-xchange.com/viewtopic.php?f=62&t=36762) but only the following carry over:

this.info.title = "document title"; [WORKED]
this.info.author = "author"; [WORKED]
this.info.authortitle = "authortitle"; [FAILED]
this.info.subject = "subject"; [WORKED but not after this]
this.info.description = "description"; [FAILED]
this.info.keywords = "keywords"; [WORKED but not after this]
this.info.copyright = "copyright"; [FAILED]
this.info.copyrightnotice = "copyrightnotice"; [FAILED]

In essence, only the common fields that already exist in MS Word seem to be applied.
Willy Van Nuffel
User
Posts: 2347
Joined: Wed Jan 18, 2006 12:10 pm

Re: Adding new standard or custom metadata fields to PDF files

Post by Willy Van Nuffel »

This is how it goes in Ad*be applications:
https://wiki.creativecommons.org/wiki/XMP_help_for_Adobe_applications

"Custom File Info Panels" seem to work in PDF-XChange Editor, but what about XMP-"Metadata Templates" ?

@Tracker Software Support:
Thanks in advance for more info.
DIV
User
Posts: 252
Joined: Fri Jun 23, 2017 1:47 am

Re: Adding new standard or custom metadata fields to PDF files

Post by DIV »

Xaris Eireinei wrote: Thu Jun 17, 2021 4:49 pm I'm bumping into this same issue.
[...]

I tried doing a java script for the Additional Metadata (https://forum.pdf-xchange.com/viewtopic.php?f=62&t=36762) but only the following carry over:
this.info.title = "document title"; [WORKED]
this.info.author = "author"; [WORKED]
this.info.authortitle = "authortitle"; [FAILED]
this.info.subject = "subject"; [WORKED but not after this]
this.info.description = "description"; [FAILED]
this.info.keywords = "keywords"; [WORKED but not after this]
this.info.copyright = "copyright"; [FAILED]
this.info.copyrightnotice = "copyrightnotice"; [FAILED]
Hi, Xaris.
I had a hunch that perhaps the fields you were looking for were not stored in the place where you were looking.
In other words, just because document title can be found under this.info.title, it doesn't guarantee that the description must be available at this.info.description — it might rather be found at (say) this.info.docDesc or at this.detailedInfo.description or at this.info.details.description.

Before I go on, I want to make a few quick points. First, I am not an expert in JavaScript ("JS"). Second, despite the similar names, JavaScript is not the same as "Java", and it's not even similar. Third, in the PDF-XChange Editor's JS Console if I enter this.info.qwertyuiop (which does not exist), it surprisingly doesn't complain: I presume this was what you found when you said "FAILED". Lastly, I didn't initially understand what you meant with "but not after this"; I now gather that you meant "but I do not need this".

So I next warn you that I don't provide a complete solution below, but I think it is some progress.

On the question of where to look for the Description metadata and so on, I wondered if there's a way to list the 'children' of "this" or "this.info". Or in perhaps more correct jargon, to find out their properties and methods.
It turns out that there's a pretty simple way: Object.getOwnPropertyNames(object), in which object could be "this" or "this.info", for instance. Sample outputs are given below.

Code: Select all

Object.getOwnPropertyNames(this.info)
ContactEmail,Authors,Author,CreationDate,Creator,Keywords,ModDate,Producer,Subject,Title
So the above are the only things available under "this.info". Notice that, as foreshadowed, the things you marked as "FAILED" don't appear.

Code: Select all

Object.getOwnPropertyNames(this)
alternatePresentations,author,baseURL,bookmarkRoot,calculate,certified,closed,collection,creationDate,creator,dataObjects,delay,dirty,disclosed,docID,documentFileName,dynamicXFAForm,external,filesize,hidden,hostContainer,icons,info,innerAppWindowRect,innerDocWindowRect,isInCollection,isModal,isInProtectedView,keywords,layout,media,metadata,modDate,mouseX,mouseY,noautocomplete,nocache,numFields,numIcons,numPages,numTemplates,pane,path,outerAppWindowRect,outerDocWindowRect,pageNum,pageWindowRect,permStatusReady,producer,requiresFullSave,securityHandler,selectedAnnots,sounds,spellDictionaryOrder,spellLanguageOrder,subject,templates,title,URL,viewState,wireframe,xfa,XFAForeground,zoom,zoomType,history,addAnnot,addField,addIcon,addLink,addRecipientListCryptFilter,addRequirement,addScript,addThumbnails,addWatermarkFromFile,addWatermarkFromText,addWeblinks,applyRedactions,bringToFront,calculateNow,certifyInvisibleSign,closeDoc,createDataObject,createIcon,createTemplate,deletePages,deleteSound,embedDocAsDataObject,embedOutputIntent,encryptForRecipients,encryptUsingPolicy,exportAsFDF,exportAsFDFStr,exportAsText,exportAsXFDF,exportAsXFDFStr,exportDataObject,exportXFAData,extractPages,flattenPages,getAnnot,getAnnotRichMedia,getAnnot3D,getAnnots,getAnnotsRichMedia,getAnnots3D,getDataObject,getDataObjectContents,getField,getIcon,getLegalWarnings,getLinks,getNthFieldName,getNthTemplate,getOCGs,getOCGOrder,getPageBox,getPageLabel,getPageNthWord,getPageNthWordQuads,getPageNumWords,getPageRotation,getPageTransition,getPreflightAuditTrail,getPrintParams,getSound,getTemplate,getURL,gotoNamedDest,importAnFDF,importAnXFDF,importDataObject,importIcon,importSound,importTextData,importXFAData,insertPages,mailDoc,mailForm,movePage,newPage,openDataObject,preflight,print,removeDataObject,removeField,removeIcon,removeLinks,removePreflightAuditTrail,removeRequirement,removeScript,removeTemplate,removeThumbnails,removeWeblinks,replacePages,resetForm,saveAs,scroll,selectPageNthWord,setAction,setDataObjectContents,setOCGOrder,setPageAction,setPageBoxes,setPageLabels,setPageRotations,setPageTabOrder,setPageTransitions,spawnPageFromTemplate,submitForm,syncAnnotScan,timestampSign,validatePreflightAuditTrail
Now there are a huge number of properties found directly under "this", as above, and each might well have numerous subproperties (like "this.info" did). I haven't gone through them all. In fact, I only tried one, but it turned out to be a promising lead.

I noticed "metadata" among the above properties, so I tried "this.metadata". Lo and behold it lists pretty much everything.

Code: Select all

this.metadata
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 5.5.0">
	<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
		<rdf:Description rdf:about=""
				xmlns:dc="http://purl.org/dc/elements/1.1/"
				xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/"
				xmlns:xmp="http://ns.adobe.com/xap/1.0/"
				xmlns:pdf="http://ns.adobe.com/pdf/1.3/"
				xmlns:xmpRights="http://ns.adobe.com/xap/1.0/rights/"
				xmlns:photoshop="http://ns.adobe.com/photoshop/1.0/">
			<dc:format>application/pdf</dc:format>
			<dc:title>
				<rdf:Alt>
					<rdf:li xml:lang="x-default">mytit</rdf:li>
				</rdf:Alt>
			</dc:title>
			<dc:creator>
				<rdf:Seq>
					<rdf:li>myau</rdf:li>
				</rdf:Seq>
			</dc:creator>
			<dc:description>
				<rdf:Alt>
					<rdf:li xml:lang="x-default">mysubj</rdf:li>
				</rdf:Alt>
			</dc:description>
			<dc:rights>
				<rdf:Alt>
					<rdf:li xml:lang="x-default">myCN</rdf:li>
				</rdf:Alt>
			</dc:rights>
			<dc:subject>
				<rdf:Bag>
					<rdf:li>myKW</rdf:li>
				</rdf:Bag>
			</dc:subject>
			<xmpMM:DocumentID>uuid:21b02a54-525c-4144-b792-54fdf74dc9a9</xmpMM:DocumentID>
			<xmpMM:InstanceID>uuid:bc081d5c-2ba9-4a60-9623-014943d0e692</xmpMM:InstanceID>
			<xmp:ModifyDate>2021-06-18T17:13:38+10:00</xmp:ModifyDate>
			<xmp:CreateDate>2021-06-18T17:13:38+10:00</xmp:CreateDate>
			<xmp:CreatorTool>PDF-XChange Editor 9.0.351</xmp:CreatorTool>
			<pdf:Producer>PDF-XChange Core API SDK (9.0.351)</pdf:Producer>
			<pdf:Keywords>myKey</pdf:Keywords>
			<xmpRights:Marked>True</xmpRights:Marked>
			<xmpRights:WebStatement>www.myurl.com</xmpRights:WebStatement>
			<photoshop:AuthorsPosition>myAT</photoshop:AuthorsPosition>
			<photoshop:CaptionWriter>myDW</photoshop:CaptionWriter>
		</rdf:Description>
	</rdf:RDF>
</x:xmpmeta>
<?xpacket end="w"?>
(Note: this was a 'dummy' file with no content that I manually added various metadata for; I deleted some blank lines in the above excerpt.)

The things to note in relation to copyright are
  • <dc:rights><rdf:Alt><rdf:li xml:lang="x-default">myCN</rdf:li></rdf:Alt></dc:rights>
  • <xmpRights:Marked>True</xmpRights:Marked>
  • <xmpRights:WebStatement>www.myurl.com</xmpRights:WebStatement>
and for description (confusingly and inconsistently labelled as "Subject" and "Description" in different parts of the Document Properties dialogue box in PDF-XChange Editor v9) see
  • <dc:description><rdf:Alt><rdf:li xml:lang="x-default">mysubj</rdf:li></rdf:Alt></dc:description>
Cursory testing on a couple of my other documents shows that "this.metadata" seems to grab everything that you could want, including pdfx and xmp entries.
Examples:
  • <xmp:CreateDate>2009-05-05T14:11:41+10:00</xmp:CreateDate>
  • <pdfx:Deweyↂ0020Number>628.162</pdfx:Deweyↂ0020Number>
Now, for an experienced programmer the above is probably all they need. For anyone else it'd be nice to have a simple route to access the specific information required, without needing to parse a 'monolithic' block of metadata. Those simple routes may indeed exist, and I would suggest that if you can understand the above discussion, then it might (I hope) provide some hints towards finding those simple routes. And please do report back here if you make progress!!

HTH,
DIV


P.S. I have previously noted some potentially confusing/inconsistent links between metadata property names in the GUI and behind the scenes. See
https://forum.pdf-xchange.com/viewtopic.php?f=62&t=32237&p=131691&hilit=keywords#p131691
Joxon
User
Posts: 47
Joined: Sat Sep 12, 2015 4:54 am

Re: Adding new standard or custom metadata fields to PDF files

Post by Joxon »

I always use BeCyPDFMetaEdit. The official homepage is dead and the program is rather dated, but it still works perfectly and it is freeware.

You have to open a pdf in ‘Complete Rewrite’ mode, make your adjustments and save the template as an ini file (under Extras).

Info and download:

www.heise.de/download/product/becypdfmetaedit-36720
DIV
User
Posts: 252
Joined: Fri Jun 23, 2017 1:47 am

Re: Adding new standard or custom metadata fields to PDF files

Post by DIV »

Adobe's "JavaScript API" (provided as part of their Acrobat DC SDK Documentation) has some further information on the "info" and "metadata" properties, as below.
  • "info" — https://opensource.adobe.com/dc-acrobat-sdk-docs/acrobatsdk/html2015/index.html#t=Acro12_MasterBook%2FJS_API_AcroJS%2FDoc_properties.htm%23TOC_info1bc-21&rhtocid=_6_1_8_23_0_20
  • "metadata" — https://opensource.adobe.com/dc-acrobat-sdk-docs/acrobatsdk/html2015/index.html#t=Acro12_MasterBook%2FJS_API_AcroJS%2FDoc_properties.htm%23TOC_metadata1bc-28&rhtocid=_6_1_8_23_0_27
Looking at the latter, I am no longer so sure that simple methods for changing arbitrary individual elements of the metadata are necessarily built in to the API. But such user-friendly methods are still something that an experienced programmer could develop. Or (for a small number of files) it could be done manually, I suppose, following "Example 3" on use of E4X to change the metadata of a document, per the second URL above.

Note that although "this" can be a convenient way to access the above methods for the 'current' PDF file, fundamentally both "info" and "metadata" are properties of a "Doc" (as in 'document') object.
https://opensource.adobe.com/dc-acrobat-sdk-docs/acrobatsdk/html2015/index.html#t=Acro12_MasterBook%2FJS_API_AcroJS%2FDoc1.htm
Therefore, if you have created a variable called "myDoc" (say) that corresponds to a specific PDF file, then you could call e.g. "myDoc.info.Title".

—DIV
Willy Van Nuffel
User
Posts: 2347
Joined: Wed Jan 18, 2006 12:10 pm

Re: Adding new standard or custom metadata fields to PDF files

Post by Willy Van Nuffel »

Dear people at Tracker Software Support,

Can you please ask your development team for some more information about a possibility for reading and writing XMP-metadata into PDF-files via dialog-boxes and/or via JavaScript, in PDF-XChange Editor and eventually also in PDF-Tools (for batch handling) ?

At this moment, not all standard fields (like Copyright Status, Copyright Notice, Copyright info URL, ...) are available or correctly mapped via JavaScript.

If the staff agrees, would you please be so kind to make an official request/ticket for a better/easier managing of XMP-metadata (incl. import/export of metadata) via the classic dialog-box and also via JavaScript ?

Many thanks in advance.


Willy.
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8436
Joined: Wed Jan 03, 2018 6:52 pm

Re: Adding new standard or custom metadata fields to PDF files

Post by TrackerSupp-Daniel »

Hi, Willy Van Nuffel et All

Though I cannot make any promises for such an implementation, I have created a formal feature request ticket on this matter for you all:
RT#5636: Simplify XMP-metadata management

Hopefully we can see something of this ilk in the future, but for the time being, the methods that Ivan mentioned above will be the most effective for your needs.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Xaris Eireinei
User
Posts: 7
Joined: Sun Jan 24, 2021 6:33 pm

Re: Adding new standard or custom metadata fields to PDF files

Post by Xaris Eireinei »

Thanks for putting in that request for us!
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8436
Joined: Wed Jan 03, 2018 6:52 pm

Adding new standard or custom metadata fields to PDF files

Post by TrackerSupp-Daniel »

:)
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
User avatar
Ivan - Tracker Software
Site Admin
Posts: 3549
Joined: Thu Jul 08, 2004 10:36 pm
Location: Vancouver Island - Canada
Contact:

Re: Adding new standard or custom metadata fields to PDF files

Post by Ivan - Tracker Software »

Willy Van Nuffel wrote: Fri Jun 18, 2021 6:59 am This is how it goes in Ad*be applications:
https://wiki.creativecommons.org/wiki/X ... plications

"Custom File Info Panels" seem to work in PDF-XChange Editor, but what about XMP-"Metadata Templates" ?

@Tracker Software Support:
Thanks in advance for more info.
Metadata templates support will be included in build 365.
Tracker Software (Project Director)

When attaching files to any message - please ensure they are archived and posted as a .ZIP, .RAR or .7z format - or they will not be posted - thanks.
Locked