I am reporting what may be a bug with certain metadata fields being reproducibly stripped from PDF files when they are saved as a form of PDF/A format (specifically, PDF/A-Na, with N=1 or 2 or 3).
I annotated a scanned PDF with metadata in PDF-XChange Editor (originally in version 6.0, but here in version 7.0).
The metadata added was entered from the "Additional Metadata" button in the Document Properties dialogue box, then in the "Description" Category the following:
- Document Title — maps to dc:title; retained in PDF/A
- Author — maps to dc:creator; retained in PDF/A
- Author Title — maps to xmp:AuthorsPosition; stripped from PDF/A
- Description — maps to dc:description; retained in PDF/A
- Description Writer — maps to xmp:CaptionWriter; stripped from PDF/A
- Keywords — maps to dc:subject(!!); stripped from PDF/A
- Copyright Status — maps to xmpRights:Marked; retained in PDF/A
- Copyright Notice — maps to dc:rights; retained in PDF/A
- Copyright Info URL — maps to xmpRights:WebStatement; retained in PDF/A
Other items map to the xmp or xmpRights namespaces.
Note further that item 6 does not correspond to 'Keywords' in the "Document Info" display in the main Document Properties dialogue box (so there's an inconsistency there). The latter instead maps to pdf:Keywords — which is retained in PDF/A.
dc:format is (rightly) set to "application/pdf", and cannot be amended — and, of course, is retained in PDF/A.
Likewise visible in the "Document Info" display in the main Document Properties dialogue box — for PDF files in general — are
- PDF Producer — maps to pdf:Producer; retained in PDF/A
- Application — maps to xmp:CreatorTool; retained in PDF/A
- PDF Version — doesn't map to any stored metadata in the XMP structure
- Created — maps to xmp:CreateDate; retained in PDF/A
- Modified — maps to xmp:ModifyDate; retained in PDF/A
- Page Count — doesn't map to any stored metadata in the XMP structure
- Page Size — doesn't map to any stored metadata in the XMP structure
- PDF-XChange — no data; not sure where this maps to, if anywhere
When exporting a file as PDF/A, some of this metadata is stripped out. I have used PDF-XChange Editor in both version 6.0 and version 7.0 (build 328.1), with apparently the same behaviour. Detailed testing was with version 7.0.
For Conformance I have chosen under Options variously PDF/A-1a, PDF/A-2a, and PDF/A-3a — with seemingly no difference.
As indicated above, three of the metadata fields are stripped out:
- Author Title / xmp:AuthorsPosition
- Description Writer / xmp:CaptionWriter
- Keywords / dc:subject
However, looking at the retention of other fields, I am more inclined to believe that it is a bug.
And, indeed, from a quick parsing of the Technical Note linked to above, it seems allowed to include any metadata one likes in a PDF file, provided it's correctly structured and follows a suitable schema.
Yours sincerely,
DIV