Optimising PDF files, manual font removal

Forum for the PDF-XChange Editor - Free and Licensed Versions

Moderators: TrackerSupp-Daniel, Tracker Support, Sean - Tracker, Paul - Tracker Supp, Chris - Tracker Supp, Vasyl-Tracker Dev Team, Ivan - Tracker Software, Tracker Supp-Stefan

Post Reply
Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Optimising PDF files, manual font removal

Post by Christian Flatscher » Thu Jan 24, 2019 4:33 pm

Hi,

Last week Saturday I sent an e-mail to support - support@tracker-software.com - but unfortuantely to date not received a reply or acknowledgement of my mail. :(

Here again my problem report:


I am creating PDF files with LibreOffice writer. Unfortunately there is a bug in LibreOffice writer that leads to fonts being embedded that are actually not in the source file. See these bugs:

https://bugs.documentfoundation.org/sho ... ?id=122657
( Bug 122657 - Converting .odt file to PDF contains fonts that are not in the .odf file )

and

https://bugs.documentfoundation.org/sho ... ?id=118541
( Bug 118541 - Undesired font embedded in pdf form created with Libre Office Writer )

It seems that it the LO people do not know how to remove these fonts from the ODT file. Details are in bug #122657.

I thought I can achieve this by using PDF XChange Editor. But it appears that this is not possbile when saving the file as optimised. See attached video file and sample files.

Please let me know to why this is the case and how I can remove unwanted fonts from a PDF file created by LibreOffice.

Thank you!

Some additional information:

unembed_fonts_problem.zip - contains two pdf files and a LibreOffice writer file
pdf_font_remove.rar.zsp.rar and pdf_font_remove.rar.1.rar - contain a demo .mp4 video of the issue requires unsplit_script.zip
unsplit_script.zip - contains a script file to rebuild the original pdf_font_remove.rar which contains the .mp4 video file


BTW - I converted a MS PPTX file with PDF-Xchange and see the same as with a LO document.

Please advise / help on how I can unembedded non-existing fonts from a PDF file.

Thank you.

Regards,

--Christian
Attachments
unsplit_script.zip
(313 Bytes) Downloaded 12 times
pdf_font_remove.rar.zsp.rar
(4.77 MiB) Downloaded 15 times
pdf_font_remove.rar.1.rar
(3.02 MiB) Downloaded 20 times
unembed_fonts_problem.zip
(1.52 MiB) Downloaded 15 times

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Thu Jan 24, 2019 10:40 pm

Hello Christian,

Terribly sorry for the missed communication, we have found your email and are looking into why it did not get a reply in the usual timeframe. We will also be taking further steps to avoid this happening again in the future.

Moving on to the issue at hand. As LibreOffice is causing issues by leaving non-existent fonts in the document, could I ask if you have tried using our Lite printer to handle the conversion?
It offers Font control options, and the ability to embed only used fonts, (or even only subsets of those) at your discretion. When printing from any application, simply choose the Lite printer and click printer properties, then navigate to the Fonts Category:
image.png
As for unembedding afterwards, It looks from my end like the PDF files do only have the 16 used fonts embedded in them (one file does have 17 fonts). I am not sure why our optimization is unable to see these to offer the unembed function, but I do not see any that are not used.

Please let us know if using our lite printer offers a better solution.
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Fri Jan 25, 2019 5:05 pm

Thank you for your prompt reply and e-mail, Daniel.

I was thinking my mail did not go through because of the size of the attachments and the fact that it contains an .mp4 file.


I am copying your last reply from your e-mail into this post for completeness sake:


"On to the topic at hand, thank you very much for links to the LibreOffice bug reports, they have helped us to sort through the issue a bit more. Upon testing in house, we do see that the conversion is handled incorrectly from their end, but I am afraid that there is nothing much we can do in that case.

"The cause from what we see, is how the fonts are being embedded by LibreOffice. Usually when a font is embedded, each character is given an identifier that tells the document which steps to use to find the visual of the character. In this case, LibreOffice has embedded the fonts as "built in" fonts, meaning that there is no identifier with steps, it simply contains the visual directly. This in turn means that we cannot unembed the font, as if we did there would be no reference for these characters and they would instantly become broken text. I am afraid that this means no matter how we process it, there will be some fonts that cannot be removed from this document.

There is good news however, As in my earlier forum post, it is possible to get much better results by using our Lite printer for the conversion, which gives you more control over the font handling, and takes the control away from LibreOffice. When Printing to our printer, a file of similar size is created, however half of the embedded fonts are able to be unembedded with our optimize function. This in turn created a file that is less than half the size in my tests, with the default settings, and "unembed recommended fonts" selected."


Unfortunately I do not have the PDF-XChange Light Printer installed but the PDF-XChange Standard Printer driver. See 01.png. Therefore my settings are slightly different from yours - see 02.png. I am not so keen to use this function because it appears that LbireOffce Writer pulls any random font into the .odt file.

What I find rather strange is that when I create a PDF file via this printer the resulting PDF file is around 53 MB in size. So not really an improvement. When I then open up that PDF file in the PDF XChange Editor it takes a quite a long time to load. I can then optimise the file. When clicking on "Fonts" it gathers the fonts embedded in the PDF - see 03.png. However again the software does not recognise any font at all! See 04.png. This is rather strange. Why is the file size all of a sudden so big, even after I chose PDF v1.6? Why does the application not recongise any fonts and how can I resolve this?

My current installation of PDFXChange Pro was done via the integrated update mechanism - which appears to have caused some changes which are for me not very pleasant. The measurement is now in imperial ( pixel / inch ) rather than metric. See 05.png.

There is indeed a reason to why I do not use any PDF printer software - which also includes the PDFXChange printer:

When using such a software it does not create any bookmarks in the resulting PDF file. As I write a lot of technical documentations for my company's customers I have no choice but to use the LibreOffice integrated PDF generator as adding bookmarks manually is a very tedious task. I would be very happy if there was a stand alone or LibreOffice plug in version of PDFXchange that would work in LibreOffice Writer and the other apps just like the MS Office integrated PDFXChange. This would help me greatly.

For a test I have also activated the automatic book generation but to no effect. See also 06.png.

Due to the file size limitations of 5 MB in this forum I will e-mail you the PDF file.

Thank you very much for your help.

Regards,

--Christian
Attachments
06.png
05.png
04.png
03.png
02.png
01.png

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Fri Jan 25, 2019 6:03 pm

Hello Christian,

The Standard printer and Lite printer handle the Font options in exactly the same way, so that should not be an issue, simply having only the "Embed all used fonts" and "embed a font subset only..." options, you should end up with a smaller file. I am unsure what you mean by:
Christian Flatscher wrote:
Fri Jan 25, 2019 5:05 pm
I am not so keen to use this function because it appears that LbireOffce Writer pulls any random font into the .odt file.
If Libre is adding random fonts to the files, would this not be more reason to avoid using it for conversion?
Christian Flatscher wrote:
Fri Jan 25, 2019 5:05 pm
What I find rather strange is that when I create a PDF file via this printer the resulting PDF file is around 53 MB in size. So not really an improvement. When I then open up that PDF file in the PDF XChange Editor it takes a quite a long time to load. I can then optimise the file. When clicking on "Fonts" it gathers the fonts embedded in the PDF - see 03.png. However again the software does not recognise any font at all!
Is this (53MB) in comparison to the 1MB file that you sent us as a sample? if not how large is the ODT file, and the PDF that LibreOffice creates from it? I only ask because if it is growing 53x in size, there s certainly something wrong, but if it is a comparison of 51mb to 53mb, it would be within acceptable limits.
As for how long it takes to load, does the load time improve after optimizing, and how long does the LibreOffice version take to load. Finally, how long does it take to load if you create the file with the Standard printer using the font settings I recommended above?
The fonts are likely not seen because as you selected earlier in the print options, they have been force embedded. This does not always happen, but it can on occasion be the case. Please try with the settings recommended above and let me know if these fonts are manageable by the optimize function.
Christian Flatscher wrote:
Fri Jan 25, 2019 5:05 pm
My current installation of PDFXChange Pro was done via the integrated update mechanism - which appears to have caused some changes which are for me not very pleasant. The measurement is now in imperial ( pixel / inch ) rather than metric.
Apologies for this sometimes during an update some of the settings can be reset, typically this only happens if you use a registry cleaner, or have updated between major version (IE V6 to V7). You can fix this setting in the Editor from the preferences (Ctrl+K) under measurement:
image.png
Christian Flatscher wrote:
Fri Jan 25, 2019 5:05 pm
When using such a software it does not create any bookmarks in the resulting PDF file. As I write a lot of technical documentations for my company's customers I have no choice but to use the LibreOffice integrated PDF generator as adding bookmarks manually is a very tedious task. I would be very happy if there was a stand alone or LibreOffice plug in version of PDFXchange that would work in LibreOffice Writer and the other apps just like the MS Office integrated PDFXChange. This would help me greatly.
Currently we do not have plans to create an addin for the LibreOffice suite, but if there is enough demand I an sure we would consider it.
Christian Flatscher wrote:
Fri Jan 25, 2019 5:05 pm
For a test I have also activated the automatic book generation but to no effect.
When using automatic bookmark detection, some parameters must be set. After checking "Enable automatic Bookmark Detection" the rest of the options will light up, and you can see the window below has "add" and "remove buttons. You will need to add a criteria for these bookmarks, as an example, maybe all of your bookmarks are 14pt comic sans bold, which the rest of the text is 12pt times new roman. In this case you would specify the font and font size for the bookmark to be generated from. You can add multiple criteria to catch all the needed bookmarks.

I am still unable to see your email in our inbox, So it seems likely that your earlier suspicion was correct and it bounced due to the file size. If after testing the above, you find that it is still generating unreasonably large files, you can upload a sample of these larger files to our useruploads server.
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Sun Jan 27, 2019 6:45 pm

Hi Daniel,

What I mean with this statement:

" I am not so keen to use this function because it appears that LbireOffce Writer pulls any random font into the .odt file."

is that LibreOffice itself already before creating any PDF file seems to pull in some random font files - see my bug report per https://bugs.documentfoundation.org/sho ... ?id=122657 - before actually creating the PDF file. So a random font is added at some point to the source .odt file.

I did some thorough investigations and found that the PDF-XChange PRO v7 - most likely all releases - are buggy. What I found is that when I convert a MS Office 2016 document - be it Word or PowerPoint - into PDF then in the resulting .PDF I can not remove any fonts when trying to optimise a PDF file.

I found that depending on the LibreOffice version (5 or 6) the resulting PDF file when creating it with the PDFXChanger Printer Standard can be anything between 50 to 60 MB. If I save in LibreOffice the .odt file as .docx and open it in Word and use PDFXChange Printer the resulting PDF file is also around 50 MB. This initially led me to believe that there is an issue with LibreOffice. I continued my tests.

When using PDF24 from https://en.pdf24.org/ or Nitro PDF 12 Pro the resulting PDF files are no larger than 2 MB. Unfortunately I can not remove any fonts from PDF files created by either Nitro or PDF24 when using the PDFXchange Editor and trying to optimise the PDF file.

I also set up a Windows 10 Pro Virtual machine and then I tested PDF-XChange PRO v7 and v6. I found that the size of a PDF file with v6 is around 2 MB and I can remove fonts that I do not want to be in the PDF file.

Conclusion:

It is very much likely that PDF-XChange PRO v7 is buggy.

I have attached these files:

0.zip
1.zip
2.zip
3.zip
4.zip
5.zip
6.zip
7.zip
8.zip

These are actually not zip files. Please unpack the file UNSPLIT.zip and run the batch file. It will unpack 3 .MP4 videos showing how I came to my test results.

Please discuss this issue with your developers and have the problem investigated and solved.

In the mean time I will continue with a few more tests and let you know my findings in due course.

Thank you.

--Christian

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Sun Jan 27, 2019 7:47 pm

Hi Daniel,

Just tested with the same Windows 10 VM and LibreOffice v5 and PDF-XChange PRO v6. The result is as expected and perfectly fine.

This proofs to me beyond any doubt that the issue lies only with PDF-XChange PRO v7.

Please have this issue resolved.

I have attached these files:

00.zip
01.zip

These are not zip files. Unpack the file unsplit2.zip and run the batch file in order to unpack the .MP4 video that confirms that the issue does not exist with PDF-XChange PRO v6 and LibreOffice 5.

Thank you for resolving this issue.

--Christian

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Tue Jan 29, 2019 12:17 am

Hello Christian,

Thank you for uploading the files, however due to our security policies we first tested these in a secured environment and found that the first set of files created a file the bluescreened our test machine. I have thus removed all copies of these files from your posts as a precaution. We will not be attempting to investigate those files any further.

Please zip and upload the original MP4, and the larger (50mb) ODT/PDF files to our USERUPLOADS server as I had requested earlier. This file sharing will be able to handle the full sized files without issue and without splitting the files into fragmented pieces.

Once we have these files we will investigate the issue directly. While I cannot agree that this "proves beyond a doubt" our V7 is the cause of the issue, I do agree that it sounds more likely after those tests. I am curious to see what other variables we can fiddle with to make this better for you.
Thank you!
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Tue Jan 29, 2019 1:15 pm

Hi Daniel,

I have uploaded the file 2019-01-27.zip to the server as requested.

Thank you for your help.

Christian

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Tue Jan 29, 2019 6:34 pm

Hello Christian,

Thank you for the videos, I will still need you to send us the ODT file that generated the 52mb PDF, and both versions of the output PDF's from v6 and v7. We will need these to compare, test and find the actual cause of the issue.

Kind regards,
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Tue Jan 29, 2019 8:13 pm

I have been testing with the original "safety" file you provided for us earlier, but in all my tests, ensuring that I have the same settings as in your videos, the output file has been 1mb in both V6 and V7, making no alterations to anything other than the settings you selected. LibreOffice is a fresh installation, and I've rolled back to v6 build 322.7 in this instance.

Can I ask that you create a printer profile from your V6 and V7 settings, then use the "Export to file" option that our printers offer to send us a copy of the printer settings you are using. nonetheless, we will still need the original files that generate the large document.

Regards,
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Wed Jan 30, 2019 7:49 am

Hi Daniel,

I have uploaded the file

2019-01-30_bug_#32093.zip

to the Tracker server.

It contains the printer configuration files, the LibreOffice Writer file, and all font files in use in that file.

It would be interesting to try your settings on my computer for both v6 and v7 of the PDF Tools.

Thanks!

--Christian

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Wed Jan 30, 2019 7:42 pm

Hello Christian,

Thank you for the original file, preferences and your fonts.
Even with these items in place, testing with your v7 printer profile, manually changing the settings to what you displayed in the earlier video, and by using the settings I recommended above, I still ended up with a file that is almost exactly the same size as the V6 printer.

As I have asked for multiple times already, we will need the output PDF file that has an excessively bloated size on your end. Please send that so we can dig into it and hopefully find the cause of this.

I should also note that the extensive load times you are seeing whenever opening the "fonts" sections of our driver and optimize function is very likely die to the extensive font library you have in place, or possibly (less likely), an error in one or more of those fonts is causing it to have a heavier load than it should.

As for my printer settings, They are the defaults on a fresh installation, unless I am mimicking exactly what you have changed in your videos. So surely enough when I export I have an identical file to your settings, no need to try that anymore.

Please send the output PDF files from V6 and V7 so that we can compare the differences and find out what is going on. Thank you for your cooperation.
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Thu Jan 31, 2019 7:31 am

Hello Daniel,

pdf_files.rar has been uploaded.

--Christian

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Thu Jan 31, 2019 9:25 am

Added also the file pdf_win10_virtual.rar.

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Thu Jan 31, 2019 6:52 pm

Thank you Christian,

We are looking into the files with the dev team now and hopefully can figure out what the root cause is from this. I will let you know if we need any more from you in the meantime.

Kind regards,
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Fri Feb 01, 2019 6:46 am

Thank you very much, Daniel.

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Fri Feb 01, 2019 6:24 pm

Hello Christian,

It does indeed seem to be cause by some font "bloat" for lack of a better term. The Devs have asked if you can provide printer tmp files from the V7 printer (please sure you are using build 328.2 for this test). This Article details the creation of printer tmp files: https://www.tracker-software.com/knowle ... -I-do-this
This will help us to discern why the files created seem to have more of the font embedded than necessary, and why in this case the resulting file is so much larger than V6.

Thank you!
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Sat Feb 02, 2019 9:05 am

Hi Daniel,

I have uploaded the file bug_#32093_pxp4780.zip.

Regards,

--Christian

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Tue Feb 05, 2019 12:19 am

Hello Christian,

Thank you for the tmp files. Our devs now see the issue and are looking into ways to prevent the file bloat in V7 from happening in the future. Dues to the nature and complexity of this, I cannot guarantee a timeline or an immediate fix, we will be working on this over time, so you will likely see gradual improvements over a few builds in the future.
For reference, the development ticket regarding this is RT #4642. You can ask any member of our support team for an update on the progress as time goes on.

Regarding being able to remove the fonts when using the optimize function. As I have mentioned before, in some instances fonts are embedded in a way that makes it impossible to remove them. This will not change as there is no fix to be made or special rules we can apply, it is intended to work this way and we cannot alter it. Any fonts that can be removed, will be available for removal here, any other fonts that are visible in the document properties are embedded this way.

Kind regards,
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Tue Feb 05, 2019 10:56 am

Thank you, Daniel.

I understand that fonts can not be removed when they are in the way that LibreOffice does. But when I use PDF Tools Pro v6 I can select fonts to be removed.

I will also add your statement with respect how fonts are embedded into the LibreOffice forum.

Thank you again for your support.

Regards,

--Christian

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Tue Feb 05, 2019 7:41 pm

Absolutely, that is included in the issue that we are investigating from the files you have sent us earlier. So now we simply need to wait patiently as the dev team works on it.

I am glad that we could work together on this and hope that it pays off soon.

Until next time!
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Wed Feb 06, 2019 5:59 am

Many thanks, Daniel.

Is there anything that Tracker Software can give me with respect to correctly embedding fonts into PDF files?

I am not interested in any kind of source code but more like a public, free available documentation that explains how to do this. I googled this subject but I am not sure if the references I found are correct:

https://www.karlrupp.net/2016/01/embed- ... -pdflatex/

https://stackoverflow.com/questions/226 ... with-tcpdf

https://tcpdf.org/

Once I have this information I will open up a bug with the Libre Office team.

Thank you.

Regards,

--Christian

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Wed Feb 06, 2019 6:22 pm

Hello Christian,

With regards to how they are handled in PDF, we base everything off the ISO Standard: https://www.iso.org/standard/51502.html
Far from free or public documentation, but it is what should be used by anyone who even remotely handles PDF documents.

I should note that I personally do not have access to the documentation, so I cannot provide a passage or page number that might be helpful either. Beyond that, I am afraid that I do not have any further information to share.

Kind regards,
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Thu Feb 07, 2019 4:40 am

Many thanks, Daniel!

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Thu Feb 07, 2019 11:15 pm

:D
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Sat Feb 23, 2019 4:20 pm

Just did a little test with converting my LibreOffice Writer document to Winword 2003 and rtf format. When opening these files with Word 2016 or WordPad the resulting PDF file is bloated. Deleting the first two pages that contain mainly the Noto fonts significantly reduce the PDF file size.

DIV
User
Posts: 62
Joined: Fri Jun 23, 2017 1:47 am

Why can some embedded fonts never be removed?

Post by DIV » Mon Feb 25, 2019 2:33 pm

TrackerSupp-Daniel wrote:
Tue Feb 05, 2019 12:19 am
As I have mentioned before, in some instances fonts are embedded in a way that makes it impossible to remove them. This will not change as there is no fix to be made or special rules we can apply, it is intended to work this way and we cannot alter it.
Hello, Daniel.

I was not familiar with this before.
Is it referring to entire fonts, or subsets?

Is there a simple reason why some embedded fonts cannot be removed? (Maybe you can say what would happen if they hypothetically were removed?)
Is there any way for users to identify in advance which embedded fonts can and can't be removed, aside from simply trying to remove everything and seeing what remains?

—DIV

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Mon Feb 25, 2019 4:16 pm

I am considering to move my files slowly over to WinWord as regrettably this is the only directly supported word processing application for the majority of PDF suppliers.

I have found a work around to my issue that I need to dig a little bit deeper. If I save the first two pages as PDF file and then copy and paste the text into WinWord - not yet playing with the Noto fonts - then it seems that the resulting PDF file is OK size wise.

My investigations in that arena will unfortunately take a little longer as I have for some time a very busy work schedule.

I shall update this ticket soon with my findings.

Thank you for your understanding.

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Mon Feb 25, 2019 8:29 pm

Thank you for the feedback Christian, We will look forward to that information.

DIV, as for your question about the fonts. It is difficult to explain, but I will try.
Normally a font is stored ine one of two places, in the system, or in the file that uses it, either or both of these locations can be used over time.

When a font is embedded, its structure is copied from the system directly into the file, and the respective "pointers" in the document are changed to point to the internal font instead. In turn if this font is removed from the document, the character ID's that the pointers use, will still try to find a relative font in the system fonts to substitute.

There is another way that fonts are sometimes embedded. A visual disgram might help here, Normally, fonts in a PDF are setup like this:

Code: Select all

 PDF Content -> Font pointer (embedded/system) -> (embedded shape/system shape)
Theoretically, Lets say that the letter A is always assigned the binary Value of 1 in a font (its much more complex than that normally). In this situation, the PDF content shows the letter A because the pointer (labelled as font value 1) is seen in both the embedded and system listing for the font as the character A. If for example, the document was corrupted and the value changed to 10, we might see letter D in A's place instead (again purely theoretical example). The advantage to this method if of course that the embedded font can be removed and character data is retained.
However, in some cases, the "shapes" of the font are directly embedded, without any pointers:

Code: Select all

 PDF Content -> (embedded shape only)
Back to our A = 1 example, But in this case, there is no pointer value (1) to be listed or stored, so A = A (the shape) and the pointer is skipped over. When this is moved between machines it works so long as the file is intact, but if the embedded font is broken in any way, there is no fallback to check the system fonts, because this character does not exist, there is no reference (basically a null pointer exception).
In this case, most applications will not allow you to remove the font, for two reasons.
One, as doing so would effectively delete all data relating to the font, and you would instead be presented with a broken series of hashes and other completely useless information that cannot be recovered.
Two, (this is very often the case) due to the fact that there is no reference/pointer for the font, this information is commonly not actually possible to be located within the PDF itself, and most applications do not know if the information is necessary for the document, thus it cannot be removed without the very real possibility of breaking something else in the document irreparably.

As a side note, I have built this entire post from memory of a conversation with the Dev team a few months back. I will get this verified for accuracy.
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

DIV
User
Posts: 62
Joined: Fri Jun 23, 2017 1:47 am

Methods of embedding of fonts

Post by DIV » Tue Feb 26, 2019 12:13 am

Hi, Daniel.

Can I have your indulgence to speculate?

Regarding the two methods for font embedding, the first method makes sense to me: the content (i.e. text) contains letters ("A", "a", etc.); each letter is uniquely associated with a standard code number (e.g. Unicode); embedded fonts are effectively tables to look up shape for a given standard code number (e.g. Unicode). Even if the font isn't embedded, if it's present on the user's system the document will still look the same, because the system-installed font file will also match shapes to standard code numbers (e.g. Unicode). Even if the user doesn't have that specific font installed, the viewer (i.e. viewing application) can often choose a similar font that is installed on their system, so that the document will display and the letters are still recognisable and in the right places.
When users highlight text in the PDF to copy it and paste into another application, it will also retain the correct identification of each letter.

Regarding the second method, is it possible it is something like storing indexed colour? You mentioned/suggested/recalled that the letters ("A", "a", etc.) map directly to shapes. I find it hard to imagine that. Rather, I can imagine another method as follows: the content (i.e. text) contains letters ("A", "B", etc.); each letter is uniquely associated with an ad hoc code number (i.e. pointer), which might be assigned in some arbitrary way, such as whichever letter appears first in the document shall be number 1, whichever appears second is 2, and so on; embedded fonts are effectively tables to look up shape for the ad hoc code number (i.e. pointer) defined specifically in that PDF file. If the font isn't embedded, the document won't be able to render correctly, regardless of whether or not that font is installed on the user's system, because the pointers in the PDF file won't correspond to standard code numbers (e.g. Unicode).
When users highlight text in the PDF to copy it and paste into another application, it will not retain the correct identification of each letter. Although I'm just speculating about the mechanism, I have certainly seen PDF files that render correctly yet when copying text from them it comes out as mangled garbage (e.g. "Once upon a time" gets copied and pasted as something like "%5}[Vxh%5VlV.!T[").

—DIV

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Tue Feb 26, 2019 1:26 am

Hello DIV,

I am glad the first example made sense.

Regarding the second method, Yes, it is more similar to an indexed color, As I mentioned it is difficult to describe, but I do believe that in essence an "ad hoc" system is a better analogy for how it is working under the hood. I was struggling to describe it that way as it sounded nearly identical in my writings to the first method. Like I said, difficult to explain. I am glad that you were able to extrapolate my meaning from there.

As before however, I am still going from memory of a single conversation months ago. So I will leave the definitive answers to the Dev team whence they have free time. At the moment they are very busy sorting out multiple features and functions, so it may be a bit of a wait on that front.

Kind regards
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Tue Feb 26, 2019 6:41 am

Thank you very much for your explanations, Daniel and DIV.

I understand both now.

--Christian

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Wed Feb 27, 2019 9:47 pm

Glad to hear it was informative Christian!

Have a great day :D
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Thu Feb 28, 2019 6:26 am

Thank you, sir, and the same to you, too!

User avatar
Will - Tracker Supp
Site Admin
Posts: 6501
Joined: Mon Oct 15, 2012 9:21 pm
Location: London, UK
Contact:

Re: Optimising PDF files, manual font removal

Post by Will - Tracker Supp » Thu Feb 28, 2019 9:29 am

:D
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.

Best regards

Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Sun Mar 24, 2019 8:41 pm

Hello everybody,

Sorry that it took so long for me to complete my last tests.

I have now converted the LibreOffice Writer document to Word 2016. Below the steps I have made:

1. I reduced the original document to the first two pages that are multi-language as I found that the first two pages are the source of the problem I have with the PDF file size as created with PDFXchange
2. I selected the entire text and chose as font Calibri and saved that LO Write document as Word 2016 file
3. The resulting file was converted into PDF with PDF Exchange as a test, result is a file of around 200 KB size
4. Then I changed the fonts to Noto, except the Western fonts which I replaced with the Fira font
5. Converting the Word file created in step 4 to PDF with PDFXchange yielded a PDF file that was nearly 30 MB in size
6. Converting the same file with the Word built in PDF converter ( "Save as PDF" ) creates an 800 KB PDF
7. Converting the LibreOffice Writer file with the built in PDF converter creates a 180 KB PDF file

Here an overview of the files in the attached word_lowriter_noto.zip file:

safety_lo.odt - The source LibreOffice writer document with Noto fonts
safety_lo.pdf - PDF file created with the LibreOffice integrated PDF converter

safety_calibri.odt - LibreOffice Writer file after converting all fonts in safety_lo.odt to Calibri
safety_calibri.docx - Word file created from safety_calibri.odt
safety_calibri.pdf - Word file converted to PDF with PDFXchange

safety_word.docx - Word file created from safety_calibri.docx, changed all non-Western fonts to the corresponding Noto fonts
safety_word.pdf - PDF file created with the Word built in PDF converter

It may be that PDFXchange has issues with the Noto font family from Google.

Thank you for looking into this matter again.

--Christian
Attachments
word_lowriter_noto.zip
(1.42 MiB) Downloaded 11 times

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Tue Mar 26, 2019 12:12 am

Hello Again Christian,

Thank you for the sample files, and detailed step by step process. We are digging into this issue currently and I will get back to your tomorrow with a progress update (please post here if I do not, just incase I get busy and forget). It does indeed appear to be a font specific issue, however our software is not the only software affected in my tests, many other virtual printers seems to have the same issue with the Noto fonts (as an example Microsoft print to PDF creates a 53MB file when printing).

All the best,
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Tue Mar 26, 2019 8:25 pm

Hello Christian.

The dev's have just come back to me, and it seems that they have found the cause of the issue and are working on finding a resolution. Unfortunately it is highly unlikely that they will be able to bring it in the initial release of V8 (build 329, which is just around the corner) but they are hoping to be able to bring it in one of the builds shortly after that.

Kind regards,
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Wed Mar 27, 2019 10:29 am

This is great news, Daniel.

Thanks!

Have a great day.

--Christian

User avatar
Will - Tracker Supp
Site Admin
Posts: 6501
Joined: Mon Oct 15, 2012 9:21 pm
Location: London, UK
Contact:

Re: Optimising PDF files, manual font removal

Post by Will - Tracker Supp » Wed Mar 27, 2019 11:09 am

:D
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.

Best regards

Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Wed Apr 24, 2019 1:43 pm

Hi,

I have now downloaded PDFXPro v8, build 331.

Please let me know if the issue reported has been fixed.

If not, please provide me with a time line until when I can expect it to be fixed.

Just as a last question - is it OK to install v8 over v7 or should I remove the v7 prior to installing the latest version?

Thanks!

--Christian

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Fri Apr 26, 2019 8:01 pm

Hello Christian,

Installing V8 overtop of V7 should not have any negative side effects, it will however overwrite the installation so V7 will not longer be available. We have made some changes in this area, but I cannot confirm that this issue has specifically been addressed. Please update and test, then let us know if it is all working on your end.

If the issue is still not fixed on your end, we will begin looking into other changes to help with this.

Kind regards,
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Sat Apr 27, 2019 4:47 pm

Hi Daniel,

Installed today PDFXChange Pro 8.0 Build 331.

Nothing has changed.

The resulting PDF file is still around 77 MB in size for two pages. :(

The only thing that has improved a bit is the time it takes to create the PDF file.

Please let me know what is required in order to investigate this matter.

What I don't understand is that with PDFXChange 6 Build 332.7 this issue is not existing...

Thank you.

Regards,

--Christian

User avatar
Tracker Supp-Stefan
Site Admin
Posts: 13293
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Optimising PDF files, manual font removal

Post by Tracker Supp-Stefan » Fri May 03, 2019 1:12 pm

Hello Christian,

Just spoke with the colleague who implemented the fix for ticket 4642 mentioned above in the topic could not actually be implemented in build 331, and will be available in the next one.
For the time being you can install an older version of our products (e.g. V6):
https://www.tracker-software.com/versio ... change-pro
Which should work correctly, and we will have the fix released with the next version.

Regards,
Stefan

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Fri May 03, 2019 3:18 pm

Great, many thanks, Stefan.

I have kept the old v6 installer... :)

Can you please find out roughly when the next build is going to be released?

Thanks again!

--Christian

User avatar
Will - Tracker Supp
Site Admin
Posts: 6501
Joined: Mon Oct 15, 2012 9:21 pm
Location: London, UK
Contact:

Re: Optimising PDF files, manual font removal

Post by Will - Tracker Supp » Fri May 03, 2019 3:54 pm

Hi Christian,

As we've just released a new build, we don't have any ETA at this time.

Thanks,
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.

Best regards

Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com

Christian Flatscher
User
Posts: 27
Joined: Wed May 23, 2012 12:34 pm

Re: Optimising PDF files, manual font removal

Post by Christian Flatscher » Fri May 03, 2019 8:43 pm

Thank you, Will.

Then I will check back later on the download page.

--Christian

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2090
Joined: Wed Jan 03, 2018 6:52 pm

Re: Optimising PDF files, manual font removal

Post by TrackerSupp-Daniel » Fri May 03, 2019 8:55 pm

:D
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

Post Reply