Remove Fonts

Forum for the PDF-XChange Editor - Free and Licensed Versions

Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan

Locked
Puffolino
User
Posts: 317
Joined: Wed Feb 09, 2011 1:06 pm

Remove Fonts

Post by Puffolino »

I try to shrink my PDF library but embedded fonts are hard to beat :roll:

Here's a small example document (1,5MB) which contains 20 fonts.

Using the PDF XChange Printer driver strips most of them but results in a file which has a size of around 9MB, optimizing does not work either.

The only way which works in general is to select the text page by page then changing the font to one of the "Standard PDF fonts". This can be done manually but it would be great if a similar function could be done automated. So what about this: allow to replace a font for the whole document? This could be initiated in the font list of the document property and may show a warning message that replacing could change the rendering result or so...

Another point I would really like to see in the document property dialog: an instance of the button "Audit Space Usage" from the optimizing dialog.
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17824
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Force font embedding in Editor

Post by Tracker Supp-Stefan »

Hello Puffolino,

I do not see this suggestion ever working out - as different fonts have different font metrics and even if the correct font with the correct size is used - text might still need more (or less) space than with the original font, and as such - text positioning will suffer. So the only way for you to change fonts is to do it manually, as you can see the result immediately and adjust the font metrics accordingly if needed. Automating such a feature will cause way more trouble than good even with "simple" and "straight forward" looking files!

I can pass the "Audit Space Usage" suggestion to the devs for consideration.

Cheers,
Stefan
Puffolino
User
Posts: 317
Joined: Wed Feb 09, 2011 1:06 pm

Re: Force font embedding in Editor

Post by Puffolino »

Stefan, you're definitely right that changing text in any way may change the document in a very aggressive way (that is clear, terefore I mentioned to do the warning message :))
Beside different font face parameters, there are (at least) two more well known reasons, why editing text using the PDF editor is tricky: tabulators (especially because of the "Edit Text Elements as Blocks" AI) and line spacing in paragraphs (would be perfect to have a virtual line spacing value which could be incremented or decremented)...

...but back to my "problem", a kind of how to reduce the size of PDF files (especially when before sending them via mail)? Doing the job manually is horrible, I tried to do so with just one single page of the document above (page 2), which has 240KB when extracted but could be squeezed to around 6KB when getting rid of the embedded fonts.
It is also difficult to see which part of the text uses which font (and if certain characters are embedded or not) - maybe a special option in the find function could give this information by highlighting this places?

Anther option would be to add additional rude optimizing methods in the font section, like "remove all embedded fonts" and "remove all embedded fonts where similar font names are installed" - anyhow I'm not sure, if other users also need to shrink files as much as I do.
Timur Born
User
Posts: 874
Joined: Tue Jun 26, 2012 1:50 pm

Re: Force font embedding in Editor

Post by Timur Born »

But there already is an "Remove all embedded fonts" option in the "Save as optimized" function?! Is this what you are looking for?

It would be great if there was an automated way to replace fonts with similar substitutes, which is not possible at the moment. What is possible, though, is to select all text of the whole PDF file and replace it with a single font (like Arial). This alone decreases filesize of your sample file to 835 kb.
DIV
User
Posts: 252
Joined: Fri Jun 23, 2017 1:47 am

Re: Force font embedding in Editor

Post by DIV »

Timur, Puffolino stated that they had done the following: "select the text page by page then changing the font to one of the "Standard PDF fonts"."
Perhaps it would be worthwhile to explain to Puffolino specifically how you recommend to "to select all text of the whole PDF file and replace it with a single font (like Arial)".
—DIV

P.S. I think Puffolino could really have posted this as a new topic, because they are asking how to not embed fonts, whereas the original question was the converse: ensuring fonts would be embedded.
Timur Born
User
Posts: 874
Joined: Tue Jun 26, 2012 1:50 pm

Re: Remove Fonts

Post by Timur Born »

Content Pane -> right-click -> Select -> Text

This will select all text in the document. Choose a font in the properties panel and all text is changed to the new font while text size and style remain untouched.

If I remember correctly then I once suggested a feature to find & replace specific fonts in a PDF file. That would help a lot in this case.
Timur Born
User
Posts: 874
Joined: Tue Jun 26, 2012 1:50 pm

Re: Remove Fonts

Post by Timur Born »

I also had a look at the "Unembed all fonts" option. It does still keep those parts of the fonts embedded that are not present on my system, which mostly seems to be that HP41. Some fonts are still listed as "Embedded subset" and curiously "Symbol" is listed as "Embedded", despite being present on my system. File size still decreases by 400 kb, which corresponds to removing all fonts that Editor is capable of removing (down from 645 to 245 kb).

This offers the best visual compromise between embedding those fonts that are necessary ("AMC_OS/X Module" headlines) and removing everything that's already installed on the system anyway.

It would still be useful to get a specific font-replacement option.

As a side-note: At one point those special fonts were installed as "temporary" fonts on my system and as such they are accessible in MS Word. Once I closed Editor those fonts were removed and I could not reproduce it again.
Puffolino
User
Posts: 317
Joined: Wed Feb 09, 2011 1:06 pm

Re: Remove Fonts

Post by Puffolino »

Sorry 'bout hijacking this thread - thought it would be a good place here...

...but I also thought "Unembed all font" would do the trick, but after doing so, the file is still 1,1MB here - which means at least it got smaller (in many cases, optimizing results in larger files). But the optimized file contains the same 20 embedded fonts as before (at least this will be shown in the document properties).

So when I saw this and looked at my actual document which has a size of around 30MB and 136 embedded fonts I was writing the first post in this thread :oops:
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8440
Joined: Wed Jan 03, 2018 6:52 pm

Re: Remove Fonts

Post by TrackerSupp-Daniel »

Hello All,

It spears that Stefan split the topics here, so now this post is its own thread, no need to worry there. :)

Thank you for the lengthy discussions here, it makes for good reading material! Unembedding all fonts is a function that Attempts to remove all embedded fonts possible. There are times when this cannot be done, most commonly when a font is in use and no valid substitute is available, and the Editor must respect those cases. Another situation when a font will be left in place is when it exists in the document improperly, as in cases like this we cannot detect if it is still in use, thus removing it is unsafe.
While we offer these functions to give you more freedom of action, there are times when we must say "this cannot be done", as making those changes will not result in something simple like missing content, it will most often result in a corrupted and unusable document.

Kind regards,


For an Addendum, As Stefan mentioned above, it is highly unlikely that we will ever offer a "find and replace" function for Text, fonts, or otherwise, as there are simply far too many issues that can occur with it, and such a feature would almost never live up to expectations when making these changes.
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Timur Born
User
Posts: 874
Joined: Tue Jun 26, 2012 1:50 pm

Re: Remove Fonts

Post by Timur Born »

Well, in this very case common fonts like Comic, Tahoma and Symbol are not removed (entirely). And once the PDF is saved as optimized Editor claims that no more fonts can be unembedded. The font list in the Save as Optimized dialog is empty then while the list document properties dialog still lists 20 fonts. So at least for these fonts it seems at least strange that Editor cannot remove them.
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8440
Joined: Wed Jan 03, 2018 6:52 pm

Re: Remove Fonts

Post by TrackerSupp-Daniel »

Hello Timur,
TrackerSupp-Daniel wrote: Tue Feb 18, 2020 6:56 pm There are times when this cannot be done, most commonly when a font is in use and no valid substitute is available, and the Editor must respect those cases. Another situation when a font will be left in place is when it exists in the document improperly, as in cases like this we cannot detect if it is still in use, thus removing it is unsafe.
Note that these were only examples, and there are other possible cases, but I do believe that this explains what is happening there. The fonts in question likely exist in the document in some form that it is not possible to remove without risking damage to your document.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Timur Born
User
Posts: 874
Joined: Tue Jun 26, 2012 1:50 pm

Re: Remove Fonts

Post by Timur Born »

This is rather vague, though, and the "risking to damage the document" part does not convince me.

I assume that font embedding is standardized in some form?! When I replace all text with Arial in this example document then all other fonts are removed without issues. The HP41 headlines look funny then (not consisting of real letters to begin with).

For testing I replacing all text with "Arial" (non MT) in a 640 pages files consisting of 453 fonts (mix of Type1 and TrueType). This left 4 original fonts (same font name, 2x bold, 2x regular) intact for no apparent reason. It also embedded "Arial MT" as Identity-H and lists "Arial MT" as WinAnsiEncoding unembedded. So the removing part mostly does not seem to be the issue here, but Editor's willingness to do so.

Frankly, when someone specifically and intentionally chooses "Unembed all fonts" then this choice should be followed. Furthermore all fonts should be listed in the "Selects fonts to unembed manually" list, this way users can trial & error if removing a specific font causes issues. It's not as if users destroy their original document by saving an optimized version.

Optimizing my 640 pages file leads to only 21 out of 453 fonts being listed here for my 640 page file and after optimization all 453 fonts are still embedded at least partially, albeit font size decreased from 1.1 mb to 680 kb. But I could easily select and replace all text with a different font with no obvious document corruption happening. This is a strong case for offering a specific font replacement option, this way the user can choose which fonts to manually substitute (aka replace) and which to keep intact. Again, the document is not broken after replacing 453 fonts with Arial, despite Editor's unwillingness to unembed said fonts. Of course Arial does not fit into the original formatting in many places, but this is not the same as breaking a document.
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8440
Joined: Wed Jan 03, 2018 6:52 pm

Re: Remove Fonts

Post by TrackerSupp-Daniel »

Hi, Timur Born

Much like outright removing something that is in use from anything else could damage it, PDF files are no exception. if we cannot replace it, we will not outright remove it because that process could damage the document. As I said before, It could be that:
  1. The font is in use and no suitable substitute could be found
  2. The font exists in the document improperly (was embedded incorrectly)
  3. One of a few other, smaller and less common issues (I honestly do not know every possibility well enough to give accurate details)
Yes, there are indeed standards to be used while embedding fonts in a PDF. As you are well aware however, there are countless PDF software's in the world which do not adhere to these standards. We do out best to adhere to the PDF standards and maintain compatibility with anything else which does, without damaging your documents irreparably.
As many people use the optimize feature and immediately overwrite the original file (despite the fact that we offer "save as" and a new file name to avoid that) we cannot easily implement a function which may possibly cause unexpected changes like this. Especially when, unlike OCR for example, no preview is visible before the save function happens.

If manually changing the fonts is possible and rectifies the issue, it would mean that case A is the most likely cause of why we cannot unembed the fonts automatically. They are in use and no equivalent substitute is available to be put in place for the removal. Please note that manually changing the font in use does not emulate unembedding the fonts, and cannot be used as a comparative scenario.

While your option to offer all fonts, including those we cannot safely unembed might possibly be considered viable in an abstract sense, I am sure you will be quite hard-pressed to find anyone willing to go through the process you are suggesting here. Removing any number of fonts from a document, one at a time, waiting for the entire optimization process to complete each time to find out which ones they cannot remove (and as above, potentially overwriting the original). All before finally arriving at a usable version of the document which still has the issue that most of the fonts are imbalanced because they were replaced with something generic and incorrectly sized. Frankly, it is far too niche to be worth implementing on our end, or even worth using from most users perspective.
It is simply a feature that, if implemented, would result in dozens of additional support requests each week (slowing down our response time), with requests for help recovering lost files, which we cannot do.

As has been iterated many times, we will not likely offer a Search and replace fonts function for the foreseeable future, It may come in time, when we can accurately replace these items without heavily impacting the quality/layout of the document in question, but it is not something we are considering currently.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
DIV
User
Posts: 252
Joined: Fri Jun 23, 2017 1:47 am

Re: Remove Fonts

Post by DIV »

Hi, Daniel.

Thanks for your extended response.

I must admit to naïvely expecting that all fonts should 'easily' be able to be unembedded. This expectation probably comes from some misconception(s) I have about the way PDF documents are created/structured/read.

I imagined that an application that creates a PDF is at complete liberty to either include or not include the fonts in use (for simplicity I discuss only full fonts, not individual characters), and that when the font is embedded, it just means adding 'extra information' to the PDF file. It is then easy to imagine that any 'extra information' in a PDF file could subsequently be removed, because it should produce a file equivalent to one that could have been created without embedded fonts from the outset.

Conceptually as follows.

FILE WITH EMBEDDING:
~~~~~~~~~~
<Font definition. ID:00001. Name:"myFont BOLD". Definition: "1"=draw fat vertical line; "2"=draw fat curvy line; ....>
<Font definition. ID:00002. Name:"yourFont LIGHT". Definition: "1"=draw thin vertical line with hook at top; "2"=draw thin zigzag line; ....>
<Page definition: ...>
<Text element 00001. Font=00001. Size=12pt. Position=...>Much Ado About Nothing<end>
<Text element 00002. Font=00001. Size=12pt. Position=...>by<end>
<Text element 00003. Font=00002. Size=12pt. Position=...>William Shakespeare<end>
<Text element 00004. Font=00002. Size=-10pt. Position=...>Copyright<end>
~~~~~~~~~~

FILE WITHOUT EMBEDDING:
~~~~~~~~~~
<Page definition: ...>
<Text element 00001. Font="myFont BOLD". Size=12pt. Position=...>Much Ado About Nothing<end>
<Text element 00002. Font="myFont BOLD". Size=12pt. Position=...>by<end>
<Text element 00003. Font="yourFont LIGHT". Size=12pt. Position=...>William Shakespeare<end>
<Text element 00004. Font="yourFont LIGHT". Size=-10pt. Position=...>Copyright<end>
~~~~~~~~~~

So in this highly simplified conception of PDF syntax that I am imagining, it seems like it would be straightforward to include or exclude font definitions, and if there were a corrupted usage (e.g. Size=-10pt), then it would affect the file either way.
I agree that there are probably numerous applications that don't respect the relevant standards, but even so I'd be surprised to learn that corrupted font definitions are commonplace.

One thing I am struggling to understand is the stress on "removing something that is in use". Embedded font definitions are presumably (almost) always 'in use'. That is, besides the exceptional case where the originating software embeds a particular font, but either the original document has no text in that font, or perhaps the resultant PDF was subsequently edited to remove all text using that font, without removing the embedded definition. Even if the user has an identically named font on their system, the PDF viewer is supposed to respect the embedded definition, in case there is a mismatch between the two identically named definitions.

If I request software to unembed fonts, then I fully expect that the resulting file will not render quite correctly unless the person viewing the PDF has all of the necessary fonts installed on their system. Are you saying that most/many/some users would not have this expectation?

—David
Timur Born
User
Posts: 874
Joined: Tue Jun 26, 2012 1:50 pm

Re: Remove Fonts

Post by Timur Born »

TrackerSupp-Daniel wrote: Thu Feb 20, 2020 9:18 pmMuch like outright removing something that is in use from anything else could damage it, PDF files are no exception. if we cannot replace it, we will not outright remove it because that process could damage the document. As I said before, It could be that:
First, let me clarify that Adobe Acrobat behaves the same as Editor, even including the same issues. So Editor is adhering to the "standard" here. That does not mean that I concur with the limitations, though, some things can still be done better than the "standard".
  1. The font is in use and no suitable substitute could be found
What is a "suitable substitute" then? In my test both Editor and Acrobat substitute one unembedded font with a different *embedded* font that seems to be of equal width. Both chose the same substitute font and both cause the same issue with the word "first" in this example. Copy & paste still shows the "fi" letters being present, and when I enter any other letter in the space the "i" suddenly becomes visible.

Original:
font_original.png
font_original.png (49.01 KiB) Viewed 6447 times
Unembedded:
font_unembedded.png
font_unembedded.png (46.41 KiB) Viewed 6447 times
So the suggestion that the current implementation of unembedding is "safe" seems not entirely correct. There are problems even within the current limitations that put said limitations to question where it comes to "safety". And if unenbedding is unsafe anyway then giving user optional (!) control over removing even more fonts seems viable.
[*]The font exists in the document improperly (was embedded incorrectly)
Selecting all "text" and then exchanging fonts leaves 4 fonts (2 x 2 of the same) of the original document still embedded. So those might be affected by being improperly embedded. They are listed as being "custom" encoded.
[*]One of a few other, smaller and less common issues (I honestly do not know every possibility well enough to give accurate details)[/list]
The most common issue would be that the substitute font was too small, especially in width. One of the "less common issues" was just demonstrated by me with the above screenshots. Obviously the current limitation did not prevent this issue from happening, though, so the limitation seems somewhat arbitrary.
Yes, there are indeed standards to be used while embedding fonts in a PDF. As you are well aware however, there are countless PDF software's in the world which do not adhere to these standards. We do out best to adhere to the PDF standards and maintain compatibility with anything else which does, without damaging your documents irreparably.
If I can select all text and replace all fonts with a single one, how would unembedding the same fonts on the same text damage the document irreparably? Other than possible layout changes in that text may not fit in its original text boxes (too large or too small)?!
As many people use the optimize feature and immediately overwrite the original file (despite the fact that we offer "save as" and a new file name to avoid that) we cannot easily implement a function which may possibly cause unexpected changes like this.
Editor does not keep me from selecting all text and exchanging all fonts, I can hit CTRL-S right afterwards without warning. How is unembedding fonts an "unexpected change" then? It uses 1. a dedicated optimization panel with 2. a specific manually to be chosen stronger unembed option and 3. manually changing the filename back to the original. Keeping knowledgeable users from using functions, because other users may be stupid enough to jump three (3!) protective mechanisms is unfair to the knowledgeable users.
Especially when, unlike OCR for example, no preview is visible before the save function happens.
I can follow that argument, but at one point users have to take responsibility for using advanced features. Just blocking said features for everyone else, because some people specifically overwrite their original file without backups seems questionable at best.
If manually changing the fonts is possible and rectifies the issue, it would mean that case A is the most likely cause of why we cannot unembed the fonts automatically. They are in use and no equivalent substitute is available to be put in place for the removal.
Which begs the question again, what qualifies as a "equivalent substitute" and why did the current implementation fail nevertheless? (see above screenshot)
Please note that manually changing the font in use does not emulate unembedding the fonts, and cannot be used as a comparative scenario.
This still seems to be very vague to me. If a font is embedded properly enough for being displayed by Editor then I would expect it to be embedded properly enough for everything else. I frankly don't know how this case would present itself.
... Removing any number of fonts from a document, one at a time, waiting for the entire optimization process to complete each time to find out which ones they cannot remove (and as above, potentially overwriting the original).
First of all, unembedding fonts in my 642 pages (454 fonts) document is so fast that I cannot even stop the time it takes Editor to do so. It is much faster than simply selecting all text of that document.

The potential for overwriting without a preview could easily be avoided by opening the optimized version in a new document, just like OCR offers as a option.
All before finally arriving at a usable version of the document which still has the issue that most of the fonts are imbalanced because they were replaced with something generic and incorrectly sized. Frankly, it is far too niche to be worth implementing on our end, or even worth using from most users perspective.
Infix PDF Editor offers find & replace for fonts. Furthermore it then analyzes the results to find those text-boxes where the substituted font does not fit inside the box anymore. It then allows to semi-automatically change fonts in size, letter spacing and line spacing (shrink/stretch to fit). It's not perfect, but it's also not impossible.
grafik.png
Find & Replace still is different from unembedding, though, and I feel that we should concentrate of the latter here.
DIV
User
Posts: 252
Joined: Fri Jun 23, 2017 1:47 am

Re: Remove Fonts

Post by DIV »

I have to commend Timur Born on an excellent contribution, setting out a series of arguments in detail with appropriate examples.

For the benefit of other readers, I presume that the "fi" characters referred to in Timur Born's example ("the first to stand") was — in the original text, at least — actually the ligature "", which is a single Unicode character (U+FB01). [Try to select/highlight just the "f" or only the "i" in "": it's not possible.] This is a quite common ligature, but it is possible that some fonts might not include character U+FB01, in which case the result reported by Timur could occur.

Now this brings up a few possible courses of action:
  1. Change to 'unconstrained' font removal functionality. Reckless/inexperienced users might thereby create PDF files that don't behave as they expected — thereby causing problems for Tracker Software's support staff (and perhaps affecting sales?). But careful/experienced users might be very happy.
  2. Maintain the status quo. Timur has explained some issues with the current 'constrained' font removal functionality.
  3. Tighten the definition of a "suitable substitute" to be more 'conservative', e.g. to prevent the "fi" substitution problem. Could this perhaps mean that the font removal functionality would rarely be available, because "suitable substitutes" (where all used glyphs/characters can be suitably mapped) might be rare?
  4. Just give up and do not provide any font removal functionality. Careful/experienced users might be disappointed.
I appreciate that this is not just a technical consideration, but also a business support/marketing/sales consideration.
Nevertheless, some of the difficulties might be addressed by the following:
  • showing a preview before saving, or simply making the change as an (undoable) 'edit' without necessarily saving; or
  • automatically save with a different filename; or
  • showing additional warnings along the lines of "Removing all embedded fonts might cause problems viewing the content. Are you sure you want to proceed?"
Looking back over some of the discussion in this thread, I am not clear on whether we are all talking about the same thing.
I.e. unembedding fonts should drastically reduce file size, but if the fonts are installed in the file reader's OS, then the file should still display (almost) exactly the same as the original PDF that contained embedded fonts.
When I refer to "removing" fonts, I mean unembedding.
Of course, actions such as changing the formatting of all text to use a single font, say, would be a reasonable trigger for unembedding (the now unused) fonts. But I don't think that font substitution — regardless of whether it's done manually or automatically — should be the only mechanism for unembedding fonts.

—DIV
TCJCM
User
Posts: 6
Joined: Tue Feb 18, 2020 8:07 pm

Re: Remove Fonts

Post by TCJCM »

Puffolino wrote: Sun Feb 16, 2020 6:53 am I try to shrink my PDF library but embedded fonts are hard to beat :roll:

Here's a small example document (1,5MB) which contains 20 fonts.

(..), optimizing does not work either.
  • the used publishing prog is no good at all.
  • e.g. three different fonts in foot line is no good so reducing amount of used fonts will be good . Too much different sytles/fonts hurts eys!
Timur Born
User
Posts: 874
Joined: Tue Jun 26, 2012 1:50 pm

Re: Remove Fonts

Post by Timur Born »

Thanks for the feedback! Looks like these "fi" characters are true f + i, though. Here I am using copy & paste from Editor to Firefox:
had been the first to stand against
they were the first to fall
There also is the case where the "i" becomes visible again, once I insert a space.
Timur Born
User
Posts: 874
Joined: Tue Jun 26, 2012 1:50 pm

Re: Remove Fonts

Post by Timur Born »

To clarify, I can select both the "f" and the "i" by themselves, so it does not seem to be a "fi" ligature. At least Editor's copy & paste is able to differentiate them.
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8440
Joined: Wed Jan 03, 2018 6:52 pm

Re: Remove Fonts

Post by TrackerSupp-Daniel »

Hi, Timur Born

When copying content, the Editor has an option enabled by default to "preserve original ligatures"
image.png
If this is enabled, it should maintain the originals as they are.

We have been monitoring this discussion and I wanted to inform you that our Dev team is beginning the considerations for a rework, both improved font removal (to address the issues mentioned here) and proper font embedding options. I should note that this is a large undertaking and as such, development on that front likely will not start until after we have our current large project (accessibility overhaul) complete. I cannot offer a timeline or guarantee for this implementation, but it is being seriously considered.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Timur Born
User
Posts: 874
Joined: Tue Jun 26, 2012 1:50 pm

Re: Remove Fonts

Post by Timur Born »

Thanks Daniel,

the option was already enabled (default?) and disabling it does not seem to make any difference. The "fi" still vanishes after unembedding fonts and I can still separately select the "f" and the "i". I can even select them after they vanished! Earlier in the paragraph another "fi" is missing within the word "battleFIeld", too.

Then I checked the text in the content pane, and indeed the "fi"s are still present in the text. Changing the font manually makes the "fi" visible again, even when choosing the same original font that the "fi" is already supposed to be encoded with (when I select the invisible "fi" the font is properly selected).
grafik.png
grafik.png (5.17 KiB) Viewed 5428 times
But I noticed something else: When I replace larger parts of the paragraph with the same original font then the white-spaces shift (become wider) and the "fi"s reappear. So I tried the various white-space options in the same settings you linked to before optimization, but unfortunately (but expectably) they did not help. On the other hand I reported earlier that manually inserting a white-space between the invisible "f" and "i" makes the "i" reappear, albeit I cannot produce the same for the "f". So there seems to be some link between white-spaces and the "fi" issue.

Again, unembedding fonts from the the same document via Adobe Acrobat produces the same error, so it may be document specific anyway, but it demonstrates that unembedding can sometimes cause more issues that font replacement.

Thanks for keeping us informed about the devs discussion, it's well appreciated! :!:
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8440
Joined: Wed Jan 03, 2018 6:52 pm

Re: Remove Fonts

Post by TrackerSupp-Daniel »

Hi, Timur Born

The checkbox mentioned above is specifically for when Copying content, usually with the "Select text" tool, it should not have an impact on the editing of text. Despite Adobe experiencing the same errors, we are still looking into this to see if there is anything we can do to handle it better than they do.

I will try to remember to post here about any updates on the discussion, but as we are ramping up for the 337 release there will likely not be any news for the next few weeks, and after that, the big feature we will be focusing on is still the accessibility overhaul, so you will likely not hear anything on this front until after the V9 release (when we plan to have accessibility completed).

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Timur Born
User
Posts: 874
Joined: Tue Jun 26, 2012 1:50 pm

Re: Remove Fonts

Post by Timur Born »

Copy & Paste works the same with ligature option enabled or disabled. The content pane also reveals that these really seem to be separate "f" and "i" characters to begin with. So who knows what's going on there.

I mostly only chimed in on the ongoing discussion, so the time-line is fine with me. Thank you so far.
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8440
Joined: Wed Jan 03, 2018 6:52 pm

Re: Remove Fonts

Post by TrackerSupp-Daniel »

8)
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
DIV
User
Posts: 252
Joined: Fri Jun 23, 2017 1:47 am

Re: Remove Fonts

Post by DIV »

Take your time, and stay safe :-)

We appreciate that you and the team are considering these questions, and are also making the effort to provide updates along the way.

—DIV
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8440
Joined: Wed Jan 03, 2018 6:52 pm

Re: Remove Fonts

Post by TrackerSupp-Daniel »

Hi, DIV

Thank you for the patience and understanding. We are all doing our best to stay safe, and most members of the team are working from home (myself included) currently. As such, things have slowed down a bit, and I unfortunately still do not have any new updates on this topic for you all.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
DIV
User
Posts: 252
Joined: Fri Jun 23, 2017 1:47 am

Timur's file with disappearing "fi" (or "fi"?)

Post by DIV »

Dear Timur,
it's been a while since I posted my 'presumption' about the "fi" combination being a ligature "fi" in your file. There were a couple of reasons for that:
  • it is something that could conceivably affect those specific letters, because they are often available as a ligature in a font, but not always;
  • in the snapshot image labelled "Original" that you provided they definitely appeared to be styled as a ligature — the most obvious characteristics being that there is no "dot" shown above the "i", and secondly the cross-bar on the "f" seems, visually, to join onto the stem of the i at the top; other characteristics are that the hooky bit at the top of the "f" is wider in the word "first" than it is in the word "fall", and the serif at the top of the "i" stem is oblique in "raiment", but is horizontal in "first" (besides being visually joined to the "f").


Based on your subsequent description I can think of only two other possibilities, as follows.
  • There is something weird about the way PDF-XChange is displaying/handling that combination of letters in that font in that file. However, it seems you've also found that "fi" is also exhibiting strange behaviour when that file is opened in (or handled by) other PDF-viewing/editing software.
  • Normally the "fi" ligature would be created in an OpenType font as a "standard ligature" in the form of a suitable single character (and single glyph) at Unicode codepoint U+FB01. Another (different) feature of OpenType fonts is that they can adjust the appearance of an individual character depending on the neighbouring characters. For example, if the capital "L" is usually represented with a glyph that has a swash to the lower right, the font could be programmed so that it would be represented with an alternative glyph if the "L" were immediately followed by a character with a descender that might clash with the swash. Thus, the individual characters can change their appearance based on the context. It is even possible to program the OpenType font to replace one character with an entirely different character in specified contexts. For example, you could program the font to automatically render the hash symbol as a sharp whenever the hash character occurs after a capital letter A–F (i.e. B# would automatically become B♯) — it's kind of an 'auto-correct' feature that can optionally be built into the font. So it is possible that what looks for all the world like a single-glyph, single-character ligature is actually two separate glyphs from two separate characters, but that the font programming has used some contextual rules to replace the usual glyphs (and characters?) with custom glyphs that each appear like part of a ligature, and those glyphs/characters would likely be saved at private-use Unicode codepoints. For example, "f" (U+0066) might be replaced by something that looks like "Ғ" (at, say, U+E000) if it is followed by an "i", and that following "i" (U+0069) might then be replaced by something that looks like "ɩ" (at, say, U+E001). Put them side-by-side and they might appear (visually) like a ligature "Ғɩ", even though they are still two separate (but nearby or even touching) characters. This would surely lead to problems — in any software application! — if the font were changed, because it is extremely unlikely that any other font would define their private-use Unicode codepoints like that.
—DIV
Timur Born
User
Posts: 874
Joined: Tue Jun 26, 2012 1:50 pm

Re: Remove Fonts

Post by Timur Born »

The font is listed as Type 1 font with "Custom" encoding. When I replace the font with itself then the combined fi "ligature" changes to individual fi letters while I hover the mouse over the font selection list, the letters also change their lines to be more individual (connection between f and i). Some other spaces also change then. Once I click the mouse-button to select the font everything turns back to how it was before.
grafik.png
grafik1.png
Replacing the font with another font is no problem, the "fi" remains readable, only unembedding had that strange effect of making the "fi" vanish (and only the "i" reclaimable via inserting a space) in both Editor and Acrobat. When I replace the missing "fi" of the unembedded version with another font then the "fi" comes back, too.
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17824
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Remove Fonts

Post by Tracker Supp-Stefan »

Hello Timur Born,

Thanks for your input!

I hope that is useful DIV?

Kind regards,
Stefan
johnlernert
User
Posts: 1
Joined: Thu Apr 16, 2020 10:23 am

Re: Remove Fonts

Post by johnlernert »

Timur Born wrote: Wed Apr 08, 2020 7:00 pm The font is listed as Type 1 font with "Custom" encoding. When I replace the font with itself then the combined fi "ligature" changes to individual fi letters while I hover the mouse over the font selection list, the letters also change their lines to be more individual (connection between f and i). Some other spaces also change then. Once I click the mouse-button to select the font everything turns back to how it was before.

grafik.png
grafik1.png

Replacing the font with another font is no problem, the "fi" remains readable, only unembedding had that strange effect of making the "fi" vanish (and only the "i" reclaimable via inserting a space) in both Editor and Acrobat. When I replace the missing "fi" of the unembedded version with another font then the "fi" comes back, too.
https://makeupexp.com/
Type 1 is a font format which came to market around 1984, together with PostScript and the Apple LaserWriter. This is the reason why the font format is sometimes called PostScript Type 1, even though you can also print these fonts on non-PostScript devices since the early nineties.

Since Type 1 is over 20 years old, it is positively ancient in technology terms. The format has effectively been superseded by OpenType, which keeps all the advantages of Type 1 and adds cross-platform compatibility and a slew of sophisticated typographic features.

Some of the key limitations of Type 1:

It does not allow for more than 256 glyphs (character shapes) to be included in a single font. There is a special flavor of Type 1, called CID fonts, that gets around this limitation but it relies on a rather rigid ‘character collections’ mechanism. Old (pre-2003) output devices sometimes cannot handle CID fonts properly.
Type 1 fonts are not cross-platform. There are tools to convert Mac fonts to Windows and vice versa but the conversion can be a hassle.
The font format stores its data in a number of separate files (the minimum being 2 files).
Since the font format is soo old, font names stick to the 8 character limit of the older DOS days. The cryptic names make it difficult to determine which typeface is stored in a file.
Timur Born
User
Posts: 874
Joined: Tue Jun 26, 2012 1:50 pm

Re: Remove Fonts

Post by Timur Born »

Well, it is what it is. As a customer I have no say in what font type my bought PDF files come in, but many of them use lots of Type 1 fonts, alongside Truetype.
DIV
User
Posts: 252
Joined: Fri Jun 23, 2017 1:47 am

Re: Remove Fonts

Post by DIV »

As I read the above posts last year, I remember having the following thoughts.
  • Some old technologies remain in common use. For instance with images the BMP, JPEG and TIFF formats are all similarly 'ancient' (by the above standard), but all are frequently encountered today, and even remain popular for creating new image files. "Adobe announced on 27 January 2021 that they would end support for Type 1 fonts in Adobe products after January 2023." according to https://en.wikipedia.org/wiki/PostScript_fonts#Type_1 — that's still a couple of years off in the future.
  • Being unable to work with 'legacy' formats can also cause frustration. Although I realise there has to be a practical limit to how many formats are supported.
—DIV

P.S. FYI, on my Windows 8.1 system my Windows\Font folder contains 192 *.fon files (out of 636 total files, in other words 30 % of all files). I have no idea how the files got there, and which applications (if any) use them, or what they're used for. But they're there, and there in large number, even though allegedly "FON files are now obsolete." per https://fileinfo.com/extension/fon#:~:text=A%20FON%20file%20is%20a%20Windows%203.x%20font,the.FNT%20file%20format.%20FON%20files%20are%20now%20obsolete. (There may be some answers in the URL...)
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8440
Joined: Wed Jan 03, 2018 6:52 pm

Remove Fonts

Post by TrackerSupp-Daniel »

:)
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Locked