Page 1 of 1

Optimizing PDF bigger than 150 Pages takes hours

Posted: Fri Aug 09, 2019 3:04 pm
by pesce
Hello!

I need to optimize large PDF Files, i.e. use "Save as optimized PDF".

By large I mean both in size and number of pages. Sample PDF documents can be found here
The PDF files are produced by a Java Swing application. They contain many complex transparent objects which can be optimized.

PDF-Xchange Pro is able to optimize these files and compress them by about a factor 10:1 :D However it takes hours (!) to complete :(

Results, when the document contain
  • 100 pages only: Duration 2 mins
  • 150 pages only: Duration 5 mins
  • 250 pages only: Duration 20 mins
  • full ca. 1000 pages: ca 6 hours
Windows version: Microsoft Windows 10 Pro, Version 10.0.16299 Build 16299.
PDF-XChange Editor Version: Version 8 Build 331.0
Hardware: LENOVO_MT_20HG_BU_Think_FM_ThinkPad T470s, i7-7600U CPU @ 2.80GHz, Physical Memory (RAM): 20.0 GB

Optimizer Settings, see Attachment. Note, especially the last setting "Find and remove content outside of the Crop Box" is very important and only this setting leads to small PDFs
Optimize_Settings.PNG
Many thanks for your help

Re: Optimizing PDF bigger than 150 Pages takes hours

Posted: Tue Aug 13, 2019 4:55 pm
by TrackerSupp-Daniel
Hello Pesce,

Thank you for the report, optimizing is a very in depth process, and as the odds are slim that every page is identical, a simple "X pages takes Y time" comparison doesn't really work in this case.

The content that optimization needs to sift through changes from page to page, and while it might not seem like much, this includes the tiny aspects that even the human cannot pickup. Take this screenshot for example.
image.png
On Page 3, there are arrows creates as images, which is no big issue. There are however also a few dozen images that are "Clipped" to 1x1 and completely invisible to the naked eye. each of these will negatively impact the optimization time, a cursory glance through the rest of the document shows many pages are setup like page 3. Conversely, Page 2 is very clean with no images whatsoever, and minimal content to go through, which will result in a more efficient optimizing.
image.png
image.png (11.08 KiB) Viewed 245 times
In practice, if the issue is that optimizing locks up the Editor and you are unable to use it during this processing time, I might advise using PDF-Tools for the process instead, as it offers all the same functions, while also able to run in the background. It may not be ideal, but it is expected for a multiple hundred page document to take tens of minutes, upwards to multiple hours, for optimizing, depending on the content on each page.

We on the support team are running some additional tests with your documents to see if we can find something that would provide any improvement.

Kind regards,

Re: Optimizing PDF bigger than 150 Pages takes hours

Posted: Tue Aug 13, 2019 6:17 pm
by TrackerSupp-Daniel
Hello Pesce,

Thank you again for the report, These tests have brought an issue we hadn't caught before to light. It appears that each consecutive page being processed by "save as optimized" with your settings takes slightly longer to be processed. IE pages 1-10 take under a second, pages 40-50 take around 1 second each, pages 100+ take multiple seconds each, and so on. I have reported this to the Development team and created a high priority ticket to rectify this issue. I cannot speak for when it will be resolved, but we are working on it currently.

For reference, you can ask any member of our support team for the following ticket number, and we will provide an update if available:
RT#4872: Optimization Takes longer per page processed

I am looking for a workaround you can use to speed up the process in the meantime, but have so far been unsuccessful. I will be sure to come back as soon as I have found anything that may be of use.

Re: Optimizing PDF bigger than 150 Pages takes hours

Posted: Tue Aug 20, 2019 12:13 pm
by pesce
Hi @TrackerSupp-Daniel,

Many thanks for your analysis and especially for opening a ticket.

I am looking forward to hearing from you.

Kind regards,
Pesce

Re: Optimizing PDF bigger than 150 Pages takes hours

Posted: Tue Aug 20, 2019 3:08 pm
by Will - Tracker Supp
:)

Re: Optimizing PDF bigger than 150 Pages takes hours

Posted: Thu Aug 22, 2019 10:49 pm
by Timur Born
It would also be nice if XChange could use multithreading for its optimization process. Maybe something like one thread per page redacting or so.

Re: Optimizing PDF bigger than 150 Pages takes hours

Posted: Thu Aug 22, 2019 11:56 pm
by TrackerSupp-Daniel
Hello Timur,

Optimizing should now make use of multiple cores, at least in the test build for 332 it now does as you can see the spike occurring as I begin optimizing an 800 page document:
image.png
image.png (5.3 KiB) Viewed 181 times
With that said, unfortunately the issue reported earlier in this thread is not yet resolved, so the overall process is still a long one.

Kind regards,

Re: Optimizing PDF bigger than 150 Pages takes hours

Posted: Fri Aug 23, 2019 7:15 am
by Timur Born
Not seeing any meaningful multi-threading here. The bottleneck is one single thread, the others are quick to come and go with little real extra load happening.