Greetings,
I have a 140 page pdf that was created from jpg images. The images are 150 dpi (1218 pixels wide by 1650 pixels tall). The images are of a school yearbook. The resulting pdf is 539,579 KB.
The pdf will be available for viewing on our Alumni website. Some classmates may want to print the pdf.
We will be having about 40 other pdf's on the website. Each pdf will be between 40 and 140 pages and currently they all have the same dpi, width and height. The files will take up a lot of space on our website. I do not know what our allocated space is on our website.
Years ago 72 dpi was ideal for viewing and 96 dpi for printing. Monitors and printers have changed.
What is the best way to shrink the file size and have it look half way decent on a Computer monitor? Should I shrink the dpi (for viewing) and mention if someone wants a pdf to print to let me know and I can share a larger size pdf file with them using a service like Dropbox? (I've used Dropbox for other things).
I have all the original images at 300 dpi and in bmp format. I'll use those to recreate jpg's in a lower dpi before creating a pdf if I have to.
Thanks - David
Smaller pdf file size
Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan
- TrackerSupp-Daniel
- Site Admin
- Posts: 8557
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Smaller pdf file size
Hello David,
that is quite the predicament indeed, My suggestion would fall in line with your idea there, having a lower resolution version "Live", and the higher resolution versions stored in another location for those who explicitly request it. In term of the live version, I am not sure how much lower than 150dpi you can go without sacrificing too much of the image fidelity, however reducing them to 96 dpi will likely still offer a nice "middle ground" for size to visual acceptability.
Theoretically, a 96dpi image can be as little as ~1/100 of a 300dpi image of the same resolution, to give an idea of the potential savings in that kind of reduction.
The Editor does also offer an optimization function, via File > Save as optimized, which allows you to tweak many settings before optimizing, an allows you to audit the current space usage so that you can see how much of the document is images, fonts, etc. This may help in your endeavours.
Kind regards,
that is quite the predicament indeed, My suggestion would fall in line with your idea there, having a lower resolution version "Live", and the higher resolution versions stored in another location for those who explicitly request it. In term of the live version, I am not sure how much lower than 150dpi you can go without sacrificing too much of the image fidelity, however reducing them to 96 dpi will likely still offer a nice "middle ground" for size to visual acceptability.
Theoretically, a 96dpi image can be as little as ~1/100 of a 300dpi image of the same resolution, to give an idea of the potential savings in that kind of reduction.
The Editor does also offer an optimization function, via File > Save as optimized, which allows you to tweak many settings before optimizing, an allows you to audit the current space usage so that you can see how much of the document is images, fonts, etc. This may help in your endeavours.
Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Convert to text.
In general there is another way to drastically shrink file size, which is to convert from images (of text) to actual text.
A couple of warnings up front. This will only be appropriate if
I originally supposed that it could only be done page by page, but now I discover the text colour (in the OCR layer) can be adjusted for the entire document in just one step. However, the images seem to only be deletable by manual selection. (Unless there's a clever trick?)
In any case, for a real-life use case I believe you would probably want to go page by page anyway, because (i) you should check that the OCR results are OK, and (ii) you won't want to rashly delete the scan images for pages that contain graphics or photographs. The latter would be troublesome to optimise 'manually', but in theory could be done by using the functionality to right-click images and edit them in third-party software. (In this case the editing would be cropping or blanking of text portions of a scan.)
As I say, a pain to do manually, but theoretically a good way of reducing file size.
Alternatively, before the scan images are deleted, the PDF file can be exported to Microsoft Word document format. The images can then be manipulated within Microsoft Word. (Deleted, cropped, moved, etc..) An advantage of this is that it would be easier to find and correct spelling issues caused by mistakes in the OCR.
This might also help explain the reason why file size should shrink: it is like 'reverse engineering' the scanned images to get a formatted-text document (with perhaps just a few embedded graphics).
—DIV
P.S. In principle it would be possible for a software application to implement this automatically with a dedicated OCR option like "PDF Output Style" = "Formatted Text & Graphics". Hypothetically, of course
A couple of warnings up front. This will only be appropriate if
- the original scan quality is excellent; and
- the document comprises mostly prose (not mostly photographs or graphics); and
- you have plenty of time and patience.
- Acquire high-quality scans. In general 300 dpi is suitable, but 400 dpi may be needed for small text.
- Perform Optical Character Recognition (a.k.a. "OCR"). This creates a 'layer' of real text. This text is set invisible, so that it won't intrude on viewing the underlying scan image.
- Make the text visible. (See also forum.)
- Delete the image.
I originally supposed that it could only be done page by page, but now I discover the text colour (in the OCR layer) can be adjusted for the entire document in just one step. However, the images seem to only be deletable by manual selection. (Unless there's a clever trick?)
In any case, for a real-life use case I believe you would probably want to go page by page anyway, because (i) you should check that the OCR results are OK, and (ii) you won't want to rashly delete the scan images for pages that contain graphics or photographs. The latter would be troublesome to optimise 'manually', but in theory could be done by using the functionality to right-click images and edit them in third-party software. (In this case the editing would be cropping or blanking of text portions of a scan.)
As I say, a pain to do manually, but theoretically a good way of reducing file size.
Alternatively, before the scan images are deleted, the PDF file can be exported to Microsoft Word document format. The images can then be manipulated within Microsoft Word. (Deleted, cropped, moved, etc..) An advantage of this is that it would be easier to find and correct spelling issues caused by mistakes in the OCR.
This might also help explain the reason why file size should shrink: it is like 'reverse engineering' the scanned images to get a formatted-text document (with perhaps just a few embedded graphics).
—DIV
P.S. In principle it would be possible for a software application to implement this automatically with a dedicated OCR option like "PDF Output Style" = "Formatted Text & Graphics". Hypothetically, of course
- TrackerSupp-Daniel
- Site Admin
- Posts: 8557
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Smaller pdf file size
Hello DIV,
Thank you for the comprehensive writeup, I hope that it helps DWC121.
As for my slipup regarding the ratio, It seems that was a typo on my part, apologies for the error!
Thank you for the comprehensive writeup, I hope that it helps DWC121.
As for my slipup regarding the ratio, It seems that was a typo on my part, apologies for the error!
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com