PDF-XChange - Tracker PDF Viewer - TIFF-XChange - Image-XChange - XMF-XChange - Raster-XChange - Support

Moderators: TrackerSupp-Daniel, Tracker Support, Chris - Tracker Supp, Vasyl-Tracker Dev Team, Paul - Tracker Supp, Ivan - Tracker Software, Sean - Tracker, Tracker Supp-Stefan

 
Mitch
User
Topic Author
Posts: 4
Joined: Mon May 01, 2017 3:14 am

Get a Scribd book as a searchable PDF with PDF Xchange Editor

Mon May 01, 2017 3:54 am

Create a searchable PDF from any Scribd book with PDF Xchange Editor
How? Use software to automate flipping the pages and take screenshots, and use PDF Xchange to create a a high quality searchable PDF.

Windows or Mac

I need my books to be available offline. Scribd Premium its offline storage often fails. The solution is to scan the pages from screen and create a searchable PDF. It's for personal usage. I tried OSX Automator, Abbyy Finereader, Acrobat DC professional, and ePub. All of these have issues with readability or OCR accuracy.

Here we go:

Login to Scribd and open the book to read.

Step 1: Screenshot pages as PNG with Keyboard Maestro (Mac) or AutoHotkey (Windows). Both are Freeware. Make sure screenshots are taken in Full Screen.

Keyboard Maestro for Mac:

Keyboard Maestro Editor.png
Keyboard Maestro settings


AutoHotkey for Windows
This is a bit more complex and involves a script:

^!R:: ; CTRL+ALT+R to run the script
loop 400 ; keep going for n number of times in this case 400 times
{
Send +{Printscreen} ;keystroke [shift]+[PrintScreen]
SetKeyDelay, 5000 ; delay for 5 seconds
Send {right} ; keystroke right
SetKeyDelay, 5000 ; delay for 5 seconds

}
return
For more info check https://autohotkey.com/board/topic/5811 ... re-script/

Step 2: Batch conversion and rename with XnView (Mac) or Irfanview (Windows).
PDF Xchange can also sharpen scan images but not in batch. That's why I use XnView or Irfanview.

Xnview for Mac
Choose Tools - Batch Convert and set the actions below under the second tab:

Xnview.png
XnView settings


Irfanview for Windows
Choose Menu - Batch conversion and rename
a. PNG compression level 6
b. Crop to 1170 x 770 (this is optional and removes the grey Scribd borders. The size is based on Macbook screen resolution 1440 x 900. Get SwitchResX for Mac if you want a screenshot of higher quality, which requires a higher screen resolution).
c. Sharpen 10, Contrast 15

Step 3: Image to PDF with PDF Xchange Editor (V6 build 321)
a. File - New Document from Image files
Go to Options
b. Select Paper size from Image size (under New Page Options)
c. Fit Image to Cell (centre-middle) under Images Layout Options
d. Flate compression all (True color, Grayscale, etc.) under Image compression
e. Set OCR Medium under Image Postprocessing. You can also skip this setting first and pre-process to see how the quality of the scanned images will be. And make your document searchable with the desired OCR accuracy via Menu - Document - OCR.

New Page Options.png
Image options


After setting the above options click ok and ok again to run and process the images. PDF Xchange is now going to OCR (recognize) the images you selected.
Alternatively, skip the OCR part first

PDF-XChange Editor OCR process.png
PDF Exchange OCR process


Step 4: Split Pages
a. Split pages with PDF Xchange Editor
b. Menu - Document - Split Pages
c. Click on icon to select a Vertical split 50% after which a dotted vertical red line appears in the preview
d. Select Remove Source pages
e. Select Change physical size

Split pages.jpg
Split Pages


That's it. Good luck.

Mitch
Last edited by Mitch on Sat Jul 22, 2017 11:19 am, edited 4 times in total.
 
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 12677
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Get a Scribd book as a searchable PDF with PDF Xchange Editor

Wed May 03, 2017 9:42 am

Hello Mitch,

Many thanks for this tutorial!
Hope other people will find it useful as well!

Cheers,
Stefan

Who is online

Users browsing this forum: No registered users and 1 guest