splitting pdf files

PDF-XChange Editor SDK for Developers

Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan

Forum rules
DO NOT post your license/serial key, or your activation code - these forums, and all posts within, are public and we will be forced to immediately deactivate your license.

When experiencing some errors, use the IAUX_Inst::FormatHRESULT method to see their description and include it in your post along with the error code.
Post Reply
jusWest
User
Posts: 150
Joined: Fri Aug 24, 2018 8:26 am

splitting pdf files

Post by jusWest »

What is the considered the best way to split large documents into smaller files in pdfxchange sdk?

Today we use op.document.extractPages for this, sending in a range of pages to send to the target pdf.

On very large files, like 1.5 GB, with a lot of images, this can take a long time.

Other sdk's I have tried is much faster at this, so that makes me beleive I'm doing it wrong.

Regards
Ronny
User avatar
Vasyl-Tracker Dev Team
Site Admin
Posts: 2352
Joined: Thu Jun 30, 2005 4:11 pm
Location: Canada

Re: splitting pdf files

Post by Vasyl-Tracker Dev Team »

Hi Ronny.

Seems there can be a document-specific issue... Can we ask for any test doc to reproduce your issue on our side?

Also please provide options you used for op.document.extractPages.

Thanks.
Vasyl Yaremyn
Tracker Software Products
Project Developer

Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
jusWest
User
Posts: 150
Joined: Fri Aug 24, 2018 8:26 am

Re: splitting pdf files

Post by jusWest »

I can only reproduce with certain legal documents, and I have none that I can share because of legal issues.

I use the 64 bit version of the library, when I try this on a 32 bit project it goes much faster, but on a couple of
my test files it then crashes in this function with a unknown error.

Memory consumption on the 64 bit version is sometimes up to 3.5 GB

Here is the code:

Code: Select all

public void ExtractPages(IPXC_Document doc, 
                                 string pageRange, 
                                 string outputPath, 
                                 string destDocName = "", 
                                 int commentsAction = 0,
                                 int bookmarksAction = 2,
                                 int extractPagesAction = 1,
                                 bool openFolder = false)
        {
            try
            {
                var nID = _Inst.Str2ID("op.document.extractPages", false);
                var Op = _Inst.CreateOp(nID);
                ICabNode input = Op.Params.Root["Input"];
                input.v = doc;

                // https://sdkhelp.pdf-xchange.com/view/PXV:op_document_extractPages_Options
                ICabNode options = Op.Params.Root["Options"];
                options["PagesRange.Type"].v = "Exact";
                options["PagesRange.Text"].v = pageRange;
                options["CommentsAction"].v = commentsAction;           // 0 (Copy), 1 (Flatten), 2 (DontCopy)
                options["BookmarksAction"].v = bookmarksAction;         // 0 (DontCopy), 1 (CopyAll), 2 (CopyRelated)
                options["DeletePages"].v = false;
                options["ExtractPagesAction"].v = extractPagesAction;   // 0 (AllToOneDoc), 1 (AllToOneFile), 2 (EachToFile), 3 (EachRangeToFile)
                options["OverwriteAll"].v = true;

                if (!string.IsNullOrEmpty(destDocName))
                    options["FileName"].v = destDocName;
                else
                    options["FileName"].v = "%[FileName]";

                options["LocalFolder"].v = outputPath;
                options["OpenFolder"].v = openFolder;

                Op.Do();
            }
            catch (Exception ex)
            {
                IAUX_Inst auxInst = (IAUX_Inst)_Inst.GetExtension("AUX");
                HasError = true;
                ErrorMessage = auxInst.FormatHRESULT(ex.HResult);
                var fileName = Path.GetFileName(destDocName);
                _AppLogger?.Error("PdfToolkit:ExtractPages(" + fileName + ") => " + ex.Message + ", (" + ErrorMessage + ")");
            }

        }
User avatar
Vasyl-Tracker Dev Team
Site Admin
Posts: 2352
Joined: Thu Jun 30, 2005 4:11 pm
Location: Canada

Re: splitting pdf files

Post by Vasyl-Tracker Dev Team »

Hi Ronny.

Try to use:
1. DontCopy for FieldsAction, BookmarksAction and CommentsAction. Just for experiment, to see if it affects performance. Please let us know if this has any effect.
2. In case when you multiple times run the extractPages-op for the same doc but with different pages-ranges - you may modify your code to run it once per doc, via setting the complex page-range like: 1-5, 10-20, ..., or via using the "PagesRange.Array" option.

Also some additional questions:
1. How many pages are in your doc?
2. Does it have bookmarks, comments, links, form-fields, named-destinations?
3. Do you extract all pages at once or multiple page-ranges and multiple times?

Cheers.
Vasyl Yaremyn
Tracker Software Products
Project Developer

Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
Post Reply