Select Identical from Multiple Pages

Forum for the PDF-XChange Editor - Free and Licensed Versions

Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan

Locked
FLuser
User
Posts: 13
Joined: Thu Mar 18, 2021 7:00 am

Select Identical from Multiple Pages

Post by FLuser »

I've searched the forums, knowledgebase and Google (several instances) and I've never found an answer to this --

I would like to select the same area of multiple pages to do things like remove headers/footers/(non-PDF-Xchange watermarks), etc.

There is a similar feature built into PDF-Xchange for ages, the "Home > Selection > Duplicate" option.

The problem is that this creates a duplicate of the selected item, rather than simply repeating the selection process.

Is there any way of doing this that I'm missing?

If not, would it be possible to either search for controls within the same area (based on the active selection) on every page -- or based on the selection lasso that was used to choose that selection?

If that's not an option, something that would allow me to do this by selecting similar thought the "Content" Hierarchy?

The third option would be something that allows the user to select all instances of a "Search", allowing the user to then delete the full control of every matched item?

The fourth option could be identical to the third option; however, rather than doing this through the "Search" feature, it could be implemented in the "Mark for Redaction" (currently redaction is based on text which means that a person cannot select full textboxes, etc)

Thank you in advance for any ideas :)
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17824
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Select Identical from Multiple Pages

Post by Tracker Supp-Stefan »

Hello FLuser,

Please try the "Mark for redaction tool" - set it as desired on one page, and you can then Duplicate that redaction object on all the pages of your file. After that you can apply all redactions and that will effectively clear up your document from the unwanted headers.

This shows how you can use the redaction tool to e.g. remove page numbering, and the process will be the same for any other content:
https://www.pdf-xchange.com/knowle ... -numbering#

Kind regards,
Stefan
FLuser
User
Posts: 13
Joined: Thu Mar 18, 2021 7:00 am

Re: Select Identical from Multiple Pages

Post by FLuser »

Thank you, unfortunately that doesn't fit my need.

If possible, could you read my post once more? In my original post I explained the reason why the redact option doesn't fit this specific need.

Thanks again.
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17824
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Select Identical from Multiple Pages

Post by Tracker Supp-Stefan »

Hello FLuser,

The redaction tool definitely does work on both text and images that are under the redaction rectangle (once applied of course):
image.png
The above is a text file in which I pasted an image (shot taken from this forum topic) - and that image is actually inside a stamp object - so both annotations and base content is affected by the redaction tool. I did read your initial post carefully again - and still can not see why the redaction tool won't be suitable?

Kind regards,
Stefan
FLuser
User
Posts: 13
Joined: Thu Mar 18, 2021 7:00 am

Re: Select Identical from Multiple Pages

Post by FLuser »

Hi Stefan,
Thank you for taking the time to re-read my post and for the very thorough explanation/screenshot.

There are a variety of ways in which the redact functionality doesn't serve the need of a bulk-selection tool. Unfortunately, because the overall task and approach is somewhat complex, I am worried that the explanation may be a bit confusing.

There is one very easy example I can provide, which focuses on the key differences. If this example is not enough and you're okay with more complex explanations, let me know and I'll explain more thoroughly.

One example; simplest explanation:
  • The need for "bulk-select" or "duplicate selection" allows one to take action on a large number of objects at once, far more useful and robust than merely deleting.
  • In addition, if the user does prefer to use for deletion purposes, it's a true delete -- it will not create new overriding content
  • Imagine: you want to delete a raster logo that's located the center of every page of a PDF.
  • If you use the redact capability, every piece of content (text, raster images, etc) will be permanently hidden by a new, tangible redact-object
  • However -- If you were able to select this logo on every page, you could delete them all (or change opacity, move, etc) -- without any redact-remnants, and without accidentally deleting every visible piece of info in that area.
  • Next: Imagine I intended on redacting the same piece of text, example: a header/footer. Now imagine some pages have raster background images. I could set the redact properties to be transparent hoping to overcome this issue, but that does not help. There will be white-out boxes (editing the original raster images) everywhere that matches the redact selection. If I were to use a bulk-selection to delete those headers/footers, this side-effect would not occur.
Does this help clarify?

Thank you again for your time.
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8440
Joined: Wed Jan 03, 2018 6:52 pm

Re: Select Identical from Multiple Pages

Post by TrackerSupp-Daniel »

Hello, FLuser

Thank you for the clarification, yes that does tidy things up quite a bit. I should note that the Redaction tool can have its "redaction color" set to none, which will prevent the "new tangible redaction object" you mention twice. Also, much like manually deleting the content, the redaction tool is specifically designed to fully delete any content within the redaction area, so for those particular purposes it certainly would still be useful.
That said, your mention of possibly wanting to edit the content, instead of simply delete, as well as wanting to retain the content behind what is being removed, are certainly valid concern. Unfortunately, due to the way that PDF is formatted, there is no reliable way to select "alike objects" for bulk editing.. If you need to make fine changes to these objects, you will need to do so manually on each page.

I should also note that in PDF, Headers and footers can be a specially formatted content group, even though you can select them on the page like normal text. If you are looking to modify the text within your headers, you may wish to first check the "Headers and footers > Manage" function, on the organize tab, before you go about selecting and deleting the text manually.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
FLuser
User
Posts: 13
Joined: Thu Mar 18, 2021 7:00 am

Re: Select Identical from Multiple Pages

Post by FLuser »

the Redaction tool can have its "redaction color" set to none, which will prevent the "new tangible redaction object" you mention twice.
In my example, I explicitly mentioned the transparent redaction because I suspected you would recommend it.

My point is that this does not address the issues I mentioned. If we discuss only deletion without discussing added functionality, the transparent redaction still causes problems. Anything in that area, even if it's overlapping (and wouldn't qualify in the selection) will be overridden -- and any raster content will be modified to be whited-out.
due to the way that PDF is formatted, there is no reliable way to select "alike objects" for bulk editing.
As a software developer myself, one who has build solutions w/Foxit and abcPDF libraries (and familiar w/PDF spec), I can tell you that this statement is not accurate.

I understand if Tracker Software doesn't want to implement the feature or it's not worth the hassle -- however, it most certainly is possible.

In fact the functionality could be "cheated" if the developer didn't want to do it properly, by simply emulating the same selection points across all pages (eg user selects from 1000,1000 to 1500,1500, repeat on all rasterized pages). This is a big of a kludge but eliminates a more complex logic.
PDF, Headers and footers can be a specially formatted content group,
Your comments regarding "Headers and footers" are based on the assumption the headers/footers created in PDF Xchange. In the examples I've provided, clearly I'm not working with documents I've created natively or I would not have a need for retrofit.

In short, I now understand these features are not available and Tracker isn't interested in adding them; however, to be clear, they certainly could be added if Tracker felt they were worthwhile.

Thank you.
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8440
Joined: Wed Jan 03, 2018 6:52 pm

Re: Select Identical from Multiple Pages

Post by TrackerSupp-Daniel »

Hello, FLuser
FLuser wrote: Tue Mar 07, 2023 11:18 pm In my example, I explicitly mentioned the transparent redaction because I suspected you would recommend it.
If you read my entire statement, you would see that I agreeing with you that the redaction tool does not accomplish all of what you are looking for, and was only clarifying that it can do some of the specific individual items you mentioned it being unable to accomplish.

Regarding the bulk editing functionality, the key word there was "reliably" you are certainly correct that there are methods which could be used to emulate this in roundabout and unreliable ways, but to do so with certainty that we are selecting only the objects which the user wants to select on each page, is not something we can confidently implement at this time. We try to avoid offering features which allow for someone to believe they are making only the changes they want, and then find later they made many changes they did not want to make as well.
With that said, exceptions are made in cases where there is extreme user demand. So far, you are the first request I have seen for a selection method like this. I will discuss with my colleagues on the support team to see if there is enough support that we might bring this to the Dev team for their consideration though.

Regarding header and footers, these containers are defined within the PDF specification, and many other applications create them properly, including Adobe and Foxit. If these headers have been made by any app which respects that part of the Specification, we will be able to see the existing container and modify it. We do not currently offer any features which create objects that only our software can use, as we strictly stick to the spec wherever possible.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
FLuser
User
Posts: 13
Joined: Thu Mar 18, 2021 7:00 am

Re: Select Identical from Multiple Pages

Post by FLuser »

I most certainly read your entire reply, thoroughly, twice; you simply appear to have misunderstood my reply.

I was explicitly addressing your claim that transparency-redaction can offer a workaround. It doesn't and I explicitly addressed that claim in your reply, where you wrote, QUOTE:
I should note that the Redaction tool can have its "redaction color" set to none, which will prevent the "new tangible redaction object" you mention twice.
This is a hypothetical workaround that has no pragmatic/real-world value in the examples I provided.
you are certainly correct that there are methods which could be used to emulate this in roundabout and unreliable ways, {...}
We try to avoid offering features which allow for someone to believe they are making only the changes they want, and then find later they made many changes they did not want to make as well.
Here again there is a misunderstanding of my commentary. The redaction feature produces unreliable results; however, nothing about a bulk-select tool, or my work-around suggestion produces anything unreliable whatsoever.

To be clear: The shortcut (at code-level, nothing to do w/user experience) I mentioned did not introduce any lack of reliability, either. It was merely a slower alternative (passing select coordinates against the cached/raster representation of pages) rather than doing it natively based on internal PDF markup.
With that said, exceptions are made in cases where there is extreme user demand. So far, you are the first request I have seen for a selection method like this.
Yes, that is completely understandable. I agree and said as much in my prior post.

With that said -- I believe this is one of those features that many people don't know to request.

I work with hundreds of attorneys and while every one of them requires the redact functionality, most don't understand more than the bare minimum (Find/Redact feature using it like search/replace). They'd like more functionality; however, they're not sure how to navigate the tool, much less willing to invest time to post on forums.

To put that more simply: One of your largest user-bases and a lucrative segment of PDF users -- doesn't use a feature because they're not aware, nor do they spend the time to understands the capabilities (much less request them).

That's not to proclaim my request should be added; it's merely to suggest that being the first person to articular something, particularly something that requires some explaining -- doesn't *necessarily* or implicitly make it less valuable.

Rather, I'm explaining a method that people probably haven't considered that fixes many issues in PDF-Xchange that don't work in many of the real-world applications. Therefore, other people might request somethin like "I'd like to delete all headers/footers and the feature in Xchange doesn't work" -- my "request" is technically fulfilling that person's request, not merely the technical description I've provided you.
Regarding header and footers, these containers are defined within the PDF specification, and many other applications create them properly
The vast majority of PDF's where I need to modify header/footers -- are the result of production by PDF printers that do not properly designate/meta-tag header/footer content.

In short:
I had already tested PDF-Xchange's capability on hundreds of documents in the past -- otherwise, I would not have stated that it's not an effective solution based on the real-world applications as I explained.

Much like PDF-Xchange's add/remove "watermarks" feature, it's great in theory-- and PDF-Xchange does as good of a job implementing the feature as anyone can (PDF-XChange can't help the limits of the PDF/metadata). However, in real-world applications of retrofitting documents produced w/PDF printers, etc -- this simply doesn't provide a working solution.

This is ZERO fault of Tracker/PDF -- they're at the mercy of factors they do not control. None-the-less, I'm offering you a suggestion on how you can actually implement a solution that truly does work in real-world scenarios.

Whether you decided to implement, or not, please don't confuse the numerous use-case scenarios (which are many, several of which you already offer w/o real-world/in-the-wild efficacy) with the functional implementation that has only been mentioned by me.

Thank you.
FLuser
User
Posts: 13
Joined: Thu Mar 18, 2021 7:00 am

Re: Select Identical from Multiple Pages

Post by FLuser »

As I initially eluded in my first post -- the "Selection > Duplicate" does indeed already require/use the exact approach/code that could be replicated to enable duplicate selection for purposes other than redaction.

I've now tested about 40 different scenarios to better understand why you believe that this cannot be accomplished (or cannot be reliably accomplished) -- I can confirm that in every instance the Selection > Duplicate could work identically to the manner that it's used in Redact, yet would be magnitudes more powerful if allowed to be a true selection.

All of the functions that do not presently work properly in real-world situations, from watermarks to header/footers w/o metadata classification -- all become possible using a single interface rather than multiple different vertical-features.

In addition, this would open-up the "paste to multiple pages" feature, which would be huge for PDF-Xchange.

PDF-Xchange can already do this. For example, copy an image on three contiguous pages (select the image on three pages, Control-C). Next, scroll down to the subsequent page, page 4 - and paste. You'll now see images pasted on three pages: 4,5,6.

There is no reason this existing feature couldn't simply be expanded to allow "Paste > Duplicate" just like "Selection > Duplicate" because the functionality is already there.

Here again, you take one single, consistent UI/approach and it solves many unique use-cases, contrary to a single "waterpark" feature, etc.
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8440
Joined: Wed Jan 03, 2018 6:52 pm

Re: Select Identical from Multiple Pages

Post by TrackerSupp-Daniel »

Hello, FLuser

Those examples are relating to Creating new content, and yes that could be a handy feature. In your "copy three items" example, the existing duplicate tool would already work to accomplish this, simply select the desired items manually, and then use the "duplicate" tool to copy them to all pages (if as your example dictates, they are across 3 pages in a row, simply set the duplicate to place the content once per 3 pages.
image.png
The issues I am mentioning are not with a relation to performing actions on the selected content, the trouble is with accurately selecting only the desired content, across multiple pages automatically; as your original suggestion and the title of this topic allude to, "Select identical", with the intent of applying the same edits to that content afterwards.

Where is the line between identical objects drawn? For some people that means exactly what I have selected, if I pick a header that says "Tracker Software" it should ignore the header on the next page that says "PDF-XChange" because that is not the same, correct?
Now we apply that logic to page numbers. in the exact same region, I see "page 1" then on the next page, it is "page 2", obviously you would want to select both of these for modification in 99% of cases. Despite the different desired outcome, you are asking to perform both with the exact same operation.

Your prior suggestion of simply drawing the selection box in the same region on each page is a commendable solution, but what if the pages are different in size, perhaps one is landscape while the rest are portrait, so we adjust the selection area based on the relative height/width of the page, or do we leave it in the same place? If we leave it in the same place, we will likely select the incorrect content, but if we adjust the selection bounds would be warped and may not select anything.

The "simple" answer from a user perspective is "offer options to control it" which leads us down the rabbit hole of how much fine control do we need? When you try to use this function, do you want to see a regex prompt every single time one of the selected items is text based? Most people would not, and so we need to have an accurate handling for each of these situations, and automatically determine which approach a user would want based on what they have selected.

This was a very basic example with a fairly easy solution (add a conditional for "Page **" text), but beyond this case, there are simply too many variables for us to offer a function which could safely allow users to know they are selecting exactly what they want to select before they begin editing.

The end result here is that offering something like this is far more complex than you might think, and to do so reliably and accurately enough to serve the uses that our millions of clients may have, would be borderline impossible. For the moment, this is not something that we will be implementing, but perhaps, if we see enough support in the future, we will reconsider.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
FLuser
User
Posts: 13
Joined: Thu Mar 18, 2021 7:00 am

Re: Select Identical from Multiple Pages

Post by FLuser »

Thank you, the Selection > Duplicate Annotations works brilliantly for the purpose you mentioned.

...

Regarding the issues you explained re:complexity of my request -- there appears to be some confusion.

For quick background, I'm a software developer. I've worked for MS and a variety of companies in the Bay Area. I've also managed teams of developers and I.P. at a VC firm (although I prefer writing code over any of that nonsense).

Like any developer with a single week's experience in the field, I've had more people ask for a "simple" feature that is a complex nightmare they'll never understand.

Therefore, I can assure you, nothing I'm suggesting is "something like this is far more complex than you might think,"

...

To understand this point, you must understand Pereto's Principle (aka 80/20 rule) which dictates any efficient / 10x dev team. Essentially you can find a sweet spot where you address the bulk of the functionality with the minority of the work.

This isn't about how you code, it's about how you chose to spec it.

In this case, 90% of the functionality can exist with only 10% of the complexity. However, if your goal is to dissuade someone from expecting a feature, I can clearly see how you could seek exceptions and proclaim those to be roadblocks.

In this instance, in the practical use case, those aren't required to gain the 90% functionality. I'm not saying it's child's play, every feature is more complex than at first glance -- however -- there is absolutely no reason to tie oneself into knots in order to overcome hurdles when they can simply be stipulated out.

Here's a few examples of simpler implementations:

Solution 1. Open a large document and explore PDF-Xchange's "Content Pane". Find an asset on the document that repeats. Imagine a "Select Similar" or "Filter By", etc. In short, every element matching the selected element (or some parameter of that element, eg minimum width or height w/raster) -- could be selected via context menu. Done, all without concern of different page sizes, etc.

Alternative 2. Add one sentence to manage user-expectations/limit extent of issues. Simply use the identical approach used w/redact and just as you do w/redact, it won't apply w/alternate page sizes (again - if you feel that creates a problem for users, simply include note that selection will only be made across matching page sizes within the selected range).

Alternative 3. Improve the Mark-for-Redaction feature by allowing the universally standard select-from-left = only controls entirely within selection rage -vs- select-from-right = any control within range. This would greatly improve the precision and useability of the Mark-for-Redact feature -- in addition -- it also opens the ability to turn that into a selection instead of choosing to "Apply". Simply put, rather than have only "Apply" you have a second option "Select All Marked"

...Yes, there is more than meets the eye -- I'm ware of precisely what that is. One perfect example is that you have no control over this being one page or 5,000 -- therefore you have an additional blocking/progress dialog or a host of async issues when users choose to move a bulk selection.

If that's an issue then don't allow moving/edits the bulk selections in early (or any) releases. Simply allowing deletion addresses a very large use-base (the redact crowd that prefers something more elegant).

...

I could elaborate in far more detail; however, the short version is that I'm well aware of the complexities.

The problem, as I see it, isn't about whether I'm aware of the complexities, but whether you're willing/interested in pragmatic applications (eg- modification to redact), or you're team is affected by the stagnant-dev-virus ... aka .. "we can't build that because it would require us to construct hundreds of esoteric nightmares ... all of which could easily be avoided, however, we can't ever mention this because it would override our code of complaining about every new feature request." ..lol
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8440
Joined: Wed Jan 03, 2018 6:52 pm

Re: Select Identical from Multiple Pages

Post by TrackerSupp-Daniel »

Hello, FLuser

You have already mentioned being a developer yourself, and if that is true, you would understand that there are a number of factors beyond just the complexity of a feature, how it will be used, and how it could be mis-used. Unlike Adobe and Foxit, we do not have teams of hundreds of developers, our whole team is less than 20 people, and so development time comes at a premium, there are numerous significantly more highly demanded features (and bugfixes) which are on the list already, and the approval process is very strict with that in mind.
As I have pointed out, there are many problems to account for, and as you pointed out there are many solutions to those problems. The end result is still that between the complexity of implementing this, the time it would take from our developers working on other tasks, and the relatively small demand this particular item has, this will not be seeing implementation at this moment in time.

As before, perhaps in the future this will change, but for now, I am sorry to say that it is not something which we are planning.

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
FLuser
User
Posts: 13
Joined: Thu Mar 18, 2021 7:00 am

Re: Select Identical from Multiple Pages

Post by FLuser »

Right, that was my point early-on.

I merely explained that it was not accurate to proclaim that I wasn't understanding the complexities and that there was a lot more to the issue than I implied.

I provided you with three highly constructive options to make far less significant modifications that would take existing features (eg Content Pane or Redact as-is) and make them far more valuable / easier to use.

You have now admitted that this wasn't the issue afterall, but simply that you don't see the potential relative to other things in your queue.

As I stated from the beginning -- that's certainly fine; and it's nice that you're being honest.

In fact...

If Tracker had been honest in the first reply (where they claimed that due to the PDF format it's not possible) -- it would have saved me several replies and a great deal of my time.


Regardless, thanks for the eventual honesty.
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 8440
Joined: Wed Jan 03, 2018 6:52 pm

Re: Select Identical from Multiple Pages

Post by TrackerSupp-Daniel »

Hello, FLuser

My initial reply never said it was impossible:
TrackerSupp-Daniel wrote: Tue Mar 07, 2023 5:39 pm Unfortunately, due to the way that PDF is formatted, there is no reliable way to select "alike objects" for bulk editing
Only that it would be unreliable, as I explained in my later posts. I will admit I oversimplified it, because I was intending to offer enough information to get the point across. You are right that this conversation has gone on for too long now though.

As this topic is not longer focused on the initial request, which has been determined as rejected for now, I will be locking this thread. If anyone in the future wishes to add their voice to this request so that we can reconsider, I would ask that they open a new thread, or email support@pdf-xchange.com

Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Locked