Page 1 of 1

Problem with Advanced Search Advanced Criterions

Posted: Mon Apr 24, 2017 9:53 am
by David.P
Hi forum and Tracker support team,

I heavily rely on the Search pane in order to quickly find certain text passages that contain certain search terms. Sometimes, I need to find paragraphs where certain search terms do occur, but where other search terms are not present, using "Advanced Criterion" in the Search Pane.

However, there seems to be a problem with advanced search when using such advanced search criterions:
Image

For example, with the attached document, if I try and find all paragraphs containing the word "Emma", but not at the same time containing the word "she", the Search function still finds many paragraphs that actually do contain both search terms.

This is a rather severe problem if you rely on finding certain text paragraphs in long documents with like thousands of pages like I do on a daily basis.

Therefore it would be great if Support and/or Devs. could look into this issue.

Thanks very much,
David

Re: Problem with Advanced Search Advanced Criterions

Posted: Thu Apr 27, 2017 1:12 pm
by Will - Tracker Supp
Hi David,

I've passed this along to the Devs. and will post back when I hear more.

Cheers,

Re: Problem with Advanced Search Advanced Criterions

Posted: Thu Apr 27, 2017 1:28 pm
by David.P
Thank you Will, that's great.
:)

Re: Problem with Advanced Search Advanced Criterions

Posted: Thu Apr 27, 2017 8:59 pm
by Vasyl-Tracker Dev Team
Hi, David.

This issue will be fixed in next upcoming build. Thanks for detailed bug-report.

Cheers.

Re: Problem with Advanced Search Advanced Criterions

Posted: Fri Apr 28, 2017 7:59 am
by David.P
Thank you very much Vasyl -- it's great to hear that this will be fixed so fast. Another proof that supreme product support and active developer response goes hand-in-hand with providing a vastly superior product!

When investigating this problem, maybe it can be checked whether this is related to the recently reported issue where only the first of the search terms in the "OR" Advanced Criterion Search box is highlighted reliably -- see for example below screenshot from a Search action in the attached document: whereas all occurrences of the first search term "her" are highlighted correctly, the highlighting of the second search term "she" sometimes is missing.
Image

Thank you and keep up the great work!
David

Re: Problem with Advanced Search Advanced Criterions

Posted: Fri Apr 28, 2017 7:11 pm
by Vasyl-Tracker Dev Team
Hi David.

I see the issue but unfortunately is too late to fix it in new upcoming build.
We will try to fix it for this build but I cannot promise that exactly..

Cheers.

Re: Problem with Advanced Search Advanced Criterions

Posted: Sat Apr 29, 2017 10:30 am
by David.P
Thank you Vasyl. The latter issue also seems to be less severe because -- while it does not highlight everything correctly -- it still finds all paragraphs that fall under the search condition.

Which is not the case with the former issue (in the first post). Glad that you were able to fix that one already!

Cheers
David
:)

Re: Problem with Advanced Search Advanced Criterions

Posted: Sun Apr 30, 2017 9:01 pm
by Will - Tracker Supp
:D

Re: Problem with Advanced Search Advanced Criterions

Posted: Tue Jul 04, 2017 10:12 am
by David.P
Hi forum, Tracker support team and Devs,

it seems that the problems described in posts #1 and #5 are resolved in Build 322.5, which is great -- thank you!

However, I just came across another issue with Advanced Search. I think that it has to do with the 'paragraph recognition' of PDF-XChange Editor.

Please see the attached file that shows how PDF-XChange Editor does not find certain search term combinations because obviously, the respective text on the page is not recognized as a text paragraph by PDF-XChange Editor.

Hopefully, this helps fixing the issue, which in my case is quite severe, because I rely heavily on PDF-XChange Editor to find certain occurrences of search term combinations in very long documents -- with possible serious legal consequences if some occurrences are not found :(

Thanks very much for looking into this,

Regards
David

Re: Problem with Advanced Search Advanced Criterions

Posted: Tue Jul 04, 2017 11:24 am
by Tracker Supp-Stefan
Hi David,

Glad to head both of the original issues are resolved.
As you say in your file - the paragraph numbers are probably the ones intefreding with the recognition logic!
I will have the devs look at it and see what can be done!

Cheers,
Stefan

Re: Problem with Advanced Search Advanced Criterions

Posted: Tue Jul 04, 2017 1:00 pm
by David.P
Thank you Stefan,

Unfortunately (regarding this problem) legal files, to a large extent, do tend to have paragraph numbers in the margin. Therefore, in order to get optimally (i.e. legally) correct search results, a reliable proximity search can be crucial.

However, this will always continue to be a problem, at least with scanned and OCR'ed documents, where it is probably even harder to recognize paragraphs.

Therefore, maybe the proximity search feature could be changed in order to be "steplessly adjustable" -- in a sense that you would specify a distance in the form of a percentage of a page dimension (e.g. page height) for proximity, instead of having to specify "paragraph" or "page":

Image

This way, paragraphs and their recognition basically would not matter anymore. Instead, the feature would simply calculate the vertical distance between the found search terms on the page. If this distance would be smaller than the user-specified "XY percent" of the page height, the occurrence would be registered as found!

For multi-column documents this still would have to be tweaked, such that search term occurrences that are vertically close to each other, but in different columns, would not be (falsely) registered as being within proximity.

I'd be happy to hear the Dev's opinion on this!

Cheers
David

Re: Problem with Advanced Search Advanced Criterions

Posted: Tue Jul 04, 2017 3:54 pm
by Tracker Supp-Stefan
Hi David,

An interesting suggestion indeed! But It might be much trickier to implement than it seems - given how content in the PDF file is positioned and all.
I will definitely check with my colleagues and see what their thoughts are on that, but please note that it might not be something we will look at implementing (that % offset feature).

Regards,
Stefan

Re: Problem with Advanced Search Advanced Criterions

Posted: Tue Jul 04, 2017 4:28 pm
by David.P
Thanks very much Stefan for the approval, and for passing this on to your colleagues.
Cheers
David
:)

Re: Problem with Advanced Search Advanced Criterions

Posted: Wed Jul 05, 2017 8:28 am
by David.P
PS: as an addendum, regarding the current implementation of proximity search:
Image

... additionally to the issue described above regarding the problematic reliability of paragraph recognition, it is also currently not possible to find a search term combination in a paragraph that spans adjacent pages. This also can lead to not finding important text passages in documents, which (at least in the legal sectors) can be decisive for the success or failure of proceedings.

This problem also could be overcome with the above feature suggestion, where text proximity is not tied to the (uncertain) detection of text paragraphs anymore, but instead is specified in a percentage of the page height (or rather, of the page contents' height), like shown below:
Image
Thanks to Support and Devs for having a look at this suggestion,

Best Regards
David.P

Re: Problem with Advanced Search Advanced Criterions

Posted: Wed Jul 05, 2017 11:41 am
by Tracker Supp-Stefan
Thanks again David,

I've passed it along to a colleague in Canada for a serious consideration as a FR.

Cheers,
Stefan

Re: Problem with Advanced Search Advanced Criterions

Posted: Wed Jul 05, 2017 12:26 pm
by David.P
Thank you again Stefan,
Cheers
David
;)

Re: Problem with Advanced Search Advanced Criterions

Posted: Wed Jul 05, 2017 1:41 pm
by Tracker Supp-Stefan
:)

Re: Problem with Advanced Search Advanced Criterions

Posted: Tue Jan 30, 2018 10:44 am
by David.P
Just as a friendly reminder, because I am working on a legal case again, where the success largely depends on finding all occurrences of certain (boolean) word combinations in text paragraphs of very long documents:

It would be amazing if the proximity search function of PDF-XChange Editor could be improved such that it also finds a word combination contained in a paragraph that spans a page break, as suggested above in detail and shown again in the below mock-up screenshot:
Image
Keep up the great work!
David

Re: Problem with Advanced Search Advanced Criterions

Posted: Tue Jan 30, 2018 4:08 pm
by Tracker Supp-Stefan
Hi David,

Thanks for bringing this up again, but at the moment we are focusing on clearing up some bugs and issues that came with V7, and as such new features development is a bit on the backburner. It will be considered, but I still can not make any promises for if or when it can actually be implemented.

Regards,
Stefan

Re: Problem with Advanced Search Advanced Criterions

Posted: Tue Jan 30, 2018 4:35 pm
by David.P
Hi Stefan,

that's of course understood. It's good to hear that the feature remains under consideration for the mid-term.
I think it will be an indispensable feature for everyone who works a lot with (particularly legal) text analysis -- as soon as they start considering, or using it :)

Cheers
David

Re: Problem with Advanced Search Advanced Criterions

Posted: Tue Jan 30, 2018 7:46 pm
by TrackerSupp-Daniel
Glad to hear that's sorted :)
Stefan or myself will get back to you as soon as we have further news on this patter.

Re: Problem with Advanced Search Advanced Criterions

Posted: Wed Dec 18, 2019 11:27 am
by kauenegrao@gmail.com
Hi,
I'm also hoping for improvements in the proximity search. In addition to what David.P suggested, I suggest using character distance as a new criteria. It would solve the following problem, already explained by David.P:

-Situation 1: sometimes a word is very close to another, but it's on the beginning of the next page;
-Situation 2: and another situation where two words are much farther, but on the same page;

Currently, it is possible to use the search criteria "Words from the same page" for the second situation, but there is no way to get the search result on the first situation, even though the words are closer than on the second situation.
So, it would be useful to choose a distance between words, measured by characters, when creating a proximity search. Such an option exists, for example, on File Locator Pro.

Thanks,

Re: Problem with Advanced Search Advanced Criterions

Posted: Wed Dec 18, 2019 12:27 pm
by Tracker Supp-Stefan
Hello again kauenegrao,

We are aware of the limitation, but the structure of PDF pages (each page is separate and independent) makes this a bit more complex than it looks at first!
There are still no news on the availability of such an "across page" search feature.

Regards,
Stefan

Re: Problem with Advanced Search Advanced Criterions

Posted: Mon Nov 15, 2021 12:22 am
by David.P
Hello all,

when improving the search function, maybe another look could be taken at the feature suggestion above:

Proximity search should find search term combinations in paragraphs that span page breaks:
https://forum.pdf-xchange.com/viewtopic.php?f=62&t=28966&hilit=paragraph+page%20break#p120917

Image

Best regards
David.P

Re: File load order of Session documents

Posted: Mon Nov 15, 2021 12:09 pm
by Tracker Supp-Stefan
Hello David.P,

I believe that the main focus of this topic is still the document load order rather than the search, so would you like me to move your latest post in a topic of it's own or move it to the original topic that you reference?
I would then also like to make a FR ticket for this suggestion so that we have it in our internal system, but without any promise when or if this would actually be implemented.

Kind regards,
Stefan

Re: Problem with Advanced Search Advanced Criterions

Posted: Mon Nov 15, 2021 2:26 pm
by David.P
Hello Stefan,

yes please feel free to do so. I have already deleted the big screenshot above, such that only the link is still there.

Best regards
David

Re: Problem with Advanced Search Advanced Criterions

Posted: Mon Nov 15, 2021 5:38 pm
by Tracker Supp-Stefan
Hello David.P,

Just moved your two and my post in between them to the original topic, and then created this ticket:
#5806: FR: Ability to specify "Proximity search" of a percentage of a page
With an explanation of your request.

While you removed your image - I created a similar one and included it in the ticket so that it's clearer what is being asked.
Please note that as this is not a critical thing - this feature might get longer to be reviewed and approved (if at all).

Kind regards,
Stefan

Re: Problem with Advanced Search Advanced Criterions

Posted: Mon Nov 15, 2021 6:22 pm
by David.P
Thank you Stefan,

I believe the mockup screenshot is appropriate and informative here, after you have moved the posts, so I added it back.

Thanks for the support and ticket creation,

Kind regards
David

Re: Problem with Advanced Search Advanced Criterions

Posted: Mon Nov 15, 2021 10:16 pm
by Paul - Tracker Supp
Thanks David,

Stefan is done for the day and it is not immediately apparent to me which screen shot you need in the ticket. I want to make sure the right one gets in there. Can you link to the right screen shot here please so I can ensure it makes it into the ticket?

please and thanks.

Re: Problem with Advanced Search Advanced Criterions

Posted: Tue Nov 16, 2021 9:08 am
by Tracker Supp-Stefan
Hello David,

Thanks for adding the image back!
I've included your sample in the ticket as well.

Kind regards,
Stefan