Problem with Advanced Search Advanced Criterions

Forum for the PDF-XChange Editor - Free and Licensed Versions

Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Sean - Tracker, Paul - Tracker Supp, Chris - Tracker Supp, Tracker Supp-Stefan, Ivan - Tracker Software

Post Reply
User avatar
David.P
User
Posts: 878
Joined: Thu Feb 28, 2008 8:16 pm
Location: Germany

Problem with Advanced Search Advanced Criterions

Post by David.P » Mon Apr 24, 2017 9:53 am

Hi forum and Tracker support team,

I heavily rely on the Search pane in order to quickly find certain text passages that contain certain search terms. Sometimes, I need to find paragraphs where certain search terms do occur, but where other search terms are not present, using "Advanced Criterion" in the Search Pane.

However, there seems to be a problem with advanced search when using such advanced search criterions:
Image

For example, with the attached document, if I try and find all paragraphs containing the word "Emma", but not at the same time containing the word "she", the Search function still finds many paragraphs that actually do contain both search terms.

This is a rather severe problem if you rely on finding certain text paragraphs in long documents with like thousands of pages like I do on a daily basis.

Therefore it would be great if Support and/or Devs. could look into this issue.

Thanks very much,
David
Attachments
Sample e-Book 'Emma' (with highlighted Find results).pdf
(511.96 KiB) Downloaded 53 times
David.P
PDF-XChange Pro

User avatar
Will - Tracker Supp
Site Admin
Posts: 6826
Joined: Mon Oct 15, 2012 9:21 pm
Location: London, UK
Contact:

Re: Problem with Advanced Search Advanced Criterions

Post by Will - Tracker Supp » Thu Apr 27, 2017 1:12 pm

Hi David,

I've passed this along to the Devs. and will post back when I hear more.

Cheers,
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.

Best regards

Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com

User avatar
David.P
User
Posts: 878
Joined: Thu Feb 28, 2008 8:16 pm
Location: Germany

Re: Problem with Advanced Search Advanced Criterions

Post by David.P » Thu Apr 27, 2017 1:28 pm

Thank you Will, that's great.
:)
David.P
PDF-XChange Pro

User avatar
Vasyl-Tracker Dev Team
Site Admin
Posts: 1957
Joined: Thu Jun 30, 2005 4:11 pm
Location: Canada

Re: Problem with Advanced Search Advanced Criterions

Post by Vasyl-Tracker Dev Team » Thu Apr 27, 2017 8:59 pm

Hi, David.

This issue will be fixed in next upcoming build. Thanks for detailed bug-report.

Cheers.
Vasyl Yaremyn
Tracker Software Products
Project Developer

Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.

User avatar
David.P
User
Posts: 878
Joined: Thu Feb 28, 2008 8:16 pm
Location: Germany

Re: Problem with Advanced Search Advanced Criterions

Post by David.P » Fri Apr 28, 2017 7:59 am

Thank you very much Vasyl -- it's great to hear that this will be fixed so fast. Another proof that supreme product support and active developer response goes hand-in-hand with providing a vastly superior product!

When investigating this problem, maybe it can be checked whether this is related to the recently reported issue where only the first of the search terms in the "OR" Advanced Criterion Search box is highlighted reliably -- see for example below screenshot from a Search action in the attached document: whereas all occurrences of the first search term "her" are highlighted correctly, the highlighting of the second search term "she" sometimes is missing.
Image

Thank you and keep up the great work!
David
Attachments
Sample e-Book 'Emma'.pdf
(994.01 KiB) Downloaded 42 times
David.P
PDF-XChange Pro

User avatar
Vasyl-Tracker Dev Team
Site Admin
Posts: 1957
Joined: Thu Jun 30, 2005 4:11 pm
Location: Canada

Re: Problem with Advanced Search Advanced Criterions

Post by Vasyl-Tracker Dev Team » Fri Apr 28, 2017 7:11 pm

Hi David.

I see the issue but unfortunately is too late to fix it in new upcoming build.
We will try to fix it for this build but I cannot promise that exactly..

Cheers.
Vasyl Yaremyn
Tracker Software Products
Project Developer

Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.

User avatar
David.P
User
Posts: 878
Joined: Thu Feb 28, 2008 8:16 pm
Location: Germany

Re: Problem with Advanced Search Advanced Criterions

Post by David.P » Sat Apr 29, 2017 10:30 am

Thank you Vasyl. The latter issue also seems to be less severe because -- while it does not highlight everything correctly -- it still finds all paragraphs that fall under the search condition.

Which is not the case with the former issue (in the first post). Glad that you were able to fix that one already!

Cheers
David
:)
David.P
PDF-XChange Pro

User avatar
Will - Tracker Supp
Site Admin
Posts: 6826
Joined: Mon Oct 15, 2012 9:21 pm
Location: London, UK
Contact:

Re: Problem with Advanced Search Advanced Criterions

Post by Will - Tracker Supp » Sun Apr 30, 2017 9:01 pm

:D
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.

Best regards

Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com

User avatar
David.P
User
Posts: 878
Joined: Thu Feb 28, 2008 8:16 pm
Location: Germany

Re: Problem with Advanced Search Advanced Criterions

Post by David.P » Tue Jul 04, 2017 10:12 am

Hi forum, Tracker support team and Devs,

it seems that the problems described in posts #1 and #5 are resolved in Build 322.5, which is great -- thank you!

However, I just came across another issue with Advanced Search. I think that it has to do with the 'paragraph recognition' of PDF-XChange Editor.

Please see the attached file that shows how PDF-XChange Editor does not find certain search term combinations because obviously, the respective text on the page is not recognized as a text paragraph by PDF-XChange Editor.

Hopefully, this helps fixing the issue, which in my case is quite severe, because I rely heavily on PDF-XChange Editor to find certain occurrences of search term combinations in very long documents -- with possible serious legal consequences if some occurrences are not found :(

Thanks very much for looking into this,

Regards
David
Attachments
Advanced Search, problems with paragraph recognition.pdf
(70.43 KiB) Downloaded 67 times
David.P
PDF-XChange Pro

User avatar
Tracker Supp-Stefan
Site Admin
Posts: 13787
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Problem with Advanced Search Advanced Criterions

Post by Tracker Supp-Stefan » Tue Jul 04, 2017 11:24 am

Hi David,

Glad to head both of the original issues are resolved.
As you say in your file - the paragraph numbers are probably the ones intefreding with the recognition logic!
I will have the devs look at it and see what can be done!

Cheers,
Stefan

User avatar
David.P
User
Posts: 878
Joined: Thu Feb 28, 2008 8:16 pm
Location: Germany

Re: Problem with Advanced Search Advanced Criterions

Post by David.P » Tue Jul 04, 2017 1:00 pm

Thank you Stefan,

Unfortunately (regarding this problem) legal files, to a large extent, do tend to have paragraph numbers in the margin. Therefore, in order to get optimally (i.e. legally) correct search results, a reliable proximity search can be crucial.

However, this will always continue to be a problem, at least with scanned and OCR'ed documents, where it is probably even harder to recognize paragraphs.

Therefore, maybe the proximity search feature could be changed in order to be "steplessly adjustable" -- in a sense that you would specify a distance in the form of a percentage of a page dimension (e.g. page height) for proximity, instead of having to specify "paragraph" or "page":

Image

This way, paragraphs and their recognition basically would not matter anymore. Instead, the feature would simply calculate the vertical distance between the found search terms on the page. If this distance would be smaller than the user-specified "XY percent" of the page height, the occurrence would be registered as found!

For multi-column documents this still would have to be tweaked, such that search term occurrences that are vertically close to each other, but in different columns, would not be (falsely) registered as being within proximity.

I'd be happy to hear the Dev's opinion on this!

Cheers
David
David.P
PDF-XChange Pro

User avatar
Tracker Supp-Stefan
Site Admin
Posts: 13787
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Problem with Advanced Search Advanced Criterions

Post by Tracker Supp-Stefan » Tue Jul 04, 2017 3:54 pm

Hi David,

An interesting suggestion indeed! But It might be much trickier to implement than it seems - given how content in the PDF file is positioned and all.
I will definitely check with my colleagues and see what their thoughts are on that, but please note that it might not be something we will look at implementing (that % offset feature).

Regards,
Stefan

User avatar
David.P
User
Posts: 878
Joined: Thu Feb 28, 2008 8:16 pm
Location: Germany

Re: Problem with Advanced Search Advanced Criterions

Post by David.P » Tue Jul 04, 2017 4:28 pm

Thanks very much Stefan for the approval, and for passing this on to your colleagues.
Cheers
David
:)
David.P
PDF-XChange Pro

User avatar
David.P
User
Posts: 878
Joined: Thu Feb 28, 2008 8:16 pm
Location: Germany

Re: Problem with Advanced Search Advanced Criterions

Post by David.P » Wed Jul 05, 2017 8:28 am

PS: as an addendum, regarding the current implementation of proximity search:
Image

... additionally to the issue described above regarding the problematic reliability of paragraph recognition, it is also currently not possible to find a search term combination in a paragraph that spans adjacent pages. This also can lead to not finding important text passages in documents, which (at least in the legal sectors) can be decisive for the success or failure of proceedings.

This problem also could be overcome with the above feature suggestion, where text proximity is not tied to the (uncertain) detection of text paragraphs anymore, but instead is specified in a percentage of the page height (or rather, of the page contents' height), like shown below:
Image
Thanks to Support and Devs for having a look at this suggestion,

Best Regards
David.P
David.P
PDF-XChange Pro

User avatar
Tracker Supp-Stefan
Site Admin
Posts: 13787
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Problem with Advanced Search Advanced Criterions

Post by Tracker Supp-Stefan » Wed Jul 05, 2017 11:41 am

Thanks again David,

I've passed it along to a colleague in Canada for a serious consideration as a FR.

Cheers,
Stefan

User avatar
David.P
User
Posts: 878
Joined: Thu Feb 28, 2008 8:16 pm
Location: Germany

Re: Problem with Advanced Search Advanced Criterions

Post by David.P » Wed Jul 05, 2017 12:26 pm

Thank you again Stefan,
Cheers
David
;)
David.P
PDF-XChange Pro

User avatar
Tracker Supp-Stefan
Site Admin
Posts: 13787
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Problem with Advanced Search Advanced Criterions

Post by Tracker Supp-Stefan » Wed Jul 05, 2017 1:41 pm

:)

User avatar
David.P
User
Posts: 878
Joined: Thu Feb 28, 2008 8:16 pm
Location: Germany

Re: Problem with Advanced Search Advanced Criterions

Post by David.P » Tue Jan 30, 2018 10:44 am

Just as a friendly reminder, because I am working on a legal case again, where the success largely depends on finding all occurrences of certain (boolean) word combinations in text paragraphs of very long documents:

It would be amazing if the proximity search function of PDF-XChange Editor could be improved such that it also finds a word combination contained in a paragraph that spans a page break, as suggested above in detail and shown again in the below mock-up screenshot:
Image
Keep up the great work!
David
David.P
PDF-XChange Pro

User avatar
Tracker Supp-Stefan
Site Admin
Posts: 13787
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Problem with Advanced Search Advanced Criterions

Post by Tracker Supp-Stefan » Tue Jan 30, 2018 4:08 pm

Hi David,

Thanks for bringing this up again, but at the moment we are focusing on clearing up some bugs and issues that came with V7, and as such new features development is a bit on the backburner. It will be considered, but I still can not make any promises for if or when it can actually be implemented.

Regards,
Stefan

User avatar
David.P
User
Posts: 878
Joined: Thu Feb 28, 2008 8:16 pm
Location: Germany

Re: Problem with Advanced Search Advanced Criterions

Post by David.P » Tue Jan 30, 2018 4:35 pm

Hi Stefan,

that's of course understood. It's good to hear that the feature remains under consideration for the mid-term.
I think it will be an indispensable feature for everyone who works a lot with (particularly legal) text analysis -- as soon as they start considering, or using it :)

Cheers
David
Last edited by David.P on Thu Feb 01, 2018 9:35 am, edited 1 time in total.
David.P
PDF-XChange Pro

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 2887
Joined: Wed Jan 03, 2018 6:52 pm

Re: Problem with Advanced Search Advanced Criterions

Post by TrackerSupp-Daniel » Tue Jan 30, 2018 7:46 pm

Glad to hear that's sorted :)
Stefan or myself will get back to you as soon as we have further news on this patter.
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

kauenegrao@gmail.com
User
Posts: 2
Joined: Wed Dec 18, 2019 11:05 am

Re: Problem with Advanced Search Advanced Criterions

Post by kauenegrao@gmail.com » Wed Dec 18, 2019 11:27 am

Hi,
I'm also hoping for improvements in the proximity search. In addition to what David.P suggested, I suggest using character distance as a new criteria. It would solve the following problem, already explained by David.P:

-Situation 1: sometimes a word is very close to another, but it's on the beginning of the next page;
-Situation 2: and another situation where two words are much farther, but on the same page;

Currently, it is possible to use the search criteria "Words from the same page" for the second situation, but there is no way to get the search result on the first situation, even though the words are closer than on the second situation.
So, it would be useful to choose a distance between words, measured by characters, when creating a proximity search. Such an option exists, for example, on File Locator Pro.

Thanks,

User avatar
Tracker Supp-Stefan
Site Admin
Posts: 13787
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Problem with Advanced Search Advanced Criterions

Post by Tracker Supp-Stefan » Wed Dec 18, 2019 12:27 pm

Hello again kauenegrao,

We are aware of the limitation, but the structure of PDF pages (each page is separate and independent) makes this a bit more complex than it look sat first!
There are still no news on the availability of such an "across page" search feature.

Regards,
Stefan

Post Reply