Does IFilter put PDF contents in Windows Search index?

Forum for the PDF-XChange Editor - Free and Licensed Versions

Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Sean - Tracker, Paul - Tracker Supp, Chris - Tracker Supp, Tracker Supp-Stefan, Ivan - Tracker Software

Post Reply
libove
User
Posts: 11
Joined: Tue Oct 14, 2014 11:43 am

Does IFilter put PDF contents in Windows Search index?

Post by libove » Mon Mar 23, 2020 12:00 pm

Is the Tracker PDFX-Change IFilter supposed to allow the Windows 10 Search indexer to index the CONTENTS of PDF files, in addition to the file names?
It doesn't seem to be doing so. (Neither did the Adobe Reader supplied IFilter).

I've already checked out this other thread:
viewtopic.php?f=62&t=33183&hilit=ifilter#p136953
When I look at Windows 10 Search advanced settings, the file type .pdf is set to use the PDFX-Change Filter Handler, and to index both properties and File contents.

How do I get the contents of PDF files indexed so that Windows 10 desktop Search can find files not just by name, but also by content?

thanks

User avatar
Tracker Supp-Stefan
Site Admin
Posts: 14037
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Does IFilter put PDF contents in Windows Search index?

Post by Tracker Supp-Stefan » Mon Mar 23, 2020 2:25 pm

Hello libove,

Thanks for your post and enquiry!
Yes - you should be able to search inside the contents of a PDF file - as long as it has searchable text in it:
search.png
(11.14 KiB) Not downloaded yet
Image

If the file is image based, or the 'text' inside it has been converted to vector objects (curves) - then our iFilter will not be able to search such content, as there is no machine recognizable text inside such files.

Kind regards,
Stefan

libove
User
Posts: 11
Joined: Tue Oct 14, 2014 11:43 am

Re: Does IFilter put PDF contents in Windows Search index?

Post by libove » Mon Mar 23, 2020 5:29 pm

I'm going off-topic here, but I'm doing so because I think it may be why this isn't working on one of my computers, but does seem to be working on another. Apologies in advance, and thanks in advance if someone can help with this.

The computer on which the content of PDF files is NOT being found in Windows Desktop Search is a Windows 10 Enterprise v1909 system.
Another computer on which the content of PDF files IS being found in Windows Desktop Search is a Windows 10 Pro v1909 system.

There appears to be a difference in the Search version between the two machines:
Pro: Click on the Search magnifying glass icon in the task bar, then the menu dots in the upper right, Build 2020.03.22.6240933.
Enterprise: same path, Build 2019.07.18.6227079.
Although obviously that could cause significant user experience differences, I wouldn't expect it to cause what I'm seeing (PDF file contents not being indexed on the Enterprise machine). Anyway, more information:

I also notice a difference in the Search options between the Pro and Enterprise machine:
On the Pro machine, there is a Cortana Permissions page in the Settings app. It looks like this:
Cortana permissions Pro.jpg
On the Enterprise machine, trying to go to the Cortana Permissions page in Settings gives me this instead:
Cortana permissions Ent.jpg
(Both machines have the same Microsoft Live account associated with the logged-in user. That is, going to the Email & accounts Settings page shows the same content on both machines).

Furthermore, the Cortana Language Settings pages are also different:
On the Pro machine:
Cortana Language settings Pro.jpg
On the Enterprise machine:
Cortana Language settings Ent.jpg
I get the feeling that either Cortana is disabled per force by Microsoft on Windows 10 Enterprise 1909, or I just haven't figured out (Despite searching and trying all kinds of things from what I've found online over the past hour+) how to turn it on.
.. And that just possibly some or all of this, crazy as it would be, IS stopping the PDFX-Change IFilter (or the Reader IFilter, which I notice is the one that is enabled on one of my Pro machines) from indexing the contents of PDF files.

I'm confused.

Again, sorry for almost surely going off-topic, and thanks in advance if someone can suggest how to fix this.
Attachments
image.png

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 3442
Joined: Wed Jan 03, 2018 6:52 pm

Re: Does IFilter put PDF contents in Windows Search index?

Post by TrackerSupp-Daniel » Tue Mar 24, 2020 4:47 pm

Hi, libove

There should not be any differences in the Ifilter handler between versions of windows, so this certainly should be working regardless of the other settings that are in place.
Can I ask you to try disabling, and then re-enabling our Ifilter on the enterprise machine? After that, please restart the PC and test again to see if it helped. This article goes over the process:
https://www.tracker-software.com/knowle ... extensions

Kind regards,
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Support: <Support@tracker-software.com>
Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

libove
User
Posts: 11
Joined: Tue Oct 14, 2014 11:43 am

Re: Does IFilter put PDF contents in Windows Search index?

Post by libove » Fri Mar 27, 2020 7:55 am

Hi Daniel, thanks for sticking with me on this.
I had already used XCShInfoSetup.exe utility, and I've now also re-run the REGSVR32 commands, and confirmed that the Tracker IFilter appears in Indexing advanced settings for .pdf file types, and that contents are also to be indexed. (And rebooted, and checked again).
I also, then, made a new copy of a PDF file, and deleted the original, to make sure that the indexer would have a clean look at the file and its contents.
Still no joy.
I'm pretty sure it's a problem with Windows 10 Search, not with the PDF filter (whether Tracker's or Adobe's).
In recent days I've totally rebuilt this Windows system, downgraded it from Enterprise to Pro, tried a local account vs. a Microsoft account, turned Cortana off and on and off and on, and played with the Windows Search Permissions (do/don't search Cloud content).
And worked over the ACLs of the second hard drive on which the not-always/not-reliably found/ seemingly not content indexed PDF files live.
Somewhere along the way, I simply cannot say exactly when/why, indexing of content of PDF files on the second drive began working.
So, this probably never had anything to do with Tracker in particular.
Microsoft, simply, sucks.
Thanks everyone.

User avatar
TrackerSupp-Daniel
Site Admin
Posts: 3442
Joined: Wed Jan 03, 2018 6:52 pm

Re: Does IFilter put PDF contents in Windows Search index?

Post by TrackerSupp-Daniel » Fri Mar 27, 2020 6:26 pm

Hi, libove

Thank you for the details of your findings and investigation.

Kind regards,
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Support: <Support@tracker-software.com>
Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623

Post Reply