Content searcher - txt doc pdf

General discussion related to "Everything".
Post Reply
EchterStahlmann
Posts: 6
Joined: Fri Aug 25, 2017 4:52 pm

Content searcher - txt doc pdf

Post by EchterStahlmann » Fri Aug 25, 2017 5:24 pm

Hello,

Is it possible?
Or what free software can do it? (not the best question for this forum :lol: )

Cheers

void
Site Admin
Posts: 5547
Joined: Fri Oct 16, 2009 11:31 pm

Re: Content searcher - txt doc pdf

Post by void » Sat Aug 26, 2017 12:30 am

Content searching is possible with Everything 1.4.

Please try the Advanced Search in Everything 1.4 to setup a content search:
  • In Everything, from the Search menu, click Advanced Search.
  • Change A word of phrase in the file to the content you would like to find.
  • Combine with other filters for the best performance.
  • For example:
    • change Extensions to: txt;doc;pdf to limit the search to txt, doc and pdf files.
    • Change the first Date modified field to: 1/1/2017 to limit the search to files modified this year.

EchterStahlmann
Posts: 6
Joined: Fri Aug 25, 2017 4:52 pm

Re: Content searcher - txt doc pdf

Post by EchterStahlmann » Sat Aug 26, 2017 12:56 pm

Thanks!
so do you think that comparison is outdated?

Image

void
Site Admin
Posts: 5547
Joined: Fri Oct 16, 2009 11:31 pm

Re: Content searcher - txt doc pdf

Post by void » Sun Aug 27, 2017 10:47 am

Everything
Find text in documents (PDF/DOC/TXT): Yes
Find all file types: Yes
Search file attributes/properties: attributes:Yes, some properties (image dimensions/id3 tags): Yes
Advanced boolean searches: Yes
Regular expression support: Yes
Maintains separate search index (perforamance hit): No
Instant search: Yes
Preview search results: Yes
Search Virtual folders (non-filesystem): No (coming in Everything 1.5)
Find photos by map location: No
Compare file attributes: No, maybe: attribdupe:
Find documents in deep folders (path>260 characters): Yes

EchterStahlmann
Posts: 6
Joined: Fri Aug 25, 2017 4:52 pm

Re: Content searcher - txt doc pdf

Post by EchterStahlmann » Mon Aug 28, 2017 11:14 am

void wrote:Everything
Find text in documents (PDF/DOC/TXT): Yes
Find all file types: Yes
Search file attributes/properties: attributes:Yes, some properties (image dimensions/id3 tags): Yes
Advanced boolean searches: Yes
Regular expression support: Yes
Maintains separate search index (perforamance hit): No
Instant search: Yes
Preview search results: Yes
Search Virtual folders (non-filesystem): No (coming in Everything 1.5)
Find photos by map location: No
Compare file attributes: No, maybe: attribdupe:
Find documents in deep folders (path>260 characters): Yes

Good for you!
You could put this updated info on your site ;)
Competition can feel fear :P

MikeKaye
Posts: 2
Joined: Sat Jan 13, 2018 12:44 am

Re: Content searcher - txt doc pdf

Post by MikeKaye » Sun Jan 14, 2018 7:02 pm

I would like to search 8-bit ASCII content for specified software source file types. Specifically file types: asm s inc c cpp h hpp java class html. No point in indexing all by default. Users can specify which types to index.

I have a few odd file types such as msa (asm backwards) which I use as prototype files with template substitution.

Searching within doc and docx files would be a plus.

void
Site Admin
Posts: 5547
Joined: Fri Oct 16, 2009 11:31 pm

Re: Content searcher - txt doc pdf

Post by void » Sun Jan 21, 2018 2:38 am

Please try making a custom filter:
  • In Everything, from the Search menu, click Add to Filter....
  • Change Name to: src
  • Change the search to: ext:asm;s;inc;c;cpp;h;hpp;java;class;html;msa
  • Change macro to: src
  • Click OK.
ext: will search for files with the matching extension.

Now when you search for src: it will instead search for: ext:asm;s;inc;c;cpp;h;hpp;java;class;html;msa

For example, to search the contents of all these files with matching extension for the text SomeFunctionName, search for:
src: content:SomeFunctionName

You could do the same for doc and docx, where there is already a doc: macro which will search for doc and docx files.

combine content: with other search functions for the best performance.
Such as dm:thisyear size:<1MB d:\dev\

rSalois
Posts: 1
Joined: Fri Jul 27, 2018 1:09 pm

Re: Content searcher - txt doc pdf

Post by rSalois » Fri Jul 27, 2018 1:38 pm

Quite unsure about the pdf extension certainly. If there's no such a big deal about searching through the txt files and word as well, pdf requires an ocr feature, which is more complicated and most of the services are usually putting it in as a "pro" feature, like here for example https://edit-pdf.pdffiller.com/ despite it's able on a free trial, its options are quite restricted nevertheless

horst.epp
Posts: 265
Joined: Fri Apr 04, 2014 3:24 pm

Re: Content searcher - txt doc pdf

Post by horst.epp » Fri Jul 27, 2018 2:51 pm

rSalois wrote:Quite unsure about the pdf extension certainly. If there's no such a big deal about searching through the txt files and word as well, pdf requires an ocr feature, which is more complicated and most of the services are usually putting it in as a "pro" feature, like here for example https://edit-pdf.pdffiller.com/ despite it's able on a free trial, its options are quite restricted nevertheless
There are no problems to search PDF content with Windows search indexer which uses iFilter to get the pdf content.
There are free iFilters available which works fine eg.
SumatraPDF https://www.sumatrapdfreader.org/prerelease.html
TET PDF IFilter https://www.pdflib.com/products/tet-pdf-ifilter/

NotNull
Posts: 2153
Joined: Wed May 24, 2017 9:22 pm

Re: Content searcher - txt doc pdf

Post by NotNull » Fri Jul 27, 2018 8:25 pm

PDF's can be text documents, images (scan2pdf, for example) or a combination of both (digital magazines).
Everything uses iFilters to read the contents of files (like @horst.epp already mentioned. AFAIK, there is no iFilter to scan images for text.
Luckily, the vast majority of PDF files out there are text based. (a quick way to test that, is to select some text in your PDF viewer; you can't do that with an image)

@horst.epp: Thank you for your TET PDF IFilter suggestion. Sounds very promising. Certainly going to try it out. Do you have personal experience with it?

horst.epp
Posts: 265
Joined: Fri Apr 04, 2014 3:24 pm

Re: Content searcher - txt doc pdf

Post by horst.epp » Sat Jul 28, 2018 7:50 am

NotNull wrote:PDF's can be text documents, images (scan2pdf, for example) or a combination of both (digital magazines).
Everything uses iFilters to read the contents of files (like @horst.epp already mentioned. AFAIK, there is no iFilter to scan images for text.
Luckily, the vast majority of PDF files out there are text based. (a quick way to test that, is to select some text in your PDF viewer; you can't do that with an image)

@horst.epp: Thank you for your TET PDF IFilter suggestion. Sounds very promising. Certainly going to try it out. Do you have personal experience with it?
Yes, I run it since since 4 months now under Windows 10 x64 without problems.
I havily depend on searching PDFs.
My own scanned PDF files are always OCR generated or printed the right way.
Even image based content PDFs can often be made searchable
with the ReadIris software from my HP scanner.

NotNull
Posts: 2153
Joined: Wed May 24, 2017 9:22 pm

Re: Content searcher - txt doc pdf

Post by NotNull » Sat Jul 28, 2018 9:02 pm

Thanks for sharing!

Post Reply