A Question for content searching, written in Korean

If you are experiencing problems with "Everything", post here for assistance.
Post Reply
gamjalook
Posts: 4
Joined: Fri Nov 13, 2020 8:46 am

A Question for content searching, written in Korean

Post by gamjalook » Fri Nov 13, 2020 9:14 am

Hi.

I recently found this Everything tool is pretty helpful.

But I'm having a problem using it, I guess.

Using "contents" or "utf8contents" functions help searching the contents in each encoded .txt files at the same time, not exclusively, right?

But when searching for contents which is written in Korean, using "contents" function cannot make the tool search the contents
in UTF-8 encoded .txt file properly.

This happens for newly made .txt files, but not for some old .txt files.
So I'm confused what is going on.

Can anyone let me know the reason or fix the problem?

Thanks ahead.
Last edited by gamjalook on Sat Nov 14, 2020 5:52 am, edited 2 times in total.

void
Site Admin
Posts: 5814
Joined: Fri Oct 16, 2009 11:31 pm

Re: A Question for content searching

Post by void » Fri Nov 13, 2020 10:38 am

content: will use the extension-associated iFilter to find content.

The default iFilter for txt files will auto detect ANSI/UTF8.

Use utf8content: if you don't want to use the associated iFilter and wish to treat the content as UTF-8.

If no associated iFilter is found, Everything will treat the content as UTF-8.

void
Site Admin
Posts: 5814
Joined: Fri Oct 16, 2009 11:31 pm

Re: A Question for content searching, written in Korean

Post by void » Fri Nov 13, 2020 10:42 am

But when searching for contents which is written in Korean, using "contents" function cannot make the tool search the contents
in UTF-8 encoded .txt file properly.
The default iFilter is picky with detecting ANSI/UTF-8 and works best if there is a BOM.

If the files are UTF-8 encoded, please make sure there is a UTF-8 BOM.

Improving ANSI/UTF-8 detection is in development.

raccoon
Posts: 117
Joined: Thu Oct 18, 2018 1:24 am

Re: A Question for content searching, written in Korean

Post by raccoon » Fri Nov 13, 2020 4:03 pm

gamjalook wrote:
Fri Nov 13, 2020 9:14 am
But when searching for contents which is written in Korean, using "contents" function cannot make the tool search the contents
in UTF-8 encoded .txt file properly.

This happens for newly made .txt files, but not for some old .txt files.
So I'm confused what is going on.
Are you sure that the newly made txt files are actually UTF-8 encoded, and not a different encoding entirely?
In what program are you using to create the files? And can you double check the save file settings?
Try each of the content search encodings to be sure.

content:<text> Search file content for text.
ansicontent:<text> Search ANSI file content for text.
utf8content:<text> Search UTF-8 file content for text.
utf16content:<text> Search UTF-16 file content for text.
utf16becontent:<text> Search UTF-16 Big Endian file content for text.

gamjalook
Posts: 4
Joined: Fri Nov 13, 2020 8:46 am

Re: A Question for content searching, written in Korean

Post by gamjalook » Sat Nov 14, 2020 5:39 am

raccoon wrote:
Fri Nov 13, 2020 4:03 pm

Are you sure that the newly made txt files are actually UTF-8 encoded, and not a different encoding entirely?
In what program are you using to create the files? And can you double check the save file settings?
Try each of the content search encodings to be sure.
Thanks for your replying raccoon.

I'm not 100% sure, but yeah.
Every .txt files are encdoed in each way, I guess.

I'm just creating those files in the way MS Windows(Win 10 20H2 19042.610) provides, mouse-right click context menu.
The default .txt files when I create through the context menu are UTF-8 encoded.
And for the test, I copied that file through "save as different file name", changing "encoding" to UTF-8(BOM) and ANSI.

Here I attach a picture which shows, at least I hope so, the question I have.
Sorry it's messy.
Attachments
test.png
test.png (148.38 KiB) Viewed 2252 times
Last edited by gamjalook on Sat Nov 14, 2020 7:48 am, edited 2 times in total.

gamjalook
Posts: 4
Joined: Fri Nov 13, 2020 8:46 am

Re: A Question for content searching, written in Korean

Post by gamjalook » Sat Nov 14, 2020 5:46 am

Thank you void.

I'm trying to understand here. :D

Until I get it, I'll use content and utf8content function both, I guess.

NotNull
Posts: 2408
Joined: Wed May 24, 2017 9:22 pm

Re: A Question for content searching, written in Korean

Post by NotNull » Sat Nov 14, 2020 4:41 pm

Some suggestions:

You can use
content:"some text" | utf8content:"some text"

to use both at the same time ("|" is Everything-speak for OR )

You can also create your own function, so you don't have to type this each time
Let's say you want your function to be named mycontent:, searching for mycontent:"some text" will automatically be translated to:
content:"some text" | utf8content:"some text"

Here is how to configure that:
  • Go to Menu:Search > Organize Filters
  • Click the New button
  • Fill the fields as follows (Name field is arbitrary):
    2020-11-14 17_35_09-Edit Filter.png
    2020-11-14 17_35_09-Edit Filter.png (5.67 KiB) Viewed 2094 times
  • Press the OK button to save
From now on you can use mycontent: to search inside text files.

Note that this is case sensitive : Mycontent:text will not give any results.

gamjalook
Posts: 4
Joined: Fri Nov 13, 2020 8:46 am

Re: A Question for content searching, written in Korean

Post by gamjalook » Sun Nov 15, 2020 10:40 am

NotNull wrote:
Sat Nov 14, 2020 4:41 pm
Some suggestions:
...
You can also create your own function, so you don't have to type this each time
...
Here is how to configure that:
NotNull, so so thank you for your help.
I've just followed your instruction, and succeeded creating own filter.
Now, all I have to do is just using that filter. :D

Wishing you all the best for your future.

Thank you!

Post Reply