Is property indexing multi-threaded

Discussion related to "Everything" 1.5 Alpha.
Post Reply
aviasd
Posts: 64
Joined: Sat Oct 07, 2017 2:18 am

Is property indexing multi-threaded

Post by aviasd » Sun Mar 21, 2021 8:14 am

Hi,

I've tried testing property indexing
Added indexing for videos - Length and total bitrate.

It seemed to go in a sequential manner even though my videos are spread in separate drives. ( some ssds, some hdds)
Took around 5 minutes for ~70k files

Is there a flag to allow multi-threaded property indexing?
( If not, I suggest it as a feature 8-) )

Thx

NotNull
Posts: 2861
Joined: Wed May 24, 2017 9:22 pm

Re: Is property indexing multi-threaded

Post by NotNull » Sun Mar 21, 2021 9:01 am

Take a look here ...

aviasd
Posts: 64
Joined: Sat Oct 07, 2017 2:18 am

Re: Is property indexing multi-threaded

Post by aviasd » Sun Mar 21, 2021 9:52 am

It seems /content_max_threads is responsible for the number of threads of reading props and content, not for the indexing part.

seems /no_incur_seek_penalty_multithreaded=1 is a viable option for nvme's but not for hdd's, but thanks it does help on those.

I'm unsure what's the ratio between time to seek/read a file and the time to process a property. depends on the property.
eg - Processing a word count for a document would probably take longer than seeking it, so a multithreaded parsing solution would have to be applied differently for this case than general threading per core ( maybe separate queues for seeking and processing? IDK, not an expert )

void
Site Admin
Posts: 6462
Joined: Fri Oct 16, 2009 11:31 pm

Re: Is property indexing multi-threaded

Post by void » Sun Mar 21, 2021 10:03 am

Everything supports a separate thread for each device by default.

Everything should index each separate device at the same time.
Everything should read properties from each separate device at the same time.

Please check to see if Everything supports separate device threads:
  • In Everything, from the Tools menu, under the Debug submenu, click Statistics.
  • For each volume at the bottom, Everything will list the current multithreaded value.
  • It can be one of the follow:
    • Separate device thread - Use a single separate thread for this volume
    • Enabled - volume supports multiple threads.
    • Disabled - volume does not support multiple threads.
  • Please also check Everything is reporting a unique Disk device index for each volume. -If not, Everything will only use one thread.
You will not see any performance increase by using more than one thread for each separate HDD.

To reduce the number of properties Everything indexes:
  • In Everything, from the Tools menu, click Options.
  • Click Properties on the left.
  • Set include only folders to a semicolon delimited (;) list of folders.
    For example:
    c:\media;d:\media;e:\
  • Set include only files to a semicolon delimited (;) list of extension.
    For example:
    *.mp4;*.mkv;*.webm
  • Click OK.

aviasd
Posts: 64
Joined: Sat Oct 07, 2017 2:18 am

Re: Is property indexing multi-threaded

Post by aviasd » Sun Mar 21, 2021 10:34 am

My setup ( C,H are nvme's)
Indexes are unique.
Update: my mistake - disk device index is not unique for D and F , should it disable reading other drives concurrently with either F or D?

Code: Select all


NTFS Index
Volume Name:	\\?\Volume{}
Path:	C:
Root:	
Include only:	
Drive Type:	Fixed
Label:	
Index number:	0
Out of date:	No
Disk device index:	3
Multithreaded:	Enabled
Folder count:	470,768
File count:	2,094,182
USN Journal ID:	
Next USN:	

NTFS Index
Volume Name:	\\?\Volume{}
Path:	D:
Root:	
Include only:	
Drive Type:	Fixed
Label:	
Index number:	1
Out of date:	No
Disk device index:	1
Multithreaded:	Separate device thread
Folder count:	140,895
File count:	812,744
USN Journal ID:	
Next USN:	

NTFS Index
Volume Name:	\\?\Volume{}
Path:	F:
Root:	
Include only:	
Drive Type:	Fixed
Label:	
Index number:	2
Out of date:	No
Disk device index:	1
Multithreaded:	Separate device thread
Folder count:	13,029
File count:	39,176
USN Journal ID:	
Next USN:	

NTFS Index
Volume Name:	\\?\Volume{}
Path:	H:
Root:	
Include only:	
Drive Type:	Fixed
Label:	
Index number:	3
Out of date:	No
Disk device index:	2
Multithreaded:	Enabled
Folder count:	9,881
File count:	18,646
USN Journal ID:	
Next USN:	

NTFS Index
Volume Name:	\\?\Volume{}
Path:	J:
Root:	
Include only:	
Drive Type:	Fixed
Label:	
Index number:	4
Out of date:	No
Disk device index:	0
Multithreaded:	Separate device thread
Folder count:	327,306
File count:	1,278,235
USN Journal ID:	
Next USN:	

[While property indexing] it does not seem like everything reads from multiple disks at the same time :
at this point of the scan, there are multiple unindexed files on drives j,f.
Image
Last edited by void on Sun Mar 21, 2021 10:37 am, edited 1 time in total.

void
Site Admin
Posts: 6462
Joined: Fri Oct 16, 2009 11:31 pm

Re: Is property indexing multi-threaded

Post by void » Mon Mar 22, 2021 7:10 am

By default, Everything will index properties for 1024 files at a time.
This is different to searching properties (which will use multiple threads).
Indexing properties will generally only use 1 thread that runs in the background.

I've added an option indexed_property_max_request to customize the number of files to request when indexing properties in Everything 1.5.0.1248a

To request all files when indexing properties:
  • In Everything, type in the following search and press ENTER:
    /indexed_property_max_request=0
    where 0 is the number of files to request when indexing properties.
    0 = unlimited
    1024 is the default.
    If successful, you should see indexed_property_max_request=0 in the status bar for a few seconds.
Setting indexed_property_max_request to 0 will now index your properties with multiple threads.
Setting indexed_property_max_request to 0 can consume a large amount of RAM.
I'll look into making this the default without consuming any additional RAM.

aviasd
Posts: 64
Joined: Sat Oct 07, 2017 2:18 am

Re: Is property indexing multi-threaded

Post by aviasd » Mon Mar 22, 2021 8:42 am

The new build does seem to improve somewhat
Drives are queried concurrently while indexing properties, there seems to have a little effect though:
Previous build indexing time:17:23
New build indexing time:15:50

After indexing from the faster drives everything is bottlenecked by the slower HDD at ~3Mb/s which is in line with my HDD's bench for random seek. nothing could be done there...

Image

However, the new build has a couple of issues:

1.With an existing database (from the previous build), shows properties only when indexing is completed, deleting the database fixes this issue.

2. Crash dump when rebuilding the index, at the point where property indexing should start. It's inconsistent though - happened 3 out of 4 times.
(dumps were sent via email) - Note: is it possible the debug log filename would have some random seed/timestamp appended? I keep overwriting those by mistake.

Thanks for the Quick response!

JTCGiants56
Posts: 108
Joined: Fri Nov 28, 2014 3:58 pm

Re: Is property indexing multi-threaded

Post by JTCGiants56 » Mon Mar 22, 2021 11:15 pm

What is the max index request value for default indexing options in settings > indexes (ie file size, date created, etc)?

I'd like to match indexed_property_max_request with this if possible.

void
Site Admin
Posts: 6462
Joined: Fri Oct 16, 2009 11:31 pm

Re: Is property indexing multi-threaded

Post by void » Tue Mar 23, 2021 11:07 am

What is the max index request value for default indexing options in settings > indexes (ie file size, date created, etc)?
File size, date modified properties are different, these are read during the initial index. Usually from the NTFS MFT.
File size, date modified properties are read using separate device threads (if supported).

You can use the value of 0 for indexed_property_max_request to request all properties.
The default value is 1024.

properties are indexed in "chunks".
indexed_property_max_request determines how many files are in these "chunks".
while these chunks do support multiple threads, they are usually filed with files from the same device.




Thanks for the information aviasd,

Multiple thread support for indexing properties will need a lot more work..

If you use 0 for indexed_property_max_request there is currently some oddness when you exit and restart Everything.. the property index progress will no longer be shown, although Everything will continue to index your properties.

I'll look into the issue with the properties only showing after the indexing is complete.
2. Crash dump when rebuilding the index, at the point where property indexing should start. It's inconsistent though - happened 3 out of 4 times.
(dumps were sent via email) - Note: is it possible the debug log filename would have some random seed/timestamp appended? I keep overwriting those by mistake.
Thank you for the mini crash dumps.
All of the mini crash dumps showed Everything trying to access an file that is not longer in memory.
I have fixed an issue with Everything not clearing results correctly in Everything 1.5.0.1249a.

Please let me know if the issue persists.
Note: is it possible the debug log filename would have some random seed/timestamp appended? I keep overwriting those by mistake.
Added to my TODO list: add timestamp to log filename.

aviasd
Posts: 64
Joined: Sat Oct 07, 2017 2:18 am

Re: Is property indexing multi-threaded

Post by aviasd » Tue Mar 23, 2021 1:03 pm

void wrote:
Tue Mar 23, 2021 11:07 am
If you use 0 for indexed_property_max_request there is currently some oddness when you exit and restart Everything.. the property index progress will no longer be shown, although Everything will continue to index your properties.
Hi
In the new build, I'm not having the issue you are describing - "no progress is shown" - I do have progress shown.
My issue is that the property columns do not get populated ( nothing is shown in the UI for those columns) until everything completely finished indexing in the background.
When it's finished all the properties appear and I can query based on a column. E.G:

Code: Select all

width:>1

But while indexing,

Code: Select all

width:>1
does not return anything.
( I know I've had some images indexed prior to the query since I saw some under options->properties )
Deleting the database and starting fresh, resolves this, but force rebuild has this issue
Image

Update: Whoops missed this line there, thanks for that.
I'll look into the issue with the properties only showing after the indexing is complete.
Update2: Ah, you meant progress not shown if closed while indexing, yep I do see this as well :oops:





I'll look into the issue with the properties only showing after the indexing is complete.
2. Crash dump when rebuilding the index, at the point where property indexing should start. It's inconsistent though - happened 3 out of 4 times.
(dumps were sent via email) - Note: is it possible the debug log filename would have some random seed/timestamp appended? I keep overwriting those by mistake.
Thank you for the mini crash dumps.
All of the mini crash dumps showed Everything trying to access an file that is not longer in memory.
I have fixed an issue with Everything not clearing results correctly in Everything 1.5.0.1249a.

Please let me know if the issue persists.
After two attempts I did not have a crash dump, so it seems to be working now.
Thanks!

JTCGiants56
Posts: 108
Joined: Fri Nov 28, 2014 3:58 pm

Re: Is property indexing multi-threaded

Post by JTCGiants56 » Tue Mar 23, 2021 10:08 pm

void wrote:
Tue Mar 23, 2021 11:07 am
What is the max index request value for default indexing options in settings > indexes (ie file size, date created, etc)?
File size, date modified properties are different, these are read during the initial index. Usually from the NTFS MFT.
File size, date modified properties are read using separate device threads (if supported).

You can use the value of 0 for indexed_property_max_request to request all properties.
The default value is 1024.

properties are indexed in "chunks".
indexed_property_max_request determines how many files are in these "chunks".
while these chunks do support multiple threads, they are usually filed with files from the same device.

Thanks for the info, but for some reason not wrapping my head around how it works. I have my files scattered around 8+ drives and was not looking to request all properties, just the few custom I have set now.

If I set this to 0, it will read properties from all drives simultaneously, thus speeding up the process?

void
Site Admin
Posts: 6462
Joined: Fri Oct 16, 2009 11:31 pm

Re: Is property indexing multi-threaded

Post by void » Wed Mar 24, 2021 9:46 am

Thank you for your feedback JTCGiants56 and aviasd,
If I set this to 0, it will read properties from all drives simultaneously, thus speeding up the process?
Yes, this will speed up your initial "Indexing Properties" at the cost of higher RAM usage.
My issue is that the property columns do not get populated ( nothing is shown in the UI for those columns) until everything completely finished indexing in the background.
This should be fixed in the Everything 1.5.0.1250a.

aviasd
Posts: 64
Joined: Sat Oct 07, 2017 2:18 am

Re: Is property indexing multi-threaded

Post by aviasd » Thu Mar 25, 2021 11:40 am

void wrote:
Wed Mar 24, 2021 9:46 am

This should be fixed in the Everything 1.5.0.1250a.
Confirmed :mrgreen:

void
Site Admin
Posts: 6462
Joined: Fri Oct 16, 2009 11:31 pm

Re: Is property indexing multi-threaded

Post by void » Sat Mar 27, 2021 6:37 am

Everything 1.5.0.1251a makes the following changes:
  • Removed indexed_property_max_request -all indexed properties are now requested.
  • Use multiple threads for SSDs by default.
  • Fixed an issue with the property indexing progress being lost after exiting.
I will trial these settings, and if it hurts system performance too much I will revert these changes.

aviasd
Posts: 64
Joined: Sat Oct 07, 2017 2:18 am

Re: Is property indexing multi-threaded

Post by aviasd » Mon Mar 29, 2021 12:40 pm

void wrote:
Sat Mar 27, 2021 6:37 am
Everything 1.5.0.1251a makes the following changes:
  • Removed indexed_property_max_request -all indexed properties are requested.
  • Use multiple threads for SSDs by default.
  • Fixed an issue with the property indexing progress being lost after exiting.
I will trial these settings, and if it hurts system performance too much I will revert these changes.
Changes in indexing performance
Indexed:
Width(images+video),length(video) - 451,509 items

Updated results (Dunno what happened with the first attempts)

Everything 1.5.0.1251a = 07:14 minutes
Everything 1.5.0.1248a /indexed_property_max_request=0 = 08:06 minutes.
So, some improvement after all

void
Site Admin
Posts: 6462
Joined: Fri Oct 16, 2009 11:31 pm

Re: Is property indexing multi-threaded

Post by void » Tue Mar 30, 2021 1:28 am

Thank you for your feedback aviasd,

I still need to make some improvements to the RAM usage.
Everything will use about 100MB (for 450,000 files) to request all this information. Once all properties are indexed, this memory is returned to the system.
This is not so bad for your case. However, for users indexing millions of files this might be an issue.

Post Reply