Everything searching multiple drives for duplicates, using MD5.
Many files, querying takes a couple of hours.
Yesterday, by the end of the day, the search results were there, but I had no time anymore to manage them
(deleting unnecessary duplicates).
Before closing Everything I exported the results to a .csv file (180MB)
Today, Everything starts from scratch, same search.
Question: is there a way to use the previously saved search results?
Thanks!
How to save and re-use MD5 duplicate search results?
-
TheBestPessimist
- Posts: 46
- Joined: Sat Jan 14, 2023 6:36 pm
Re: How to save and re-use MD5 duplicate search results?
I can't help you with your problem
(unless you open the CSV in excel and search for the duplicates from there), but I want to say I stopped using Everything for any kind of file/folder property management if it is computed, like for example computing the file hash (MD5, SHA, doesn't matter).
I found multiple problems, at least in my workflow:
- Everything doesn't remember the 'ad-hoc' computed properties: ie if i add a column for MD5 and it computes the hash for all the visible files on my local drive, next time i start Everything, they're all gone
- Everything doesn't carry over properties during file/folder movement: i have C:/a/b/c and i move "a" to D:/a/b/c -> all computed properties are lost
- Everything interface is completely frozen when indexing properties, then if i do a file move, it freezes again -> i cannot search anything, not even the normal 'name/path search'
- I'm sure there was something more, but i can't remember on the spot
I have had these issues both for local drives (hdd or ssd, doesn't matter) and for mounted smb shares.
All i can say is: please pay very very close attention to what properties you're indexing, especially if they're calculated over very many files. There are many papercuts, some bigger than others
I found multiple problems, at least in my workflow
dupe:md5 <multiple | paths | here>- Everything doesn't remember the 'ad-hoc' computed properties: ie if i add a column for MD5 and it computes the hash for all the visible files on my local drive, next time i start Everything, they're all gone
- Everything doesn't carry over properties during file/folder movement: i have C:/a/b/c and i move "a" to D:/a/b/c -> all computed properties are lost
- Everything interface is completely frozen when indexing properties, then if i do a file move, it freezes again -> i cannot search anything, not even the normal 'name/path search'
- I'm sure there was something more, but i can't remember on the spot
I have had these issues both for local drives (hdd or ssd, doesn't matter) and for mounted smb shares.
All i can say is: please pay very very close attention to what properties you're indexing, especially if they're calculated over very many files. There are many papercuts, some bigger than others
Re: How to save and re-use MD5 duplicate search results?
MD5 information is cached per tab.
The simple answer is to keep the tab opened.
Please try opening your csv as a file list under File menu -> Open File List.
MD5 should be stored in the CSV file.
The simple answer is to keep the tab opened.
Please try opening your csv as a file list under File menu -> Open File List.
MD5 should be stored in the CSV file.
-
TheBestPessimist
- Posts: 46
- Joined: Sat Jan 14, 2023 6:36 pm
Re: How to save and re-use MD5 duplicate search results?
Proposal: after that info is calculated, why not save it to the database? This way it can be used in the future tooMD5 information is cached per tab.
Re: How to save and re-use MD5 duplicate search results?
Maybe an option for a more persistent property cache?
property cache would persist between sessions.
User would need to manually clear cache.
I will consider such an option.
Use property indexing if you want to index md5.
Tools -> Options -> Properties -> Add property -> md5.
property cache would persist between sessions.
User would need to manually clear cache.
I will consider such an option.
Use property indexing if you want to index md5.
Tools -> Options -> Properties -> Add property -> md5.
-
TheBestPessimist
- Posts: 46
- Joined: Sat Jan 14, 2023 6:36 pm
Re: How to save and re-use MD5 duplicate search results?
It would not stop. At least not for calculated properties.where would this stop?
How do we choose what property to cache and what not to cache?
My use case is computing sha1 for files which are stored in Google Drive. That access is slow and i cannot download locally all the TB of data i have, so i 'stream' them, and load some properties only for the current folder(s) I am working right now.
And in case i try to make everything precalculate all the sha1 from the parent folder of where i work, ever so often, Everything decides to nuke all the calculated properties, whether from adding a column to the view, or from using the Settings->Properties menu, then reindex them. This means i lost hours in which i could not use Everything even for a basic search because it indexes the properties and all UI is frozen and even right clicking the tray is frozen sometimes.
If those properties are saved, even only the 'ad-hoc' loaded ones, I would get much greater benefit from Everything as i have to do multiple searches throughout the day in the same 'set of folders'.
---
Why SHA1 and not MD5? For my laptop, SHA1 computes at SSD read speed, 4GB/s, while MD5 at best at 500-600 MB/s. Maybe it' hardware accelerated? IDK, But SHA1 is definitely faster.
Last edited by TheBestPessimist on Fri Dec 19, 2025 6:55 am, edited 1 time in total.
-
TheBestPessimist
- Posts: 46
- Joined: Sat Jan 14, 2023 6:36 pm
Re: How to save and re-use MD5 duplicate search results?
I would be happy with that option.Maybe an option for a more persistent property cache?
property cache would persist between sessions.
User would need to manually clear cache.
1. Forever Cache all the properties that i calculated (or loaded?), whether adhoc or via Settings.
2. Do not delete those properties, even if the drive is not available (think Google Drive restarts, or external USB drive is removed from laptop, or network share is currently unavailable because i'm not at home)
3.1 Give an option: 'Delete all cached properties older than X amount of time' and let me select the time. (maybe this is not needed?)
3.2 Give an option 'Delete cached properties X,Y,Z for folder F (including subfolders)' and let me select the folder.
I will consider such an option.