For a few months I've been de-duping using sizedupe: namedupe: date-createddupe: date-modifieddupe:
This was until I realised these can lead to removing uniques if the duplicates are only within one of two folders compared.
Not the case if you compare 2 folders you know have the same size and file amount, but not always the case.
So anyway I figured a new more accurate way was necessary and because I export filelists anyway, I figured I could just modify an efu to point to other locations and then find out how big those are.
This works half the time, you can right-click the files if there aren't too many of them and then click properties.
The file amount is reported as the frozen export one, it can be ignored. The size however is updated to the live files.
But this doesn't tell me, if they don't match, which files are missing/smaller.
Read extended information was what I hoped would update the filelist to reflect the actual files to the paths in the modified .efu but it does nothing.
So how do I update an .efu that has had its paths modified, if a file in the modified pathing does not exist?
If there are 5000 files in the original export, but with a modified path it points only to 4990 files..How do I make the search results report those as non-existent, maybe by having all the metadata removed besides the path? Then I could sort by any metadata other than path and then de-select anything that doesn't exist.
If this is a confusing read, I can attempt to clarify in a reply.
Update/calculate actual files in an exported .efu for de-duplication
-
Herkules97
- Posts: 220
- Joined: Tue Oct 08, 2019 6:42 am
Re: Update/calculate actual files in an exported .efu for de-duplication
Steps to recreate your environment would indeed help.Herkules97 wrote: Sun Apr 13, 2025 1:44 pm If this is a confusing read, I can attempt to clarify in a reply.
Some of the following might help (in random order):
1. Use dupe-from:
This can be used without using EFU file(s)
Search for:
Code: Select all
"X:\first folder\" | "Y:\second folder\" dupe-from:"X:\first folder\" unique:name2. Add EFU to the index
(If I understood correctly, this is what you have done)
This will give items doubled items in the index for "Y:\second folder\"
Search for:
Code: Select all
"Y:\second folder\" unique:path;name addcol:filelistfilename
Items with an entry in the filelistfilename column are in the EFU, but not in Y:
3. Use the diff macro from Compare 2 file lists with each other?
(when folderstructures are (largely identical)
Search for:
Code: Select all
diff:"X:\first folder\";"Y:\second folder\"That can be changed in this part of the search query:
unique:regmatch1;size;dm-
Herkules97
- Posts: 220
- Joined: Tue Oct 08, 2019 6:42 am
Re: Update/calculate actual files in an exported .efu for de-duplication
I do not add the efus to an instance, I load them via File drop-down. They are intentionally smaller-scale because usually it's just one folder or so.NotNull wrote: Sun Apr 13, 2025 7:22 pmSteps to recreate your environment would indeed help.Herkules97 wrote: Sun Apr 13, 2025 1:44 pm If this is a confusing read, I can attempt to clarify in a reply.
Some of the following might help (in random order):
1. Use dupe-from:
This can be used without using EFU file(s)
Search for:Code: Select all
"X:\first folder\" | "Y:\second folder\" dupe-from:"X:\first folder\" unique:name
2. Add EFU to the index
(If I understood correctly, this is what you have done)
This will give items doubled items in the index for "Y:\second folder\"
Search for:Items without entry in the filelistfilename column are on Y: , but not in the EFUCode: Select all
"Y:\second folder\" unique:path;name addcol:filelistfilename
Items with an entry in the filelistfilename column are in the EFU, but not in Y:
3. Use the diff macro from Compare 2 file lists with each other?
(when folderstructures are (largely identical)
Search for:Comparisons will be done based on name+path, size and date.Code: Select all
diff:"X:\first folder\";"Y:\second folder\"
That can be changed in this part of the search query:unique:regmatch1;size;dm
A benefit of it is that I always export efus anyway..And by using a sacrificial instance, it doesn't freeze active instances which can cause a re-build when done deleting files.
Anyway onto what I do in steps:
I have a folder I want to delete all files in.
I export B:\ folder to .efu.
I modify .efu using Notepad++ to point from B:\ to C:\ for that folder.
I load C:\ efu via the load filelist thing under the File drop-down top-left in a sacrificial instance.
I select all files, right-click, properties. Check size, is it the same as the B:\ folder? Safe to delete everything in the B:\ folder.
Is it different in even a byte from the B:\ folder? Well shit, now you have to figure out what that is.
Sometimes I've known what those are, often not.
I am not sure I want to use dupe from either..It would run into the same issue as the 4-property method where I don't actually know if I am deleting exactly the files I have elsewhere.
It's also why I stopped using dupcleaner pro. I just don't trust the method.
Granted maybe I could just compare file amount and size amount if I use dupe from.
But I think the .efu method is the safest, but it does seem to lack a way to update the .efu temporarily to read live data.
This can be a one-time trigger not a constant monitor. Just a simple load of each live file to see if they all exist where the .efu says.
I also think this method would require the name to be kept too..It's the only part, without full row select enabled, that can be selected.
Name and path, everything else can be emptied to imply it does not exist on the C:\ location.
Sort by anything other than name and path and either at the top or bottom files that do not exist in C:\ would be.
Select all of those and copy path, add as exclusions in a search for the B:\ folder.
Now invert selection on C:\ so it points to all files that do exist.
Right-click for properties for both locations and confirm that the size matches with the partial results for both.
Now the files remaining on B:\ are probably few enough that you can manually go through and see where copies may exist.
Or just let them be. Usually not large enough to warrant emptying the B:\ folder entirely if some remain.