Search Duplicate Files w/ different names

Discussion related to "Everything" 1.5.
Post Reply
komobu
Posts: 1
Joined: Mon Jun 01, 2026 12:10 pm

Search Duplicate Files w/ different names

Post by komobu »

I have about 3000 e-books "epub" files spread out over many directories on my computer with different names. Tae a look at the following

A Summers Moon.epub
A Summers Moon(2).epub
A Summers Moon by William Smith.epub

All these files are the same book. What is the best way to search for duplicate files by content instead of File Name? I thought of size, but most of my epub files are really close in size.

Thanks for any help
void
Developer
Posts: 19863
Joined: Fri Oct 16, 2009 11:31 pm

Re: Search Duplicate Files w/ different names

Post by void »

Please try an exact size match first:

In Everything 1.5, search for:

Code: Select all

*.epub dupe:size
This will instantly show possibly duplicated ebooks.



To find epubs that have the same content, search for:

Code: Select all

*.epub dupe:size;sha256


For epubs with slightly different size, please try:

Code: Select all

*.epub add-column:a a:=INT($size:/1024) dupe:a
Adjust 1024 as needed.



For epubs with slightly different name, please try:

Code: Select all

*.epub regex:name:^(.{12}) dupe:1
This will find epubs starting with the same first 12 characters.
Adjust 12 as needed.



Find duplicates in Everything 1.5
Herkules97
Posts: 220
Joined: Tue Oct 08, 2019 6:42 am

Re: Search Duplicate Files w/ different names

Post by Herkules97 »

komobu wrote: Mon Jun 01, 2026 1:57 pm I have about 3000 e-books "epub" files spread out over many directories on my computer with different names. Tae a look at the following

A Summers Moon.epub
A Summers Moon(2).epub
A Summers Moon by William Smith.epub

All these files are the same book. What is the best way to search for duplicate files by content instead of File Name? I thought of size, but most of my epub files are really close in size.

Thanks for any help
We don't know your exact setup.

If the files have different sizes, can't use sizedupe.
If the files have different names, can't use namedupe.
If the files have different content(would be the same as different size), can't use hashes like SHA256.
If the files aren't the exact same, they likely won't have the same timestamps, so can't use date-created-dupe nor date-modified-dupe. If you use Windows Explorer to copy, it will only copy the time modified so created would be irrelevant anyway.

The best way is to not have gathered every copy in the first place, second best is to manually de-duplicate.

Or you can do like me and just keep everything. For some songs I have 10 or more copies with varying metadata, sound quality and whatever other differences.
Post Reply