How to find missing files from a text list using Everything diff/filelist? (Spaces/formatting issue)

Found a bug in "Everything"? report it here
Post Reply
Debugger
Posts: 719
Joined: Thu Jan 26, 2017 11:56 am

How to find missing files from a text list using Everything diff/filelist? (Spaces/formatting issue)

Post by Debugger »

Hi everyone,I am completely stuck and frustrated. I’ve spent months trying to solve what seems like an impossible task.I have two text lists of music files (with titles, artists, and remixes). One list has about 300 names from the internet, and the other has around 500 filenames from my hard drives. I want to find the missing tracks from the internet list that I don't have on my PC yet.I have tested thousands of scripts, regex patterns, Python, PowerShell, CMD, ChatGPT/Google AI, and professional text comparers. EVERY single tool returns massive false positives and completely fails.Why? Because the strings are messy. The files on my disk and the list from the internet might be identical to the human eye, but they differ by a single trailing space, an extra space before a parenthesis like (Dj...), minor formatting differences, or folder paths. Because of this, standard text-matching tools treat them as completely different files.Right now, my hand literally hurts from clicking because I have to manually copy and paste each of the 300 names into Everything one by one to verify if I already have a duplicate. It takes whole days and it is unbearable.I know Everything is the most powerful search tool out there. Is there a way to import these two text lists into Everything (or use the diff: function) to find the absolute duplicates or missing files, while completely ignoring white spaces, formatting differences, and paths?Please help me save my sanity and my hands. Thank you!

Ps.I cannot use any audio duplicate finders (like FinDupe, Similarity, etc.). The tracks cannot be compared by audio length or checksums because some songs often cut off the ends of the songs right after the main beat ends to save size. This means the same song has a completely different duration, file size, and audio waveform depending on where it was sourced.

diff strings etc:

Code: Select all

english/chinese title(ArtitsMixer
english/chinese title (ArtitsMixer
 english/chinese title(ArtitsMixer
void
Developer
Posts: 19863
Joined: Fri Oct 16, 2009 11:31 pm

Re: How to find missing files from a text list using Everything diff/filelist? (Spaces/formatting issue)

Post by void »

  • Download ES
  • Add ES to your PATH.
  • Download es_batch_converter-Debugger.zip
  • Open the html file.
  • Paste in your filenames.
  • Set your Everything instance.
  • Click Convert.
  • Click Copy.
  • Create a new BAT file and paste in the contents.
  • Save and run the BAT file, it will spit out the missing files.
Spaces are ignored in the search.
Adjust the script to your desired needs.
Post Reply