Hi everyone,I am completely stuck and frustrated. I’ve spent months trying to solve what seems like an impossible task.I have two text lists of music files (with titles, artists, and remixes). One list has about 300 names from the internet, and the other has around 500 filenames from my hard drives. I want to find the missing tracks from the internet list that I don't have on my PC yet.I have tested thousands of scripts, regex patterns, Python, PowerShell, CMD, ChatGPT/Google AI, and professional text comparers. EVERY single tool returns massive false positives and completely fails.Why? Because the strings are messy. The files on my disk and the list from the internet might be identical to the human eye, but they differ by a single trailing space, an extra space before a parenthesis like (Dj...), minor formatting differences, or folder paths. Because of this, standard text-matching tools treat them as completely different files.Right now, my hand literally hurts from clicking because I have to manually copy and paste each of the 300 names into Everything one by one to verify if I already have a duplicate. It takes whole days and it is unbearable.I know Everything is the most powerful search tool out there. Is there a way to import these two text lists into Everything (or use the diff: function) to find the absolute duplicates or missing files, while completely ignoring white spaces, formatting differences, and paths?Please help me save my sanity and my hands. Thank you!
Ps.I cannot use any audio duplicate finders (like FinDupe, Similarity, etc.). The tracks cannot be compared by audio length or checksums because some songs often cut off the ends of the songs right after the main beat ends to save size. This means the same song has a completely different duration, file size, and audio waveform depending on where it was sourced.
diff strings etc:
Code: Select all
english/chinese title(ArtitsMixer
english/chinese title (ArtitsMixer
english/chinese title(ArtitsMixer