Find Duplicate Images in Your Library (2026 Guide)
On this page
- Why reference libraries fill up with duplicates
- Before you start
- Step 1: Run the duplicate search
- Step 2: Review each duplicate pair
- Step 3: Delete the copies you do not need
- Step 4: Use "Find similar" for near-duplicates pHash misses
- Step 5: Set up a smart folder to catch future duplicates
- Common problems and fixes
- How other tools compare on duplicate detection
- Next steps
- Frequently asked questions
By refern. Last updated: June 2026.
The short answer: type is:duplicate in refern's search bar and you get a list of all near-duplicate pairs in your library, detected by perceptual hashing on your local machine. No cloud, no upload, no waiting. From there you review each pair and delete the copies you do not want.
The rest of this guide walks through exactly how to do that, why duplicates accumulate in the first place, and what to do when the results need a closer look.
Why reference libraries fill up with duplicates
If you have been collecting references for more than a few months, you almost certainly have duplicates. They get in through several routes:
- You save the same image twice from the browser because you forgot you already had it.
- You import a folder that overlaps with one you imported before.
- You drag an image out of a canvas or project folder into your main library, not realizing you already collected the original.
- A site serves the same image at two different URLs (different crop, different resolution, same visual content) and you saved both.
- You downloaded a JPEG and a PNG of the same illustration from different sources.
None of these are failures of discipline. They are just what happens when a library grows organically. The problem is that once you have several thousand images, spotting a duplicate by eye is not practical.
Before you start
You need refern installed and a workspace set up. If you have not done that yet, download at refern.app and follow the first-run prompts to point refern at the folder where your references live. refern indexes the files in place and never copies or moves them.
Wait for the initial indexing to complete (you will see the progress bar in the pipeline card disappear). For libraries under 50,000 files this usually takes a few minutes. For larger libraries it takes longer, but you can close the app and it will resume where it left off on next open.
Once indexing is done, your images all have their pHash values computed and stored in the local database. The duplicate search is a fast query against those stored values, not a real-time scan.
Step 1: Run the duplicate search
Click the search bar at the top of the library view (or press Ctrl+K / Cmd+K to open the search overlay).
Type:
is:duplicate
Hit Enter. refern queries all pHash values in the library and returns every image that has at least one near-identical counterpart.
The result set shows all members of each duplicate group. If you have three copies of the same image, all three appear. If you have 500 unique images and 20 are duplicates forming 10 pairs, you will see 20 results.
You can combine the operator with other filters. For example:
is:duplicate tag:anatomyshows only duplicate images that are also tagged "anatomy."is:duplicate in:moodboard-foldernarrows the scan to one folder.is:duplicate rating:>=3shows only highly-rated duplicates, which is useful when you want to confirm you are keeping the better copy before deleting the other.
Step 2: Review each duplicate pair
The search returns a flat grid. You want to look at each pair side by side before deleting.
Click any image to open the metadata sidebar on the right. Check the source URL, rating, tags, and notes. This tells you which copy has more metadata attached to it, which copy you imported more recently, and which you rated higher.
For a closer look, right-click the image and choose "Open original." This opens the file directly in your OS default viewer, so you can compare the full resolution of both copies.
Things to check:
- Resolution. If one copy is 4000 x 3000 and the other is 800 x 600, keep the larger one.
- Format. A PNG original is usually better than a JPEG recompression. Check the file extension in the filename or metadata panel.
- Metadata richness. If one copy has tags, a source URL, and a rating while the other is bare, keep the richer one.
- Tags and links. If one copy is tagged and linked to canvases or groups, deleting it breaks those connections. Prefer to delete the untagged copy.
Step 3: Delete the copies you do not need
Select the image you want to remove. Press Delete or right-click and choose "Move to trash." refern soft-deletes the entity: the thumbnail and index entry are removed, but the original file on disk is moved to refern's trash area, not permanently deleted yet. You have a window to undo.
To permanently remove files from disk, open the trash (Settings, Trash), select the items, and choose "Delete permanently."
If you have many duplicates, you can multi-select in the duplicate search results. Hold Shift and click to range-select, or Ctrl/Cmd and click to pick individual images. Then delete all selected at once.
A practical order of operations:
- Sort by date added (oldest first) so the original tends to appear before the repeat.
- Scan through, rating the best copy with at least one star if you have not already.
- Filter to
is:duplicate rating:0(no rating) and delete all of those. This leaves only the rated copies. - Re-run
is:duplicateto confirm the count dropped.
Step 4: Use "Find similar" for near-duplicates pHash misses
pHash catches images that are visually nearly identical but allows a small tolerance. A heavily cropped version of the same image, or one with a strong color-grade applied, may not match.
For those cases, right-click any image in the library or on a canvas and choose "Find similar." This opens a radial menu with a visual similarity search. It uses a 512-byte local descriptor (HSV histogram, dominant colors, color layout, edge histogram) rather than pHash, so it catches softer matches.
This is useful for:
- Finding that you saved both a tight crop and the full original.
- Spotting images from the same photoshoot (same lighting, same model, slightly different pose).
- Locating multiple versions of an illustration that went through different color palettes.
Visual similarity results appear in the search overlay, ranked by similarity score. Review them the same way as the pHash results.
Step 5: Set up a smart folder to catch future duplicates
Once your library is clean, you want to stop duplicates from piling up again.
Go to the smart folders panel (left sidebar, the folder icon with a filter mark). Create a new smart folder. Set its query to is:duplicate. Name it "Duplicates."
Now whenever duplicates accumulate, that folder shows a non-zero count as a visual reminder. You can check it after a big import session instead of having to remember to run the search manually.
Common problems and fixes
"I ran is:duplicate and got zero results even though I know I have duplicates."
Check that indexing finished. If the pipeline progress card is still visible, the pHash values have not all been computed yet. Wait for it to complete, then retry.
"I deleted the wrong copy and now the better file is in trash."
Go to Settings, Trash. Find the file, right-click, and choose Restore. It returns to its original folder with all metadata intact.
"The duplicate search returned two images that look completely different to me."
pHash is a visual hash, not a byte hash. Two very similar-looking abstract textures or solid-color gradients can produce close hashes. Review the pair manually. If they are genuinely different, keep both. The false-positive rate is low but not zero.
"I have thousands of results and no time to review them all."
Use the combination filters described in Step 1. Start with is:duplicate rating:0 tag: (no rating, specific tag category) to pick off the lowest-value duplicates first. Work through categories rather than trying to process everything at once.
How other tools compare on duplicate detection
PureRef, BeeRef, and Allusion do not have any form of duplicate detection.
PureRef has no search, no metadata index, and no tag system at all. pureref.com/handbook/features confirms this. Finding a duplicate in PureRef means scrolling the canvas by eye. At any board size above a few dozen images this is impractical.
BeeRef (free, open source, GPL-3.0) also has no search or metadata layer. The feature list at beeref.org shows a canvas-focused tool with no library system. There is no path to duplicate detection without a database, and BeeRef has none.
Allusion (free, GPL-3.0) is the closest competitor in spirit because it does index files by tag and folder, but it has no pHash, no visual similarity, and no duplicate operator. A GitHub issue documents a RAM crash at 358 images during thumbnail generation (issue #640), and the project has been effectively unmaintained since February 2023, so this gap is unlikely to close. The project does have basic tag-based search, and hierarchical tags out of the box are a genuine strength for smaller libraries that do not need deduplication.
refern is the only one of these four tools that ships local duplicate detection. The pHash is computed at index time, so the is:duplicate query is instant regardless of library size.
Next steps
Once your library is clean:
- Set up a smart folder for ongoing duplicate monitoring as described above.
- If you are coming from PureRef and want to understand what else refern adds, see the refern vs PureRef comparison.
- If you are evaluating whether to switch from Eagle or are managing a very large library, the best Eagle alternatives roundup covers the full landscape.
- Use
is:duplicatecombined withrating:>=3occasionally to audit whether any of your highest-rated references have duplicates that could be cleaned up.
A clean library is faster to search, easier to browse, and less likely to surface the wrong version of an image when you are mid-project and in a hurry. Running the duplicate scan once a month, especially after a batch import session, keeps things manageable.
Frequently asked questions
What is perceptual hashing (pHash) for images?
Does refern send my images to a server to find duplicates?
Can refern find duplicates that are slightly different, such as a cropped or resaved copy?
PureRef, BeeRef, and Allusion all have a duplicate detection feature, right?
How large a library can refern's duplicate scan handle?
- $30 one-time, no subscription
- Windows, macOS, Linux
- Local-first and private
- 10,000+ creatives
- Community on Discord
“Organization and search like Eagle cool, canvas from PureRef.”
Try it yourself
One library for your references, with a canvas built in.
refern keeps your images organized and searchable, gives you an infinite canvas to arrange them, and read your files as is. $30 one-time, lifetime updates.
No account required. Cancel anytime during the trial.
Sources
Keep reading
Best PureRef Alternatives for Linux in 2026
PureRef alternative for Linux artists: compare refern, BeeRef, Allusion, TagStudio, and digiKam. Find the best-maintained option for your workflow in 2026.
Best Reference Managers for Artists in 2026
What is the best reference manager for artists? Compare refern, Eagle, PureRef, Allusion, TagStudio, BeeRef, and digiKam with honest pros, cons, and clear pick guides.
Best Reference Managers for Artists 2026 (Top 10)
Best reference manager for artists in 2026 compared: refern, Eagle, PureRef, Allusion, TagStudio, BeeRef, digiKam, Billfish, Adobe Bridge, Kuadro. Prices, platforms, and honest verdicts.
Moodboard Without Copying Files: Build One in refern
Moodboard without copying files: index your images in place, compose on an infinite canvas, and skip the disk-doubling libraries. Free 30-day trial at refern.app.