Add a "hide duplicate images" option to import dialogs#21319
Conversation
There was a problem hiding this comment.
This commit changes the 'swap' argument of dt_gui_preferences_bool function (which controls the position in the grid and has nothing to do with memory allocation). I've failed to see how this can fix "pre-existing memory leak". Could you please explain why the commit comment mentions a memory leak?
EDIT: Unexpected behavior from GitHub UI, I wrote this comment under a specific commit, and it is published in the general thread. It is about commit 71d8f2a
|
That's my bad; I made two back-to-back commits (in order to put the leak fix in its own), but added the wrong files to each. I'll add a comment on the offending commit to explain for any other reviewers The first leak fix is in The second leak fix is properly in its own commit, here (making a scope-local copy of the list so that we don't inhibit the caller's ability to free the |
|
You should do a proper PR with the proposed fix for 5.6 first i think. |
|
@jenshannoschwalm by that, do you mean separate out the memory leak fixes into their own PR? (I'm happy to do that, just making sure I understand you properly) |
Exactly.
|
|
I think this is an enhancement and not a bug. It's also somewhat of an edge cause because you have to be saving the same image to both cards and then importing from the camera The question I have is can one image be corrupt and the other not and still have the same file size. Would check summing be a better test of duplicate (though painfully slow). |
|
I separated out the memory leaks into their own PR (#21326), but seem to have goofed up this PR with my rebase. Fixing that now |
- skip duplicate checking when size or timestamp is not set
- resolve "chains" (all dupes point to the same canonical image)
- match UI defaults and config defaults ("hide duplicates" starts
unchecked)
"copy&import" dialogs.
|
Okay, rebase is untangled and should be good for further review
I actually did implement this at first -- the timestamp+size test marked "possible duplicate" images, then they were still copied from the camera and checksummed. It only discarded images if their "canonical" version's checksum matched perfectly (I had originally wanted to read the "thumbnail preview" for every image and compare the checksums of those, but I realized this would indeed be painfully slow). I later changed to the current behavior because I realized that doing it that way effectively doubled the import time. I suspect (without evidence) that a corrupted image would most likely be due to an incomplete write (and would thus have a different filesize), but even if that's not true I think it's fair to let the user handle that case on their own. The current system for detecting "already imported" doesn't take into account file checksums, only the file name and timestamp, so doing this the performant way (with metadata only) rather than the "catches any edge case" way (with checksums) felt like the right course. |
I have a camera with two SD card slots (Canon R7), and when I have "save to both cards" enabled, darktable shows both copies of every file in the "import from camera" window. This is frustrating, because I have to then sort by name and manually deselect all of the duplicates, or import them and delete them manually.
This PR adds a
hide duplicate imagesoption to the import dialogs, which looks at the timestamp (from the photo metadata) and filesize of every image. If two (or more) match, it considers the one with the alphabetically-first path the "canonical" version and marks the others with a "possible duplicate" flag. Checking the "hide duplicate images" box filters the possible duplicates out of the list. It only compares against other images in the same import job, not against anything already in the darktable library/database.This doesn't take into account the filename(s) of the images, which is how the "only new images" box works (in addition to timestamp), but relying on a size check felt more reliable to me. I'm happy to update that to match if you'd prefer.
This PR also fixes two small pre-existing memory leaks in the import dialog process.
note: I used AI tools to write most of this code, since C++ is not my strongest language. I reviewed the changes as carefully as I could and they all look reasonable