feat: add ImgEdit benchmark with edit type subsets by davidberenstein1957 · Pull Request #517 · PrunaAI/pruna

davidberenstein1957 · 2026-01-31T16:05:19Z

Closes #510

Summary

Add ImgEdit benchmark for image editing evaluation with 8 edit type subsets
Fetch instructions and judge prompts from GitHub (PKU-YuanGroup/ImgEdit)
Support subset filtering: replace, add, remove, adjust, extract, style, background, compose

Usage

from pruna.data import PrunaDataModule

# Load all edit types
dm = PrunaDataModule.from_string("ImgEdit")

# Load specific edit type
dm = PrunaDataModule.from_string("ImgEdit", subset="replace")

Test plan

PrunaDataModule.from_string("ImgEdit") works
Subset filter works for all 8 edit types
Auxiliaries include subset, image_id, judge_prompt fields
Docstring tests pass

Closes #510 - Add setup_imgedit_dataset in datasets/prompt.py - Support subset filter (replace, add, remove, adjust, extract, style, background, compose) - Fetch instructions and judge prompts from GitHub (PKU-YuanGroup/ImgEdit) - Register ImgEdit in base_datasets - Add BenchmarkInfo entry with accuracy metric, task_type image_edit - Add test for loading with subset filter Co-authored-by: Cursor <cursoragent@cursor.com>

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

Comment @cursor review or bugbot run to trigger another review on this PR

tests/data/test_datamodule.py

src/pruna/data/datasets/prompt.py

…nting - Rename subset parameter to category in setup_imgedit_dataset - Add empty dataset guard before ds.select([0]) - Fix trailing newlines linting issue - Update tests to use category parameter Co-authored-by: Cursor <cursoragent@cursor.com>

Prevents crash when category filter produces empty dataset. Co-authored-by: Cursor <cursoragent@cursor.com>

Co-authored-by: Cursor <cursoragent@cursor.com>

github-actions · 2026-02-13T00:13:02Z

This PR has been inactive for 10 days and is now marked as stale.

…mgedit-benchmark Made-with: Cursor

…hmark pattern - Resolve conflicts in prompt.py and test_datamodule.py - Refactor setup_imgedit_dataset: ImgEditCategory, fraction, test_sample_size, _prepare_test_only_prompt_dataset - Add ImgEdit to BENCHMARK_CATEGORY_CONFIG Made-with: Cursor

Made-with: Cursor

… script Made-with: Cursor

Made-with: Cursor

davidberenstein1957 · 2026-02-27T09:40:12Z

src/pruna/data/__init__.py

+        name="imgedit",
+        display_name="ImgEdit",
+        description="Image editing benchmark with 8 edit types for evaluating editing capabilities.",
+        metrics=["accuracy"],


should be img_edit_score but is not implemented in pruna

Made-with: Cursor

…hmark

review-notebook-app · 2026-02-27T10:17:15Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

…ep ImgEdit and GenEval Made-with: Cursor

…ep ImgEdit, HPS, LongTextBench, GenEval Made-with: Cursor

…ep ImgEdit and GenEval Made-with: Cursor

cursor bot reviewed Jan 31, 2026

View reviewed changes

tests/data/test_datamodule.py Outdated Show resolved Hide resolved

src/pruna/data/datasets/prompt.py Outdated Show resolved Hide resolved

davidberenstein1957 and others added 3 commits January 31, 2026 17:19

fix: add empty dataset guard to setup_parti_prompts_dataset

5b560b1

Prevents crash when category filter produces empty dataset. Co-authored-by: Cursor <cursoragent@cursor.com>

fix: shorten ImgEdit description to fix line length linting

36c291d

Co-authored-by: Cursor <cursoragent@cursor.com>

github-actions bot added the stale label Feb 13, 2026

davidberenstein1957 requested review from begumcig and simlang February 20, 2026 22:24

github-actions bot removed the stale label Feb 21, 2026

davidberenstein1957 added 2 commits February 26, 2026 14:48

merge: align feat/add-partiprompts-benchmark-to-pruna into feat/add-i…

8ce885a

…mgedit-benchmark Made-with: Cursor

davidberenstein1957 force-pushed the feat/add-imgedit-benchmark branch from 7e59c76 to ce71981 Compare February 27, 2026 09:03

davidberenstein1957 changed the base branch from main to feat/add-partiprompts-benchmark-to-pruna February 27, 2026 09:08

davidberenstein1957 added 5 commits February 27, 2026 10:09

chore: apply ruff format to data module, add lint-before-push script

c7ffa2e

Made-with: Cursor

chore: fix get_literal_values_from_param docstring, add SCOPE to lint…

81fd644

… script Made-with: Cursor

chore: remove scripts/lint-before-push.sh

ddaf8c2

Made-with: Cursor

chore: align metrics with Pruna, comment unsupported InferBench metrics

347f8ec

Made-with: Cursor

fix: remove accuracy from ImgEdit - img_edit_score ≠ accuracy

f8c232a

Made-with: Cursor

davidberenstein1957 commented Feb 27, 2026

View reviewed changes

davidberenstein1957 added 2 commits February 27, 2026 10:43

chore: simplify metric comment

757555e

Made-with: Cursor

Merge remote-tracking branch 'origin/main' into feat/add-imgedit-benc…

5ac24a1

…hmark

Merge feat/add-partiprompts-benchmark-to-pruna: resolve conflicts, ke…

c108e6b

…ep ImgEdit and GenEval Made-with: Cursor

davidberenstein1957 force-pushed the feat/add-partiprompts-benchmark-to-pruna branch from 3be835c to 7ebb4cd Compare February 27, 2026 10:26

davidberenstein1957 added 2 commits February 27, 2026 11:31

Merge feat/add-partiprompts-benchmark-to-pruna: resolve conflicts, ke…

fb7e92a

…ep ImgEdit, HPS, LongTextBench, GenEval Made-with: Cursor

Merge feat/add-partiprompts-benchmark-to-pruna: resolve conflicts, ke…

1881552

…ep ImgEdit and GenEval Made-with: Cursor

davidberenstein1957 merged commit be60209 into feat/add-partiprompts-benchmark-to-pruna Feb 27, 2026

davidberenstein1957 mentioned this pull request Feb 27, 2026

feat: add benchmark support to PrunaDataModule and implement PartiPrompts #502

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add ImgEdit benchmark with edit type subsets#517

feat: add ImgEdit benchmark with edit type subsets#517
davidberenstein1957 merged 16 commits intofeat/add-partiprompts-benchmark-to-prunafrom
feat/add-imgedit-benchmark

davidberenstein1957 commented Jan 31, 2026 •

edited

Loading

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Feb 13, 2026

Uh oh!

davidberenstein1957 Feb 27, 2026

Uh oh!

review-notebook-app bot commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

davidberenstein1957 commented Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Usage

Test plan

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Feb 13, 2026

Uh oh!

davidberenstein1957 Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

review-notebook-app bot commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

davidberenstein1957 commented Jan 31, 2026 •

edited

Loading