Skip to content

Add bulk delete to query builder#8120

Open
alesan99 wants to merge 27 commits into
mainfrom
issue-6882
Open

Add bulk delete to query builder#8120
alesan99 wants to merge 27 commits into
mainfrom
issue-6882

Conversation

@alesan99

@alesan99 alesan99 commented May 27, 2026

Copy link
Copy Markdown
Contributor

Partially fixes issue #6882

Adds a "Bulk Delete" button to Query Builder. Also adds a new "bulk_delete" permission.

Implemented through the new bulk_delete endpoint.

TODO:

  • Add deletion blockers to the warning window. It should should which related records are going to be deleted, like in the regular deletion warning.
  • Add a "bulk delete" button to the record set form. This should delete all the records within a dataset, rather than the dataset itself.
  • Audit Log?

Checklist

  • Self-review the PR after opening it to make sure the changes look good and
    self-explanatory (or properly documented)
  • Add relevant issue to release milestone
  • Add pr to documentation list
  • Add automated tests

Testing instructions

  • Create a bunch of records (Bulk carry forward is a good way)
  • Create a query in Query Builder, and run it.
  • See that the "Bulk Delete" button appears if you are an institution admin
  • Select some records in your query results
  • Click bulk delete, and see you are warned that you will delete X records permantently
  • Click delete, and see that the records were deleted and removed from the query results.
  • Reset your selection
  • Click bulk delete, and see you are warned that you will delete ALL record from the query permanently
  • Check existing behavior:
  • Try to delete individual records in query builder through browse in forms. They should still get removed from query results like normal

Summary by CodeRabbit

  • New Features

    • Bulk delete: remove multiple records from a query or selected rows; includes background processing and task status reporting.
    • Multi-step confirmation with required typed confirmation of the exact delete count.
    • UI control shown only when delete permissions apply.
  • Notifications

    • Progress and lifecycle notifications for bulk-delete start, success (with count), and failure.
  • Localization

    • Added UI and notification strings for bulk delete.
  • Documentation

    • Bulk delete API endpoint added to generated OpenAPI docs.

@coderabbitai

coderabbitai Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Adds a backend atomic bulk-delete POST endpoint accepting explicit IDs or a stored query and a frontend two-step bulk-delete UI with permissions, result updates, localization, notifications, and OpenAPI documentation.

Changes

Bulk Delete Feature

Layer / File(s) Summary
Backend bulk delete endpoint
specifyweb/backend/bulk_copy/bulk_copy.py
New atomic collection_dispatch_bulk_delete endpoint accepts POST with ids array or query object, resolves model and PKs (via stored query when needed), deletes records via delete_resource, returns 400 if no IDs/query, and 204 on success.
Routing and views
specifyweb/backend/bulk_copy/urls.py, specifyweb/backend/bulk_copy/views.py
Adds routes for initiating bulk delete and checking task status. Views dispatch bulk_delete_task.apply_async, return a task_id, and expose task status/progress via JSON.
Background task & transactional deletion
specifyweb/backend/bulk_copy/bulk_delete_task.py
Adds bulk_delete_task Celery task and do_bulk_delete which runs within a transaction, resolves IDs from spquery when provided, deletes each record via delete_resource, and reports progress and notifications.
BulkDelete dialog and button components
specifyweb/frontend/js_src/lib/components/QueryBuilder/BulkDelete.tsx
New QueryBulkDelete and BulkDeletionDialog implement two-step confirmation, compute delete count from selected IDs or total, POST { ids, query } to backend, and invoke onDeleted on success.
Query results integration
specifyweb/frontend/js_src/lib/components/QueryBuilder/Results.tsx
Conditionally renders QueryBulkDelete when permissions allow. Refactors handleDelete to accept single or multiple IDs, deduplicate deletions, remove rows from results, update selection and total count.
Bulk delete permission policy
specifyweb/frontend/js_src/lib/components/Permissions/definitions.ts
Added '/record/bulk_delete' operation policy requiring the delete permission.
Notification renderers
specifyweb/frontend/js_src/lib/components/Notifications/NotificationRenderers.tsx
Adds renderers for bulk-delete-started, bulk-delete-succeeded, and bulk-delete-failed using localized notification text and table labels.
Localization and OpenAPI schema
specifyweb/frontend/js_src/lib/localization/forms.ts, specifyweb/frontend/js_src/lib/localization/query.ts, specifyweb/frontend/js_src/lib/localization/notifications.ts, specifyweb/specify/models_utils/schema.py
Adds bulk-delete localization strings for dialogs and notifications, and OpenAPI documentation for POST /bulk_copy/bulk_delete/{table} with request body schema and 204 response.

Sequence Diagram(s)

sequenceDiagram
  participant UI as Browser UI (BulkDeletionDialog)
  participant API as Backend API (collection_dispatch_bulk_delete)
  participant Celery as Celery (bulk_delete_task)
  participant DoDelete as do_bulk_delete
  participant DB as Database

  UI->>API: POST /bulk_copy/bulk_delete/{table} { ids?, query? }
  API->>Celery: apply_async(collection_id, user_id, agent_id, table, ids, spquery)
  Celery->>DoDelete: invoke do_bulk_delete(...)
  DoDelete->>DB: resolve IDs (stored query) / delete_resource(id)*
  DB-->>DoDelete: deletion results
  DoDelete-->>Celery: progress updates / completion
  Celery-->>UI: notifications via Message records (started/failed/succeeded)
Loading

Suggested reviewers:

  • emenslin
  • acwhite211
  • CarolineDenis
  • bhumikaguptaa
🚥 Pre-merge checks | ✅ 4 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Automatic Tests ⚠️ Warning The PR adds significant bulk delete functionality (backend endpoints, Celery task, frontend UI) but includes no automated tests. The PR objectives explicitly list "automated tests are pending". Add automated tests for backend endpoints, Celery task transaction handling, ID resolution logic, and frontend component state management and permission checks.
Testing Instructions ⚠️ Warning Testing says "institution admin" can see bulk delete, but code requires generic delete permission, not admin role. Also omits tests for distinct query mode and preview button. Clarify permission requirement as generic delete permission rather than admin role; add tests for distinct query disabling button and preview functionality.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: adding a bulk delete feature to the query builder interface.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch issue-6882

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread specifyweb/backend/bulk_copy/bulk_copy.py Fixed
Comment thread specifyweb/frontend/js_src/lib/components/QueryBuilder/Results.tsx Fixed
Comment thread specifyweb/backend/bulk_copy/bulk_copy.py Fixed
@alesan99 alesan99 marked this pull request as ready for review June 3, 2026 18:44

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@specifyweb/backend/bulk_copy/bulk_copy.py`:
- Around line 94-124: The code is using spquery["contexttableid"] to fetch IDs
but then deletes rows from the URL {model}, allowing mismatched-table deletes;
update the logic in the block that builds the query (the session_context /
build_query path using fields_from_json and build_query) to verify the resolved
query target matches the route model before proceeding: after obtaining entity =
query.column_descriptions[0]["entity"] (or inspecting pk_col via
inspect(entity)) compare the entity/table identifier to the route model's table
identity (e.g., model.__tablename__ or model's table id), and if they differ
raise/return a 400 error (abort the operation) rather than continuing to call
delete_resource; keep the check local to the code that computes ids (the
session_context / build_query / ids logic) so deletions only proceed when the
query resolves to the same model.
- Around line 84-90: The handler currently assumes json.loads(request.body)
returns a dict and calls data.get(...), which will crash for list/scalar bodies;
update the code in bulk_copy.py (the section handling data, ids, spquery and
model) to first verify that data is a dict and return a 400 (bad request) if
not, then validate that ids (if present) is a list and spquery (if present) is
the expected type (e.g., str or dict per your API) before using .get; keep the
existing model handling (isinstance(model, str) -> get_model_or_404(model)) but
perform these body/type checks immediately after parsing to reject invalid
bodies and provide clear error messages.

In `@specifyweb/frontend/js_src/lib/components/QueryBuilder/BulkDelete.tsx`:
- Around line 132-156: The final confirmation Dialog (rendered when
isWarningOpen) only shows generic text; update the bulk-delete confirmation UI
to surface dependent/related records and potential deletion blockers before the
delete is committed: compute or fetch the dependent records (e.g., a new
dependentRecords array derived alongside totalDeleteCount in the BulkDelete
logic or via an async call in the component), then render a clear section inside
the Dialog (near formsText.bulkDeleteFinalConfirmationDescription()) listing
each dependent record type/count and any blockers, and disable the Button.Danger
(handleClick) unless the user has acknowledged or resolved blockers; update the
Dialog content to include this dependent-record summary and any required user
acknowledgement state so handleClick only runs when safe.
- Around line 34-36: The button is incorrectly disabled when queryResource is
undefined even if there are selected rows; change the disabled logic on
Button.Small to only be true when there are no rows to delete (e.g. replace
disabled={totalCount === 0 || queryResource === undefined} with something like
disabled={totalCount === 0 && (selectedIds?.length ?? 0) === 0} so the button is
enabled when selectedIds (or your actual selected-row array/ids variable) has
entries even if queryResource is undefined; update the expression to use the
actual selected-row variable name present in this file (e.g. selectedIds,
selectedRowIds, or selectedRows) and keep onClick={handleOpen} as-is.

In `@specifyweb/frontend/js_src/lib/components/QueryBuilder/Results.tsx`:
- Around line 407-420: The bulk-delete button renders even for distinct queries
which makes recordIds() unsafe; update the rendering guard so QueryBulkDelete
(used here with props table, totalCount, onDeleted, recordIds, queryResource) is
only shown when canBulkDeleteTable is true AND the result set is not distinct
(e.g., !isDistinct) or when a guaranteed primary-key derivation is available;
specifically, change the condition around the QueryBulkDelete component to check
isDistinct (or a new safeBulkDelete boolean) and skip rendering QueryBulkDelete
when isDistinct is true unless you can derive real primary keys for each row
instead of casting display values via queryIdField from
loadedResults/selectedRows.
- Around line 221-238: The bug is that totalRemoveCount is accumulated with a
negative sign (totalRemoveCount = totalRemoveCount - removeCount) so later
subtracting it in setTotalCount increases totalCount; change the accumulator to
add positives (use totalRemoveCount += removeCount) and ensure the "no ids
provided" branch sets totalRemoveCount to the positive count (totalRemoveCount =
totalCount ?? 0) so the final setTotalCount(... Math.max(0, totalCount -
totalRemoveCount)) correctly decreases totalCount; update references in the
block handling recordIds, the loop using removeCount and deletingRef, and the
no-ids branch so all usages treat totalRemoveCount as a positive number of
removed rows.
- Around line 240-244: The selection filter currently uses a stale recordIds and
ends up preserving deleted row ids; update the newSelectedRows implementation so
it computes the current set of existing IDs (e.g., map the current records to
currentIds) and then return new Set(Array.from(selectedRows).filter(id =>
currentIds.includes(id))) so setSelectedRows(newSelectedRows(selectedRows))
drops any ids that no longer exist; modify newSelectedRows to reference that
fresh currentIds (instead of the old recordIds) to ensure deleted selections are
removed.
- Around line 199-247: The handleDelete closure uses stale
results/selectedRows/totalCount because those values aren't in its deps; change
it to read latest state from refs and use functional state updates: inside
handleDelete, read the current list from resultsRef.current (fallback to [])
instead of the closed-over results, compute filteredResults and
totalRemoveCount, then call setResults(prev => /* use prev or resultsRef.current
to compute new array */) and setTotalCount(prev => /* compute from prev and
totalRemoveCount */) and setSelectedRows(prev => new
Set(Array.from(prev).filter(id => !recordIds.includes(id)))), and keep
deletingRef usage; update the dependency list to only include stable setters and
refs (setResults, setTotalCount, setSelectedRows, deletingRef, resultsRef) so
the callback no longer captures stale snapshots.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 427c36b6-8c21-45e8-af63-818c2fc6b464

📥 Commits

Reviewing files that changed from the base of the PR and between 7fc918f and 0d1e07c.

📒 Files selected for processing (9)
  • specifyweb/backend/bulk_copy/bulk_copy.py
  • specifyweb/backend/bulk_copy/urls.py
  • specifyweb/backend/bulk_copy/views.py
  • specifyweb/frontend/js_src/lib/components/Permissions/definitions.ts
  • specifyweb/frontend/js_src/lib/components/QueryBuilder/BulkDelete.tsx
  • specifyweb/frontend/js_src/lib/components/QueryBuilder/Results.tsx
  • specifyweb/frontend/js_src/lib/localization/forms.ts
  • specifyweb/frontend/js_src/lib/localization/query.ts
  • specifyweb/specify/models_utils/schema.py

Comment thread specifyweb/backend/bulk_copy/bulk_copy.py
Comment thread specifyweb/backend/bulk_copy/bulk_copy.py
Comment on lines +132 to +156
<div className="mb-4 flex flex-col gap-4">
<section>{formsText.deleteConfirmationDescription()}</section>
</div>
</Dialog>
{
isWarningOpen ?
(
<Dialog
buttons={
<>
<Button.DialogClose>{commonText.close()}</Button.DialogClose>
<Button.Danger
disabled={confirmationText !== totalDeleteCount.toString()}
onClick={handleClick}
>
{formsText.bulkDeleteFinalConfirmationOption({ count: totalDeleteCount })}
</Button.Danger>
</>
}
header={formsText.bulkDeleteFinalConfirmation()}
onClose={onClose}
>
<div className="mb-4 flex flex-col gap-4">
<section>{formsText.deleteConfirmationDescription()}</section>
<section>{formsText.bulkDeleteFinalConfirmationDescription()}</section>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Show dependent-record impact before the destructive confirmation.

This dialog still only shows generic warning text and the typed count. The linked issue/PR objective requires surfacing related records/deletion blockers before confirmation, so users can see what else will be affected before committing the delete.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@specifyweb/frontend/js_src/lib/components/QueryBuilder/BulkDelete.tsx` around
lines 132 - 156, The final confirmation Dialog (rendered when isWarningOpen)
only shows generic text; update the bulk-delete confirmation UI to surface
dependent/related records and potential deletion blockers before the delete is
committed: compute or fetch the dependent records (e.g., a new dependentRecords
array derived alongside totalDeleteCount in the BulkDelete logic or via an async
call in the component), then render a clear section inside the Dialog (near
formsText.bulkDeleteFinalConfirmationDescription()) listing each dependent
record type/count and any blockers, and disable the Button.Danger (handleClick)
unless the user has acknowledged or resolved blockers; update the Dialog content
to include this dependent-record summary and any required user acknowledgement
state so handleClick only runs when safe.

Comment thread specifyweb/frontend/js_src/lib/components/QueryBuilder/Results.tsx
Comment thread specifyweb/frontend/js_src/lib/components/QueryBuilder/Results.tsx Outdated
Comment thread specifyweb/frontend/js_src/lib/components/QueryBuilder/Results.tsx
Comment thread specifyweb/frontend/js_src/lib/components/QueryBuilder/Results.tsx
Comment thread specifyweb/specify/models_utils/schema.py
@github-project-automation github-project-automation Bot moved this from 📋Back Log to Dev Attention Needed in General Tester Board Jun 3, 2026
Verify query table matches bulk_delete table
Comment thread specifyweb/backend/bulk_copy/bulk_copy.py Fixed
alesan99 and others added 3 commits June 3, 2026 14:29
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Disable bulk delete for distinct queries
@alesan99 alesan99 added this to the 7.12.2 milestone Jun 3, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (3)
specifyweb/frontend/js_src/lib/components/Notifications/NotificationRenderers.tsx (1)

400-411: ⚡ Quick win

Consider including table name in the localization string for consistency.

The bulk-delete-started handler manually concatenates the table name (lines 406-408) outside the localization string, whereas bulk-delete-succeeded and bulk-delete-failed include {tableName} as a parameter within their localization strings. This inconsistency makes the code harder to maintain and localize properly.

♻️ Proposed refactor for consistency

Update the localization string to include the table name parameter, then simplify the handler:

In specifyweb/frontend/js_src/lib/localization/notifications.ts:

  bulkDeleteHasStarted: {
-    'en-us': 'Bulk delete has started in the background',
+    'en-us': 'Bulk delete has started in the background — {tableName:string}.',
  },

Then in this file:

  'bulk-delete-started'(notification) {
    const tableName = notification.payload.table;
    const table = getTable(tableName);
    return (
-      <>
-        <p>
-          {notificationsText.bulkDeleteHasStarted()} —{' '}
-          {table?.label ?? tableName}
-          {'.'}
-        </p>
-      </>
+      <p>
+        {notificationsText.bulkDeleteHasStarted({
+          tableName: table?.label ?? tableName,
+        })}
+      </p>
    );
  },
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@specifyweb/frontend/js_src/lib/components/Notifications/NotificationRenderers.tsx`
around lines 400 - 411, The 'bulk-delete-started' renderer currently appends the
table name outside the i18n string (see handler labeled 'bulk-delete-started',
the tableName variable and getTable call), making it inconsistent with
bulk-delete-succeeded/failed; update the localization API to accept a tableName
parameter (e.g. add {tableName} to bulkDeleteHasStarted in notifications
localization) and then change the renderer to pass
notificationsText.bulkDeleteHasStarted({ tableName: table?.label ?? tableName })
and remove the manual concatenation so the entire message is produced by the
i18n call.
specifyweb/backend/bulk_copy/bulk_delete_task.py (2)

101-101: 💤 Low value

Replace Http404 with ValueError in background task context.

get_model_or_404 raises Http404 for invalid model names, but HTTP exceptions are not appropriate in a Celery background task. The exception will still fail the task, but the error semantics are confusing.

♻️ Suggested alternative
+from specifyweb.specify.api.crud import strict_get_model
+
 `@transaction.atomic`
 def do_bulk_delete(
     collection,
     agent,
     table_name: str,
     ids: list[int],
     spquery: dict | None = None,
     progress=None,
 ) -> None:
     """Core deletion logic, wrapped in an atomic block."""
 
-    model = get_model_or_404(table_name)
+    try:
+        model = strict_get_model(table_name)
+    except AttributeError as e:
+        raise ValueError(f'Invalid model name: {table_name}') from e
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@specifyweb/backend/bulk_copy/bulk_delete_task.py` at line 101, The background
Celery task uses get_model_or_404 which raises Http404; replace this HTTP
exception with a plain ValueError for background task semantics by catching
Http404 around the call in bulk_delete_task (the line assigning model =
get_model_or_404(table_name)) and re-raising a ValueError with a clear message
(or use a non-HTTP lookup helper if available) so the task fails with a
ValueError instead of an Http404; ensure you reference get_model_or_404 and the
model variable in your change.

60-60: 💤 Low value

Replace EN DASH with HYPHEN-MINUS in comment.

The comment contains an EN DASH () character instead of a standard HYPHEN-MINUS (-), which can cause encoding or linting issues.

♻️ Suggested fix
     except Exception:
-        # 2a) Task failed – notify the user and re-raise
+        # 2a) Task failed - notify the user and re-raise
         Message.objects.create(
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@specifyweb/backend/bulk_copy/bulk_delete_task.py` at line 60, In
bulk_delete_task.py update the comment that currently reads "# 2a) Task failed –
notify the user and re-raise" to use a normal hyphen-minus instead of the EN
DASH; locate the comment (search for "Task failed – notify the user and
re-raise") and replace the en dash (U+2013) with a hyphen-minus (U+002D) so it
becomes "# 2a) Task failed - notify the user and re-raise" to avoid
encoding/lint issues.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@specifyweb/backend/bulk_copy/bulk_delete_task.py`:
- Around line 35-36: Wrap the calls that fetch models (the lines using
spmodels.Collection.objects.get(id=collection_id) and
spmodels.Agent.objects.get(id=agent_id)) in try/except blocks to catch
spmodels.Collection.DoesNotExist and spmodels.Agent.DoesNotExist, and handle
each by logging a clear error or raising a descriptive exception (e.g.,
ValueError or a custom exception) that includes the missing id and context, then
exit or return the task gracefully; ensure you reference these exact symbols
when editing so the task doesn't crash with an unhandled DoesNotExist.
- Around line 106-110: The code directly indexes spquery['contexttableid'] and
spquery['fields'], which will raise KeyError when those keys are missing; update
the validation in bulk_delete_task (around the logic that checks
model.specify_model.tableId) to explicitly verify presence and types of
'contexttableid' and 'fields' (e.g., use spquery.get(...) or an if
'contexttableid' not in spquery / if 'fields' not in spquery) and raise a clear
ValueError like "Missing or invalid 'contexttableid' in spquery" or "Missing or
invalid 'fields' in spquery' before comparing to model.specify_model.tableId or
iterating fields.
- Around line 17-87: The success/count reporting is wrong because
bulk_delete_task uses the input ids list (which may be empty) instead of the
actual resolved IDs from do_bulk_delete and mixes types for the initial 'count'
notification; update do_bulk_delete to return the final number of deleted IDs
(or the resolved IDs list) and change bulk_delete_task to capture that return
value (e.g., total = do_bulk_delete(...)) and use it for all notifications and
update_state calls, and replace the mixed-type initial Message content ('entire
query' string) with a consistent numeric value or null/None when unknown; ensure
references to do_bulk_delete, bulk_delete_task, Message, and the progress
callback are updated accordingly so progress/FINAL meta and success Message use
the returned total.

In `@specifyweb/backend/bulk_copy/views.py`:
- Line 24: The code does not handle malformed JSON at the data =
json.loads(request.body) call; wrap that call in a try/except catching
json.JSONDecodeError and return a 400 response (e.g., HttpResponseBadRequest or
JsonResponse({"error": "Invalid JSON"}, status=400)) from the view instead of
letting the exception bubble, and ensure json and the chosen response class are
imported; locate the assignment to data (the json.loads(request.body) line) in
the view and apply the try/except there.
- Around line 40-52: Update bulk_delete_status to validate that the requested
task belongs to the current user and to normalize Celery state names to match
backup_status conventions: fetch the AsyncResult via
bulk_delete_task.AsyncResult(task_id), determine the task metadata/kwargs or
stored owner id (same source used by backup_status) and return HTTP 403 if the
requesting user is not the owner; map raw states (e.g., "SUCCESS"→"SUCCEEDED",
"FAILURE"→"FAILED", keep others consistent with backup_status) before building
the JSON, and keep the existing logic for taskprogress extraction (use
result.info if dict else repr(result.info)). Ensure checks use the same
identifiers/fields as backup_status to locate ownership.

In
`@specifyweb/frontend/js_src/lib/components/Notifications/NotificationRenderers.tsx`:
- Line 415: The code blindly casts notification.payload.count to a number;
instead validate it before use: in the NotificationRenderers component/function,
check that notification and notification.payload exist and that typeof
notification.payload.count === "number" and
Number.isFinite(notification.payload.count) (or parseInt/Number if you expect
numeric strings), then use that validated value (or a safe default like 0 or
bail out/render a fallback) rather than the direct assertion to avoid runtime
errors.

---

Nitpick comments:
In `@specifyweb/backend/bulk_copy/bulk_delete_task.py`:
- Line 101: The background Celery task uses get_model_or_404 which raises
Http404; replace this HTTP exception with a plain ValueError for background task
semantics by catching Http404 around the call in bulk_delete_task (the line
assigning model = get_model_or_404(table_name)) and re-raising a ValueError with
a clear message (or use a non-HTTP lookup helper if available) so the task fails
with a ValueError instead of an Http404; ensure you reference get_model_or_404
and the model variable in your change.
- Line 60: In bulk_delete_task.py update the comment that currently reads "# 2a)
Task failed – notify the user and re-raise" to use a normal hyphen-minus instead
of the EN DASH; locate the comment (search for "Task failed – notify the user
and re-raise") and replace the en dash (U+2013) with a hyphen-minus (U+002D) so
it becomes "# 2a) Task failed - notify the user and re-raise" to avoid
encoding/lint issues.

In
`@specifyweb/frontend/js_src/lib/components/Notifications/NotificationRenderers.tsx`:
- Around line 400-411: The 'bulk-delete-started' renderer currently appends the
table name outside the i18n string (see handler labeled 'bulk-delete-started',
the tableName variable and getTable call), making it inconsistent with
bulk-delete-succeeded/failed; update the localization API to accept a tableName
parameter (e.g. add {tableName} to bulkDeleteHasStarted in notifications
localization) and then change the renderer to pass
notificationsText.bulkDeleteHasStarted({ tableName: table?.label ?? tableName })
and remove the manual concatenation so the entire message is produced by the
i18n call.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: ca3f230b-0014-416d-bf46-2ab4cdc86de1

📥 Commits

Reviewing files that changed from the base of the PR and between 925f58f and 8f0e181.

📒 Files selected for processing (7)
  • specifyweb/backend/bulk_copy/bulk_delete_task.py
  • specifyweb/backend/bulk_copy/urls.py
  • specifyweb/backend/bulk_copy/views.py
  • specifyweb/frontend/js_src/lib/components/Notifications/NotificationRenderers.tsx
  • specifyweb/frontend/js_src/lib/components/QueryBuilder/BulkDelete.tsx
  • specifyweb/frontend/js_src/lib/components/QueryBuilder/Results.tsx
  • specifyweb/frontend/js_src/lib/localization/notifications.ts

Comment on lines +17 to +87
@app.task(base=LogErrorsTask, bind=True)
def bulk_delete_task(
self,
collection_id: int,
user_id: int,
agent_id: int,
table_name: str,
ids: list[int],
spquery: dict | None = None,
) -> None:
"""Run bulk deletion as a background Celery task within a single transaction.

If *ids* is empty but *spquery* is provided, all IDs matching the
query are resolved and then deleted. The entire operation runs inside
``transaction.atomic`` so that any failure rolls back everything.
"""
import specifyweb.specify.models as spmodels

collection = spmodels.Collection.objects.get(id=collection_id)
agent = spmodels.Agent.objects.get(id=agent_id)

def progress(current: int, total: int) -> None:
if not self.request.called_directly:
self.update_state(
state='PROGRESS',
meta={'current': current, 'total': total},
)

# 1) Notify the user that the task has started
Message.objects.create(
user_id=user_id,
content=json.dumps({
'type': 'bulk-delete-started',
'table': table_name,
'count': len(ids) if ids else 'entire query',
'taskid': self.request.id,
}),
)

try:
do_bulk_delete(collection, agent, table_name, ids, spquery, progress)

except Exception:
# 2a) Task failed – notify the user and re-raise
Message.objects.create(
user_id=user_id,
content=json.dumps({
'type': 'bulk-delete-failed',
'table': table_name,
'taskid': self.request.id,
}),
)
raise

# 2b) Task succeeded
total = len(ids)
if not self.request.called_directly:
self.update_state(
state='SUCCESS',
meta={'current': total, 'total': total},
)

Message.objects.create(
user_id=user_id,
content=json.dumps({
'type': 'bulk-delete-succeeded',
'table': table_name,
'taskid': self.request.id,
'count': total,
}),
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Fix incorrect count tracking when IDs are resolved from query.

When ids is empty and resolved from spquery inside do_bulk_delete, line 72 computes total = len(ids) using the input ids list (which is empty), not the resolved IDs. This causes the success notification at line 85 to report count: 0 even when records were deleted.

Additionally, line 51 uses mixed types for count: len(ids) if ids else 'entire query' (int or string). The frontend notification renderer likely expects a consistent type.

🐛 Suggested fix

Modify do_bulk_delete to return the resolved ID count, and use it in the success path:

 `@transaction.atomic`
 def do_bulk_delete(
     collection,
     agent,
     table_name: str,
     ids: list[int],
     spquery: dict | None = None,
     progress=None,
-) -> None:
+) -> int:
-    """Core deletion logic, wrapped in an atomic block."""
+    """Core deletion logic, wrapped in an atomic block.
+    
+    Returns the number of records deleted.
+    """
 
     model = get_model_or_404(table_name)
 
     # Resolve IDs from query if no explicit IDs were provided
     if len(ids) == 0 and spquery is not None:
         with models.session_context() as session:
             tableid = spquery['contexttableid']
             if model.specify_model.tableId != tableid:
                 raise ValueError(
                     'Query table does not match bulk delete table.'
                 )
 
             field_specs = fields_from_json(spquery['fields'])
             query, __ = build_query(
                 session,
                 collection,
                 None,         # user
                 tableid,
                 field_specs,
             )
 
             entity = query.column_descriptions[0]['entity']
             pk_col = inspect(entity).primary_key[0]
 
             ids = list(
                 session.execute(
                     query.with_entities(pk_col).distinct()
                 ).scalars()
             )
 
     if len(ids) == 0:
         raise ValueError('No record IDs to delete.')
 
     total = len(ids)
     for current, record_id in enumerate(ids, start=1):
         # delete_resource raises an exception on failure, causing
         # the entire transaction to roll back.
         delete_resource(collection, agent, model, record_id, None)
         if progress is not None:
             progress(current, total)
+    
+    return total

Then update the task to use the returned count:

 def bulk_delete_task(
     self,
     collection_id: int,
     user_id: int,
     agent_id: int,
     table_name: str,
     ids: list[int],
     spquery: dict | None = None,
 ) -> None:
     """Run bulk deletion as a background Celery task within a single transaction.
 
     If *ids* is empty but *spquery* is provided, all IDs matching the
     query are resolved and then deleted.  The entire operation runs inside
     ``transaction.atomic`` so that any failure rolls back everything.
     """
     import specifyweb.specify.models as spmodels
 
     collection = spmodels.Collection.objects.get(id=collection_id)
     agent = spmodels.Agent.objects.get(id=agent_id)
 
     def progress(current: int, total: int) -> None:
         if not self.request.called_directly:
             self.update_state(
                 state='PROGRESS',
                 meta={'current': current, 'total': total},
             )
 
+    initial_count = len(ids) if ids else None
+    
     # 1) Notify the user that the task has started
     Message.objects.create(
         user_id=user_id,
         content=json.dumps({
             'type': 'bulk-delete-started',
             'table': table_name,
-            'count': len(ids) if ids else 'entire query',
+            'count': initial_count,
             'taskid': self.request.id,
         }),
     )
 
     try:
-        do_bulk_delete(collection, agent, table_name, ids, spquery, progress)
+        total = do_bulk_delete(collection, agent, table_name, ids, spquery, progress)
 
     except Exception:
         # 2a) Task failed – notify the user and re-raise
         Message.objects.create(
             user_id=user_id,
             content=json.dumps({
                 'type': 'bulk-delete-failed',
                 'table': table_name,
                 'taskid': self.request.id,
             }),
         )
         raise
 
     # 2b) Task succeeded
-    total = len(ids)
     if not self.request.called_directly:
         self.update_state(
             state='SUCCESS',
             meta={'current': total, 'total': total},
         )
 
     Message.objects.create(
         user_id=user_id,
         content=json.dumps({
             'type': 'bulk-delete-succeeded',
             'table': table_name,
             'taskid': self.request.id,
             'count': total,
         }),
     )

Note: Using None for initial_count when IDs will be resolved from a query is clearer than the string 'entire query'.

🧰 Tools
🪛 Ruff (0.15.15)

[warning] 60-60: Comment contains ambiguous (EN DASH). Did you mean - (HYPHEN-MINUS)?

(RUF003)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@specifyweb/backend/bulk_copy/bulk_delete_task.py` around lines 17 - 87, The
success/count reporting is wrong because bulk_delete_task uses the input ids
list (which may be empty) instead of the actual resolved IDs from do_bulk_delete
and mixes types for the initial 'count' notification; update do_bulk_delete to
return the final number of deleted IDs (or the resolved IDs list) and change
bulk_delete_task to capture that return value (e.g., total =
do_bulk_delete(...)) and use it for all notifications and update_state calls,
and replace the mixed-type initial Message content ('entire query' string) with
a consistent numeric value or null/None when unknown; ensure references to
do_bulk_delete, bulk_delete_task, Message, and the progress callback are updated
accordingly so progress/FINAL meta and success Message use the returned total.

Comment on lines +35 to +36
collection = spmodels.Collection.objects.get(id=collection_id)
agent = spmodels.Agent.objects.get(id=agent_id)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add error handling for missing collection or agent.

If the provided collection_id or agent_id does not exist, objects.get() will raise DoesNotExist, causing the task to fail with an unclear error. Consider adding explicit error handling.

🛡️ Suggested fix
     import specifyweb.specify.models as spmodels
 
-    collection = spmodels.Collection.objects.get(id=collection_id)
-    agent = spmodels.Agent.objects.get(id=agent_id)
+    try:
+        collection = spmodels.Collection.objects.get(id=collection_id)
+        agent = spmodels.Agent.objects.get(id=agent_id)
+    except spmodels.Collection.DoesNotExist:
+        raise ValueError(f'Collection {collection_id} not found')
+    except spmodels.Agent.DoesNotExist:
+        raise ValueError(f'Agent {agent_id} not found')
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
collection = spmodels.Collection.objects.get(id=collection_id)
agent = spmodels.Agent.objects.get(id=agent_id)
try:
collection = spmodels.Collection.objects.get(id=collection_id)
agent = spmodels.Agent.objects.get(id=agent_id)
except spmodels.Collection.DoesNotExist:
raise ValueError(f'Collection {collection_id} not found')
except spmodels.Agent.DoesNotExist:
raise ValueError(f'Agent {agent_id} not found')
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@specifyweb/backend/bulk_copy/bulk_delete_task.py` around lines 35 - 36, Wrap
the calls that fetch models (the lines using
spmodels.Collection.objects.get(id=collection_id) and
spmodels.Agent.objects.get(id=agent_id)) in try/except blocks to catch
spmodels.Collection.DoesNotExist and spmodels.Agent.DoesNotExist, and handle
each by logging a clear error or raising a descriptive exception (e.g.,
ValueError or a custom exception) that includes the missing id and context, then
exit or return the task gracefully; ensure you reference these exact symbols
when editing so the task doesn't crash with an unhandled DoesNotExist.

Comment on lines +106 to +110
tableid = spquery['contexttableid']
if model.specify_model.tableId != tableid:
raise ValueError(
'Query table does not match bulk delete table.'
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add validation for missing query fields.

Direct dictionary access to spquery['contexttableid'] at line 106 and spquery['fields'] at line 112 will raise KeyError if these keys are missing. Consider validating the query structure.

🛡️ Suggested fix
     # Resolve IDs from query if no explicit IDs were provided
     if len(ids) == 0 and spquery is not None:
+        if 'contexttableid' not in spquery or 'fields' not in spquery:
+            raise ValueError('Query must contain contexttableid and fields')
+        
         with models.session_context() as session:
             tableid = spquery['contexttableid']
             if model.specify_model.tableId != tableid:
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@specifyweb/backend/bulk_copy/bulk_delete_task.py` around lines 106 - 110, The
code directly indexes spquery['contexttableid'] and spquery['fields'], which
will raise KeyError when those keys are missing; update the validation in
bulk_delete_task (around the logic that checks model.specify_model.tableId) to
explicitly verify presence and types of 'contexttableid' and 'fields' (e.g., use
spquery.get(...) or an if 'contexttableid' not in spquery / if 'fields' not in
spquery) and raise a clear ValueError like "Missing or invalid 'contexttableid'
in spquery" or "Missing or invalid 'fields' in spquery' before comparing to
model.specify_model.tableId or iterating fields.

``query`` (dict representation of a :class:`SpQuery`). Returns the
Celery task ID so the frontend can poll for progress.
"""
data = json.loads(request.body)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle JSON decode errors.

The endpoint does not handle malformed JSON in the request body. A client sending invalid JSON will trigger an unhandled exception.

🛡️ Suggested fix
+from json import JSONDecodeError
+
 `@login_maybe_required`
 `@require_POST`
 def bulk_delete_background(request, model: str) -> http.JsonResponse:
     """Dispatch a Celery task to bulk-delete records in the background.
 
     Accepts a JSON body with ``ids`` (list of int) and an optional
     ``query`` (dict representation of a :class:`SpQuery`).  Returns the
     Celery task ID so the frontend can poll for progress.
     """
-    data = json.loads(request.body)
+    try:
+        data = json.loads(request.body)
+    except JSONDecodeError:
+        return http.JsonResponse({'error': 'Invalid JSON'}, status=400)
     ids = data.get('ids', [])
     spquery = data.get('query')
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@specifyweb/backend/bulk_copy/views.py` at line 24, The code does not handle
malformed JSON at the data = json.loads(request.body) call; wrap that call in a
try/except catching json.JSONDecodeError and return a 400 response (e.g.,
HttpResponseBadRequest or JsonResponse({"error": "Invalid JSON"}, status=400))
from the view instead of letting the exception bubble, and ensure json and the
chosen response class are imported; locate the assignment to data (the
json.loads(request.body) line) in the view and apply the try/except there.

Comment on lines +40 to +52
@login_maybe_required
@require_GET
def bulk_delete_status(request, task_id: str) -> http.JsonResponse:
"""Return the Celery task state for a background bulk-delete."""
result = bulk_delete_task.AsyncResult(task_id)
status = {
'taskstatus': result.state,
'taskprogress': (
result.info if isinstance(result.info, dict) else repr(result.info)
),
'taskid': task_id,
}
return http.JsonResponse(status) No newline at end of file

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Add task ownership validation and normalize state values.

Two issues:

  1. Security: Any authenticated user can query the status of any task by guessing or observing task IDs. There is no validation that the task belongs to the requesting user.

  2. Consistency: Unlike backup_status (which normalizes SUCCESSSUCCEEDED, FAILUREFAILED, etc.), this endpoint returns raw Celery state names. The frontend may expect normalized values.

🔒 Suggested fixes
 `@login_maybe_required`
 `@require_GET`
 def bulk_delete_status(request, task_id: str) -> http.JsonResponse:
     """Return the Celery task state for a background bulk-delete."""
     result = bulk_delete_task.AsyncResult(task_id)
+    
+    # Validate task ownership by checking task args contain current user
+    # (task args are: collection_id, user_id, agent_id, ...)
+    if result.args and len(result.args) > 1:
+        task_user_id = result.args[1]
+        if task_user_id != request.specify_user.id:
+            return http.HttpResponseForbidden()
+    
+    # Normalize state names to match frontend expectations
+    state_map = {
+        'PENDING': 'RUNNING',
+        'RECEIVED': 'RUNNING',
+        'STARTED': 'RUNNING',
+        'PROGRESS': 'RUNNING',
+        'SUCCESS': 'SUCCEEDED',
+        'FAILURE': 'FAILED',
+        'RETRY': 'FAILED',
+    }
+    
     status = {
-        'taskstatus': result.state,
+        'taskstatus': state_map.get(result.state, 'FAILED'),
         'taskprogress': (
             result.info if isinstance(result.info, dict) else repr(result.info)
         ),
         'taskid': task_id,
     }
     return http.JsonResponse(status)

Based on learnings from backup_status implementation pattern.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@specifyweb/backend/bulk_copy/views.py` around lines 40 - 52, Update
bulk_delete_status to validate that the requested task belongs to the current
user and to normalize Celery state names to match backup_status conventions:
fetch the AsyncResult via bulk_delete_task.AsyncResult(task_id), determine the
task metadata/kwargs or stored owner id (same source used by backup_status) and
return HTTP 403 if the requesting user is not the owner; map raw states (e.g.,
"SUCCESS"→"SUCCEEDED", "FAILURE"→"FAILED", keep others consistent with
backup_status) before building the JSON, and keep the existing logic for
taskprogress extraction (use result.info if dict else repr(result.info)). Ensure
checks use the same identifiers/fields as backup_status to locate ownership.

},
'bulk-delete-succeeded'(notification) {
const tableName = notification.payload.table;
const count = notification.payload.count as number;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate count from payload before use.

The count is extracted using a type assertion without validation. If the backend sends an invalid value, this could cause runtime issues.

🛡️ Proposed fix to add validation
-    const count = notification.payload.count as number;
+    const count = Number(notification.payload.count) || 0;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const count = notification.payload.count as number;
const count = Number(notification.payload.count) || 0;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@specifyweb/frontend/js_src/lib/components/Notifications/NotificationRenderers.tsx`
at line 415, The code blindly casts notification.payload.count to a number;
instead validate it before use: in the NotificationRenderers component/function,
check that notification and notification.payload exist and that typeof
notification.payload.count === "number" and
Number.isFinite(notification.payload.count) (or parseInt/Number if you expect
numeric strings), then use that validated value (or a safe default like 0 or
bail out/render a fallback) rather than the direct assertion to avoid runtime
errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Dev Attention Needed

Development

Successfully merging this pull request may close these issues.

3 participants