Rebuild full search document for tag deletes by mgwozdz-unicon · Pull Request #38477 · openedx/openedx-platform

mgwozdz-unicon · 2026-04-29T16:32:54Z

Description

This PR in conjunction with the openedx-core PR (openedx/openedx-core#571) is the backend component for openedx/modular-learning#260. The openedx-core PR (openedx/openedx-core#571) ensures that when a tag and its descendants are deleted that CONTENT_OBJECT_ASSOCIATIONS_CHANGED events take place. This PR ensures that once those events happen, the entire search document is replaced for the relevant content. Without this PR, those events were only leading to a partial update of the document, which was sufficient to update renamed tags but not sufficient to remove deleted tags from the search index.

Goal:
When a Course Author deletes a tag that is associated to Library content, they should be able to return to the Library page and not see those tags in the tag search filter.

Supporting information

Github Issue: openedx/modular-learning#260

Testing instructions

Note: This cannot be tested without the code from the following PRs in place:

Go to Taxonomies and select a taxonomy
Add new tags to this taxonomy that you plan to delete. (I have been adding a subtag (e.g. Wind1) with 2 sibbling children (W1.1 and W1.2) for my testing.)
Go to a Library and create 2 new pieces of content.
For one piece of content, associate the parent subtag (e.g. Wind1) and click Save.
For the other piece of content, associate a child (e.g. W1.2) and click Save.
Click on the tag search filter button and note that Wind1 and W1.2 appear in the tag search filter.
Return to the taxonomy and delete the subtag Wind1.
It should disappear from the taxonomy and a toast should display indicating what was deleted.
Go back to the Library and refresh the page in the browser.
The pieces of content you previously associated those tags to should now display that 0 tags are associated to them.
Click on the tag search filter button and note that Wind1 and W1.2 no longer appear in the tag search filter.

Deadline

This is for our Verawood code cut stretch goal.

Other information

While this code requires other PRs in place to be able to test, I believe it can be safely merged without them.

openedx-webhooks · 2026-04-29T16:33:00Z

Thanks for the pull request, @mgwozdz-unicon!

This repository is currently maintained by @openedx/wg-maintenance-openedx-platform.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
- This process (including the steps you'll need to take) is documented here.
If it doesn't, simply proceed with the next step.

🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

Dependencies

This PR must be merged before / after / at the same time as ...
Blockers

This PR is waiting for OEP-1234 to be accepted.
Timeline information

This PR must be merged by XX date because ...
Partner information

This is for a course on edx.org.
Supporting documentation
Relevant Open edX discussion forum threads

🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.

Details

Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

The size and impact of the changes that it introduces
The need for product review
Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

ormsbee

Just a couple of questions on things that I'm confused about.

ormsbee · 2026-04-29T17:13:23Z

+    Returns a tuple of (doc, should_delete_index_doc). The second value is True when the
+    object no longer exists in a form that should remain indexed.
+    """
+    if isinstance(key, LibraryCollectionLocator):


[Nit (optional)]: The match statement with class matching works well for this kind of code:

match key: case LibraryCollectionLocator(): ... case LibraryContainerLocator(): ... case _: raise TypeError(...)

ormsbee · 2026-04-29T20:23:27Z

+    try:
+        doc, should_delete_doc = _build_full_content_object_index_doc(key)
+    except ItemNotFoundError:
+        should_delete_doc = True


I'm a bit confused by this code. What raises the ItemNotFoundError, and why does it bubble up to here?

The exception was coming from modulestore().get_item() in the course-block path. I’m moving that handling into _build_full_content_object_index_doc() so missing objects are normalized there for all key types, and the caller just deletes the stale index doc when should_delete_doc is true.

ormsbee · 2026-04-29T20:24:55Z

    _wait_for_meili_tasks(tasks)


+def _replace_index_docs(docs) -> None:


Am I right in understanding that this takes a list of docs, but it's only ever called with a single doc in that list in upsert_content_object_tags_index_doc()? Is it intended to be called with a list of docs elsewhere?

Yes. I'm updating it to take a single doc to match the actual usage.

bradenmacdonald · 2026-04-29T22:39:19Z

+            if not doc.get(Fields.display_name):
+                return None, True


Can we mention in a comment what this is checking for? ("searchable_doc_for_container always returns a document, even if the container has been deleted, but in that case some fields will be missing.")

Nit: I also think we should be careful here to only return None, True if Fields.display_name is actually absent, and not just empty "".

The docstring for searchable_doc_for_container seems wrong; it says Field.type will be missing, but looking at the code that doesn't seem to be the case. Out of scope for this PR but feel free to correct if you want.

bradenmacdonald · 2026-04-29T22:45:56Z

+    Helper function that fully replaces a document in the search index.
+
+    We use this when nested fields need to be removed from indexed documents, because
+    Meilisearch partial updates do not reliably clear nested facet values.


Nit: This seems like a bug in Meilisearch. Were you able to find a bug report on their GitHub, or some explanation in their documentation as to why this is necessary?

bradenmacdonald · 2026-04-29T23:17:43Z

@mgwozdz-unicon OK, I played around with your openedx-core PR and the authoring PR but not using this PR and I definitely see the Meilisearch bug you're fixing here.

Just setting Field.tags: {} does not seem to clear the values of the sub-tags properly in the Meilisearch document, even though it shows the tags field being totally empty in the Meilisearch UI:

However, I found a much simpler fix for this problem that doesn't require reindexing the whole docs or refactoring this code so significantly. I think the following is sufficient:

--- a/openedx/core/djangoapps/content/search/documents.py
+++ b/openedx/core/djangoapps/content/search/documents.py
@@ -383,7 +383,10 @@ def searchable_doc_tags(object_id: OpaqueKey) -> dict:
     if not all_tags:
         # Clear out tags in the index when unselecting all tags for the block, otherwise
         # it would remain the last value if a cleared Fields.tags field is not included
-        return {Fields.tags: {}}
+        return {Fields.tags: {
+            Fields.tags_taxonomy: [],
+            Fields.tags_level0: [],
+        }}
     result = {
         Fields.tags_taxonomy: [],
         Fields.tags_level0: [],

What do you think?

(Note: it may be best to add level1, level2, etc. as well. Also the code could be re-organized to avoid duplication with the result = ... below, since we're now essentially returning result in either case.)

ormsbee · 2026-04-30T04:49:57Z

@mgwozdz-unicon: Just a note that openedx-core 1.0.0 has been released.

Co-authored-by: Copilot <copilot@github.com>

openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Apr 29, 2026

openedx-webhooks added this to Contributions Apr 29, 2026

github-project-automation Bot moved this to Needs Triage in Contributions Apr 29, 2026

mgwozdz-unicon mentioned this pull request Apr 29, 2026

Delete tag updates meilisearch openedx/openedx-core#571

Merged

mphilbrick211 moved this from Needs Triage to Waiting on Author in Contributions Apr 29, 2026

mgwozdz-unicon force-pushed the tag-delete-updates-meilisearch branch from 53c46c2 to 03d7d0f Compare April 29, 2026 17:19

ormsbee reviewed Apr 29, 2026

View reviewed changes

bradenmacdonald reviewed Apr 29, 2026

View reviewed changes

Comment thread openedx/core/djangoapps/content/search/api.py Outdated

bradenmacdonald reviewed Apr 29, 2026

View reviewed changes

mgwozdz-unicon force-pushed the tag-delete-updates-meilisearch branch 2 times, most recently from 05e6f0c to a55c105 Compare April 30, 2026 00:36

mgwozdz-unicon and others added 3 commits April 30, 2026 06:45

feat: switch approach to clearing nested tag fields

12f9a45

Co-authored-by: Copilot <copilot@github.com>

feat: bump openedx-core version

ae1afb1

fix: test_documents should expect output on all levels for tags

e8d81d3

Co-authored-by: Copilot <copilot@github.com>

mgwozdz-unicon force-pushed the tag-delete-updates-meilisearch branch from 857dd5f to e8d81d3 Compare April 30, 2026 13:50

fix: put back whitespace

7f90cdc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rebuild full search document for tag deletes#38477

Rebuild full search document for tag deletes#38477
mgwozdz-unicon wants to merge 4 commits intoopenedx:masterfrom
mgwozdz-unicon:tag-delete-updates-meilisearch

mgwozdz-unicon commented Apr 29, 2026

Uh oh!

openedx-webhooks commented Apr 29, 2026

Uh oh!

ormsbee left a comment

Uh oh!

ormsbee Apr 29, 2026

Uh oh!

ormsbee Apr 29, 2026

Uh oh!

mgwozdz-unicon Apr 29, 2026

Uh oh!

ormsbee Apr 29, 2026

Uh oh!

mgwozdz-unicon Apr 29, 2026

Uh oh!

bradenmacdonald Apr 29, 2026

Uh oh!

Uh oh!

bradenmacdonald Apr 29, 2026

Uh oh!

bradenmacdonald commented Apr 29, 2026 •

edited

Loading

Uh oh!

ormsbee commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		_wait_for_meili_tasks(tasks)


		def _replace_index_docs(docs) -> None:

Conversation

mgwozdz-unicon commented Apr 29, 2026

Description

Supporting information

Testing instructions

Deadline

Other information

Uh oh!

openedx-webhooks commented Apr 29, 2026

Uh oh!

ormsbee left a comment

Choose a reason for hiding this comment

Uh oh!

ormsbee Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

ormsbee Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

mgwozdz-unicon Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

ormsbee Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

mgwozdz-unicon Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

bradenmacdonald Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bradenmacdonald Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

bradenmacdonald commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ormsbee commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

bradenmacdonald commented Apr 29, 2026 •

edited

Loading