runtime/rp2: handle RP2350 shared FIFO IRQ for GC#5482
Open
rdon-key wants to merge 1 commit into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #5151.
This PR fixes an RP2350 -scheduler=cores deadlock observed when repeatedly calling runtime.GC().
On Pico 2, the test program below stops at the first GC cycle, while the identical program completes on RP2040 and on RP2350 with -scheduler=tasks.
Before this change:
This points to an RP2350-specific issue in the multicore GC path, not a general spinlock problem.
Root cause
The deadlock is in the SIO FIFO IRQ handling used during the GC stop-the-world phase.
RP2040 and RP2350 have different SIO FIFO IRQ topology:
The previous common RP2 runtime registered FIFO IRQ handling with a per-core model that works on RP2040, but does not behave correctly on RP2350 during stop-the-world.
Separating the FIFO IRQ setup per chip resolves the deadlock.
Change
Hardware spinlock note
The existing hardware spinlock implementation is intentionally left unchanged.
I checked whether this reproduced failure needs to be addressed by switching the runtime to software spinlocks. It does not appear to be necessary for this bug.
The runtime currently uses the following hardware spinlock IDs:
This PR adds no new hardware spinlock usage, does not change these lock IDs, and does not add writes to the newer SIO registers involved in the RP2350-E2 aliasing issue.
With the hardware spinlock implementation left unchanged, fixing the SIO FIFO IRQ setup is enough for the test below to pass repeatedly on Pico 2. Therefore, spinlock changes are out of scope for this PR.
Test program
Before this change
With the previous implementation, the test stopped on Pico 2 with -scheduler=cores.
The program did not reach PASS.
After this change
All four tested combinations pass:
pico2 -scheduler=cores was run multiple times with N = 10000 and completed successfully each time.