8367151: [lworld] CorrectlyRestoreRfp.java triggers "bad oop found" during deoptimization #1751

marc-chevalier · 2025-11-20T12:13:53Z

We are on aarch64.

When a function needs stack extension, we build a stack that has this shape:

valhalla/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp

Lines 6040 to 6073 in b1d14c6

    
           // Remove the extension of the caller's frame used for inline type unpacking 
        
           // 
        
           // Right now the stack looks like this: 
        
           // 
        
           // | Arguments from caller     | 
        
           // |---------------------------|  <-- caller's SP 
        
           // | Saved LR #1               | 
        
           // | Saved FP #1               | 
        
           // |---------------------------| 
        
           // | Extension space for       | 
        
           // |   inline arg (un)packing  | 
        
           // |---------------------------|  <-- start of this method's frame 
        
           // | Saved LR #2               | 
        
           // | Saved FP #2               | 
        
           // |---------------------------|  <-- FP 
        
           // | sp_inc                    | 
        
           // | method locals             | 
        
           // |---------------------------|  <-- SP 
        
           // 
        
           // There are two copies of FP and LR on the stack. They will be identical at 
        
           // first, but that can change. 
        
           // If the caller has been deoptimized, LR #1 will be patched to point at the 
        
           // deopt blob, and LR #2 will still point into the old method. 
        
           // If the saved FP (x29) was not used as the frame pointer, but to store an 
        
           // oop, the GC will be aware only of FP #2 as the spilled location of x29 and 
        
           // will fix only this one. 
        
           // 
        
           // When restoring, one must then load FP #2 into x29, and LR #1 into x30, 
        
           // while keeping in mind that from the scalarized entry point, there will be 
        
           // only one copy of each. 
        
           // 
        
           // The sp_inc stack slot holds the total size of the frame including the 
        
           // extension space minus two words for the saved FP and LR. That is how to 
        
           // find LR #1. FP #2 is always located just after sp_inc.

Currently, when leaving the frame, we use LR §1 (I use § not to mess with github rendering that interpret # as PR references) as return address (because it can be patched for deoptimization), and FP §2 to restore x29 (because when it contains an oop, the GC is only aware of this copy).

In our failing case, we have a C2-compiled frame that is being deoptimized when returning from a call to an interpreted method. During deoptimization, the function frame::sender_for_compiled_frame(RegisterMap*) const is used to locate the location on the stack where rfp (x29) is saved.

valhalla/src/hotspot/cpu/aarch64/frame_aarch64.inline.hpp

Line 446 in b1d14c6

inline frame frame::sender_for_compiled_frame(RegisterMap* map) const {

Actually this function is a bit more general: it computes the sender frame of a compiled frame, and build the RegisterMap. The problem is that during deoptimization, this function locates the wrong save of rfp (FP §1) because the C2 frame is being modified by the deoptimization process and it's not anymore recognized as a C2-compiled method that needs stack repairs. In this modified frame the sender's sp is correctly known (or the deoptimization mechanism would not work), and the saved FP is taken just 2 words above: that is FP §1. On top of that, if rfp contained an oop and the GC moved the pointed object during the call we are returning from, the value we get for rfp is not valid anymore.

The good and bad news is that the GC also locates the saved location of rfp thanks to the same function. The bad news is that GC sees the C2 frame correctly, and so sender_for_compiled_frame can locate FP §2. We can follow a few ideas:

make the deoptimized frame bottom under FP/LR §2. This is not possible, for many reasons: we need LR §1, we need to remove the whole frame to find the sender's frame...
make sender_for_compiled_frame detects when the deoptimized frame is the one of a C2 compiled method that needs stack repair. No idea how to do that! Also, it seems brittle, and more complicated than the next solution.
always pick FP §1: since the deoptimized frame will pick FP §1, in case it's a regular C2 frame, we can also make sure to use FP §1. It is the simplest solution and the one I explain after.

In JDK-8365996, the problem was pretty much the opposite: remove_frame was using FP §1 to restore rfp but the GC only updates FP §2. So the solution was to restore from FP §2:

https://github.com/openjdk/valhalla/pull/1540/files#diff-0f4150a9c607ccd590bf256daa800c0276144682a92bc6bdced5e8bc1bb81f3aR6140-R6145

Here the solution is to revert this part (restore rfp from FP §1), and let GC knows about FP §1 only in sender_for_compiled_frame. Overall, let's never speak about FP/LR §2. This way, we always have the sender's sp, the saved LR and FP consecutively. FP/LR §2 is only needed to make space between the unpacked arguments and the locals, as there would be between regular arguments and locals. They could have a fictive value and we should probably implement that.

This make virtual thread tests fail massively. Surely because of mismatch between our choice of FP §1 or 2. Let's problem list this for now... To help with that, I introduced frame::compiled_frame_details() const to do all this little tricks and return the location of LR/FP§1 and the sender's sp at once, without letting the frame users have to figure out the internal structure.

I found the extraction of compiled_frame_details a bit risky, so I proceded in steps: first making the function, calling it from sender_for_compiled_frame and compare the results with the old way of getting everything. These temporary assert weren't triggered, so I actually used the returned value of compiled_frame_details and removed the newly useless code from sender_for_compiled_frame. So I'm rather confident it does as good as before.

I don't have much opinion about the names of compiled_frame_details and CompiledFramePointers, feel free to suggest better if you have a better idea.

Thanks,
Marc

Progress

Change must not contain extraneous whitespace
Commit message must refer to an issue
Change must be properly reviewed (1 review required, with at least 1 Committer)

Issue

JDK-8367151: [lworld] CorrectlyRestoreRfp.java triggers "bad oop found" during deoptimization (Bug - P3)

Reviewers

Tobias Hartmann (@TobiHartmann - Committer)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/valhalla.git pull/1751/head:pull/1751
$ git checkout pull/1751

Update a local copy of the PR:
$ git checkout pull/1751
$ git pull https://git.openjdk.org/valhalla.git pull/1751/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 1751

View PR using the GUI difftool:
$ git pr show -t 1751

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/valhalla/pull/1751.diff

Using Webrev

Link to Webrev Comment

bridgekeeper · 2025-11-20T12:15:06Z

👋 Welcome back mchevalier! A progress list of the required criteria for merging this PR into lworld will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2025-11-20T12:15:55Z

@marc-chevalier This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8367151: [lworld] CorrectlyRestoreRfp.java triggers "bad oop found" during deoptimization

Reviewed-by: thartmann

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 1 new commit pushed to the lworld branch:

c73220a: 8372345: [lworld] Problem list JDK-8372341

Please see this link for an up-to-date comparison between the source branch of this pull request and the lworld branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the lworld branch, type /integrate in a new comment.

mlbridge · 2025-11-20T12:24:41Z

Webrevs

TobiHartmann

Thanks for the detailed explanation, this is great for future reference!

The changes look good to me. Just a few comments.

I don't have much opinion about the names of compiled_frame_details and CompiledFramePointers, feel free to suggest better if you have a better idea.

Naming is fine with me but do we even need to factor this logic out? Do you expect it to be used in more places in the future?

TobiHartmann · 2025-11-21T08:28:28Z

src/hotspot/cpu/aarch64/frame_aarch64.cpp

+  cfp.sender_pc_addr = (address*)(l_sender_sp - frame::return_addr_offset);
+
+#ifdef ASSERT
+  // when the stack was extnded (so LR #1 and LR #2 are distinct) and LR #1 was patched


Suggested change

// when the stack was extnded (so LR #1 and LR #2 are distinct) and LR #1 was patched

// when the stack was extended (so LR #1 and LR #2 are distinct) and LR #1 was patched

TobiHartmann · 2025-11-21T08:30:51Z

src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp

+    // find FP/LR #1. This size is expressed in bytes. Be careful when using it
+    // from C++ in pointer arithmetic; you might need to divide it by wordSize.
+    //
+    // TODO 8371993 store fake values instyead of LR/FP#2


Suggested change

// TODO 8371993 store fake values instyead of LR/FP#2

// TODO 8371993 store fake values instead of LR/FP#2

TobiHartmann · 2025-11-21T08:35:47Z

There seems to be a merge conflict.

openjdk · 2025-11-21T08:37:18Z

@marc-chevalier this pull request can not be integrated into lworld due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout JDK-8367151
git fetch https://git.openjdk.org/valhalla.git lworld
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge lworld"
git push

marc-chevalier · 2025-11-21T08:59:46Z

Naming is fine with me but do we even need to factor this logic out? Do you expect it to be used in more places in the future?

I suspect we might need a similar thing in virtual threads. I've seen other places where we do this trick of finding the increment and fixing the framesize to find the sp of the caller. For instance

valhalla/src/hotspot/cpu/aarch64/continuationFreezeThaw_aarch64.inline.hpp

Lines 56 to 81 in 29569fb

    
           template<typename FKind> 
        
           inline frame FreezeBase::sender(const frame& f) { 
        
             assert(FKind::is_instance(f), ""); 
        
             if (FKind::interpreted) { 
        
               return frame(f.sender_sp(), f.interpreter_frame_sender_sp(), f.link(), f.sender_pc()); 
        
             } 
        
             intptr_t** link_addr = link_address<FKind>(f); 
        
             intptr_t* sender_sp = (intptr_t*)(link_addr + frame::sender_sp_offset); //  f.unextended_sp() + (fsize/wordSize); // 
        
             address sender_pc = ContinuationHelper::return_address_at(sender_sp - 1); 
        
             assert(sender_sp != f.sp(), "must have changed"); 
        
             int slot = 0; 
        
             CodeBlob* sender_cb = CodeCache::find_blob_and_oopmap(sender_pc, slot); 
        
             // Repair the sender sp if the frame has been extended 
        
             if (sender_cb->is_nmethod()) { 
        
               sender_sp = f.repair_sender_sp(sender_sp, link_addr); 
        
             } 
        
             return sender_cb != nullptr 
        
               ? frame(sender_sp, sender_sp, *link_addr, sender_pc, sender_cb, 
        
                       slot == -1 ? nullptr : sender_cb->oop_map_for_slot(slot, sender_pc), 
        
                       false /* on_heap ? */) 
        
               : frame(sender_sp, sender_sp, *link_addr, sender_pc); 
        
           }

looks a lot like what was in sender_for_compiled_frame that I've extracted. I think it's a bit subtle and worth delegating to a common method. It is true, it uses repair_sender_sp but so does my new compiled_frame_details and there is still work to do around.

TobiHartmann · 2025-11-21T09:23:01Z

Right, that makes sense to me. @pchilano might want to re-use that code when fixing the Virtual Threads part.

marc-chevalier · 2025-11-24T07:33:52Z

/integrate

Thanks @TobiHartmann!

openjdk · 2025-11-24T07:34:58Z

Going to push as commit 405db7a.
Since your change was applied there have been 2 commits pushed to the lworld branch:

a483e8c: 8209554: [lworld] ClassCastException thrown for JCK test instead of expected IllegalArgumentException
c73220a: 8372345: [lworld] Problem list JDK-8372341

Your commit was automatically rebased without conflicts.

openjdk · 2025-11-24T07:35:08Z

@marc-chevalier Pushed as commit 405db7a.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

marc-chevalier added 4 commits November 14, 2025 15:19

First fix

c59d4b0

Factor compiled frame pointers computation

4488e74

ProblemList and comment

e53e14c

Actually use compiled_frame_details

06923e1

marc-chevalier marked this pull request as ready for review November 20, 2025 12:18

openjdk bot added the rfr Pull request is ready for review label Nov 20, 2025

TobiHartmann approved these changes Nov 21, 2025

View reviewed changes

openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Nov 21, 2025

Address comments

b266f50

TobiHartmann approved these changes Nov 21, 2025

View reviewed changes

Merge remote-tracking branch 'origin/lworld' into JDK-8367151

04a3651

openjdk bot added ready Pull request is ready to be integrated and removed merge-conflict Pull request has merge conflict with target branch labels Nov 21, 2025

TobiHartmann approved these changes Nov 21, 2025

View reviewed changes

openjdk bot added the integrated Pull request has been integrated label Nov 24, 2025

openjdk bot closed this Nov 24, 2025

openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Nov 24, 2025

This was referenced Nov 27, 2025

8371993: [lworld] Aarch64: save bad values instead of rfp and lr above the extension space #1764

Closed

8367553: [lworld] compiler/valhalla/inlinetypes/TestNullableArrays.java fails with segfault in C1 compiled code on aarch64 #1766

Closed

	// Remove the extension of the caller's frame used for inline type unpacking
	//
	// Right now the stack looks like this:
	//
	// \| Arguments from caller \|
	// \|---------------------------\| <-- caller's SP
	// \| Saved LR #1 \|
	// \| Saved FP #1 \|
	// \|---------------------------\|
	// \| Extension space for \|
	// \| inline arg (un)packing \|
	// \|---------------------------\| <-- start of this method's frame
	// \| Saved LR #2 \|
	// \| Saved FP #2 \|
	// \|---------------------------\| <-- FP
	// \| sp_inc \|
	// \| method locals \|
	// \|---------------------------\| <-- SP
	//
	// There are two copies of FP and LR on the stack. They will be identical at
	// first, but that can change.
	// If the caller has been deoptimized, LR #1 will be patched to point at the
	// deopt blob, and LR #2 will still point into the old method.
	// If the saved FP (x29) was not used as the frame pointer, but to store an
	// oop, the GC will be aware only of FP #2 as the spilled location of x29 and
	// will fix only this one.
	//
	// When restoring, one must then load FP #2 into x29, and LR #1 into x30,
	// while keeping in mind that from the scalarized entry point, there will be
	// only one copy of each.
	//
	// The sp_inc stack slot holds the total size of the frame including the
	// extension space minus two words for the saved FP and LR. That is how to
	// find LR #1. FP #2 is always located just after sp_inc.

	// when the stack was extnded (so LR #1 and LR #2 are distinct) and LR #1 was patched
	// when the stack was extended (so LR #1 and LR #2 are distinct) and LR #1 was patched

	// TODO 8371993 store fake values instyead of LR/FP#2
	// TODO 8371993 store fake values instead of LR/FP#2

8367151: [lworld] CorrectlyRestoreRfp.java triggers "bad oop found" during deoptimization #1751

8367151: [lworld] CorrectlyRestoreRfp.java triggers "bad oop found" during deoptimization #1751

Conversation

marc-chevalier commented Nov 20, 2025 • edited by openjdk bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Progress

Issue

Reviewers

Reviewing

Uh oh!

bridgekeeper bot commented Nov 20, 2025

Uh oh!

openjdk bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mlbridge bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Webrevs

Uh oh!

TobiHartmann left a comment

Choose a reason for hiding this comment

Uh oh!

TobiHartmann Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

TobiHartmann Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

TobiHartmann commented Nov 21, 2025

Uh oh!

openjdk bot commented Nov 21, 2025

Uh oh!

marc-chevalier commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TobiHartmann commented Nov 21, 2025

Uh oh!

marc-chevalier commented Nov 24, 2025

Uh oh!

openjdk bot commented Nov 24, 2025

Uh oh!

openjdk bot commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

marc-chevalier commented Nov 20, 2025 •

edited by openjdk bot

Loading

openjdk bot commented Nov 20, 2025 •

edited

Loading

mlbridge bot commented Nov 20, 2025 •

edited

Loading

marc-chevalier commented Nov 21, 2025 •

edited

Loading