Move bgfx initialization from main to JS thread.#1620
Move bgfx initialization from main to JS thread.#1620bkaradzic-microsoft wants to merge 6 commits intomasterfrom
Conversation
66def78 to
ff71c90
Compare
ff71c90 to
18288df
Compare
There was a problem hiding this comment.
Pull request overview
This PR refactors the graphics threading model so bgfx initialization and frame submission happen on the JS (API) thread, while the main/UI thread pumps bgfx rendering via a new Device::RenderFrame() loop, and teardown is coordinated via Device::Shutdown().
Changes:
- Move bgfx init/frame ownership to the JS thread via an auto-driven JS-side frame loop started by
Device::AddToJavaScript(). - Introduce main-thread
RenderFrame()pumping and a two-phaseShutdown()to ensure bgfx shutdown occurs on the JS thread. - Update NativeEngine/Canvas/XR and apps/tests to the new frame-loop + RenderFrame pumping model.
Reviewed changes
Copilot reviewed 33 out of 33 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| Polyfills/Canvas/Source/nanovg/nanovg_babylon.cpp | Update FrameBuffer bind/viewport calls to new signatures (no encoder arg). |
| Polyfills/Canvas/Source/Context.cpp | Switch to bgfx::begin() and new FrameBuffer API during Canvas flush. |
| Plugins/NativeXr/Source/NativeXrImpl.cpp | Adjust XR frame scheduling/clears to new bgfx threading/encoder usage. |
| Plugins/NativeEngine/Source/PerFrameValue.h | Remove encoder parameters from per-frame flag accessors. |
| Plugins/NativeEngine/Source/NativeEngine.h | Remove update-token API and simplify bound framebuffer access. |
| Plugins/NativeEngine/Source/NativeEngine.cpp | Replace update-token encoders with bgfx::begin(); update framebuffer binding logic. |
| Plugins/ExternalTexture/Source/ExternalTexture_Shared.h | Add render-thread dispatch chain to safely call overrideInternal after resource creation. |
| Core/Graphics/Source/SafeTimespanGuarantor.cpp | Simplify safe-timespan state machine and callback triggering. |
| Core/Graphics/Source/FrameBuffer.cpp | Remove encoder dependency from Bind/Unbind/viewport/scissor internals. |
| Core/Graphics/Source/DeviceImpl.h | Add RenderFrame() and render-thread callback dispatch; remove per-thread encoder cache. |
| Core/Graphics/Source/DeviceImpl.cpp | Move bgfx init to JS thread; implement main-thread RenderFrame(); adjust shutdown behavior. |
| Core/Graphics/Source/DeviceContext.cpp | Remove UpdateToken plumbing; expose render-thread dispatch. |
| Core/Graphics/Source/Device.cpp | Start JS-side frame loop from AddToJavaScript; implement RenderFrame() + Shutdown(). |
| Core/Graphics/InternalInclude/Babylon/Graphics/SafeTimespanGuarantor.h | Simplify guarantor state and remove locking/waiting API surface. |
| Core/Graphics/InternalInclude/Babylon/Graphics/FrameBuffer.h | Update FrameBuffer API to remove encoder params where no longer needed. |
| Core/Graphics/InternalInclude/Babylon/Graphics/DeviceContext.h | Remove UpdateToken; add render-thread dispatch API; update viewId allocation signature. |
| Core/Graphics/Include/Shared/Babylon/Graphics/Device.h | Publicly expose RenderFrame()/Shutdown() and document new lifecycle expectations. |
| Apps/UnitTests/Source/Tests.ShaderCache.cpp | Update test to pump RenderFrame() and call Device::Shutdown(). |
| Apps/UnitTests/Source/Tests.JavaScript.cpp | Update test to pump RenderFrame() until JS completion, then shutdown. |
| Apps/UnitTests/Source/Tests.ExternalTexture.cpp | Update tests to new rendering lifecycle and async completion via RenderFrame() pumping. |
| Apps/UnitTests/Source/Tests.ExternalTexture.D3D11.cpp | Update D3D11 layer-index external texture test to new lifecycle. |
| Apps/UnitTests/Source/Tests.Device.D3D11.cpp | Use EnableRendering()/DisableRendering() without per-frame update tokens. |
| Apps/StyleTransferApp/Win32/App.cpp | Replace manual Start/Finish frame sequencing with RenderFrame() pumping and Shutdown(). |
| Apps/PrecompiledShaderTest/Source/App.cpp | Replace update-token driven frame sequencing with RenderFrame() pumping + shutdown. |
| Apps/Playground/visionOS/LibNativeBridge.mm | Replace frame Start/Finish sequence with Device::RenderFrame(). |
| Apps/Playground/macOS/ViewController.mm | Replace frame Start/Finish sequence with Device::RenderFrame(). |
| Apps/Playground/iOS/LibNativeBridge.mm | Replace frame Start/Finish sequence with Device::RenderFrame(). |
| Apps/Playground/X11/App.cpp | Replace frame Start/Finish sequence with Device::RenderFrame(). |
| Apps/Playground/Win32/App.cpp | Replace frame Start/Finish sequence with Device::RenderFrame(); set thread description. |
| Apps/Playground/UWP/App.cpp | Replace frame Start/Finish sequence with Device::RenderFrame(). |
| Apps/Playground/Shared/AppContext.cpp | Centralize teardown with Device::Shutdown() before destroying JS runtime/canvas. |
| Apps/Playground/Android/BabylonNative/src/main/cpp/BabylonNativeJNI.cpp | Replace frame Start/Finish sequence with Device::RenderFrame(). |
| Apps/HeadlessScreenshotApp/Win32/App.cpp | Replace update-token driven sequencing with RenderFrame() pumping + shutdown. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| throw std::runtime_error{"Safe timespan cannot end if guarantor state is not open"}; | ||
| } | ||
| if (m_count == 0) | ||
| { | ||
| m_state = State::Closed; | ||
| m_closeDispatcher.tick(*m_cancellation); | ||
| } | ||
| else | ||
| { | ||
| m_state = State::Closing; | ||
| } | ||
| } | ||
|
|
||
| void SafeTimespanGuarantor::Lock() | ||
| { | ||
| std::scoped_lock lock{m_mutex}; | ||
| if (m_state != State::Closed) | ||
| { | ||
| throw std::runtime_error{"SafeTimespanGuarantor can only be locked from a closed state"}; | ||
| } | ||
| m_state = State::Locked; | ||
| } | ||
|
|
||
| void SafeTimespanGuarantor::Unlock() | ||
| { | ||
| std::scoped_lock lock{m_mutex}; | ||
| if (m_state != State::Locked) | ||
| { | ||
| throw std::runtime_error{"SafeTimespanGuarantor can only be unlocked if it was locked"}; | ||
| } | ||
| m_state = State::Closed; | ||
| } | ||
|
|
||
| SafeTimespanGuarantor::SafetyGuarantee SafeTimespanGuarantor::GetSafetyGuarantee() | ||
| { | ||
| std::unique_lock lock{m_mutex}; | ||
| if (m_state == State::Closed || m_state == State::Locked) | ||
| { | ||
| m_condition_variable.wait(lock, [this]() { return m_state != State::Closed && m_state != State::Locked; }); | ||
| } | ||
| m_count++; | ||
|
|
||
| return gsl::finally(std::function<void()>{[this] { | ||
| std::scoped_lock lock{m_mutex}; | ||
| if (--m_count == 0 && m_state == State::Closing) | ||
| { | ||
| m_state = State::Closed; | ||
| m_closeDispatcher.tick(*m_cancellation); | ||
| } | ||
| }}); | ||
| m_closeDispatcher.tick(*m_cancellation); | ||
| } |
There was a problem hiding this comment.
m_closeDispatcher.tick(...) is invoked while holding m_mutex, which can deadlock if any close callback attempts to interact with SafeTimespanGuarantor (or anything else that needs the same lock). Consider moving the tick out of the locked region.
| { | ||
| std::scoped_lock lock{m_mutex}; | ||
| if (m_state != State::Closed) | ||
| { | ||
| throw std::runtime_error{"Safe timespan cannot begin if guarantor state is not closed"}; | ||
| } | ||
| m_state = State::Open; | ||
| } | ||
|
|
||
| m_condition_variable.notify_all(); | ||
| std::this_thread::yield(); | ||
|
|
||
| m_openDispatcher.tick(*m_cancellation); | ||
| } |
There was a problem hiding this comment.
m_openDispatcher.tick(...) executes arbitrary queued callbacks; calling it while holding m_mutex risks deadlocks/re-entrancy issues if any callback (directly or indirectly) calls back into SafeTimespanGuarantor. Consider releasing the lock before ticking (set state under lock, then tick after unlocking).
| auto framePump = std::make_shared<std::function<void(Napi::Env)>>(); | ||
| auto* stopFlag = &m_stopRequested; | ||
| auto* runningFlag = &m_frameLoopRunning; | ||
| *framePump = [impl = m_impl.get(), &guarantor, &jsRuntime, framePump, stopFlag, runningFlag](Napi::Env) { | ||
| if (stopFlag->load()) |
There was a problem hiding this comment.
StartFrameLoop captures raw pointers/references to m_stopRequested/m_frameLoopRunning and a raw DeviceImpl* into an async JS dispatch loop. If the Device instance is moved/assigned (move semantics are still enabled) or destroyed without a prior Shutdown(), these captured addresses can become stale, leading to hangs (stop flag not observed) or UAF. Consider storing the frame-loop state in a shared_ptr that moves with the Device, or disallow moving/destroying while the frame loop is running (assert + delete move ops after start).
| std::atomic<bool> done{false}; | ||
| m_jsRuntime->Dispatch([this, &done, beforeDisableRendering = std::move(beforeDisableRendering)](Napi::Env) { | ||
| if (beforeDisableRendering) |
There was a problem hiding this comment.
Shutdown() unconditionally dereferences m_jsRuntime (m_jsRuntime->Dispatch(...)). If Shutdown() is called on a Device that was never AddToJavaScript-ed (or after a move that nulled the pointer), this will crash. Consider guarding with an explicit precondition (assert/throw) or falling back to a non-JS shutdown path when m_jsRuntime == nullptr.
|
@bkaradzic-microsoft I've opened a new pull request, #1623, to work on those changes. Once the pull request is ready, I'll request review from you. |
`Plugins/NativeEngine/Source/NativeEngine.cpp` had a UTF-8 BOM (`EF BB BF`) prepended, which can break toolchains and linters that expect plain ASCII/UTF-8 without BOM. ## Changes - Stripped the three-byte UTF-8 BOM from the start of `NativeEngine.cpp` so the file begins directly with `#include "NativeEngine.h"` <!-- START COPILOT CODING AGENT TIPS --> --- 💬 We'd love your input! Share your thoughts on Copilot coding agent in our [2 minute survey](https://gh.io/copilot-coding-agent-survey). --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: bkaradzic-microsoft <260535795+bkaradzic-microsoft@users.noreply.github.com>
No description provided.