[DO-NOT-MERGE] Partner integration test: #1652 + #1646#1674
Draft
bghgary wants to merge 57 commits intoBabylonJS:masterfrom
Draft
[DO-NOT-MERGE] Partner integration test: #1652 + #1646#1674bghgary wants to merge 57 commits intoBabylonJS:masterfrom
bghgary wants to merge 57 commits intoBabylonJS:masterfrom
Conversation
- Add Tests.ExternalTexture.Render.cpp: end-to-end test that renders a texture array through a ShaderMaterial to an external render target, verifying each slice (red, green, blue) via pixel readback. - Add tests.externalTexture.render.ts: JS test with sampler2DArray shader. - Add RenderDoc.h/cpp to UnitTests for optional GPU capture support. - Add Utils helpers: CreateTestTextureArrayWithData, CreateRenderTargetTexture, ReadBackRenderTarget, DestroyRenderTargetTexture (D3D11, Metal, stubs for D3D12/OpenGL). - Fix ExternalTexture_Shared.h: pass m_impl->NumLayers() instead of hardcoded 1 in Attach(), preserving texture array metadata. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove layerIndex parameter from Impl::Update declaration to match the updated signature in ExternalTexture_Shared.h. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Migrate HeadlessScreenshotApp, StyleTransferApp, and PrecompiledShaderTest from AddToContextAsync (promise-based) to CreateForJavaScript (synchronous). This removes the extra frame pump and promise callbacks that were previously required. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…llers - Move RenderDoc.h/cpp to D3D11-only CMake block with HAS_RENDERDOC define - Guard RenderDoc calls with HAS_RENDERDOC instead of WIN32 - Update StyleTransferApp and PrecompiledShaderTest to use CreateForJavaScript Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On OpenGL, TextureT is unsigned int (not a pointer), so reinterpret_cast fails. Add NativeHandleToUintPtr helper using if constexpr to handle both pointer types (D3D11/Metal/D3D12) and integer types (OpenGL). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- RenderDoc.h/cpp now accept void* device instead of ID3D11Device* - Move RenderDoc to WIN32 block (not D3D11-only) since it works with any API - Fix OpenGL build: use NativeHandleToUintPtr helper for TextureT cast - Add Linux support (dlopen librenderdoc.so) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move DestroyTestTexture after FinishRenderingCurrentFrame so bgfx::frame() processes the texture creation command before the native resource is released. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Ensure the JS startup dispatch completes before calling deviceUpdate.Finish() and FinishRenderingCurrentFrame(). This guarantees that bgfx::frame() processes the CreateForJavaScript texture creation command, making the texture available for subsequent render frames. The old async API had an implicit sync point (addToContext.wait) that the new sync API lost. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…eation - Rename CreateForJavaScriptWithTextureArray to CreateForJavaScript and use arraySize=1 since texture array rendering is covered by RenderTextureArray. The old test crashed on CI (STATUS_BREAKPOINT in bgfx when creating texture arrays via encoder on WARP). - Revert two-step create+override approach back to single createTexture2D call with _external parameter (overrideInternal from JS thread doesn't work since the D3D11 resource isn't created until bgfx::frame). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CreateForJavaScript already exists in Tests.ExternalTexture.cpp, causing a linker duplicate symbol error. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… test) The D3D11-specific CreateForJavaScript test crashed on CI due to bgfx assertions when calling createTexture2D with _external on the encoder thread. The cross-platform CreateForJavaScript test in Tests.ExternalTexture.cpp already covers this functionality. The texture array rendering is covered by RenderTextureArray. Also revert app startup ordering to Finish->Wait (matching the pattern used by HeadlessScreenshotApp on master). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The bgfx callback's fatal() handler was silently calling debugBreak() on DebugCheck assertions with no output, making CI crashes impossible to diagnose. Now logs the file, line, error code and message to stderr before breaking, so the assertion details appear in CI logs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add DISM/d3dconfig step to CI to enable D3D debug layer, which will provide detailed D3D11 validation messages for the CreateShaderResourceView E_INVALIDARG failure. Kept the _external createTexture2D path (reverted the AfterRenderScheduler approach) so we can see the actual D3D debug output that explains the SRV mismatch. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The bgfx _external texture path triggers E_INVALIDARG in CreateShaderResourceView on CI's WARP D3D11 driver. The overrideInternal alternative doesn't support full array textures (hardcodes ArraySize=1). Since the _external path works on real GPUs, skip the render test on CI via BABYLON_NATIVE_SKIP_RENDER_TESTS and keep the direct _external path. Also adds D3D debug layer enablement to CI for future diagnostics, and logs bgfx fatal errors to stderr before crashing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Switch from _external parameter (crashes on WARP) to create+overrideInternal two-step approach. The overrideInternal path is compatible with WARP but sets ArraySize=1 in the SRV, so the RenderTextureArray test (which needs full array access) is skipped on CI. The render test works on real GPUs where the _external path succeeds. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The overrideInternal call fires on AfterRenderScheduler after the first bgfx::frame(). An additional frame pump ensures the native texture backing is applied before the scene render. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove deleted Tests.ExternalTexture.D3D11.cpp from Install/Test/CMakeLists.txt - Add extra frame pump after CreateForJavaScript in HeadlessScreenshotApp and StyleTransferApp so overrideInternal has time to apply the native texture backing before the first render. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…er FrameCompletionScope acquisition.
…ompletionScope instead of deferred member release.
This reverts commit 0d315bb.
…cquired FrameCompletionScope in SubmitCommands.
…ck on FrameCompletionScope.
…enderScheduler fires one frame late when called outside RAF.
…te leaking into nanovg rendering.
… getFrameBufferData outside RAF.
…nder-test # Conflicts: # .github/jobs/win32.yml
…erResourceView The createTexture2D _external path now works on WARP after the bgfx update in master. Drop the placeholder + AfterRenderScheduler + overrideInternal two-step dance and the extra frame pump that existed to apply it. This also restores full array-slice SRV so RenderTextureArray no longer needs a sanitizer-only skip. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove 'pump an extra frame' block from StyleTransferApp and HeadlessScreenshotApp (no longer needed after the _external revert). - Update ExternalTexture Readme to match the one-call createTexture2D path and direct-construction consumer pattern. - Drop unused DispatchSync declaration from UnitTests/Utils.h. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The SKIP_RENDER_TESTS flag was originally added because WARP could not handle the _external path in bgfx. bgfx PR BabylonJS#1669 fixed CreateShaderResourceView for the _external parameter, so the render test should pass on WARP now. Let CI verify. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Reword BABYLON_NATIVE_SKIP_RENDER_TESTS option description. WARP is available on Win32, so 'no real GPU' is inaccurate; the real reason to skip is a missing per-backend test harness. - Add braces around if constexpr / else branches in NativeHandleToUintPtr per coding style. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- RenderDoc.cpp: drop absolute Windows path; use plain #include "renderdoc_app.h" - Tests.ExternalTexture.Render.cpp: handle renderSlice rejection and add wait timeout to prevent test hangs - Tests.ExternalTexture.cpp: add ExternalTexture.Update test (Update() path was dropped in the refactor) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Restore AddToContextAsync as a deprecated shim around CreateForJavaScript for source compat with existing consumers - Tighten render test: color tolerance 25 -> 2; require exact 100%% match (was 90%%) - Drop BABYLON_NATIVE_SKIP_RENDER_TESTS option; gate SKIP_RENDER_TESTS directly on GRAPHICS_API STREQUAL "D3D12" Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
# Conflicts: # Apps/HeadlessScreenshotApp/Win32/App.cpp # Apps/PrecompiledShaderTest/Source/App.cpp # Apps/StyleTransferApp/Win32/App.cpp
d42b95a to
4c79b1d
Compare
Matches the established pattern from ExternalTexture.AddToContextAsyncAndUpdate on master: wait for JS work to complete while the frame is still open, then Finish. The new CreateForJavaScript and RenderTextureArray tests in this PR had accidentally dropped this pattern, which works on master but deadlocks under the reworked threading model in BabylonJS#1652 (Finish closes the frame gate while JS is still acquiring FrameCompletionScopes). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…e-mutex deadlock Reproduce the deadlock reported in BabylonJS#1646 (comment) without needing any Babylon scene, GC pressure, or external scheduling. The bug: CreateForJavaScript holds m_impl->Mutex() across the JS-side property lookup `Graphics::DeviceContext::GetFromJavaScript(env)`. That lookup can run user JS or engine GC/finalizers (e.g. a sibling Texture finalizer registered by Napi::Pointer::Create) which themselves re-enter m_impl->Mutex() on the same thread. On MSVC, recursively locking std::mutex throws std::system_error("resource_deadlock_would_occur"), which then escapes the AppRuntime dispatch lambda and triggers std::abort. The test redefines `_native._Graphics` as an accessor whose getter, when invoked, asks the ExternalTexture for its Width() (which also takes m_impl->Mutex()) and observes whether the lock is currently held on the caller's thread. With the bug present, Width() throws system_error and the test reports `recursiveLockObserved=true`. With the fix, the lookup runs before the mutex is acquired and Width() succeeds. Test is Win32-only because it relies on MSVC's std::mutex deadlock detection to surface the recursive lock as a recoverable exception. [Created by Copilot on behalf of @bghgary] Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The JS-side property lookup `Graphics::DeviceContext::GetFromJavaScript` can run engine GC and finalizers, which on Chakra can finalize a previously-collected Texture wrapper for the same ExternalTexture. That finalizer also takes m_impl->Mutex(). Holding the mutex across the lookup therefore re-enters the same std::mutex on the same thread and throws std::system_error on MSVC, which then escapes the AppRuntime dispatch lambda and triggers std::abort. DeviceContext is process-scoped and does not require the impl mutex, so move its resolution above the scoped_lock. The lock still covers all m_impl reads and the AddTexture call. Verified by the regression test added in the prior commit: ExternalTexture.CreateForJavaScriptDoesNotHoldImplMutexAcrossJsCallout fails before this change and passes after. [Created by Copilot on behalf of @bghgary] Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The regression test added in dd75619 uses the Node-API C api_define_properties to install a test accessor for `_native._Graphics`. The JSI Node-API shim does not expose that symbol, so the JSI build (e.g. Win32_x64_JSI_D3D11) failed to compile. Gate the test on `defined(NAPI_VERSION)`, which is defined by the shared Node-API header (Chakra) but not by the JSI Node-API shim. The bug being tested is also Chakra-specific (synchronous GC during a JS-side property lookup), so skipping on JSI matches the test's actual scope. [Created by Copilot on behalf of @bghgary] Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The test installed a JS getter on `_native._Graphics` to deterministically reproduce the recursive-lock condition by re-entering `m_impl->Mutex()` from inside the lookup. It worked, but on review it bought too little: - It validates this specific shape of fix (lock not held across the GetFromJavaScript callout) rather than the absence of the bug. If anyone later switched the impl to `std::recursive_mutex`, the test would silently pass even with the buggy "lock-held-across-callout" pattern, since recursive_mutex does not throw on re-entrant lock. - It is tightly coupled to MSVC's std::mutex deadlock-detection throwing std::system_error on recursive lock; the probe is silent on libc++/libstdc++ where re-entrant lock is undefined behavior rather than an exception. - The wild-case crash involves a Chakra finalizer firing during a JS property lookup, which we cannot deterministically arrange. The test had to fabricate the recursive-lock shape via a getter, which makes it more of a fault- injection test than a true regression for the original failure path. The fix itself (resolve DeviceContext before locking impl) is small and well-reviewed; the cost/value trade-off does not justify the test complexity. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The mutex on ExternalTexture::Impl bridges graphics-thread state (m_info, m_ptr, m_textures) to JS-thread reads in CreateForJavaScript. Previously the outer ExternalTexture class reached through Impl::Mutex() to lock around bodies that mixed cached-field reads, bgfx calls, and napi object allocation. That shape is what produced BabylonJS#1646: a JS object allocation triggered a sibling Texture finalizer on the same thread, which re-entered the same mutex and aborted under MSVC's std::mutex. Move the lock inside Impl so the lock scope matches the actual cross-thread boundary, no more, no less: - ImplBase exposes two JS-thread entry points, CreateTexture (returns a fully-constructed Graphics::Texture*) and DestroyTexture (called from the napi finalizer). Each takes the mutex internally and contains no JS callouts, so the lock can never be held across user-visible JS execution. This structurally prevents the recursive-mutex bug at this call site -- not just the one already fixed in 415f193. - Impl::Update self-locks around the m_info / m_ptr / UpdateTextures publish, with explicit braces around the locked region. - The Mutex() accessor, AddTexture, and RemoveTexture are removed from ImplBase's public surface. m_mutex is no longer reachable from outside the impl. - ExternalTexture::Width/Height/Get/Update no longer lock. Per the public contract documented in ExternalTexture.h, every operation except CreateForJavaScript is graphics-thread-only and therefore serialized against Update() by single-threaded execution. The previous internal locks on these getters did not actually make them safe to call from other threads (no happens-before with prior Updates) and only obscured the real synchronization boundary. - ExternalTexture::CreateForJavaScript becomes a thin wrapper: resolve DeviceContext (JS lookup, must run unlocked), call Impl::CreateTexture, then Napi::Pointer::Create with a finalizer that delegates to Impl::DestroyTexture via weak_ptr. No platform-specific Impl files (D3D11/D3D12/Metal/OpenGL) need changes -- none of them touched the mutex or texture set.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Test vehicle for partner integration: combines #1652 + #1646. This PR exists solely to run CI and give partners a single branch to consume. Do not merge — the two source PRs land independently.
Contents
origin/master@ 81a7bfe (latest, includes Tighten errorRatio on visual tests. #1673 and Add programmatic GPU frame capture via TestUtils.captureNextFrame. #1675)origin/rework-thread-model(Reworked threading model. #1652: Reworked threading model)bghgary/external-texture-render-test(Add synchronous ExternalTexture.CreateForJavaScript and deprecate AddToContextAsync #1646: Synchronous ExternalTexture API + recursive-mutex deadlock fix + Impl mutex encapsulation refactor)Conflict resolution
3 apps (
HeadlessScreenshotApp,PrecompiledShaderTest,StyleTransferApp) conflicted where #1652 added a "reopen the gate so JS can continue running" block in the same region #1646 removed the old extra-frame-pump pattern. Kept #1652's new block in all three.Testing
ExternalTexture.*unit tests: all pass (Construction, CreateForJavaScript, Update; RenderTextureArray skipped on D3D12 as in Add synchronous ExternalTexture.CreateForJavaScript and deprecate AddToContextAsync #1646).ExternalTexture.*. Root cause was a recursive-mutex deadlock inExternalTexture::CreateForJavaScript(lock held across a JS property lookup that ran a sibling Texture finalizer on the same thread, re-entering the samestd::mutexand aborting under MSVC). Fixed in Add synchronous ExternalTexture.CreateForJavaScript and deprecate AddToContextAsync #1646 by resolvingGraphics::DeviceContextbefore acquiring the impl lock; the broader Impl-mutex encapsulation refactor in Add synchronous ExternalTexture.CreateForJavaScript and deprecate AddToContextAsync #1646 makes this bug shape structurally impossible at the call sites that previously held the lock across JS callouts.