Created concepts for samplers, added quotient_and_pdf variants to satisfy the concepts by karimsayedre · Pull Request #1001 · Devsh-Graphics-Programming/Nabla

karimsayedre · 2026-02-18T20:17:45Z

Examples PR

Notes:

The quotient_and_pdf() methods in UniformHemisphere, UniformSphere, ProjectedHemisphere, and ProjectedSphere shadow the struct type sampling::quotient_and_pdf<Q, P> from quotient_and_pdf.hlsl. DXC can't resolve the return type because the method name takes precedence over the struct name during lookup. Fixed by fully qualifying with ::nbl::hlsl::sampling::quotient_and_pdf<U, T>.
Obv. there's some refactoring to be done to satisfy all the concepts, so for not Basic (Level1) samplers are concept tested

…isfy the concepts

include/nbl/builtin/hlsl/sampling/box_muller_transform.hlsl

include/nbl/builtin/hlsl/sampling/concepts.hlsl

include/nbl/builtin/hlsl/sampling/cos_weighted_spheres.hlsl

include/nbl/builtin/hlsl/sampling/quotient_and_pdf.hlsl

…concepts - Move codomain_and_*Pdf and domain_and_*Pdf structs into their own warp_and_pdf.hlsl header - Keeping quotient_and_pdf.hlsl focused on importance sampling quotients for BxDFs - Add SampleWithPDF, SampleWithRcpPDF, and SampleWithDensity concepts to validate sample types - Used concept composition (NBL_CONCEPT_REQ_TYPE_ALIAS_CONCEPT) to build ResamplableSampler on TractableSampler and BijectiveSampler on ResamplableSampler

include/nbl/builtin/hlsl/sampling/cos_weighted_spheres.hlsl

include/nbl/builtin/hlsl/sampling/warp_and_pdf.hlsl

include/nbl/builtin/hlsl/sampling/concepts.hlsl

…ical triangle

…nstead

…tion

…cal tri

… projected/spherical triangle

devshgraphicsprogramming · 2026-03-20T23:36:32Z

include/nbl/builtin/hlsl/sampling/spherical_rectangle.hlsl

+        const scalar_type au = u.x * solidAngle + k;
        const scalar_type fu = (hlsl::cos<scalar_type>(au) * b0 - b1) / hlsl::sin<scalar_type>(au);
        const scalar_type cu_2 = hlsl::max<scalar_type>(fu * fu + b0 * b0, 1.f); // forces `cu` to be in [-1,1]
        const scalar_type cu = ieee754::flipSignIfRHSNegative<scalar_type>(scalar_type(1.0) / hlsl::sqrt<scalar_type>(cu_2), fu);


use inversesqrt instead of scalar_type(1.0) / hlsl::sqrt<scalar_type>(cu_2)

also cu*sign(fu) can possibly be faster than integer bit tricks, try to profile

also fix the naming, cu_2 is actually rcpCu_2 from what I see (but carried over from my code) double check with the paper

devshgraphicsprogramming · 2026-03-20T23:38:52Z

include/nbl/builtin/hlsl/sampling/spherical_rectangle.hlsl


-        const scalar_type au = uv.x * S + k;
+        const scalar_type au = u.x * solidAngle + k;
        const scalar_type fu = (hlsl::cos<scalar_type>(au) * b0 - b1) / hlsl::sin<scalar_type>(au);


read the paper, or deduce if au can't be more than PI or less than 0

if indeed 0 <= au <=PI then you can use inversesqrt(1.f-cos*cos) instead of division by sin and profile/benchmark

devshgraphicsprogramming · 2026-03-20T23:46:36Z

include/nbl/builtin/hlsl/sampling/spherical_rectangle.hlsl

        const scalar_type cu_2 = hlsl::max<scalar_type>(fu * fu + b0 * b0, 1.f); // forces `cu` to be in [-1,1]
        const scalar_type cu = ieee754::flipSignIfRHSNegative<scalar_type>(scalar_type(1.0) / hlsl::sqrt<scalar_type>(cu_2), fu);

        scalar_type xu = -(cu * r0.z) / hlsl::sqrt<scalar_type>(scalar_type(1.0) - cu * cu);


I see an opportunity for algebraic optimization

r0.z * sign(negFu) * inversesqrt(rcpCu_2 - 1.0);

Note that rcpCu_2 is probably current cu_2 because thats wrongly named

this way you should get dif of the cu variable alltogether
and compute negFu instead of fu by reversing the order of the subtraction in current fu equation
(note that the latter cu_2 computation uses fu*fu which stacks constant regardless of sign of fu )

devshgraphicsprogramming · 2026-03-20T23:52:01Z

include/nbl/builtin/hlsl/sampling/spherical_rectangle.hlsl

-        r0.z = -hlsl::abs(r0.z);
-        vector3_type r1 = r0 + vector3_type(rect.extents.x, rect.extents.y, 0);
+        retval.r0 = hlsl::promote<vector3_type>(-hlsl::abs(retval.r0.z));


I'm 99% sure this change broke everything and the solid angle rectangle sampler is broken

you're assigning -abs(r0.z) to all 3 components

it was just supposed to be r0.z = ieee754::negativeAbs(r0.z) I think, thats all

could also just abs(r0.z) and correct everything that uses r0.z to flip its sign

devshgraphicsprogramming · 2026-03-21T00:26:31Z

include/nbl/builtin/hlsl/sampling/spherical_rectangle.hlsl


        scalar_type xu = -(cu * r0.z) / hlsl::sqrt<scalar_type>(scalar_type(1.0) - cu * cu);
        xu = hlsl::clamp<scalar_type>(xu, r0.x, r1.x); // avoid Infs
        const scalar_type d_2 = xu * xu + r0.z * r0.z;


could be computed as r0_z_squared*(1/(rcpCu_2-1)+1) and the r0_z_squared kept as a member

Actually algebra-ing more

d_2 = r0_z_squared*(rcpCu_2*rcpCu_2-rcpCu_2) or d_2 = r0_z_squared*rcpCu_2*(rcpCu_2-1)

devshgraphicsprogramming · 2026-03-21T00:35:02Z

include/nbl/builtin/hlsl/sampling/spherical_rectangle.hlsl

@@ -74,18 +90,45 @@

        const scalar_type h0 = r0.y / hlsl::sqrt<scalar_type>(d_2 + r0.y * r0.y);
        const scalar_type h1 = r1.y / hlsl::sqrt<scalar_type>(d_2 + r1.y * r1.y);


looks like you should also keep a r0_y_squared and r1_y_squared

btw everthing that happens to h0 and h1 here, is sampling a line's projection on the sphere, useful if you want to make the rectangle sampling robust to being thing (it would involve flipping the local X and Y axes of the rectangle / basis rows such that the y-direction is always longer)

also use inversesqrt here

devshgraphicsprogramming · 2026-03-21T00:38:12Z

include/nbl/builtin/hlsl/sampling/spherical_triangle.hlsl

+
+	struct cache_type
+	{
+		density_type pdf;


pdf is constant, have an empty cache

devshgraphicsprogramming · 2026-03-21T00:41:37Z

include/nbl/builtin/hlsl/sampling/spherical_triangle.hlsl

+
+		vector3_type retval = tri_vertices[1];
+		const scalar_type cosBC_s = nbl::hlsl::dot(C_s, tri_vertices[1]);
+		const scalar_type csc_b_s = 1.0 / nbl::hlsl::sqrt(1.0 - cosBC_s * cosBC_s);


can we use inversesqrt without messing up the precision ?

devshgraphicsprogramming · 2026-03-21T00:42:10Z

include/nbl/builtin/hlsl/sampling/spherical_triangle.hlsl

+	domain_type _generateInverse(const codomain_type L)
+	{
+		const scalar_type cosAngleAlongBC_s = nbl::hlsl::dot(L, tri_vertices[1]);
+		const scalar_type csc_a_ = 1.0 / nbl::hlsl::sqrt(1.0 - cosAngleAlongBC_s * cosAngleAlongBC_s);


inversesqrt without messing up precision ?

devshgraphicsprogramming · 2026-03-21T00:44:08Z

include/nbl/builtin/hlsl/shapes/spherical_rectangle.hlsl

        angle_adder.addCosine(cosGamma[1]);
        angle_adder.addCosine(cosGamma[2]);
        angle_adder.addCosine(cosGamma[3]);
        return angle_adder.getSumofArccos() - scalar_type(2.0) * numbers::pi<float>;


template the sincos_accumulator::getSumofArccos which acos implementation it should use (let default template arg be impl::acos_helper) and try some fast inverse trig functions

devshgraphicsprogramming · 2026-03-21T00:45:56Z

include/nbl/builtin/hlsl/sampling/spherical_triangle.hlsl

+		math::sincos_accumulator<scalar_type> angle_adder = math::sincos_accumulator<scalar_type>::create(cosA, sinA);
+		angle_adder.addAngle(cosB_, sinB_);
+		angle_adder.addAngle(cosC_, sinC_);
+		const scalar_type subTriSolidAngleRatio = (angle_adder.getSumofArccos() - numbers::pi<scalar_type>) * (scalar_type(1.0) / solidAngle);


lets store rcpSolidAngle instead of `solidAngle

devshgraphicsprogramming · 2026-03-21T00:46:28Z

include/nbl/builtin/hlsl/sampling/spherical_triangle.hlsl

+		// 1 ULP below 1.0, ensures (1.0 - cosBC_s) is strictly positive in float
+		const scalar_type one_below_one = bit_cast<scalar_type>(bit_cast<uint_type>(scalar_type(1)) - uint_type(1));


don't we have a ieee754 function for this ?

just wrote this in `ieee754.hlsl`, looks good?

yes great, you should probably require that T is scalar

maaaybe make a nextTowardZero then you don't need care about sign or negativity

devshgraphicsprogramming · 2026-03-21T00:48:59Z

include/nbl/builtin/hlsl/sampling/spherical_triangle.hlsl

+		{
+			const scalar_type cosAngleAlongAC = ((v_ * q - u_ * p) * cosA - v_) / ((v_ * p + u_ * q) * sinA);
+			if (nbl::hlsl::abs(cosAngleAlongAC) < 1.f)
+				C_s += math::quaternion<scalar_type>::slerp_delta(tri_vertices[0], tri_vertices[2] * triCscB, cosAngleAlongAC);


store tri_vertices[2] * triCscB instead of tri_vertices[2] and triCscB

the triCscB < numeric_limits<scalar_type>::max check can be done with cosA<1.f-epsilon instead

leave a comment that there's plenty of opportunity for optimization given that first two arguments to slerp_delta are constant

Nabla/include/nbl/builtin/hlsl/math/quaternions.hlsl

Line 295 in 00017ad

vector3_type planeNormal = hlsl::cross(start,preScaledWaypoint);

devshgraphicsprogramming · 2026-03-21T00:57:47Z

include/nbl/builtin/hlsl/sampling/spherical_triangle.hlsl

+			const scalar_type cosAngleAlongBC_s = nbl::hlsl::clamp(scalar_type(1.0) + cosBC_s * u.y - u.y, scalar_type(-1.0), scalar_type(1.0));
+			if (nbl::hlsl::abs(cosAngleAlongBC_s) < scalar_type(1.0))


you don't need the clamp, you have an abs(...)<1.f right before you use it for anything

devshgraphicsprogramming · 2026-03-21T01:02:32Z

include/nbl/builtin/hlsl/sampling/spherical_triangle.hlsl

+	vector3_type tri_vertices[3];
+	scalar_type triCosC;
+	scalar_type triCscB;
+	scalar_type triCscC;


take a template whether you want the SphericalTriangle to be bijective (have a generateInverse method) and implement the Bijective in terms of Resamplable & Tractable + the triCscC member

devshgraphicsprogramming · 2026-03-21T01:04:12Z

include/nbl/builtin/hlsl/sampling/projected_spherical_triangle.hlsl


+    weight_type backwardWeight(const vector3_type L)
+    {
+        return backwardPdf(L);


now here's our time to shine!

The reason for this whole damn refactor and distrinction between PDF and MIS weight

you can simply return abs(L.z)*triangleProjectedSolidAngle for the weight!

You would need to compute the triangleProjectedSolidAngle during create though.

I'd implenent backwardWeight in terms of forwardWeight (can fill unused cache variables with garbage)

devshgraphicsprogramming · 2026-03-21T01:07:29Z

include/nbl/builtin/hlsl/sampling/projected_spherical_triangle.hlsl

-        retval.sphtri = sampling::SphericalTriangle<T>::create(tri);
-        return retval;
-    }
+        density_type pdf;


you want this to be basically only L_z for the weight functions and bilinear cache to get the forward PDF

devshgraphicsprogramming · 2026-03-21T01:08:08Z

include/nbl/builtin/hlsl/sampling/projected_spherical_triangle.hlsl

+    // NOTE: produces a degenerate (all-zero) bilinear patch when the receiver normal faces away
+    // from all three triangle vertices, resulting in NaN PDFs (0 * inf). Callers must ensure
+    // at least one vertex has positive projection onto the receiver normal.
+    Bilinear<scalar_type> computeBilinearPatch()


this is why this sampler shouldn't hand out a PDF but only weights

add a __ prefix to the function

also why aren't you storing the bilinear patch as a member and re-creating it on the fly?

the receiver normal doesn't chance between calls to generate

Also where's a static create method and creation params ?

devshgraphicsprogramming · 2026-03-21T01:11:31Z

include/nbl/builtin/hlsl/sampling/projected_spherical_triangle.hlsl

+        const vector2_type u = sphtri.generateInverse(L);
+        Bilinear<scalar_type> bilinear = computeBilinearPatch();
        return pdf * bilinear.backwardPdf(u);
    }


this function is not needed at all

devshgraphicsprogramming · 2026-03-21T01:26:28Z

include/nbl/builtin/hlsl/sampling/projected_spherical_triangle.hlsl

+    density_type forwardPdf(const cache_type cache)
    {
-        const scalar_type cos_c = sphtri.tri.cos_sides[2];
-        const scalar_type csc_b = sphtri.tri.csc_sides[1];
-        const scalar_type solidAngle = sphtri.tri.solidAngle();
-        return generate(rcpPdf, solidAngle, cos_c, csc_b, receiverNormal, isBSDF, u);
+        return cache.pdf;


we'd get the cache.blinearCache and query the PDF of that and multiply it with sphtri.rcpSolidAngle

devshgraphicsprogramming · 2026-03-21T01:27:21Z

include/nbl/builtin/hlsl/sampling/projected_spherical_triangle.hlsl

+
    sampling::SphericalTriangle<T> sphtri;
+    vector3_type receiverNormal;
+    bool receiverWasBSDF;


don't need receiverNormal and receiverWasBSDF members, need bilinearPatch and rcpProjSolidAngle instead

so its useful to have a create (which computes bilinear for you from receiverNormal and wasBSDF) but not the kind I told before to remove

#966 (comment)

# Conflicts: # examples_tests # include/nbl/builtin/hlsl/shapes/spherical_triangle.hlsl

devshgraphicsprogramming · 2026-03-24T15:33:03Z

include/nbl/builtin/hlsl/sampling/cumulative_probability_builder.h

+// Builds a normalized cumulative histogram from an array of non-negative weights.
+// Output has N-1 entries (last bucket implicitly 1.0).
+template<typename T>
+void computeNormalizedCumulativeHistogram(const T* weights, uint32_t N, T* outCumProb)


maybe take a span<T> weights then the N is implicit, also do a special path for size()<2 because right now it will crash

devshgraphicsprogramming · 2026-03-24T17:38:56Z

include/nbl/builtin/hlsl/algorithm.hlsl

 {
    impl::bound_t<Accessor,Comparator> implementation = impl::bound_t<Accessor,Comparator>::setup(begin,end,value,comp);
-    return implementation(accessor);
+    const uint32_t retval = implementation(accessor);


WARNING impl::bound_t<Accessor,Comparator>::setup takes comp as const

need to change the impl and provide it brom a NBL_REF_ARG through the methods of impl::bound_t

devshgraphicsprogramming · 2026-03-24T17:44:20Z

include/nbl/builtin/hlsl/sampling/uniform_spheres.hlsl

 	static weight_type backwardWeight(const codomain_type sample)
 	{
-		return backwardPdf(sample);
+		return T(0.5) * hemisphere_t::backwardWeight(sample);


remember sample is a HLSL keyword

devshgraphicsprogramming · 2026-03-24T17:48:30Z

include/nbl/builtin/hlsl/sampling/uniform_spheres.hlsl

-		T z = T(1.0) - T(2.0) * _sample.x;
-		T r = hlsl::sqrt<T>(hlsl::max<T>(T(0.0), T(1.0) - z * z));
-		T phi = T(2.0) * numbers::pi<T> * _sample.y;
-		return vector_t3(r * hlsl::cos<T>(phi), r * hlsl::sin<T>(phi), z);
+		// Map _sample.x from [0,1] into hemisphere sample + sign flip:
+		// upper hemisphere when _sample.x < 0.5, lower when >= 0.5
+		const bool chooseLower = _sample.x >= T(0.5);
+		const T hemiX = chooseLower ? (T(2.0) * _sample.x - T(1.0)) : (T(2.0) * _sample.x);
+		vector_t3 retval = hemisphere_t::__generate(vector_t2(hemiX, _sample.y));
+		retval.z = chooseLower ? (-retval.z) : retval.z;
+		return retval;


ugh no the code is definitely worse here with all the conditionals, I just need a shared common function between both that takes a z coord externally

devshgraphicsprogramming · 2026-03-24T18:07:27Z

include/nbl/builtin/hlsl/sampling/alias_table.hlsl

+	NBL_PRIMARY_REQUIRES(
+		concepts::accessors::GenericReadAccessor<ProbabilityAccessor, T, Codomain> &&
+		concepts::accessors::GenericReadAccessor<AliasIndexAccessor, Codomain, Codomain> &&
+		concepts::accessors::GenericReadAccessor<PdfAccessor, T, Codomain>)


require that Domain is an UnsignedIntegralScalar (or whatever the concept is called)

devshgraphicsprogramming · 2026-03-24T18:17:43Z

include/nbl/builtin/hlsl/sampling/box_muller_transform.hlsl

        math::sincos<scalar_type>(scalar_type(2.0) * numbers::pi<scalar_type> * u.y - numbers::pi<scalar_type>, sinPhi, cosPhi);
        const codomain_type outPos = vector2_type(cosPhi, sinPhi) * nbl::hlsl::sqrt(scalar_type(-2.0) * nbl::hlsl::log(u.x)) * stddev;
-        cache.pdf = backwardPdf(outPos);
+        cache.u_x = u.x;
        return outPos;
    }

    density_type forwardPdf(const cache_type cache)
    {
-        return cache.pdf;
+        return halfRcpStddev2 * numbers::inv_pi<scalar_type> * cache.u_x;


ok so you check my math from the comment and it was wrong ?

Created concepts for samplers, added quotient_and_pdf variants to sat…

f71610b

…isfy the concepts