Skip to content

Created concepts for samplers, added quotient_and_pdf variants to satisfy the concepts#1001

Open
karimsayedre wants to merge 39 commits intomasterfrom
sampler-concepts
Open

Created concepts for samplers, added quotient_and_pdf variants to satisfy the concepts#1001
karimsayedre wants to merge 39 commits intomasterfrom
sampler-concepts

Conversation

@karimsayedre
Copy link
Contributor

@karimsayedre karimsayedre commented Feb 18, 2026

Examples PR

Notes:

  • The quotient_and_pdf() methods in UniformHemisphere, UniformSphere, ProjectedHemisphere, and ProjectedSphere shadow the struct type sampling::quotient_and_pdf<Q, P> from quotient_and_pdf.hlsl. DXC can't resolve the return type because the method name takes precedence over the struct name during lookup. Fixed by fully qualifying with ::nbl::hlsl::sampling::quotient_and_pdf<U, T>.
  • Obv. there's some refactoring to be done to satisfy all the concepts, so for not Basic (Level1) samplers are concept tested

…concepts

- Move codomain_and_*Pdf and domain_and_*Pdf structs into their own warp_and_pdf.hlsl header
- Keeping quotient_and_pdf.hlsl focused on importance sampling quotients for BxDFs
- Add SampleWithPDF, SampleWithRcpPDF, and SampleWithDensity concepts to validate sample types
- Used concept composition (NBL_CONCEPT_REQ_TYPE_ALIAS_CONCEPT) to build ResamplableSampler on TractableSampler and BijectiveSampler on ResamplableSampler
const scalar_type au = u.x * solidAngle + k;
const scalar_type fu = (hlsl::cos<scalar_type>(au) * b0 - b1) / hlsl::sin<scalar_type>(au);
const scalar_type cu_2 = hlsl::max<scalar_type>(fu * fu + b0 * b0, 1.f); // forces `cu` to be in [-1,1]
const scalar_type cu = ieee754::flipSignIfRHSNegative<scalar_type>(scalar_type(1.0) / hlsl::sqrt<scalar_type>(cu_2), fu);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use inversesqrt instead of scalar_type(1.0) / hlsl::sqrt<scalar_type>(cu_2)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also cu*sign(fu) can possibly be faster than integer bit tricks, try to profile

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also fix the naming, cu_2 is actually rcpCu_2 from what I see (but carried over from my code) double check with the paper


const scalar_type au = uv.x * S + k;
const scalar_type au = u.x * solidAngle + k;
const scalar_type fu = (hlsl::cos<scalar_type>(au) * b0 - b1) / hlsl::sin<scalar_type>(au);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read the paper, or deduce if au can't be more than PI or less than 0

if indeed 0 <= au <=PI then you can use inversesqrt(1.f-cos*cos) instead of division by sin and profile/benchmark

const scalar_type cu_2 = hlsl::max<scalar_type>(fu * fu + b0 * b0, 1.f); // forces `cu` to be in [-1,1]
const scalar_type cu = ieee754::flipSignIfRHSNegative<scalar_type>(scalar_type(1.0) / hlsl::sqrt<scalar_type>(cu_2), fu);

scalar_type xu = -(cu * r0.z) / hlsl::sqrt<scalar_type>(scalar_type(1.0) - cu * cu);
Copy link
Member

@devshgraphicsprogramming devshgraphicsprogramming Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see an opportunity for algebraic optimization

r0.z * sign(negFu) * inversesqrt(rcpCu_2 - 1.0);

Note that rcpCu_2 is probably current cu_2 because thats wrongly named

this way you should get dif of the cu variable alltogether
and compute negFu instead of fu by reversing the order of the subtraction in current fu equation
(note that the latter cu_2 computation uses fu*fu which stacks constant regardless of sign of fu )

Comment on lines +62 to +66
r0.z = -hlsl::abs(r0.z);
vector3_type r1 = r0 + vector3_type(rect.extents.x, rect.extents.y, 0);
retval.r0 = hlsl::promote<vector3_type>(-hlsl::abs(retval.r0.z));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 99% sure this change broke everything and the solid angle rectangle sampler is broken

you're assigning -abs(r0.z) to all 3 components

it was just supposed to be r0.z = ieee754::negativeAbs(r0.z) I think, thats all

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could also just abs(r0.z) and correct everything that uses r0.z to flip its sign

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved


scalar_type xu = -(cu * r0.z) / hlsl::sqrt<scalar_type>(scalar_type(1.0) - cu * cu);
xu = hlsl::clamp<scalar_type>(xu, r0.x, r1.x); // avoid Infs
const scalar_type d_2 = xu * xu + r0.z * r0.z;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be computed as r0_z_squared*(1/(rcpCu_2-1)+1) and the r0_z_squared kept as a member

Actually algebra-ing more

d_2 = r0_z_squared*(rcpCu_2*rcpCu_2-rcpCu_2)
or
d_2 = r0_z_squared*rcpCu_2*(rcpCu_2-1)

@@ -74,18 +90,45 @@

const scalar_type h0 = r0.y / hlsl::sqrt<scalar_type>(d_2 + r0.y * r0.y);
const scalar_type h1 = r1.y / hlsl::sqrt<scalar_type>(d_2 + r1.y * r1.y);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like you should also keep a r0_y_squared and r1_y_squared

btw everthing that happens to h0 and h1 here, is sampling a line's projection on the sphere, useful if you want to make the rectangle sampling robust to being thing (it would involve flipping the local X and Y axes of the rectangle / basis rows such that the y-direction is always longer)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also use inversesqrt here


struct cache_type
{
density_type pdf;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pdf is constant, have an empty cache

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved


vector3_type retval = tri_vertices[1];
const scalar_type cosBC_s = nbl::hlsl::dot(C_s, tri_vertices[1]);
const scalar_type csc_b_s = 1.0 / nbl::hlsl::sqrt(1.0 - cosBC_s * cosBC_s);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use inversesqrt without messing up the precision ?

domain_type _generateInverse(const codomain_type L)
{
const scalar_type cosAngleAlongBC_s = nbl::hlsl::dot(L, tri_vertices[1]);
const scalar_type csc_a_ = 1.0 / nbl::hlsl::sqrt(1.0 - cosAngleAlongBC_s * cosAngleAlongBC_s);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inversesqrt without messing up precision ?

angle_adder.addCosine(cosGamma[1]);
angle_adder.addCosine(cosGamma[2]);
angle_adder.addCosine(cosGamma[3]);
return angle_adder.getSumofArccos() - scalar_type(2.0) * numbers::pi<float>;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

template the sincos_accumulator::getSumofArccos which acos implementation it should use (let default template arg be impl::acos_helper) and try some fast inverse trig functions

math::sincos_accumulator<scalar_type> angle_adder = math::sincos_accumulator<scalar_type>::create(cosA, sinA);
angle_adder.addAngle(cosB_, sinB_);
angle_adder.addAngle(cosC_, sinC_);
const scalar_type subTriSolidAngleRatio = (angle_adder.getSumofArccos() - numbers::pi<scalar_type>) * (scalar_type(1.0) / solidAngle);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets store rcpSolidAngle instead of `solidAngle

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved

Comment on lines +113 to +114
// 1 ULP below 1.0, ensures (1.0 - cosBC_s) is strictly positive in float
const scalar_type one_below_one = bit_cast<scalar_type>(bit_cast<uint_type>(scalar_type(1)) - uint_type(1));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we have a ieee754 function for this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image just wrote this in `ieee754.hlsl`, looks good?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes great, you should probably require that T is scalar

maaaybe make a nextTowardZero then you don't need care about sign or negativity

{
const scalar_type cosAngleAlongAC = ((v_ * q - u_ * p) * cosA - v_) / ((v_ * p + u_ * q) * sinA);
if (nbl::hlsl::abs(cosAngleAlongAC) < 1.f)
C_s += math::quaternion<scalar_type>::slerp_delta(tri_vertices[0], tri_vertices[2] * triCscB, cosAngleAlongAC);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

store tri_vertices[2] * triCscB instead of tri_vertices[2] and triCscB

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the triCscB < numeric_limits<scalar_type>::max check can be done with cosA<1.f-epsilon instead

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leave a comment that there's plenty of opportunity for optimization given that first two arguments to slerp_delta are constant

vector3_type planeNormal = hlsl::cross(start,preScaledWaypoint);

Comment on lines +83 to +84
const scalar_type cosAngleAlongBC_s = nbl::hlsl::clamp(scalar_type(1.0) + cosBC_s * u.y - u.y, scalar_type(-1.0), scalar_type(1.0));
if (nbl::hlsl::abs(cosAngleAlongBC_s) < scalar_type(1.0))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need the clamp, you have an abs(...)<1.f right before you use it for anything

vector3_type tri_vertices[3];
scalar_type triCosC;
scalar_type triCscB;
scalar_type triCscC;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

take a template whether you want the SphericalTriangle to be bijective (have a generateInverse method) and implement the Bijective in terms of Resamplable & Tractable + the triCscC member


weight_type backwardWeight(const vector3_type L)
{
return backwardPdf(L);
Copy link
Member

@devshgraphicsprogramming devshgraphicsprogramming Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now here's our time to shine!

The reason for this whole damn refactor and distrinction between PDF and MIS weight

you can simply return abs(L.z)*triangleProjectedSolidAngle for the weight!

You would need to compute the triangleProjectedSolidAngle during create though.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd implenent backwardWeight in terms of forwardWeight (can fill unused cache variables with garbage)

retval.sphtri = sampling::SphericalTriangle<T>::create(tri);
return retval;
}
density_type pdf;
Copy link
Member

@devshgraphicsprogramming devshgraphicsprogramming Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you want this to be basically only L_z for the weight functions and bilinear cache to get the forward PDF

Comment on lines +40 to +43
// NOTE: produces a degenerate (all-zero) bilinear patch when the receiver normal faces away
// from all three triangle vertices, resulting in NaN PDFs (0 * inf). Callers must ensure
// at least one vertex has positive projection onto the receiver normal.
Bilinear<scalar_type> computeBilinearPatch()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is why this sampler shouldn't hand out a PDF but only weights

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a __ prefix to the function

Copy link
Member

@devshgraphicsprogramming devshgraphicsprogramming Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also why aren't you storing the bilinear patch as a member and re-creating it on the fly?

the receiver normal doesn't chance between calls to generate

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also where's a static create method and creation params ?

const vector2_type u = sphtri.generateInverse(L);
Bilinear<scalar_type> bilinear = computeBilinearPatch();
return pdf * bilinear.backwardPdf(u);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is not needed at all

Comment on lines +65 to +67
density_type forwardPdf(const cache_type cache)
{
const scalar_type cos_c = sphtri.tri.cos_sides[2];
const scalar_type csc_b = sphtri.tri.csc_sides[1];
const scalar_type solidAngle = sphtri.tri.solidAngle();
return generate(rcpPdf, solidAngle, cos_c, csc_b, receiverNormal, isBSDF, u);
return cache.pdf;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'd get the cache.blinearCache and query the PDF of that and multiply it with sphtri.rcpSolidAngle


sampling::SphericalTriangle<T> sphtri;
vector3_type receiverNormal;
bool receiverWasBSDF;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need receiverNormal and receiverWasBSDF members, need bilinearPatch and rcpProjSolidAngle instead

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so its useful to have a create (which computes bilinear for you from receiverNormal and wasBSDF) but not the kind I told before to remove

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# Conflicts:
#	examples_tests
#	include/nbl/builtin/hlsl/shapes/spherical_triangle.hlsl
// Builds a normalized cumulative histogram from an array of non-negative weights.
// Output has N-1 entries (last bucket implicitly 1.0).
template<typename T>
void computeNormalizedCumulativeHistogram(const T* weights, uint32_t N, T* outCumProb)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe take a span<T> weights then the N is implicit, also do a special path for size()<2 because right now it will crash

Comment on lines 167 to +169
{
impl::bound_t<Accessor,Comparator> implementation = impl::bound_t<Accessor,Comparator>::setup(begin,end,value,comp);
return implementation(accessor);
const uint32_t retval = implementation(accessor);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING impl::bound_t<Accessor,Comparator>::setup takes comp as const

need to change the impl and provide it brom a NBL_REF_ARG through the methods of impl::bound_t

Comment on lines 155 to +157
static weight_type backwardWeight(const codomain_type sample)
{
return backwardPdf(sample);
return T(0.5) * hemisphere_t::backwardWeight(sample);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remember sample is a HLSL keyword

Comment on lines +118 to +112
T z = T(1.0) - T(2.0) * _sample.x;
T r = hlsl::sqrt<T>(hlsl::max<T>(T(0.0), T(1.0) - z * z));
T phi = T(2.0) * numbers::pi<T> * _sample.y;
return vector_t3(r * hlsl::cos<T>(phi), r * hlsl::sin<T>(phi), z);
// Map _sample.x from [0,1] into hemisphere sample + sign flip:
// upper hemisphere when _sample.x < 0.5, lower when >= 0.5
const bool chooseLower = _sample.x >= T(0.5);
const T hemiX = chooseLower ? (T(2.0) * _sample.x - T(1.0)) : (T(2.0) * _sample.x);
vector_t3 retval = hemisphere_t::__generate(vector_t2(hemiX, _sample.y));
retval.z = chooseLower ? (-retval.z) : retval.z;
return retval;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh no the code is definitely worse here with all the conditionals, I just need a shared common function between both that takes a z coord externally

Comment on lines +35 to +38
NBL_PRIMARY_REQUIRES(
concepts::accessors::GenericReadAccessor<ProbabilityAccessor, T, Codomain> &&
concepts::accessors::GenericReadAccessor<AliasIndexAccessor, Codomain, Codomain> &&
concepts::accessors::GenericReadAccessor<PdfAccessor, T, Codomain>)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

require that Domain is an UnsignedIntegralScalar (or whatever the concept is called)

Comment on lines 47 to +55
math::sincos<scalar_type>(scalar_type(2.0) * numbers::pi<scalar_type> * u.y - numbers::pi<scalar_type>, sinPhi, cosPhi);
const codomain_type outPos = vector2_type(cosPhi, sinPhi) * nbl::hlsl::sqrt(scalar_type(-2.0) * nbl::hlsl::log(u.x)) * stddev;
cache.pdf = backwardPdf(outPos);
cache.u_x = u.x;
return outPos;
}

density_type forwardPdf(const cache_type cache)
{
return cache.pdf;
return halfRcpStddev2 * numbers::inv_pi<scalar_type> * cache.u_x;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so you check my math from the comment and it was wrong ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants