Multi-res grid refinement + Neon backend support by hsalehipour · Pull Request #159 · Autodesk/XLB

hsalehipour · 2026-03-13T22:13:24Z

Contributing Guidelines

I have read and understood the CONTRIBUTING.md guidelines

Description

Grid refinement capability is now supported in XLB through the Neon backend. The Neon backend provides full support for dense grids on multi-GPU systems, as well as multi-resolution grids on single GPUs. All newly introduced functionalities have been carefully tested and optimized. This represents a major enhancement to the library and involves substantial additions and improvements to the codebase.

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update

How Has This Been Tested?

All pytest tests pass

============================================= test session starts ==============================================
platform linux -- Python 3.12.3, pytest-9.0.2, pluggy-1.6.0
rootdir: /home/max/repo/test/XLB
collected 93 items                                                                                             

tests/boundary_conditions/bc_equilibrium/test_bc_equilibrium_jax.py ....                                 [  4%]
tests/boundary_conditions/bc_equilibrium/test_bc_equilibrium_warp.py ....                                [  8%]
tests/boundary_conditions/bc_fullway_bounce_back/test_bc_fullway_bounce_back_jax.py ......               [ 15%]
tests/boundary_conditions/bc_fullway_bounce_back/test_bc_fullway_bounce_back_warp.py ......              [ 21%]
tests/boundary_conditions/mask/test_bc_indices_masker_jax.py .......                                     [ 29%]
tests/boundary_conditions/mask/test_bc_indices_masker_warp.py ......                                     [ 35%]
tests/grids/test_grid_jax.py .......                                                                     [ 43%]
tests/grids/test_grid_warp.py ....                                                                       [ 47%]
tests/kernels/collision/test_bgk_collision_jax.py ......                                                 [ 53%]
tests/kernels/collision/test_bgk_collision_warp.py ......                                                [ 60%]
tests/kernels/equilibrium/test_equilibrium_jax.py ......                                                 [ 66%]
tests/kernels/equilibrium/test_equilibrium_warp.py ......                                                [ 73%]
tests/kernels/macroscopic/test_macroscopic_jax.py ......                                                 [ 79%]
tests/kernels/macroscopic/test_macroscopic_warp.py .......                                               [ 87%]
tests/kernels/stream/test_stream_jax.py ......                                                           [ 93%]
tests/kernels/stream/test_stream_warp.py ......                                                          [100%]

======================================== 93 passed in 248.34s (0:04:08) ========================================

Linting and Code Formatting

Make sure the code follows the project's linting and formatting standards. This project uses Ruff for linting.

To run Ruff, execute the following command from the root of the repository:

ruff check .

Ruff passes

Simplifies the `add_to_app` method in the multiresolution stepper. It now leverages keyword arguments and introspection for more flexible and maintainable operator calls. This change enhances code readability and reduces the risk of errors when adding new operators.

(perf) Introduce two new scheduling strategies for the MRES algorithm

Merged and resolved conflicts of the latest XLB/main into dev

…So no need to further multiply by rho in KBC.

…mplifying the function signatures across multiple classes.

Fixed KBC bug

* Fixed some runtime bugs * fixed some naming/spelling errors * removed some debugging comments. * Introduced a new file `cell_type.py` containing boundary-mask constants for fluid voxelss to replace hardcoded values with the new constants. * Applied renaming of 254 to SFV to function names

- Unified multi-resolution recursion builder in `simulation_manager.py` to streamline the construction of simulation steps. - Refactored nse_multires_stepper for improved clarity - Updated performance optimization handling in `multires_momentum_transfer.py` to support multiple fusion strategies.

…ine and clarify the implementation of multi-resolution streaming steps.

(refactoring) Cleaning up multi-res stepper.

Documentation

… multi-res by ensuring consistent use of `store_dtype` and `compute_dtype`.

Fixed mixed precision handling of the Neon backend

…ed + README update (#39)

* (build) Introducing Neon backend as an optional installation parameter. * (install) new installation mode for neon backend. * (build) Add ARM support for Neon wheel resolution * (documentation) Fixes to README and AUTHORS * (ruff) fixes to the style * (documentation) fix list of supported python versions

mehdiataei

Thank you @massimim, this looks like a strong contribution. I left code-level comments in this round, but I also wanted to share a few higher-level suggestions.

I think framing Neon as a new compute "backend" may be slightly misleading. Conceptually, Neon here does not seem fully parallel to JAX. The implementation largely reuses Warp functionals and then executes them through Neon handles, containers, and skeletons. In that sense, Neon feels more like an execution/runtime layer on top of Warp code generation than a standalone compute backend.

I think a different framing could make the design clearer:

If Neon is fundamentally "Warp math + Neon execution", I would be hesitant to model it as a third peer backend throughout the operator hierarchy.
Instead, I would consider splitting the abstraction into:
- kernel / math backend: JAX vs Warp
- execution runtime: direct Warp launch vs Neon container/skeleton launch

I think this would make Neon more generic and would better highlight its real strength: the execution model and skeleton abstraction, rather than presenting it as a bespoke backend. It may also make the integration easier to extend and adopt. Several of the current issues feel like symptoms of the abstraction boundary being one layer off. This likely needs some careful design thought, but I would strongly encourage it. To me, the more compelling framing is that Neon provides a skeleton/runtime that Warp kernels can target.

The multires implementation also feels too monolithic. Kernels, schedule planning, state ownership, and runtime graph compilation all live in roughly the same layer, which makes the system harder to reason about, test, and extend.

One possible improvement would be to introduce a typed MultiresPlan / Schedule layer that represents the recursive timestep as explicit operations, then have a separate Neon graph builder that lowers that plan into containers/skeletons. I would also keep simulation state in a manager and keep kernels separate from schedule construction.

The topology and coordinate model feels too implicit at the moment, which makes it harder to debug and reuse. An explicit MultiresTopology or LevelInfo abstraction could help a lot, with methods such as:

level_shape(level)
global_bounds(level)
to_global(level, coords)
face_indices(level, side)
active_indices(level)

I think making those concepts explicit would improve both clarity and correctness.

For the new BC, I suggest clearly clarifying the Re that it has been validated for.

mehdiataei · 2026-04-03T18:30:06Z

setup.py

        "jax>=0.8.2",  # Base JAX CPU-only requirement
    ],
    extras_require={
+        "warp": ["warp-lang>=1.10.0"],  # Warp backend (single-GPU); included by default for full backend support


The new default install path is broken: pip install xlb no longer guarantees import xlb

Repro:

uv venv .venv --python 3.12 uv pip install --python .venv/bin/python . source .venv/bin/activate && python -c "import xlb"

I suggest adding warp-lang to base install or fully decouple top-level imports from Warp so a minimal CPU/JAX install can import successfully

mehdiataei · 2026-04-03T18:32:24Z

setup.py

    },
    python_requires=">=3.11",
    dependency_links=["https://storage.googleapis.com/jax-releases/libtpu_releases.html"],
+    cmdclass={"install": InstallWithNeonHooks},


This will bake in python 312. It may break if the user have a different python version.

I think we need to either move to a version-specific XLB wheels per interpreter or move the Neon installation to a runtime script instead.

mehdiataei · 2026-04-03T18:34:17Z

setup.py

+        "warp": ["warp-lang>=1.10.0"],  # Warp backend (single-GPU); included by default for full backend support
        "cuda": ["jax[cuda13]>=0.8.2"],  # For CUDA installations (pip install -U "jax[cuda13]")
        "tpu": ["jax[tpu]>=0.8.2"],  # For TPU installations
+        "neon": [_neon_wheel_requirement()],


Another issue I found:

if you have warp-lang installed, using [neon] will not remove/replace warp so there will be a conflict.

Check:

uv venv .venv --python 3.12 uv pip install --python .venv/bin/python warp-lang uv pip install --python .venv/bin/python '.[neon]

mehdiataei · 2026-04-03T18:38:56Z

xlb/operator/stepper/nse_multires_stepper.py

+        nvtx.pop_range()
+
+    @Operator.register_backend(ComputeBackend.NEON)
+    def neon_launch(self, f_0, f_1, bc_mask, missing_mask, omega, timestep):


MultiresIncompressibleNavierStokesStepper.neon_launch()

will hit TypeError: 'dict' object is not callable using the registered NEON neon_launch

I think you can just remove neon_launch() it is not used in the execution paths.

mehdiataei · 2026-04-03T18:40:40Z

xlb/helper/simulation_manager.py

+
+
+class MultiresSimulationManager(MultiresIncompressibleNavierStokesStepper):
+    """Orchestrates multi-resolution LBM simulations on the Neon backend.


MultiresSimulationManager(..., force_vector=...) will fail. This should be handled explicitly. Either implement the forcedCollision (which should be easy) or maybe create a check for this.

mehdiataei · 2026-04-03T19:00:56Z

examples/cfd/data/ahmed.json

+    "0.238" : { "x-velocity" : [24.405,24.168,22.782,20.196,16.970,13.937,12.137,11.757,12.851,14.649,16.780,18.995,21.070,23.335,25.280,27.468,29.262,30.832,32.133,33.102,33.856,34.473,34.922,35.340,35.698,36.039,36.336,36.629,36.906,37.193,37.454,37.691,37.929,38.329,38.611,38.875,39.126,39.414,39.677,39.917,40.097,40.259,40.380,40.478,40.568], "height" : [0.028,0.038,0.048,0.058,0.068,0.078,0.088,0.098,0.108,0.118,0.128,0.138,0.148,0.158,0.168,0.178,0.188,0.198,0.208,0.218,0.228,0.238,0.248,0.258,0.268,0.278,0.288,0.298,0.308,0.318,0.328,0.338,0.348,0.368,0.388,0.408,0.428,0.458,0.488,0.518,0.558,0.598,0.638,0.688,0.738]},
+    "0.288" : { "x-velocity" : [21.489,22.225,22.127,21.456,20.404,19.743,19.541,19.909,21.002,22.381,24.018,25.670,27.421,28.998,30.371,31.523,32.406,33.111,33.670,34.155,34.532,34.893,35.240,35.567,35.875,36.158,36.437,36.708,36.974,37.230,37.473,37.709,37.932,38.266,38.515,38.773,39.008,39.270,39.562,39.782,39.962,40.148,40.266,40.369,40.475], "height" : [0.028,0.038,0.048,0.058,0.068,0.078,0.088,0.098,0.108,0.118,0.128,0.138,0.148,0.158,0.168,0.178,0.188,0.198,0.208,0.218,0.228,0.238,0.248,0.258,0.268,0.278,0.288,0.298,0.308,0.318,0.328,0.338,0.348,0.368,0.388,0.408,0.428,0.458,0.488,0.518,0.558,0.598,0.638,0.688,0.738]}
+    }
+}


pls fix the trailing whitespace

mehdiataei · 2026-04-03T19:04:29Z

xlb/grid/grid.py

+    velocity_set = velocity_set or DefaultConfig.velocity_set
    if compute_backend == ComputeBackend.WARP:
        from xlb.grid.warp_grid import WarpGrid



you can delete WarpGrid import

mehdiataei · 2026-04-03T19:05:34Z

xlb/grid/neon_grid.py

+        self.bk = neon.Backend(runtime=neon.Backend.Runtime.stream, dev_idx_list=dev_idx_list)
+        self.bk.info_print()
+        self.grid = neon.dense.dGrid(backend=self.bk, dim=self.dim, sparsity=None, stencil=self.neon_stencil)
+        pass


Can you remove these stray pass statements? same for initialize_backend()

mehdiataei · 2026-04-03T19:06:26Z

xlb/grid/neon_grid.py

+                self.neon_stencil.append([xval, yval, zval])
+
+        self.bk = neon.Backend(runtime=neon.Backend.Runtime.stream, dev_idx_list=dev_idx_list)
+        self.bk.info_print()


Can you hide these print statements behind a debug flag? There are some more in grid.

mehdiataei · 2026-04-03T19:07:03Z

xlb/grid/multires_grid.py

+        return self.velocity_set
+
+    def _initialize_backend(self):
+        # FIXME@max: for now we hardcode the number of devices to 0


Maybe remove this

massimim and others added 30 commits April 10, 2025 17:07

WIP

42df87b

WIP

1337a2b

WIP: single level

090837e

WIP

5fc8020

WIP

e1257c9

Single level working, multi-level runtime error: not enough resources

57e0b11

WIP: recursive

c126568

Refactoring.

58b7f7b

WIP

07fc7ae

Debugging

e0f7146

LDC

556d32e

WIP

7cbc16f

Improving GPU utilization.

05a86e3

Printing stats.

b91acd4

Printing stats.

cc85c97

Printing stats.

1c6aca3

Printing stats.

a2aad43

Printing stats.

a165017

Cleaning.

558b044

Cleaning.

7dac522

Merge branch 'main' into mres

c0fde85

Fusion

92c97b8

Fusion

e6d6222

clean up and removing deprecated odd/even approach

694ed06

Merge branch 'mres' into hesam-mres

6baad57

added dGrid and mGrid handling in indices masker

8cfc636

Update test

95ae794

Merge branch 'mres' into hesam-mres

e827f09

Added a multires_boundary_masker and made sure results are correct.

acfc54d

Merge branch 'main' into hesam-mres

e40ac5d

massimim and others added 27 commits November 13, 2025 08:09

Merge remote-tracking branch 'origin/dev' into dev

7ef1336

Merge branch 'dev' into profiling

1dd43bd

(feature) Two new optimization strategies for the mres stepper.

69a1621

(refactoring) Removing debug IO

77c4f3e

(refactoring) Applying ruff.

2738780

Function renaming.

23390e2

Merge pull request #30 from massimim/profiling

34979c2

(perf) Introduce two new scheduling strategies for the MRES algorithm

Merge remote-tracking branch 'origin/main' into clean-up

68e68f8

Merge pull request #31 from hsalehipour/clean-up

fea13f0

Merged and resolved conflicts of the latest XLB/main into dev

Fixed a bug left from previous merge PR

fe4c1d2

The second moment computation includes "rho" embedded in its output. …

2f66dc6

…So no need to further multiply by rho in KBC.

Removed unused parameters (rho, u) from various collision methods, si…

3e55bbd

…mplifying the function signatures across multiple classes.

Updared wall BC in this example to ensure numerical stability

e668765

Merge pull request #32 from hsalehipour/clean-up

afecb8c

Fixed KBC bug

Removed redundant functions from nse_multires_stepper.py to streaml…

22ecca8

…ine and clarify the implementation of multi-resolution streaming steps.

Renamed functions in nse_multires_stepper.py for improved clarity.

d9cb382

Cleaning up multi-res stepper.

fe19b65

Merge pull request #35 from massimim/cleanup_refactoring

231ed73

(refactoring) Cleaning up multi-res stepper.

Documentation

71e3cf8

Merge pull request #36 from hsalehipour/cleanup_refactoring

cc2606f

Documentation

Fixed mixed precision handling of the Neon backend for single-res and…

ba92286

… multi-res by ensuring consistent use of `store_dtype` and `compute_dtype`.

Merge pull request #38 from hsalehipour/mixed_precision

e3bf1cc

Fixed mixed precision handling of the Neon backend

(refactoring) Allowing Warp backend to run without neon to be install…

0f01e8c

…ed + README update (#39)

Ensuring all pytests are passing.

3413ac3

hsalehipour requested a review from mehdiataei March 13, 2026 22:13

mehdiataei reviewed Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-res grid refinement + Neon backend support#159

Multi-res grid refinement + Neon backend support#159
hsalehipour wants to merge 265 commits intoAutodesk:mainfrom
hsalehipour:dev

hsalehipour commented Mar 13, 2026

Uh oh!

mehdiataei left a comment

Uh oh!

mehdiataei Apr 3, 2026

Uh oh!

mehdiataei Apr 3, 2026

Uh oh!

mehdiataei Apr 3, 2026

Uh oh!

mehdiataei Apr 3, 2026

Uh oh!

mehdiataei Apr 3, 2026

Uh oh!

mehdiataei Apr 3, 2026

Uh oh!

mehdiataei Apr 3, 2026

Uh oh!

mehdiataei Apr 3, 2026

Uh oh!

mehdiataei Apr 3, 2026

Uh oh!

mehdiataei Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants



		class MultiresSimulationManager(MultiresIncompressibleNavierStokesStepper):
		"""Orchestrates multi-resolution LBM simulations on the Neon backend.

Conversation

hsalehipour commented Mar 13, 2026

Contributing Guidelines

Description

Type of change

How Has This Been Tested?

Linting and Code Formatting

Uh oh!

mehdiataei left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants