pytorch · AlannaBurke · Apr 22, 2026 · Apr 23, 2026 · Apr 24, 2026
@@ -230,7 +230,7 @@
     ],
     "show_version_warning_banner": True,
     "use_edit_page_button": True,
-    "header_links_before_dropdown": 8,
+    "header_links_before_dropdown": 9,
     "navbar_align": "left",
     "navbar_start": ["navbar-logo", "version-switcher"],
     "navbar_center": ["navbar-nav"],

@@ -0,0 +1,107 @@
+# Linux Desktop Deployment
+
+ExecuTorch provides comprehensive support for Linux environments, enabling high-performance model execution across a wide range of hardware configurations. The runtime leverages optimized backends like XNNPACK for CPU execution and OpenVINO for Intel hardware acceleration.
+
+This guide details the system requirements, available backends, and steps to build and run ExecuTorch natively on Linux distributions.
+
+## Prerequisites
+
+ExecuTorch is actively tested and supported on several major Linux distributions. Ensure your environment meets the following minimum requirements:
+
+- **Operating System**: CentOS 8+, Ubuntu 20.04.6 LTS+, or RHEL 8+ [1].
+- **Compiler Toolchain**: `g++` version 7 or higher, `clang++` version 5 or higher, or another C++17-compatible toolchain [1].
+- **Python Environment**: Python 3.10–3.13, preferably managed via Conda or `venv` [1].
+- **Build Tools**: CMake and an optional compiler cache (`ccache`) to significantly speed up recompilation [1].
+
+## Available Backends
+
+ExecuTorch supports multiple backends for Linux, allowing you to optimize execution for specific CPU architectures or dedicated AI accelerators.
+
+| Backend | Hardware Target | Architecture | Key Features |
+|---|---|---|---|
+| **XNNPACK** | CPU | x86, x86-64, ARM64 | Highly optimized CPU execution; supports fp32, fp16, and 8-bit quantization; works up to AVX512 on x86-64 [2]. |
+| **OpenVINO** | Intel Hardware | x86-64 | Accelerates inference on Intel CPUs, integrated GPUs, discrete GPUs, and NPUs [3]. |
+| **Vulkan** | GPU | Cross-platform | Executes on GPUs via GLSL compute shaders; primarily focused on Android but supports Linux GPUs with Vulkan 1.1+ [4]. |
+
+## Building for Linux
+
+The ExecuTorch CMake build system includes a dedicated `linux` preset that configures the runtime with the features and backends common for Linux targets [5].
+
+### 1. Environment Setup
+
+Begin by cloning the ExecuTorch repository and configuring your Python environment. Once the environment is active, install the Python dependencies and the default backends (which includes XNNPACK).
+
+```bash
+# Clone and setup the environment
+git clone -b viable/strict https://github.com/pytorch/executorch.git
+cd executorch
+conda create -yn executorch python=3.10.0
+conda activate executorch
+
+# Install Python packages and dependencies
+./install_executorch.sh
+```
+
+### 2. Compile the Runtime
+
+With the environment configured, use the `linux` CMake preset to build the C++ runtime. This process will compile the core ExecuTorch libraries and the registered backends.
+
+```bash
+mkdir cmake-out
+cmake -B cmake-out --preset linux
+cmake --build cmake-out -j10
+```
+
+This will generate the `libexecutorch.a` static library and the associated backend libraries (e.g., `libxnnpack_backend.a`).
+
+### 3. OpenVINO Integration (Optional)
+
+If you intend to target Intel hardware, the OpenVINO backend requires additional setup. You must install the OpenVINO toolkit and build the backend separately using the provided scripts.
+
+```bash
+# From the executorch/backends/openvino/ directory
+pip install -r requirements.txt
+cd scripts/
+./openvino_build.sh
+```
+
+This generates `libopenvino_backend.a` in the `cmake-out/backends/openvino/` directory [3].
+
+## Runtime Integration
+
+To integrate ExecuTorch into your Linux C++ application, link against the compiled runtime and backend libraries. 
+
+When linking the XNNPACK backend, the use of static initializers requires linking with the whole-archive flag to ensure the backend registration code is not stripped by the linker [2].
+
+```cmake
+# CMakeLists.txt
+add_subdirectory("executorch")
+
+target_link_libraries(
+    my_linux_app
+    PRIVATE 
+    executorch
+    extension_module_static
+    extension_tensor
+    optimized_native_cpu_ops_lib
+    $<LINK_LIBRARY:WHOLE_ARCHIVE,xnnpack_backend>
+)
+```
+
+No additional code is required to initialize the backends; any `.pte` file exported for XNNPACK or OpenVINO will automatically execute on the appropriate hardware when loaded by the `Module` API.
+
+## Next Steps
+
+- **{doc}`backends/xnnpack/xnnpack-overview`** — Deep dive into XNNPACK export and execution.
+- **{doc}`build-run-openvino`** — Deep dive into OpenVINO setup and hardware acceleration.
+- **{doc}`using-executorch-cpp`** — Learn how to use the C++ `Module` API to load and run models.
+
+---
+
+## References
+
+[1] ExecuTorch Documentation: [System Requirements](using-executorch-building-from-source.md#system-requirements)  
+[2] ExecuTorch Documentation: [XNNPACK Backend](backends/xnnpack/xnnpack-overview.md)  
+[3] ExecuTorch Documentation: [OpenVINO Backend](build-run-openvino.md)  
+[4] ExecuTorch Documentation: [Vulkan Backend](backends/vulkan/vulkan-overview.md)  
+[5] ExecuTorch Documentation: [Building the C++ Runtime](using-executorch-building-from-source.md#building-the-c-runtime)  
@@ -0,0 +1,95 @@
+# macOS Desktop Deployment
+
+ExecuTorch provides robust support for macOS deployment, offering hardware-accelerated execution across both Apple Silicon and Intel-based Macs. The runtime is optimized to take advantage of Apple's Core ML framework, Metal Performance Shaders (MPS), and the CPU-optimized XNNPACK backend.
+
+This guide covers the platform-specific requirements, available backends, and steps to build and run ExecuTorch natively on macOS.
+
+## Prerequisites
+
+To build and run ExecuTorch on macOS, ensure your system meets the following minimum requirements:
+
+- **Operating System**: macOS Big Sur (11.0) or higher. For Core ML and MPS support, macOS 13.0+ and 12.4+ are required, respectively [1].
+- **Development Tools**: Xcode 14.1 or higher. The Xcode Command Line Tools must be installed (`xcode-select --install`) [2].
+- **Python Environment**: Python 3.10–3.13, preferably managed via Conda or `venv` [3].
+
+### Intel Mac Considerations
+
+For Intel-based macOS systems, PyTorch does not provide pre-built binaries. When installing ExecuTorch Python dependencies, you must build PyTorch from source by passing specific flags to the installation script:
+
+```bash
+./install_executorch.sh --use-pt-pinned-commit --minimal
+```
+
+## Available Backends
+
+ExecuTorch supports three primary backends for macOS, allowing you to target the CPU, GPU, or Apple Neural Engine (ANE) depending on your hardware and model requirements.
+
+| Backend | Hardware Target | Minimum macOS | Key Features |
+|---|---|---|---|
+| **Core ML** | CPU, GPU, ANE | 13.0 | Dynamic dispatch across all Apple hardware; supports fp32 and fp16; recommended for Apple Silicon [2]. |
+| **MPS** | Apple Silicon GPU | 12.4 | Direct execution on Metal Performance Shaders; supports fp32 and fp16 [4]. |
+| **XNNPACK** | CPU (ARM64 & x86_64) | 11.0 | Highly optimized CPU execution; supports 8-bit quantization; works on both Apple Silicon and Intel Macs [5]. |
+
+## Building for macOS
+
+The ExecuTorch CMake build system includes a dedicated `macos` preset that configures the runtime with the features and backends common for Mac targets [3].
+
+### 1. Enable Required Backends
+
+By default, the ExecuTorch installation script builds the XNNPACK and Core ML backends. If you intend to use the MPS backend, you must enable it during the initial setup:
+
+```bash
+CMAKE_ARGS="-DEXECUTORCH_BUILD_MPS=ON" ./install_executorch.sh
+```
+
+### 2. Compile the Runtime
+
+Once the Python environment is configured, use the `macos` CMake preset to build the C++ runtime:
+
+```bash
+mkdir cmake-out
+cmake -B cmake-out --preset macos
+cmake --build cmake-out -j10
+```
+
+This will compile the core ExecuTorch libraries and the registered backends (e.g., `libxnnpack_backend.a`, `libcoremldelegate.a`).
+
+## Runtime Integration
+
+To integrate ExecuTorch into your macOS C++ application, link against the compiled runtime and backend libraries. 
+
+When linking the Core ML or XNNPACK backends, the use of static initializers requires linking with the whole-archive flag to ensure the backend registration code is not stripped by the linker [2] [5].
+
+```cmake
+# CMakeLists.txt
+add_subdirectory("executorch")
+
+target_link_libraries(
+    my_macos_app
+    PRIVATE 
+    executorch
+    extension_module_static
+    extension_tensor
+    optimized_native_cpu_ops_lib
+    $<LINK_LIBRARY:WHOLE_ARCHIVE,coremldelegate>
+    $<LINK_LIBRARY:WHOLE_ARCHIVE,xnnpack_backend>
+)
+```
+
+No additional code is required to initialize the backends; any `.pte` file exported for Core ML, MPS, or XNNPACK will automatically execute on the appropriate hardware when loaded by the `Module` API.
+
+## Next Steps
+
+- **{doc}`backends/coreml/coreml-overview`** — Deep dive into Core ML export and execution.
+- **{doc}`backends/mps/mps-overview`** — Deep dive into MPS export and execution.
+- **{doc}`using-executorch-cpp`** — Learn how to use the C++ `Module` API to load and run models.
+
+---
+
+## References
+
+[1] ExecuTorch Documentation: [Building from Source](using-executorch-building-from-source.md)  
+[2] ExecuTorch Documentation: [Core ML Backend](backends/coreml/coreml-overview.md)  
+[3] ExecuTorch Documentation: [Building the C++ Runtime](using-executorch-building-from-source.md#building-the-c-runtime)  
+[4] ExecuTorch Documentation: [MPS Backend](backends/mps/mps-overview.md)  
+[5] ExecuTorch Documentation: [XNNPACK Backend](backends/xnnpack/xnnpack-overview.md)  
@@ -1,24 +1,65 @@
 (desktop-section)=
+
 # Desktop & Laptop Platforms
 
-Deploy ExecuTorch on Linux, macOS, and Windows with optimized backends for each platform.
+ExecuTorch provides robust, high-performance deployment capabilities for desktop and laptop environments across macOS, Linux, and Windows. By leveraging native hardware acceleration and a cross-platform C++ runtime, developers can execute PyTorch models efficiently on CPUs, GPUs, and dedicated AI accelerators (NPUs/ANEs).
+
+This section provides comprehensive, platform-specific guides for setting up, building, and optimizing ExecuTorch for native desktop execution.
+
+## Platform-Specific Guides
+
+Select your target operating system below for detailed setup instructions, prerequisites, and backend integration steps.
+
+::::{grid} 3
+:::{grid-item-card} macOS
+:class-card: card-prerequisites
+**→ {doc}`desktop-macos`**
+
+Native execution on Apple Silicon and Intel Macs using Core ML, MPS, and XNNPACK.
+:::
+:::{grid-item-card} Linux
+:class-card: card-prerequisites
+**→ {doc}`desktop-linux`**
 
-## Platform Overview & Runtime
+High-performance deployment on Linux distributions using XNNPACK and OpenVINO.
+:::
+:::{grid-item-card} Windows
+:class-card: card-prerequisites
+**→ {doc}`desktop-windows`**
 
-- {doc}`using-executorch-cpp` — C++ runtime integration guide
-- {doc}`using-executorch-building-from-source` — Building ExecuTorch from source
+Native Windows and WSL support using XNNPACK and OpenVINO with Visual Studio.
+:::
+::::
 
-## Backends
+## Backend Hardware Support
 
-- {doc}`desktop-backends` — Available desktop backends and platform-specific optimization
+ExecuTorch relies on specialized backends to map model execution to the underlying desktop hardware. The table below summarizes the available backends and their supported platforms.
+
+| Backend | Primary Hardware Target | macOS | Linux | Windows | Key Features |
+|---|---|:---:|:---:|:---:|---|
+| **[XNNPACK](backends/xnnpack/xnnpack-overview)** | CPU (ARM64, x86-64) | ✅ | ✅ | ✅ | Highly optimized CPU execution; supports fp32, fp16, and 8-bit quantization. Included by default. |
+| **[Core ML](backends/coreml/coreml-overview)** | Apple CPU, GPU, ANE | ✅ | ❌ | ❌ | Dynamic dispatch across Apple hardware; recommended for Apple Silicon. |
+| **[MPS](backends/mps/mps-overview)** | Apple Silicon GPU | ✅ | ❌ | ❌ | Direct execution on Metal Performance Shaders for high-throughput GPU inference. |
+| **[OpenVINO](build-run-openvino)** | Intel CPU, GPU, NPU | ❌ | ✅ | ✅ | Intel-optimized execution across integrated graphics, discrete GPUs, and NPUs. |
+| **[Vulkan](backends/vulkan/vulkan-overview)** | Cross-platform GPU | ❌ | ✅ | ❌ | GPU execution via GLSL compute shaders; primarily focused on Android but supports Linux. |
+
+## Core Runtime Integration
+
+Regardless of the target desktop platform, integrating ExecuTorch into a native application follows a consistent pattern using the C++ `Module` API.
+
+- **{doc}`using-executorch-cpp`** — Learn how to use the C++ `Module` API to load `.pte` files, configure memory allocation, and execute inferences natively.
+- **{doc}`using-executorch-building-from-source`** — Comprehensive reference for the CMake build system, configuration options, and presets used across all desktop platforms.
 
 ## Tutorials
 
-- {doc}`raspberry_pi_llama_tutorial` — Cross compiling ExecuTorch for the Raspberry Pi on Linux Host
+- **{doc}`raspberry_pi_llama_tutorial`** — Cross compiling ExecuTorch for the Raspberry Pi on a Linux Host.
 
 ```{toctree}
 :hidden:
-using-executorch-cpp
-using-executorch-building-from-source
-desktop-backends
-raspberry_pi_llama_tutorial
+:maxdepth: 2
+:caption: Desktop Platforms
+
+desktop-macos
+desktop-linux
+desktop-windows
+```