From 6f3def4f4d990b01375eb38ef1354409e48dc891 Mon Sep 17 00:00:00 2001 From: peter941221 Date: Wed, 10 Jun 2026 12:25:57 +0800 Subject: [PATCH 1/2] #4802 - Fix QDP Python Plugin Guide link Signed-off-by: peter941221 --- samples/python/quickly_deployable_plugins/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/samples/python/quickly_deployable_plugins/README.md b/samples/python/quickly_deployable_plugins/README.md index 3dd0d03da..a592eb66a 100644 --- a/samples/python/quickly_deployable_plugins/README.md +++ b/samples/python/quickly_deployable_plugins/README.md @@ -284,7 +284,7 @@ options: # Additional Resources **Python Plugin Guide** -- [pluginGuide.md](../../../documentation/python/pluginGuide.md) +- [Python Plugin Guide](https://docs.nvidia.com/deeplearning/tensorrt/latest/_static/python-api/pluginGuide.html) **`tensorrt.plugin` API reference** - [`tensorrt.plugin` module API reference](https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/tensorrt.plugin/index.html) From 8c45866723195876fd3e884802a66a544189b1b0 Mon Sep 17 00:00:00 2001 From: peter941221 Date: Wed, 10 Jun 2026 12:36:54 +0800 Subject: [PATCH 2/2] #4802 - Fix stale QDP Python doc links Signed-off-by: peter941221 --- samples/python/quickly_deployable_plugins/README.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/samples/python/quickly_deployable_plugins/README.md b/samples/python/quickly_deployable_plugins/README.md index a592eb66a..875c4ebf9 100644 --- a/samples/python/quickly_deployable_plugins/README.md +++ b/samples/python/quickly_deployable_plugins/README.md @@ -66,7 +66,7 @@ def add_plugin_desc(inp0: trtp.TensorDesc, block_size: int) -> trtp.TensorDesc: return inp0.like() ``` -The argument "sample::elemwise_add_plugin" defines the namespace ("sample") and name ("elemwise_add_plugin") of the plugin. Input arguments to the decorated function (`plugin_desc`) annotated with `trt.plugin.TensorDesc` denote the input tensors; all others are interpreted as plugin attributes (see the [TRT API Reference](https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/tensorrt.plugin/trt_plugin_register.html) for a full list of allowed attribute types). The output signature is a `trt.plugin.TensorDesc` describing the output. `inp0.like()` returns a tensor descriptor with identical shape and type characteristics to `inp0`. +The argument "sample::elemwise_add_plugin" defines the namespace ("sample") and name ("elemwise_add_plugin") of the plugin. Input arguments to the decorated function (`plugin_desc`) annotated with `trt.plugin.TensorDesc` denote the input tensors; all others are interpreted as plugin attributes (see the [TRT API Reference](https://docs.nvidia.com/deeplearning/tensorrt/latest/_static/python-api/infer/tensorrt.plugin/trt_plugin_register.html) for a full list of allowed attribute types). The output signature is a `trt.plugin.TensorDesc` describing the output. `inp0.like()` returns a tensor descriptor with identical shape and type characteristics to `inp0`. The computation function, decorated with `trt.plugin.impl`, receives `trt.plugin.Tensor`s for each input and output. In contrast to `TensorDesc`s, a `Tensor` references an underlying data buffer, directly accessible through `Tensor.data_ptr`. When working with Torch and OpenAI Triton kernels, it is easier to use `torch.as_tensor()` to zero-copy construct a `torch.Tensor` corresponding to the `trt.plugin.Tensor`. @@ -124,7 +124,7 @@ Non-zero is an operation where the indices of the non-zero elements of the input To handle DDS, the extent of each data-dependent output dimension must be expressed in terms of a *_size tensor_*, which is a scalar that communicates to TRT an upper-bound and an autotune value for that dimension, in terms of the input shapes. The TRT engine build may be optimized for the autotune value, but the extent of that dimension may stretch up to the upper-bound at runtime. -In this sample, we consider a 2D input tensor `inp0`; the output will be an $N x 2$ tensor (a set of $N$ 2D indices), where $N$ is the number of non-zero indices. At maximum, all elements could be non-zero, and so the upper-bound could be expressed as `upper_bound = inp0.shape_expr[0] * inp0.shape_expr[1]`. Note that `trt.plugin.TensorDesc.shape_expr` returns symbolic shape expressions for that tensor. Arithmetic operations on shape expressions are supported through standard Python binary operators (see [TRT Python API reference](https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/tensorrt.plugin/Shape/ShapeExpr.html) for full list of supported operations). +In this sample, we consider a 2D input tensor `inp0`; the output will be an $N x 2$ tensor (a set of $N$ 2D indices), where $N$ is the number of non-zero indices. At maximum, all elements could be non-zero, and so the upper-bound could be expressed as `upper_bound = inp0.shape_expr[0] * inp0.shape_expr[1]`. Note that `trt.plugin.TensorDesc.shape_expr` returns symbolic shape expressions for that tensor. Arithmetic operations on shape expressions are supported through standard Python binary operators (see [TRT Python API reference](https://docs.nvidia.com/deeplearning/tensorrt/latest/_static/python-api/infer/tensorrt.plugin/Shape/ShapeExpr.html) for full list of supported operations). On average, we can expect half of the input to be filled with zero, so a size tensor can be constructed with that as the autotune value: ```python @@ -157,7 +157,7 @@ python3 qdp_runner.py non_zero [-v] This sample contains a circular padding plugin, which is useful for ops like circular convolution. It is equivalent to PyTorch's [torch.nn.CircularPad2d](https://pytorch.org/docs/stable/generated/torch.nn.CircularPad2d.html#torch.nn.CircularPad2d). -Refer [this section about circular padding plugin](https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/pluginGuide.html#example-circular-padding-plugin) in the python plugin guide for more info. +Refer [this section about circular padding plugin](https://docs.nvidia.com/deeplearning/tensorrt/latest/_static/python-api/pluginGuide.html#example-circular-padding-plugin) in the python plugin guide for more info. ## ONNX model with a plugin @@ -205,7 +205,7 @@ def circ_pad_plugin_autotune(inp0: trtp.TensorDesc, pads: npt.NDArray[np.int32], Note that we're using another way of constructing a `trt.plugin.AutoTuneCombination` here -- namely, through `pos(...)` to populate the type/format information and `tactics(...)` to specify the tactics. In this sample, we use an OpenAI Triton kernel and `torch.nn.functional.pad` as two methods to compute the circular padding. -Refer [this section](https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/pluginGuide.html#example-plugins-with-multiple-backends-using-custom-tactics) in the Python plugin guide for more info. +Refer [this section](https://docs.nvidia.com/deeplearning/tensorrt/latest/_static/python-api/pluginGuide.html#example-plugins-with-multiple-backends-using-custom-tactics) in the Python plugin guide for more info. ## Loading and running a TRT engine containing a plugin @@ -234,7 +234,7 @@ Let's extend the [above sample](#using-multiple-tactics-and-onnx-cirular-padding Instead of specifying the OpenAI Triton Kernel callback to TRT through `@trt.plugin.impl`, we can directly compile the kernel ahead of time, and provide that to TRT under `@trt.plugin.aot_impl`. -Refer [this section](https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/pluginGuide.html#providing-an-ahead-of-time-aot-implementation) in the Python plugin guide for more info. +Refer [this section](https://docs.nvidia.com/deeplearning/tensorrt/latest/_static/python-api/pluginGuide.html#providing-an-ahead-of-time-aot-implementation) in the Python plugin guide for more info. ## ONNX model with an AOT plugin @@ -287,7 +287,7 @@ options: - [Python Plugin Guide](https://docs.nvidia.com/deeplearning/tensorrt/latest/_static/python-api/pluginGuide.html) **`tensorrt.plugin` API reference** -- [`tensorrt.plugin` module API reference](https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/tensorrt.plugin/index.html) +- [`tensorrt.plugin` module API reference](https://docs.nvidia.com/deeplearning/tensorrt/latest/_static/python-api/infer/tensorrt.plugin/index.html) **Developer Guide** - [Extending TensorRT with Custom Layers](https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#extending)