@@ -244,7 +244,7 @@ def generate_rand_batch(
244244
245245######################################################################
246246# Using SDPA with ``torch.compile``
247- # =================================
247+ # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
248248#
249249# With the release of PyTorch 2.0, a new feature called
250250# ``torch.compile()`` has been introduced, which can provide
@@ -324,9 +324,9 @@ def generate_rand_batch(
324324#
325325
326326######################################################################
327- # Using SDPA with attn_bias subclasses`
328- # ==========================================
329- #
327+ # Using SDPA with attn_bias subclasses
328+ # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
329+
330330# As of PyTorch 2.3, we have added a new submodule that contains tensor subclasses.
331331# Designed to be used with ``torch.nn.functional.scaled_dot_product_attention``.
332332# The module is named ``torch.nn.attention.bias`` and contains the following two
@@ -394,7 +394,7 @@ def generate_rand_batch(
394394
395395######################################################################
396396# Conclusion
397- # ==========
397+ # ~~~~~~~~~~~
398398#
399399# In this tutorial, we have demonstrated the basic usage of
400400# ``torch.nn.functional.scaled_dot_product_attention``. We have shown how
0 commit comments