Skip to content

Fix rotation#1724

Open
wenhuach21 wants to merge 16 commits intomainfrom
fix_rotation
Open

Fix rotation#1724
wenhuach21 wants to merge 16 commits intomainfrom
fix_rotation

Conversation

@wenhuach21
Copy link
Copy Markdown
Contributor

@wenhuach21 wenhuach21 commented Apr 23, 2026

Description

Please briefly describe your main changes, the motivation.

Type of Change

  • Bug fix
  • New feature
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Other (please specify):

Related Issues

Fixes or relates to #

Checklist Before Submitting

  • My code has been tested locally.
  • Documentation has been updated as needed.
  • New or updated tests are included where applicable.

Copilot AI review requested due to automatic review settings April 23, 2026 04:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the experimental inplace-rotation implementation to better support models where head_dim is not hidden_size // num_heads (e.g., Qwen-3), reduce peak memory during weight rotation, and introduce a mechanism for model-specific rotation overrides.

Changes:

  • Resolve head_dim using mapping.attn_head_dimconfig.head_dimhidden_size // num_heads in rotation + hook codepaths.
  • Add chunked (memory-efficient) weight rotation for large matrices and adjust OV rotation to always use a decomposed per-head + cross-head form.
  • Introduce a special-model override registry and wire it into apply_rotation_transform.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
auto_round/experimental/rotation_inplace/utils.py Head-dim resolution update; new Hadamard handling paths (incl. chunked full-dim custom matrix) and non-pow2 behavior changes.
auto_round/experimental/rotation_inplace/special_model_handler.py New registry/helpers for model-specific override application with logging.
auto_round/experimental/rotation_inplace/apply_rotation_transform.py Adds _resolve_head_dim, chunked rotation helper, OV rotation fix, and applies model-specific overrides.
auto_round/main.py Extends CLI --rotation_type choices to include quarot_hadamard.
Comments suppressed due to low confidence (1)

auto_round/experimental/rotation_inplace/utils.py:521

  • In apply_exact_had_to_linear, the full-dimension had_matrix path applies W @ H.T regardless of output. This ignores the output=True contract and is inconsistent with the non-custom output=True path (which rotates via W = H @ W). Please implement the correct left-multiply when output=True for had_dim == -1 (and update the chunked fast path accordingly).
    if had_matrix is not None:
        H = had_matrix.to(device=compute_dev, dtype=torch.float64)
        if had_dim == -1:
            # Full-dimension custom matrix
            if output:
                # W.T = H @ W.T  →  W = (H @ W.T).T = W @ H.T
                W_ = W_ @ H.T
            else:
                # W = H @ W  (rotate input columns: W_new[i,:] = sum H[i,j]*W[j,:])
                # Actually for input side: W_new = W @ H (each row is rotated)
                W_ = W_ @ H.T

Comment thread auto_round/experimental/rotation_inplace/utils.py
Comment thread auto_round/experimental/rotation_inplace/special_model_handler.py Outdated
Comment thread auto_round/experimental/rotation_inplace/special_model_handler.py
Comment thread auto_round/__main__.py
Comment thread auto_round/experimental/rotation_inplace/utils.py
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@wenhuach21
Copy link
Copy Markdown
Contributor Author

#1721
#1722

@wenhuach21
Copy link
Copy Markdown
Contributor Author

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@wenhuach21
Copy link
Copy Markdown
Contributor Author

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@wenhuach21
Copy link
Copy Markdown
Contributor Author

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants