-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Normalize OpenPangu RoPE scaling defaults #28361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Normalize OpenPangu RoPE scaling defaults #28361
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to normalize the RoPE scaling defaults for OpenPangu models by reading from config.rope_scaling and backfilling any missing YaRN parameters. My main concern is that the implementation seems to be in OpenPanguMLAAttention, while the verification logs and model configuration for FreedomIntelligence/openPangu-Embedded-7B indicate that OpenPanguEmbeddedAttention is being used. This suggests the fix might be in the wrong place and would not be applied as intended. I've left a critical comment detailing this issue. Please verify the correct location for this logic.
148afe6 to
7f12cee
Compare
77092fa to
fc854be
Compare
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a helper function _normalize_rope_scaling_defaults to centralize the logic for handling RoPE scaling parameters for OpenPangu models. This change effectively removes hardcoded default values from OpenPanguMLAAttention and ensures that both OpenPanguMLAAttention and OpenPanguEmbeddedAttention consistently apply these defaults. The implementation is clean, robust, and improves the model's configurability by allowing rope_scaling parameters to be specified in the model's configuration file. The changes are well-verified and represent a good improvement in code maintainability and consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Signed-off-by: zitian.zhao <zitian.zhao@tencentmusic.com>
Signed-off-by: zitian.zhao <zitian.zhao@tencentmusic.com>
Signed-off-by: zitian.zhao <zitian.zhao@tencentmusic.com>
Signed-off-by: zitian.zhao <zitian.zhao@tencentmusic.com>
Signed-off-by: zitian.zhao <zitian.zhao@tencentmusic.com>
db047af to
f0e2d61
Compare
|
/gemini review |
|
@codex review |
|
Codex Review: Didn't find any major issues. Can't wait for the next one! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the RoPE scaling configuration for OpenPangu models by introducing a normalization function, _normalize_rope_scaling_defaults. This change removes hardcoded values and allows for using RoPE scaling parameters from the model's configuration, with sensible defaults. The implementation is applied to both OpenPanguMLAAttention and OpenPanguEmbeddedAttention.
My review identifies a potential logic issue in the new normalization function that could lead to inconsistent values for type and rope_type in the resulting configuration dictionary. I've provided a suggestion to refactor the logic for better clarity and correctness, ensuring that the configuration remains consistent regardless of which keys are provided by the user.
Summary
Verification
CUDA_VISIBLE_DEVICES=0 vllm serve FreedomIntelligence/openPangu-Embedded-7B --port 8818 --trust_remote_code --served-model-name openPangu-Embedded-7B– logs show every OpenPanguEmbeddedAttention layer printing the resolved rope_scaling_config, confirming it was read from the config (see screenshot).curl http://localhost:8818/v1/chat/completions …– inference succeeds with expected response, proving the model loads and serves normally (see screenshot).