Skip to content

refactor teacher server#9457

Open
hjh0119 wants to merge 6 commits into
modelscope:mainfrom
hjh0119:refactor-teacher-server
Open

refactor teacher server#9457
hjh0119 wants to merge 6 commits into
modelscope:mainfrom
hjh0119:refactor-teacher-server

Conversation

@hjh0119
Copy link
Copy Markdown
Collaborator

@hjh0119 hjh0119 commented May 31, 2026

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the Generalized Knowledge Distillation (GKD) trainer to fetch teacher logprobs using a new inference-only client (VLLMInferClient) and a dedicated /infer/ endpoint, while also lazy-loading transformers imports across utility files to optimize startup time. The code review identified several critical issues in the new implementation: self.teacher_client is only initialized on the main process, which will cause crashes on non-zero ranks during evaluation; incorrect tuple unpacking and an undefined variable encoded_chunkbatch in _assemble_topk_for_chunk will raise ValueError and NameError; parse_prompt_logprobs lacks safety checks for None values and padding, leading to potential AttributeError and ValueError crashes; and a potential KeyError exists when popping _teacher_raw from chunks.

Comment thread swift/rlhf_trainers/gkd_trainer.py
Comment thread swift/rlhf_trainers/gkd_trainer.py Outdated
Comment thread swift/rlhf_trainers/utils.py
Comment thread swift/rlhf_trainers/gkd_trainer.py
Comment thread swift/pipelines/infer/deploy.py
Comment thread swift/pipelines/infer/infer.py Outdated
Comment thread swift/utils/env.py Outdated
Comment thread swift/infer_engine/vllm_engine.py Outdated
@hjh0119
Copy link
Copy Markdown
Collaborator Author

hjh0119 commented Jun 1, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the GKD (Generalized Knowledge Distillation) pipeline to fetch teacher logprobs more efficiently using a new /infer/ endpoint and VLLMInferClient, aligning both standard and Megatron-based trainers. The review feedback highlights critical issues, including a rank mismatch in the Megatron trainer where the teacher client is initialized on the last rank but called on rank 0, missing safety checks and padding in parse_prompt_logprobs that could cause shape mismatches or AttributeErrors, and potential KeyErrors when popping keys from batch dictionaries.

Comment thread swift/rlhf_trainers/utils.py
Comment thread swift/megatron/trainers/gkd_trainer.py
Comment thread swift/megatron/trainers/gkd_trainer.py
Comment thread swift/rlhf_trainers/gkd_trainer.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants