Skip to content

Conversation

@Ceng23333
Copy link
Contributor

@Ceng23333 Ceng23333 commented Jan 26, 2026

#193

  1. /models接口返回正确modelid
  2. 非流式接口返回openai格式
  3. 响应请求中sampling params和chat_template_kwargs
  4. tokenizer decode调用参考vllm
  5. request调度检查

@Ceng23333 Ceng23333 requested a review from a team January 26, 2026 06:14
@Ceng23333 Ceng23333 changed the title Issue/193 Issue/193: inference_server适配部署需求 Jan 26, 2026
@Ceng23333 Ceng23333 force-pushed the issue/193 branch 2 times, most recently from 9aaef0e to 89acdcc Compare February 2, 2026 06:31
last_len = getattr(req, "_stream_last_yielded_length", 0)
token_text = decoded_text[last_len:]
if token_text:
req._stream_last_yielded_length = len(decoded_text)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

处理尾部可能存在的不完整字符(max_tokens 导致的finished)

top_k=int(pick("top_k", self.top_k)),
max_tokens=int(max_tokens) if max_tokens is not None else None,
stop=stop,
stop_token_ids=stop_token_ids,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_check_request_finished中添加stop_token_ids的检查

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start_time删掉,下面_chat的start_time也删掉

Ceng23333 and others added 5 commits February 4, 2026 10:24
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants