Skip to content

feat: Add SGLang inference guide and fuse checkpoint script#77

Open
CloudRipple wants to merge 2 commits intoOpenMOSS:mainfrom
CloudRipple:main
Open

feat: Add SGLang inference guide and fuse checkpoint script#77
CloudRipple wants to merge 2 commits intoOpenMOSS:mainfrom
CloudRipple:main

Conversation

@CloudRipple
Copy link

This pull request introduces support for an accelerated inference backend using SGLang and adds a new processor class for handling fused MOSS-TTS and MOSS-Audio-Tokenizer models. The documentation has been updated in both English and Chinese to reflect these enhancements, including detailed setup instructions and usage examples.

Documentation updates for SGLang backend:

  • Added a new section in README.md describing the SGLang backend for accelerated inference, including quick start instructions, request/response format, and notes on compilation behavior.
  • Updated the table of contents in README.md to include the SGLang backend section.
  • Added a corresponding section in README_zh.md with SGLang backend setup and usage instructions in Chinese.
  • Updated the table of contents in README_zh.md to reference the SGLang backend section.

Codebase enhancement:

  • Introduced the MossTTSDelayWithCodecProcessor class in moss_tts_delay/processing_moss_tts_delay_with_codec.py to support processing for the fused MOSS-TTS and codec models, including token handling, audio normalization, and delay pattern application.

@xiami2019
Copy link
Member

README 补一个 相比HF的速度提升吧,类似之前TTSD那个提升多少倍。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants