Some small fixes in this release.
- [Docs] Documents are now easier to read.
- [Docs] PD disaggregation documentation update : Update the PD disaggregation documentation to remove the --enforce-eager argument when starting the vllm service, so that graph mode is enabled by default at startup.
- [Feat] Completely remove
UCconnector, please useUCMConnectorfrom now on. - [Feat] UCM supports recovery form load failure:Implement the get_block_ids_with_load_errors interface in the KVConnectorBase_V1 class, enabling vLLM to reexecute inference for requests whose KV cache failed to load from UCM.
- [Build] Use
pip install uc-manager==0.1.2and the install will build from source for both vllm and vllm-ascend. - [Build] Sparse module are now built and used only if set environment variable
export ENABLE_SPARSE=TRUE.
What's Changed
- [cleancode]rm video by @Lijiachen1018 in #459
- [fix] pick fixes from Release to develop by @Lijiachen1018 in #465
- [cleancode]remove uc connector by @Lijiachen1018 in #460
- [build] project docs for pypi by @Lijiachen1018 in #466
- [build]build sparse only if enabled by @Lijiachen1018 in #470
- [Misc] fetch dependence from gitcode as backup by @mag1c-h in #469
- [docs] renew docs by @Lijiachen1018 in #476
- release v0.1.1 by @Lijiachen1018 in #478
- feat: add MetaX MACA device support for PC by @simshi in #387
- [Docs] PD disaggregation documentation update by @sumingZero in #479
- [Feat] UCM supports recovery form load failure by @sumingZero in #477
- [feat]Add configurable scattergatter by @qyh111 in #483
- [bugfix]add synchronize on ascend platform by @qyh111 in #485
- [build] fix build by source distribution by @Lijiachen1018 in #484
- release v0.1.2 by @Lijiachen1018 in #491
- develop merge into main by @ygwpz in #492
- [docs] fix links in docs and add clarifications by @Lijiachen1018 in #499
New Contributors
Full Changelog: v0.1.0...v0.1.2