Skip to content

v0.1.2

Latest

Choose a tag to compare

@Lijiachen1018 Lijiachen1018 released this 10 Dec 07:56
· 10 commits to develop since this release
aa31619

Some small fixes in this release.

  • [Docs] Documents are now easier to read.
  • [Docs] PD disaggregation documentation update : Update the PD disaggregation documentation to remove the --enforce-eager argument when starting the vllm service, so that graph mode is enabled by default at startup.
  • [Feat] Completely remove UCconnector, please use UCMConnector from now on.
  • [Feat] UCM supports recovery form load failure:Implement the get_block_ids_with_load_errors interface in the KVConnectorBase_V1 class, enabling vLLM to reexecute inference for requests whose KV cache failed to load from UCM.
  • [Build] Use pip install uc-manager==0.1.2 and the install will build from source for both vllm and vllm-ascend.
  • [Build] Sparse module are now built and used only if set environment variable export ENABLE_SPARSE=TRUE.

What's Changed

New Contributors

Full Changelog: v0.1.0...v0.1.2