Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
143 commits
Select commit Hold shift + click to select a range
b24765a
Update setup.py
Jiang-Jia-Jun Apr 3, 2026
55dbc83
[Cherry-Pick][BugFix] prevent requests from entering running state wi…
liyonghua0910 Apr 3, 2026
7ab48c4
[Cherry-Pick][CI] Use GPU-Build-RL runner for _build_linux_rl.yml (#7…
EmmonsCurse Apr 3, 2026
36909bf
[Cherry-Pick][BugFix] fix MTP bugs in TP and overlap(#7172) (#7192)
huicongyao Apr 8, 2026
403ce13
remove arctic_inference deps (#7236)
Deleter-D Apr 8, 2026
6b78981
Split enable_mm (#7183) (#7233)
EmmonsCurse Apr 8, 2026
84d6271
[Feature]distinguish whl version (#7204) (#7224)
EmmonsCurse Apr 8, 2026
0181884
support moe for sm103 (#7240)
BingooYang Apr 8, 2026
9c65655
[Cherry-Pick][RL] support moe-topk use topk_reduce_func #7218 (#7256)
zoooo0820 Apr 9, 2026
5fd8020
[Cherry-Pick][BugFix] Fix batch_size derivation and relax shape check…
xiaoxiaohehe001 Apr 9, 2026
098dd2c
[XPU][CI] lock xvllm version for fix bug (#7264) (#7266)
EmmonsCurse Apr 9, 2026
849eb3d
[Cherry-Pick][Optimization] merge matmul and add (#6986) (#7191)
BingooYang Apr 9, 2026
6fcc25f
Update ci_metax.yml (#7286)
plusNew001 Apr 9, 2026
921a0ae
[Docs] Update docs for release/2.5 (#7267) (#7277)
EmmonsCurse Apr 9, 2026
dea9d35
[OP]Unify MoE op with moe_permute path for bf16 GLM (#7164) (#7279)
fxyfxy777 Apr 9, 2026
dd0863b
[BugFix] Fix Async D2H copy bug & flash mash atten cache V out of bou…
EmmonsCurse Apr 10, 2026
4f36346
[Cherry-Pick] change rms norm for glm #7269 (#7276)
zhangbo9674 Apr 10, 2026
c756038
[Cherry-Pick][FDConfig] Auto-scale CUDA Graph Capture & CLI Quantizat…
Deleter-D Apr 10, 2026
2ac9b89
[XPU][CI]Update xtdk version in download_dependencies.sh (#7320) (#7322)
EmmonsCurse Apr 10, 2026
65c6e72
[Cherry-Pick][Docs] Update Release Note(#7302) (#7341)
EmmonsCurse Apr 11, 2026
42b0f59
[Cherry-Pick][RL] change glm rope_emb calculation #7316 (#7318)
zoooo0820 Apr 11, 2026
7446665
[Cherry-Pick][RL]moe bf16 ep support paddle batch_gemm(#7337) (#7339)
ckl117 Apr 11, 2026
9e8ea7d
[Cherry-Pick][CI] Sync dev optimizations to 2.6(#7335) (#7343)
EmmonsCurse Apr 12, 2026
9cb82d7
[Cherry-Pick][TI-consistent] support quant use pow2scale(#7308) (#7310)
liuruyan Apr 13, 2026
b2997f3
fix overlap mtp empty run (#7314)
Sunny-bot1 Apr 13, 2026
d9a008f
[Feature] Support set PREEMPTED_TOKEN_ID in GET_SAVE_OUTPUT_V1 (#7159…
rainyfly Apr 13, 2026
9823d63
remove fa4 requirements (#7354)
zoooo0820 Apr 13, 2026
144dc17
update attn_mask_q 2 (#7373)
ckl117 Apr 13, 2026
e7c8dc2
[Speculate Decoding] Fix step_idx semantics in limit_thinking and set…
lonelygsh Apr 14, 2026
8a8beca
[BugFix][PD Disaggregation][KVCache] Fix low cache hit rate in PD spl…
EmmonsCurse Apr 14, 2026
f6c066f
Revert "[Optimization] Optimize ttft for prefill pd (#6680)" (#7386)
freeliuzc Apr 14, 2026
5f7524e
fix rl moe gate type (#7394)
Sunny-bot1 Apr 14, 2026
2ee1cc3
check init_flash_attn_version log (#7401)
ckl117 Apr 15, 2026
61bfe6e
modify flashmask version (#7414)
BingooYang Apr 15, 2026
26674bb
[Cherry-Pick][RL] Add clear_graph_opt_backend for glm4_mtp (#7378) (#…
Deleter-D Apr 15, 2026
b8e8a62
PD deployment support without router (#7412) (#7424)
juncaipeng Apr 16, 2026
72ce56b
[BugFix] fix tool call parser (#7369) (#7419)
EmmonsCurse Apr 16, 2026
185708b
[Cherry-Pick][BugFix] Fix real token exceeding max_batched_tokens lim…
freeliuzc Apr 17, 2026
650d1e4
[Cherry-Pick][Speculative Decoding] Add MTP logprob support for PD di…
Deleter-D Apr 17, 2026
56b761d
[Cherry-Pick][Speculative Decoding][BugFix] Fix apply repeat times pe…
freeliuzc Apr 17, 2026
fc801f8
[Bugfix][RL] fix control request timeout in async update weights pipe…
jackyYang6 Apr 20, 2026
f4f7760
[CI] Temporarily pin paddlepaddle-gpu to 3.5.0.dev20260417 (#7486) (#…
EmmonsCurse Apr 20, 2026
95261f0
Unify num_experts_per_tok to moe_k in ModelConfig for MoE model compa…
xyxinyang Apr 21, 2026
74ddb20
[RL][Cherry-Pick] Fix the out-of-bounds issue caused by int32 in the…
gongshaotian Apr 21, 2026
be2fd17
add m_grouped_bf16_gemm_nn_contiguous(#7536)
ckl117 Apr 21, 2026
13034ef
[BugFix] Fix skip_x_record_stream incompatibility across deep_ep vers…
EmmonsCurse Apr 21, 2026
d551846
Mooncake storage register local buffer by chunk (#7416) (#7540)
juncaipeng Apr 22, 2026
86df2a9
Update args_utils.py (#7549)
Jiang-Jia-Jun Apr 22, 2026
b0fde16
Enable output caching by default
Jiang-Jia-Jun Apr 22, 2026
2961400
[Cherry-Pick][BugFix] Fix clear_parameters hang issue in MTP during w…
Deleter-D Apr 22, 2026
9c91ecb
[Cherry-Pick][BugFix] Fix bugs in /v1/abort_requests interface from P…
qwes5s5 Apr 22, 2026
3d6d3a2
[DataProcessor] add completions (#7543) (#7558)
EmmonsCurse Apr 22, 2026
2c04dfd
Update args_utils.py
Jiang-Jia-Jun Apr 22, 2026
9ef8467
[Scheduler][BugFix] Fix token_budget calculation to use actual decode…
EmmonsCurse Apr 22, 2026
258b22a
support deepgemm without bias input (#7559) (#7565)
EmmonsCurse Apr 23, 2026
b3aa469
[KSM] support keep sampling mask (#7460)
zeroRains Apr 23, 2026
eb92613
[Cherry-Pick][BugFix] Fix save_output_specualate parameter bugs in su…
Deleter-D Apr 23, 2026
10f5a20
Cache queue support ipc (#7589)
juncaipeng Apr 23, 2026
af68b26
[RL] Remove redundant barrier and optimize model weights signal broad…
EmmonsCurse Apr 24, 2026
4cbae62
Use triton qk_norm both in Prefill and Decode (#7213) (#7306)
EmmonsCurse Apr 24, 2026
8d7063e
[Cherry-Pick][Optimization]Change default workers and max-concurrency…
EmmonsCurse Apr 24, 2026
0de0be4
[Others] print evictable blocks in console log (#7384) (#7580)
EmmonsCurse Apr 24, 2026
d88982b
[Optimization] Support async D2H copy for MTP logprobs & Clean up ove…
EmmonsCurse Apr 24, 2026
5508979
Fix PD interaction and error response (#7606)
juncaipeng Apr 24, 2026
c8a59a3
[Cherry-Pick][CI] Sync dev optimizations to 2.6(#7602) (#7610)
EmmonsCurse Apr 24, 2026
6ad8fce
[RL][Feature] R3 Support GPUPrefixCache, CPUPrefixCache, PD Disaggreg…
gongshaotian Apr 27, 2026
e0cad0f
[Cherry-Pick][Speculative Decoding][BugFix] overlap compute logprobs …
huicongyao Apr 27, 2026
eee8289
[Bugfix]compile support SM100 (#7581) (#7629)
ChowMingSing Apr 27, 2026
99444f6
fix fp8 infer error (#7627) (#7631)
EmmonsCurse Apr 28, 2026
23e0a84
[Cherry-Pick][CI] Pin Paddle to release/3.3 last_commit build in 2.6(…
EmmonsCurse Apr 28, 2026
5582b5a
[BugFix][Speculative Decoding] Fix tokens_per_seq min value calculati…
EmmonsCurse Apr 28, 2026
ecb31fb
[KVCache] Support flush FD GPU/CPU Cache index by AttentionStore (#7644)
jackyYang6 Apr 28, 2026
37672f9
support different AS interface for GPU and XPU (#7380) (#7647)
ApplEOFDiscord Apr 28, 2026
bfff3d9
[Cherry-Pick][KVCache] Support environment variable overrides for Att…
jackyYang6 Apr 28, 2026
188db35
[RL] Correct the semantics of max_num_batched_tokens with multimodal …
gongshaotian Apr 28, 2026
0aa3e25
[Cherry-Pick][RL] rl support mix_quant (#7645) (#7650)
ckl117 Apr 29, 2026
32d5f5b
Refine metrics and trace for pd (#7613) (#7661)
juncaipeng Apr 29, 2026
3ac8ff2
Remove recode info for request when finish sending cache (#7664)
juncaipeng Apr 29, 2026
f8e38f6
abort requests fix2 (#7652)
qwes5s5 Apr 29, 2026
97cda57
[Cherry-Pick][Optimize]Compute slot_mapping and position_ids(#7313 #7…
ShaneGZhu Apr 29, 2026
75f328c
[Cherry-Pick][Optimization] Support logprob overlap in speculative de…
Deleter-D Apr 29, 2026
d3a2c71
[Cherry-Pick][BugFix][KVCache] Fix inference slowdown when enabling C…
kevincheng2 Apr 29, 2026
df1d64c
Fix key error for updating mtp model weights (#7676)
juncaipeng Apr 30, 2026
c1f9714
[Cherry-Pick] [BugFix] Fix get_tasks returns empty list and incorrect…
liyonghua0910 Apr 30, 2026
0ec9625
[Cherry-Pick] [BugFix] fix preempted token id not returned when a ful…
liyonghua0910 Apr 30, 2026
66dea60
[BugFix] Fix get_tasks returns empty list and incorrect nnode computa…
liyonghua0910 May 1, 2026
d0a0b3e
fix rl overlap (#7745)
Sunny-bot1 May 8, 2026
d5af459
[Cherry-Pick] [BugFix] Fix stop token sequence pointer offset and act…
chang-wenbin May 9, 2026
a5fa727
[BugFix][KVCache][Speculative Decoding] Fix get_max_chunk_tokens for …
EmmonsCurse May 9, 2026
d92163f
[BugFix] Fix ZMQ multipart frame interleaving in Splitwise connector …
yuanlehome May 9, 2026
228987a
[Cherry-Pick] [BugFix] [RL] Fix cpu cache for rl (#7764) (#7765)
liyonghua0910 May 11, 2026
ad431c7
[RL] R3 Support Overlap Schedule (#7674)
gongshaotian May 11, 2026
53af5cc
[Cherry-Pick][CI] Remove checklist validation from CheckPRTemplate.py…
EmmonsCurse May 11, 2026
f8a0cf2
[BugFix][KSM] Fix sampling_mask reordering in recover_batch_index_for…
DesmonDay May 11, 2026
7901aeb
[XPU][CI] fix XPU CI bug (#7778)
plusNew001 May 11, 2026
a5191f2
[Cherry-Pick][Cleanup] Replace torch proxy alias with public compat A…
SigureMo May 11, 2026
fae4a8b
[BugFix] Fix KSM bug in MTP and Overlap (#7788)
zeroRains May 12, 2026
ae5dac1
[Cherry-Pick][Optimization] enable trtllm_all_reduce fusion kernel in…
BingooYang May 12, 2026
0077822
[FDConfig] 默认开启 FD_ENABLE_E2W_TENSOR_CONVERT 和 FD_ENGINE_TASK_QUEUE_W…
sunlei1024 May 13, 2026
976cb7b
[BugFix] fix: cast image_mask.any() to bool for task queue serializat…
EmmonsCurse May 13, 2026
90c010d
[Cherry-Pick][Speculative Decoding] Support mtp super ultra overlap i…
freeliuzc May 13, 2026
4e7a46e
prepare request in prefill instance by multi threads (#7724)
juncaipeng May 13, 2026
d38eeb8
[Scheduler] [Optimization] Only preempt decode requests and better ma…
liyonghua0910 May 13, 2026
5e76c8b
fix(PrefixCache): fix garbled text in PD disaggregation by early retu…
EmmonsCurse May 13, 2026
33b22b3
[Cherry-pick] [Optimization] Elemenwise fusion (#6880) (#7683)
BingooYang May 13, 2026
dc1fea1
[Cherry-Pick] [BugFix] Fix abort when enabling overlap schedule (#780…
liyonghua0910 May 13, 2026
478c9fa
[RL] pause: use abort pipeline with scheduling loop alive for gracefu…
jackyYang6 May 14, 2026
d02f3ba
[Feature] Add TritonMoEMethod for BF16 MoE inference (#7815)
xuanyuanminzheng May 14, 2026
df637af
refact abort requests (#7808)
qwes5s5 May 14, 2026
18cab83
fix paddle optional get assert in sm103 (#7820)
zoooo0820 May 14, 2026
72beb9e
opt moe_align_kernel (#7786)
yongqiangma May 14, 2026
04e4ae8
[Cherry-Pick][BugFix] Fix pause drain hang caused by stale abort mark…
jackyYang6 May 15, 2026
d71bdda
[Cherry-Pick][CI] Optimize clean_ports logic by removing redundant co…
EmmonsCurse May 15, 2026
514ed5c
[Cherry-Pick][Op][Optimization]Kernel fusion: cast+sigmoid+bias+noaux…
ShaneGZhu May 18, 2026
9894b32
[Cherry-Pick][RL] Support cpu tensor broadcast(#7833) (#7840)
Sunny-bot1 May 18, 2026
ab3c5f4
[Cherry-Pick][CI] Set --workers=1 to avoid intermittent timeout failu…
EmmonsCurse May 19, 2026
41d44d6
fix refact abort (#7838)
qwes5s5 May 19, 2026
8c4f5a6
[Cherry-Pick] update fleet_ops(#7859) (#7858)
liuruyan May 20, 2026
31b12ee
[Cherry-Pick][Optimization] Reduce logprob processing overhead by usi…
Sunny-bot1 May 20, 2026
b5c8290
[RL] Reset buffer size of `slot_mapping` (#7868)
gongshaotian May 21, 2026
b562b8d
fix ce bug (#7874)
liuruyan May 21, 2026
485f6c2
[Cherry-Pick][Feature][Log]console metrics log for pd disaggregation …
CSWYF3634076 May 22, 2026
e7815be
[Cherry-Pick][Benchmark] Add inner benchmark metrics component (#7881…
Deleter-D May 22, 2026
5d18984
fix(kvcache): buffer early layer0 signals (#7896)
kevincheng2 May 25, 2026
3ffeb44
[Cherry-Pick][CI] Restore self-hosted runners for GitHub workflows(#7…
EmmonsCurse May 25, 2026
85399db
[Cherry-pick][XPU][CI] fix logs update bug (#7915)
plusNew001 May 25, 2026
e7a02e2
supoort glm yarn rope (#7894)
Sunny-bot1 May 25, 2026
0a5d4b6
[bugfix] AS block leaks (#7895)
zccjjj May 26, 2026
bf0dace
[Scheduler] Increase sleep interval in fetch loops and cancel schedul…
liyonghua0910 May 26, 2026
a095d6f
[Cherry-Pick][Feature] support decode unified attention for mix(#7688…
lizhenyun01 May 26, 2026
c52b063
[Cherry-Pick][Optimization][Speculative Decoding]opt mtp logprob (#78…
Sunny-bot1 May 26, 2026
261041b
[Cherry-Pick][Bugfix] Fix clear bug in RL causing CUDA error 700 duri…
freeliuzc May 27, 2026
8a1e71d
[PD] PD send cache via storage & Refine swap_cache_layout op (#7839)
juncaipeng May 27, 2026
2b0fd53
[Cherry-Pick][Optimization]support fused noauxtc kernel on ep mode(#7…
ShaneGZhu May 28, 2026
1e7ee22
[Cherry-Pick] [Optimization] TopP=1.0 using _random_sample (#7892) an…
ckl117 May 28, 2026
fefbcff
[Cherry-Pick] [BugFix] fix all reduce fusion accurate issue (#7923) (…
BingooYang May 28, 2026
ac24fcc
[Cherry-Pick][BugFix] fix mtp reset bugs in rl (#7957) (#7958)
Deleter-D May 29, 2026
7198b58
[RL] Fix the incorrect routing of EOS tokens, which leads to changes …
gongshaotian Jun 1, 2026
eeed8a3
[RL] Fix Ernie mm bug (#7966)
gongshaotian Jun 2, 2026
780c000
[Cherry-Pick][RL][Feature] Add GDR streaming weight update path (#795…
jackyYang6 Jun 3, 2026
99c7df1
fix moe accurate issue
BingooYang Jun 3, 2026
f232ed9
fix bug
BingooYang Jun 3, 2026
b5ec4fa
add test
BingooYang Jun 3, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions .flake8
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ max-line-length = 119
# E402: module level import not at top of file
per-file-ignores =
__init__.py:F401,F403,E402
fastdeploy/model_executor/layers/sample/ops/top_k_top_p_triton.py:E241,E121,E131,E266
3 changes: 2 additions & 1 deletion .github/workflows/CheckPRTemplate.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@ jobs:
check:
name: Check PR Template
if: ${{ github.repository_owner == 'PaddlePaddle' }}
runs-on: ubuntu-latest
runs-on:
group: APPROVAL
env:
PR_ID: ${{ github.event.pull_request.number }}
BASE_BRANCH: ${{ github.event.pull_request.base.ref }}
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/Codestyle-Check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@ jobs:
pre-commit:
name: Pre Commit
if: ${{ github.repository_owner == 'PaddlePaddle' }}
runs-on: ubuntu-latest
runs-on:
group: APPROVAL
env:
PR_ID: ${{ github.event.pull_request.number }}
BRANCH: ${{ github.event.pull_request.base.ref }}
Expand Down
36 changes: 31 additions & 5 deletions .github/workflows/_accuracy_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -69,12 +69,27 @@ jobs:
if ls "${REPO_NAME}"* >/dev/null 2>&1; then
echo "ERROR: Failed to clean ${REPO_NAME}* after multiple attempts"
ls -ld "${REPO_NAME}"*
exit 1
echo "Attempting force cleanup with find..."
find /workspace -mindepth 1 -maxdepth 1 -name "${REPO_NAME}*" -type d -exec chmod -R u+rwx {} \; -exec rm -rf {} + 2>/dev/null || true
if ls "${REPO_NAME}"* >/dev/null 2>&1; then
echo "ERROR: Force cleanup still failed"
exit 1
else
echo "Force cleanup succeeded"
fi
fi
'

wget -q --no-proxy ${fd_archive_url}
tar -xf FastDeploy.tar.gz
wget -q --no-proxy ${fd_archive_url} || {
echo "ERROR: Failed to download archive from ${fd_archive_url}"
exit 1
}

tar --no-same-owner -xf FastDeploy.tar.gz || {
echo "ERROR: Failed to extract archive"
exit 1
}

rm -rf FastDeploy.tar.gz
cd FastDeploy
git config --global user.name "FastDeployCI"
Expand Down Expand Up @@ -145,7 +160,10 @@ jobs:
docker rm -f ${runner_name} || true
fi

docker run --rm --ipc=host --pid=host --net=host \
docker run --rm --net=host \
--shm-size=64g \
--sysctl kernel.msgmax=1048576 \
--sysctl kernel.msgmnb=268435456 \
--name ${runner_name} \
-v $(pwd):/workspace \
-w /workspace \
Expand All @@ -160,8 +178,9 @@ jobs:
-v "${CACHE_DIR}/.cache:/root/.cache" \
-v "${CACHE_DIR}/ConfigDir:/root/.config" \
-e TZ="Asia/Shanghai" \
-e "no_proxy=localhost,127.0.0.1,0.0.0.0,bcebos.com,.bcebos.com,bj.bcebos.com,su.bcebos.com,paddle-ci.gz.bcebos.com,apiin.im.baidu.com,baidu-int.com,.baidu.com,aliyun.com,gitee.com,pypi.tuna.tsinghua.edu.cn,.tuna.tsinghua.edu.cn" \
--gpus '"device='"${DEVICES}"'"' ${docker_image} /bin/bash -xc '
python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/
python -m pip install https://paddle-qa.bj.bcebos.com/paddle-pipeline/Release-TagBuild-Training-Linux-Gpu-Cuda12.6-Cudnn9.5-Trt10.5-Mkl-Avx-Gcc11-SelfBuiltPypiUse/2b9f8b689bc8988f97a5ede056c8c81bfa0332c2/paddlepaddle_gpu-3.3.1.post20260420+2b9f8b689bc-cp310-cp310-linux_x86_64.whl --extra-index-url https://www.paddlepaddle.org.cn/packages/stable/cu126/

pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple

Expand Down Expand Up @@ -204,3 +223,10 @@ jobs:
fi
echo "TEST_EXIT_CODE=${TEST_EXIT_CODE}"
exit ${TEST_EXIT_CODE}

- name: Terminate and delete the container
if: always()
run: |
set +e
docker exec -t ${{ runner.name }} /bin/bash -c 'find /workspace -mindepth 1 -delete'
docker rm -f ${{ runner.name }}
29 changes: 24 additions & 5 deletions .github/workflows/_base_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,14 @@ jobs:
if ls "${REPO_NAME}"* >/dev/null 2>&1; then
echo "ERROR: Failed to clean ${REPO_NAME}* after multiple attempts"
ls -ld "${REPO_NAME}"*
exit 1
echo "Attempting force cleanup with find..."
find /workspace -mindepth 1 -maxdepth 1 -name "${REPO_NAME}*" -type d -exec chmod -R u+rwx {} \; -exec rm -rf {} + 2>/dev/null || true
if ls "${REPO_NAME}"* >/dev/null 2>&1; then
echo "ERROR: Force cleanup still failed"
exit 1
else
echo "Force cleanup succeeded"
fi
fi
'

Expand Down Expand Up @@ -111,7 +118,11 @@ jobs:
exit 1
fi

tar -xf FastDeploy.tar.gz
tar --no-same-owner -xf FastDeploy.tar.gz || {
echo "ERROR: Failed to extract archive"
exit 1
}

rm -rf FastDeploy.tar.gz
cd FastDeploy
git config --global user.name "FastDeployCI"
Expand Down Expand Up @@ -200,8 +211,9 @@ jobs:
-v "${CACHE_DIR}/.cache:/root/.cache" \
-v "${CACHE_DIR}/ConfigDir:/root/.config" \
-e TZ="Asia/Shanghai" \
-e "no_proxy=localhost,127.0.0.1,0.0.0.0,bcebos.com,.bcebos.com,bj.bcebos.com,su.bcebos.com,paddle-ci.gz.bcebos.com,apiin.im.baidu.com,baidu-int.com,.baidu.com,aliyun.com,gitee.com,pypi.tuna.tsinghua.edu.cn,.tuna.tsinghua.edu.cn" \
--gpus '"device='"${DEVICES}"'"' ${docker_image} /bin/bash -xc '
python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/
python -m pip install https://paddle-qa.bj.bcebos.com/paddle-pipeline/Release-TagBuild-Training-Linux-Gpu-Cuda12.6-Cudnn9.5-Trt10.5-Mkl-Avx-Gcc11-SelfBuiltPypiUse/2b9f8b689bc8988f97a5ede056c8c81bfa0332c2/paddlepaddle_gpu-3.3.1.post20260420+2b9f8b689bc-cp310-cp310-linux_x86_64.whl --extra-index-url https://www.paddlepaddle.org.cn/packages/stable/cu126/

pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple

Expand Down Expand Up @@ -254,13 +266,13 @@ jobs:

curl -X POST http://0.0.0.0:${FLASK_PORT}/switch \
-H "Content-Type: application/json" \
-d "{ \"--model\": \"/MODELDATA/ERNIE-4.5-0.3B-Paddle\", \"--max-concurrency\": 5, \"--max-waiting-time\": 1 }"
-d "{ \"--model\": \"/MODELDATA/ERNIE-4.5-0.3B-Paddle\", \"--workers\": 1, \"--max-concurrency\": 5, \"--max-waiting-time\": 1 }"
check_service 90
python -m pytest -sv test_max_concurrency.py || TEST_EXIT_CODE=1

curl -X POST http://0.0.0.0:${FLASK_PORT}/switch \
-H "Content-Type: application/json" \
-d "{ \"--model\": \"/MODELDATA/ERNIE-4.5-0.3B-Paddle\", \"--max-concurrency\": 5000, \"--max-waiting-time\": 1 }"
-d "{ \"--model\": \"/MODELDATA/ERNIE-4.5-0.3B-Paddle\", \"--workers\": 1, \"--max-concurrency\": 5000, \"--max-waiting-time\": 1 }"
check_service 90
python -m pytest -sv test_max_waiting_time.py || TEST_EXIT_CODE=1

Expand Down Expand Up @@ -294,3 +306,10 @@ jobs:
fi
echo "TEST_EXIT_CODE=${TEST_EXIT_CODE}"
exit ${TEST_EXIT_CODE}

- name: Terminate and delete the container
if: always()
run: |
set +e
docker exec -t ${{ runner.name }} /bin/bash -c 'find /workspace -mindepth 1 -delete'
docker rm -f ${{ runner.name }}
14 changes: 12 additions & 2 deletions .github/workflows/_build_linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,7 @@ jobs:
git config --global user.name "FastDeployCI"
git config --global user.email "fastdeploy_ci@example.com"
git log -n 3 --oneline

- name: FastDeploy Build
shell: bash
env:
Expand Down Expand Up @@ -156,7 +157,8 @@ jobs:
PARENT_DIR=$(dirname "$WORKSPACE")
echo "PARENT_DIR:$PARENT_DIR"
docker run --rm --net=host \
--cap-add=SYS_PTRACE --privileged --shm-size=64G \
--cap-add=SYS_PTRACE --shm-size=64G \
--name ${runner_name} \
-v $(pwd):/workspace -w /workspace \
-v "${CACHE_DIR}/gitconfig:/etc/gitconfig:ro" \
-v "${CACHE_DIR}/.cache:/root/.cache" \
Expand All @@ -171,6 +173,7 @@ jobs:
-e "PADDLE_WHL_URL=${PADDLE_WHL_URL}" \
-e "BRANCH_REF=${BRANCH_REF}" \
-e "CCACHE_MAXSIZE=50G" \
-e "no_proxy=localhost,127.0.0.1,0.0.0.0,bcebos.com,.bcebos.com,bj.bcebos.com,su.bcebos.com,paddle-ci.gz.bcebos.com,apiin.im.baidu.com,baidu-int.com,.baidu.com,aliyun.com,gitee.com,pypi.tuna.tsinghua.edu.cn,.tuna.tsinghua.edu.cn" \
--gpus "\"device=${gpu_id}\"" ${docker_image} /bin/bash -c '
if [[ -n "${FD_VERSION}" ]]; then
export FASTDEPLOY_VERSION=${FD_VERSION}
Expand All @@ -193,7 +196,7 @@ jobs:
elif [[ "${PADDLEVERSION}" != "" ]];then
python -m pip install paddlepaddle-gpu==${PADDLEVERSION} -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
else
python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/
python -m pip install https://paddle-qa.bj.bcebos.com/paddle-pipeline/Release-TagBuild-Training-Linux-Gpu-Cuda12.6-Cudnn9.5-Trt10.5-Mkl-Avx-Gcc11-SelfBuiltPypiUse/2b9f8b689bc8988f97a5ede056c8c81bfa0332c2/paddlepaddle_gpu-3.3.1.post20260420+2b9f8b689bc-cp310-cp310-linux_x86_64.whl --extra-index-url https://www.paddlepaddle.org.cn/packages/stable/cu126/

This comment was marked as outdated.

fi

pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
Expand Down Expand Up @@ -248,3 +251,10 @@ jobs:
target_path_stripped="${target_path#paddle-github-action/}"
WHEEL_PATH=https://paddle-github-action.bj.bcebos.com/${target_path_stripped}/${fd_wheel_name}
echo "wheel_path=${WHEEL_PATH}" >> $GITHUB_OUTPUT

- name: Terminate and delete the container
if: always()
run: |
set +e
docker exec -t ${{ runner.name }} /bin/bash -c 'find /workspace -mindepth 1 -delete'
docker rm -f ${{ runner.name }}
14 changes: 12 additions & 2 deletions .github/workflows/_build_linux_cu129.yml
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,7 @@ jobs:
git config --global user.name "FastDeployCI"
git config --global user.email "fastdeploy_ci@example.com"
git log -n 3 --oneline

- name: FastDeploy Build
shell: bash
env:
Expand Down Expand Up @@ -143,7 +144,8 @@ jobs:
PARENT_DIR=$(dirname "$WORKSPACE")
echo "PARENT_DIR:$PARENT_DIR"
docker run --rm --net=host \
--cap-add=SYS_PTRACE --privileged --shm-size=64G \
--cap-add=SYS_PTRACE --shm-size=64G \
--name ${runner_name} \
-v $(pwd):/workspace -w /workspace \
-v "${CACHE_DIR}/gitconfig:/etc/gitconfig:ro" \
-v "${CACHE_DIR}/.cache:/root/.cache" \
Expand All @@ -158,6 +160,7 @@ jobs:
-e "PADDLE_WHL_URL=${PADDLE_WHL_URL}" \
-e "BRANCH_REF=${BRANCH_REF}" \
-e "CCACHE_MAXSIZE=50G" \
-e "no_proxy=localhost,127.0.0.1,0.0.0.0,bcebos.com,.bcebos.com,bj.bcebos.com,su.bcebos.com,paddle-ci.gz.bcebos.com,apiin.im.baidu.com,baidu-int.com,.baidu.com,aliyun.com,gitee.com,pypi.tuna.tsinghua.edu.cn,.tuna.tsinghua.edu.cn" \
--gpus "\"device=${gpu_id}\"" ${docker_image} /bin/bash -c '
if [[ -n "${FD_VERSION}" ]]; then
export FASTDEPLOY_VERSION=${FD_VERSION}
Expand All @@ -180,7 +183,7 @@ jobs:
elif [[ "${PADDLEVERSION}" != "" ]];then
python -m pip install paddlepaddle-gpu==${PADDLEVERSION} -i https://www.paddlepaddle.org.cn/packages/stable/cu129/
else
python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu129/
python -m pip install https://paddle-qa.bj.bcebos.com/paddle-pipeline/Release-TagBuild-Training-Linux-Gpu-Cuda12.9-Cudnn9.9-Trt10.5-Mkl-Avx-Gcc11-SelfBuiltPypiUse/2b9f8b689bc8988f97a5ede056c8c81bfa0332c2/paddlepaddle_gpu-3.3.1.post20260420+2b9f8b689bc-cp310-cp310-linux_x86_64.whl --extra-index-url https://www.paddlepaddle.org.cn/packages/stable/cu129/
fi

pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
Expand Down Expand Up @@ -235,3 +238,10 @@ jobs:
target_path_stripped="${target_path#paddle-github-action/}"
WHEEL_PATH=https://paddle-github-action.bj.bcebos.com/${target_path_stripped}/${fd_wheel_name}
echo "wheel_path_cu129=${WHEEL_PATH}" >> $GITHUB_OUTPUT

- name: Terminate and delete the container
if: always()
run: |
set +e
docker exec -t ${{ runner.name }} /bin/bash -c 'find /workspace -mindepth 1 -delete'
docker rm -f ${{ runner.name }}
14 changes: 12 additions & 2 deletions .github/workflows/_build_linux_cu130.yml
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,7 @@ jobs:
git config --global user.name "FastDeployCI"
git config --global user.email "fastdeploy_ci@example.com"
git log -n 3 --oneline

- name: FastDeploy Build
shell: bash
env:
Expand Down Expand Up @@ -143,7 +144,8 @@ jobs:
PARENT_DIR=$(dirname "$WORKSPACE")
echo "PARENT_DIR:$PARENT_DIR"
docker run --rm --net=host \
--cap-add=SYS_PTRACE --privileged --shm-size=64G \
--cap-add=SYS_PTRACE --shm-size=64G \
--name ${runner_name} \
-v $(pwd):/workspace -w /workspace \
-v "${CACHE_DIR}/gitconfig:/etc/gitconfig:ro" \
-v "${CACHE_DIR}/.cache_cu130:/root/.cache" \
Expand All @@ -158,6 +160,7 @@ jobs:
-e "PADDLE_WHL_URL=${PADDLE_WHL_URL}" \
-e "BRANCH_REF=${BRANCH_REF}" \
-e "CCACHE_MAXSIZE=50G" \
-e "no_proxy=localhost,127.0.0.1,0.0.0.0,bcebos.com,.bcebos.com,bj.bcebos.com,su.bcebos.com,paddle-ci.gz.bcebos.com,apiin.im.baidu.com,baidu-int.com,.baidu.com,aliyun.com,gitee.com,pypi.tuna.tsinghua.edu.cn,.tuna.tsinghua.edu.cn" \
--gpus "\"device=${gpu_id}\"" ${docker_image} /bin/bash -c '
if [[ -n "${FD_VERSION}" ]]; then
export FASTDEPLOY_VERSION=${FD_VERSION}
Expand All @@ -180,7 +183,7 @@ jobs:
elif [[ "${PADDLEVERSION}" != "" ]];then
python -m pip install paddlepaddle-gpu==${PADDLEVERSION} -i https://www.paddlepaddle.org.cn/packages/stable/cu130/
else
python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu130/
python -m pip install https://paddle-qa.bj.bcebos.com/paddle-pipeline/Release-TagBuild-Training-Linux-Gpu-Cuda130-Cudnn913-Trt1013-Mkl-Avx-Gcc11-SelfBuiltPypiUse/2b9f8b689bc8988f97a5ede056c8c81bfa0332c2/paddlepaddle_gpu-3.3.1.post20260420+2b9f8b689bc-cp310-cp310-linux_x86_64.whl --extra-index-url https://www.paddlepaddle.org.cn/packages/stable/cu130/
fi

pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
Expand Down Expand Up @@ -235,3 +238,10 @@ jobs:
target_path_stripped="${target_path#paddle-github-action/}"
WHEEL_PATH=https://paddle-github-action.bj.bcebos.com/${target_path_stripped}/${fd_wheel_name}
echo "wheel_path_cu130=${WHEEL_PATH}" >> $GITHUB_OUTPUT

- name: Terminate and delete the container
if: always()
run: |
set +e
docker exec -t ${{ runner.name }} /bin/bash -c 'find /workspace -mindepth 1 -delete'
docker rm -f ${{ runner.name }}
12 changes: 11 additions & 1 deletion .github/workflows/_build_linux_fd_router.yml
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,7 @@ jobs:
git config --global user.name "FastDeployCI"
git config --global user.email "fastdeploy_ci@example.com"
git log -n 3 --oneline

- name: FastDeploy FD_ROUTER Build
shell: bash
env:
Expand Down Expand Up @@ -137,7 +138,8 @@ jobs:
PARENT_DIR=$(dirname "$WORKSPACE")
echo "PARENT_DIR:$PARENT_DIR"
docker run --rm --net=host \
--cap-add=SYS_PTRACE --privileged --shm-size=64G \
--cap-add=SYS_PTRACE --shm-size=64G \
--name ${runner_name} \
-v $(pwd):/workspace -w /workspace \
-v "${CACHE_DIR}/gitconfig:/etc/gitconfig:ro" \
-v "${CACHE_DIR}/.cache:/root/.cache" \
Expand All @@ -151,6 +153,7 @@ jobs:
-e "PADDLE_WHL_URL=${PADDLE_WHL_URL}" \
-e "BRANCH_REF=${BRANCH_REF}" \
-e "CCACHE_MAXSIZE=50G" \
-e "no_proxy=localhost,127.0.0.1,0.0.0.0,bcebos.com,.bcebos.com,bj.bcebos.com,su.bcebos.com,paddle-ci.gz.bcebos.com,apiin.im.baidu.com,baidu-int.com,.baidu.com,aliyun.com,gitee.com,pypi.tuna.tsinghua.edu.cn,.tuna.tsinghua.edu.cn" \
--gpus "\"device=${gpu_id}\"" ${docker_image} /bin/bash -c '
if [[ -n "${FD_VERSION}" ]]; then
export FASTDEPLOY_VERSION=${FD_VERSION}
Expand Down Expand Up @@ -211,3 +214,10 @@ jobs:
target_path_stripped="${target_path#paddle-github-action/}"
FD_ROUTER_PATH=https://paddle-github-action.bj.bcebos.com/${target_path_stripped}/fd-router
echo "fd_router_path=${FD_ROUTER_PATH}" >> $GITHUB_OUTPUT

- name: Terminate and delete the container
if: always()
run: |
set +e
docker exec -t ${{ runner.name }} /bin/bash -c 'find /workspace -mindepth 1 -delete'
docker rm -f ${{ runner.name }}
Loading
Loading