[KVCache] Add e2e v1 cache manager tests for prefix caching and swap mechanism verification by kevincheng2 · Pull Request #7814 · PaddlePaddle/FastDeploy

kevincheng2 · 2026-05-14T05:42:48Z

Motivation

新增 V1 CacheManager 端到端测试，验证前缀缓存（Prefix Caching）、GPU/CPU 交换机制（SWAP2CPU/SWAP2GPU）、LRU 淘汰策略及缓存监控指标在实际推理服务中的运行效果。当前 CacheManager 缺少端到端测试覆盖，无法自动化验证缓存功能正确性，本 PR 填补这一空白。

Modifications

新增 tests/ci_use/Prefix_Caching_Swap/test_v1_cache_manager.py（630 行），包含 6 个端到端测试用例：
- test_basic_prefix_cache_functionality: 冷启动验证 cached_tokens=0，重复请求及多轮对话共享前缀验证缓存命中
- test_lru_eviction_policy: 填满缓存后按 LRU 顺序重新访问，验证淘汰策略正确性

Usage or Command

# 启动 FastDeploy 推理服务（启用 V1 CacheManager）
ENABLE_V1_KVCACHE_MANAGER=1 python -m fastdeploy.entrypoints.openai.api_server \
    --model <model_path> \
    --port 12211 \
    --max-model-len 128 \
    --num-gpu-blocks-override 4 \
    --swap-space 10 \
    --enable-prefix-caching \
    --metrics-port 9090 \
    --cache-queue-port 8081 \
    --engine-worker-queue-port 8082

# 运行 e2e 测试
pytest tests/ci_use/Prefix_Caching_Swap/test_v1_cache_manager.py -v

Accuracy Tests

N/A（本 PR 仅新增测试用例，不修改模型前向或 kernel 代码）

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

## Motivation 新增 CacheManager 端到端测试，验证前缀缓存（Prefix Caching）在 FastDeploy 推理服务中的实际运行效果。测试覆盖冷启动、重复请求命中、共享前缀命中、metrics 端点及非流式请求等场景，确保缓存功能在生产环境中正常运行。 ## Modifications - 新增 tests/e2e/test_cache_manager.py，包含 5 个端到端测试用例 - test_cache_cold_start: 冷启动请求验证 cached_tokens=0 - test_cache_hit_on_repeat: 重复请求验证缓存命中（cached_tokens > 0） - test_cache_shared_prefix: 多轮对话共享 system prompt 前缀验证缓存命中 - test_cache_metrics_endpoint: /metrics 端点包含缓存相关指标 - test_cache_non_stream: 非流式请求缓存命中验证 ## Usage or Command ```bash # 启动服务（需要 MODEL_PATH 环境变量） export MODEL_PATH=/path/to/models python -m pytest tests/e2e/test_cache_manager.py -vv -s # 关键启动参数 python -m fastdeploy.entrypoints.openai.api_server \ --model <model_path> \ --enable-prefix-caching \ --swap-space 20 \ --max-model-len 32768 \ --max-num-seqs 128 ```

paddle-bot · 2026-05-14T05:42:55Z

Thanks for your contribution!

PaddlePaddle-bot · 2026-05-14T05:53:13Z

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-05-14 20:05:22

CI报告基于以下代码生成（30分钟更新一次）:

PR commit: b126011
Merge base: cb2d7c0 (branch: develop)
查看完整 Diff
CI 详情

1 任务总览

所有 Required 任务均已通过 ✅，PR 可正常合并。有 4 个 Optional 任务仍在等待中（不影响合并）。

总执行（rerun次数）	总任务	✅ 通过	❌ 失败	⏳ 运行中	⏸️ 等待中	跳过
13（0）	13	9	0	0	4	0

2 任务状态汇总

2.1 Required任务：2/2 通过

必选任务阻塞合并，失败需优先处理。

状态	任务	耗时	根因	修复建议	日志	重跑
✅	其余 2 个必选任务全部通过	-	-	-	-	-

2.2 可选任务 — 7/11 通过

可选任务不阻塞合并，失败仅供参考。

状态	任务	耗时	日志	重跑
⏸️	`FD-Clone-Linux / code-clone`	-	-	-
⏸️	`FD-Clone-Linux-ILUVATAR / code-clone`	-	-	-
⏸️	`FD-Clone-Linux-XPU / code-clone`	-	-	-
⏸️	`CI_HPU`	-	-	-
✅	其余 7 个可选任务通过	-	-	-

3 失败详情（仅 required）

无 required 失败任务。

codecov-commenter · 2026-05-14T06:24:38Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@cb2d7c0). Learn more about missing BASE report.

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #7814   +/-   ##
==========================================
  Coverage           ?   63.52%           
==========================================
  Files              ?      461           
  Lines              ?    64580           
  Branches           ?     9897           
==========================================
  Hits               ?    41026           
  Misses             ?    20728           
  Partials           ?     2826

Flag	Coverage Δ
GPU	`73.21% <ø> (?)`
XPU	`7.13% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

PaddlePaddle-bot

🤖 Paddle-CI-Agent | pr_review | 2026-05-14 20:18:04

📋 Review 摘要

PR 概述：新增 V1 CacheManager 端到端测试，覆盖前缀缓存、SWAP 机制、LRU 淘汰及缓存监控指标验证
变更范围：tests/ci_use/Prefix_Caching_Swap/
影响面 Tag：[KVCache] [CI]

📝 PR 规范检查

PR 标题格式合规（[KVCache] 为官方 Tag），描述包含 Motivation / Modifications / Usage or Command / Accuracy Tests / Checklist 所有必填 section，内容充实，规范通过 ✓

问题

级别	文件	概述
🟡 建议	`test_v1_cache_manager.py:335`	`test_lru_eviction_policy` 仅断言 status==200，未验证 LRU 淘汰顺序，测试可信度低
❓ 疑问	`test_v1_cache_manager.py:47`	`make_usage_payload` 等三个 payload 方法均缺少 `model` 字段，可能导致全部用例 422 失败

⚠️ 覆盖说明：PR 描述提到文件共 630 行 / 6 个测试用例，但本次 diff 仅包含 335 行（test_basic_prefix_cache_functionality 和 test_lru_eviction_policy），其余 4 个测试用例（swap、metrics、robustness 等）未在 diff 中出现，本次 Review 无法覆盖。

总体评价

整体测试框架设计合理，fixture 生命周期管理、流式响应解析逻辑清晰。主要问题是 LRU 测试断言太弱，建议补强校验逻辑；同时请确认 payload 中 model 字段的省略是否被服务端接受，避免 CI 全量失败。

kevincheng2 temporarily deployed to Metax_ci May 14, 2026 05:42 — with GitHub Actions Inactive

This comment was marked as outdated.

Sign in to view

update ci case

0267cdb

kevincheng2 temporarily deployed to Metax_ci May 14, 2026 11:45 — with GitHub Actions Inactive

kevincheng2 changed the title ~~[Tests] add e2e cache manager tests for prefix caching verification~~ [KVCache] Add e2e v1 cache manager tests for prefix caching and swap mechanism verification May 14, 2026

update ci case

b126011

kevincheng2 temporarily deployed to Metax_ci May 14, 2026 11:59 — with GitHub Actions Inactive

This comment was marked as outdated.

Sign in to view

PaddlePaddle-bot reviewed May 14, 2026

View reviewed changes

Comment thread tests/ci_use/Prefix_Caching_Swap/test_v1_cache_manager.py

Comment thread tests/ci_use/Prefix_Caching_Swap/test_v1_cache_manager.py

EmmonsCurse approved these changes May 15, 2026

View reviewed changes

EmmonsCurse merged commit 45fb3d1 into PaddlePaddle:develop May 15, 2026
40 of 43 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[KVCache] Add e2e v1 cache manager tests for prefix caching and swap mechanism verification#7814

[KVCache] Add e2e v1 cache manager tests for prefix caching and swap mechanism verification#7814
EmmonsCurse merged 3 commits into
PaddlePaddle:developfrom
kevincheng2:feature/e2e-cache-manager-tests

kevincheng2 commented May 14, 2026 •

edited

Loading

Uh oh!

paddle-bot Bot commented May 14, 2026

Uh oh!

PaddlePaddle-bot commented May 14, 2026 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

codecov-commenter commented May 14, 2026 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

kevincheng2 commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot Bot commented May 14, 2026

Uh oh!

PaddlePaddle-bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1 任务总览

2 任务状态汇总

2.1 Required任务：2/2 通过

2.2 可选任务 — 7/11 通过

3 失败详情（仅 required）

Uh oh!

This comment was marked as outdated.

Uh oh!

codecov-commenter commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

📝 PR 规范检查

问题

总体评价

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kevincheng2 commented May 14, 2026 •

edited

Loading

PaddlePaddle-bot commented May 14, 2026 •

edited

Loading

codecov-commenter commented May 14, 2026 •

edited

Loading