Skip to content

feat: add aggregation user detail stripping hook#224

Merged
yyiilluu merged 10 commits into
mainfrom
codex/user-detail-stripping-foundation
Jun 26, 2026
Merged

feat: add aggregation user detail stripping hook#224
yyiilluu merged 10 commits into
mainfrom
codex/user-detail-stripping-foundation

Conversation

@yilu331

@yilu331 yilu331 commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Keeps the OSS aggregation hook inert by default, while supporting embedding apps that use typed stripping placeholders.
  • Replaces person-only placeholder cleanup with typed cleanup across storage/logging fallbacks.
  • Removes the dynamic bridge helpers and calls the configurator hooks directly now that BaseConfigurator owns the no-op defaults.

Changes

  • Adds count_stripping_placeholders() and replace_stripping_placeholders() with generic replacements for email, phone, and person placeholders.
  • Sanitizes typed placeholder residue in structured aggregation output, string/dict/list fallback logging, nested values, and exception messages.
  • Wires manual/API and inline aggregation paths through direct create_user_detail_stripper() and get_playbook_aggregation_prompt_extra_instructions() configurator calls.
  • Updates tests so instance-level configurator stubs prove the direct-call behavior.

Test Plan

  • uv run ruff check reflexio/lib/_generation.py reflexio/server/services/playbook/playbook_aggregator.py reflexio/server/services/playbook/playbook_generation_service.py reflexio/server/services/playbook/user_detail_stripping.py tests/lib/test_generation_unit.py tests/server/services/playbook/test_playbook_aggregator.py tests/server/services/playbook/test_playbook_generation_service.py
  • uv run pyright reflexio/server/services/playbook/user_detail_stripping.py reflexio/server/services/playbook/playbook_aggregator.py reflexio/lib/_generation.py reflexio/server/services/playbook/playbook_generation_service.py
  • uv run pytest tests/server/services/playbook/test_playbook_aggregator.py tests/lib/test_generation_unit.py tests/server/services/playbook/test_playbook_generation_service.py -q -o 'addopts=' (101 passed)
  • uv run pytest tests/lib/test_generation_unit.py tests/server/services/playbook/test_playbook_generation_service.py -q -o 'addopts=' (31 passed after final wrap cleanup)
  • uv run pytest tests/ --ignore=tests/e2e_tests/ --ignore=tests/benchmarks -q -o 'addopts=' (3615 passed, 70 skipped, 6 subtests passed)
  • uv run pytest tests/e2e_tests/ -q -o 'addopts=' (41 passed, 47 skipped)

@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR adds configurable user-detail stripping to playbook aggregation. It defines stripping contracts and placeholder helpers, threads an optional stripper through generation and service entry points, sanitizes prompts, model outputs, and logs, and updates the playbook aggregation prompt and tests.

Changes

User Detail Stripping Pipeline

Layer / File(s) Summary
Stripping contracts and config
reflexio/server/services/playbook/user_detail_stripping.py, reflexio/server/services/configurator/base_configurator.py, reflexio/server/services/playbook/README.md
Defines the stripping protocols, passthrough implementation, placeholder helper functions, and the base configurator accessors that return optional stripping and prompt-instruction values.
Aggregator sanitization and prompt updates
reflexio/server/services/playbook/playbook_aggregator.py, reflexio/server/prompt/prompt_bank/playbook_aggregation/v2.3.0.prompt.md, tests/server/services/playbook/test_playbook_aggregator.py
PlaybookAggregator accepts an optional stripper, strips prompt inputs, sanitizes structured and fallback outputs and logs, records placeholder-leakage events, and the prompt template accepts extra aggregation instructions.
Configured stripper wiring
reflexio/lib/_generation.py, reflexio/server/services/playbook/playbook_generation_service.py, tests/lib/test_generation_unit.py, tests/server/services/playbook/test_playbook_generation_service.py
GenerationMixin and PlaybookGenerationService create configured aggregation extras from the configurator and forward them into PlaybookAggregator, with tests covering the default and configured paths.

Sequence Diagram(s)

sequenceDiagram
  participant PlaybookGenerationService
  participant BaseConfigurator
  participant PlaybookAggregator
  participant UserDetailStripper
  participant llm_client
  participant record_usage_event

  PlaybookGenerationService->>BaseConfigurator: create_user_detail_stripper()
  BaseConfigurator-->>PlaybookGenerationService: UserDetailStripper | None
  PlaybookGenerationService->>PlaybookAggregator: __init__(..., user_detail_stripper)
  PlaybookAggregator->>UserDetailStripper: strip_user_details(prompt fields, shared_mapping)
  PlaybookAggregator->>llm_client: send sanitized aggregation prompt
  llm_client-->>PlaybookAggregator: aggregation output
  PlaybookAggregator->>record_usage_event: record placeholder leakage count
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

I hopped through prompts and sniffed out [PERSON_N] trails,
then tucked them softly into user-agnostic tales.
The aggregator now tiptoes, neat and clear,
while my little paws leave placeholders near.
🐇✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.85% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly reflects the main change: adding an optional user-detail stripping hook for aggregation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/user-detail-stripping-foundation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@reflexio/server/services/playbook/playbook_aggregator.py`:
- Around line 1355-1359: The fallback logging path in playbook aggregation still
emits unsanitized raw responses when structured parsing fails, so placeholder
tokens can leak in logs. Update the response handling around the
PlaybookAggregationOutput branch in playbook_aggregator.py to apply the same
placeholder sanitization logic to string and dict responses before
log_model_response() is called, using the existing
_sanitize_aggregation_response helper and preserving the placeholder leak
recording behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: b367dfb0-967c-4af7-aaa6-789bf9087585

📥 Commits

Reviewing files that changed from the base of the PR and between ba9a2b2 and a471320.

📒 Files selected for processing (8)
  • reflexio/lib/_generation.py
  • reflexio/models/config_schema.py
  • reflexio/server/services/configurator/base_configurator.py
  • reflexio/server/services/playbook/playbook_aggregator.py
  • reflexio/server/services/playbook/playbook_generation_service.py
  • reflexio/server/services/playbook/user_detail_stripping.py
  • tests/models/test_config_f1_fields.py
  • tests/server/services/playbook/test_playbook_aggregator.py

Comment thread reflexio/server/services/playbook/components/aggregator.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
reflexio/lib/_generation.py (1)

47-66: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Consider extracting the aggregator-stripping kwargs assembly into a shared helper.

This block (fetch user_detail_stripper + aggregation_prompt_extra_instructions from the configurator, then conditionally build aggregator_kwargs) is duplicated verbatim in PlaybookGenerationService._trigger_playbook_aggregation (Lines 511-525). If the inclusion contract changes (e.g. a third hook, or always-pass semantics), both sites must be kept in sync. A single helper in user_detail_stripping.py returning the kwargs dict would centralize it.

♻️ Sketch of a shared helper
# user_detail_stripping.py
def build_aggregator_stripping_kwargs(configurator: object) -> dict[str, Any]:
    kwargs: dict[str, Any] = {}
    stripper = create_configured_user_detail_stripper(configurator)
    if stripper is not None:
        kwargs["user_detail_stripper"] = stripper
    extra = get_configured_playbook_aggregation_prompt_extra_instructions(configurator)
    if extra:
        kwargs["aggregation_prompt_extra_instructions"] = extra
    return kwargs
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@reflexio/lib/_generation.py` around lines 47 - 66, The aggregator kwargs
assembly is duplicated between this generation path and
PlaybookGenerationService._trigger_playbook_aggregation, so centralize it in a
shared helper in user_detail_stripping.py. Add a helper that takes the
configurator, builds the conditional kwargs for user_detail_stripper and
aggregation_prompt_extra_instructions, and returns the dict, then replace the
local assembly here and in _trigger_playbook_aggregation with calls to that
helper to keep the inclusion contract in one place.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@reflexio/lib/_generation.py`:
- Around line 47-66: The aggregator kwargs assembly is duplicated between this
generation path and PlaybookGenerationService._trigger_playbook_aggregation, so
centralize it in a shared helper in user_detail_stripping.py. Add a helper that
takes the configurator, builds the conditional kwargs for user_detail_stripper
and aggregation_prompt_extra_instructions, and returns the dict, then replace
the local assembly here and in _trigger_playbook_aggregation with calls to that
helper to keep the inclusion contract in one place.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 96ba040e-0e1d-4baa-9a0a-3de1300060e5

📥 Commits

Reviewing files that changed from the base of the PR and between d5a41ef and eb9fcfc.

📒 Files selected for processing (9)
  • reflexio/lib/_generation.py
  • reflexio/server/prompt/prompt_bank/playbook_aggregation/v2.3.0.prompt.md
  • reflexio/server/services/configurator/base_configurator.py
  • reflexio/server/services/playbook/playbook_aggregator.py
  • reflexio/server/services/playbook/playbook_generation_service.py
  • reflexio/server/services/playbook/user_detail_stripping.py
  • tests/lib/test_generation_unit.py
  • tests/server/services/playbook/test_playbook_aggregator.py
  • tests/server/services/playbook/test_playbook_generation_service.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • reflexio/server/services/playbook/user_detail_stripping.py
  • reflexio/server/prompt/prompt_bank/playbook_aggregation/v2.3.0.prompt.md
  • reflexio/server/services/playbook/playbook_aggregator.py

@yilu331 yilu331 force-pushed the codex/user-detail-stripping-foundation branch from 44c0761 to 6e8f8f7 Compare June 25, 2026 08:07
Comment thread reflexio/server/services/playbook/user_detail_stripping.py Outdated
Comment thread reflexio/server/services/playbook/user_detail_stripping.py Outdated
@yyiilluu

Copy link
Copy Markdown
Contributor

one more not sure if applied to the change
[P2] Env drift guard does not actually see the new env reads
The new env names are documented in .env.template, which is good. But the code reads them through constants at configurator.py, while the drift guard only regex-matches literal names like env_str("NAME", ...) in test_env_file_validation.py. That means future edits could rename/remove template entries without this guard catching it. Either call env_str("REFLEXIO_USER_DETAIL_STRIPPING_ENABLED", "true") directly, or extend the guard to resolve simple constants.

@yilu331 yilu331 force-pushed the codex/user-detail-stripping-foundation branch 2 times, most recently from f0f2802 to b0a1065 Compare June 26, 2026 04:47
Comment thread reflexio/server/services/playbook/playbook_aggregator.py Outdated
Comment thread reflexio/server/services/playbook/playbook_aggregator.py Outdated
Comment thread reflexio/server/services/playbook/playbook_aggregator.py Outdated
@yilu331 yilu331 force-pushed the codex/user-detail-stripping-foundation branch from c6cc155 to cabb9a3 Compare June 26, 2026 07:59
@yilu331 yilu331 force-pushed the codex/user-detail-stripping-foundation branch from cabb9a3 to dc8e1c0 Compare June 26, 2026 08:19
@yyiilluu yyiilluu merged commit 6b6b61b into main Jun 26, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants