From 320a30af7b0bcc993dde35555553b669d957ebff Mon Sep 17 00:00:00 2001 From: Kane Williams Date: Tue, 5 May 2026 15:31:56 +1200 Subject: [PATCH 1/3] docs(skill): Harden ruleset-builder against FK-cascade and mask-type errors Targets two observed agent failure modes in the ruleset-builder skill: - FK cascade misunderstanding (skipping IDs, inventing source_table-style params) - Mask-type hallucination (from_random_words, hash_text, etc.) Adds references/fk-cascade.md documenting that the engine auto-replicates the parent PK rule onto every FK column at runtime when the mask is one of the cascading masks (imitate_unique, imitate_uuid, imitate_nz_ird). Reinforces the existing Mask Types Quick Reference as a closed list and adds a kill-table of common hallucinations. Fixes missing --type flag on dm rulesets validate. --- .../.claude-plugin/plugin.json | 2 +- .../skills/ruleset-builder/SKILL.md | 41 +++---- .../ruleset-builder/references/fk-cascade.md | 109 ++++++++++++++++++ .../references/ruleset-yaml-reference.md | 24 ++++ 4 files changed, 155 insertions(+), 21 deletions(-) create mode 100644 claude-skills/ruleset-builder/skills/ruleset-builder/references/fk-cascade.md diff --git a/claude-skills/ruleset-builder/.claude-plugin/plugin.json b/claude-skills/ruleset-builder/.claude-plugin/plugin.json index 66d3086..0ed5a57 100644 --- a/claude-skills/ruleset-builder/.claude-plugin/plugin.json +++ b/claude-skills/ruleset-builder/.claude-plugin/plugin.json @@ -1,6 +1,6 @@ { "name": "ruleset-builder", - "version": "1.0.0", + "version": "1.1.0", "description": "Convert auto-generated DataMasque rulesets into production-ready form. Validate and iterate.", "author": { "name": "DataMasque Ltd" }, "repository": "https://github.com/datamasque/datamasque-cli", diff --git a/claude-skills/ruleset-builder/skills/ruleset-builder/SKILL.md b/claude-skills/ruleset-builder/skills/ruleset-builder/SKILL.md index da58556..857c2ef 100644 --- a/claude-skills/ruleset-builder/skills/ruleset-builder/SKILL.md +++ b/claude-skills/ruleset-builder/skills/ruleset-builder/SKILL.md @@ -12,21 +12,26 @@ Transform auto-generated DataMasque rulesets into production-ready rulesets with 2. **`hash_columns`** — on every applicable `mask_table` task for deterministic consistency 3. **Clean structure** — `skip_defaults`, no doc blocks, validated -**4-step process. Complete all 4 steps. Report after each step before proceeding.** +FK cascade is automatic: mask the parent PK with `imitate_unique` (or `imitate_uuid` / `imitate_nz_ird`) and the engine replicates the rule onto every FK column referencing it. **Do NOT add explicit rules for FK columns.** Avoid `from_unique_imitate` and `mask_unique_key` (both deprecated). Never skip IDs. -Use `TaskCreate` for all 4 steps before starting. The prompt must include business domain and application type — ask if missing. +5-step process (0–4). Use `TaskCreate` to track all 5; report after each step before proceeding. The prompt must include business domain and application type — ask if missing. --- -## Step 0: Report version -Report: **Version 1.5** +## Step 0: Report versions + +Report the Ruleset Builder version (from `plugin.json`) and `dm version` so the operator can correlate output with releases. --- ## Step 1: Read reference docs -Read all three before any other work: +Canonical mask reference: + + +Read all of these before any other work: ``` +${CLAUDE_PLUGIN_ROOT}/skills/ruleset-builder/references/fk-cascade.md ${CLAUDE_PLUGIN_ROOT}/skills/ruleset-builder/references/mask-definitions-guide.md ${CLAUDE_PLUGIN_ROOT}/skills/ruleset-builder/references/hash-columns-guide.md ${CLAUDE_PLUGIN_ROOT}/skills/ruleset-builder/references/ruleset-yaml-reference.md @@ -52,11 +57,11 @@ masks: ### Classification rules (apply in order) **1. ID columns** — any column ending in `_ID`, `_NO`, `_NR`, `_NBR` is an entity identifier. -- Strip adjective/verb prefixes before the noun: `PREVIOUS_`, `OLD_`, `TRANSFERRED_`, `PRIOR_`, `CURR_`, `NEW_`, `NEXT_`, `ALT_`, `PARENT_`, `CHILD_`, `SOURCE_`, `TARGET_`, `ORIG_`, `PENDING_`, `ARCHIVED_`, `DELETED_` -- Extract the core entity: `PREVIOUS_INVOICE_ID` → `invoice`, `TRANSFERRED_ACCOUNT_ID` → `account`, `INVOICE_ACCOUNT_ID` → `invoice_account` (compound kept — no prefix stripped) -- Group all derivatives to one rule: `$ref: "Global/RuleLib#masks/{entity}_id"` -- Library entry: `type: imitate_unique`, `seed: "{entity}"` — **seed is required** -- This overrides whatever mask was originally generated (even `imitate_unique`, `from_random_number`, etc.) +- **FK side: drop the rule entirely.** If an ID column is a foreign key (the table's `Foreign Keys` metadata in the discovery CSV has an entry for it), do NOT emit a rule for it. The engine cascades automatically from the parent PK rule. See `fk-cascade.md`. +- **PK side: use `imitate_unique` with `seed:`.** Strip adjective/verb prefixes before the noun: `PREVIOUS_`, `OLD_`, `TRANSFERRED_`, `PRIOR_`, `CURR_`, `NEW_`, `NEXT_`, `ALT_`, `PARENT_`, `CHILD_`, `SOURCE_`, `TARGET_`, `ORIG_`, `PENDING_`, `ARCHIVED_`, `DELETED_`. Extract the core entity (`PREVIOUS_INVOICE_ID` → `invoice`). +- Library entry name: `{entity}_id`. Reference it as `$ref: "Global/RuleLib#masks/{entity}_id"`. +- Library entry body: `type: imitate_unique`, `seed: "{entity}"`. The `seed` is optional but recommended: it namespaces by entity so unrelated IDs don't collide (e.g. `customer.id=42` doesn't mask to the same value as `product.id=42`). Doesn't affect FK cascade. +- This overrides whatever mask was originally generated (even `from_random_number`). **2. Named patterns** — detect by mask structure: @@ -84,18 +89,16 @@ masks: **3. Remaining** — group by column name concept. Where column names share a root (e.g., `RESULT3_VALUE`, `RESULT5_VALUE` → `result_value`; `GENERAL_2`, `GENERAL_6` → `general`), use one shared rule. Strip adjective prefixes. Use first occurrence's parameters. -- `imitate_unique` (non-ID cols) → `{col_group}: type: imitate_unique, seed: "{col_group}"` — **seed is required** +- `imitate_unique` (non-ID cols) → `{col_group}: type: imitate_unique, seed: "{col_group}"` (seed recommended for namespacing; see ID columns section). - `from_random_date` → `{col_group}: type: from_random_date, min/max from first occurrence` - `from_random_number` → `{col_group}: type: from_random_number, min/max from first occurrence` -- `imitate` (non-phone) → `{col_group}: type: imitate` +- String catch-all → `{col_group}: type: imitate_unique, seed: "{col_group}"` (use `imitate` only for types `imitate_unique` can't handle, e.g. datetime, bool). - Complex chains → keep structure, group by column name -> **Critical rule:** Every `imitate_unique` entry in `ruleset_library.yaml` MUST have a `seed` value. -> - Entity ID rules: `seed: "{entity_name}"` (e.g., `account_id` → `seed: "account"`) -> - All other `imitate_unique` rules: `seed: "{rule_name}"` (e.g., `field_name` → `seed: "field_name"`) - ### Output format +`Global/RuleLib` below is a placeholder for `/` — substitute the operator's real values, and create the library with `dm libraries create` before running the ruleset. + ```yaml version: '1.0' skip_defaults: @@ -154,10 +157,8 @@ Build a lookup of `(schema, table)` → columns with constraint and FK metadata: Remove any comment lines containing `ROWID`. -Run: -```bash -dm rulesets validate --file -``` +Run `dm rulesets validate --file --type database` +(use `file` for file-masking rulesets). Fix any errors and re-validate until passing. diff --git a/claude-skills/ruleset-builder/skills/ruleset-builder/references/fk-cascade.md b/claude-skills/ruleset-builder/skills/ruleset-builder/references/fk-cascade.md new file mode 100644 index 0000000..74a0e69 --- /dev/null +++ b/claude-skills/ruleset-builder/skills/ruleset-builder/references/fk-cascade.md @@ -0,0 +1,109 @@ +# FK Cascade Invariant + +The most important rule when refining a DataMasque ruleset that spans +related tables. Get this wrong and you either leak identity (by skipping +IDs entirely) or break the engine (by adding rules for FK columns). + +## The rule + +**Mask only the parent PK column. The engine cascades the same masked value +to every FK column referencing it.** + +Three masks support this cascade: + +- `imitate_unique` — recommended for new work. +- `imitate_uuid` — for UUID-shaped IDs. +- `imitate_nz_ird` — for NZ IRD numbers. + +(`from_unique_imitate` and `mask_unique_key` are deprecated; do not emit.) + +When `mask_table` runs and a rule on a referenced column uses one of these +masks, the engine: + +1. Discovers child tables with FKs referencing this column. +2. Auto-replicates the parent's rule onto every FK column. +3. Same mask config → same masked output → joins survive. + +This is documented at +: + +> "You can apply an `imitate_unique` mask to a primary key column or a +> column that is used as a foreign key in another table. References will be +> updated automatically. Composite primary keys are supported." + +## Worked example + +Schema: +- `customers.id` (PK), `customers.email` +- `orders.id` (PK), `orders.customer_id` (FK → `customers.id`), `orders.tracking_number` + +Correct ruleset: + +```yaml +- type: mask_table + table: customers + key: id + rules: + - column: id + masks: + - type: imitate_unique + seed: customer + - column: email + masks: + - type: from_file + seed_file: DataMasque_emails.csv + seed_column: email + +- type: mask_table + table: orders + key: id + rules: + # customer_id is intentionally absent — the engine replicates the + # `customers.id` rule onto it automatically. Adding it here would + # be rejected by the runtime FK check. + - column: tracking_number + masks: + - type: imitate_unique + seed: tracking +``` + +After the run, `orders.customer_id` holds the same masked values as +`customers.id`, joins remain intact, and `tracking_number` is independently +masked with its own seed. + +## Anti-patterns to refuse + +- **Adding explicit FK rules** ("I'll mask both PK and FK with shared + `$ref` so the cascade works"). The runtime rejects this by default with + the error: + *"To preserve referential integrity, the following foreign key columns + cannot be directly masked by this task."* + The engine will replicate the rule for you; adding your own conflicts. +- **Skipping IDs to "preserve FK joins"**. Leaves identifiers in plain + sight. Mask the parent PK with `imitate_unique` — joins survive via + the auto-cascade. +- **Inventing linking parameters** (`source_table`, `source_column`, + `parent_column`, `link_to`). None of these exist on any DataMasque mask. +- **Inventing a hashing mask** (`hash_text`, `hash`, `link`, `match_id`). + None of these exist. `imitate_unique` is the deterministic mask. +- **Using `from_unique_imitate` or `mask_unique_key`**. Both deprecated. + `imitate_unique` replaces both. + +## Cross-run consistency requires `run_secret` + +Within a single run, `imitate_unique` is deterministic via a per-run +`insecure_seed`. Across runs, the cascade only holds if the run is +invoked with a `run_secret`. Without it, the same input maps to a +different masked value next run. If cross-run consistency matters, flag +this in the final summary. + +## Self-check before finishing + +For each FK relationship in the schema: + +1. Is the parent PK masked with `imitate_unique`, `imitate_uuid`, or + `imitate_nz_ird`? +2. Is the FK column **absent** from your output (no explicit rule)? +3. Are `from_unique_imitate` and `mask_unique_key` absent from your output? + +If any answer is "no", fix it before validation. diff --git a/claude-skills/ruleset-builder/skills/ruleset-builder/references/ruleset-yaml-reference.md b/claude-skills/ruleset-builder/skills/ruleset-builder/references/ruleset-yaml-reference.md index 6a610ca..c7f20af 100644 --- a/claude-skills/ruleset-builder/skills/ruleset-builder/references/ruleset-yaml-reference.md +++ b/claude-skills/ruleset-builder/skills/ruleset-builder/references/ruleset-yaml-reference.md @@ -72,6 +72,16 @@ For PostgreSQL/MySQL, plain names work: `table: users`, `key: id`. ## Mask Types Quick Reference +This is the **closed list** of every `type:` value DataMasque accepts. Do +not invent mask types or parameters (no `source_table`, no `link_to`, no +`parent_column` — none exist). For per-mask parameter details, see the +canonical source: +. + +For a deterministic hash, use `imitate_unique` (or `imitate_uuid` for UUIDs) +optionally with `seed:` to namespace. The cascade is automatic; no +cross-table reference parameter exists. See `fk-cascade.md`. + ### Generic - `from_fixed` — fixed replacement value - `from_column` — copy from another column @@ -124,6 +134,20 @@ For PostgreSQL/MySQL, plain names work: `table: users`, `key: id`. ### Document - `json` — mask JSON fields within a column - `xml` — mask XML elements within a column +- `unstructured_text` — mask entities inside free text + +### Commonly-hallucinated names that do NOT exist + +These plausible-sounding names are not in DataMasque. Refuse to emit them: + +| Hallucinated name | What was wanted | Use instead | +|-------------------------------------------------------------|-----------------------------------|---------------------------------------------------------------| +| `hash_text`, `hash` | deterministic hash of a value | `imitate_unique` (or `imitate_uuid` for UUIDs) | +| `link`, `match_id`, `link_to` | join two columns after masking | shared `imitate_unique` config on both sides | +| `from_random_words` | random words / short text | `from_random_text` (random chars) or `from_file` | +| `from_random_string` | random string | `from_random_text` | +| `redact`, `mask_value` | constant placeholder | `from_fixed` with `value:` | +| `source_table`, `source_column`, `parent_column`, `link_to` | param to point a FK at its parent | does not exist — cascade is automatic with shared mask config | ## skip_defaults From bb72774d41ce4e855fa42c777fdebbd4091445d1 Mon Sep 17 00:00:00 2001 From: Kane Williams Date: Thu, 7 May 2026 08:27:55 +1200 Subject: [PATCH 2/3] docs(skill): renumber steps 1-5 and align tables in SKILL.md --- .../skills/ruleset-builder/SKILL.md | 72 +++++++++---------- 1 file changed, 36 insertions(+), 36 deletions(-) diff --git a/claude-skills/ruleset-builder/skills/ruleset-builder/SKILL.md b/claude-skills/ruleset-builder/skills/ruleset-builder/SKILL.md index 857c2ef..7f56bce 100644 --- a/claude-skills/ruleset-builder/skills/ruleset-builder/SKILL.md +++ b/claude-skills/ruleset-builder/skills/ruleset-builder/SKILL.md @@ -14,17 +14,17 @@ Transform auto-generated DataMasque rulesets into production-ready rulesets with FK cascade is automatic: mask the parent PK with `imitate_unique` (or `imitate_uuid` / `imitate_nz_ird`) and the engine replicates the rule onto every FK column referencing it. **Do NOT add explicit rules for FK columns.** Avoid `from_unique_imitate` and `mask_unique_key` (both deprecated). Never skip IDs. -5-step process (0–4). Use `TaskCreate` to track all 5; report after each step before proceeding. The prompt must include business domain and application type — ask if missing. +5-step process (1–5). Use `TaskCreate` to track all 5; report after each step before proceeding. The prompt must include business domain and application type — ask if missing. --- -## Step 0: Report versions +## Step 1: Report versions Report the Ruleset Builder version (from `plugin.json`) and `dm version` so the operator can correlate output with releases. --- -## Step 1: Read reference docs +## Step 2: Read reference docs Canonical mask reference: @@ -39,7 +39,7 @@ ${CLAUDE_PLUGIN_ROOT}/skills/ruleset-builder/references/ruleset-yaml-reference.m --- -## Step 2: Extract ruleset_library +## Step 3: Extract ruleset_library Write a Python script using `ruamel.yaml` (`uv pip install ruamel.yaml`). @@ -65,27 +65,27 @@ masks: **2. Named patterns** — detect by mask structure: -| Pattern | Detection | Library rule | -|---------|-----------|--------------| -| Email | `chain(concat(concat(firstName+lastName, glue='.')+email_suffix)+transform_case(lower))` | `email_address` | -| Full name | `chain(concat(firstName+lastName, glue=' ')+take_substring)` OR plain `concat(firstName+lastName, glue=' ')` — column not containing USERNAME/LOGIN | `full_name` | -| Username | Same mask as full_name but column name contains USERNAME, USER_NAME, LOGIN, LOGON | `username` | -| First name only | `from_file` with firstNames seed | `name_first` | -| Last name only | `from_file` with lastNames seed | `name_last` | -| DOB | Column name contains DOB/BIRTH/DATE_OF_BIRTH — use `retain_age` regardless of original type | `dob` | -| Company | `chain(from_file(companies)+take_substring)` | `company_name` | -| Country name | `from_file(country_codes, seed_column=name)` | `country_name` | -| Country alpha-2 | `from_file(country_codes, seed_column=alpha_2)` | `country_code_2` | -| Country alpha-3 | `from_file(country_codes, seed_column=alpha_3)` | `country_code_3` | -| Phone/fax | `imitate` on column name containing PHONE, TEL, FAX, MOBILE, CELL | `phone` | -| Address line 1 | `from_file(addresses, seed_column=street_address)` on LINE_1/ADDRESS_LINE_1 columns | `address_line1` | -| Address line N | Same for LINE_2, LINE_3 etc. | `address_lineN` | -| Address full | `from_file(addresses, seed_column=street_address)` on non-line-numbered columns | `address_full` | -| Address expr | `concat(address+city+state+postcode, glue=', ')` | `network_address_expr` | -| City | `from_file(addresses, seed_column=city)` | `city` | -| Postcode | `from_file(addresses, seed_column=postcode)` | `post_code` | -| Suburb | `from_file(addresses, seed_column=suburb)` | `suburb` | -| Occupation | `from_file(occupations)` | `occupation` | +| Pattern | Detection | Library rule | +|-----------------|-----------------------------------------------------------------------------------------------------------------------------------------------------|------------------------| +| Email | `chain(concat(concat(firstName+lastName, glue='.')+email_suffix)+transform_case(lower))` | `email_address` | +| Full name | `chain(concat(firstName+lastName, glue=' ')+take_substring)` OR plain `concat(firstName+lastName, glue=' ')` — column not containing USERNAME/LOGIN | `full_name` | +| Username | Same mask as full_name but column name contains USERNAME, USER_NAME, LOGIN, LOGON | `username` | +| First name only | `from_file` with firstNames seed | `name_first` | +| Last name only | `from_file` with lastNames seed | `name_last` | +| DOB | Column name contains DOB/BIRTH/DATE_OF_BIRTH — use `retain_age` regardless of original type | `dob` | +| Company | `chain(from_file(companies)+take_substring)` | `company_name` | +| Country name | `from_file(country_codes, seed_column=name)` | `country_name` | +| Country alpha-2 | `from_file(country_codes, seed_column=alpha_2)` | `country_code_2` | +| Country alpha-3 | `from_file(country_codes, seed_column=alpha_3)` | `country_code_3` | +| Phone/fax | `imitate` on column name containing PHONE, TEL, FAX, MOBILE, CELL | `phone` | +| Address line 1 | `from_file(addresses, seed_column=street_address)` on LINE_1/ADDRESS_LINE_1 columns | `address_line1` | +| Address line N | Same for LINE_2, LINE_3 etc. | `address_lineN` | +| Address full | `from_file(addresses, seed_column=street_address)` on non-line-numbered columns | `address_full` | +| Address expr | `concat(address+city+state+postcode, glue=', ')` | `network_address_expr` | +| City | `from_file(addresses, seed_column=city)` | `city` | +| Postcode | `from_file(addresses, seed_column=postcode)` | `post_code` | +| Suburb | `from_file(addresses, seed_column=suburb)` | `suburb` | +| Occupation | `from_file(occupations)` | `occupation` | **3. Remaining** — group by column name concept. Where column names share a root (e.g., `RESULT3_VALUE`, `RESULT5_VALUE` → `result_value`; `GENERAL_2`, `GENERAL_6` → `general`), use one shared rule. Strip adjective prefixes. Use first occurrence's parameters. @@ -119,11 +119,11 @@ tasks: Do NOT write a custom YAML serializer. Use `ruamel.yaml` round-trip dumper. Use `DoubleQuotedScalarString` for `$ref` values. -**Report:** "Step 2 done — extracted N rule library definitions: [list each name and usage count]." +**Report:** "Step 3 done — extracted N rule library definitions: [list each name and usage count]." --- -## Step 3: Add hash_columns +## Step 4: Add hash_columns Write a Python script that: @@ -149,11 +149,11 @@ Build a lookup of `(schema, table)` → columns with constraint and FK metadata: 4. Write to output file -**Report:** "Step 3 done — added hash_columns to N tables, skipped M (all-unique), skipped K (no suitable key). Top hash columns: [column → count]." +**Report:** "Step 4 done — added hash_columns to N tables, skipped M (all-unique), skipped K (no suitable key). Top hash columns: [column → count]." --- -## Step 4: Validate and clean up +## Step 5: Validate and clean up Remove any comment lines containing `ROWID`. @@ -166,11 +166,11 @@ Fix any errors and re-validate until passing. ## Summary -| Metric | Value | -|--------|-------| -| Total tables | N | +| Metric | Value | +|----------------------------|----------------| +| Total tables | N | | Mask definitions extracted | N (list names) | -| Tables with hash_columns | N | -| Tables skipped (no key) | N | -| Validation | passed/failed | -| Output file | path | +| Tables with hash_columns | N | +| Tables skipped (no key) | N | +| Validation | passed/failed | +| Output file | path | From 3371da6d6313d207aa0da6570b54cf1e5e2bb047 Mon Sep 17 00:00:00 2001 From: Kane Williams Date: Thu, 7 May 2026 08:30:10 +1200 Subject: [PATCH 3/3] docs(skill): align tables in ruleset-builder reference guides --- .../references/hash-columns-guide.md | 16 ++++++++-------- .../references/mask-definitions-guide.md | 14 +++++++------- .../references/ruleset-libraries-guide.md | 14 +++++++------- 3 files changed, 22 insertions(+), 22 deletions(-) diff --git a/claude-skills/ruleset-builder/skills/ruleset-builder/references/hash-columns-guide.md b/claude-skills/ruleset-builder/skills/ruleset-builder/references/hash-columns-guide.md index 0de3770..cecdc28 100644 --- a/claude-skills/ruleset-builder/skills/ruleset-builder/references/hash-columns-guide.md +++ b/claude-skills/ruleset-builder/skills/ruleset-builder/references/hash-columns-guide.md @@ -71,14 +71,14 @@ hash_columns: Every table belongs to a domain entity. Find the column that identifies that entity: -| Domain | Typical hash column | Examples | -|--------|-------------------|----------| -| Customer | `cust_id`, `customer_id`, `client_id` | CUST_MASTER, CUST_ADDRESS | -| Account | `acc_id`, `account_id`, `account_no` | DEP_ACCOUNT, DEP_EMAIL_ALERT | -| Card | `card_id`, `card_no` | CARD_MASTER, CARD_INSURANCE | -| Loan | `loan_id`, `loan_no` | LOAN_COLLATERAL, LOAN_GUARANTOR | -| Employee | `emp_id`, `emp_no`, `employee_id` | COM_EMPLOYEE, COM_EMP_ROLE | -| Transaction | `tx_id`, `trf_id`, `fx_tx_id` | TRF_MASTER, FX_RECEIPT | +| Domain | Typical hash column | Examples | +|-------------|---------------------------------------|---------------------------------| +| Customer | `cust_id`, `customer_id`, `client_id` | CUST_MASTER, CUST_ADDRESS | +| Account | `acc_id`, `account_id`, `account_no` | DEP_ACCOUNT, DEP_EMAIL_ALERT | +| Card | `card_id`, `card_no` | CARD_MASTER, CARD_INSURANCE | +| Loan | `loan_id`, `loan_no` | LOAN_COLLATERAL, LOAN_GUARANTOR | +| Employee | `emp_id`, `emp_no`, `employee_id` | COM_EMPLOYEE, COM_EMP_ROLE | +| Transaction | `tx_id`, `trf_id`, `fx_tx_id` | TRF_MASTER, FX_RECEIPT | ### Step 2: Check foreign keys in the DDL diff --git a/claude-skills/ruleset-builder/skills/ruleset-builder/references/mask-definitions-guide.md b/claude-skills/ruleset-builder/skills/ruleset-builder/references/mask-definitions-guide.md index d389afb..8a014b0 100644 --- a/claude-skills/ruleset-builder/skills/ruleset-builder/references/mask-definitions-guide.md +++ b/claude-skills/ruleset-builder/skills/ruleset-builder/references/mask-definitions-guide.md @@ -175,14 +175,14 @@ tasks: Common seed files for `from_file` masks: -| Category | Files | -|----------|-------| -| Names | `DataMasque_firstNames_mixed.csv`, `DataMasque_lastNames_v2.csv` | +| Category | Files | +|-----------|-------------------------------------------------------------------------------------------------------| +| Names | `DataMasque_firstNames_mixed.csv`, `DataMasque_lastNames_v2.csv` | | Addresses | `DataMasque_US_addresses.csv`, `DataMasque_AU_addresses_real.csv`, `DataMasque_NZ_addresses_real.csv` | -| Companies | `DataMasque_companies.csv`, `DataMasque_NZ_companies.csv`, `DataMasque_AU_companies.csv` | -| Email | `DataMasque_fake_email_suffixes.csv`, `DataMasque_email_suffixes.csv` | -| Reference | `DataMasque_country_codes.csv`, `DataMasque_occupations.csv` | -| Cards | `DataMasque_credit_card_numbers.csv`, `DataMasque_credit_card_prefixes.csv` | +| Companies | `DataMasque_companies.csv`, `DataMasque_NZ_companies.csv`, `DataMasque_AU_companies.csv` | +| Email | `DataMasque_fake_email_suffixes.csv`, `DataMasque_email_suffixes.csv` | +| Reference | `DataMasque_country_codes.csv`, `DataMasque_occupations.csv` | +| Cards | `DataMasque_credit_card_numbers.csv`, `DataMasque_credit_card_prefixes.csv` | Regional variants exist for BR, IN, AU, NZ, US. Use `from_file` when there are more than ~50 distinct values; diff --git a/claude-skills/ruleset-builder/skills/ruleset-builder/references/ruleset-libraries-guide.md b/claude-skills/ruleset-builder/skills/ruleset-builder/references/ruleset-libraries-guide.md index d823b33..b845b5e 100644 --- a/claude-skills/ruleset-builder/skills/ruleset-builder/references/ruleset-libraries-guide.md +++ b/claude-skills/ruleset-builder/skills/ruleset-builder/references/ruleset-libraries-guide.md @@ -143,13 +143,13 @@ tasks: [...] ## Libraries vs YAML Anchors -| Feature | YAML Anchors (`&`/`*`) | Libraries (`$ref`) | -|---------|----------------------|-------------------| -| Scope | Within one ruleset | Across multiple rulesets | -| Management | Inline in YAML | Managed via API/CLI, versioned | -| Syntax | `<<: *anchor_name` | `$ref: "lib#path"` | -| Override | `<<:` merge key | Not supported (use as-is) | -| Best for | Single-ruleset reuse | Organisation-wide standards | +| Feature | YAML Anchors (`&`/`*`) | Libraries (`$ref`) | +|------------|------------------------|--------------------------------| +| Scope | Within one ruleset | Across multiple rulesets | +| Management | Inline in YAML | Managed via API/CLI, versioned | +| Syntax | `<<: *anchor_name` | `$ref: "lib#path"` | +| Override | `<<:` merge key | Not supported (use as-is) | +| Best for | Single-ruleset reuse | Organisation-wide standards | **Recommendation:** - Start with YAML anchors (`mask_definitions`) for within-ruleset deduplication