datamasque · kw-datamasque · May 6, 2026 · May 5, 2026 · May 6, 2026 · May 6, 2026
diff --git a/claude-skills/ruleset-builder/.claude-plugin/plugin.json b/claude-skills/ruleset-builder/.claude-plugin/plugin.json
@@ -1,6 +1,6 @@
 {
   "name": "ruleset-builder",
-  "version": "1.0.0",
+  "version": "1.1.0",
   "description": "Convert auto-generated DataMasque rulesets into production-ready form. Validate and iterate.",
   "author": { "name": "DataMasque Ltd" },
   "repository": "https://github.com/datamasque/datamasque-cli",

diff --git a/claude-skills/ruleset-builder/skills/ruleset-builder/SKILL.md b/claude-skills/ruleset-builder/skills/ruleset-builder/SKILL.md
@@ -12,29 +12,34 @@ Transform auto-generated DataMasque rulesets into production-ready rulesets with
 2. **`hash_columns`** — on every applicable `mask_table` task for deterministic consistency
 3. **Clean structure** — `skip_defaults`, no doc blocks, validated
 
-**4-step process. Complete all 4 steps. Report after each step before proceeding.**
+FK cascade is automatic: mask the parent PK with `imitate_unique` (or `imitate_uuid` / `imitate_nz_ird`) and the engine replicates the rule onto every FK column referencing it. **Do NOT add explicit rules for FK columns.** Avoid `from_unique_imitate` and `mask_unique_key` (both deprecated). Never skip IDs.
 
-Use `TaskCreate` for all 4 steps before starting. The prompt must include business domain and application type — ask if missing.
+5-step process (1–5). Use `TaskCreate` to track all 5; report after each step before proceeding. The prompt must include business domain and application type — ask if missing.
 
 ---
 
-## Step 0: Report version
-Report: **Version 1.5**
+## Step 1: Report versions
+
+Report the Ruleset Builder version (from `plugin.json`) and `dm version` so the operator can correlate output with releases.
 
 ---
 
-## Step 1: Read reference docs
+## Step 2: Read reference docs
+
+Canonical mask reference:
+<https://portal.datamasque.com/portal/documentation/latest/masking-functions-overview.html>
 
-Read all three before any other work:
+Read all of these before any other work:
 ```
+${CLAUDE_PLUGIN_ROOT}/skills/ruleset-builder/references/fk-cascade.md
 ${CLAUDE_PLUGIN_ROOT}/skills/ruleset-builder/references/mask-definitions-guide.md
 ${CLAUDE_PLUGIN_ROOT}/skills/ruleset-builder/references/hash-columns-guide.md
 ${CLAUDE_PLUGIN_ROOT}/skills/ruleset-builder/references/ruleset-yaml-reference.md
 ```
 
 ---
 
-## Step 2: Extract ruleset_library
+## Step 3: Extract ruleset_library
 
 Write a Python script using `ruamel.yaml` (`uv pip install ruamel.yaml`).
 
@@ -52,50 +57,48 @@ masks:
 ### Classification rules (apply in order)
 
 **1. ID columns** — any column ending in `_ID`, `_NO`, `_NR`, `_NBR` is an entity identifier.
-- Strip adjective/verb prefixes before the noun: `PREVIOUS_`, `OLD_`, `TRANSFERRED_`, `PRIOR_`, `CURR_`, `NEW_`, `NEXT_`, `ALT_`, `PARENT_`, `CHILD_`, `SOURCE_`, `TARGET_`, `ORIG_`, `PENDING_`, `ARCHIVED_`, `DELETED_`
-- Extract the core entity: `PREVIOUS_INVOICE_ID` → `invoice`, `TRANSFERRED_ACCOUNT_ID` → `account`, `INVOICE_ACCOUNT_ID` → `invoice_account` (compound kept — no prefix stripped)
-- Group all derivatives to one rule: `$ref: "Global/RuleLib#masks/{entity}_id"`
-- Library entry: `type: imitate_unique`, `seed: "{entity}"` — **seed is required**
-- This overrides whatever mask was originally generated (even `imitate_unique`, `from_random_number`, etc.)
+- **FK side: drop the rule entirely.** If an ID column is a foreign key (the table's `Foreign Keys` metadata in the discovery CSV has an entry for it), do NOT emit a rule for it. The engine cascades automatically from the parent PK rule. See `fk-cascade.md`.
+- **PK side: use `imitate_unique` with `seed:`.** Strip adjective/verb prefixes before the noun: `PREVIOUS_`, `OLD_`, `TRANSFERRED_`, `PRIOR_`, `CURR_`, `NEW_`, `NEXT_`, `ALT_`, `PARENT_`, `CHILD_`, `SOURCE_`, `TARGET_`, `ORIG_`, `PENDING_`, `ARCHIVED_`, `DELETED_`. Extract the core entity (`PREVIOUS_INVOICE_ID` → `invoice`).
+- Library entry name: `{entity}_id`. Reference it as `$ref: "Global/RuleLib#masks/{entity}_id"`.
+- Library entry body: `type: imitate_unique`, `seed: "{entity}"`. The `seed` is optional but recommended: it namespaces by entity so unrelated IDs don't collide (e.g. `customer.id=42` doesn't mask to the same value as `product.id=42`). Doesn't affect FK cascade.
+- This overrides whatever mask was originally generated (even `from_random_number`).
 
 **2. Named patterns** — detect by mask structure:
 
-| Pattern | Detection | Library rule |
-|---------|-----------|--------------|
-| Email | `chain(concat(concat(firstName+lastName, glue='.')+email_suffix)+transform_case(lower))` | `email_address` |
-| Full name | `chain(concat(firstName+lastName, glue=' ')+take_substring)` OR plain `concat(firstName+lastName, glue=' ')` — column not containing USERNAME/LOGIN | `full_name` |
-| Username | Same mask as full_name but column name contains USERNAME, USER_NAME, LOGIN, LOGON | `username` |
-| First name only | `from_file` with firstNames seed | `name_first` |
-| Last name only | `from_file` with lastNames seed | `name_last` |
-| DOB | Column name contains DOB/BIRTH/DATE_OF_BIRTH — use `retain_age` regardless of original type | `dob` |
-| Company | `chain(from_file(companies)+take_substring)` | `company_name` |
-| Country name | `from_file(country_codes, seed_column=name)` | `country_name` |
-| Country alpha-2 | `from_file(country_codes, seed_column=alpha_2)` | `country_code_2` |
-| Country alpha-3 | `from_file(country_codes, seed_column=alpha_3)` | `country_code_3` |
-| Phone/fax | `imitate` on column name containing PHONE, TEL, FAX, MOBILE, CELL | `phone` |
-| Address line 1 | `from_file(addresses, seed_column=street_address)` on LINE_1/ADDRESS_LINE_1 columns | `address_line1` |
-| Address line N | Same for LINE_2, LINE_3 etc. | `address_lineN` |
-| Address full | `from_file(addresses, seed_column=street_address)` on non-line-numbered columns | `address_full` |
-| Address expr | `concat(address+city+state+postcode, glue=', ')` | `network_address_expr` |
-| City | `from_file(addresses, seed_column=city)` | `city` |
-| Postcode | `from_file(addresses, seed_column=postcode)` | `post_code` |
-| Suburb | `from_file(addresses, seed_column=suburb)` | `suburb` |
-| Occupation | `from_file(occupations)` | `occupation` |
+| Pattern         | Detection                                                                                                                                           | Library rule           |
+|-----------------|-----------------------------------------------------------------------------------------------------------------------------------------------------|------------------------|
+| Email           | `chain(concat(concat(firstName+lastName, glue='.')+email_suffix)+transform_case(lower))`                                                            | `email_address`        |
+| Full name       | `chain(concat(firstName+lastName, glue=' ')+take_substring)` OR plain `concat(firstName+lastName, glue=' ')` — column not containing USERNAME/LOGIN | `full_name`            |
+| Username        | Same mask as full_name but column name contains USERNAME, USER_NAME, LOGIN, LOGON                                                                   | `username`             |
+| First name only | `from_file` with firstNames seed                                                                                                                    | `name_first`           |
+| Last name only  | `from_file` with lastNames seed                                                                                                                     | `name_last`            |
+| DOB             | Column name contains DOB/BIRTH/DATE_OF_BIRTH — use `retain_age` regardless of original type                                                         | `dob`                  |
+| Company         | `chain(from_file(companies)+take_substring)`                                                                                                        | `company_name`         |
+| Country name    | `from_file(country_codes, seed_column=name)`                                                                                                        | `country_name`         |
+| Country alpha-2 | `from_file(country_codes, seed_column=alpha_2)`                                                                                                     | `country_code_2`       |
+| Country alpha-3 | `from_file(country_codes, seed_column=alpha_3)`                                                                                                     | `country_code_3`       |
+| Phone/fax       | `imitate` on column name containing PHONE, TEL, FAX, MOBILE, CELL                                                                                   | `phone`                |
+| Address line 1  | `from_file(addresses, seed_column=street_address)` on LINE_1/ADDRESS_LINE_1 columns                                                                 | `address_line1`        |
+| Address line N  | Same for LINE_2, LINE_3 etc.                                                                                                                        | `address_lineN`        |
+| Address full    | `from_file(addresses, seed_column=street_address)` on non-line-numbered columns                                                                     | `address_full`         |
+| Address expr    | `concat(address+city+state+postcode, glue=', ')`                                                                                                    | `network_address_expr` |
+| City            | `from_file(addresses, seed_column=city)`                                                                                                            | `city`                 |
+| Postcode        | `from_file(addresses, seed_column=postcode)`                                                                                                        | `post_code`            |
+| Suburb          | `from_file(addresses, seed_column=suburb)`                                                                                                          | `suburb`               |
+| Occupation      | `from_file(occupations)`                                                                                                                            | `occupation`           |
 
 **3. Remaining** — group by column name concept. Where column names share a root (e.g., `RESULT3_VALUE`, `RESULT5_VALUE` → `result_value`; `GENERAL_2`, `GENERAL_6` → `general`), use one shared rule. Strip adjective prefixes. Use first occurrence's parameters.
 
-- `imitate_unique` (non-ID cols) → `{col_group}: type: imitate_unique, seed: "{col_group}"` — **seed is required**
+- `imitate_unique` (non-ID cols) → `{col_group}: type: imitate_unique, seed: "{col_group}"` (seed recommended for namespacing; see ID columns section).
 - `from_random_date` → `{col_group}: type: from_random_date, min/max from first occurrence`
 - `from_random_number` → `{col_group}: type: from_random_number, min/max from first occurrence`
-- `imitate` (non-phone) → `{col_group}: type: imitate`
+- String catch-all → `{col_group}: type: imitate_unique, seed: "{col_group}"` (use `imitate` only for types `imitate_unique` can't handle, e.g. datetime, bool).
 - Complex chains → keep structure, group by column name
 
-> **Critical rule:** Every `imitate_unique` entry in `ruleset_library.yaml` MUST have a `seed` value.
-> - Entity ID rules: `seed: "{entity_name}"` (e.g., `account_id` → `seed: "account"`)
-> - All other `imitate_unique` rules: `seed: "{rule_name}"` (e.g., `field_name` → `seed: "field_name"`)
-
 ### Output format
 
+`Global/RuleLib` below is a placeholder for `<namespace>/<library_name>` — substitute the operator's real values, and create the library with `dm libraries create` before running the ruleset.
+
 ```yaml
 version: '1.0'
 skip_defaults:
@@ -116,11 +119,11 @@ tasks:
 
 Do NOT write a custom YAML serializer. Use `ruamel.yaml` round-trip dumper. Use `DoubleQuotedScalarString` for `$ref` values.
 
-**Report:** "Step 2 done — extracted N rule library definitions: [list each name and usage count]."
+**Report:** "Step 3 done — extracted N rule library definitions: [list each name and usage count]."
 
 ---
 
-## Step 3: Add hash_columns
+## Step 4: Add hash_columns
 
 Write a Python script that:
 
@@ -146,30 +149,28 @@ Build a lookup of `(schema, table)` → columns with constraint and FK metadata:
 
 4. Write to output file
 
-**Report:** "Step 3 done — added hash_columns to N tables, skipped M (all-unique), skipped K (no suitable key). Top hash columns: [column → count]."
+**Report:** "Step 4 done — added hash_columns to N tables, skipped M (all-unique), skipped K (no suitable key). Top hash columns: [column → count]."
 
 ---
 
-## Step 4: Validate and clean up
+## Step 5: Validate and clean up
 
 Remove any comment lines containing `ROWID`.
 
-Run:
-```bash
-dm rulesets validate --file <output_file>
-```
+Run `dm rulesets validate --file <output_file> --type database`
+(use `file` for file-masking rulesets).
 
 Fix any errors and re-validate until passing.
 
 ---
 
 ## Summary
 
-| Metric | Value |
-|--------|-------|
-| Total tables | N |
+| Metric                     | Value          |
+|----------------------------|----------------|
+| Total tables               | N              |
 | Mask definitions extracted | N (list names) |
-| Tables with hash_columns | N |
-| Tables skipped (no key) | N |
-| Validation | passed/failed |
-| Output file | path |
+| Tables with hash_columns   | N              |
+| Tables skipped (no key)    | N              |
+| Validation                 | passed/failed  |
+| Output file                | path           |
diff --git a/claude-skills/ruleset-builder/skills/ruleset-builder/references/fk-cascade.md b/claude-skills/ruleset-builder/skills/ruleset-builder/references/fk-cascade.md
@@ -0,0 +1,109 @@
+# FK Cascade Invariant
+
+The most important rule when refining a DataMasque ruleset that spans
+related tables. Get this wrong and you either leak identity (by skipping
+IDs entirely) or break the engine (by adding rules for FK columns).
+
+## The rule
+
+**Mask only the parent PK column. The engine cascades the same masked value
+to every FK column referencing it.**
+
+Three masks support this cascade:
+
+- `imitate_unique` — recommended for new work.
+- `imitate_uuid` — for UUID-shaped IDs.
+- `imitate_nz_ird` — for NZ IRD numbers.
+
+(`from_unique_imitate` and `mask_unique_key` are deprecated; do not emit.)
+
+When `mask_table` runs and a rule on a referenced column uses one of these
+masks, the engine:
+
+1. Discovers child tables with FKs referencing this column.
+2. Auto-replicates the parent's rule onto every FK column.
+3. Same mask config → same masked output → joins survive.
+
+This is documented at
+<https://portal.datamasque.com/portal/documentation/latest/unique-masks.html>:
+
+> "You can apply an `imitate_unique` mask to a primary key column or a
+> column that is used as a foreign key in another table. References will be
+> updated automatically. Composite primary keys are supported."
+
+## Worked example
+
+Schema:
+- `customers.id` (PK), `customers.email`
+- `orders.id` (PK), `orders.customer_id` (FK → `customers.id`), `orders.tracking_number`
+
+Correct ruleset:
+
+```yaml
+- type: mask_table
+  table: customers
+  key: id
+  rules:
+    - column: id
+      masks:
+        - type: imitate_unique
+          seed: customer
+    - column: email
+      masks:
+        - type: from_file
+          seed_file: DataMasque_emails.csv
+          seed_column: email
+
+- type: mask_table
+  table: orders
+  key: id
+  rules:
+    # customer_id is intentionally absent — the engine replicates the
+    # `customers.id` rule onto it automatically. Adding it here would
+    # be rejected by the runtime FK check.
+    - column: tracking_number
+      masks:
+        - type: imitate_unique
+          seed: tracking
+```
+
+After the run, `orders.customer_id` holds the same masked values as
+`customers.id`, joins remain intact, and `tracking_number` is independently
+masked with its own seed.
+
+## Anti-patterns to refuse
+
+- **Adding explicit FK rules** ("I'll mask both PK and FK with shared
+  `$ref` so the cascade works"). The runtime rejects this by default with
+  the error:
+  *"To preserve referential integrity, the following foreign key columns
+  cannot be directly masked by this task."*
+  The engine will replicate the rule for you; adding your own conflicts.
+- **Skipping IDs to "preserve FK joins"**. Leaves identifiers in plain
+  sight. Mask the parent PK with `imitate_unique` — joins survive via
+  the auto-cascade.
+- **Inventing linking parameters** (`source_table`, `source_column`,
+  `parent_column`, `link_to`). None of these exist on any DataMasque mask.
+- **Inventing a hashing mask** (`hash_text`, `hash`, `link`, `match_id`).
+  None of these exist. `imitate_unique` is the deterministic mask.
+- **Using `from_unique_imitate` or `mask_unique_key`**. Both deprecated.
+  `imitate_unique` replaces both.
+
+## Cross-run consistency requires `run_secret`
+
+Within a single run, `imitate_unique` is deterministic via a per-run
+`insecure_seed`. Across runs, the cascade only holds if the run is
+invoked with a `run_secret`. Without it, the same input maps to a
+different masked value next run. If cross-run consistency matters, flag
+this in the final summary.
+
+## Self-check before finishing
+
+For each FK relationship in the schema:
+
+1. Is the parent PK masked with `imitate_unique`, `imitate_uuid`, or
+   `imitate_nz_ird`?
+2. Is the FK column **absent** from your output (no explicit rule)?
+3. Are `from_unique_imitate` and `mask_unique_key` absent from your output?
+
+If any answer is "no", fix it before validation.
diff --git a/...-skills/ruleset-builder/skills/ruleset-builder/references/hash-columns-guide.md b/...-skills/ruleset-builder/skills/ruleset-builder/references/hash-columns-guide.md
@@ -71,14 +71,14 @@ hash_columns:
 
 Every table belongs to a domain entity. Find the column that identifies that entity:
 
-| Domain | Typical hash column | Examples |
-|--------|-------------------|----------|
-| Customer | `cust_id`, `customer_id`, `client_id` | CUST_MASTER, CUST_ADDRESS |
-| Account | `acc_id`, `account_id`, `account_no` | DEP_ACCOUNT, DEP_EMAIL_ALERT |
-| Card | `card_id`, `card_no` | CARD_MASTER, CARD_INSURANCE |
-| Loan | `loan_id`, `loan_no` | LOAN_COLLATERAL, LOAN_GUARANTOR |
-| Employee | `emp_id`, `emp_no`, `employee_id` | COM_EMPLOYEE, COM_EMP_ROLE |
-| Transaction | `tx_id`, `trf_id`, `fx_tx_id` | TRF_MASTER, FX_RECEIPT |
+| Domain      | Typical hash column                   | Examples                        |
+|-------------|---------------------------------------|---------------------------------|
+| Customer    | `cust_id`, `customer_id`, `client_id` | CUST_MASTER, CUST_ADDRESS       |
+| Account     | `acc_id`, `account_id`, `account_no`  | DEP_ACCOUNT, DEP_EMAIL_ALERT    |
+| Card        | `card_id`, `card_no`                  | CARD_MASTER, CARD_INSURANCE     |
+| Loan        | `loan_id`, `loan_no`                  | LOAN_COLLATERAL, LOAN_GUARANTOR |
+| Employee    | `emp_id`, `emp_no`, `employee_id`     | COM_EMPLOYEE, COM_EMP_ROLE      |
+| Transaction | `tx_id`, `trf_id`, `fx_tx_id`         | TRF_MASTER, FX_RECEIPT          |
 
 ### Step 2: Check foreign keys in the DDL