Skip to content

Proximal L1 solver (method=prox, l1_lambda) + honest method label#184

Open
MaxGhenis wants to merge 2 commits into
codex/target-alignment-audit-20260624from
l1-proximal-solver
Open

Proximal L1 solver (method=prox, l1_lambda) + honest method label#184
MaxGhenis wants to merge 2 commits into
codex/target-alignment-audit-20260624from
l1-proximal-solver

Conversation

@MaxGhenis

@MaxGhenis MaxGhenis commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Proximal L1 solver + honest method label

Adds the L1 selection path the l0-paper comparison needs, and fixes the apg mislabel surfaced while building it.

The apg bug

calibrate(method=...) accepted "apg"/"adam" but both ran torch Adam on log-weightsmethod was validated, written into the run manifest, then never passed to _optimize. So every default run recorded method: "apg" for an Adam run (false provenance), and the docstring claimed "Adam is the accelerated proximal gradient," which is wrong (Adam has no prox step). Not a results bug, but a real mislabel — and it actively blocks L1, which needs a proximal operator.

Fix: method ∈ {"adam", "prox"}, default "adam"; "apg" is rejected. No back-compat alias (no caller passed it).

The L1 path (method="prox", l1_lambda)

Proximal gradient (ISTA-style) minimizing the same capped weighted-MAPE loss plus l1_lambda · mean(w / w0):

  • Optimizes the ratio r = w / w0 (O(1) scale) so the step is well-conditioned.
  • Uses plain gradient, not Adam: Adam normalizes every coordinate's update and prevents clean soft-threshold selection.
  • Uses the same effective smooth-step size for the prox: r ← max(r - eta · l1_lambda / n, 0), so l1_lambda is the coefficient on the recorded mean initial-weight-ratio objective, not an unnormalized sum penalty.
  • Produces exact zeros, so L1 selects a sparse calibrated subset (the convex analog of the L0 gates).
  • Applies mass conservation once at the end over survivors; per-step rescaling would resurrect zeroed records.

Validation

Focused checks passed:

  • ruff format / ruff check on the touched solver/test files.
  • pytest packages/populace-calibrate/tests/test_solve.py -q → 38 passed.
  • pytest packages/populace-calibrate/tests -q → 99 passed, with existing torch sparse warnings.

New tests cover sparse selection, lambda monotonicity, apg rejection, l1_lambda requiring method="prox", and a one-step zero-gradient case that pins the l1_lambda · mean(w / w0) prox scaling.

On the L2 concentration fixture after the mean-penalty scaling fix: l1=0 → 160 survivors, 0.5 → 106, 1.0 → 79, 2.0 → 57; larger penalties correctly error if all weights are zeroed.

Based on codex/target-alignment-audit-20260624 at dcdde76, so it stacks on the current formula-owned aggregate export fix.

@MaxGhenis MaxGhenis left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: PolicyEngine/populace PR #184

Recommendation: REQUEST_CHANGES

Findings

  1. [CRITICAL] The L1 path does not implement the objective it documents and records.

In packages/populace-calibrate/src/populace/calibrate/solve.py, _optimize_proximal documents the objective as capped weighted-MAPE plus l1_lambda * mean(w_i / w0_i), and result metadata records l1_penalty="mean_initial_weight_ratio_abs". But the update at lines 685-690 normalizes the smooth gradient step by RMS, making the effective step size learning_rate / rms, while the soft-threshold is fixed at learning_rate * l1_lambda. For the documented mean penalty, the threshold should use the same effective step size and the mean divisor, approximately (learning_rate / rms) * l1_lambda / n. The current update is therefore not the proximal operator for the recorded objective; it behaves like an unnormalized heuristic whose lambda scale changes with record count and gradient scale. That is especially risky because this PR is meant to supply a solver/provenance path for l0-paper comparisons.

Relevant lines: packages/populace-calibrate/src/populace/calibrate/solve.py:627, :632, :648, :652, :685, :690, :1289.

Suggested fix: choose one contract and make code, docs, metadata, and tests agree. If the contract is mean(w/w0), use the actual effective step in the prox shrink and divide by n. If the desired behavior is a budget heuristic, rename the metadata away from an objective penalty and stop claiming it minimizes target_loss + lambda * mean(r).

Test Gap

The new tests show that prox can produce exact zeros and that pruning is monotone on one fixture. They do not test the L1 scale/objective contract. Add a small deterministic one-step prox test or a duplicate-record invariance test for the declared mean penalty.

Other Notes

The PR is mergeable but the branch is three commits behind the current #182 base. It should be rebased after the solver issue is fixed. No GitHub checks were reported for this branch during review; local focused test_solve.py passed 37 tests.

The method arg accepted "apg"/"adam" but both ran torch Adam on log-weights, so manifests could record method="apg" for an Adam run. Adam also cannot perform the soft-thresholding L1 needs.

This adds method="prox" for proximal gradient on weight ratios r=w/w0, rejects the misleading apg alias, and guards l1_lambda so it can only be used with the prox path.

The L1 contract is explicit: l1_lambda multiplies mean(w/w0). The prox shrink uses the same effective smooth-step size and divides by n, so the soft-threshold matches the recorded mean-initial-weight-ratio objective rather than an unnormalized sum penalty.

Tests cover sparse selection, lambda monotonicity, apg rejection, prox-only l1_lambda, and a one-step zero-gradient case that pins the mean-ratio prox scale.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@MaxGhenis MaxGhenis force-pushed the l1-proximal-solver branch from 42c1a58 to 258e7eb Compare June 25, 2026 12:50
@MaxGhenis

Copy link
Copy Markdown
Contributor Author

Fixed the review finding and pushed the update to #184.

What changed:

  • Rebases the branch onto the current Treat PE-US aggregate outputs as formula-owned #182 base (dcdde76).
  • Keeps the documented L1 objective as l1_lambda * mean(w / w0).
  • Changes the prox shrink to use the same effective smooth-step size and divide by n, so the implementation matches the recorded mean_initial_weight_ratio_abs provenance.
  • Adds a one-step zero-gradient regression test that would fail if the code drifts back to an unnormalized sum-penalty shrink.
  • Updates the PR body so the L1 formula and validation examples match the pushed code.

Verification:

  • ruff format / ruff check on the touched files.
  • pytest packages/populace-calibrate/tests/test_solve.py -q → 38 passed.
  • pytest packages/populace-calibrate/tests -q → 99 passed, with existing torch sparse warnings.

@MaxGhenis MaxGhenis left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up after the pushed fix: the L1 proximal update now matches the documented objective, l1_lambda * mean(w / w0), because the shrink threshold uses the same effective smooth step size and divides by record count. The new one-step regression test covers the scale contract that the original review flagged.

Validated locally with pytest packages/populace-calibrate/tests/test_solve.py -q and pytest packages/populace-calibrate/tests -q. I am leaving this as a comment review rather than approving my own pushed fix.

@MaxGhenis MaxGhenis left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up after the second review pass: I pushed 99b4f36 to address the remaining solver-contract issues. method="prox" now rejects l2_lambda > 0 instead of recording an L2 objective it does not optimize, and the L1 scale tests now cover both the zero-gradient / n shrink and a nonzero-gradient case that pins use of the RMS-normalized effective step.

Validated with uv run --package populace-calibrate --group dev python -m pytest packages/populace-calibrate/tests/test_solve.py -q, uv run --package populace-calibrate --group dev python -m pytest packages/populace-calibrate/tests -q, uv run --no-sync ruff check packages/populace-calibrate/src/populace/calibrate/solve.py packages/populace-calibrate/tests/test_solve.py, and git diff --check. Existing torch sparse warnings only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant