fix(earnings): venue-rule compound-word under-count — compound_type axis + fail-loud boundaries (D-30)#93
Merged
Merged
Conversation
…rted guard) - Remove the inverted (?<!\w-)/(?!-\w) guard in _form_to_pattern; every form now uses plain (?<!\w)/(?!\w) boundaries so pre-tariff/tariff-based/non-fat/ pro-Palestine/New York-based count for the bare term (both venues' PDFs). - Rewrite the Hyphenated-compound docstring with verbatim Kalshi + Polymarket PDF URLs so the rule cannot be re-inverted. - Correct test_hyphenated_compound_is_not_the_bare_word -> test_hyphenated_compound_counts_as_bare_word (locks the fixed behavior). - GLP-1 slash-term + plural/possessive regressions stay green.
- earnings_fact: COMPOUND_TYPE_VALUES (standalone|open|hyphenated|closed|
affix_derivation) + KALSHI/POLYMARKET_AUTOCOUNT_COMPOUND_TYPES frozensets +
nullable compound_type ColumnSpec (SEPARATE axis from MATCH_RULE_VALUES);
all three exported in __all__.
- stt: classify_mentions() SIBLING to count_mentions — one record per
occurrence {surface,start,compound_type}, reusing the same form-expansion
machinery. Word-boundary pass tags standalone/open/hyphenated; substring pass
tags closed candidates vs affix_derivation via a curated stdlib prefix/suffix
heuristic (no dictionary dep, D-30 decision 4). Ambiguous -> conservative
closed candidate, never a silent drop.
…andidate review - build_fact_rows emits compound_type per row (default standalone for pre-fix / PR #89 rows); each classify_mentions occurrence is one row, never an aggregate mixing compound_types. - kalshi_boolean_settles + polymarket_count restrict to KALSHI/POLYMARKET_AUTOCOUNT_COMPOUND_TYPES (standalone|open|hyphenated); closed is Kalshi-No + Polymarket candidate-only; affix_derivation counts for neither. - New closed_candidate_count + resolve_polymarket_status: fail-loud 'disputed' when auto < threshold but auto+closed >= threshold (closed candidates could flip the outcome) — never a silent resolved_no. - Back-compat: a row without compound_type settles exactly as before.
…type - scripts/export_schemas.py output: add nullable compound_type enum column to json/schema.earnings_fact.v1.json + update EXPORT_MANIFEST sha256/size, keeping the schema-drift gate (test_schemas_codegen.py) green after the Task 2 axis add.
… (review round 1) F2: classify_mentions silently fell through to exact semantics on an unknown match_rule and returned [] for empty/degenerate terms — the silent-settle hazards count_mentions guards against. Extract the four guards into _validated_forms_for_term; both count_mentions and classify_mentions now share the identical fail-loud form-prep path.
…ndidate scan (review round 1) F3: pass 2 scanned acronym and case-sensitive forms case-blind, contradicting its own comment — OCI tagged a candidate inside 'social', Block inside 'blockchain', flooding human review and bypassing the case-sensitive over-count guard pass 1 enforces. Acronym forms are now excluded from pass 2 entirely; case-sensitive forms scan case-sensitively (documented: capitalized 'Blockchain' surfaces as a conservative closed candidate for 'Block' — review-only, never auto-counted).
…review round 1) F4: _DERIVATIONAL_PREFIXES silently dropped genuine closed compounds — oversupply/prepayment/underdog classified affix_derivation, excluded from closed_candidate_count, so a Polymarket straddle resolved resolved_no with no human review (the silent drop the locked design forbids). Remove the prefix branch entirely; only the grammatical-suffix branch (joyful, running) may return affix_derivation. All prefix-attached cases fall through to the conservative closed candidate.
…test (review round 1)
F6: the set-membership assertion (in {closed, affix_derivation}) stayed green
if firefighter regressed to affix_derivation — the exact silent drop the test
claims to prevent. Assert == 'closed' exactly (firefighter is a named
closed-compound example in the research doc / Polymarket PDF); killjoy and
wildfire already assert == 'closed' in test_classify_closed_candidate.
…eview round 1) F5: an out-of-enum compound_type (e.g. 'Standalone') vanished from ALL tallies — not auto-counted, not a closed candidate — so a true count of 5 resolved resolved_no at threshold 5 silently. _compound_type now validates against COMPOUND_TYPE_VALUES and raises ValueError on any unknown non-null value (applied in kalshi_boolean_settles, polymarket_count, polymarket_threshold_met, closed_candidate_count, resolve_status, and at build_fact_rows write time). None/missing still defaults to standalone (back-compat with pre-fix rows).
…view round 1) F1 (CRITICAL): the closed-candidate fail-loud path had no production producer — _count_final still used count_mentions, FactDelta carried no compound_type, and build_fact_rows defaulted absent keys to standalone, so 'firefighter' for term 'fire' produced NO closed candidate anywhere and a Polymarket closed-straddle still resolved resolved_no silently. - _count_final now runs classify_mentions and aggregates ONE delta per (term, compound_type) — word-boundary occurrences tally identically to the old path, just split per type; closed compounds now produce candidate deltas. - FactDelta gains compound_type (additive, default 'standalone' — existing SSE consumers and persisted deltas unaffected) + to_stt_count() as the propagation seam into build_fact_rows occurrence records. - End-to-end test: streaming closed compound -> closed delta -> fact rows -> polymarket 'disputed' at threshold 1 (never silent resolved_no); Kalshi still resolves no. No core schema change -> no export regen needed (earnings_fact.v1 already carries the nullable compound_type column).
…y (review round 2) R2-F1 (CRITICAL): _PASSTHROUGH_FIELDS stripped compound_type at the SDK's canonical read boundary — both _project_row (hosted /facts) and _project_stream_row (live SSE) laundered closed/affix occurrences into standalone auto-counts on BOTH venues, inverting Kalshi's closed-compound No and bypassing the fail-loud Polymarket review for every downstream consumer. - Add compound_type to _PASSTHROUGH_FIELDS (projected only-when-present: transcript rows + pre-fix payloads unaffected). - _compound_type treats float NaN as missing (default standalone): a MIXED old/new frame round-tripped through from_rows -> to_dict fills NaN for pre-fix rows — that is missing, not an out-of-enum value. - Tests: closed row survives BOTH projection paths and still resolves Kalshi no / Polymarket disputed; pre-fix payloads project cleanly; mixed frames resolve NaN-safe.
…s (review round 2) R2-F2: round-1 F3 overcorrected — excluding acronym forms from pass 2 entirely silently dropped plausible acronym compounds: 'GenAI'/'OpenAI' for term 'AI' produced zero rows, so a Polymarket threshold-1 market resolved resolved_no instead of surfacing a closed-candidate review. Acronym forms (already case_sensitive=True) now flow through the round-1 case-sensitive substring path: the case-preserved 'AI' inside GenAI/OpenAI is a closed candidate (review-only, never auto-counts); lowercase 'social'/'said' never match.
… spoken_at (review round 2) R2-F3: to_stt_count copied the engine-relative float spoken_at (12.5 s into stream) verbatim; build_fact_rows wrote it to the timestamp_utc spoken_at column and pyarrow silently coerced float->us-after-epoch, persisting 1970-01-01 00:00:00.000012+00:00 as the temporal audit marker on every seam-built row. - to_stt_count maps the float into offset_seconds (the engine-relative int audit field build_fact_rows already accepts) and never emits spoken_at. - build_fact_rows gains _validated_spoken_at: None passes (nullable column); tz-aware datetime passes through; float/int/string/naive datetime raises — the fail-loud write seam, consistent with the SSE projection path's _assert_live_temporal_contract tz-aware enforcement (documented). - Tests: to_stt_count emits offset_seconds not spoken_at; float + naive datetime raise; tz-aware wallclock passes; FactLedger round-trip of a seam-built row has NULL spoken_at, no epoch-1970 timestamps.
…review round 3)
R3-F1 (CRITICAL): pass 2 skipped any token containing '-' ('hyphenated handled
in pass 1') — but pass 1 only matches the term as a distinct hyphen-separated
ELEMENT (fire-related); it cannot match 'fire' FUSED inside the component
'wildfire' of 'wildfire-related'. 'wildfire-related costs' for term 'fire' at
Polymarket threshold 1 emitted nothing -> auto=0, closed=0 -> resolved_no
instead of disputed — the exact silent under-count class this fix targets.
Pass 2 now splits a hyphenated token into components and scans EACH exactly
like an unhyphenated word (same case rules: case-blind normal forms,
case-preserved acronym/case-sensitive forms):
- a component exactly equal to the surface (under the form's case rule) is
pass 1's boundary-match territory — skipped explicitly (covered-span
overlap check also guards; both exercised by tests);
- fused matches classify via _closed_or_affix on the COMPONENT, not the whole
token (joyful-sounding for joy -> affix_derivation);
- absolute offsets account for the component's position inside the token so
covered-span bookkeeping stays correct.
Tests: wildfire-related/firefighter-led -> closed; OpenAI-based -> closed
(case-preserved); pre-tariff + state-of-the-art -> exactly one hyphenated
occurrence, no pass-2 duplicate; joyful-sounding -> affix_derivation; count
conservation across fire/fire-related/wildfire/wildfire-related (2 auto +
2 closed, no overlapping spans). Pass-2 comment + docstrings updated.
… SSE wire (review round 4) _fact_payload used asdict(delta) verbatim, leaking spoken_at — an ENGINE-RELATIVE float — labeled as the schema's tz-aware wallclock column. That reopened on the live wire the 1970-epoch coercion path R2-F3 closed in to_stt_count(). The payload now carries offset_seconds (int) and omits spoken_at entirely. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… (review round 4) _assert_live_temporal_contract no-opped when either temporal field was absent, so a fact_delta carrying only a malformed spoken_at (engine-relative float or naive string) passed silently into the row. Each PRESENT field is now independently required to parse tz-aware; the ordering check still runs when both are present. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… boundary (review round 4) The ledger re-derived kalshi_counted but persisted compound_type verbatim — an out-of-enum value would be durably written, served by /facts, and vanish from every venue tally downstream. The durable write now refuses out-of-enum compound_type (None passes: pre-fix rows omit the nullable column). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
✅ Docs-required check: PASS API-surface change includes docs updates — no reminder needed. API-surface files changed: Docs files changed: |
|
Parity ticket gate: PASSED See |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Settlement-correctness fix for the earnings-mention venue rules (d30 / quick-260702-d30). An inverted guard dropped hyphenated compounds (e.g.
supply-chainforchain) from BOTH venue counts — under-counting KalshiKXEARNINGSMENTIONand Polymarket mention markets.Counting fix + compound axis
compound_typeaxis onschema.earnings_fact.v1(standalone/open/hyphenated/closed/affix_derivation— additive, nullable; pre-fix rows default to standalone semantics). Closed compounds (wildfireforfire) are Kalshi-No + Polymarket human-review candidates; affix derivations (joyfulforjoy) count for neither venue.(term, compound_type); acronyms scanned case-sensitively and excluded from the closed-candidate pass;firefighter-class ambiguity locked as closed.EXPORT_MANIFESTregenerated through the byte-deterministic drift gate;compound_typeprojected through the catalog read boundary (stripping it laundered closed/affix occurrences into auto-counts).Fail-loud boundaries (no fail-open path for the new axis)
compound_typeraises in every venue tally, at theFactLedgerdurable-write boundary (round 4), and shared form validation withclassify_mentions.FactDelta.spoken_at(engine-relative float) maps tooffset_seconds— into_stt_count()(round 2) and now also on the SSE wire in_fact_payload(round 4), never as the schema's wallclockspoken_at.Review history
review round 1/2/3).offset_secondswire mapping; ledgercompound_typeenum guard). P3 (suffix-chainjoyfullyclassifies closed → conservativedisputed, not a silent mis-settle) accepted per severity gate.Known pre-existing issue (flagged, out of scope)
The
transcript_segmentSSE path emitsSegment.spoken_at/knowledge_timeas raw floats (main code, untouched here; the consumer rejected those frames before this branch too). It sits in the deliberately-unwired live path (/stream404s live calls until the Phase 28 deploy follow-up). Proper fix = wallclock anchoring at capture time, coordinated with the 27-09/27-10 live validation — tracked as a follow-up task.Test evidence
uv run pytest -m "not live", exit 0) on the merged branch; pre-push hook test + TS typecheck green.offset_secondsand neverspoken_at; consumer raisesLiveStreamErroron float and on naive single-sidedspoken_at; ledger rejects out-of-enumcompound_typebefore the parquet write (nothing persisted), acceptsNone+ all five canonical values.origin/main(post-PR Phase 28: Hosted GCE Data Platform — serving apps + IaC + SDK hosted surface (platform layer) #92): zero file overlap.TS Parity
Wire-level fix. The TS stream consumer (
earnings_stream.ts) readsspoken_atonly as a string and tolerates its absence — no public TS API change; no TS code change required.compound_typealready crossed to TS via the regenerated schema export.Settlement firewall
No changes to
research(),merge/*,live/_sources.py, or CWOP registration. Changes are confined to the earnings engine/serving/consumer +schema.earnings_fact.v1.🤖 Generated with Claude Code
python_only: true — wire-level + Python-internal fix. The
compound_typeaxis lives onschema.earnings_fact.v1(already crossed to TS via the PR #89 schema export; not a hand-edited TS generated type). The SSE wire changes (offset_secondsin place of an engine-relativespoken_atfloat) are already forward-compatible with the TS stream consumerpackages-ts/weather/src/_fetchers/earnings_stream.ts, which readsspoken_atas optional (pickString(...) ?? null) and already mapsoffset_seconds → offsetSeconds. No TS source change required.