Skip to content

feat: Feast First-Class LabelView Implementation#6292

Open
ntkathole wants to merge 1 commit intofeast-dev:masterfrom
ntkathole:labelView
Open

feat: Feast First-Class LabelView Implementation#6292
ntkathole wants to merge 1 commit intofeast-dev:masterfrom
ntkathole:labelView

Conversation

@ntkathole
Copy link
Copy Markdown
Member

@ntkathole ntkathole commented Apr 17, 2026

What this PR does / why we need it:

RFC - https://docs.google.com/document/d/1mcJM0sHeBk269oMIsLsLnAlrMtgnxn5Pws0oEeogc48/edit?usp=sharing

Summary

Introduces LabelView, a new alpha-stage Feast primitive that manages mutable labels and annotations separately from immutable feature data stored in regular FeatureViews. This supports multi-labeler workflows (human reviewers, automated safety scanners, reward models) where different sources independently write labels for the same entity keys.

LabelView is fully opt-in — existing workflows are unaffected unless a user explicitly defines one.

What's included

Core implementation (feast.labeling module)

  • LabelView class inheriting from BaseFeatureView, with label-specific attributes: labeler_field, conflict_policy, retain_history, reference_feature_view
  • ConflictPolicy enum (LAST_WRITE_WINS, LABELER_PRIORITY, MAJORITY_VOTE)
  • Protobuf definition (LabelView.proto) with LabelViewSpec, LabelViewMeta, and ConflictResolutionPolicy

Full SDK integration

  • FeatureStore.apply() — registers LabelViews, auto-registers their PushSource, validates name uniqueness across all view types
  • FeatureStore.push() — routes push data to LabelView when its PushSource matches
  • FeatureStore.get_historical_features() — works via batch_source/stream_source properties on LabelView
  • FeatureStore.teardown() — includes LabelView tables
  • FeatureStore.list_label_views() / get_label_view() — new public API methods
  • FeatureService composability — LabelViews can be included alongside regular feature views

Registry support (all backends)

  • File registry — full CRUD, apply_materialization, name-conflict checks
  • SQL registry — label_views table, _infer_fv_table/_infer_fv_classes support, proto() builder
  • Remote (gRPC) registry — label_view arm added to ApplyFeatureViewRequest oneof, full client + server wiring
  • Snowflake registry — DDL for LABEL_VIEWS table, correct column-name mapping in delete_feature_view

CLI

  • feast label-views list — lists all label views with name, entities, conflict policy
  • feast label-views describe <name> — shows full label view details
  • feast feature-views list unchanged (label views are only in their dedicated command)

Permissions

  • LABEL_VIEW = 11 added to Permission.proto Type enum
  • _PERMISSION_TYPES map updated for from_proto/to_proto roundtrip

Regression safety

  • LabelViews are excluded from feast materialize / materialize-incremental (labels come in via push(), not batch materialization); explicit error if attempted by name
  • No changes to existing feature-views list output
  • All registry proto() builders include label_views so cached/exported blobs are complete
  • Cross-type name uniqueness enforced via _ensure_feature_view_name_is_unique

Documentation

  • Docstrings with [Alpha] tags and Google-style Args blocks on all public classes/methods
  • Sphinx/RST API reference entries for LabelView and ConflictPolicy
  • Concept guide (docs/getting-started/concepts/label-view.md) with usage examples, conflict policies, and alpha limitations
  • Alpha limitations clearly documented: conflict_policy stored but not enforced at read time (requires online-store query-path changes); retain_history stored but not enforced at write time (requires online-store write-path changes); batch materialization not supported

Tests

  • 30 unit tests covering creation, defaults, proto roundtrip, copy/equality, validation, FeatureService composition, registry proto roundtrip, batch_source/stream_source properties, name-conflict detection, and MaterializationTask acceptance

Alpha Limitations

  • conflict_policy — persisted in registry metadata but not enforced during get_online_features. Online store returns last-written row regardless of policy.
  • retain_history — persisted but not acted on. Online store always overwrites previous value.
  • Batch materialization — not supported; labels ingested via FeatureStore.push() only.

Test Plan

  • All 30 test_label_view.py unit tests pass
  • All 5 test_registry_diff.py tests pass (including test_diff_registry_objects_permissions)
  • No linter errors introduced
  • Existing feature view / entity / feature service workflows unaffected when no LabelView is defined
  • Manual: feast apply with a repo containing a LabelView definition
  • Manual: feast label-views list / feast label-views describe
  • Manual: FeatureStore.push() to a LabelView's PushSource

Which issue(s) this PR fixes:

#5456

@ntkathole ntkathole self-assigned this Apr 17, 2026
@ntkathole ntkathole force-pushed the labelView branch 2 times, most recently from 0f755c2 to 468ea54 Compare April 17, 2026 08:31
@ntkathole ntkathole marked this pull request as ready for review April 17, 2026 08:43
@ntkathole ntkathole requested a review from a team as a code owner April 17, 2026 08:43
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@ntkathole ntkathole force-pushed the labelView branch 2 times, most recently from 623acc0 to d8daa1f Compare April 17, 2026 10:51
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@ntkathole ntkathole force-pushed the labelView branch 2 times, most recently from 85c9875 to 2678090 Compare April 17, 2026 12:51
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Copy Markdown

@JiwaniZakir JiwaniZakir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In docs/getting-started/concepts/label-view.md, the LabelView definition example includes interaction_id both as the entity join key (Entity(join_keys=["interaction_id"])) and as an explicit Field inside the schema list. Standard Feast FeatureView definitions typically exclude join-key columns from the schema since they're inferred from the entity definition — duplicating them here either reflects an inconsistency with how the underlying implementation handles schema validation, or it will silently produce duplicate columns. This should either be removed from the schema to match Feast conventions or explicitly documented as a required pattern specific to LabelView.

On the alpha-limitations transparency: documenting that conflict_policy and retain_history are persisted but not enforced is the right call, but consider surfacing a runtime warning (e.g., a UserWarning on apply() or push()) when a non-default ConflictPolicy or retain_history=True is configured. As written, users who set ConflictPolicy.MAJORITY_VOTE will silently get last-write-wins behavior, which is a subtle correctness hazard that the docs alone may not prevent.

@ntkathole
Copy link
Copy Markdown
Member Author

In docs/getting-started/concepts/label-view.md, the LabelView definition example includes interaction_id both as the entity join key (Entity(join_keys=["interaction_id"])) and as an explicit Field inside the schema list. Standard Feast FeatureView definitions typically exclude join-key columns from the schema since they're inferred from the entity definition — duplicating them here either reflects an inconsistency with how the underlying implementation handles schema validation, or it will silently produce duplicate columns. This should either be removed from the schema to match Feast conventions or explicitly documented as a required pattern specific to LabelView.

Thanks for review @JiwaniZakir. Can you please re-confirm - this is how regular feature view works, it isn't duplication.

On the alpha-limitations transparency: documenting that conflict_policy and retain_history are persisted but not enforced is the right call, but consider surfacing a runtime warning (e.g., a UserWarning on apply() or push()) when a non-default ConflictPolicy or retain_history=True is configured. As written, users who set ConflictPolicy.MAJORITY_VOTE will silently get last-write-wins behavior, which is a subtle correctness hazard that the docs alone may not prevent.

valid point but the next PR will have support for conflict_policy and retain_history and required changes in online store.

@JiwaniZakir
Copy link
Copy Markdown

The Python 3.12 worker crash in test_e2e_local.py is worth investigating before merging — a gw6 worker crash typically signals a segfault or unhandled exception during collection or teardown, which could be unrelated to LabelView itself but should be confirmed. Can you share the full traceback from that run? Also, given the retain_history and conflict_policy attributes on LabelView, it would be worth confirming the batched write tests cover the MAJORITY_VOTE conflict resolution path specifically, since that's the most stateful and error-prone of the three policies.

devin-ai-integration[bot]

This comment was marked as resolved.

Signed-off-by: ntkathole <nikhilkathole2683@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants