Skip to content

ExistQuery may not be accurate when relying on fast-field #6288

@evance-br

Description

@evance-br

Describe the bug

ExistQuery such as field:* can be wrong when they are satisfied from fast-field storage instead of index_field_presence.
When fast field normalizer is set to raw, the fast-field path treats the whole string as one token and drops values longer than 255 bytes, so nothing is written to the fast column for those docs.
Queries like NOT field:* can then exclude documents that do have field in the source JSON, while term queries on the same field (e.g. field:"github") still work because they use the inverted index.

Suggestions:

Maybe instead of dropping the values we could truncate the values, this will break ordering on that fast column but at the moment with the logic of dropping, it's already broken.

Quickwit version

  • Quickwit 0.8.2
  • Latest dockerhub image (quickwit/quickwit:edge-slim-bookworm)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions