Skip to content

Commit 6dea1dc

Browse files
authored
fix(knowledge): send tag filters as a JSON string so the document filter works (#5259)
* fix(knowledge): send tag filters as a JSON string so the document filter works The document-list tag filter never reached the database. The `tagFilters` query field was a Zod `.transform()` that decoded the JSON string into an array of objects; the client's `requestJson` parses the query before serializing, so `appendQuery` received the array and emitted `tagFilters=[object Object]` into the URL. The route then failed to `JSON.parse` it and returned 400, so the list came back empty (or stale via keepPreviousData) regardless of operator or value. - Model `tagFilters` as the wire string it actually is; decode it server-side via a new `parseDocumentTagFiltersParam` helper (route maps a bad value to 400). - Harden `appendQuery`: throw on an array-of-objects query param instead of silently serializing `[object Object]`, so this whole class fails loudly. - Default the text tag-filter operator to `contains` so a partial value matches. - Tests: requestJson serializes the JSON param verbatim + the guard throws; the query schema keeps tagFilters a string; the decode helper round-trips. A full sweep of every GET/DELETE contract query field confirmed this was the only field of this class — logs filters and table filter/sort are unaffected. * fix(knowledge): reject tag-filter operators invalid for the field type Greptile P2: documentTagFilterSchema accepted any non-empty operator string, so an unsupported operator was silently dropped by the query builder instead of returning 400. Validate the operator against the field type's allowed set (single source of truth in filters/types) via superRefine. * fix(knowledge): validate tag-filter type against the slot, not the client claim Greptile P1: operator validation trusted the client-supplied fieldType, so a numeric slot could be sent with fieldType 'text' + 'contains' and slip through to build a text LIKE on a numeric column. Validate against the slot's inherent type via getFieldTypeForSlot (the source of truth): reject unknown slots and fieldType/slot mismatches at the boundary before checking the operator. * fix(knowledge): validate tag-filter values against the field type Greptile P1: value/valueTo were z.unknown(), so a number filter accepted 'abc', a date filter 'not-a-date', etc. — unusable values the query builder then silently dropped. Add a shared isValidFilterValue (single source of truth in filters/types) and reject unusable value/valueTo at the boundary, including the between upper bound. * fix(knowledge): only send a between tag filter once both bounds are set Cursor Bugbot: the strict valueTo validation made a partially-entered between filter (lower bound only) 400 and break the whole document list mid-entry. activeTagFilters now withholds a between row until both bounds are filled — consistent with already requiring the lower bound before sending any filter — so the list keeps loading while the range is being entered. * fix(knowledge): reject impossible calendar dates in tag filters Greptile P1: the date value check was format-only, so 2026-02-30 / 2026-99-99 passed the boundary and then made the document query's ::date cast throw a 500. Validate real calendar dates by round-tripping the parsed parts.
1 parent f5f87de commit 6dea1dc

7 files changed

Lines changed: 385 additions & 26 deletions

File tree

apps/sim/app/api/knowledge/[id]/documents/route.ts

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ import {
88
bulkKnowledgeDocumentsContract,
99
createKnowledgeDocumentsContract,
1010
listKnowledgeDocumentsQuerySchema,
11+
parseDocumentTagFiltersParam,
1112
} from '@/lib/api/contracts/knowledge'
1213
import { parseRequest } from '@/lib/api/server'
1314
import { getSession } from '@/lib/auth'
@@ -67,6 +68,18 @@ export const GET = withRouteHandler(
6768
const { enabledFilter, search, limit, offset, sortBy, sortOrder, tagFilters } =
6869
queryResult.data
6970

71+
let parsedTagFilters: TagFilterCondition[] | undefined
72+
try {
73+
parsedTagFilters = parseDocumentTagFiltersParam(tagFilters) as
74+
| TagFilterCondition[]
75+
| undefined
76+
} catch {
77+
return NextResponse.json(
78+
{ error: 'tagFilters must be a valid JSON array' },
79+
{ status: 400 }
80+
)
81+
}
82+
7083
const result = await getDocuments(
7184
knowledgeBaseId,
7285
{
@@ -76,7 +89,7 @@ export const GET = withRouteHandler(
7689
offset,
7790
...(sortBy && { sortBy }),
7891
...(sortOrder && { sortOrder }),
79-
tagFilters: tagFilters as TagFilterCondition[] | undefined,
92+
tagFilters: parsedTagFilters,
8093
},
8194
requestId
8295
)

apps/sim/app/workspace/[workspaceId]/knowledge/[id]/base.tsx

Lines changed: 25 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -264,13 +264,20 @@ export function KnowledgeBase({
264264
const activeTagFilters: DocumentTagFilter[] = useMemo(
265265
() =>
266266
tagFilterEntries
267-
.filter((f) => f.tagSlot && f.value.trim())
267+
.filter((f) => {
268+
if (!f.tagSlot || !f.value.trim()) return false
269+
// A `between` filter only applies once both bounds are set. Sending it
270+
// with just the lower bound would be rejected at the API boundary and
271+
// break the whole list while the user is still entering the range.
272+
if (f.operator === 'between' && !f.valueTo.trim()) return false
273+
return true
274+
})
268275
.map((f) => ({
269276
tagSlot: f.tagSlot,
270277
fieldType: f.fieldType,
271278
operator: f.operator,
272279
value: f.value,
273-
...(f.operator === 'between' && f.valueTo ? { valueTo: f.valueTo } : {}),
280+
...(f.operator === 'between' ? { valueTo: f.valueTo } : {}),
274281
})),
275282
[tagFilterEntries]
276283
)
@@ -1466,11 +1473,25 @@ const createEmptyEntry = (): TagFilterEntry => ({
14661473
tagName: '',
14671474
tagSlot: '',
14681475
fieldType: 'text',
1469-
operator: 'eq',
1476+
operator: 'contains',
14701477
value: '',
14711478
valueTo: '',
14721479
})
14731480

1481+
/**
1482+
* Default operator when a tag is selected. Text filters default to `contains`
1483+
* so typing part of a value finds matches (exact `equals` stays one click away
1484+
* in the operator dropdown); other field types keep their first, equality
1485+
* operator.
1486+
*/
1487+
function getDefaultOperatorForFieldType(
1488+
fieldType: FilterFieldType,
1489+
operators: ReturnType<typeof getOperatorsForFieldType>
1490+
): string {
1491+
if (fieldType === 'text') return 'contains'
1492+
return operators[0]?.value ?? 'eq'
1493+
}
1494+
14741495
interface TagFilterSectionProps {
14751496
tagDefinitions: TagDefinition[]
14761497
entries: TagFilterEntry[]
@@ -1601,7 +1622,7 @@ function TagFilterSection({ tagDefinitions, entries, onChange }: TagFilterSectio
16011622
tagName,
16021623
tagSlot: def?.tagSlot || '',
16031624
fieldType,
1604-
operator: operators[0]?.value || 'eq',
1625+
operator: getDefaultOperatorForFieldType(fieldType, operators),
16051626
value: '',
16061627
valueTo: '',
16071628
})
Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
/**
2+
* @vitest-environment node
3+
*/
4+
import { afterEach, describe, expect, it, vi } from 'vitest'
5+
import { z } from 'zod'
6+
import { requestJson } from '@/lib/api/client/request'
7+
import { listKnowledgeDocumentsContract } from '@/lib/api/contracts/knowledge'
8+
import { defineRouteContract } from '@/lib/api/contracts/types'
9+
10+
/**
11+
* Captures the URL of the last fetch call and returns a valid JSON response so
12+
* `requestJson`'s response validation passes.
13+
*/
14+
function mockFetchReturning(body: unknown) {
15+
const fetchMock = vi.fn(
16+
async () =>
17+
new Response(JSON.stringify(body), {
18+
status: 200,
19+
headers: { 'content-type': 'application/json' },
20+
})
21+
)
22+
vi.stubGlobal('fetch', fetchMock)
23+
return fetchMock
24+
}
25+
26+
describe('requestJson query serialization', () => {
27+
afterEach(() => {
28+
vi.unstubAllGlobals()
29+
})
30+
31+
it('serializes a JSON-string query param verbatim (regression: tagFilters)', async () => {
32+
const fetchMock = mockFetchReturning({
33+
success: true,
34+
data: {
35+
documents: [],
36+
pagination: { total: 0, limit: 50, offset: 0, hasMore: false },
37+
},
38+
})
39+
40+
const tagFilters = JSON.stringify([
41+
{ tagSlot: 'tag1', fieldType: 'text', operator: 'contains', value: 'Ada Lovelace' },
42+
])
43+
44+
await requestJson(listKnowledgeDocumentsContract, {
45+
params: { id: 'kb-1' },
46+
query: { tagFilters },
47+
})
48+
49+
const calledUrl = String(fetchMock.mock.calls[0][0])
50+
const url = new URL(calledUrl, 'https://example.test')
51+
// The param must round-trip as the exact JSON, never "[object Object]".
52+
expect(url.searchParams.get('tagFilters')).toBe(tagFilters)
53+
expect(calledUrl).not.toContain('object+Object')
54+
expect(calledUrl).not.toContain('[object Object]')
55+
})
56+
57+
it('throws instead of silently corrupting an array-of-objects query param', async () => {
58+
mockFetchReturning({ ok: true })
59+
60+
const badContract = defineRouteContract({
61+
method: 'GET',
62+
path: '/api/test',
63+
query: z.object({ items: z.array(z.object({ a: z.string() })) }),
64+
response: { mode: 'json', schema: z.object({ ok: z.boolean() }) },
65+
})
66+
67+
await expect(requestJson(badContract, { query: { items: [{ a: 'x' }] } })).rejects.toThrow(
68+
/arrays of objects are not URL-safe/
69+
)
70+
})
71+
})

apps/sim/lib/api/client/request.ts

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -68,9 +68,19 @@ function appendQuery(path: string, query: unknown): string {
6868

6969
if (Array.isArray(value)) {
7070
for (const item of value) {
71-
if (item !== undefined && item !== null && item !== '') {
72-
searchParams.append(key, String(item))
71+
if (item === undefined || item === null || item === '') continue
72+
// A non-scalar in a query array would stringify to "[object Object]" and
73+
// silently corrupt the request. Encode such values as a single JSON
74+
// string param and decode them server-side instead. Failing loudly here
75+
// keeps the boundary honest (this is how the knowledge tagFilters bug
76+
// shipped undetected).
77+
if (typeof item === 'object') {
78+
throw new Error(
79+
`Cannot serialize query param "${key}": arrays of objects are not URL-safe — ` +
80+
'encode the value as a JSON string param and decode it server-side.'
81+
)
7382
}
83+
searchParams.append(key, String(item))
7484
}
7585
continue
7686
}
Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
/**
2+
* @vitest-environment node
3+
*/
4+
import { describe, expect, it } from 'vitest'
5+
import {
6+
listKnowledgeDocumentsQuerySchema,
7+
parseDocumentTagFiltersParam,
8+
} from '@/lib/api/contracts/knowledge/documents'
9+
10+
describe('listKnowledgeDocumentsQuerySchema.tagFilters', () => {
11+
it('keeps tagFilters a raw string (must NOT transform to an array)', () => {
12+
// A transform-to-array here breaks requestJson outbound serialization
13+
// (the array serializes as "[object Object]"). The wire type must stay a
14+
// string; decoding happens server-side via parseDocumentTagFiltersParam.
15+
const tagFilters = JSON.stringify([
16+
{ tagSlot: 'tag1', fieldType: 'text', operator: 'contains', value: 'x' },
17+
])
18+
const parsed = listKnowledgeDocumentsQuerySchema.parse({ tagFilters })
19+
expect(parsed.tagFilters).toBe(tagFilters)
20+
expect(typeof parsed.tagFilters).toBe('string')
21+
})
22+
})
23+
24+
describe('parseDocumentTagFiltersParam', () => {
25+
it('returns undefined for an absent param', () => {
26+
expect(parseDocumentTagFiltersParam(undefined)).toBeUndefined()
27+
expect(parseDocumentTagFiltersParam('')).toBeUndefined()
28+
})
29+
30+
it('decodes a valid JSON array of filters', () => {
31+
const filters = [
32+
{ tagSlot: 'tag1', fieldType: 'text', operator: 'contains', value: 'x' },
33+
{ tagSlot: 'date1', fieldType: 'date', operator: 'eq', value: '2026-04-21' },
34+
]
35+
expect(parseDocumentTagFiltersParam(JSON.stringify(filters))).toEqual(filters)
36+
})
37+
38+
it('throws on malformed JSON', () => {
39+
expect(() => parseDocumentTagFiltersParam('[object Object]')).toThrow()
40+
expect(() => parseDocumentTagFiltersParam('{not json')).toThrow()
41+
})
42+
43+
it('throws when the shape is wrong', () => {
44+
expect(() => parseDocumentTagFiltersParam(JSON.stringify([{ tagSlot: '' }]))).toThrow()
45+
})
46+
47+
it('rejects an operator that is not valid for the field type', () => {
48+
// unknown operator
49+
expect(() =>
50+
parseDocumentTagFiltersParam(
51+
JSON.stringify([{ tagSlot: 'tag1', fieldType: 'text', operator: 'bogus', value: 'x' }])
52+
)
53+
).toThrow()
54+
// valid operator name, wrong field type (contains is text-only)
55+
expect(() =>
56+
parseDocumentTagFiltersParam(
57+
JSON.stringify([
58+
{ tagSlot: 'number1', fieldType: 'number', operator: 'contains', value: '1' },
59+
])
60+
)
61+
).toThrow()
62+
})
63+
64+
it('rejects a fieldType that does not match the tag slot', () => {
65+
// number1 is a numeric column; claiming it is text must fail
66+
expect(() =>
67+
parseDocumentTagFiltersParam(
68+
JSON.stringify([
69+
{ tagSlot: 'number1', fieldType: 'text', operator: 'contains', value: 'x' },
70+
])
71+
)
72+
).toThrow()
73+
})
74+
75+
it('rejects an unknown tag slot', () => {
76+
expect(() =>
77+
parseDocumentTagFiltersParam(
78+
JSON.stringify([{ tagSlot: 'tag99', fieldType: 'text', operator: 'eq', value: 'x' }])
79+
)
80+
).toThrow()
81+
})
82+
83+
it('rejects values that are unusable for the field type', () => {
84+
// non-numeric value on a number field
85+
expect(() =>
86+
parseDocumentTagFiltersParam(
87+
JSON.stringify([{ tagSlot: 'number1', fieldType: 'number', operator: 'eq', value: 'abc' }])
88+
)
89+
).toThrow()
90+
// non-date value on a date field
91+
expect(() =>
92+
parseDocumentTagFiltersParam(
93+
JSON.stringify([{ tagSlot: 'date1', fieldType: 'date', operator: 'eq', value: 'nope' }])
94+
)
95+
).toThrow()
96+
// well-formed but impossible calendar dates (would 500 on ::date)
97+
for (const value of ['2026-02-30', '2026-99-99', '2026-13-01', '2026-00-10']) {
98+
expect(() =>
99+
parseDocumentTagFiltersParam(
100+
JSON.stringify([{ tagSlot: 'date1', fieldType: 'date', operator: 'eq', value }])
101+
)
102+
).toThrow()
103+
}
104+
// non-boolean value on a boolean field
105+
expect(() =>
106+
parseDocumentTagFiltersParam(
107+
JSON.stringify([
108+
{ tagSlot: 'boolean1', fieldType: 'boolean', operator: 'eq', value: 'maybe' },
109+
])
110+
)
111+
).toThrow()
112+
})
113+
114+
it('rejects a between filter missing a usable upper bound', () => {
115+
expect(() =>
116+
parseDocumentTagFiltersParam(
117+
JSON.stringify([
118+
{
119+
tagSlot: 'number1',
120+
fieldType: 'number',
121+
operator: 'between',
122+
value: '1',
123+
valueTo: 'x',
124+
},
125+
])
126+
)
127+
).toThrow()
128+
})
129+
130+
it('accepts a valid number, date, boolean, and between filter', () => {
131+
const filters = [
132+
{ tagSlot: 'number1', fieldType: 'number', operator: 'gte', value: '42' },
133+
{ tagSlot: 'date1', fieldType: 'date', operator: 'eq', value: '2026-04-21' },
134+
{ tagSlot: 'boolean1', fieldType: 'boolean', operator: 'eq', value: 'true' },
135+
{
136+
tagSlot: 'number2',
137+
fieldType: 'number',
138+
operator: 'between',
139+
value: '1',
140+
valueTo: '10',
141+
},
142+
]
143+
expect(parseDocumentTagFiltersParam(JSON.stringify(filters))).toEqual(filters)
144+
})
145+
})

0 commit comments

Comments
 (0)