Skip to content

Commit 950c260

Browse files
authored
fix(uploads): attach compiled binary for AI-generated docs, not source (#5266)
* fix(uploads): attach compiled binary for AI-generated docs, not source AI-generated documents (pdf/docx/pptx/xlsx) created in Chat are stored as their generation source, with the rendered binary in a separate content-addressed artifact store. Read/preview paths swap in the binary, but attachment/upload/provider paths downloaded the raw source — so a generated PDF emailed via Gmail (and 30+ other tools) arrived as the generator script renamed .pdf. - Add shared resolveServableDocBytes resolver + downloadServableFileFromStorage wrapper; the file-serve route now delegates to the same resolver so the two paths resolve identically. - Migrate ~34 attachment/upload/parse tool routes + the LLM provider attachment path to the servable download; media-only tools and source-editing paths keep the raw download intentionally. - Surface a retryable 409 (shared docNotReadyResponse) when a doc artifact is still compiling instead of shipping source. * fix(uploads): return retryable 409 for not-ready docs in slack/teams sends The slack send-message and teams write_channel/write_chat routes call download helpers that can throw DocCompileUserError while a generated doc is still compiling. Map it to the shared docNotReadyResponse 409 (matching the other migrated tool routes) instead of a generic 500. The provider attachment path is internal LLM execution (no HTTP response), so it intentionally propagates the typed error. * fix(uploads): not-ready 409 for uptimerobot, real MIME for non-doc, xlsx tests Address review findings: - uptimerobot create-psp/update-psp now map DocCompileUserError to the shared 409 (Greptile + Cursor flagged the gap alongside slack/teams). - downloadServableFileFromStorage returns the extension-derived MIME (getMimeTypeFromExtension) for non-doc files instead of an empty string when userFile.type is unset. - Add resolveServableDocBytes tests for the three xlsx branches (binary ZIP passthrough, not-ready throw under E2B+beta, no-workspaceId raw passthrough). * fix(uploads): enforce attachment size limits on resolved bytes Size limits were checked against userFile.size (source metadata) before resolution, but a generated doc resolves to a larger compiled binary — so a small-source doc could pass the pre-check yet exceed the service limit. Add a post-resolution check on the actual resolved bytes (mirroring docusign/vanta) across gmail send/draft/edit-draft, smtp, outlook send/draft, telegram, sftp, and teams; the cheap source pre-check stays as an early reject. * chore(uploads): drop extraneous inline comments from servable-file changes * fix(sftp): enforce 100MB cap on cumulative resolved bytes, not per-file The SFTP batch upload checked each resolved file against the 100MB cap individually, so multiple resolved attachments could each pass while their combined size exceeded the limit. Accumulate resolved bytes across the loop and reject once the running total exceeds the cap. * fix(sendgrid): reject attachments exceeding SendGrid's 30MB limit on resolved bytes SendGrid had no attachment-size guard, so a generated doc resolving to a large compiled binary could be sent and fail opaquely at the API. Add a post-resolution total-size check (30MB, SendGrid's documented message limit) matching the gmail/smtp/outlook routes.
1 parent a6196e0 commit 950c260

51 files changed

Lines changed: 1129 additions & 332 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

apps/sim/app/api/files/serve/[...path]/route.ts

Lines changed: 13 additions & 108 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,14 @@
11
import { readFile } from 'fs/promises'
22
import { createLogger } from '@sim/logger'
3-
import { sha256Hex } from '@sim/security/hash'
43
import type { NextRequest } from 'next/server'
54
import { NextResponse } from 'next/server'
65
import { fileServeParamsSchema, fileServeQuerySchema } from '@/lib/api/contracts/storage-transfer'
76
import { checkSessionOrInternalAuth } from '@/lib/auth/hybrid'
87
import {
98
DocCompileUserError,
10-
getE2BDocFormat,
11-
loadCompiledDocByExt,
9+
resolveServableDocBytes,
1210
} from '@/lib/copilot/tools/server/files/doc-compile'
13-
import { isE2BDocEnabled } from '@/lib/core/config/env-flags'
1411
import { withRouteHandler } from '@/lib/core/utils/with-route-handler'
15-
import { runSandboxTask } from '@/lib/execution/sandbox/run-task'
1612
import { CopilotFiles, isUsingCloudStorage } from '@/lib/uploads'
1713
import type { StorageContext } from '@/lib/uploads/config'
1814
import { parseWorkspaceFileKey } from '@/lib/uploads/contexts/workspace/workspace-file-manager'
@@ -26,47 +22,14 @@ import {
2622
findLocalFile,
2723
getContentType,
2824
} from '@/app/api/files/utils'
29-
import type { SandboxTaskId } from '@/sandbox-tasks/registry'
3025

3126
const logger = createLogger('FilesServeAPI')
3227

33-
const ZIP_MAGIC = Buffer.from([0x50, 0x4b, 0x03, 0x04])
34-
const PDF_MAGIC = Buffer.from([0x25, 0x50, 0x44, 0x46, 0x2d]) // %PDF-
35-
36-
interface CompilableFormat {
37-
magic: Buffer
38-
taskId: SandboxTaskId
39-
contentType: string
40-
}
41-
42-
const COMPILABLE_FORMATS: Record<string, CompilableFormat> = {
43-
'.pptx': {
44-
magic: ZIP_MAGIC,
45-
taskId: 'pptx-generate',
46-
contentType: 'application/vnd.openxmlformats-officedocument.presentationml.presentation',
47-
},
48-
'.docx': {
49-
magic: ZIP_MAGIC,
50-
taskId: 'docx-generate',
51-
contentType: 'application/vnd.openxmlformats-officedocument.wordprocessingml.document',
52-
},
53-
'.pdf': {
54-
magic: PDF_MAGIC,
55-
taskId: 'pdf-generate',
56-
contentType: 'application/pdf',
57-
},
58-
}
59-
60-
const MAX_COMPILED_DOC_CACHE = 10
61-
const compiledDocCache = new Map<string, Buffer>()
62-
63-
function compiledCacheSet(key: string, buffer: Buffer): void {
64-
if (compiledDocCache.size >= MAX_COMPILED_DOC_CACHE) {
65-
compiledDocCache.delete(compiledDocCache.keys().next().value as string)
66-
}
67-
compiledDocCache.set(key, buffer)
68-
}
69-
28+
/**
29+
* Resolves the bytes + content type to serve for a stored file via the shared
30+
* {@link resolveServableDocBytes} (generated docs → compiled artifact). `raw=1`
31+
* bypasses resolution and serves the stored source as-is.
32+
*/
7033
async function compileDocumentIfNeeded(
7134
buffer: Buffer,
7235
filename: string,
@@ -76,71 +39,13 @@ async function compileDocumentIfNeeded(
7639
signal: AbortSignal | undefined
7740
): Promise<{ buffer: Buffer; contentType: string }> {
7841
if (raw) return { buffer, contentType: getContentType(filename) }
79-
80-
const ext = filename.slice(filename.lastIndexOf('.')).toLowerCase()
81-
const extNoDot = ext.replace(/^\./, '')
82-
const format = COMPILABLE_FORMATS[ext]
83-
84-
// Already a binary file (uploaded or pre-compiled)? Serve as-is.
85-
if (format) {
86-
const magicLen = format.magic.length
87-
if (buffer.length >= magicLen && buffer.subarray(0, magicLen).equals(format.magic)) {
88-
return { buffer, contentType: getContentType(filename) }
89-
}
90-
}
91-
92-
// .xlsx is a ZIP container with no JS compile path. An uploaded/binary xlsx
93-
// must short-circuit here (it isn't in COMPILABLE_FORMATS) — otherwise every
94-
// xlsx open would utf-8-decode the whole binary and do an always-miss S3 GET.
95-
// Only a Python-source xlsx (UTF-8 text, no ZIP magic) falls through.
96-
if (
97-
extNoDot === 'xlsx' &&
98-
buffer.length >= ZIP_MAGIC.length &&
99-
buffer.subarray(0, ZIP_MAGIC.length).equals(ZIP_MAGIC)
100-
) {
101-
return { buffer, contentType: getContentType(filename) }
102-
}
103-
104-
// Generated docs render from a content-addressed compiled binary that is built
105-
// exactly ONCE per edit_content/create (at write time) and stored in S3. Serve
106-
// only LOADS it — it must never compile, or it would re-run E2B on every preview
107-
// fetch, including against the incomplete source mid-generation. A hit returns
108-
// the (possibly partial) committed doc; a miss in the E2B regime means the doc
109-
// is still being generated → 409, and the client polls until the artifact lands.
110-
if (workspaceId && (format || extNoDot === 'xlsx')) {
111-
const source = buffer.toString('utf-8')
112-
// Load the prebuilt artifact directly from S3 (content-addressed). No extra
113-
// in-memory layer here: the store is the source of truth, the client (react
114-
// query) already caches the bytes, and this branch never recomputes.
115-
const stored = await loadCompiledDocByExt(workspaceId, source, extNoDot)
116-
if (stored) {
117-
return { buffer: stored.buffer, contentType: stored.contentType }
118-
}
119-
120-
if (isE2BDocEnabled && (await getE2BDocFormat(filename))) {
121-
// Artifact not built yet (still generating, or the source didn't compile at
122-
// write time). Signal "not ready" without compiling — handled as 409.
123-
throw new DocCompileUserError('Document is still being generated')
124-
}
125-
}
126-
127-
if (!format) return { buffer, contentType: getContentType(filename) }
128-
129-
// E2B disabled and no stored artifact → compile JS source via isolated-vm.
130-
const code = buffer.toString('utf-8')
131-
const cacheKey = sha256Hex(`${ext}${code}${workspaceId ?? ''}`)
132-
const cached = compiledDocCache.get(cacheKey)
133-
if (cached) {
134-
return { buffer: cached, contentType: format.contentType }
135-
}
136-
137-
const compiled = await runSandboxTask(
138-
format.taskId,
139-
{ code, workspaceId: workspaceId || '' },
140-
{ ownerKey, signal }
141-
)
142-
compiledCacheSet(cacheKey, compiled)
143-
return { buffer: compiled, contentType: format.contentType }
42+
return resolveServableDocBytes({
43+
rawBuffer: buffer,
44+
fileName: filename,
45+
workspaceId,
46+
ownerKey,
47+
signal,
48+
})
14449
}
14550

14651
const STORAGE_KEY_PREFIX_RE = /^\d{13}-[a-z0-9]{7}-/

apps/sim/app/api/tools/a2a/send-message/route.ts

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,8 @@ import { enforceUserOrIpRateLimit } from '@/lib/core/rate-limiter'
1717
import { generateRequestId } from '@/lib/core/utils/request'
1818
import { withRouteHandler } from '@/lib/core/utils/with-route-handler'
1919
import { processFilesToUserFiles } from '@/lib/uploads/utils/file-utils'
20-
import { downloadFileFromStorage } from '@/lib/uploads/utils/file-utils.server'
20+
import { downloadServableFileFromStorage } from '@/lib/uploads/utils/file-utils.server'
21+
import { docNotReadyResponse } from '@/lib/uploads/utils/servable-file-response'
2122
import { assertToolFileAccess } from '@/app/api/files/authorization'
2223

2324
export const dynamic = 'force-dynamic'
@@ -90,13 +91,19 @@ export const POST = withRouteHandler(async (request: NextRequest) => {
9091
if (denied) return denied
9192
}
9293
files = await Promise.all(
93-
userFiles.map(async (userFile) => ({
94-
bytes: await downloadFileFromStorage(userFile, requestId, logger, {
95-
maxBytes: A2A_MAX_FILE_BYTES,
96-
}),
97-
name: userFile.name,
98-
mediaType: userFile.type || 'application/octet-stream',
99-
}))
94+
userFiles.map(async (userFile) => {
95+
const { buffer, contentType } = await downloadServableFileFromStorage(
96+
userFile,
97+
requestId,
98+
logger,
99+
{ maxBytes: A2A_MAX_FILE_BYTES }
100+
)
101+
return {
102+
bytes: buffer,
103+
name: userFile.name,
104+
mediaType: contentType || userFile.type || 'application/octet-stream',
105+
}
106+
})
100107
)
101108
}
102109

@@ -130,6 +137,8 @@ export const POST = withRouteHandler(async (request: NextRequest) => {
130137
output,
131138
})
132139
} catch (error) {
140+
const notReady = docNotReadyResponse(error)
141+
if (notReady) return notReady
133142
logger.error(`[${requestId}] A2A send failed`, { error: getErrorMessage(error) })
134143
return NextResponse.json({ success: false, error: getErrorMessage(error) }, { status: 502 })
135144
}

apps/sim/app/api/tools/agiloft/attach/route.test.ts

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ vi.mock('@/lib/uploads/utils/file-utils', () => ({
2121
processFilesToUserFiles: mockProcessFilesToUserFiles,
2222
}))
2323
vi.mock('@/lib/uploads/utils/file-utils.server', () => ({
24-
downloadFileFromStorage: mockDownloadFileFromStorage,
24+
downloadServableFileFromStorage: mockDownloadFileFromStorage,
2525
}))
2626
vi.mock('@/app/api/files/authorization', () => ({
2727
assertToolFileAccess: mockAssertToolFileAccess,
@@ -77,7 +77,10 @@ beforeEach(() => {
7777
{ key: 's3://bucket/file.txt', name: 'file.txt', size: 5, type: 'text/plain' },
7878
])
7979
mockAssertToolFileAccess.mockResolvedValue(null)
80-
mockDownloadFileFromStorage.mockResolvedValue(Buffer.from('hello'))
80+
mockDownloadFileFromStorage.mockResolvedValue({
81+
buffer: Buffer.from('hello'),
82+
contentType: 'application/octet-stream',
83+
})
8184
})
8285

8386
describe('POST /api/tools/agiloft/attach', () => {

apps/sim/app/api/tools/agiloft/attach/route.ts

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,8 @@ import { generateRequestId } from '@/lib/core/utils/request'
99
import { withRouteHandler } from '@/lib/core/utils/with-route-handler'
1010
import type { RawFileInput } from '@/lib/uploads/utils/file-schemas'
1111
import { processFilesToUserFiles } from '@/lib/uploads/utils/file-utils'
12-
import { downloadFileFromStorage } from '@/lib/uploads/utils/file-utils.server'
12+
import { downloadServableFileFromStorage } from '@/lib/uploads/utils/file-utils.server'
13+
import { docNotReadyResponse } from '@/lib/uploads/utils/servable-file-response'
1314
import { assertToolFileAccess } from '@/app/api/files/authorization'
1415
import { buildAttachFileUrl } from '@/tools/agiloft/utils'
1516
import {
@@ -74,7 +75,18 @@ export const POST = withRouteHandler(async (request: NextRequest) => {
7475

7576
const denied = await assertToolFileAccess(userFile.key, authResult.userId, requestId, logger)
7677
if (denied) return denied
77-
const fileBuffer = await downloadFileFromStorage(userFile, requestId, logger)
78+
79+
let fileBuffer: Buffer
80+
try {
81+
const servable = await downloadServableFileFromStorage(userFile, requestId, logger)
82+
fileBuffer = servable.buffer
83+
} catch (error) {
84+
const notReady = docNotReadyResponse(error)
85+
if (notReady) return notReady
86+
logger.error(`[${requestId}] Failed to download file from storage:`, error)
87+
return NextResponse.json({ success: false, error: toError(error).message }, { status: 500 })
88+
}
89+
7890
const resolvedFileName = data.fileName || userFile.name || 'attachment'
7991

8092
let resolvedIP: string

apps/sim/app/api/tools/box/upload/route.ts

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,8 @@ import { checkInternalAuth } from '@/lib/auth/hybrid'
77
import { generateRequestId } from '@/lib/core/utils/request'
88
import { withRouteHandler } from '@/lib/core/utils/with-route-handler'
99
import { processFilesToUserFiles, type RawFileInput } from '@/lib/uploads/utils/file-utils'
10-
import { downloadFileFromStorage } from '@/lib/uploads/utils/file-utils.server'
10+
import { downloadServableFileFromStorage } from '@/lib/uploads/utils/file-utils.server'
11+
import { docNotReadyResponse } from '@/lib/uploads/utils/servable-file-response'
1112
import { assertToolFileAccess } from '@/app/api/files/authorization'
1213

1314
export const dynamic = 'force-dynamic'
@@ -53,7 +54,17 @@ export const POST = withRouteHandler(async (request: NextRequest) => {
5354

5455
const denied = await assertToolFileAccess(userFile.key, authResult.userId, requestId, logger)
5556
if (denied) return denied
56-
fileBuffer = await downloadFileFromStorage(userFile, requestId, logger)
57+
try {
58+
const result = await downloadServableFileFromStorage(userFile, requestId, logger)
59+
fileBuffer = result.buffer
60+
} catch (error) {
61+
const notReady = docNotReadyResponse(error)
62+
if (notReady) return notReady
63+
return NextResponse.json(
64+
{ success: false, error: getErrorMessage(error, 'Failed to download file') },
65+
{ status: 500 }
66+
)
67+
}
5768
fileName = validatedData.fileName || userFile.name
5869
} else if (validatedData.fileContent) {
5970
logger.info(`[${requestId}] Using legacy base64 content input`)

apps/sim/app/api/tools/brex/upload-receipt/route.test.ts

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ vi.mock('@/lib/uploads/utils/file-utils', () => ({
2121
processFilesToUserFiles: mockProcessFilesToUserFiles,
2222
}))
2323
vi.mock('@/lib/uploads/utils/file-utils.server', () => ({
24-
downloadFileFromStorage: mockDownloadFileFromStorage,
24+
downloadServableFileFromStorage: mockDownloadFileFromStorage,
2525
}))
2626
vi.mock('@/app/api/files/authorization', () => ({
2727
assertToolFileAccess: mockAssertToolFileAccess,
@@ -65,7 +65,10 @@ beforeEach(() => {
6565
{ key: 'uploads/receipt.pdf', name: 'receipt.pdf', size: 5, type: 'application/pdf' },
6666
])
6767
mockAssertToolFileAccess.mockResolvedValue(null)
68-
mockDownloadFileFromStorage.mockResolvedValue(Buffer.from('receipt-bytes'))
68+
mockDownloadFileFromStorage.mockResolvedValue({
69+
buffer: Buffer.from('receipt-bytes'),
70+
contentType: 'application/pdf',
71+
})
6972
})
7073

7174
describe('POST /api/tools/brex/upload-receipt', () => {
@@ -192,7 +195,10 @@ describe('POST /api/tools/brex/upload-receipt', () => {
192195
})
193196

194197
it('rejects files over the 50 MB limit', async () => {
195-
mockDownloadFileFromStorage.mockResolvedValueOnce(Buffer.alloc(50 * 1024 * 1024 + 1))
198+
mockDownloadFileFromStorage.mockResolvedValueOnce({
199+
buffer: Buffer.alloc(50 * 1024 * 1024 + 1),
200+
contentType: 'application/pdf',
201+
})
196202

197203
const response = await POST(createMockRequest('POST', baseBody))
198204
expect(response.status).toBe(400)

apps/sim/app/api/tools/brex/upload-receipt/route.ts

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,8 @@ import {
1111
import { generateRequestId } from '@/lib/core/utils/request'
1212
import { withRouteHandler } from '@/lib/core/utils/with-route-handler'
1313
import { processFilesToUserFiles, type RawFileInput } from '@/lib/uploads/utils/file-utils'
14-
import { downloadFileFromStorage } from '@/lib/uploads/utils/file-utils.server'
14+
import { downloadServableFileFromStorage } from '@/lib/uploads/utils/file-utils.server'
15+
import { docNotReadyResponse } from '@/lib/uploads/utils/servable-file-response'
1516
import { assertToolFileAccess } from '@/app/api/files/authorization'
1617
import { BREX_API_BASE, buildBrexHeaders } from '@/tools/brex/utils'
1718

@@ -48,7 +49,19 @@ export const POST = withRouteHandler(async (request: NextRequest) => {
4849
const denied = await assertToolFileAccess(userFile.key, authResult.userId, requestId, logger)
4950
if (denied) return denied
5051

51-
const fileBuffer = await downloadFileFromStorage(userFile, requestId, logger)
52+
let fileBuffer: Buffer
53+
try {
54+
const resolved = await downloadServableFileFromStorage(userFile, requestId, logger)
55+
fileBuffer = resolved.buffer
56+
} catch (error) {
57+
const notReady = docNotReadyResponse(error)
58+
if (notReady) return notReady
59+
logger.error(`[${requestId}] Failed to download receipt file:`, error)
60+
return NextResponse.json(
61+
{ success: false, error: getErrorMessage(error, 'Unknown error') },
62+
{ status: 500 }
63+
)
64+
}
5265
if (fileBuffer.length > MAX_RECEIPT_SIZE_BYTES) {
5366
return NextResponse.json(
5467
{ success: false, error: 'Receipt file exceeds the 50 MB limit' },

apps/sim/app/api/tools/confluence/upload-attachment/route.ts

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,8 @@ import { checkSessionOrInternalAuth } from '@/lib/auth/hybrid'
77
import { validateAlphanumericId, validateJiraCloudId } from '@/lib/core/security/input-validation'
88
import { withRouteHandler } from '@/lib/core/utils/with-route-handler'
99
import { processSingleFileToUserFile, type RawFileInput } from '@/lib/uploads/utils/file-utils'
10-
import { downloadFileFromStorage } from '@/lib/uploads/utils/file-utils.server'
10+
import { downloadServableFileFromStorage } from '@/lib/uploads/utils/file-utils.server'
11+
import { docNotReadyResponse } from '@/lib/uploads/utils/servable-file-response'
1112
import { assertToolFileAccess } from '@/app/api/files/authorization'
1213
import { getConfluenceCloudId } from '@/tools/confluence/utils'
1314
import { parseAtlassianErrorMessage } from '@/tools/jira/utils'
@@ -91,9 +92,14 @@ export const POST = withRouteHandler(async (request: NextRequest) => {
9192
if (denied) return denied
9293

9394
let fileBuffer: Buffer
95+
let resolvedContentType: string
9496
try {
95-
fileBuffer = await downloadFileFromStorage(userFile, 'confluence-upload', logger)
97+
const servable = await downloadServableFileFromStorage(userFile, 'confluence-upload', logger)
98+
fileBuffer = servable.buffer
99+
resolvedContentType = servable.contentType
96100
} catch (error) {
101+
const notReady = docNotReadyResponse(error)
102+
if (notReady) return notReady
97103
logger.error('Failed to download file from storage:', error)
98104
return NextResponse.json(
99105
{
@@ -104,7 +110,7 @@ export const POST = withRouteHandler(async (request: NextRequest) => {
104110
}
105111

106112
const uploadFileName = fileName || userFile.name || 'attachment'
107-
const mimeType = userFile.type || 'application/octet-stream'
113+
const mimeType = resolvedContentType || userFile.type || 'application/octet-stream'
108114

109115
const url = `https://api.atlassian.com/ex/confluence/${cloudId}/wiki/rest/api/content/${pageId}/child/attachment`
110116

0 commit comments

Comments
 (0)