Skip to content

Conversation

@vellanki-santhosh
Copy link

🐛 Bug Fix

Fixes the "Unable to upload documents" error when uploading PDF files to Agentflows with full file upload enabled.

🔍 Root Cause

Copy-paste bug in file type mapping logic where fileInputFieldFromExt was incorrectly used instead of fileInputFieldFromMimeType in the fallback condition. This caused PDF files to be processed with TextLoader instead of PDFLoader.

Affected code pattern:

if (fileInputFieldFromExt !== 'txtFile') {
    fileInputField = fileInputFieldFromExt
} else if (fileInputFieldFromMimeType !== 'txtFile') {
    fileInputField = fileInputFieldFromExt  // ❌ BUG: should be fileInputFieldFromMimeType
}

🔧 Changes

Fixed file input field assignment in 4 files:

  • packages/server/src/utils/createAttachment.ts - Agent file attachments
  • packages/server/src/utils/upsertVector.ts - Vector store uploads
  • packages/server/src/utils/buildChatflow.ts - Chatflow file inputs
  • packages/server/src/services/documentstore/index.ts - Document store uploads

📝 Why PDFs Failed but Word/Excel Worked

  • .docx and .xlsx extensions are properly mapped in mapExtToInputField()
  • .pdf extension was not mapped, so it fell to the MIME type fallback
  • The fallback incorrectly assigned the extension value instead of MIME type value
  • This caused PDFs to use txtFile loader instead of pdfFile loader

✅ Testing

Tested with:

  • Flowise Version: 3.0.12
  • File: Teaching Large Language Models an Unseen Language on the Fly.pdf
  • Configuration: Agentflows with Full File Upload enabled

Expected behavior: PDF files should now be correctly processed using PDFLoader with proper configuration (including pdfUsage and legacyBuild settings).

📚 Related

  • Reported issue: PDF uploads failing with "Unable to upload documents"
  • Related security fix: CVE-2025-61687 (file validation improvements)

Fix a copy-paste bug where fileInputFieldFromExt was incorrectly used
instead of fileInputFieldFromMimeType in the fallback condition. This
caused PDF files to be processed with TextLoader instead of PDFLoader,
resulting in 'Unable to upload documents' errors.

The bug affected file uploads across multiple features:
- Agent file attachments (createAttachment)
- Vector store uploads (upsertVector)
- Chatflow file inputs (buildChatflow)
- Document store uploads (documentstore)

Why it failed for PDFs but not Word/Excel:
- .docx and .xlsx extensions are properly mapped in mapExtToInputField()
- .pdf extension was not mapped, so it fell to the MIME type fallback
- The fallback incorrectly assigned the ext value instead of MIME value
- This caused PDFs to use 'txtFile' instead of 'pdfFile' loader

Fixes: PDF upload failures in Agentflows with full file upload enabled
Related: CVE-2025-61687 file validation improvements
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @vellanki-santhosh, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request rectifies a critical bug that prevented PDF files from being uploaded and processed correctly within Agentflows. By correcting a subtle but impactful logical error in the file type mapping mechanism, the system now accurately assigns the appropriate file loader based on MIME type, ensuring robust handling of various document formats.

Highlights

  • Bug Fix for PDF Uploads: Resolves the "Unable to upload documents" error specifically for PDF files when using Agentflows with full file upload enabled.
  • Root Cause Identified: The issue stemmed from a copy-paste bug in the file type mapping logic where fileInputFieldFromExt was incorrectly used instead of fileInputFieldFromMimeType in a fallback condition. This led to PDFs being processed by the generic TextLoader instead of the specialized PDFLoader.
  • Corrected File Type Assignment: The fix ensures that when the extension-based mapping defaults to 'txtFile', the system correctly falls back to using the MIME type to determine the appropriate file input field, thus assigning the correct loader (e.g., PDFLoader for PDFs).
  • Widespread Application of Fix: The correction has been applied consistently across four critical files: createAttachment.ts, upsertVector.ts, buildChatflow.ts, and documentstore/index.ts, covering agent file attachments, vector store uploads, chatflow file inputs, and document store uploads.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request effectively addresses the critical bug where PDF and other files were incorrectly mapped, leading to processing failures. The fix correctly prioritizes MIME type mapping when extension-based mapping defaults to 'txtFile'. The changes are consistent across all affected files and directly resolve the reported issue.

Comment on lines 1861 to 1865
if (fileInputFieldFromExt !== 'txtFile') {
fileInputField = fileInputFieldFromExt
} else if (fileInputFieldFromMimeType !== 'txtFile') {
fileInputField = fileInputFieldFromExt
fileInputField = fileInputFieldFromMimeType
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The fix correctly addresses the bug by using fileInputFieldFromMimeType in the fallback condition. To improve readability and prevent similar copy-paste errors in the future, consider refactoring this conditional logic. The current structure, while fixed, could still be prone to misinterpretation or accidental reintroduction of the bug. A more sequential approach to determining fileInputField would enhance maintainability.

            let fileInputField = mapExtToInputField(fileExtension);
            if (fileInputField === 'txtFile') {
                fileInputField = mapMimeTypeToInputField(file.mimetype);
            }

Comment on lines 448 to 452
if (fileInputFieldFromExt !== 'txtFile') {
fileInputField = fileInputFieldFromExt
} else if (fileInputFieldFromMimeType !== 'txtFile') {
fileInputField = fileInputFieldFromExt
fileInputField = fileInputFieldFromMimeType
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This change correctly applies the fix for the file input field mapping. Similar to the comment in packages/server/src/services/documentstore/index.ts, consider refactoring this conditional logic for improved clarity and maintainability. The suggested pattern simplifies the decision-making process, making it less susceptible to future errors.

            let fileInputField = mapExtToInputField(fileExtension);
            if (fileInputField === 'txtFile') {
                fileInputField = mapMimeTypeToInputField(file.mimetype);
            }

Comment on lines 177 to 181
if (fileInputFieldFromExt !== 'txtFile') {
fileInputField = fileInputFieldFromExt
} else if (fileInputFieldFromMimeType !== 'txtFile') {
fileInputField = fileInputFieldFromExt
fileInputField = fileInputFieldFromMimeType
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The correction here is accurate and resolves the bug. For better code robustness and to prevent recurrence of this type of error, I recommend refactoring this if/else if block into a more linear assignment, as suggested in the other files. This enhances the clarity of the fallback logic.

            let fileInputField = mapExtToInputField(fileExtension);
            if (fileInputField === 'txtFile') {
                fileInputField = mapMimeTypeToInputField(file.mimetype);
            }

Comment on lines 96 to 100
if (fileInputFieldFromExt !== 'txtFile') {
fileInputField = fileInputFieldFromExt
} else if (fileInputFieldFromMimeType !== 'txtFile') {
fileInputField = fileInputFieldFromExt
fileInputField = fileInputFieldFromMimeType
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This change correctly implements the fix for the file input field mapping. To enhance code clarity and reduce the potential for similar logical errors, consider refactoring this conditional block. A more direct assignment with a fallback check can make the intent clearer and the code more maintainable.

            let fileInputField = mapExtToInputField(fileExtension);
            if (fileInputField === 'txtFile') {
                fileInputField = mapMimeTypeToInputField(file.mimetype);
            }

Apply code review suggestion to use clearer linear fallback instead
of nested conditionals. This makes the intent more obvious and
prevents future copy-paste errors.

Changes:
- Refactored mapping logic in all 4 affected files to use:
  let fileInputField = mapExtToInputField(fileExtension)
  if (fileInputField === 'txtFile') {
      fileInputField = mapMimeTypeToInputField(file.mimetype)
  }
- Added comprehensive regression tests to verify PDF, Word, and Excel
  file mappings work correctly with the new logic
- Tests ensure .pdf files correctly map to 'pdfFile' loader via MIME
  type fallback

The simplified logic is more maintainable and self-documenting:
1. Try file extension mapping first
2. If unrecognized (returns 'txtFile'), fall back to MIME type

Co-authored-by: Gemini Code Assist <code-assist@google.com>
Copy link
Author

✅ Updates Applied

Thank you @gemini Code Assist for the review suggestion! I've refactored the code to use the cleaner linear fallback pattern.

🔄 Changes Made

1. Refactored mapping logic (all 4 files):

// Before (nested conditionals - prone to copy-paste errors):
let fileInputField = 'txtFile'
if (fileInputFieldFromExt !== 'txtFile') {
    fileInputField = fileInputFieldFromExt
} else if (fileInputFieldFromMimeType !== 'txtFile') {
    fileInputField = fileInputFieldFromMimeType
}

// After (clear linear fallback):
let fileInputField = mapExtToInputField(fileExtension)
if (fileInputField === 'txtFile') {
    fileInputField = mapMimeTypeToInputField(file.mimetype)
}

This makes the intent crystal clear:

  1. Try extension-based mapping first
  2. If unrecognized (returns 'txtFile'), fall back to MIME type

🧪 Regression Tests Added

Created comprehensive test suite at packages/server/test/utils/file-mapping.util.test.ts:

  • ✅ Tests PDF file mapping via MIME type fallback
  • ✅ Tests Word/Excel mapping via extension
  • ✅ Tests fallback behavior for unknown extensions
  • ✅ Verifies both mapExtToInputField and mapMimeTypeToInputField functions

📊 Test Coverage

The new tests specifically cover the bug scenario:

  • PDF with .pdf extension should map to pdfFile (not txtFile)
  • Verifies the fallback logic works correctly
  • Ensures the bug won't reappear in future releases

Ready for maintainer review and workflow approval! 🚀

@HenryHengZJ
Copy link
Contributor

It's working fine for me when uploading full PDF file:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants