Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses the upload “total file size limit” logic in Hubble’s backend, aiming to avoid incorrectly enforcing the limit across unrelated uploads (issue #496).
Changes:
- Add
FileMappingService.listByConnId(connId)to fetch file mappings scoped to a graph connection. - Update upload validation to sum sizes using
listByConnId(connId)instead oflistAll(). - Fix the i18n error key used when exceeding the total size limit (
exceed-total-size).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/service/load/FileMappingService.java | Adds a new query helper to list file mappings by conn_id. |
| hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/load/FileUploadController.java | Updates total-size validation to be connection-scoped and uses the correct error key for total-limit violations. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| long totalFileSizeLimit = this.config.get( | ||
| HubbleOptions.UPLOAD_TOTAL_FILE_SIZE_LIMIT); | ||
| List<FileMapping> fileMappings = this.service.listAll(); | ||
| List<FileMapping> fileMappings = this.service.listByConnId(connId); | ||
| long currentTotalSize = fileMappings.stream() | ||
| .map(FileMapping::getTotalSize) | ||
| .reduce(0L, (Long::sum)); | ||
| Ex.check(fileSize + currentTotalSize <= totalFileSizeLimit, | ||
| "load.upload.file.exceed-single-size", | ||
| "load.upload.file.exceed-total-size", | ||
| FileUtils.byteCountToDisplaySize(totalFileSizeLimit)); |
There was a problem hiding this comment.
The total upload limit is now computed by summing total_size across all FileMapping records for the connection. Since completed mappings are persisted (and totalSize is set on completion), this quota becomes cumulative across jobs and won’t “reset” when a new job is created, which seems to be the core problem described in #496. Consider scoping the sum to the current job (conn_id + job_id), or using jobEntity.getJobSize() (which is already maintained on upload/delete) so the limit is enforced per job and naturally resets between jobs.
| .map(FileMapping::getTotalSize) | ||
| .reduce(0L, (Long::sum)); |
There was a problem hiding this comment.
currentTotalSize is calculated by loading every FileMapping for the connection into memory and then reducing with boxed Longs. For large numbers of mappings this is unnecessarily expensive on every upload request. Consider using a DB-side SUM(total_size) aggregate (or at least mapToLong(...).sum()) to avoid allocating and iterating over full entities.
| .map(FileMapping::getTotalSize) | |
| .reduce(0L, (Long::sum)); | |
| .mapToLong(FileMapping::getTotalSize) | |
| .sum(); |
Purpose of the PR
Main Changes
Verifying these changes
Does this PR potentially affect the following parts?
Documentation Status
Doc - TODODoc - DoneDoc - No Need