Skip to content

Bug: V3 endpoint migration is extremely slow and throughput degrades over time #15124

Description

@DarkR0ast

Description

Running the V3 endpoint-to-location migration is significantly slower than expected. In addition to the low initial throughput, migration performance steadily degrades as more endpoints are processed.

For a dataset of approximately 65k endpoints, the migration starts at around 3.4 endpoints/sec and gradually drops to 0.3 endpoints/sec, increasing the estimated completion time from roughly 5 hours to 38+ hours.

Steps to Reproduce

  1. Enable V3_FEATURE_LOCATIONS.
  2. Run the endpoint migration (migrate_endpoints_to_locations).
  3. Monitor the migration progress and throughput.

Expected Behavior

Migration throughput should remain relatively stable throughout execution, with only minor fluctuations.

Actual Behavior

Migration speed continuously decreases as the migration progresses.

Example output:

Migrated 3,900/65,194 endpoints (6.0%) — 3.4 endpoints/sec — ETA 5h 4m
...
Migrated 4,350/65,194 endpoints (6.7%) — 2.5 endpoints/sec — ETA 6h 53m
...
Migrated 23,300/65,194 endpoints (35.7%) — 0.3 endpoints/sec — ETA 38h 29m

The slowdown appears cumulative rather than transient, suggesting the migration becomes less efficient as more records are processed.

Additional Observations

During the migration, individual endpoint failures are logged and skipped, for example:

ValidationError: {'protocol': ['icmp is not a supported protocol']}

ERROR [dojo.management.commands.migrate_endpoints_to_locations:475]
Failed to migrate endpoint id=104369; continuing

These validation failures are handled correctly and the migration continues, but they do not appear to explain the overall degradation in throughput.

Impact

For larger DefectDojo instances, the migration may require tens of hours or longer to complete, making adoption of V3 Locations difficult in production environments.

Questions

  • Is this level of performance expected for the current implementation?
  • Are there known bottlenecks in the migration (e.g. repeated queries, missing bulk operations, growing in-memory state, transaction handling, or index updates) that could explain the progressive slowdown?
Image Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions