Skip to content

qrouting: fix multiple bugs across the module#3861

Open
abdoulosseni wants to merge 1 commit intoOpenSIPS:masterfrom
abdoulosseni:fix/qrouting-multiple-bugs
Open

qrouting: fix multiple bugs across the module#3861
abdoulosseni wants to merge 1 commit intoOpenSIPS:masterfrom
abdoulosseni:fix/qrouting-multiple-bugs

Conversation

@abdoulosseni
Copy link
Copy Markdown

Summary

Fix 10 bugs found in the qrouting module:

Critical (incorrect behavior):

  • MI enable_dst disables instead of enabling: copy-paste error caused the MI command to call mi_qr_disable_dst_* functions instead of mi_qr_enable_dst_*
  • Read lock leak in qr_score_grp(): missing lock_stop_read() when gateway is disabled and not dirty, causing progressive deadlock
  • update_grp_stats() pass-by-value: QR_STATUS_DIRTY flag written to stack copy, never applied to actual group — carrier scores remain stale
  • Wrong union member in qr_set_dst_state(): dst->gw->ref_lock accessed unconditionally, but for QR_DST_GRP destinations this is undefined behavior (reinterprets grp union member as gw pointer)
  • Weighted sort score/index mismatch: qr_weight_based_sort() swapped dsts[] but not scores[], causing wrong weights after first iteration

Resource leaks & crashes:

  • dialog_prop shm leak: 3 error paths in qr_check_reply_tmcb() after shm_malloc don't free on failure
  • NULL deref on *qr_main_list: MI status/enable/disable functions and w_qr_set_dst_state() dereference without NULL check, crashing if called before drouting loads
  • DB connection leak in qr_check_db(): handle not closed on capability/version check failures
  • Division by zero: sampling_interval=0 not validated, causes crash at history_span * 60 / sampling_interval
  • Memory leak in qr_reload(): shm_realloc failure overwrites profs with NULL (losing original block) and doesn't free DB result set; now uses temp variable

Test plan

  • Verify MI enable_dst command correctly re-enables a previously disabled gateway
  • Verify MI disable_dst still works as expected
  • Confirm no lock contention/deadlock under load with disabled gateways
  • Test qr_reload behavior under memory pressure
  • Test module startup with sampling_interval=0 (should fail cleanly)
  • Verify weighted dynamic routing distributes traffic correctly across multiple gateways

🤖 Generated with Claude Code

Fix the following issues in the qrouting module:

1. MI "enable_dst" command was calling mi_qr_disable_dst_* functions
   instead of mi_qr_enable_dst_*, effectively disabling destinations
   when users intended to enable them (copy-paste error).

2. Read lock leak in qr_score_grp() when a gateway is disabled and
   not dirty: the else branch was missing a lock_stop_read() call.

3. update_grp_stats() took its qr_grp_t argument by value, so the
   QR_STATUS_DIRTY flag was written to a stack copy and never applied
   to the actual group, causing stale cached scores.

4. qr_set_dst_state() always locked dst->gw->ref_lock regardless of
   destination type. For QR_DST_GRP destinations, this reinterprets
   the grp union member as a gw pointer (undefined behavior). Now
   locks dst->grp.ref_lock for carrier destinations.

5. qr_weight_based_sort() swapped destination indices but not their
   corresponding scores, causing wrong weights after the first
   iteration.

6. Memory leak of dialog_prop (shm) on three error paths in
   qr_check_reply_tmcb() after successful allocation.

7. NULL pointer dereference on *qr_main_list in MI status/enable/
   disable functions and w_qr_set_dst_state() when called before
   drouting has loaded data.

8. DB connection leak in qr_check_db(): the handle was not closed
   on the capability check and table version check error paths.

9. Division by zero when sampling_interval is set to 0: added
   parameter validation.

10. Memory leak in qr_reload() on shm_realloc failure: the old profs
    pointer was overwritten with NULL (losing the original block),
    and the DB result set was not freed. Now uses a temporary variable
    to preserve the original pointer and properly cleans up.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant