-
Notifications
You must be signed in to change notification settings - Fork 175
fix(websocket): Fix websocket client race on abort and memory leak(IDFGH-16555) #924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
|
67bd7e3 to
46871bf
Compare
| #else | ||
| // When separate TX lock is not configured, we already hold client->lock | ||
| // which protects the transport, so we can send PONG directly | ||
| esp_transport_ws_send_raw(client->transport, WS_TRANSPORT_OPCODES_PONG | WS_TRANSPORT_OPCODES_FIN, data, client->payload_len, |
Check warning
Code scanning / clang-tidy
The value '138' provided to the cast expression is not in the valid range of values for 'ws_transport_opcodes' [clang-analyzer-optin.core.EnumCastOutOfRange] Warning
46871bf to
5577e03
Compare
ca2956e to
0e58789
Compare
| } | ||
| ESP_LOGD(TAG, "Calling abort_connection due to send error"); | ||
| #ifdef CONFIG_ESP_WS_CLIENT_SEPARATE_TX_LOCK | ||
| xSemaphoreGiveRecursive(client->tx_lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to move this verification to abort connection function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@euripedesrocha I am not sure about it, as abort connection is used in 5 different places, if we move it inside abort_connection it adds complex detection logic "which lock am I holding?" Only 1 place (send error path) needs lock switching (tx_lock → client->lock)
0e58789 to
62925a5
Compare
15dcb35 to
f474654
Compare
a6e4259 to
50e3068
Compare
50e3068 to
22eb17e
Compare
- Add state check in abort_connection to prevent double-close - Fix memory leak: free errormsg_buffer on disconnect - Reset connection state on reconnect to prevent stale data - Implement lock ordering for separate TX lock mode - Read buffered data immediately after connection to prevent data loss - Added sdkconfig.ci.tx_lock config
22eb17e to
52abfc0
Compare
TODO - Will remove comments after the review ( left it for easier review)
Description
This PR fixes critical memory leaks and crashes in the ESP WebSocket client that occur during reconnection scenarios(CONFIG_ESP_WS_CLIENT_SEPARATE_TX_LOCK = y).
Changes Made:
Related
#898
Checklist
Before submitting a Pull Request, please ensure the following:
Note
Prevents double-close races and error-buffer leaks, improves TX lock ordering and PONG handling, resets state on connect, and tweaks recv/poll flow; adds tx-lock CI config.
esp_websocket_client_abort_connection()to skip abort when closing/closed and set error type.client->errormsg_bufferto fix leak; add debug logging.tx_lockandlock(whenCONFIG_ESP_WS_CLIENT_SEPARATE_TX_LOCK) before callingesp_websocket_client_abort_connection().payload_len/offset,last_fin, andlast_opcode.recvto capture early data.client->lockbeforerecvwhenread_select > 0; avoid duplicate lock ops.client->transport_listandclient->transport.examples/target/sdkconfig.ci.tx_lockenablingCONFIG_ESP_WS_CLIENT_SEPARATE_TX_LOCKand TX lock timeout.Written by Cursor Bugbot for commit 52abfc0. This will update automatically on new commits. Configure here.