Add functionality for export of latency logs via telemetry#608
Add functionality for export of latency logs via telemetry#608saishreeeee merged 90 commits intotelemetryfrom
Conversation
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
…r operations Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
…ze and get telemetry client Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
…pTelemetryClient class with NOOP_TELEMETRY_CLIENT singleton, updated tests accordingly Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
|
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
|
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
|
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
|
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
jprakash-db
left a comment
There was a problem hiding this comment.
LGTM, Thanks for the making the changes
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
|
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
* send telemetry to unauth endpoint in case of connection/auth errors Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * formatting Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * added unit test for send_connection_error_telemetry Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * retry errors Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Add functionality for export of latency logs via telemetry (#608) * added functionality for export of failure logs Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * changed logger.error to logger.debug in exc.py Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix telemetry loss during Python shutdown Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * unit tests for export_failure_log Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * try-catch blocks to make telemetry failures non-blocking for connector operations Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed redundant try/catch blocks, added try/catch block to initialize and get telemetry client Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * skip null fields in telemetry request Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed dup import, renamed func, changed a filter_null_values to lamda Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed unnecassary class variable and a redundant try/except block Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * public functions defined at interface level Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * changed export_event and flush to private functions Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * formatting Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * changed connection_uuid to thread local in thrift backend Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * made errors more specific Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * revert change to connection_uuid Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * reverting change in close in telemetry client Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * JsonSerializableMixin Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * isdataclass check in JsonSerializableMixin Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * convert TelemetryClientFactory to module-level functions, replace NoopTelemetryClient class with NOOP_TELEMETRY_CLIENT singleton, updated tests accordingly Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * renamed connection_uuid as session_id_hex Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * added NotImplementedError to abstract class, added unit tests Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * formatting Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * added PEP-249 link, changed NoopTelemetryClient implementation Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed unused import Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * made telemetry client close a module-level function Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * unit tests verbose Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * debug logs in unit tests Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * debug logs in unit tests Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed ABC from mixin, added try/catch block around executor shutdown Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * checking stuff Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * finding out * finding out more * more more finding out more nice * locks are useless anyways * haha * normal * := looks like walrus horizontally * one more * walrus again * old stuff without walrus seems to fail * manually do the walrussing * change 3.13t, v2 Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * formatting, added walrus Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * formatting Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed walrus, removed test before stalling test Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * changed order of stalling test Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed debugging, added TelemetryClientFactory Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * remove more debugging Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * latency logs funcitionality Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * fixed type of return value in get_session_id_hex() in thrift backend Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * debug on TelemetryClientFactory lock Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * formatting Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * type notation for _waiters Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * called connection.close() in test_arraysize_buffer_size_passthrough Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * run all unit tests Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * more debugging Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed the connection.close() from that test, put debug statement before and after TelemetryClientFactory lock Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * more debug Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * more more more Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * why Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * whywhy Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * thread name Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * added teardown to all tests except finalizer test (gc collect) Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * added the get_attribute functions to the classes Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed tearDown, added connection.close() to first test Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * finally Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * remove debugging Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * added test for export_latency_log, made mock of thrift backend with retry policy Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * added multi threaded tests Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * formatting Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * added TelemetryExtractor, removed multithreaded tests Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * formatting Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * fixes in test Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * fix in telemetry extractor Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * added doc strings to latency_logger, abstracted export_telemetry_log Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * statement type, unit test fix Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * unit test fix Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * statement type changes Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * test_fetches fix Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * added mocks to resolve the errors caused by log_latency decorator in tests Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed function in test_fetches cuz it is only used once Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * added _safe_call which returns None in case of errors in the get functions Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed the changes in test_client and test_fetches Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed the changes in test_fetches Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * test_telemetry Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed test Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> --------- Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * MaxRetryDurationError Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * main changes Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * formatting Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * import json Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * without the max retry errors Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * unauth telemetry client Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * remove duplicate code setting telemetry_enabled Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed unused errors Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * merge with main changes Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * test Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * without try/catch block Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * - Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * error log for auth provider, ThriftDatabricksClient Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * error log for session.open Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * retry tests fix Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * test connection failure log Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * check types fix Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * test Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * rephrase import Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> --------- Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
What type of PR is this?
Description
Add functionality for export of latency logs via telemetry
Example latency log:
{
"frontend_log_event_id": "30565427-1299-4fbc-bec0-25378cfe912b",
"context": {
"client_context": {
"timestamp_millis": 1750592858108,
"user_agent": "PyDatabricksSqlConnector/4.0.4"
}
},
"entry": {
"sql_driver_log": {
"session_id": "01f04f5e-b1a8-1dd1-962a-e40efd092bdd",
"system_configuration": {
"driver_version": "4.0.4",
"os_name": "Darwin",
"os_version": "24.5.0",
"os_arch": "arm64",
"runtime_name": "Python 3.13.3",
"runtime_version": "3.13.3",
"runtime_vendor": "CPython",
"driver_name": "Databricks SQL Python Connector",
"char_set_encoding": "utf-8",
"locale_name": "en_US"
},
"driver_connection_params": {
"http_path": "/sql/1.0/warehouses/864004c1b3961382",
"mode": "THRIFT",
"host_info": {
"host_url": "e2-dogfood.staging.cloud.databricks.com",
"port": 443
},
"auth_mech": "PAT"
},
"sql_statement_id": "01f04f5e-b1d6-1a1e-afe4-e99e1ccb8805",
"sql_operation": {
"statement_type": "sql",
"is_compressed": true,
"execution_result": "inline_arrow",
"retry_count": 0
},
"operation_latency_ms": 518
}
}
}
How is this tested?
Related Tickets & Documents
PECOBLR-554