Skip to content

Commit 9504c87

Browse files
committed
doc: instrumentation: add documentation for the instrumentation subsystem
Add documentation for the instrumentation subsystem as this was missing from the initial contribution. Signed-off-by: Benjamin Cabé <benjamin@zephyrproject.org>
1 parent 91b1b84 commit 9504c87

File tree

4 files changed

+252
-0
lines changed

4 files changed

+252
-0
lines changed

doc/services/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ OS Services
1717
file_system/index.rst
1818
formatted_output.rst
1919
input/index.rst
20+
instrumentation/index.rst
2021
ipc/index.rst
2122
llext/index.rst
2223
logging/index.rst
Lines changed: 240 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,240 @@
1+
.. _instrumentation:
2+
3+
Instrumentation
4+
###############
5+
6+
Overview
7+
********
8+
9+
The instrumentation subsystem provides compiler-managed runtime system instrumentation capabilities
10+
for Zephyr applications. It enables developers to trace function calls, observe context switches,
11+
and profile application performance with minimal manual instrumentation effort.
12+
13+
Unlike the :ref:`tracing <tracing>` subsystem, which provides RTOS-aware tracing with structured
14+
event APIs, the instrumentation subsystem works at a lower level by leveraging compiler
15+
instrumentation hooks. This approach makes it possible to capture virtually any function entry and
16+
exit events without requiring manual tracing calls in the code.
17+
18+
.. admonition:: Tracing vs. Instrumentation
19+
:class: hint
20+
21+
**When to use Tracing**: Choose the tracing subsystem when you need RTOS-aware event tracing
22+
(e.g. thread switches, semaphore operations, etc.) and want to minimize overhead.
23+
24+
**When to use Instrumentation**: Choose instrumentation when you need a detailed view of
25+
function-level execution to better understand code flow, or to identify performance bottlenecks
26+
without adding manual trace points.
27+
28+
The instrumentation subsystem relies on compiler support for automatic function instrumentation.
29+
When enabled, the compiler automatically inserts calls to special instrumentation handler functions
30+
at the entry and exit of every function in your application (excluding those explicitly marked with
31+
``__no_instrumentation__``). Currently, only GCC is supported with the ``-finstrument-functions``
32+
compiler flag.
33+
34+
The subsystem initializes automatically after RAM initialization and uses trigger/stopper functions
35+
to control when recording is active. The default trigger and stopper functions are both set to
36+
``main()`` (configurable via Kconfig), meaning instrumentation captures the entire execution from
37+
when ``main()`` starts until it returns.
38+
39+
The recorded data is stored in RAM and can be accessed from a host computer thanks to a UART backend
40+
that exposes a set of simple commands. :zephyr_file:`scripts/instrumentation/zaru.py` script allows
41+
to execute these commands through a high-level command-line interface and makes it easy to obtain
42+
data in a format suitable for further analysis (e.g. using `Perfetto`_).
43+
44+
Operational Modes
45+
*****************
46+
47+
The instrumentation subsystem supports two modes that can be enabled independently or together:
48+
49+
Callgraph Mode (Tracing)
50+
========================
51+
52+
In callgraph mode (enabled with :kconfig:option:`CONFIG_INSTRUMENTATION_MODE_CALLGRAPH`), the
53+
subsystem records function entry and exit events along with timestamps and context information in a
54+
memory buffer. This enables:
55+
56+
- Reconstruction of the complete function call graph
57+
- Observation of thread context switches
58+
- Analysis of execution flow and timing relationships
59+
60+
The trace buffer can operate in ring buffer mode (default, overwrites old entries) or fixed buffer
61+
mode (stops when full). Buffer size is configurable via
62+
:kconfig:option:`CONFIG_INSTRUMENTATION_MODE_CALLGRAPH_TRACE_BUFFER_SIZE`.
63+
64+
.. code-block:: console
65+
:caption: Example of callgraph mode output. See :ref:`zaru_usage` for more details.
66+
67+
$ ./scripts/instrumentation/zaru.py trace
68+
69+
Thread Name Thread ID CPU Mode Timestamp Function(s)
70+
------------------------------------------------------------------------------------------------
71+
... (truncated) ...
72+
73+
main 0x20001a38 0) 0 | 187837720 ns | sys_dlist_append();
74+
main 0x20001a38 0) 0 | 188802680 ns | }; /* z_priq_simple_add */
75+
main 0x20001a38 0) 0 | 189282840 ns | }; /* add_to_waitq_locked */
76+
main 0x20001a38 0) 0 | 189770000 ns | add_thread_timeout();
77+
main 0x20001a38 0) 0 | 190732920 ns | }; /* pend_locked */
78+
main 0x20001a38 0) 0 | 191198480 ns | k_spin_release();
79+
main 0x20001a38 0) 0 | 192125560 ns | z_swap() {
80+
main 0x20001a38 0) 0 | 192590080 ns | k_spin_release();
81+
main 0x20001a38 0) 0 | 193520000 ns | z_swap_irqlock() {
82+
main 0x20001a38 0) 0 | 193987840 ns | __set_BASEPRI() {
83+
main 0x20001a38 0) 0 | 194474640 ns | /* --> Scheduler switched OUT from thread 'main' */
84+
thread-none none-thread 0) 0 | 195178000 ns | /* <-- Scheduler switched IN thread 'thread-none' */
85+
thread-none none-thread 0) 0 | 195851520 ns | z_thread_entry() {
86+
thread-none none-thread 0) 0 | 196312600 ns | k_sched_current_thread_query() {
87+
thread-none none-thread 0) 0 | 196774680 ns | z_impl_k_sched_current_thread_query();
88+
thread-none none-thread 0) 0 | 197694480 ns | }; /* k_sched_current_thread_query */
89+
thread_A 0x200000d8 0) 7 | 198160000 ns | thread_A() {
90+
thread_A 0x200000d8 0) 7 | 198443400 ns | get_sem_and_exec_function() {
91+
thread_A 0x200000d8 0) 7 | 198727440 ns | k_sem_take() {
92+
thread_A 0x200000d8 0) 7 | 199011840 ns | z_impl_k_sem_take() {
93+
thread_A 0x200000d8 0) 7 | 199397520 ns | k_spin_lock() {
94+
thread_A 0x200000d8 0) 7 | 199784200 ns | __get_BASEPRI();
95+
thread_A 0x200000d8 0) 7 | 200557840 ns | __set_BASEPRI_MAX();
96+
thread_A 0x200000d8 0) 7 | 201333640 ns | __ISB();
97+
thread_A 0x200000d8 0) 7 | 202111360 ns | z_spinlock_validate_pre();
98+
thread_A 0x200000d8 0) 7 | 202891000 ns | z_spinlock_validate_post();
99+
thread_A 0x200000d8 0) 7 | 203664760 ns | }; /* k_spin_lock */
100+
thread_A 0x200000d8 0) 7 | 204058000 ns | k_spin_unlock() {
101+
thread_A 0x200000d8 0) 7 | 204450840 ns | __set_BASEPRI();
102+
thread_A 0x200000d8 0) 7 | 205231640 ns | __ISB();
103+
thread_A 0x200000d8 0) 7 | 206009600 ns | }; /* k_spin_unlock */
104+
thread_A 0x200000d8 0) 7 | 206291600 ns | }; /* z_impl_k_sem_take */
105+
thread_A 0x200000d8 0) 7 | 206572920 ns | }; /* k_sem_take */
106+
107+
... (truncated) ...
108+
109+
Statistical Mode (Profiling)
110+
============================
111+
112+
In statistical mode (enabled with :kconfig:option:`CONFIG_INSTRUMENTATION_MODE_STATISTICAL`), the
113+
subsystem accumulates timing statistics for each unique function executed between the trigger and
114+
stopper points. This provides total execution time per function and helps identify performance
115+
bottlenecks. The subsystem tracks up to
116+
:kconfig:option:`CONFIG_INSTRUMENTATION_MODE_STATISTICAL_MAX_NUM_FUNC` unique functions.
117+
118+
.. code-block:: console
119+
:caption: Example of statistical mode output (top 10 most expensive functions). See
120+
:ref:`zaru_usage` for more details.
121+
122+
$ ./scripts/instrumentation/zaru.py profile -n 10
123+
124+
9.45% 0000061d main
125+
6.00% 0000049d k_msleep
126+
5.98% 00000469 k_sleep
127+
5.95% 0000aea1 z_impl_k_sleep
128+
5.93% 0000ad6d z_tick_sleep
129+
5.66% 00000431 k_sem_take
130+
5.65% 00007e65 z_impl_k_sem_take
131+
5.51% 0000ac29 z_pend_curr
132+
2.83% 000063ed sys_clock_isr
133+
2.67% 0000d361 sys_clock_announce
134+
135+
Configuration
136+
*************
137+
138+
Enable instrumentation with:
139+
140+
.. code-block:: kconfig
141+
142+
CONFIG_INSTRUMENTATION=y
143+
CONFIG_INSTRUMENTATION_MODE_CALLGRAPH=y # For tracing
144+
CONFIG_INSTRUMENTATION_MODE_STATISTICAL=y # For profiling
145+
146+
The instrumentation subsystem uses :ref:`retained memory <retention_api>` to persist trigger/stopper
147+
function addresses across reboots. This must be configured in the devicetree:
148+
149+
.. code-block:: devicetree
150+
151+
/ {
152+
sram@2003FC00 {
153+
compatible = "zephyr,memory-region", "mmio-sram";
154+
reg = <0x2003FC00 DT_SIZE_K(1)>;
155+
zephyr,memory-region = "RetainedMem";
156+
157+
retainedmem {
158+
compatible = "zephyr,retained-ram";
159+
status = "okay";
160+
161+
instrumentation_triggers: retention@0 {
162+
compatible = "zephyr,retention";
163+
status = "okay";
164+
reg = <0x0 0x10>;
165+
};
166+
};
167+
};
168+
};
169+
170+
/* Adjust main SRAM to exclude retained region */
171+
&sram0 {
172+
reg = <0x20000000 DT_SIZE_K(255)>;
173+
};
174+
175+
See the :zephyr:code-sample:`instrumentation` sample for complete configuration examples.
176+
Additional options include buffer sizes, trigger functions, and function/file exclusion lists (see
177+
Kconfig options starting with :kconfig:option-regex:`CONFIG_INSTRUMENTATION_*`).
178+
179+
.. _zaru_usage:
180+
181+
``zaru.py`` Usage
182+
*****************
183+
184+
The ``zaru.py`` command-line tool (located in :zephyr_file:`scripts/instrumentation/zaru.py`)
185+
provides an interface for controlling instrumentation and extracting data from the target over UART.
186+
187+
The tool offers several commands:
188+
189+
- ``status``: Check if the target device supports callgraph (tracing) and statistical (profiling)
190+
modes.
191+
- ``trace``: Capture and display function call traces.
192+
- ``profile``: Capture and display function profiling data.
193+
- ``reboot``: Reboot the target device.
194+
195+
You can get help for each command by running ``zaru.py <command> --help``.
196+
197+
By default, ``zaru.py`` attempts to connect to the target device using ``/dev/ttyACM0``. You can
198+
specify a different serial port using the ``--serial`` option:
199+
200+
.. code-block:: console
201+
202+
$ ./scripts/instrumentation/zaru.py --serial /dev/ttyACM1 status
203+
204+
The ``--build-dir`` option can be used to specify the Zephyr build directory, which is needed to
205+
locate the ELF file for symbol resolution. If not provided, ``zaru.py`` will attempt to find it
206+
automatically.
207+
208+
See the :zephyr:code-sample:`instrumentation` sample documentation for detailed usage instructions.
209+
210+
Limitations and Considerations
211+
******************************
212+
213+
Compiler support
214+
The instrumentation subsystem requires GCC with ``-finstrument-functions`` support. Other
215+
compilers are not supported.
216+
217+
Stack size requirements
218+
Instrumentation adds overhead to every function call, which increases stack usage. You will likely
219+
need to increase thread stack sizes to accommodate the additional space required by instrumentation
220+
handlers and nested function calls.
221+
222+
Execution overhead
223+
All function calls incur instrumentation overhead. Code size will increase due to added
224+
instrumentation calls, and performance will be impacted.
225+
226+
Initialization constraints
227+
Code that runs before RAM initialization (e.g., early boot functions) is not captured as it runs
228+
before the instrumentation subsystem is initialized.
229+
230+
To reduce overhead, use trigger/stopper functions to instrument only code regions of interest, and
231+
exclude performance-critical functions via
232+
:kconfig:option:`CONFIG_INSTRUMENTATION_EXCLUDE_FUNCTION_LIST` and
233+
:kconfig:option:`CONFIG_INSTRUMENTATION_EXCLUDE_FILE_LIST`.
234+
235+
API Reference
236+
*************
237+
238+
.. doxygengroup:: instrumentation_api
239+
240+
.. _Perfetto: https://perfetto.dev/

include/zephyr/instrumentation/instrumentation.h

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,12 @@
77
#ifndef ZEPHYR_INCLUDE_INSTRUMENTATION_INSTRUMENTATION_H_
88
#define ZEPHYR_INCLUDE_INSTRUMENTATION_INSTRUMENTATION_H_
99

10+
/**
11+
* @defgroup instrumentation_api Instrumentation
12+
* @ingroup os_services
13+
* @{
14+
*/
15+
1016
#include <zephyr/kernel.h>
1117

1218
#ifdef __cplusplus
@@ -212,4 +218,8 @@ void *instr_get_stop_func(void);
212218
}
213219
#endif
214220

221+
/**
222+
* @}
223+
*/
224+
215225
#endif /* ZEPHYR_INCLUDE_INSTRUMENTATION_INSTRUMENTATION_H_ */

samples/subsys/instrumentation/README.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
.. zephyr:code-sample:: instrumentation
22
:name: Instrumentation
3+
:relevant-api: instrumentation_api
34

45
Demonstrate the instrumentation subsystem tracing and profiling features.
56

0 commit comments

Comments
 (0)