Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions source/layers/validation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ Currently checked things:
- check whether created immediate command lists are using in order queues
- check whether in order command lists are using copy offload

### `ZEL_ENABLE_SYSTEM_RESOURCE_TRACKER_CHECKER` (Linux Only)
### `ZEL_ENABLE_SYSTEM_RESOURCE_TRACKER_CHECKER`

The System Resource Tracker monitors both Level Zero API resources and system resources in real-time. It tracks:

Expand Down Expand Up @@ -137,7 +137,7 @@ export ZEL_LOADER_LOGGING_LEVEL=debug
- Debugging and benchmarking
- CI/CD integration for automated resource monitoring

**Platform Support:** This checker is Linux-only and uses `/proc/self/status` for system metrics. It is automatically excluded from Windows and macOS builds.
**Platform Support:** This checker supports Linux and Windows. On Linux it reads `/proc/self/status`; on Windows it uses the Win32 `PSAPI` and `Toolhelp32` APIs. macOS is not supported.

See [System Resource Tracker documentation](checkers/system_resource_tracker/system_resource_tracker.md) for detailed usage and CSV format.

Expand Down
6 changes: 1 addition & 5 deletions source/layers/validation/checkers/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,4 @@ add_subdirectory(events_checker)
add_subdirectory(performance)
add_subdirectory(parameter_validation)
add_subdirectory(template)

# System resource tracker is Linux-only (uses /proc/self/status)
if(UNIX AND NOT APPLE)
add_subdirectory(system_resource_tracker)
endif()
add_subdirectory(system_resource_tracker)
Original file line number Diff line number Diff line change
@@ -1,8 +1,5 @@
# System resource tracker is Linux-only (uses /proc/self/status)
if(UNIX AND NOT APPLE)
target_sources(${TARGET_NAME}
PRIVATE
${CMAKE_CURRENT_LIST_DIR}/zel_system_resource_tracker_checker.h
${CMAKE_CURRENT_LIST_DIR}/zel_system_resource_tracker_checker.cpp
)
endif()
target_sources(${TARGET_NAME}
PRIVATE
${CMAKE_CURRENT_LIST_DIR}/zel_system_resource_tracker_checker.h
${CMAKE_CURRENT_LIST_DIR}/zel_system_resource_tracker_checker.cpp
)
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

The System Resource Tracker is a Level Zero validation layer checker that monitors both Level Zero API resources and system resources in real-time. It tracks resource allocation and deallocation across all Level Zero API calls that create or destroy resources, providing detailed insights into memory usage, resource lifecycles, and system-level metrics.

**Platform Support:** Linux only. This checker uses `/proc/self/status` for system metrics and is not available on Windows or macOS.
**Platform Support:** Linux and Windows. On Linux the checker reads `/proc/self/status` for system metrics; on Windows it uses the Win32 `GetProcessMemoryInfo`, `CreateToolhelp32Snapshot`, and `GetProcessHandleCount` APIs. macOS is not supported.

## Features

Expand All @@ -14,13 +14,13 @@ The System Resource Tracker is a Level Zero validation layer checker that monito
- Logs warnings when memory increases during destroy operations
- Reports cumulative leaks per resource type at program exit
- Provides detailed per-handle leak information
- **System Resource Monitoring**: Tracks real system metrics via `/proc/self/status` including:
- Virtual memory size (VmSize)
- Resident set size (VmRSS)
- Data segment size (VmData)
- Peak virtual memory (VmPeak)
- **System Resource Monitoring**: Tracks real system metrics including:
- Virtual memory size (VmSize / PrivateUsage on Windows)
- Resident set size (VmRSS / WorkingSetSize on Windows)
- Data segment size (VmData / PrivateUsage on Windows)
- Peak virtual memory (VmPeak / PeakWorkingSetSize on Windows)
- Thread count
- File descriptor count
- File descriptor / handle count
- **Signed Delta Tracking**: Calculates both positive and negative resource changes (deltas) for each API call with proper signed arithmetic
- **Cumulative Summaries**: Maintains running totals of all resource types and leak totals
- **CSV Export**: Optionally exports timestamped data for graphing and analysis
Expand All @@ -34,6 +34,7 @@ The System Resource Tracker is a Level Zero validation layer checker that monito

Enable the checker to log resource usage to the Level Zero debug log:

**Linux:**
```bash
export ZE_ENABLE_VALIDATION_LAYER=1
export ZEL_ENABLE_SYSTEM_RESOURCE_TRACKER_CHECKER=1
Expand All @@ -44,10 +45,22 @@ export ZEL_LOADER_LOGGING_LEVEL=debug
./my_level_zero_app
```

**Windows (PowerShell):**
```powershell
$env:ZE_ENABLE_VALIDATION_LAYER=1
$env:ZEL_ENABLE_SYSTEM_RESOURCE_TRACKER_CHECKER=1
$env:ZEL_ENABLE_LOADER_LOGGING=1
$env:ZEL_LOADER_LOGGING_LEVEL="debug"

# Run your Level Zero application
.\my_level_zero_app.exe
```

### CSV Output for Graphing

Set the `ZEL_SYSTEM_RESOURCE_TRACKER_CSV` environment variable to specify the output CSV file path, this path will be relative to the current working directory of the application:

**Linux:**
```bash
export ZE_ENABLE_VALIDATION_LAYER=1
export ZEL_ENABLE_SYSTEM_RESOURCE_TRACKER_CHECKER=1
Expand All @@ -57,6 +70,16 @@ export ZEL_SYSTEM_RESOURCE_TRACKER_CSV=tracker_output.csv
./my_level_zero_app
```

**Windows (PowerShell):**
```powershell
$env:ZE_ENABLE_VALIDATION_LAYER=1
$env:ZEL_ENABLE_SYSTEM_RESOURCE_TRACKER_CHECKER=1
$env:ZEL_SYSTEM_RESOURCE_TRACKER_CSV="tracker_output.csv"

# Run your Level Zero application
.\my_level_zero_app.exe
```

**Note:** The actual output file will include the process ID (e.g., `tracker_output_pid12345.csv`) to ensure each process creates a unique file. This prevents conflicts when multiple processes use the tracker simultaneously.

## Tracked API Calls
Expand Down Expand Up @@ -317,7 +340,7 @@ The System Resource Tracker is implemented as a validation layer checker that us
- Per-resource-type leak counters
- Thread-local pre-call metrics storage for append operations
- `getResourceTracker()`: Function-local static singleton accessor ensuring proper initialization order
- `getSystemResourceMetrics()`: Parses `/proc/self/status` to read current system metrics
- `getSystemResourceMetrics()`: Reads current system metrics — parses `/proc/self/status` on Linux; uses `GetProcessMemoryInfo`, `CreateToolhelp32Snapshot`, and `GetProcessHandleCount` on Windows
- `checkForLeak()`: Compares creation metrics to destruction metrics and logs warnings if memory increased
- `writeCsvData()`: Atomic CSV line writer using ostringstream with signed delta support
- `logResourceSummary()`: Formats and logs cumulative resource usage
Expand All @@ -335,13 +358,20 @@ The tracker uses multiple mechanisms to ensure thread safety:
### Performance Considerations

- Tracking overhead is approximately < 1ms per API call
- System metrics are read by parsing a small text file (`/proc/self/status` on Linux)
- System metrics are read from `/proc/self/status` on Linux or via Win32 `PSAPI` / `Toolhelp32` on Windows
- CSV writes are buffered and flushed after each call to ensure crash safety
- The tracker only runs when explicitly enabled via environment variable

### Platform Support

The System Resource Tracker is **Linux-only** and relies on `/proc/self/status` for system resource metrics. The checker is automatically excluded from builds on Windows and macOS.
The System Resource Tracker supports **Linux and Windows**:

| Platform | Metrics source | VmSize | VmRSS | VmData | VmPeak | Threads | FDs/Handles |
|----------|---------------|--------|-------|--------|--------|---------|-------------|
| Linux | `/proc/self/status` | VmSize | VmRSS | VmData | VmPeak | Threads field | `getrlimit(RLIMIT_NOFILE)` |
| Windows | Win32 PSAPI / Toolhelp32 | `PrivateUsage` | `WorkingSetSize` | `PrivateUsage` | `PeakWorkingSetSize` | thread snapshot | `GetProcessHandleCount` |

macOS is not supported.

## Troubleshooting

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,15 @@
#include <iomanip>
#include <fstream>
#include <sstream>
#ifdef _WIN32
#include <windows.h>
#include <psapi.h>
#include <tlhelp32.h>
typedef ptrdiff_t ssize_t;
#else
#include <sys/resource.h>
#include <unistd.h>
#endif
#include <cstring>
#include <mutex>
#include <chrono>
Expand Down Expand Up @@ -162,6 +169,50 @@ namespace validation_layer
static SystemResourceMetrics getSystemResourceMetrics();
static void writeCsvData(const std::string& apiCall, const SystemResourceMetrics& current, const SystemResourceMetrics& delta, bool checkLeak);

#ifdef _WIN32
// Helper function to read system resource metrics (Windows)
static SystemResourceMetrics getSystemResourceMetrics() {
SystemResourceMetrics metrics;

PROCESS_MEMORY_COUNTERS_EX pmc;
ZeroMemory(&pmc, sizeof(pmc));
pmc.cb = sizeof(pmc);
if (GetProcessMemoryInfo(GetCurrentProcess(),
reinterpret_cast<PROCESS_MEMORY_COUNTERS*>(&pmc),
sizeof(pmc))) {
metrics.vmSize = static_cast<size_t>(pmc.PrivateUsage) / 1024;
metrics.vmRSS = static_cast<size_t>(pmc.WorkingSetSize) / 1024;
metrics.vmData = static_cast<size_t>(pmc.PrivateUsage) / 1024;
metrics.vmPeak = static_cast<size_t>(pmc.PeakWorkingSetSize) / 1024;
}

// Count threads belonging to this process
DWORD currentPid = GetCurrentProcessId();
HANDLE hSnapshot = CreateToolhelp32Snapshot(TH32CS_SNAPTHREAD, 0);
if (hSnapshot != INVALID_HANDLE_VALUE) {
THREADENTRY32 te;
te.dwSize = sizeof(THREADENTRY32);
size_t threadCount = 0;
if (Thread32First(hSnapshot, &te)) {
do {
if (te.th32OwnerProcessID == currentPid) {
threadCount++;
}
} while (Thread32Next(hSnapshot, &te));
}
CloseHandle(hSnapshot);
metrics.numThreads = threadCount;
}

// Count open handles
DWORD handleCount = 0;
if (GetProcessHandleCount(GetCurrentProcess(), &handleCount)) {
metrics.numFDs = static_cast<size_t>(handleCount);
}

return metrics;
}
#else
// Helper function to read system resource metrics from /proc/self/status
static SystemResourceMetrics getSystemResourceMetrics() {
SystemResourceMetrics metrics;
Expand Down Expand Up @@ -207,6 +258,7 @@ namespace validation_layer

return metrics;
}
#endif

// Helper function to write CSV data with signed deltas (assumes mutex is already held by caller)
static void writeCsvData(const std::string& apiCall, const SystemResourceMetrics& current, const SystemResourceMetrics& delta, bool checkLeak = false) {
Expand Down Expand Up @@ -402,11 +454,21 @@ namespace validation_layer
std::to_string(reinterpret_cast<uintptr_t>(&getResourceTracker())));

// Check if CSV output is requested
#ifdef _WIN32
char* csvPath = nullptr;
size_t csvPathLen = 0;
_dupenv_s(&csvPath, &csvPathLen, "ZEL_SYSTEM_RESOURCE_TRACKER_CSV");
#else
const char* csvPath = getenv("ZEL_SYSTEM_RESOURCE_TRACKER_CSV");
#endif
if (csvPath && csvPath[0] != '\0') {
try {
// Create unique filename per process by appending PID
#ifdef _WIN32
DWORD pid = GetCurrentProcessId();
#else
pid_t pid = getpid();
#endif
std::string uniquePath(csvPath);

// Insert PID before file extension or at end
Expand Down
Loading