fix(invocation stats): Report delta VRAM for each invocation; fix RAM cache reporting #8746
+18
−12
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
The VRAM peak usage information included in the invocation performance statistics printed at the end of each generation was not very useful, as it did not indicate how much additional VRAM was actually being used by each node. This PR changes this information to delta VRAM, allowing you to see when a node's execution caused the allocated VRAM to increase over the course of the generation. Because of the VRAM cache allocation algorithm, there can also be decreases in VRAM when the execution of a node causes part of a model to be moved back to RAM. I think this is useful information as well.
This PR also fixes a bug that was causing the RAM cache size to be reported as 0.00G.
Related Issues / Discussions
None
QA Instructions
Notice that even non-GPU operations like "string" seem to be using VRAM. Also notice that the RAM cache size (last line) is shown as 0.00G.
Since the same nodes are executing, there will ordinarily be no VRAM usage change (unless you are short of cache memory). Notice that the RAM cache size is now correct, and should match the cache size dynamically calculated at startup time.
Merge Plan
Simple merge.
Checklist
What's Newcopy (if doing a release after this PR)