Skip to content

add callback: mem_snapshot for profile#9411

Open
mugglewei97 wants to merge 1 commit into
modelscope:mainfrom
mugglewei97:feat/callback_mem_snapshot
Open

add callback: mem_snapshot for profile#9411
mugglewei97 wants to merge 1 commit into
modelscope:mainfrom
mugglewei97:feat/callback_mem_snapshot

Conversation

@mugglewei97
Copy link
Copy Markdown

PR type

  • Bug Fix
  • [Y] New Feature
  • Document Updates
  • More Models or Datasets Support

PR information

Add a new type of callback which help to profile memory occupation

Experiment results

Tested by myself it works

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a MemorySnapshotCallback to record and visualize CUDA memory history at specified intervals during training, along with new configuration arguments mem_snapshot_path and mem_snapshot_interval. Review feedback suggests improving the implementation's robustness by validating that the snapshot interval is a positive integer to prevent division by zero errors, using os.path.splitext for safer file path manipulation, and ensuring the target directory exists before saving snapshots.

Comment thread swift/callbacks/mem_snapshot.py Outdated
Comment thread swift/callbacks/mem_snapshot.py Outdated
Comment thread swift/callbacks/mem_snapshot.py
Comment thread swift/trainers/arguments.py Outdated
@mugglewei97 mugglewei97 force-pushed the feat/callback_mem_snapshot branch from 40c948e to 748dc52 Compare May 25, 2026 08:30
@mugglewei97 mugglewei97 force-pushed the feat/callback_mem_snapshot branch from 748dc52 to 57df710 Compare May 25, 2026 09:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant