gpu-memory

Detailed VRAM profiler for transformer inference with per-layer breakdown, activation analysis, and a predictive memory model that predicts VRAM with <1.2% error. Shows that FFN layers dominate static memory and that measured runtime VRAM exceeds KV-cache estimates by 2-4x.

benchmarking capacity-planning transformers pytorch gpu-memory profiling memory-analysis vram kv-cache llm-inference

Updated Jul 14, 2026
Python

joe0731 / hf_vram_calc

Star

A CLI tool for estimating GPU VRAM requirements for Hugging Face models, supporting various data types, parallelization strategies, and fine-tuning scenarios like LoRA.

gpu-memory vram huggingface pipeline-parallelism memory-estimation huggingface-models hugging-face-transformers huggingface-datasets vram-monitoring vram-calculator vram-memory-estimation

Updated Oct 22, 2025
Python

omkhairate / Nomad

Star

Research-oriented Metal path tracer for macOS with dynamic geometry residency, GPU memory budgeting, benchmark automation, and neural CGVQM residency studies.

macos benchmarking machine-learning research metal computer-graphics renderer gpu-memory path-tracing neural-rendering

Updated Jul 7, 2026
C++

manishklach / ghostkv-lab

Star

Research harness for evaluating query-time bounded elimination of reconstructable KV-cache witnesses in long-context transformer inference workloads. Related provisional filing: IN 202641062451.

transformer gpu-memory memory-systems kv-cache cxl long-context llm-inference transformer-memory ai-infrastructure flashattention transformer-optimization systems-research long-context-inference attention-optimization

Updated May 18, 2026
Python

manishklach / kv_deadline_scheduler

Star

Deadline-aware KV-cache scheduling for protecting decode-critical request-state under long-context LLM inference pressure.

inference gpu-memory memory-management nvme hbm kv-cache memory-tiering cxl llm long-context vllm pagedattention ai-infrastructure systems-research

Updated Jun 19, 2026
Python

Improve this page

Add a description, image, and links to the gpu-memory topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gpu-memory topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gpu-memory

Here are 29 public repositories matching this topic...

NVIDIA / gdrcopy

eyalroz / cuda-api-wrappers

parasj / checkmate

LiyuanLucasLiu / Torch-Scope

Lin-Mao / DrGPUM

mis-wut / feathergpu

pvjosue / OpenCV-Spout

jonlamb-gh / rpi3-rust-fel4-workspace

Alex188dot / GPU-VRAM-Calculator

Fangyh09 / gpustatus

GPUforLLM / llm-vram-calculator

eklitzke / tf-slice

0-u-0 / nvidia-gpu-monitor

obisin / dgls

hongshibao / kubernetes

JohnScheuer / gpu-memory-profiler

joe0731 / hf_vram_calc

omkhairate / Nomad

manishklach / ghostkv-lab

manishklach / kv_deadline_scheduler

Improve this page

Add this topic to your repo