Skip to content

Proposal for memory limiting in the scrape loop#76

Open
dashpole wants to merge 4 commits intoprometheus:mainfrom
dashpole:memory_limiter_proposal
Open

Proposal for memory limiting in the scrape loop#76
dashpole wants to merge 4 commits intoprometheus:mainfrom
dashpole:memory_limiter_proposal

Conversation

@dashpole
Copy link
Copy Markdown
Contributor

@dashpole dashpole commented Mar 10, 2026

As @bwplotka presented in his "Scrape Trolley Dillema" talk last year at promcon, Prometheus could use a mechanism to prevent OOMs caused by short-term memory usage of scraping targets. This has also been a request from some users of the OTel collector's Prometheus receiver.

PoC: prometheus/prometheus@main...dashpole:prometheus:memory_limiter_simple

cc @bernot-dev @ArthurSens

@dashpole dashpole force-pushed the memory_limiter_proposal branch from d421c14 to 5e8d860 Compare March 10, 2026 19:24
Signed-off-by: David Ashpole <dashpole@google.com>
@dashpole dashpole force-pushed the memory_limiter_proposal branch from 5e8d860 to 9076c48 Compare March 10, 2026 19:33
@bwplotka
Copy link
Copy Markdown
Member

FYI: Talk link: https://youtu.be/ulHQUCarjjo?list=PLoz-W_CUquUlHOg314_YttjHL0iGTdE3O

Copy link
Copy Markdown
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is a great plan, something I'd like to see 👍🏽

@dashpole dashpole marked this pull request as ready for review March 11, 2026 13:49
Signed-off-by: David Ashpole <dashpole@google.com>
Copy link
Copy Markdown
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm supportive. It should break the ice on the non trivial issues you listed and it's elegant. This would also help with Otel-collector Prometheus scrapes.

Disclaimer @dashpole works with me at Google, so I'm not going to merge until we have buy-in from other maintainers.

WDYT @krajorama @roidelapluie @bboreham @ArthurSens ?

Copy link
Copy Markdown
Member

@saswatamcode saswatamcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would love to see this!


**2. The Prometheus Server Operator:**
Server operators need to understand the global impact of memory limiting so they can take corrective action (e.g., increasing memory limits, adding Prometheus replicas, or investigating massive targets).
* **Counter for aborted scrapes:** A new internal Prometheus metric (e.g., `prometheus_target_scrapes_skipped_memory_limit_total`) will be introduced to track the total number of aborted scrapes globally. Operators can set alerts on this metric to be notified of memory pressure, allowing them to intervene if data loss becomes too widespread.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewers: The name can be debated/changed during review.

Signed-off-by: David Ashpole <dashpole@google.com>
Signed-off-by: David Ashpole <dashpole@google.com>
@dashpole dashpole force-pushed the memory_limiter_proposal branch from 9cbc215 to 677f3a0 Compare March 30, 2026 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants