Skip to content

Conversation

@magras
Copy link
Contributor

@magras magras commented Nov 28, 2025

The backoff function's wait time calculation completely overflows time.Duration on the 55th retry (approximately after 6 hours). This results in zero wait times, leading to the uncontrolled spawn of hundreds of goroutines, which can cause memory exhaustion and OOM kill on linux.

@magras
Copy link
Contributor Author

magras commented Nov 28, 2025

There is a test program that demonstrates the problem:

package main

import (
	"fmt"
	"math"
	"time"
)

func main() {
	n := 60
	for i := 0; i < n; i++ {
		x := math.Pow(2, float64(i))
		d := time.Duration(x) * time.Second
		fmt.Println(i, x, d)
	}
}

playground: https://go.dev/play/p/wBxP0SpRRkt

@magras
Copy link
Contributor Author

magras commented Dec 8, 2025

Hey @hwh33, @myleshorton, @reflog, could one of you please take a look at this PR?

@reflog
Copy link
Contributor

reflog commented Dec 8, 2025

@magras I'm on it, thank you for reporting

@reflog reflog self-requested a review December 8, 2025 15:04
Copy link
Contributor

@reflog reflog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be a subtle bug: if the first request fails before any success, nextWait starts at zero (Go's default for time.Duration):

s.nextWait = retryWaitSeconds // only set on success

On first failure → wait = 0 → immediate retry with no backoff.

Recommendation: Initialize nextWait to retryWaitSeconds in the struct or handle the zero case in backoff():

if s.nextWait == 0 {
s.nextWait = retryWaitSeconds
}

The `backoff` function's wait time calculation completely overflows
`time.Duration` on the 55th retry (approximately after 6 hours). This
results in zero wait times, leading to the uncontrolled spawn of
hundreds of goroutines, which can cause memory exhaustion and OOM kill
on linux.
@magras
Copy link
Contributor Author

magras commented Dec 9, 2025

Recommendation: Initialize nextWait to retryWaitSeconds in the struct

I don't know go, but I found no native way to achieve that.

Hence I went the second way, but slightly deviated from it: now retry timeout is clamped between min and max.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical exponential backoff overflow bug that could lead to memory exhaustion and OOM kill. The previous implementation used math.Pow(2, float64(failCount)) which overflows time.Duration after approximately 55 retries (6 hours), resulting in zero wait times and uncontrolled goroutine spawn.

Key Changes:

  • Replaced counter-based exponential calculation with direct duration tracking to prevent overflow
  • Renamed retryWaitSeconds to minRetryWait for clarity
  • Changed struct field from failCount int to nextWait time.Duration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

reflog and others added 2 commits December 9, 2025 11:00
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@reflog reflog merged commit 332d3e3 into getlantern:main Dec 9, 2025
@magras magras deleted the fix-oom branch December 9, 2025 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants