Skip to content

thermal: surgically excise power-down timeout#2585

Open
hawkw wants to merge 1 commit into
masterfrom
eliza/stop-shutting-yourself-off
Open

thermal: surgically excise power-down timeout#2585
hawkw wants to merge 1 commit into
masterfrom
eliza/stop-shutting-yourself-off

Conversation

@hawkw

@hawkw hawkw commented Jul 2, 2026

Copy link
Copy Markdown
Member

Presently, the thermal loop will power down the system if any component remains above its "critical" threshold for 60 seconds. We believe this behavior to be excessively conservative, and unnecessary, as an uncontrollable thermal runaway event will eventually result in something tripping its "power-down" threshold and shutting the system down immediately.

This commit makes the smallest possible change to the thermal loop to remove the timeout. This allows us to operate at the edge of our thermal envelope for as long as is necessary, while still ensuring that the fans go full-throttle when we hit 90 Funny AMD Temperature Units, or when anything else hits its crit threshold. The rest of the control loop behavior should be identical.

Presently, the thermal loop will power down the system if any component
remains above its "critical" threshold for 60 seconds. We believe this
behavior to be excessively conservative, and unnecessary, as an
uncontrollable thermal runaway event will eventually result in something
tripping its "power-down" threshold and shutting the system down
immediately.

This commit makes the smallest possible change to the thermal loop to
remove the timeout. This allows us to operate at the edge of our thermal
envelope for as long as is necessary, while still ensuring that the fans
go full-throttle when we hit 90 Funny AMD Temperature Units, or when
anything else hits its crit threshold. The rest of the control loop
behavior should be identical.
Comment on lines -1440 to -1443
} else if now_ms > *start_time + self.overheat_timeout_ms {
// If blasting the fans hasn't cooled us down in this amount
// of time, then something is terribly wrong - abort!
self.transition_to_uncontrollable(now_ms)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the important part

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant