Skip to content

Commit 1a931be

Browse files
committed
Fix race condition in config-manager when label is unset
When the node label (nvidia.com/device-plugin.config) is not set, a race condition could cause the config-manager to hang indefinitely on startup. The issue occurred when the informer's AddFunc fired before the first Get() call, setting current="" and broadcasting. When Get() was subsequently called, it found lastRead == current (both empty strings) and waited forever, as no future events would wake it up. This fix adds an 'initialized' flag to SyncableConfig to ensure the first Get() call never waits, regardless of timing. Subsequent Get() calls still wait properly when the value hasn't changed. Signed-off-by: Uri Sternik <uri.sternik@wiz.io>
1 parent 624b771 commit 1a931be

File tree

1 file changed

+7
-5
lines changed

1 file changed

+7
-5
lines changed

cmd/config-manager/main.go

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -79,10 +79,11 @@ type Flags struct {
7979
// Multiple calls to Set() do not queue, meaning that only calls to Get() made
8080
// *before* a call to Set() will be notified.
8181
type SyncableConfig struct {
82-
cond *sync.Cond
83-
mutex sync.Mutex
84-
current string
85-
lastRead string
82+
cond *sync.Cond
83+
mutex sync.Mutex
84+
current string
85+
lastRead string
86+
initialized bool
8687
}
8788

8889
// NewSyncableConfig creates a new SyncableConfig
@@ -106,9 +107,10 @@ func (m *SyncableConfig) Set(value string) {
106107
func (m *SyncableConfig) Get() string {
107108
m.mutex.Lock()
108109
defer m.mutex.Unlock()
109-
if m.lastRead == m.current {
110+
if m.initialized && m.lastRead == m.current {
110111
m.cond.Wait()
111112
}
113+
m.initialized = true
112114
m.lastRead = m.current
113115
return m.lastRead
114116
}

0 commit comments

Comments
 (0)