Fix tc class EEXIST error during TAP device creation#178
Draft
sjmiller609 wants to merge 1 commit intomainfrom
Draft
Fix tc class EEXIST error during TAP device creation#178sjmiller609 wants to merge 1 commit intomainfrom
sjmiller609 wants to merge 1 commit intomainfrom
Conversation
tc class add fails with RTNETLINK EEXIST when a class with the same ID already exists. This happens when removeVMClass cleanup silently fails (all tc teardown is best-effort) leaving orphaned classes, or on restore/re-creation paths where the TAP is gone but the class persists. tc class replace is the idempotent equivalent — it creates the class if missing or updates it in place if it already exists, eliminating the 'File exists' error without changing any other behavior.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the
failed to create instanceerror caused by:Root cause:
addVMClassusestc class addwhich fails withEEXISTwhen a class with the same ID already exists on the bridge. This happens because:Orphaned tc classes from failed cleanup.
removeVMClass(called during TAP teardown) is entirely best-effort — filter, qdisc, and class deletions all silently swallow errors. If the filter deletion fails (fragile string parsing oftc filter showoutput),tc class delalso fails silently because the class still has references. The TAP device gets deleted, but the tc class persists as an orphan. On the next allocation that maps to the same class ID,tc class addhits the orphan and fails.16-bit hash collision risk.
deriveClassIDhashes TAP names via FNV-1a truncated to 16 bits (65,536 possible IDs). By birthday paradox, collision probability reaches ~50% at 256 concurrent TAPs per host. Two different live instances can map to the same class ID.Restore/re-creation paths.
createTAPDevicechecks for an existing TAP (and deletes it if found), but does not check for an existing tc class independently. When the TAP is gone but the class persists, the idempotency check is bypassed.CleanupOrphanedClasseshandles this but only runs atInitialize()time, not before each allocation.Fix: Change
tc class addtotc class replaceinaddVMClass. This is the idempotent equivalent — creates the class if missing, updates it in place if it already exists. No behavioral change for the happy path; eliminates the EEXIST failure on all three root cause paths.One-line change in
lib/network/bridge_linux.go.