Skip to content

Add unit tests for SONiC config_generator orchestrator and service ca…#2237

Open
berendt wants to merge 1 commit intomainfrom
gh2221
Open

Add unit tests for SONiC config_generator orchestrator and service ca…#2237
berendt wants to merge 1 commit intomainfrom
gh2221

Conversation

@berendt
Copy link
Copy Markdown
Member

@berendt berendt commented Apr 28, 2026

…ches

Covers natural_sort_key, generate_sonic_config (base-config loading, deepcopy, primary-IP / AS routing, OOB management, breakout merge, config_version normalization), the metalbox / NTP / DNS caches, and the log-server / SNMP helpers. Helpers from companion sub-issues are patched at their import sites so this file exercises only the orchestrator's own glue.

Also hardens generate_sonic_config so the top-level scaffold keys it and its helpers index into directly are always present, even when /etc/sonic/config_db.json is absent or fails to load.

Closes #2221

AI-assisted: Claude Code

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The list of top-level scaffold keys is duplicated between generate_sonic_config and the _make_base_config test helper; consider extracting this list into a single constant or helper used by both to avoid drift when keys change.
  • The test_config_generator_orchestrator.py file has grown very large and spans multiple distinct concerns (orchestrator, metalbox cache, NTP, DNS, log, SNMP); consider splitting it into smaller test modules per concern to keep future changes more manageable.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The list of top-level scaffold keys is duplicated between `generate_sonic_config` and the `_make_base_config` test helper; consider extracting this list into a single constant or helper used by both to avoid drift when keys change.
- The `test_config_generator_orchestrator.py` file has grown very large and spans multiple distinct concerns (orchestrator, metalbox cache, NTP, DNS, log, SNMP); consider splitting it into smaller test modules per concern to keep future changes more manageable.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown
Contributor

@ideaship ideaship left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[config_generator.py:319] Dead guard introduced by this PR

The if "VERSIONS" not in config: guard at line 319 is now dead: adding "VERSIONS" to TOP_LEVEL_SCAFFOLD_KEYS guarantees it is present via setdefault at line 193. Remove lines 319–320 only — the inner guard at line 321 (if "DATABASE" not in config["VERSIONS"]:) is still live because the scaffold only creates an empty dict, not the nested key.

"PORTCHANNEL_MEMBER",
"VRF",
"VERSIONS",
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TOP_LEVEL_SCAFFOLD_KEYS is missing sections the orchestrator writes into directly: BGP_NEIGHBOR_AF (e.g. line 1020), BGP_NEIGHBOR (line 1202), BGP_GLOBALS_AF_NETWORK (line 1819), BGP_GLOBALS_AF (line 2014), BGP_GLOBALS_ROUTE_ADVERTISE (line 2035), VXLAN_TUNNEL, VXLAN_EVPN_NVO, VXLAN_TUNNEL_MAP (lines 2067–2086). With a missing or sparse config_db.json, any device with connected ports, Loopback0 routes, or VRFs with VNI will raise KeyError at runtime. These eight keys need to be added to the tuple.

from osism.tasks.conductor.sonic import config_generator
from osism.tasks.conductor.sonic.config_generator import (
_add_ntp_configuration,
_get_ntp_servers,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_get_ntp_servers is not called from production code — _add_ntp_configuration reaches the metalbox IP via _get_metalbox_ip_for_device directly, never through this function. The associated _ntp_servers_cache global and clear_ntp_cache() are also unreachable. These tests exercise an orphaned private function. Recommend either removing _get_ntp_servers, _ntp_servers_cache, and clear_ntp_cache() if the NetBox-crawl NTP path is obsolete, or wiring the function back into _add_ntp_configuration if it is intended behaviour.

…ches

Covers natural_sort_key, generate_sonic_config (base-config loading,
deepcopy, primary-IP / AS routing, OOB management, breakout merge,
config_version normalization), the metalbox / DNS caches, and the
log-server / SNMP helpers. Helpers from companion sub-issues are patched
at their import sites so this file exercises only the orchestrator's own
glue.

Also hardens generate_sonic_config so the top-level scaffold keys it and
its helpers index into directly are always present, even when
/etc/sonic/config_db.json is absent or fails to load. The scaffold key
list is exposed as TOP_LEVEL_SCAFFOLD_KEYS and reused by the test helper
so the production code and the tests cannot drift apart when keys are
added or removed. Extends the tuple with BGP_GLOBALS_AF,
BGP_GLOBALS_AF_NETWORK, BGP_GLOBALS_ROUTE_ADVERTISE, BGP_NEIGHBOR,
BGP_NEIGHBOR_AF, VXLAN_TUNNEL, VXLAN_EVPN_NVO and VXLAN_TUNNEL_MAP, all
of which the orchestrator writes into directly and which previously
raised KeyError on devices with connected ports, Loopback0 routes or
VRFs with VNI when config_db.json was missing or sparse.

Removes the orphaned NetBox-crawl NTP path (_get_ntp_servers,
_ntp_servers_cache, clear_ntp_cache); _add_ntp_configuration reaches the
metalbox IP via _get_metalbox_ip_for_device directly, so the crawl
helper and its cache had no production callers.

The test module is split per concern — orchestrator, metalbox cache,
NTP, DNS, log-server, SNMP — into sibling test_config_generator_*.py
files sharing a small _config_generator_helpers.py module, with the
cache-reset and mock_nb fixtures lifted into conftest.py and opted into
via pytestmark.usefixtures so unrelated SONiC tests stay unaffected.

Closes #2221

AI-assisted: Claude Code

Signed-off-by: Christian Berendt <berendt@osism.tech>
Copy link
Copy Markdown
Contributor

@ideaship ideaship left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder — dead VERSIONS guard (flagged in previous round, still open)

The if "VERSIONS" not in config: guard at config_generator.py:324–325 was called out in the last review and has not been addressed. The scaffold setdefault loop added by this PR guarantees VERSIONS is always present, so lines 324–325 are now unreachable. Only the inner guard on line 326 (if "DATABASE" not in config["VERSIONS"]:) is still live.


New finding — STATIC_ROUTE wholesale replacement (correctness bug, config_generator.py:303)

Line 303 does a full dict replacement:

config["STATIC_ROUTE"] = {}                            # replaces the dict
config["STATIC_ROUTE"]["mgmt|0.0.0.0/0"] = {"nexthop": metalbox_ip}

The scaffold setdefault on line 197–198 preserves any STATIC_ROUTE entries from config_db.json, but line 303 then throws that away. Any custom static routes in the base config are silently dropped for every device that has an OOB IP. The fix is to drop line 303 — the dict is already present and any pre-existing entries should survive.


assert config["MGMT_INTERFACE"]["eth0"] == {"admin_status": "up"}
assert "eth0|10.42.0.5/24" in config["MGMT_INTERFACE"]
assert config["STATIC_ROUTE"]["mgmt|0.0.0.0/0"] == {"nexthop": "10.42.0.1"}
Copy link
Copy Markdown
Contributor

@ideaship ideaship Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test — this test uses an empty scaffold so the STATIC_ROUTE overwrite in the production code is invisible. The overwrite itself is fixed in #2238; a companion test is still needed here: seed a base config with a non-empty STATIC_ROUTE, trigger the OOB path, and assert the pre-existing entries survive alongside the new mgmt|0.0.0.0/0 route.

config_b = generate_sonic_config(device_b, "HWSKU-B")

assert config_b["DEVICE_METADATA"]["shared"] == {"flag": "original"}
assert config_a["DEVICE_METADATA"] is not config_b["DEVICE_METADATA"]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weak test — the docstring says this guards against cross-device base-config contamination, "a real bug we got bitten by", but that scenario cannot occur with the current code: base_config is a local variable produced by json.load inside generate_sonic_config, so each call already gets an independent dict. Removing deepcopy here would not cause this test to fail.

The test passes but for the wrong reason — it verifies isolation that json.load already provides, not the isolation that deepcopy adds.

@ideaship
Copy link
Copy Markdown
Contributor

Note: the STATIC_ROUTE production bug section in the review body above is covered by #2238 (already open against main). The inline comment on the test file remains — a regression test for that fix still belongs in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unit tests for osism/tasks/conductor/sonic/config_generator.py — orchestrator, caches, log/SNMP

3 participants