Conversation
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- The list of top-level scaffold keys is duplicated between
generate_sonic_configand the_make_base_configtest helper; consider extracting this list into a single constant or helper used by both to avoid drift when keys change. - The
test_config_generator_orchestrator.pyfile has grown very large and spans multiple distinct concerns (orchestrator, metalbox cache, NTP, DNS, log, SNMP); consider splitting it into smaller test modules per concern to keep future changes more manageable.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The list of top-level scaffold keys is duplicated between `generate_sonic_config` and the `_make_base_config` test helper; consider extracting this list into a single constant or helper used by both to avoid drift when keys change.
- The `test_config_generator_orchestrator.py` file has grown very large and spans multiple distinct concerns (orchestrator, metalbox cache, NTP, DNS, log, SNMP); consider splitting it into smaller test modules per concern to keep future changes more manageable.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
ideaship
left a comment
There was a problem hiding this comment.
[config_generator.py:319] Dead guard introduced by this PR
The if "VERSIONS" not in config: guard at line 319 is now dead: adding "VERSIONS" to TOP_LEVEL_SCAFFOLD_KEYS guarantees it is present via setdefault at line 193. Remove lines 319–320 only — the inner guard at line 321 (if "DATABASE" not in config["VERSIONS"]:) is still live because the scaffold only creates an empty dict, not the nested key.
| "PORTCHANNEL_MEMBER", | ||
| "VRF", | ||
| "VERSIONS", | ||
| ) |
There was a problem hiding this comment.
TOP_LEVEL_SCAFFOLD_KEYS is missing sections the orchestrator writes into directly: BGP_NEIGHBOR_AF (e.g. line 1020), BGP_NEIGHBOR (line 1202), BGP_GLOBALS_AF_NETWORK (line 1819), BGP_GLOBALS_AF (line 2014), BGP_GLOBALS_ROUTE_ADVERTISE (line 2035), VXLAN_TUNNEL, VXLAN_EVPN_NVO, VXLAN_TUNNEL_MAP (lines 2067–2086). With a missing or sparse config_db.json, any device with connected ports, Loopback0 routes, or VRFs with VNI will raise KeyError at runtime. These eight keys need to be added to the tuple.
| from osism.tasks.conductor.sonic import config_generator | ||
| from osism.tasks.conductor.sonic.config_generator import ( | ||
| _add_ntp_configuration, | ||
| _get_ntp_servers, |
There was a problem hiding this comment.
_get_ntp_servers is not called from production code — _add_ntp_configuration reaches the metalbox IP via _get_metalbox_ip_for_device directly, never through this function. The associated _ntp_servers_cache global and clear_ntp_cache() are also unreachable. These tests exercise an orphaned private function. Recommend either removing _get_ntp_servers, _ntp_servers_cache, and clear_ntp_cache() if the NetBox-crawl NTP path is obsolete, or wiring the function back into _add_ntp_configuration if it is intended behaviour.
…ches Covers natural_sort_key, generate_sonic_config (base-config loading, deepcopy, primary-IP / AS routing, OOB management, breakout merge, config_version normalization), the metalbox / DNS caches, and the log-server / SNMP helpers. Helpers from companion sub-issues are patched at their import sites so this file exercises only the orchestrator's own glue. Also hardens generate_sonic_config so the top-level scaffold keys it and its helpers index into directly are always present, even when /etc/sonic/config_db.json is absent or fails to load. The scaffold key list is exposed as TOP_LEVEL_SCAFFOLD_KEYS and reused by the test helper so the production code and the tests cannot drift apart when keys are added or removed. Extends the tuple with BGP_GLOBALS_AF, BGP_GLOBALS_AF_NETWORK, BGP_GLOBALS_ROUTE_ADVERTISE, BGP_NEIGHBOR, BGP_NEIGHBOR_AF, VXLAN_TUNNEL, VXLAN_EVPN_NVO and VXLAN_TUNNEL_MAP, all of which the orchestrator writes into directly and which previously raised KeyError on devices with connected ports, Loopback0 routes or VRFs with VNI when config_db.json was missing or sparse. Removes the orphaned NetBox-crawl NTP path (_get_ntp_servers, _ntp_servers_cache, clear_ntp_cache); _add_ntp_configuration reaches the metalbox IP via _get_metalbox_ip_for_device directly, so the crawl helper and its cache had no production callers. The test module is split per concern — orchestrator, metalbox cache, NTP, DNS, log-server, SNMP — into sibling test_config_generator_*.py files sharing a small _config_generator_helpers.py module, with the cache-reset and mock_nb fixtures lifted into conftest.py and opted into via pytestmark.usefixtures so unrelated SONiC tests stay unaffected. Closes #2221 AI-assisted: Claude Code Signed-off-by: Christian Berendt <berendt@osism.tech>
ideaship
left a comment
There was a problem hiding this comment.
Reminder — dead VERSIONS guard (flagged in previous round, still open)
The if "VERSIONS" not in config: guard at config_generator.py:324–325 was called out in the last review and has not been addressed. The scaffold setdefault loop added by this PR guarantees VERSIONS is always present, so lines 324–325 are now unreachable. Only the inner guard on line 326 (if "DATABASE" not in config["VERSIONS"]:) is still live.
New finding — STATIC_ROUTE wholesale replacement (correctness bug, config_generator.py:303)
Line 303 does a full dict replacement:
config["STATIC_ROUTE"] = {} # replaces the dict
config["STATIC_ROUTE"]["mgmt|0.0.0.0/0"] = {"nexthop": metalbox_ip}The scaffold setdefault on line 197–198 preserves any STATIC_ROUTE entries from config_db.json, but line 303 then throws that away. Any custom static routes in the base config are silently dropped for every device that has an OOB IP. The fix is to drop line 303 — the dict is already present and any pre-existing entries should survive.
|
|
||
| assert config["MGMT_INTERFACE"]["eth0"] == {"admin_status": "up"} | ||
| assert "eth0|10.42.0.5/24" in config["MGMT_INTERFACE"] | ||
| assert config["STATIC_ROUTE"]["mgmt|0.0.0.0/0"] == {"nexthop": "10.42.0.1"} |
There was a problem hiding this comment.
Missing test — this test uses an empty scaffold so the STATIC_ROUTE overwrite in the production code is invisible. The overwrite itself is fixed in #2238; a companion test is still needed here: seed a base config with a non-empty STATIC_ROUTE, trigger the OOB path, and assert the pre-existing entries survive alongside the new mgmt|0.0.0.0/0 route.
| config_b = generate_sonic_config(device_b, "HWSKU-B") | ||
|
|
||
| assert config_b["DEVICE_METADATA"]["shared"] == {"flag": "original"} | ||
| assert config_a["DEVICE_METADATA"] is not config_b["DEVICE_METADATA"] |
There was a problem hiding this comment.
Weak test — the docstring says this guards against cross-device base-config contamination, "a real bug we got bitten by", but that scenario cannot occur with the current code: base_config is a local variable produced by json.load inside generate_sonic_config, so each call already gets an independent dict. Removing deepcopy here would not cause this test to fail.
The test passes but for the wrong reason — it verifies isolation that json.load already provides, not the isolation that deepcopy adds.
|
Note: the STATIC_ROUTE production bug section in the review body above is covered by #2238 (already open against |
…ches
Covers natural_sort_key, generate_sonic_config (base-config loading, deepcopy, primary-IP / AS routing, OOB management, breakout merge, config_version normalization), the metalbox / NTP / DNS caches, and the log-server / SNMP helpers. Helpers from companion sub-issues are patched at their import sites so this file exercises only the orchestrator's own glue.
Also hardens generate_sonic_config so the top-level scaffold keys it and its helpers index into directly are always present, even when /etc/sonic/config_db.json is absent or fails to load.
Closes #2221
AI-assisted: Claude Code