Build DOCA-OFED kernel modules for Rocky 10#2310
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces support for DOCA OFED on Rocky Linux 10 for both x86_64 and aarch64 architectures. The changes involve updating repository configurations, kernel version matrices, and adding retry logic for Pulp-related tasks. Feedback focuses on improving maintainability and consistency by replacing hardcoded version strings with variables in repository definitions and variable names. Additionally, it is recommended to use kolla_base_arch instead of os_arch in push-ofed.yml to ensure correct repository selection during cross-architecture builds and to maintain consistency in failure messages.
DOCA OFED support has now been added for Rocky 10
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces support for DOCA OFED on Rocky Linux 10 for both x86_64 and aarch64 architectures. Key changes include updating the DOCA version matrix, refactoring repository lookups to be more dynamic, and adding Pulp repository configurations and DNF custom repositories for RHEL 10. Feedback focuses on correcting the use of non-standard architecture variables, ensuring Ansible retry logic is correctly implemented with 'until' and 'delay' parameters, and utilizing generic version variables in repository URLs to improve maintainability.
| - name: Get OFED module repo | ||
| ansible.builtin.set_fact: | ||
| doca_modules_repos: "{{ stackhpc_pulp_rpm_repos | selectattr('name', 'search', 'OFED') | selectattr('base_path', 'search', '/' ~ kolla_base_arch ~ '/') | list }}" | ||
| doca_modules_repos: "{{ stackhpc_pulp_rpm_repos | selectattr('name', 'search', 'OFED') | selectattr('base_path', 'search', '/rhel' ~ os_release) | selectattr('base_path', 'search', '/' ~ os_arch ~ '/') | list }}" |
There was a problem hiding this comment.
The variable os_arch is used here, but it is not a standard Ansible fact or a commonly defined variable in Kayobe. It should likely be kolla_base_arch, which is the standard variable for architecture in this project and was used in the previous version of this task.
doca_modules_repos: "{{ stackhpc_pulp_rpm_repos | selectattr('name', 'search', 'OFED') | selectattr('base_path', 'search', '/rhel' ~ os_release) | selectattr('base_path', 'search', '/' ~ kolla_base_arch ~ '/') | list }}"| that: | ||
| - doca_modules_repos | length == 1 | ||
| fail_msg: "Expected exactly one OFED module repo for {{ kolla_base_arch }}" | ||
| fail_msg: "Expected exactly one OFED module repo for {{ os_arch }}" |
| name: "{{ doca_modules_repo_name }}" | ||
| state: present | ||
| retries: "{{ pulp_timeout_retries | default(3) }}" | ||
| retries: "{{ pulp_timeout_retries | default(3) | int }}" |
There was a problem hiding this comment.
The retries keyword in Ansible only takes effect when used in conjunction with the until keyword. Without until, this task will not retry on failure. Additionally, adding a delay is recommended to avoid overwhelming the service during retries. Ensure numeric variables use the | int filter and provide sensible defaults for timeouts and delays.
register: result
until: result is success
retries: "{{ pulp_timeout_retries | default(3) | int }}"
delay: "{{ pulp_delay | default(5) | int }}"References
- Use the | int filter for numeric variables in Ansible tasks to ensure they are correctly interpreted as integers, especially when variables might be overridden by external string inputs.
- Define configuration variables such as delays or timeouts in the vars section of an Ansible playbook to allow for overrides, and use sensible defaults that work across different environments.
| distribution_name: "doca-3.2.2-rhel10-" | ||
| base_path: "doca/3.2.2/rhel10/x86_64/" | ||
| - name: DOCA Online Repo {{ stackhpc_pulp_doca_version }} - RHEL 10 | ||
| url: "{{ stackhpc_release_pulp_content_url }}/doca/{{ stackhpc_pulp_doca_version }}/rhel10/x86_64/{{ stackhpc_pulp_repo_doca_3_2_2_rhel10_x86_64_version }}" |
There was a problem hiding this comment.
The URL uses a hardcoded version variable name (stackhpc_pulp_repo_doca_3_2_2_rhel10_x86_64_version). This should be replaced with the generic variable stackhpc_pulp_repo_rhel10_doca_version defined in ofed.yml, which dynamically resolves to the correct version based on the DOCA version matrix. This improves maintainability when updating DOCA versions.
url: "{{ stackhpc_release_pulp_content_url }}/doca/{{ stackhpc_pulp_doca_version }}/rhel10/x86_64/{{ stackhpc_pulp_repo_rhel10_doca_version }}"| distribution_name: "doca-3.2.2-rhel10-aarch64-" | ||
| base_path: "doca/3.2.2/rhel10/aarch64/" | ||
| - name: DOCA Online Repo {{ stackhpc_pulp_doca_version }} - RHEL 10 aarch64 | ||
| url: "{{ stackhpc_release_pulp_content_url }}/doca/{{ stackhpc_pulp_doca_version }}/rhel10/aarch64/{{ stackhpc_pulp_repo_doca_3_2_2_rhel10_aarch64_version }}" |
There was a problem hiding this comment.
Similar to the x86_64 repository, this URL should use the generic variable stackhpc_pulp_repo_rhel10_doca_aarch64_version instead of the hardcoded version variable name.
url: "{{ stackhpc_release_pulp_content_url }}/doca/{{ stackhpc_pulp_doca_version }}/rhel10/aarch64/{{ stackhpc_pulp_repo_rhel10_doca_aarch64_version }}"
DOCA OFED support has now been added for Rocky 10