Skip to content

Build DOCA-OFED kernel modules for Rocky 10#2310

Draft
owenjones wants to merge 1 commit into
stackhpc/2025.1from
rl10/doca-ofed
Draft

Build DOCA-OFED kernel modules for Rocky 10#2310
owenjones wants to merge 1 commit into
stackhpc/2025.1from
rl10/doca-ofed

Conversation

@owenjones
Copy link
Copy Markdown
Member

DOCA OFED support has now been added for Rocky 10

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for DOCA OFED on Rocky Linux 10 for both x86_64 and aarch64 architectures. The changes involve updating repository configurations, kernel version matrices, and adding retry logic for Pulp-related tasks. Feedback focuses on improving maintainability and consistency by replacing hardcoded version strings with variables in repository definitions and variable names. Additionally, it is recommended to use kolla_base_arch instead of os_arch in push-ofed.yml to ensure correct repository selection during cross-architecture builds and to maintain consistency in failure messages.

Comment thread etc/kayobe/ansible/tools/push-ofed.yml
Comment thread etc/kayobe/ansible/tools/push-ofed.yml
Comment thread etc/kayobe/dnf.yml Outdated
Comment thread etc/kayobe/dnf.yml Outdated
Comment thread etc/kayobe/ofed.yml Outdated
Comment thread etc/kayobe/ofed.yml Outdated
Comment thread etc/kayobe/pulp.yml Outdated
DOCA OFED support has now been added for Rocky 10
@owenjones
Copy link
Copy Markdown
Member Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for DOCA OFED on Rocky Linux 10 for both x86_64 and aarch64 architectures. Key changes include updating the DOCA version matrix, refactoring repository lookups to be more dynamic, and adding Pulp repository configurations and DNF custom repositories for RHEL 10. Feedback focuses on correcting the use of non-standard architecture variables, ensuring Ansible retry logic is correctly implemented with 'until' and 'delay' parameters, and utilizing generic version variables in repository URLs to improve maintainability.

- name: Get OFED module repo
ansible.builtin.set_fact:
doca_modules_repos: "{{ stackhpc_pulp_rpm_repos | selectattr('name', 'search', 'OFED') | selectattr('base_path', 'search', '/' ~ kolla_base_arch ~ '/') | list }}"
doca_modules_repos: "{{ stackhpc_pulp_rpm_repos | selectattr('name', 'search', 'OFED') | selectattr('base_path', 'search', '/rhel' ~ os_release) | selectattr('base_path', 'search', '/' ~ os_arch ~ '/') | list }}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The variable os_arch is used here, but it is not a standard Ansible fact or a commonly defined variable in Kayobe. It should likely be kolla_base_arch, which is the standard variable for architecture in this project and was used in the previous version of this task.

        doca_modules_repos: "{{ stackhpc_pulp_rpm_repos | selectattr('name', 'search', 'OFED') | selectattr('base_path', 'search', '/rhel' ~ os_release) | selectattr('base_path', 'search', '/' ~ kolla_base_arch ~ '/') | list }}"

that:
- doca_modules_repos | length == 1
fail_msg: "Expected exactly one OFED module repo for {{ kolla_base_arch }}"
fail_msg: "Expected exactly one OFED module repo for {{ os_arch }}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The variable os_arch is used in the failure message but is likely undefined. Please use kolla_base_arch for consistency with the rest of the configuration.

        fail_msg: "Expected exactly one OFED module repo for {{ kolla_base_arch }}"

name: "{{ doca_modules_repo_name }}"
state: present
retries: "{{ pulp_timeout_retries | default(3) }}"
retries: "{{ pulp_timeout_retries | default(3) | int }}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The retries keyword in Ansible only takes effect when used in conjunction with the until keyword. Without until, this task will not retry on failure. Additionally, adding a delay is recommended to avoid overwhelming the service during retries. Ensure numeric variables use the | int filter and provide sensible defaults for timeouts and delays.

      register: result
      until: result is success
      retries: "{{ pulp_timeout_retries | default(3) | int }}"
      delay: "{{ pulp_delay | default(5) | int }}"
References
  1. Use the | int filter for numeric variables in Ansible tasks to ensure they are correctly interpreted as integers, especially when variables might be overridden by external string inputs.
  2. Define configuration variables such as delays or timeouts in the vars section of an Ansible playbook to allow for overrides, and use sensible defaults that work across different environments.

Comment thread etc/kayobe/pulp.yml
distribution_name: "doca-3.2.2-rhel10-"
base_path: "doca/3.2.2/rhel10/x86_64/"
- name: DOCA Online Repo {{ stackhpc_pulp_doca_version }} - RHEL 10
url: "{{ stackhpc_release_pulp_content_url }}/doca/{{ stackhpc_pulp_doca_version }}/rhel10/x86_64/{{ stackhpc_pulp_repo_doca_3_2_2_rhel10_x86_64_version }}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The URL uses a hardcoded version variable name (stackhpc_pulp_repo_doca_3_2_2_rhel10_x86_64_version). This should be replaced with the generic variable stackhpc_pulp_repo_rhel10_doca_version defined in ofed.yml, which dynamically resolves to the correct version based on the DOCA version matrix. This improves maintainability when updating DOCA versions.

    url: "{{ stackhpc_release_pulp_content_url }}/doca/{{ stackhpc_pulp_doca_version }}/rhel10/x86_64/{{ stackhpc_pulp_repo_rhel10_doca_version }}"

Comment thread etc/kayobe/pulp.yml
distribution_name: "doca-3.2.2-rhel10-aarch64-"
base_path: "doca/3.2.2/rhel10/aarch64/"
- name: DOCA Online Repo {{ stackhpc_pulp_doca_version }} - RHEL 10 aarch64
url: "{{ stackhpc_release_pulp_content_url }}/doca/{{ stackhpc_pulp_doca_version }}/rhel10/aarch64/{{ stackhpc_pulp_repo_doca_3_2_2_rhel10_aarch64_version }}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the x86_64 repository, this URL should use the generic variable stackhpc_pulp_repo_rhel10_doca_aarch64_version instead of the hardcoded version variable name.

    url: "{{ stackhpc_release_pulp_content_url }}/doca/{{ stackhpc_pulp_doca_version }}/rhel10/aarch64/{{ stackhpc_pulp_repo_rhel10_doca_aarch64_version }}"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant