Skip to content

Compute Worker- reorganizing and cleaning of compute worker#2294

Merged
Didayolo merged 6 commits intodevelopfrom
compute_worker_cleaning
Apr 1, 2026
Merged

Compute Worker- reorganizing and cleaning of compute worker#2294
Didayolo merged 6 commits intodevelopfrom
compute_worker_cleaning

Conversation

@ihsaan-ullah
Copy link
Copy Markdown
Collaborator

@ihsaan-ullah ihsaan-ullah commented Mar 26, 2026

Description

This PR cleans the compute worker code and uses settings and other new classes for better readability and debugging

Checklist

  • Code review by me
  • Hand tested by me
  • I'm proud of my work
  • Code review by reviewer
  • Hand tested by reviewer
  • CircleCi tests are passing
  • Ready to merge

@ihsaan-ullah
Copy link
Copy Markdown
Collaborator Author

ihsaan-ullah commented Mar 26, 2026

Some points for discussion:

  • At 2 places in the compute worker code logger.debug(...) is used withtout checking if LOG_LEVEL is set to debug. Should this be fixed?

Update: Ignore the question below because I added more stuff to settins and now this class has env variables and other settings e.g. directories and constants.

  • Do you think the following 3 constants are ok to be placed in the new Settings class or should they be placed somewhere else?
 # Constants
  DOCKER = "docker"
  PODMAN = "podman"
  LOG_LEVEL_DEBUG = "debug"

@Didayolo Didayolo changed the title Compute Worker- reorganzing and cleaning of compute worker Compute Worker- reorganizing and cleaning of compute worker Apr 1, 2026
@Didayolo
Copy link
Copy Markdown
Member

Didayolo commented Apr 1, 2026

Hi @ihsaan-ullah,

Could we replace:

    def prepare(self):
        hostname = utils.nodenames.gethostname()
        if self.is_scoring:
            self._update_status(SubmissionStatus.RUNNING, extra_information=f"scoring_hostname-{hostname}")
        else:
            self._update_status(SubmissionStatus.RUNNING, extra_information=f"ingestion_hostname-{hostname}")
        if not self.is_scoring:
            # Only during prediction step do we want to announce "preparing"
            self._update_status(SubmissionStatus.PREPARING)

By:

    def prepare(self):
        hostname = utils.nodenames.gethostname()
        if self.is_scoring:
            self._update_status(SubmissionStatus.RUNNING, extra_information=f"scoring_hostname-{hostname}")
        else:
            # Only during prediction step do we want to announce "preparing"
            self._update_status(SubmissionStatus.PREPARING, extra_information=f"ingestion_hostname-{hostname}")

?

@ihsaan-ullah
Copy link
Copy Markdown
Collaborator Author

ihsaan-ullah commented Apr 1, 2026

Yes why not. I would suggest to change RUNNING to SCORING like this

def prepare(self):
    hostname = utils.nodenames.gethostname()
    if self.is_scoring:
        self._update_status(SubmissionStatus.SCORING, extra_information=f"scoring_hostname-{hostname}")
    else:
        # Only during prediction step do we want to announce "preparing"
        self._update_status(SubmissionStatus.PREPARING, extra_information=f"ingestion_hostname-{hostname}")

@Didayolo
Copy link
Copy Markdown
Member

Didayolo commented Apr 1, 2026

Are we sure this initial "Running" state is not useful?

@ihsaan-ullah
Copy link
Copy Markdown
Collaborator Author

This Running status is confusing because when ingestion runs at the end it changes status to Scoring but when scoring starts it changes it back to Running.

@ihsaan-ullah
Copy link
Copy Markdown
Collaborator Author

We can note this down and change it in another PR along with some other small updates.

@Didayolo
Copy link
Copy Markdown
Member

Didayolo commented Apr 1, 2026

At 2 places in the compute worker code logger.debug(...) is used withtout checking if LOG_LEVEL is set to debug. Should this be fixed?

I think this is fine, it is normal to have some logger.debug calls that we always want, and others that are optional.

@Didayolo
Copy link
Copy Markdown
Member

Didayolo commented Apr 1, 2026

@ihsaan-ullah :

I tried what we just discussed, but then the status remained stuck in "Scoring" instead of "Finished".

I did the most preservative change:

    def prepare(self):
        hostname = utils.nodenames.gethostname()
        if self.is_scoring:
            self._update_status(SubmissionStatus.RUNNING, extra_information=f"scoring_hostname-{hostname}")
        else:
            self._update_status(SubmissionStatus.RUNNING, extra_information=f"ingestion_hostname-{hostname}")
            # Only during prediction step do we want to announce "preparing"
            self._update_status(SubmissionStatus.PREPARING)

It should be the same as before as the conditions else and not self.is_scoring were equal.

However it looks like we stay in "Preparing" status for the whole ingestion program, never "Running". After some tests, this is already present in develop.

I also:

  • Removed a duplicated loop.close()
  • Added a check if folder exists in detele_files_in_folder
  • Replace yaml.load by yaml.safe_load
  • Removed a useless for kind, logs in ... for loop

@Didayolo
Copy link
Copy Markdown
Member

Didayolo commented Apr 1, 2026

I fixed this by adding a update_status to RUNNING at the end of the prepare function.

The status updates workflow can be improved, but I think this is for another PR.

This one is good to go on my side.

@ihsaan-ullah
Copy link
Copy Markdown
Collaborator Author

I would suggest to move your changes to another PR because if something goes wrong then we revert a very small change.

But if you are confident, I let you proceed.

@Didayolo Didayolo merged commit 1ad888d into develop Apr 1, 2026
2 checks passed
@Didayolo Didayolo deleted the compute_worker_cleaning branch April 1, 2026 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants