Skip to content

[ENH] Improve Global Config Architecture #1564

@SimonBlanke

Description

@SimonBlanke

The config module uses extensive globals and I agree with this comment: https://github.com/openml/openml-python/blob/main/openml/config.py#L395

This is problematic for the following reasons:

  • Any code can modify openml.config.apikey = "x". So bad or no encapsulation
  • Global state persists between tests, causing flaky tests
  • openml.config.apikey = 123 silently accepts wrong type. So no validation

And I am sure there are more reasons.

I think it was done to have this kind of API instead of function-call-like syntax:

openml.config.apikey = "my-key"
openml.config.server = "https://test.openml.org/api/v1/xml"

We can preserve this API and still get rid of most of these globals()/global, by defining a module level __getattr__ and __setattr__ and use a dataclass for the data encapsulation and validation:

  from dataclasses import dataclass, replace

  @dataclass
  class OpenMLConfig:
      apikey: str = ""
      server: str = "https://www.openml.org/api/v1/xml"
      # ... 

  _config = OpenMLConfig()

  def __getattr__(name: str):
      if hasattr(_config, name):
          return getattr(_config, name)
      raise AttributeError(f"module 'openml.config' has no attribute '{name}'")

  def __setattr__(name: str, value):
      global _config
      if hasattr(_config, name):
          _config = replace(_config, **{name: value})
      else:
          raise AttributeError(f"module 'openml.config' has no attribute '{name}'")

This is still not a solution I am completely content with, but it is an improvement and not too difficult to implement.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions