Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/contributing/documentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ For larger changes, clone a fork of the repository as described in the
After cloning the repository, you may also build and serve the documentation through Docker:

```
docker compose up docs
docker compose -f compose.yaml -f compose.ports.yaml up docs
```


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ This is useful, for example, to run unit tests in the container:
python -m pytest -x -v -m "not php_api"
```

## Unit tests
## Running Unit tests

Our unit tests are written with the [`pytest`](https://pytest.org) framework.
An invocation could look like this:
Expand Down
217 changes: 127 additions & 90 deletions docs/contributing/tests.md
Original file line number Diff line number Diff line change
@@ -1,98 +1,135 @@
# Writing Tests
# Testing

tl;dr:
- Setting up the `py_api` fixture to test directly against a REST API endpoint is really slow, only use it for migration/integration tests.
- Getting a database fixture and doing a database call is slow, consider mocking if appropriate.
This page covers running and writing tests for the REST API.
It assumes you already followed the instructions on the ["Setup"](setup.md) page.

## Overhead from Fixtures
Sometimes, you want to interact with the REST API through the `py_api` fixture,
or want access to a database with `user_test` or `expdb_test` fixtures.
Be warned that these come with considerable relative overhead, which adds up when running thousands of tests.
!!! note "Follow the documentation"

Some tests and files in the current test suite need to be updated to reflect these conventions.
This documentation specifies the desired way of doing things even if the source code does something else.

## Running Tests

Following the setup page, you can run tests when the services are running.
Start the services using one of two commands:

```bash
docker compose --profile apis up -d # if you run tests which need PHP REST API
docker compose up python-api -d # if you do not run tests that require the PHP API
```
and then invoke pytest:
```bash
docker compose exec python-api python -m pytest tests
```
As of writing, the full test suite ran in sequence takes about 2 minutes.
We generally recommend running subsets of the tests during development.
In particular, running only the tests which do not use fuzzing or the PHP API takes under 3 seconds:
```bash
docker compose exec python-api python -m pytest tests -m "not php_api and not slow"
```

## Writing Tests

We use the ubiquitous [Pytest](https://docs.pytest.org) framework when writing tests.

### File Structure
When writing tests, we have the following additional conventions on the file structure:

- Use a `_test` suffix when naming our files (not a `test_` prefix). Our tests already exist in a `tests` directory, and in common tree list side panels it's likely you can only see the start of file names, so this is more informative.
- One dedicated test file per endpoint


### General Test Guidelines
Some guidelines and things to keep in mind when writing tests:

- Try to keep tests small, so that they fail for one particular reason only.
- Mark tests that update the database in anyway with the `mut` marker (`@pytest.mark.mut`).
- If the test is excessively slow (>0.1 sec) and does not connect to PHP, use a `slow` marker. Tests that include PHP always require roundtrips through other services which makes them slow by default. PHP tests can be filtered out with the automatically generated "php_api" marker.
- Four common fixtures you might need when writing tests are:
- py_api: an async client for the Python based REST API
- php_api: an async client for the PHP based REST API
- expdb_test: an AsyncConnection to the "expdb" OpenML database.
- user_test: an AsyncConnection to the "openml" OpenML database.
- Above fixtures have considerable per-test overhead. Use them only when you need them.
- When writing assertions the expected value (a constant, or a php response) should be on the right (`assert response == expected`).

### Writing Tests for an Endpoint
Because the `py_api` and database fixtures provide considerable per-test overhead,
follow these guidelines for writing a test suite for an endpoint.

Include tests against `py_api` for input validation specific to that endpoint. Validation in reused components should be tested centrally (e.g., Pagination).
```python
def test_get_dataset_identifier_validation(py_api: httpx.AsyncClient) -> None:
response = await py_api.get("/datasets/not-an-integer")
assert response.status_code == HTTPStatus.UNPROCESSABLE_ENTITY
```

Include one test against the `py_api` that confirms a successful request has the expected result:

```python
def test_get_dataset_success(py_api: httpx.AsyncClient) -> None:
response = await py_api.get("/datasets/1")
assert response.status_code == HTTPStatus.OK
assert response.json() == {...} # some expected data
```
Comment thread
PGijsbers marked this conversation as resolved.

For all other tests, do not use `py_api` but call the implementing function directly. For example, do not call `client.get("/datasets/1")` but instead `get_dataset`:

```python
@pytest.mark.parametrize('execution_number', range(5000))
def test_private_dataset_owner_access(
execution_number,
expdb_test: Connection,
user_test: Connection,
py_api: TestClient,
) -> None:
fetch_user(ApiKey.REGULAR_USER, user_test) # accesses only the user db
get_estimation_procedures(expdb_test) # accesses only the experiment db
py_api.get("/does/not/exist") # only queries the api
pass
async def test_get_dataset_private_success(expdb_test: AsyncConnection, user_test: AsyncConnection) -> None:
private_dataset = 42
owner_of_that_dataset = OWNER_USER
dataset = await get_dataset(dataset_id=42, user=owner_of_that_dataset, user_db=user_test, expdb_db=expdb_test)
assert dataset.id == private_dataset

async def test_get_dataset_private_access_denied(expdb_test: AsyncConnection, user_test: AsyncConnection) -> None:
private_dataset = 42
owner_of_that_dataset = SOME_USER # Test User defined in a common file
with pytest.raises(DatasetNoAccessError) as e:
await get_dataset(dataset_id=42, user=owner_of_that_dataset, user_db=user_test, expdb_db=expdb_test)
assert e.value.status_code == HTTPStatus.FORBIDDEN
Comment thread
coderabbitai[bot] marked this conversation as resolved.
```

When individually adding/removing components, we measure (for 5000 repeats, n=1):

| expdb | user | api | exp call | user call | api get | time (s) |
|-------|------|-----|----------|-----------|---------|----------:|
| ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | 1.78 |
| ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | 3.45 |
| ❌ | ✅ | ❌ | ❌ | ❌ | ❌ | 3.22 |
| ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | 298.48 |
| ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | 4.44 |
| ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | 285.69 |
| ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | 4.91 |
| ❌ | ✅ | ❌ | ❌ | ✅ | ❌ | 5.81 |
| ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 307.91 |

Adding a fixture that just returns some value adds only minimal overhead (1.91s),
so the burden comes from establishing the database connection itself.

We make the following observations:

- Adding a database fixture adds the same overhead as instantiating an entirely new test.
- Overhead of adding multiple database fixtures is not free
- The `py_api` fixture adds two orders of magnitude more overhead

We want our tests to be fast, so we want to avoid using these fixtures when we reasonably can.
We restrict usage of `py_api` fixtures to integration/migration tests, since it is very slow.
These only run on CI before merges.
For database fixtures

We will write some fixtures that can be used to e.g., get a `User` without accessing the database.
The validity of these users will be tested against the database in only a single test.

### Mocking
Mocking can help us reduce the reliance on database connections in tests.
A mocked function can prevent accessing the database, and instead return a predefined value instead.

It has a few upsides:
- It's faster than using a database fixture (see below).
- The test is not dependent on the database: you can run the test without a database.

But it also has downsides:
- Behavior changes in the database, such as schema changes, are not automatically reflected in the tests.
- The database layer (e.g., queries) are not actually tested.

Basically, the mocked behavior may not match real behavior when executed on a database.
For this reason, for each mocked entity, we should add a test that verifies that if the database layer
is invoked with the database, it returns the expected output that matches the mock.
This is additional overhead in development, but hopefully it pays back in more granular test feedback and faster tests.

On the speed of mocks, consider these two tests:

```diff
@pytest.mark.parametrize('execution_number', range(5000))
def test_private_dataset_owner_access(
execution_number,
admin,
+ mocker,
- expdb_test: Connection,
) -> None:
+ mock = mocker.patch('database.datasets.get')
+ class Dataset(NamedTuple):
+ uploader: int
+ visibility: Visibility
+ mock.return_value = Dataset(uploader=1, visibility=Visibility.PRIVATE)

_get_dataset_raise_otherwise(
dataset_id=1,
user=admin,
- expdb=expdb_test,
+ expdb=None,
*note:* We will likely mock the database layer at some point, but it's still taking shape.

For the initial development of this API, we want to have a clear mapping from PHP API output to the new output.
We also want to be aware of quirks that the PHP API might have.
For both these reasons, we write what we call "migration" tests: they call both APIs with a variety of input and compare the result.
Note that in some cases, there are some quite significant differences between the PHP and the Python based API.
That's okay, but in that case we want to "document" the behavior of both in the test.
Please reference a few implemented migration tests to get a better understanding, but here is a high level sketch:

```python
async def test_get_dataset(py_api: httpx.AsyncClient, php_api: httpx.AsyncClient) -> None:
py_response, php_response = await asyncio.gather(
py_api.get("/datasets/1"),
php_api.get("/data/1"),
)

if py_response.status_code == HTTPStatus.OK and php_response.status_code == HTTPStatus.OK:
_assert_success_response_equal(py_response.json(), php_response.json())
else:
_assert_error_response_equal(py_response, php_response)

def _assert_success_response_equal(py_json, php_json) -> None:
# PHP API returns numbers as strings
py_json = nested_num_to_str(py_json)

# There might be more differences which need addressing
# ...
# and then finally we compare the results to ensure the remaining data is identical
assert py_json == php_json

def _assert_error_response_equal(py_response, php_response) -> None:
# There might be some translation of error codes
if py_response.status_code == HTTPStatus.UNPROCESSABLE_ENTITY:
assert php_response.status_code == HTTPStatus.PRECONDITION_FAILED
elif ...:
...
else:
assert py_response.status_code == php_response.status_code

# Python follows RFC9457 while PHP has a custom system:
assert py_response.json()["code"] == php_response.json()["error"]["code"]

```
There is only a single database call in the test. It fetches a record on an indexed field and does not require any joins.
Despite the database call being very light, the database-included test is ~50% slower than the mocked version (3.50s vs 5.04s).
2 changes: 1 addition & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ nav:
- Changes: migration.md
- Contributing:
- contributing/index.md
- Development: contributing/contributing.md
- Setup: contributing/setup.md
- Tests: contributing/tests.md
- Documentation: contributing/documentation.md
- Project Overview: contributing/project_overview.md
Expand Down
10 changes: 7 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -121,11 +121,15 @@ asyncio_mode = "auto"
asyncio_default_fixture_loop_scope = "session"
asyncio_default_test_loop_scope = "session"
markers = [
"slow: test or sets of tests which take more than a few seconds to run.",
# We exclude PHP tests because they are quite slow to begin with.
"slow: python-only test or sets of tests which take more than 0.1 second to run.",
# While the `mut`ation marker below is not strictly necessary as every change is
# executed within transaction that is rolled back, it can halt other unit tests which
# executed within transaction that is rolled back, it can halt other unit tests
# whose queries may depend on the execution or rollback of the transaction.
"mut: executes a mutation on the database (in a transaction which is rolled back)",
"mut: executes a mutation on the database (even when it is cleaned up).",
# *_api markers included for documentation, should not be added manually.
"py_api: requests the py_api fixture. Added automatically.",
"php_api: requests the php_api fixture. Added automatically.",
]
Comment thread
sourcery-ai[bot] marked this conversation as resolved.
filterwarnings = [
'ignore:A private pytest class or function was used.:DeprecationWarning:tests.conftest:',
Expand Down
Loading