-
-
Notifications
You must be signed in to change notification settings - Fork 50
Update testing documentation #314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
6ab8d2b
Better separate different subpages of contributing guidelines
PGijsbers 311dc4a
Update developer documentation
PGijsbers 272a074
Fix typos, elaboreate on some points, give run example
PGijsbers f45a7c8
Clarify the 'slow' marker is only for tests that do not use PHP
PGijsbers c7fa0e1
Fix typos
PGijsbers 36c4920
Add language tags
PGijsbers 502ec03
Minor edits for clarity
PGijsbers File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,98 +1,135 @@ | ||
| # Writing Tests | ||
| # Testing | ||
|
|
||
| tl;dr: | ||
| - Setting up the `py_api` fixture to test directly against a REST API endpoint is really slow, only use it for migration/integration tests. | ||
| - Getting a database fixture and doing a database call is slow, consider mocking if appropriate. | ||
| This page covers running and writing tests for the REST API. | ||
| It assumes you already followed the instructions on the ["Setup"](setup.md) page. | ||
|
|
||
| ## Overhead from Fixtures | ||
| Sometimes, you want to interact with the REST API through the `py_api` fixture, | ||
| or want access to a database with `user_test` or `expdb_test` fixtures. | ||
| Be warned that these come with considerable relative overhead, which adds up when running thousands of tests. | ||
| !!! note "Follow the documentation" | ||
|
|
||
| Some tests and files in the current test suite need to be updated to reflect these conventions. | ||
| This documentation specifies the desired way of doing things even if the source code does something else. | ||
|
|
||
| ## Running Tests | ||
|
|
||
| Following the setup page, you can run tests when the services are running. | ||
| Start the services using one of two commands: | ||
|
|
||
| ```bash | ||
| docker compose --profile apis up -d # if you run tests which need PHP REST API | ||
| docker compose up python-api -d # if you do not run tests that require the PHP API | ||
| ``` | ||
| and then invoke pytest: | ||
| ```bash | ||
| docker compose exec python-api python -m pytest tests | ||
| ``` | ||
| As of writing, the full test suite ran in sequence takes about 2 minutes. | ||
| We generally recommend running subsets of the tests during development. | ||
| In particular, running only the tests which do not use fuzzing or the PHP API takes under 3 seconds: | ||
| ```bash | ||
| docker compose exec python-api python -m pytest tests -m "not php_api and not slow" | ||
| ``` | ||
|
|
||
| ## Writing Tests | ||
|
|
||
| We use the ubiquitous [Pytest](https://docs.pytest.org) framework when writing tests. | ||
|
|
||
| ### File Structure | ||
| When writing tests, we have the following additional conventions on the file structure: | ||
|
|
||
| - Use a `_test` suffix when naming our files (not a `test_` prefix). Our tests already exist in a `tests` directory, and in common tree list side panels it's likely you can only see the start of file names, so this is more informative. | ||
| - One dedicated test file per endpoint | ||
|
|
||
|
|
||
| ### General Test Guidelines | ||
| Some guidelines and things to keep in mind when writing tests: | ||
|
|
||
| - Try to keep tests small, so that they fail for one particular reason only. | ||
| - Mark tests that update the database in anyway with the `mut` marker (`@pytest.mark.mut`). | ||
| - If the test is excessively slow (>0.1 sec) and does not connect to PHP, use a `slow` marker. Tests that include PHP always require roundtrips through other services which makes them slow by default. PHP tests can be filtered out with the automatically generated "php_api" marker. | ||
| - Four common fixtures you might need when writing tests are: | ||
| - py_api: an async client for the Python based REST API | ||
| - php_api: an async client for the PHP based REST API | ||
| - expdb_test: an AsyncConnection to the "expdb" OpenML database. | ||
| - user_test: an AsyncConnection to the "openml" OpenML database. | ||
| - Above fixtures have considerable per-test overhead. Use them only when you need them. | ||
| - When writing assertions the expected value (a constant, or a php response) should be on the right (`assert response == expected`). | ||
|
|
||
| ### Writing Tests for an Endpoint | ||
| Because the `py_api` and database fixtures provide considerable per-test overhead, | ||
| follow these guidelines for writing a test suite for an endpoint. | ||
|
|
||
| Include tests against `py_api` for input validation specific to that endpoint. Validation in reused components should be tested centrally (e.g., Pagination). | ||
| ```python | ||
| def test_get_dataset_identifier_validation(py_api: httpx.AsyncClient) -> None: | ||
| response = await py_api.get("/datasets/not-an-integer") | ||
| assert response.status_code == HTTPStatus.UNPROCESSABLE_ENTITY | ||
| ``` | ||
|
|
||
| Include one test against the `py_api` that confirms a successful request has the expected result: | ||
|
|
||
| ```python | ||
| def test_get_dataset_success(py_api: httpx.AsyncClient) -> None: | ||
| response = await py_api.get("/datasets/1") | ||
| assert response.status_code == HTTPStatus.OK | ||
| assert response.json() == {...} # some expected data | ||
| ``` | ||
|
|
||
| For all other tests, do not use `py_api` but call the implementing function directly. For example, do not call `client.get("/datasets/1")` but instead `get_dataset`: | ||
|
|
||
| ```python | ||
| @pytest.mark.parametrize('execution_number', range(5000)) | ||
| def test_private_dataset_owner_access( | ||
| execution_number, | ||
| expdb_test: Connection, | ||
| user_test: Connection, | ||
| py_api: TestClient, | ||
| ) -> None: | ||
| fetch_user(ApiKey.REGULAR_USER, user_test) # accesses only the user db | ||
| get_estimation_procedures(expdb_test) # accesses only the experiment db | ||
| py_api.get("/does/not/exist") # only queries the api | ||
| pass | ||
| async def test_get_dataset_private_success(expdb_test: AsyncConnection, user_test: AsyncConnection) -> None: | ||
| private_dataset = 42 | ||
| owner_of_that_dataset = OWNER_USER | ||
| dataset = await get_dataset(dataset_id=42, user=owner_of_that_dataset, user_db=user_test, expdb_db=expdb_test) | ||
| assert dataset.id == private_dataset | ||
|
|
||
| async def test_get_dataset_private_access_denied(expdb_test: AsyncConnection, user_test: AsyncConnection) -> None: | ||
| private_dataset = 42 | ||
| owner_of_that_dataset = SOME_USER # Test User defined in a common file | ||
| with pytest.raises(DatasetNoAccessError) as e: | ||
| await get_dataset(dataset_id=42, user=owner_of_that_dataset, user_db=user_test, expdb_db=expdb_test) | ||
| assert e.value.status_code == HTTPStatus.FORBIDDEN | ||
|
coderabbitai[bot] marked this conversation as resolved.
|
||
| ``` | ||
|
|
||
| When individually adding/removing components, we measure (for 5000 repeats, n=1): | ||
|
|
||
| | expdb | user | api | exp call | user call | api get | time (s) | | ||
| |-------|------|-----|----------|-----------|---------|----------:| | ||
| | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | 1.78 | | ||
| | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | 3.45 | | ||
| | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ | 3.22 | | ||
| | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | 298.48 | | ||
| | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | 4.44 | | ||
| | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | 285.69 | | ||
| | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | 4.91 | | ||
| | ❌ | ✅ | ❌ | ❌ | ✅ | ❌ | 5.81 | | ||
| | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 307.91 | | ||
|
|
||
| Adding a fixture that just returns some value adds only minimal overhead (1.91s), | ||
| so the burden comes from establishing the database connection itself. | ||
|
|
||
| We make the following observations: | ||
|
|
||
| - Adding a database fixture adds the same overhead as instantiating an entirely new test. | ||
| - Overhead of adding multiple database fixtures is not free | ||
| - The `py_api` fixture adds two orders of magnitude more overhead | ||
|
|
||
| We want our tests to be fast, so we want to avoid using these fixtures when we reasonably can. | ||
| We restrict usage of `py_api` fixtures to integration/migration tests, since it is very slow. | ||
| These only run on CI before merges. | ||
| For database fixtures | ||
|
|
||
| We will write some fixtures that can be used to e.g., get a `User` without accessing the database. | ||
| The validity of these users will be tested against the database in only a single test. | ||
|
|
||
| ### Mocking | ||
| Mocking can help us reduce the reliance on database connections in tests. | ||
| A mocked function can prevent accessing the database, and instead return a predefined value instead. | ||
|
|
||
| It has a few upsides: | ||
| - It's faster than using a database fixture (see below). | ||
| - The test is not dependent on the database: you can run the test without a database. | ||
|
|
||
| But it also has downsides: | ||
| - Behavior changes in the database, such as schema changes, are not automatically reflected in the tests. | ||
| - The database layer (e.g., queries) are not actually tested. | ||
|
|
||
| Basically, the mocked behavior may not match real behavior when executed on a database. | ||
| For this reason, for each mocked entity, we should add a test that verifies that if the database layer | ||
| is invoked with the database, it returns the expected output that matches the mock. | ||
| This is additional overhead in development, but hopefully it pays back in more granular test feedback and faster tests. | ||
|
|
||
| On the speed of mocks, consider these two tests: | ||
|
|
||
| ```diff | ||
| @pytest.mark.parametrize('execution_number', range(5000)) | ||
| def test_private_dataset_owner_access( | ||
| execution_number, | ||
| admin, | ||
| + mocker, | ||
| - expdb_test: Connection, | ||
| ) -> None: | ||
| + mock = mocker.patch('database.datasets.get') | ||
| + class Dataset(NamedTuple): | ||
| + uploader: int | ||
| + visibility: Visibility | ||
| + mock.return_value = Dataset(uploader=1, visibility=Visibility.PRIVATE) | ||
|
|
||
| _get_dataset_raise_otherwise( | ||
| dataset_id=1, | ||
| user=admin, | ||
| - expdb=expdb_test, | ||
| + expdb=None, | ||
| *note:* We will likely mock the database layer at some point, but it's still taking shape. | ||
|
|
||
| For the initial development of this API, we want to have a clear mapping from PHP API output to the new output. | ||
| We also want to be aware of quirks that the PHP API might have. | ||
| For both these reasons, we write what we call "migration" tests: they call both APIs with a variety of input and compare the result. | ||
| Note that in some cases, there are some quite significant differences between the PHP and the Python based API. | ||
| That's okay, but in that case we want to "document" the behavior of both in the test. | ||
| Please reference a few implemented migration tests to get a better understanding, but here is a high level sketch: | ||
|
|
||
| ```python | ||
| async def test_get_dataset(py_api: httpx.AsyncClient, php_api: httpx.AsyncClient) -> None: | ||
| py_response, php_response = await asyncio.gather( | ||
| py_api.get("/datasets/1"), | ||
| php_api.get("/data/1"), | ||
| ) | ||
|
|
||
| if py_response.status_code == HTTPStatus.OK and php_response.status_code == HTTPStatus.OK: | ||
| _assert_success_response_equal(py_response.json(), php_response.json()) | ||
| else: | ||
| _assert_error_response_equal(py_response, php_response) | ||
|
|
||
| def _assert_success_response_equal(py_json, php_json) -> None: | ||
| # PHP API returns numbers as strings | ||
| py_json = nested_num_to_str(py_json) | ||
|
|
||
| # There might be more differences which need addressing | ||
| # ... | ||
| # and then finally we compare the results to ensure the remaining data is identical | ||
| assert py_json == php_json | ||
|
|
||
| def _assert_error_response_equal(py_response, php_response) -> None: | ||
| # There might be some translation of error codes | ||
| if py_response.status_code == HTTPStatus.UNPROCESSABLE_ENTITY: | ||
| assert php_response.status_code == HTTPStatus.PRECONDITION_FAILED | ||
| elif ...: | ||
| ... | ||
| else: | ||
| assert py_response.status_code == php_response.status_code | ||
|
|
||
| # Python follows RFC9457 while PHP has a custom system: | ||
| assert py_response.json()["code"] == php_response.json()["error"]["code"] | ||
|
|
||
| ``` | ||
| There is only a single database call in the test. It fetches a record on an indexed field and does not require any joins. | ||
| Despite the database call being very light, the database-included test is ~50% slower than the mocked version (3.50s vs 5.04s). | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.