MNT: Replace live server calls with mocks in tests/test_utils/test_utils.py#1679
MNT: Replace live server calls with mocks in tests/test_utils/test_utils.py#1679Rahuldrabit wants to merge 5 commits intoopenml:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR replaces live server network calls with local mocks in tests/test_utils/test_utils.py to make tests run deterministically and offline, addressing part of issue #1649. The changes are test-only with no production code modifications.
Changes:
- Added helper function
_create_mock_listing_callto generate mock responses for paginated listing endpoints - Modified
_mocked_perform_api_callto return mock XML responses for dataset queries - Updated 10 test functions to use mocks instead of live server calls, removing
@pytest.mark.test_server()and@pytest.mark.flaky()markers
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@geetu040 Hi !! can you please give review . |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def mock_listing_call(limit, offset, **kwargs): | ||
| if offset >= total_items: | ||
| return pd.DataFrame() if return_type == "dataframe" else [] | ||
| size = min(limit, total_items - offset) | ||
| items = [item_factory(i) for i in range(offset, offset + size)] | ||
| return pd.DataFrame(items) if return_type == "dataframe" else items | ||
| return mock_listing_call |
There was a problem hiding this comment.
_create_mock_listing_call appends an empty batch when offset >= total_items, which can leave an extra empty element in the _list_all() result list in cases where limit is None and total_items is an exact multiple of batch_size. This differs from real listing calls which often signal end-of-results via OpenMLServerNoResult, and can subtly change downstream behavior (e.g., callers that expect no empty final batch). Consider raising openml.exceptions.OpenMLServerNoResult when offset >= total_items, or ensure _list_all-style termination without adding an empty batch to results.
| return """<oml:data xmlns:oml="http://openml.org/openml"> | ||
| <oml:dataset> | ||
| <oml:did>61</oml:did> | ||
| <oml:name>iris</oml:name> | ||
| <oml:version>1</oml:version> | ||
| <oml:status>active</oml:status> | ||
| </oml:dataset> | ||
| </oml:data>""" | ||
| raise ValueError(f"Unexpected call: {call}") | ||
|
|
||
|
|
There was a problem hiding this comment.
_mocked_perform_api_call hard-codes the full expected call string (including parameter order). This makes the test brittle to harmless refactors in URL construction (e.g., reordering filters) while not changing behavior. Consider matching on structured components (e.g., prefix + presence of required segments) or parsing the endpoint into parts before deciding which response to return.
| mock_list_tasks.side_effect = _create_mock_listing_call( | ||
| min_number_tasks_on_test_server, lambda i: {"tid": i} | ||
| ) | ||
| tasks = openml.tasks.list_tasks(size=min_number_tasks_on_test_server) | ||
| assert min_number_tasks_on_test_server == len(tasks) | ||
|
|
||
|
|
There was a problem hiding this comment.
These tests now use local mocks rather than the test server, but the fixture/variable name min_number_tasks_on_test_server (and similar) still suggests a dependency on the remote test server state. Renaming the parameter/fixture (and updating its docstring) to reflect that it’s just a mocked item count would make the intent clearer and avoid confusion about network requirements.
| """This test verifies that the test server downloads the data from the correct source. | ||
|
|
||
| If this tests fails, it is highly likely that the test server is not configured correctly. | ||
| Usually, this means that the test server is serving data from the task with the same ID from the production server. | ||
| That is, it serves parquet files wrongly associated with the test server's task. | ||
| """ | ||
| mock_task = unittest.mock.Mock() | ||
| mock_dataset = unittest.mock.Mock() | ||
| mock_dataset.features = {0: "feature1", 1: "feature2"} | ||
| mock_dataset.get_data.return_value = (pd.DataFrame({"feature1": [1], "feature2": [2]}), None, None, None) | ||
| mock_task.get_dataset.return_value = mock_dataset | ||
| mock_get_task.return_value = mock_task | ||
|
|
||
| task = openml.tasks.get_task(119) |
There was a problem hiding this comment.
test_correct_test_server_download_state no longer tests the stated behavior in its docstring: patching openml.tasks.get_task replaces the whole OpenML code path with a mock, so the assertion only validates the mock setup (features vs dataframe columns) rather than verifying the server/data-source configuration. Either (a) update/rename the test and docstring to reflect the new goal, or (b) mock at a lower level (e.g., _perform_api_call / download layer) so openml.tasks.get_task() and dataset parsing logic are still exercised offline.
Reference Issue: Fixes #1649
Replace tests' live-server network calls with local mocks/patches so tests run deterministically and offline.
Add helpers to mock listing endpoints and API call responses (for datasets, tasks, flows, setups, runs, evaluations).
Patch cache/config calls and use tmp_path where needed to test cache utilities without touching real filesystem config.
No production code changes; test-only changes.
Result : Passed