fix(api): prevent background task garbage collection in lifespan startup#6551
Open
alexzhu0 wants to merge 1 commit into
Open
fix(api): prevent background task garbage collection in lifespan startup#6551alexzhu0 wants to merge 1 commit into
alexzhu0 wants to merge 1 commit into
Conversation
The two `asyncio.create_task(...)` calls in `keep/api/api.py` had no strong reference held to the returned Task object. Per the Python asyncio docs (https://docs.python.org/3/library/asyncio-task.html#asyncio.create_task), the event loop only keeps a weak reference to a task, so a fire-and-forget task can be garbage-collected mid-execution. When that happens the task is silently cancelled and the user sees no traceback, just missing work. Affected sites: - `startup()` line 183: `asyncio.create_task(process_watcher_task.async_process_watcher())` — the maintenance-window watcher had zero strong references. - `lifespan()` line 234: `asyncio.create_task(check_pending_tasks(background_tasks))` — the debug pending-task logger was passed `background_tasks` as an argument but never added itself to that set, so it had no strong reference either. Fix: hold a strong reference by adding each task to the existing `background_tasks` set that `lifespan()` already maintains for request-context use, and register a done-callback so the task removes itself on completion. `startup()` now accepts an optional `background_tasks` set; when present it is used to track the watcher task. Passing `None` keeps backward compatibility for any other caller.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #6552
Problem
Two
asyncio.create_task(...)calls inkeep/api/api.pycreate background tasks without holding a strong reference to the returnedTaskobject. The Python asyncio docs warn that the event loop only keeps a weak reference to a task, so a fire-and-forget task can be garbage-collected mid-execution and silently cancelled — no traceback, just missing work.Affected sites
startup()line 183 — the maintenance-window watcher:Zero strong references. The Task can be collected before it ever runs
async_process_watcher, and when it does the watcher just disappears.lifespan()line 234 — the debug pending-task logger:check_pending_tasksis passed thebackground_tasksset as an argument, but the Task wrapping the coroutine is never added to that set. So this task also lacks a strong reference and is GC-eligible.Root cause
CPython garbage-collects coroutine Tasks the same way it collects any other object. The asyncio event loop only keeps weak references to running tasks (
asyncio.all_tasks()is implemented via aWeakSet). If no other code holds a strong reference, the GC is free to reclaim the Task at any GC pass, which cancels it.Changes
keep/api/api.py— one file, +10 / -5 lines:startup()now accepts an optionalbackground_tasks: set | None = Noneparameter; when present the watcher task is added to it and registers a done-callback to discard itself on completion (task.add_done_callback(background_tasks.discard)).lifespan()passes its existingbackground_tasksset intostartup(), and now also adds the debugcheck_pending_taskstask to that same set with the same done-callback. The set was already being created here for request-context use; this just makes the lifespan tasks share it.settype annotation is tightened toset[asyncio.Task]for clarity.Backward compatible: any other caller of
startup()continues to work because the new parameter defaults toNone.Testing
Manual reasoning + asyncio docs reference; the GC-cancellation behaviour is established Python asyncio semantics and the fix matches the pattern recommended in the stdlib docs (hold strong reference in a set, discard via done-callback).
I considered adding a unit test but a test that reliably triggers
gc.collect()mid-task in pytest would be fragile and the bug class is well-documented. Happy to add a regression test that simply asserts the watcher Task ends up inapp.stateif maintainers prefer.