Add uvloop-inspired async task API by benoitc · Pull Request #25 · benoitc/erlang-python

benoitc · 2026-03-11T18:57:33Z

Summary

Add thread-safe async task API to py_event_loop module
py_event_loop:run/3,4 - Blocking run of async Python functions
py_event_loop:create_task/3,4 - Non-blocking task submission
py_event_loop:await/1,2 - Wait for task result with timeout
py_event_loop:spawn_task/3,4 - Fire-and-forget task execution

Uses enif_send for thread-safe wakeup from dirty schedulers. Task queue protected by mutex. Results delivered via Erlang message passing.

Also adds erlang.sleep() and erlang.call() for sync context, plus explicit scheduling API (erlang.schedule(), erlang.schedule_py()).

Add erlang.sleep() function that works in both async and sync contexts: - Async: returns asyncio.sleep() which uses Erlang timer system - Sync: uses erlang.call('_py_sleep') callback with receive/after, truly releasing the dirty scheduler for cooperative yielding Remove unused _erlang_sleep NIF which only released the GIL but blocked the pthread. The callback approach properly suspends the Erlang process. Changes: - Add sleep() to _erlang_impl and export to erlang module - Add _py_sleep callback in py_event_loop.erl - Remove py_erlang_sleep NIF and dispatch_sleep_complete - Remove sync_sleep fields from event loop struct - Remove sleep handlers from py_event_worker - Update tests to use erlang.sleep()

Update docstring and asyncio.md to clarify: - Both sync and async modes release the dirty NIF scheduler - Async: yields to event loop via asyncio.sleep()/call_later() - Sync: suspends Erlang process via receive/after callback Also fix outdated architecture diagram that referenced removed sleep_wait/dispatch_sleep_complete NIF.

- Make erlang.call() blocking (no replay) - Add erlang.schedule(), schedule_py(), consume_time_slice() - ScheduleMarker type for explicit dirty scheduler release

Add documentation for: - erlang.call() now blocks (no replay) - erlang.schedule() for Erlang callback continuation - erlang.schedule_py() for Python function continuation - erlang.consume_time_slice() for cooperative scheduling

Implement uvloop-inspired thread-safe task submission for running async Python coroutines from any Erlang dirty scheduler thread. Core changes: - Add task queue (ErlNifIOQueue) to event loop for atomic operations - nif_call_soon_threadsafe: serialize and enqueue task, send wakeup - nif_process_ready_tasks: dequeue and schedule tasks on event loop - py_event_worker handles task_ready message to process queue High-level Erlang API: - py_event_loop:run/3,4 - blocking run, wait for result - py_event_loop:create_task/3,4 - non-blocking, returns ref - py_event_loop:spawn/3,4 - fire-and-forget with optional notify - py_event_loop:await/1,2 - wait for task result Uses enif_send() for thread-safe wakeup from any dirty scheduler, avoiding the thread-local event loop issues with asyncio.

This reverts commit cba1903.

Implements a thread-safe async task queue that works from dirty schedulers: - Add task_queue (ErlNifIOQueue) and py_loop fields to erlang_event_loop_t - nif_submit_task: Thread-safe task submission via enif_ioq and enif_send - nif_process_ready_tasks: Dequeue tasks, create coroutines, schedule on loop - py_event_worker handles task_ready wakeup message - High-level Erlang API: run/3,4, create_task/3,4, await/1,2, spawn_task/3,4 - Python ErlangEventLoop registers with global loop via _set_global_loop_ref - Register callbacks early in supervisor to ensure availability

- Add uvloop-style lazy Python loop creation in process_ready_tasks - Only call _run_once when coroutines are scheduled (not for sync functions) - Use enif_send directly for sync function results (faster path) - Fix queue size tracking in task processing loop Before: 1003 ms/task (1 task/sec) After: 0.009 ms/task (117K tasks/sec)

Pass timeout_hint=0 to _run_once() when coroutines are scheduled, preventing the event loop from blocking for up to 1 second when work is already pending. This matches uvloop's approach of computing exact sleep times. Changes: - Add timeout_hint parameter to ErlangEventLoop._run_once() - Update C code to pass timeout=0 after scheduling coroutines - Add bench_channel_async.erl for sync vs async comparison

Reduces async task API overhead by: - Early exit before GIL acquisition when task queue is empty - Caching asyncio module and _run_and_send function across calls - Only calling _run_once when coroutines are actually scheduled Performance improvements: - create_task + await: ~40% faster (157K vs 113K tasks/sec) - Concurrent tasks: ~30% faster (360K vs 275K tasks/sec)

uvloop-style optimizations for the Python event loop: - Handle pooling: reuse Handle objects in call_soon() instead of allocating - Time caching: cache time.monotonic() at start of each _run_once iteration - Clear context references when returning handles to pool These reduce allocations and syscalls in the hot path.

Restructure nif_process_ready_tasks into two phases: - Phase 1: Dequeue all tasks WITHOUT GIL (NIF operations only) - Phase 2: Acquire GIL once, process entire batch, release Benefits: - GIL held only during Python operations, not NIF operations - Batch up to 64 tasks per GIL acquisition - Task queue mutex released before GIL acquired (no lock overlap)

SuspensionRequiredException inherits from BaseException, not Exception, so the except Exception block didn't catch it. This caused the suspension mechanism to replay the entire function, making time measurements show ~0 elapsed time instead of the actual sleep duration. The fix catches BaseException and falls back to time.sleep() for correct timing behavior in py:call contexts. For dirty scheduler release in sync contexts, py:exec/py:eval should be used instead.

Add 15 new tests covering: - Stdlib operations (math.sqrt, pow, floor, ceil) - Operator module functions (add, mul) - Error handling (invalid module, function, timeout) - Concurrency (multiple processes, batch tasks) - Edge cases (empty args, large results, nested data) Tests use stdlib modules to avoid context issues with __main__.

benoitc added 21 commits March 11, 2026 08:39

Add blocking erlang.call() and explicit scheduling API

2c93eee

- Make erlang.call() blocking (no replay) - Add erlang.schedule(), schedule_py(), consume_time_slice() - ScheduleMarker type for explicit dirty scheduler release

Clarify dirty scheduler behavior in channel.receive and sleep docs

800fa34

Revert "Add thread-safe async task API (call_soon_threadsafe pattern)"

d821938

This reverts commit cba1903.

Simplify cb_sleep timeout handling

ae39ce5

Document async task API in changelog and asyncio docs

22bdf0c

Add event loop architecture documentation with diagrams

173f498

Document two-phase task processing in architecture docs

63efe69

Merge fix/fix-erlang-call: Fix erlang.sleep() timing

d53114f

Merge origin/main into feature/async-task-api

f989c8b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add uvloop-inspired async task API#25

Add uvloop-inspired async task API#25
benoitc wants to merge 21 commits intomainfrom
feature/async-task-api

benoitc commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

benoitc commented Mar 11, 2026

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant