fix(voice): emit turn_ended when workflow yields no audio#3252
fix(voice): emit turn_ended when workflow yields no audio#3252adityasingh2400 wants to merge 1 commit into
Conversation
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 12a03672f4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| local_queue = asyncio.Queue() | ||
| await local_queue.put(VoiceStreamEventLifecycle(event="turn_ended")) | ||
| self._ordered_tasks.append(local_queue) |
There was a problem hiding this comment.
Reset empty turns before accepting the next turn
In streamed sessions, an empty workflow yield followed immediately by another transcript now appends turn_ended but returns before the dispatcher calls _finish_turn(). _started_processing_turn stays true, so the next _add_text() suppresses turn_started, producing turn_started, turn_ended, audio, turn_ended and losing that turn's trace text.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in 9d27b959: Finish turn state synchronously before enqueuing the synthetic turn_ended and wait for the dispatcher to drain. A counter tells the dispatcher its _finish_turn is a no-op for these events. New tests cover empty-then-real and back-to-back empty turns.
The synthesized turn_ended for an empty workflow yield was enqueued in the dispatcher, but _started_processing_turn / tracing span stayed live until the dispatcher drained it. A follow-up _add_text could then skip turn_started, and the late _finish_turn could finish the next turn's span. Finish turn state synchronously, mark the synthetic event so the dispatcher's _finish_turn is skipped, and wait for the dispatcher to flush it before _turn_done returns so ordering is preserved.
8f1f19c to
9d27b95
Compare
Summary
When a
VoiceWorkflowBase.run()(oron_start()) yields only empty/whitespace deltas — common with LLM streaming keepalives —StreamedAudioResult._add_textcalls_start_turn(emittingturn_started) but the sentence splitter leaves_text_bufferempty._turn_donethen sees an empty buffer and never schedules a TTS task withfinish_turn=True, so the dispatcher exits the loop and emitssession_endedwithout ever emittingturn_ended.Consumers that pair
turn_started/turn_endedlifecycle events (e.g. UI state machines, transcript boundary tracking) silently break.The fix pushes a synthetic
turn_endedlifecycle event onto the ordered task queue whenever a turn was started but no synthesizable text remains, preserving balanced lifecycle pairs.Test plan
test_voicepipeline_empty_workflow_yield_emits_turn_endedfails onmain(asserts only[turn_started, session_ended]are emitted) and passes after the fixpytest tests/voice -v)🤖 Generated with Claude Code