Skip to content

fix: recording issue on chatui#8440

Merged
Soulter merged 1 commit into
masterfrom
fix/8364
May 30, 2026
Merged

fix: recording issue on chatui#8440
Soulter merged 1 commit into
masterfrom
fix/8364

Conversation

@Soulter
Copy link
Copy Markdown
Member

@Soulter Soulter commented May 30, 2026

#8364

Modifications / 改动点

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果


Checklist / 检查清单

  • 😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    / 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。

  • 👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    / 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”

  • 🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
    / 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到 requirements.txtpyproject.toml 文件相应位置。

  • 😮 My changes do not introduce malicious code.
    / 我的更改没有引入恶意代码。

Summary by Sourcery

Improve chat voice recording reliability and integrate it with the existing media upload flow.

Bug Fixes:

  • Ensure audio recording works across more browsers by selecting a supported MIME type and handling unsupported environments gracefully.
  • Prevent crashes and bad states when starting or stopping recordings fails or produces empty audio data.
  • Avoid leaking media tracks and blob URLs by consistently stopping recorder streams and revoking object URLs for removed recordings.

Enhancements:

  • Return recorded audio as a File object and reuse the shared media upload pipeline for chat voice messages instead of performing inline uploads.
  • Extend media handling to treat recorded audio as a first-class staged file type, with proper duplicate detection and removal behavior.
  • Refine chat session list UI so progress indicators are positioned consistently and animate when session actions are visible.

@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. feature:chatui The bug / feature is about astrbot's chatui, webchat labels May 30, 2026
@Soulter Soulter merged commit b8cf2ef into master May 30, 2026
20 checks passed
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • In useRecording.stopRecording, the cleanup logic for stopping tracks and resetting mediaRecorder is duplicated in the onstop, onerror, and catch paths; consider extracting this into a small helper so that future changes don’t need to be kept in sync across three locations.
  • In useRecording.stopRecording, recorder.onerror currently rejects with the raw event; wrapping this in an Error (with the event attached as metadata) will make downstream error handling and logging more consistent with the other error paths.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `useRecording.stopRecording`, the cleanup logic for stopping tracks and resetting `mediaRecorder` is duplicated in the `onstop`, `onerror`, and `catch` paths; consider extracting this into a small helper so that future changes don’t need to be kept in sync across three locations.
- In `useRecording.stopRecording`, `recorder.onerror` currently rejects with the raw event; wrapping this in an `Error` (with the event attached as metadata) will make downstream error handling and logging more consistent with the other error paths.

## Individual Comments

### Comment 1
<location path="dashboard/src/composables/useRecording.ts" line_range="55-57" />
<code_context>
+            mediaRecorder.value?.stream.getTracks().forEach(track => track.stop());
+            audioChunks.value = [];
+
             const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
-            mediaRecorder.value = new MediaRecorder(stream);
+            const mimeType = getSupportedMimeType();
+            mediaRecorder.value = new MediaRecorder(
+                stream,
+                mimeType ? { mimeType } : undefined
</code_context>
<issue_to_address>
**issue (bug_risk):** Stream from getUserMedia is not cleaned up if MediaRecorder construction fails.

If `new MediaRecorder(stream, ...)` throws (e.g., due to unsupported options), the `stream` remains active because the `catch` only logs, updates `isRecording`, and rethrows. Consider either:

- An inner `try/catch` around `new MediaRecorder(...)` that stops `stream.getTracks()` on failure, or
- Managing `stream` separately (e.g., `let stream: MediaStream | null`) and using a `finally` to stop tracks when `mediaRecorder.value` was never set.

This prevents leaving the microphone open when recorder construction fails.
</issue_to_address>

### Comment 2
<location path="dashboard/src/composables/useRecording.ts" line_range="81" />
<code_context>
     }

-    async function stopRecording(onStop?: (label: string) => void): Promise<string> {
+    async function stopRecording(onStop?: (label: string) => void): Promise<File> {
         return new Promise((resolve, reject) => {
-            if (!mediaRecorder.value) {
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting recorder cleanup, MIME type storage, and ID generation into small helpers plus one extra ref to flatten control flow and remove duplication between start/stop recording.

You can keep all the new robustness while reducing duplication and branching with a couple of small helpers and one extra ref.

### 1. Extract recorder cleanup into a helper

Right now `stopRecording` repeats:

- `recorder.stream.getTracks().forEach(track => track.stop());`
- `if (mediaRecorder.value === recorder) { mediaRecorder.value = null; }`
- `audioChunks.value = [];` (in some paths)

This can be centralized:

```ts
const recordingMimeType = ref<string | null>(null);

function cleanupRecorder(recorder: MediaRecorder, options?: { clearChunks?: boolean }) {
  try {
    recorder.stream.getTracks().forEach(track => track.stop());
  } catch {
    // ignore
  }
  if (mediaRecorder.value === recorder) {
    mediaRecorder.value = null;
  }
  if (options?.clearChunks) {
    audioChunks.value = [];
  }
}
```

Then `stopRecording` becomes flatter and easier to follow:

```ts
async function stopRecording(onStop?: (label: string) => void): Promise<File> {
  return new Promise((resolve, reject) => {
    const recorder = mediaRecorder.value;
    if (!recorder) {
      reject(new Error('No media recorder'));
      return;
    }

    isRecording.value = false;
    onStop?.('聊天输入框');

    const rejectOnce = (err: unknown) => {
      cleanupRecorder(recorder, { clearChunks: true });
      reject(err);
    };

    recorder.onstop = () => {
      try {
        const mimeType = recordingMimeType.value || getRecordingMimeType();
        const audioBlob = new Blob(audioChunks.value, { type: mimeType });
        cleanupRecorder(recorder, { clearChunks: true });

        if (!audioBlob.size) {
          rejectOnce(new Error('Recording is empty'));
          return;
        }

        const filename = getRecordingFilename(mimeType);
        const audioFile = new File([audioBlob], filename, {
          type: mimeType,
          lastModified: Date.now(),
        });
        resolve(audioFile);
      } catch (e) {
        rejectOnce(e);
      }
    };

    recorder.onerror = (event) => {
      rejectOnce(event);
    };

    try {
      recorder.stop();
    } catch (error) {
      rejectOnce(error);
    }
  });
}
```

This preserves all current behavior but:

- Cleanup happens in one helper.
- There is effectively a single reject path (`rejectOnce`).
- `onstop` / `onerror` are registered before `stop()`.

### 2. Store the chosen mime type once

You can avoid re-deriving MIME type later by storing the chosen type from `getSupportedMimeType()` when starting:

```ts
const recordingMimeType = ref<string | null>(null);

async function startRecording(onStart?: (label: string) => void) {
  try {
    if (!navigator.mediaDevices?.getUserMedia || typeof MediaRecorder === 'undefined') {
      throw new Error('Audio recording is not supported in this browser');
    }

    mediaRecorder.value?.stream.getTracks().forEach(track => track.stop());
    audioChunks.value = [];

    const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
    const mimeType = getSupportedMimeType();
    recordingMimeType.value = mimeType || null;

    mediaRecorder.value = new MediaRecorder(
      stream,
      mimeType ? { mimeType } : undefined
    );

    mediaRecorder.value.ondataavailable = (event) => {
      if (event.data.size > 0) {
        audioChunks.value.push(event.data);
      }
    };

    mediaRecorder.value.start();
    isRecording.value = true;
    onStart?.('录音中...');
  } catch (error) {
    console.error('Failed to start recording:', error);
    isRecording.value = false;
    throw error;
  }
}
```

Then `stopRecording` can prefer `recordingMimeType.value` and use `getRecordingMimeType()` only as a fallback (as shown above), which reduces branching and the conceptual duplication between the two functions.

### 3. Extract ID generation

To keep `getRecordingFilename` focused on extension logic, move the fallback ID generation into a helper:

```ts
function generateId(): string {
  return crypto.randomUUID?.() || `${Date.now()}-${Math.random().toString(16).slice(2)}`;
}

function getRecordingFilename(mimeType: string): string {
  const extensionMap: Record<string, string> = {
    'audio/webm': 'webm',
    'audio/webm;codecs=opus': 'webm',
    'audio/ogg': 'ogg',
    'audio/ogg;codecs=opus': 'ogg',
    'audio/mp4': 'm4a',
    'audio/mpeg': 'mp3',
    'audio/wav': 'wav'
  };
  const normalizedMimeType = mimeType.toLowerCase();
  const extension =
    extensionMap[normalizedMimeType] ||
    normalizedMimeType.split('/')[1]?.split(';')[0] ||
    'webm';

  return `${generateId()}.${extension}`;
}
```

This keeps all existing behavior (including older-environment robustness) while simplifying the main functions that most readers care about.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines 55 to +57
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
mediaRecorder.value = new MediaRecorder(stream);
const mimeType = getSupportedMimeType();
mediaRecorder.value = new MediaRecorder(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Stream from getUserMedia is not cleaned up if MediaRecorder construction fails.

If new MediaRecorder(stream, ...) throws (e.g., due to unsupported options), the stream remains active because the catch only logs, updates isRecording, and rethrows. Consider either:

  • An inner try/catch around new MediaRecorder(...) that stops stream.getTracks() on failure, or
  • Managing stream separately (e.g., let stream: MediaStream | null) and using a finally to stop tracks when mediaRecorder.value was never set.

This prevents leaving the microphone open when recorder construction fails.

}

async function stopRecording(onStop?: (label: string) => void): Promise<string> {
async function stopRecording(onStop?: (label: string) => void): Promise<File> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider extracting recorder cleanup, MIME type storage, and ID generation into small helpers plus one extra ref to flatten control flow and remove duplication between start/stop recording.

You can keep all the new robustness while reducing duplication and branching with a couple of small helpers and one extra ref.

1. Extract recorder cleanup into a helper

Right now stopRecording repeats:

  • recorder.stream.getTracks().forEach(track => track.stop());
  • if (mediaRecorder.value === recorder) { mediaRecorder.value = null; }
  • audioChunks.value = []; (in some paths)

This can be centralized:

const recordingMimeType = ref<string | null>(null);

function cleanupRecorder(recorder: MediaRecorder, options?: { clearChunks?: boolean }) {
  try {
    recorder.stream.getTracks().forEach(track => track.stop());
  } catch {
    // ignore
  }
  if (mediaRecorder.value === recorder) {
    mediaRecorder.value = null;
  }
  if (options?.clearChunks) {
    audioChunks.value = [];
  }
}

Then stopRecording becomes flatter and easier to follow:

async function stopRecording(onStop?: (label: string) => void): Promise<File> {
  return new Promise((resolve, reject) => {
    const recorder = mediaRecorder.value;
    if (!recorder) {
      reject(new Error('No media recorder'));
      return;
    }

    isRecording.value = false;
    onStop?.('聊天输入框');

    const rejectOnce = (err: unknown) => {
      cleanupRecorder(recorder, { clearChunks: true });
      reject(err);
    };

    recorder.onstop = () => {
      try {
        const mimeType = recordingMimeType.value || getRecordingMimeType();
        const audioBlob = new Blob(audioChunks.value, { type: mimeType });
        cleanupRecorder(recorder, { clearChunks: true });

        if (!audioBlob.size) {
          rejectOnce(new Error('Recording is empty'));
          return;
        }

        const filename = getRecordingFilename(mimeType);
        const audioFile = new File([audioBlob], filename, {
          type: mimeType,
          lastModified: Date.now(),
        });
        resolve(audioFile);
      } catch (e) {
        rejectOnce(e);
      }
    };

    recorder.onerror = (event) => {
      rejectOnce(event);
    };

    try {
      recorder.stop();
    } catch (error) {
      rejectOnce(error);
    }
  });
}

This preserves all current behavior but:

  • Cleanup happens in one helper.
  • There is effectively a single reject path (rejectOnce).
  • onstop / onerror are registered before stop().

2. Store the chosen mime type once

You can avoid re-deriving MIME type later by storing the chosen type from getSupportedMimeType() when starting:

const recordingMimeType = ref<string | null>(null);

async function startRecording(onStart?: (label: string) => void) {
  try {
    if (!navigator.mediaDevices?.getUserMedia || typeof MediaRecorder === 'undefined') {
      throw new Error('Audio recording is not supported in this browser');
    }

    mediaRecorder.value?.stream.getTracks().forEach(track => track.stop());
    audioChunks.value = [];

    const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
    const mimeType = getSupportedMimeType();
    recordingMimeType.value = mimeType || null;

    mediaRecorder.value = new MediaRecorder(
      stream,
      mimeType ? { mimeType } : undefined
    );

    mediaRecorder.value.ondataavailable = (event) => {
      if (event.data.size > 0) {
        audioChunks.value.push(event.data);
      }
    };

    mediaRecorder.value.start();
    isRecording.value = true;
    onStart?.('录音中...');
  } catch (error) {
    console.error('Failed to start recording:', error);
    isRecording.value = false;
    throw error;
  }
}

Then stopRecording can prefer recordingMimeType.value and use getRecordingMimeType() only as a fallback (as shown above), which reduces branching and the conceptual duplication between the two functions.

3. Extract ID generation

To keep getRecordingFilename focused on extension logic, move the fallback ID generation into a helper:

function generateId(): string {
  return crypto.randomUUID?.() || `${Date.now()}-${Math.random().toString(16).slice(2)}`;
}

function getRecordingFilename(mimeType: string): string {
  const extensionMap: Record<string, string> = {
    'audio/webm': 'webm',
    'audio/webm;codecs=opus': 'webm',
    'audio/ogg': 'ogg',
    'audio/ogg;codecs=opus': 'ogg',
    'audio/mp4': 'm4a',
    'audio/mpeg': 'mp3',
    'audio/wav': 'wav'
  };
  const normalizedMimeType = mimeType.toLowerCase();
  const extension =
    extensionMap[normalizedMimeType] ||
    normalizedMimeType.split('/')[1]?.split(';')[0] ||
    'webm';

  return `${generateId()}.${extension}`;
}

This keeps all existing behavior (including older-environment robustness) while simplifying the main functions that most readers care about.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the audio recording and media handling system. It introduces a dedicated useRecording composable to manage recording states and device streams, and updates useMediaHandling to treat voice recordings as staged files of type 'record'. In Chat.vue, the recording functions are updated to use these composables. The reviewer recommends passing 'record' as an argument to processAndUploadFile in Chat.vue to ensure the uploaded audio is correctly classified as a voice recording on the frontend.

Comment on lines +1320 to 1331
async function stopRecording() {
try {
const audioFile = await stopRecorder();
const uploaded = await processAndUploadFile(audioFile);
if (!uploaded) {
toast.error(tm("voice.error"));
}
} catch (error) {
console.error("Failed to stop recording:", error);
toast.error(tm("voice.error"));
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Pass 'record' as the typeOverride parameter to processAndUploadFile so that the uploaded audio file is correctly identified as a voice recording on the frontend.

async function stopRecording() {
  try {
    const audioFile = await stopRecorder();
    const uploaded = await processAndUploadFile(audioFile, 'record');
    if (!uploaded) {
      toast.error(tm("voice.error"));
    }
  } catch (error) {
    console.error("Failed to stop recording:", error);
    toast.error(tm("voice.error"));
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature:chatui The bug / feature is about astrbot's chatui, webchat size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant