Skip to content

fix: detect Tencent SILK (\x02 prefix) in audio magic bytes to avoid ffmpeg failure#8009

Open
Tom8266 wants to merge 2 commits intoAstrBotDevs:masterfrom
Tom8266:master
Open

fix: detect Tencent SILK (\x02 prefix) in audio magic bytes to avoid ffmpeg failure#8009
Tom8266 wants to merge 2 commits intoAstrBotDevs:masterfrom
Tom8266:master

Conversation

@Tom8266
Copy link
Copy Markdown

@Tom8266 Tom8266 commented May 5, 2026

Root Cause

QQ official bot sends voice messages in Tencent SILK format (leading \x02 byte before #!SILK_V3 magic). _get_audio_magic_type() had two off-by-one slice errors that prevented SILK detection:

  1. Standard SILK (line 353): header[:8] == b"#!SILK_V3"#!SILK_V3 is 9 bytes, so this never matched
  2. Tencent SILK: QQ sends \x02#!SILK_V3 — not detected at all

Fix

# Change Location
1 Import tencent_silk_to_wav L18
2 Standard SILK: header[:9] (was [:8]) L353
3 Tencent SILK: header[1:10] (new detection) L357
4 ensure_wav() routes silk type → tencent_silk_to_wav() L308

Relationship to #7832

#7832 adds silk/amr routing in ensure_wav(), but does not fix the magic byte detection — _get_audio_magic_type() still never returns "silk". This PR completes the fix by correcting the two off-by-one errors so the detection actually works.

Before / After

Before: QQ voice → magic returns "" → ffmpeg → "Invalid data found" error
After:  QQ voice → magic returns "silk" → tencent_silk_to_wav → WAV OK

Testing

  • ✅ Tencent SILK file detected as "silk" (verified with real QQ voice files)
  • ✅ Standard SILK file detected as "silk"
  • ✅ WAV files still short-circuit correctly
  • ✅ All other audio formats (mp3, ogg, flac, amr, opus, mp4) unaffected
  • ✅ No circular imports

Fixed by: Tom8266 and deepseek-v4-pro

@dosubot dosubot Bot added size:S This PR changes 10-29 lines, ignoring generated files. area:core The bug / feature is about astrbot's core, backend labels May 5, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • In ensure_wav, the temp output path creation for SILK appears to duplicate logic likely already handled by convert_audio_to_wav for a None output_path; consider centralizing temp path creation to a single place to avoid divergence in behavior.
  • In _get_audio_magic_type, the SILK detection logic could be made clearer and less error-prone by using header.startswith(b"#!SILK_V3") and header.startswith(b"\x02#!SILK_V3") instead of explicit slice indices.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `ensure_wav`, the temp output path creation for SILK appears to duplicate logic likely already handled by `convert_audio_to_wav` for a `None` `output_path`; consider centralizing temp path creation to a single place to avoid divergence in behavior.
- In `_get_audio_magic_type`, the SILK detection logic could be made clearer and less error-prone by using `header.startswith(b"#!SILK_V3")` and `header.startswith(b"\x02#!SILK_V3")` instead of explicit slice indices.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for processing Tencent SILK audio files by updating the magic byte detection logic and integrating a conversion helper. The review suggests refactoring the temporary file path generation to reduce code duplication and improve consistency using pathlib, as well as adopting the more idiomatic startswith method with a tuple for checking file headers.

Comment on lines +308 to 315
if audio_type == "silk":
if output_path is None:
temp_dir = get_astrbot_temp_path()
os.makedirs(temp_dir, exist_ok=True)
output_path = os.path.join(temp_dir, f"media_audio_{uuid.uuid4().hex}.wav")
return await tencent_silk_to_wav(audio_path, output_path)

return await convert_audio_to_wav(audio_path, output_path)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for generating a temporary output path is duplicated here and in convert_audio_format. Using pathlib.Path for consistency and refactoring the path generation to the top of the conversion logic makes the function cleaner and ensures output_path is available for both silk and other formats, adhering to the rule of avoiding code duplication. Furthermore, please ensure this new attachment handling functionality is covered by unit tests.

Suggested change
if audio_type == "silk":
if output_path is None:
temp_dir = get_astrbot_temp_path()
os.makedirs(temp_dir, exist_ok=True)
output_path = os.path.join(temp_dir, f"media_audio_{uuid.uuid4().hex}.wav")
return await tencent_silk_to_wav(audio_path, output_path)
return await convert_audio_to_wav(audio_path, output_path)
if output_path is None:
temp_dir = Path(get_astrbot_temp_path())
temp_dir.mkdir(parents=True, exist_ok=True)
output_path = str(temp_dir / f"media_audio_{uuid.uuid4().hex}.wav")
if audio_type == "silk":
return await tencent_silk_to_wav(audio_path, output_path)
return await convert_audio_to_wav(audio_path, output_path)
References
  1. When implementing similar functionality for different cases (e.g., direct vs. quoted attachments), refactor the logic into a shared helper function to avoid code duplication.
  2. New functionality, such as handling attachments, should be accompanied by corresponding unit tests.

Comment thread astrbot/core/utils/media_utils.py Outdated
Comment on lines 353 to 358
if header[:9] == b"#!SILK_V3":
return "silk"

# Tencent SILK: leading \x02 byte before #!SILK_V3
if header[:1] == b"\x02" and header[1:10] == b"#!SILK_V3":
return "silk"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using startswith with a tuple of prefixes is more idiomatic in Python (PEP 8) and less error-prone than manual slicing for magic byte detection, especially when dealing with prefixes of different lengths.

Suggested change
if header[:9] == b"#!SILK_V3":
return "silk"
# Tencent SILK: leading \x02 byte before #!SILK_V3
if header[:1] == b"\x02" and header[1:10] == b"#!SILK_V3":
return "silk"
if header.startswith((b"#!SILK_V3", b"\x02#!SILK_V3")):
return "silk"
References
  1. Using startswith with a tuple of prefixes is more idiomatic in Python (PEP 8) for checking multiple possible prefixes. (link)

…ffmpeg failure

QQ official bot sends voice in Tencent SILK format (leading \x02 byte before
#!SILK_V3 magic). _get_audio_magic_type() had two off-by-one slice errors:

  1. Standard SILK:  header[:8]  vs b'#!SILK_V3' (8 != 9 bytes) — never matched
  2. Tencent SILK:   not detected at all

Fixes:
  - Standard SILK:  header[:9]  == b'#!SILK_V3'   (correct 9-byte slice)
  - Tencent SILK:   header[:1] == b"\x02" and header[1:10] == b'#!SILK_V3'
  - ensure_wav() routes detected silk to tencent_silk_to_wav()

Before: QQ voice → ffmpeg → 'Invalid data found'
After:  QQ voice → magic detects silk → tencent_silk_to_wav → WAV OK
Replace manual slice comparisons with startswith() — cleaner, less
error-prone, and immune to off-by-one slice errors.

Suggested by: sourcery-ai
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core The bug / feature is about astrbot's core, backend size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant