Skip to content

fix: native llama reasoning stream and provider options#216

Open
JKobrynski wants to merge 3 commits into
callstackincubator:mainfrom
JKobrynski:fix/native-llama-reasoning-stream-and-provider-options
Open

fix: native llama reasoning stream and provider options#216
JKobrynski wants to merge 3 commits into
callstackincubator:mainfrom
JKobrynski:fix/native-llama-reasoning-stream-and-provider-options

Conversation

@JKobrynski

Copy link
Copy Markdown
Contributor

Related issue - #199

Summary

Improves the @react-native-ai/llama streaming adapter by preferring native parsed reasoning/content fields from llama.rn when they are available, instead of relying only on placeholder parsing of raw token text.
Previously, the adapter inferred reasoning boundaries entirely from streamed token markers like <think>...</think>. That fallback parser remains useful, but newer llama.rn versions can expose native structured fields such as tokenData.reasoning_content and tokenData.content, which are a more reliable source of truth.
This change updates the stream parser to use native parsed fields when present, emit proper incremental deltas from their accumulated values, preserve fallback placeholder parsing when native fields are absent, and keep tool-call behavior intact across both paths.

Changes

  • Prefer native tokenData.reasoning_content for reasoning stream output when available
  • Prefer native tokenData.content for normal text stream output when available
  • Track previously emitted native values and emit only new suffixes as deltas
  • Preserve fallback placeholder parsing when native fields are absent
  • Prevent duplicate text/reasoning emission when native fields and raw token text coexist on the same chunk
  • Ensure buffered fallback text is still flushed safely after native chunks
  • Preserve queued tool-call emission even when native parsed fields are present on a chunk
  • Keep <think> markers inside tool-call payload text untouched
  • Keep split <think> markers inside tool-call payload text untouched
  • Extract llama completion option building into buildLlamaCompletionOptions
  • Add focused unit tests for providerOptions.llama passthrough
  • Export the completion-options helper from the llama package entrypoint

Testing

  • Ran bun test packages/llama/src/__tests__/streamParser.test.ts
  • Ran bun test packages/llama/src/__tests__/completionOptions.test.ts
  • Ran ./node_modules/.bin/tsc --noEmit --project packages/llama/tsconfig.json
  • Ran ./node_modules/.bin/tsc --noEmit --project packages/llama/tsconfig.build.json
  • Ran bun lint
  • Added regression coverage for:
    • native reasoning/content delta emission from accumulated values
    • native chunks that also include raw token text
    • fallback placeholder parsing when native fields are absent
    • buffered fallback flushing after native chunks
    • tool-call payloads containing <think> markers
    • split <think> markers inside tool-call payloads
    • queued tool-call emission from native chunks

@vercel

vercel Bot commented Jun 10, 2026

Copy link
Copy Markdown

@JKobrynski is attempting to deploy a commit to the Callstack Team on Vercel.

A member of the Team first needs to authorize it.

@JKobrynski JKobrynski marked this pull request as ready for review June 10, 2026 11:19
@artus9033 artus9033 requested a review from Copilot June 10, 2026 11:29

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the @react-native-ai/llama adapter to (1) prefer native structured streaming fields (tokenData.reasoning_content / tokenData.content) when available, and (2) forward providerOptions.llama into llama.rn completion options via a shared builder helper.

Changes:

  • Update the stream parser to emit incremental deltas based on native accumulated reasoning_content / content fields, while keeping the placeholder-based fallback path.
  • Extract completion option construction into buildLlamaCompletionOptions and add providerOptions.llama passthrough.
  • Add/extend unit tests for native streaming behavior and completion option passthrough.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
packages/llama/src/streamParser.ts Prefer native parsed content/reasoning_content and emit suffix deltas; adjust tool-call flushing logic.
packages/llama/src/completionOptions.ts New helper for preparing messages (multimodal) and building completion options with providerOptions.llama passthrough.
packages/llama/src/ai-sdk.ts Refactor to use shared completion-options builder and message preparation helper.
packages/llama/src/index.ts Export completion-options helpers from the package entrypoint.
packages/llama/src/tests/streamParser.test.ts Add coverage for native parsed streaming and tool-call edge cases.
packages/llama/src/tests/completionOptions.test.ts Add focused tests for providerOptions.llama passthrough behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +328 to +335
const hasNativeParsedContent =
tokenData.content !== undefined ||
tokenData.reasoning_content !== undefined

if (hasNativeParsedContent) {
processNativeParsedContent(tokenData)
return
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, addressed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants