Skip to content

Conversation

@giladgd
Copy link
Member

@giladgd giladgd commented Nov 28, 2025

Description of change

  • fix: adapt to llama.cpp changes
  • fix: pad the context size to align with the implementation in llama.cpp
  • fix: fallback to other available binaries on load failure in internal llama getter without backend
  • fix: bugs

You may notice that the context size that gets created is sometimes slightly larger than what you specify (by up to 256) - this happens since llama.cpp pads the size of the context to align to multiples of 256 for performance reasons.
You can check the actual context size that gets created using the .contextSize property on the created context or any of its sequences.

Pull-Request Checklist

  • Code is up-to-date with the master branch
  • npm run format to apply eslint formatting
  • npm run test passes with this change
  • This pull request links relevant issues as Fixes #0000
  • There are new or updated unit tests validating the change
  • Documentation has been updated to reflect this change
  • The new commits and pull request title follow conventions explained in pull request guidelines (PRs that do not follow this convention will not be merged)

@giladgd giladgd requested a review from ido-pluto November 28, 2025 17:03
@giladgd giladgd self-assigned this Nov 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants