fix: adapt to `llama.cpp` changes #522

giladgd · 2025-11-28T17:03:32Z

Description of change

fix: adapt to llama.cpp changes
fix: pad the context size to align with the implementation in llama.cpp
fix: fallback to other available binaries on load failure in internal llama getter without backend
fix: bugs

You may notice that the context size that gets created is sometimes slightly larger than what you specify (by up to 256) - this happens since llama.cpp pads the size of the context to align to multiples of 256 for performance reasons.
You can check the actual context size that gets created using the .contextSize property on the created context or any of its sequences.

Pull-Request Checklist

Code is up-to-date with the master branch
npm run format to apply eslint formatting
npm run test passes with this change
This pull request links relevant issues as Fixes #0000
There are new or updated unit tests validating the change
Documentation has been updated to reflect this change
The new commits and pull request title follow conventions explained in pull request guidelines (PRs that do not follow this convention will not be merged)

… llama getter without backend

…cpp`

…ase is set to `latest`

giladgd added 5 commits November 28, 2025 18:43

fix: adapt to llama.cpp changes

659d064

fix: bugs

1d40dc4

fix: adapt to llama.cpp changes

521478e

fix: fallback to other available binaries on load failure in internal…

493d5dc

… llama getter without backend

build: fix CI

24daf6d

giladgd requested a review from ido-pluto November 28, 2025 17:03

giladgd self-assigned this Nov 28, 2025

giladgd added 12 commits November 28, 2025 21:11

fix: pad the context size to align with the implementation in `llama.…

e70bcd0

…cpp`

docs: context size alignment

ddf4958

test: fix test due to context size alignment

a07451e

build: fix build for macOS x64

6059f56

build: fix CI

81fc9c8

feat(source download CLI): log the downloaded release when the rele…

225bdb3

…ase is set to `latest`

build: fix CI

46e9cbf

fix: remove redundant fix

f4596aa

refactor: extract contextSizePad to the config file

e33fb5d

test: fix test

5edcc0f

build: update release job

deea4d2

fix: use macOS 26 in the CI

f594c58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix: adapt to `llama.cpp` changes #522

fix: adapt to `llama.cpp` changes #522

Uh oh!

giladgd commented Nov 28, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

fix: adapt to llama.cpp changes #522

Are you sure you want to change the base?

fix: adapt to llama.cpp changes #522

Uh oh!

Conversation

giladgd commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of change

Pull-Request Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: adapt to `llama.cpp` changes #522

fix: adapt to `llama.cpp` changes #522

giladgd commented Nov 28, 2025 •

edited

Loading