Skip to content

Fix/mkl build candle single batch#170

Open
donhardman wants to merge 4 commits into
masterfrom
fix/mkl-build-candle-single-batch
Open

Fix/mkl build candle single batch#170
donhardman wants to merge 4 commits into
masterfrom
fix/mkl-build-candle-single-batch

Conversation

@donhardman
Copy link
Copy Markdown
Member

No description provided.

- Add bypass for batch-of-1 to skip padding and attention mask logic
- Use sequential tokenization for small batches to avoid rayon overhead
- Optimize BERT forward pass by removing intermediate batching wrappers
- Implement direct mean pooling for single sequences to reduce latency
- Upgrade rust-min-libc images to rust 1.95.0
- Add static symbol check for MKL on x86_64 builds
- Ensure build fails if MKL features are missing symbols
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 13, 2026

Windows test results

  5 files    5 suites   20m 23s ⏱️
501 tests 482 ✅ 12 💤 7 ❌
509 runs  490 ✅ 12 💤 7 ❌

For more details on these failures, see this check.

Results for commit 2befc60.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 13, 2026

Linux debug test results

  8 files    8 suites   14m 49s ⏱️
523 tests 511 ✅ 12 💤 0 ❌
537 runs  525 ✅ 12 💤 0 ❌

Results for commit 2befc60.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 13, 2026

Linux release test results

  8 files    8 suites   7m 51s ⏱️
523 tests 511 ✅ 12 💤 0 ❌
537 runs  525 ✅ 12 💤 0 ❌

Results for commit 2befc60.

♻️ This comment has been updated with latest results.

- Update bitflags, cudarc, rand, rustls, tokio, tower-http, wasm-bindgen
- Remove unused iri-string and plain dependencies
- Set optimization level to z to minimize release binary size
- Update lockfile with latest dependency checksums and versions
- Build both MKL-optimized and baseline x86_64 binaries
- Add libiomp5.so packaging for Linux MKL variants
- Use separate target directories to prevent cache invalidation
- Update verification steps to check all generated variants
- Ensure baseline builds do not contain MKL symbols
@github-actions
Copy link
Copy Markdown

clt

❌ CLT tests in test/clt-tests/mcl/
✅ OK: 23
❌ Failed: 1
⏳ Duration: 317s
👉 Check Action Results for commit 3076bce

Failed tests:

🔧 Edit failed tests in UI:

test/clt-tests/mcl/auto-embeddings-jina-remote.rec
––– input –––
rm -f /var/log/manticore/searchd.log; stdbuf -oL searchd --stopwait > /dev/null; stdbuf -oL searchd ${SEARCHD_ARGS:-} > /dev/null
––– output –––
OK
––– input –––
if timeout 10 grep -qm1 'accepting connections' <(tail -n 1000 -f /var/log/manticore/searchd.log); then echo 'Accepting connections!'; else echo 'Timeout or failed!'; fi
––– output –––
OK
––– input –––
cosine_similarity() {
    local file1="$1" file2="$2"

    awk '
    NR==FNR { a[NR]=$1; suma2+=$1*$1; next }
    {
        dot += a[FNR]*$1
        sumb2 += $1*$1
    }
    END {
        print dot / (sqrt(suma2) * sqrt(sumb2))
    }' "$file1" "$file2"
}
––– output –––
OK
––– input –––
export -f cosine_similarity
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_invalid_model (title TEXT, embedding FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME = 'jina/invalid-model-name-12345' FROM = 'title') " 2>&1
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_valid_model_no_api_key (title TEXT, embedding FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME = 'jina/jina-embeddings-v2-base-en' FROM = 'title') " 2>&1
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_jina_remote (title TEXT, content TEXT, description TEXT, embedding FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME = 'jina/jina-embeddings-v4' FROM = 'title, content' API_KEY='${JINA_API_KEY}') "; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P9306 -E -e "SHOW CREATE TABLE test_jina_remote"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "INSERT INTO test_jina_remote (id, title, content, description) VALUES(1, 'machine learning algorithms', 'deep neural networks and artificial intelligence', 'advanced AI research')"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "SELECT COUNT(*) as record_count FROM test_jina_remote WHERE id=1"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "INSERT INTO test_jina_remote (id, title, content, description) VALUES(2, 'machine learning algorithms', 'deep neural networks and artificial intelligence', 'different description')"

mysql -h0 -P9306 -e "SELECT embedding FROM test_jina_remote WHERE id=1" | \
    grep -v embedding | \
    sed 's/[0-9]\+\(\.[0-9]\+\)\?/\n&\n/g' | \
    grep -E '^[0-9]+(\.[0-9]+)?$' | \
    awk '{printf "%.5f\n", $1}' > /tmp/vector1.txt

mysql -h0 -P9306 -e "SELECT embedding FROM test_jina_remote WHERE id=2" | \
    grep -v embedding | \
    sed 's/[0-9]\+\(\.[0-9]\+\)\?/\n&\n/g' | \
    grep -E '^[0-9]+(\.[0-9]+)?$' | \
    awk '{printf "%.5f\n", $1}' > /tmp/vector2.txt

SIMILARITY=$(cosine_similarity /tmp/vector1.txt /tmp/vector2.txt)

echo "Cosine similarity: $SIMILARITY"

RESULT=$(awk -v sim="$SIMILARITY" 'BEGIN {
    if (sim > 0.99)
        print "SUCCESS: Same FROM fields produce similar vectors (similarity: " sim ")"
    else
        print "FAIL: Different vectors (FROM does not include description field and should not change generated vector value) (similarity: " sim ")"
}')

echo "$RESULT"

rm -f /tmp/vector1.txt /tmp/vector2.txt
––– output –––
- Cosine similarity: #!/(1|0\.[0-9]+)/!#
+ ERROR 1064 (42000) at line 1: Failed to send request to remote model
- SUCCESS: Same FROM fields produce similar vectors (similarity: #!/(1|0\.[0-9]+)/!#)
+ Cosine similarity: -nan
+ FAIL: Different vectors (FROM does not include description field and should not change generated vector value) (similarity: -nan)
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_jina_title_only (title TEXT, content TEXT, embedding FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME = 'jina/jina-embeddings-v4' FROM = 'title' API_KEY='${JINA_API_KEY}') "; mysql -h0 -P9306 -e "INSERT INTO test_jina_title_only (id, title, content) VALUES(1, 'machine learning algorithms', 'completely different content here')"; MD5_MULTI=$(mysql -h0 -P9306 -e "SELECT embedding FROM test_jina_remote WHERE id=1" | grep -v embedding | md5sum | awk '{print $1}'); MD5_SINGLE=$(mysql -h0 -P9306 -e "SELECT embedding FROM test_jina_title_only WHERE id=1" | grep -v embedding | md5sum | awk '{print $1}'); echo "multi_field_md5: $MD5_MULTI"; echo "single_field_md5: $MD5_SINGLE"; if [ "$MD5_MULTI" != "$MD5_SINGLE" ]; then echo "SUCCESS: Different FROM specifications produce different vectors"; else echo "INFO: FROM field comparison result"; fi
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_jina_invalid_field (title TEXT, embedding FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME = 'jina/text-embedding-ada-002' FROM = 'nonexistent_field') " 2>&1
––– output –––
OK
––– input –––
if mysql -h0 -P9306 -e "SHOW TABLES LIKE 'test_jina_no_from'" | grep -q test_jina_no_from; then mysql -h0 -P9306 -e "INSERT INTO test_jina_no_from (id, title, embedding) VALUES(1, 'test title', '(0.1, 0.2, 0.3, 0.4, 0.5)')"; echo "insert_result: $?"; else echo "insert_result: skipped (table not created)"; fi
––– output –––
OK
––– input –––
if mysql -h0 -P9306 -e "SHOW TABLES LIKE 'test_jina_no_from'" | grep -q test_jina_no_from; then mysql -h0 -P9306 -e "SHOW CREATE TABLE test_jina_no_from"; else echo "table_structure: skipped (table not created)"; fi
––– output –––
OK
––– input –––
if [ -n "$JINA_API_KEY" ] && [ "$JINA_API_KEY" != "dummy_key_for_testing" ]; then echo "API key is available for testing"; else echo "API key not available - using dummy for error testing"; fi
––– output –––
OK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant