|
11 | 11 | "1. OpenAI\n", |
12 | 12 | "2. HuggingFace\n", |
13 | 13 | "3. Vertex AI\n", |
| 14 | + "4. Cohere\n", |
14 | 15 | "\n", |
15 | 16 | "Before running this notebook, be sure to\n", |
16 | 17 | "1. Have installed ``redisvl`` and have that environment active for this notebook.\n", |
|
27 | 28 | }, |
28 | 29 | { |
29 | 30 | "cell_type": "code", |
30 | | - "execution_count": 1, |
| 31 | + "execution_count": 2, |
31 | 32 | "metadata": {}, |
32 | 33 | "outputs": [], |
33 | 34 | "source": [ |
|
298 | 299 | "test[:10]" |
299 | 300 | ] |
300 | 301 | }, |
| 302 | + { |
| 303 | + "cell_type": "markdown", |
| 304 | + "metadata": {}, |
| 305 | + "source": [ |
| 306 | + "### Cohere\n", |
| 307 | + "\n", |
| 308 | + "[Cohere](https://dashboard.cohere.ai/) allows you to implement language AI into your product. The `CohereTextVectorizer` makes it simple to use RedisVL with the embeddings models at Cohere. For this you will need to install `cohere`.\n", |
| 309 | + "\n", |
| 310 | + "```bash\n", |
| 311 | + "pip install cohere\n", |
| 312 | + "```" |
| 313 | + ] |
| 314 | + }, |
| 315 | + { |
| 316 | + "cell_type": "code", |
| 317 | + "execution_count": 2, |
| 318 | + "metadata": {}, |
| 319 | + "outputs": [], |
| 320 | + "source": [ |
| 321 | + "import getpass\n", |
| 322 | + "# setup the API Key\n", |
| 323 | + "api_key = os.environ.get(\"COHERE_API_KEY\") or getpass.getpass(\"Enter your Cohere API key: \")" |
| 324 | + ] |
| 325 | + }, |
| 326 | + { |
| 327 | + "cell_type": "markdown", |
| 328 | + "metadata": {}, |
| 329 | + "source": [ |
| 330 | + "\n", |
| 331 | + "Special attention needs to be paid to the `input_type` parameter for each `embed` call. For example, for embedding \n", |
| 332 | + "queries, you should set `input_type='search_query'`; for embedding documents, set `input_type='search_document'`. See\n", |
| 333 | + "more information [here](https://docs.cohere.com/reference/embed)" |
| 334 | + ] |
| 335 | + }, |
| 336 | + { |
| 337 | + "cell_type": "code", |
| 338 | + "execution_count": 3, |
| 339 | + "metadata": {}, |
| 340 | + "outputs": [ |
| 341 | + { |
| 342 | + "name": "stdout", |
| 343 | + "output_type": "stream", |
| 344 | + "text": [ |
| 345 | + "Vector dimensions: 1024\n", |
| 346 | + "[-0.010856628, -0.019683838, -0.0062179565, 0.003545761, -0.047943115, 0.0009365082, -0.005924225, 0.016174316, -0.03289795, 0.049194336]\n", |
| 347 | + "Vector dimensions: 1024\n", |
| 348 | + "[-0.010108948, -0.016693115, -0.0002310276, -0.022644043, -0.04147339, 0.0021324158, -0.033477783, -0.0005378723, -0.02619934, 0.058013916]\n" |
| 349 | + ] |
| 350 | + } |
| 351 | + ], |
| 352 | + "source": [ |
| 353 | + "from redisvl.vectorize.text import CohereTextVectorizer\n", |
| 354 | + "\n", |
| 355 | + "# create a vectorizer\n", |
| 356 | + "co = CohereTextVectorizer(\n", |
| 357 | + " model=\"embed-english-v3.0\",\n", |
| 358 | + " api_config={\"api_key\": api_key},\n", |
| 359 | + ")\n", |
| 360 | + "\n", |
| 361 | + "# embed a search query\n", |
| 362 | + "test = co.embed(\"This is a test sentence.\", input_type='search_query')\n", |
| 363 | + "print(\"Vector dimensions: \", len(test))\n", |
| 364 | + "print(test[:10])\n", |
| 365 | + "\n", |
| 366 | + "# embed a document\n", |
| 367 | + "test = co.embed(\"This is a test sentence.\", input_type='search_document')\n", |
| 368 | + "print(\"Vector dimensions: \", len(test))\n", |
| 369 | + "print(test[:10])" |
| 370 | + ] |
| 371 | + }, |
301 | 372 | { |
302 | 373 | "cell_type": "markdown", |
303 | 374 | "metadata": {}, |
|
0 commit comments