diff --git a/.github/ignore-notebooks.txt b/.github/ignore-notebooks.txt index 61ba17de..af3d0c71 100644 --- a/.github/ignore-notebooks.txt +++ b/.github/ignore-notebooks.txt @@ -9,4 +9,9 @@ spring_ai_redis_rag.ipynb 00_litellm_proxy_redis.ipynb 04_redisvl_benchmarking_basics.ipynb 06_hnsw_to_svs_vamana_migration.ipynb -07_flat_to_svs_vamana_migration.ipynb \ No newline at end of file +07_flat_to_svs_vamana_migration.ipynb +1_pre_generated_semantic_caching.ipynb +2_semantic_caching_with_langcache.ipynb +1_semantic_classification.ipynb +2_semantic_tool_calling.ipynb +3_semantic_guardrails.ipynb \ No newline at end of file diff --git a/README.md b/README.md index 16887e85..8032497d 100644 --- a/README.md +++ b/README.md @@ -162,9 +162,16 @@ A set of Java recipes can be found under [/java-recipes](/java-recipes/README.md #### Notebooks -| Notebook | Description | -|---------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------| -| [notebooks/RAG/spring_ai_redis_rag.ipynb](./java-recipes/notebooks/RAG/spring_ai_redis_rag.ipynb) | Demonstrates building a RAG-ba sed beer recommendation chatbot using Spring AI and Redis as the vector store | +| Notebook | Description | +|------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------| +| [notebooks/RAG/spring_ai_redis_rag.ipynb](./java-recipes/notebooks/RAG/spring_ai_redis_rag.ipynb) | Demonstrates building a RAG-ba sed beer recommendation chatbot using Spring AI and Redis as the vector store | +| [RAG/spring_ai_redis_rag.ipynb](./RAG/spring_ai_redis_rag.ipynb) | Demonstrates building a RAG-based beer recommendation chatbot using Spring AI and Redis as the vector store | +| [semantic-routing/1_semantic_classification.ipynb](./semantic-routing/1_semantic_classification.ipynb) | Demonstrates how to perform text classification with vector search (RedisVL) instead of LLMs | +| [semantic-routing/2_semantic_tool_calling.ipynb](./semantic-routing/2_semantic_tool_calling.ipynb) | Demonstrates how to perform tool selection with vector search (RedisVL) instead of LLMs | +| [semantic-routing/3_semantic_guardrails.ipynb](./semantic-routing/3_semantic_guardrails.ipynb) | Demonstrates how to implement guardrails with vector search (RedisVL) | +| [semantic-caching/1_pre_generated_semantic_caching.ipynb](./semantic-caching/1_pre_generated_semantic_caching.ipynb) | Demonstrates how to perform pre generated semantic caching with RedisVL | +| [semantic-caching/2_semantic_caching_with_langcache.ipynb](./semantic-caching/2_semantic_caching_with_langcache.ipynb) | Demonstrates how to perform pre generated semantic caching with LangCache | + #### Applications diff --git a/java-recipes/README.md b/java-recipes/README.md index a7e4e8b3..2b612d9c 100644 --- a/java-recipes/README.md +++ b/java-recipes/README.md @@ -26,9 +26,14 @@ There are two types of Java Recipes: Notebooks and Applications. Notebooks are i Notebooks require a Jupyter Notebook environment to run. Check out the [Setup Instructions & Implementation Details](./notebooks/README.md) for more details on how to set up your environment. -| Notebook | Description | -|--------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------| -| [notebooks/RAG/spring_ai_redis_rag.ipynb](./notebooks/RAG/spring_ai_redis_rag.ipynb) | Demonstrates building a RAG-ba sed beer recommendation chatbot using Spring AI and Redis as the vector store | +| Notebook | Description | +|------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------| +| [RAG/spring_ai_redis_rag.ipynb](./RAG/spring_ai_redis_rag.ipynb) | Demonstrates building a RAG-based beer recommendation chatbot using Spring AI and Redis as the vector store | +| [semantic-routing/1_semantic_classification.ipynb](./semantic-routing/1_semantic_classification.ipynb) | Demonstrates how to perform text classification with vector search (RedisVL) instead of LLMs | +| [semantic-routing/2_semantic_tool_calling.ipynb](./semantic-routing/2_semantic_tool_calling.ipynb) | Demonstrates how to perform tool selection with vector search (RedisVL) instead of LLMs | +| [semantic-routing/3_semantic_guardrails.ipynb](./semantic-routing/3_semantic_guardrails.ipynb) | Demonstrates how to implement guardrails with vector search (RedisVL) | +| [semantic-caching/1_pre_generated_semantic_caching.ipynb](./semantic-caching/1_pre_generated_semantic_caching.ipynb) | Demonstrates how to perform pre generated semantic caching with RedisVL | +| [semantic-caching/2_semantic_caching_with_langcache.ipynb](./semantic-caching/2_semantic_caching_with_langcache.ipynb) | Demonstrates how to perform pre generated semantic caching with LangCache | ## Applications diff --git a/java-recipes/notebooks/RAG/resources/beers.json.gz b/java-recipes/notebooks/RAG/resources/beers.json.gz new file mode 100644 index 00000000..e69de29b diff --git a/java-recipes/notebooks/RAG/spring_ai_redis_rag.ipynb b/java-recipes/notebooks/RAG/spring_ai_redis_rag.ipynb index f09e718e..c9fa6c4a 100644 --- a/java-recipes/notebooks/RAG/spring_ai_redis_rag.ipynb +++ b/java-recipes/notebooks/RAG/spring_ai_redis_rag.ipynb @@ -30,10 +30,12 @@ }, { "cell_type": "code", - "execution_count": 1, "id": "f0483426-9a2a-4fc1-a184-9ba3343d2bf9", - "metadata": {}, - "outputs": [], + "metadata": { + "jupyter": { + "is_executing": true + } + }, "source": [ "%mavenRepo spring_milestones https://repo.spring.io/milestone/ \n", "%maven \"org.springframework.ai:spring-ai-openai:1.0.0-M6\"\n", @@ -41,7 +43,9 @@ "%maven \"org.springframework.ai:spring-ai-redis-store:1.0.0-M6\"\n", "%maven \"org.slf4j:slf4j-simple:2.0.17\" \n", "%maven \"redis.clients:jedis:5.2.0\"" - ] + ], + "outputs": [], + "execution_count": null }, { "cell_type": "markdown", @@ -268,7 +272,7 @@ "String[] KEYS = { \"name\", \"abv\", \"ibu\", \"description\" };\n", "\n", "// Data path\n", - "String filePath = \"../resources/beers.json.gz\";\n", + "String filePath = \"./resources/beers.json.gz\";\n", "\n", "// Check if embeddings are already loaded\n", "Map indexInfo = vectorStore.getJedis().ftInfo(\"beers\");\n", diff --git a/java-recipes/notebooks/README.md b/java-recipes/notebooks/README.md index a5a240e8..7a752e20 100644 --- a/java-recipes/notebooks/README.md +++ b/java-recipes/notebooks/README.md @@ -24,7 +24,7 @@ This project uses Docker Compose to set up a complete environment for running Java-based AI applications with Redis. The environment includes: -- A Jupyter Notebook server with Java kernel support +- A Jupyter Notebook server with Java kernel support (For notebooks that based on Java) - Redis Stack (includes Redis and RedisInsight) - Pre-installed dependencies for AI/ML workloads @@ -79,9 +79,15 @@ OPENAI_API_KEY=your_openai_api_key_here ## Notebooks -| Notebook | Description | -| --- | --- | -| [RAG/spring_ai_redis_rag.ipynb](./RAG/spring_ai_redis_rag.ipynb) | Demonstrates building a RAG-based beer recommendation chatbot using Spring AI and Redis as the vector store | +| Notebook | Description | +|------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------| +| [RAG/spring_ai_redis_rag.ipynb](./RAG/spring_ai_redis_rag.ipynb) | Demonstrates building a RAG-based beer recommendation chatbot using Spring AI and Redis as the vector store | +| [semantic-routing/1_semantic_classification.ipynb](./semantic-routing/1_semantic_classification.ipynb) | Demonstrates how to perform text classification with vector search (RedisVL) instead of LLMs | +| [semantic-routing/2_semantic_tool_calling.ipynb](./semantic-routing/2_semantic_tool_calling.ipynb) | Demonstrates how to perform tool selection with vector search (RedisVL) instead of LLMs | +| [semantic-routing/3_semantic_guardrails.ipynb](./semantic-routing/3_semantic_guardrails.ipynb) | Demonstrates how to implement guardrails with vector search (RedisVL) | +| [semantic-caching/1_pre_generated_semantic_caching.ipynb](./semantic-caching/1_pre_generated_semantic_caching.ipynb) | Demonstrates how to perform pre generated semantic caching with RedisVL | +| [semantic-caching/2_semantic_caching_with_langcache.ipynb](./semantic-caching/2_semantic_caching_with_langcache.ipynb) | Demonstrates how to perform pre generated semantic caching with LangCache | + ## Project Structure diff --git a/java-recipes/notebooks/resources/beers.json.gz b/java-recipes/notebooks/resources/beers.json.gz deleted file mode 100644 index e32d6b02..00000000 Binary files a/java-recipes/notebooks/resources/beers.json.gz and /dev/null differ diff --git a/java-recipes/notebooks/semantic-caching/1_pre_generated_semantic_caching.ipynb b/java-recipes/notebooks/semantic-caching/1_pre_generated_semantic_caching.ipynb new file mode 100644 index 00000000..50e457e7 --- /dev/null +++ b/java-recipes/notebooks/semantic-caching/1_pre_generated_semantic_caching.ipynb @@ -0,0 +1,520 @@ +{ + "cells": [ + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "# Semantic Caching\n", + "\n", + "Semantic caching is an intelligent caching strategy that stores and retrieves responses based on the meaning of queries rather than exact text matches. Unlike traditional caching that requires identical strings, semantic caching can return cached responses for questions that are semantically similar, even when phrased differently.\n", + "\n", + "## Semantic Caching vs. Traditional Caching vs. LLM Re-generation\n", + "\n", + "**Traditional caching** stores responses using exact query strings as keys:\n", + "- **Fast retrieval** for identical queries\n", + "- **Cache misses** for any variation in phrasing, even minor differences\n", + "- **Low cache hit rates** in conversational applications where users rarely phrase questions identically\n", + "\n", + "**LLM re-generation** involves calling the language model for every query:\n", + "- **Flexible** handling of any question variation\n", + "- **High API costs** and latency for repeated similar questions\n", + "\n", + "**Semantic caching** uses vector similarity to match queries with cached responses:\n", + "- **High cache hit rates** by matching semantically similar questions\n", + "- **Cost reduction** by avoiding redundant LLM calls for similar queries\n", + "- **Fast retrieval** through vector similarity search\n", + "\n", + "In this notebook, we'll implement semantic caching using RedisVL with pre-generated FAQs about a Chevrolet Colorado vehicle brochure, demonstrating how semantic similarity can dramatically improve cache hit rates compared to exact string matching." + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Running Redis\n", + "\n", + "There are several options one can follow to have a running instance of Redis. For the sake of simplicity, in this notebook, we will run it in a Docker container.\n", + "\n", + "For production where high-availability and reliability is a concern, we recommend using [Redis Cloud](https://cloud.redis.io/).\n", + "\n", + "A free database can be spun up in Redis Cloud." + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Running Redis in a Docker Container using TestContainers\n", + "\n", + "**Docker containers** are lightweight, portable environments that package an application and all its dependencies so it runs consistently across different systems. **Testcontainers** is a library that lets us run lightweight, disposable Docker containers for integration testing, so you can test against real services like databases or message queues without complex setup.\n", + "\n", + "Make sure you have Docker installed: [install Docker](https://www.docker.com/get-started/)." + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "#### Installing dependencies" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:25:22.540393Z", + "start_time": "2025-11-24T10:25:22.227016Z" + } + }, + "cell_type": "code", + "source": "@file:DependsOn(\"org.testcontainers:testcontainers:2.0.2\")", + "outputs": [], + "execution_count": 1 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "#### Configuring a generic Redis Container" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:25:24.296117Z", + "start_time": "2025-11-24T10:25:23.981290Z" + } + }, + "cell_type": "code", + "source": [ + "import org.testcontainers.containers.GenericContainer\n", + "import org.testcontainers.utility.DockerImageName\n", + "\n", + "class RedisContainer : GenericContainer(DockerImageName.parse(\"redis:latest\")) {\n", + " init {\n", + " withExposedPorts(6379)\n", + " }\n", + "}" + ], + "outputs": [], + "execution_count": 2 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "#### Creating a Docker network\n", + "\n", + "This is necessary because later on this notebook we will spin up a Redis Insight container that needs to be in the same network." + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:25:26.133813Z", + "start_time": "2025-11-24T10:25:26.081066Z" + } + }, + "cell_type": "code", + "source": [ + "import org.testcontainers.containers.Network\n", + "\n", + "val network = Network.newNetwork()\n", + "val networkAlias = \"redis-network\"" + ], + "outputs": [], + "execution_count": 3 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "#### Start a Redis Container" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:25:29.231779Z", + "start_time": "2025-11-24T10:25:27.519794Z" + } + }, + "cell_type": "code", + "source": [ + "val networkAlias = \"redis\"\n", + "val redis = RedisContainer().withNetwork(network).withNetworkAliases(networkAlias)\n", + "redis.start()\n", + "\n", + "val host = redis.host\n", + "val port = redis.getMappedPort(6379)\n", + "println(\"Redis 8 started at $host:$port\")" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Redis 8 started at localhost:54215\n" + ] + } + ], + "execution_count": 4 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "## Implementing our Semantic Cache" + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Installing dependencies\n", + "\n", + "As mentioned in the beginning, we will use RedisVL's semantic routing abstraction to implement our semantic cache. Therefore, we will need to add RedisVL as a dependency." + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:25:32.571699Z", + "start_time": "2025-11-24T10:25:31.160492Z" + } + }, + "cell_type": "code", + "source": [ + "@file:DependsOn(\"com.redis:redisvl:0.0.1\")\n", + "%use serialization" + ], + "outputs": [], + "execution_count": 5 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Loading Pre-Generated FAQs\n", + "\n", + "For this semantic caching demonstration, we'll use pre-generated frequently asked questions (FAQs) about a Chevrolet Colorado vehicle brochure. These FAQs were created by processing the vehicle documentation and extracting question-answer pairs using an LLM.\n" + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:25:33.877137Z", + "start_time": "2025-11-24T10:25:33.805708Z" + } + }, + "cell_type": "code", + "source": [ + "import java.io.File\n", + "\n", + "val jsonText = File(\"./resources/3_colorado_faqs.json\").readText(Charsets.UTF_8)\n", + "val jsonArray = Json.parseToJsonElement(jsonText).jsonArray\n", + "\n", + "println(\"Loaded ${jsonArray.size} FAQs from file\")" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Loaded 346 FAQs from file\n" + ] + } + ], + "execution_count": 6 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Setting up the Text Vectorizer\n", + "\n", + "The vectorizer is responsible for converting text into numerical vector representations that capture semantic meaning. RedisVL provides several vectorizer options such as OpenAI and VertexAI. We're using the HuggingFace Text Vectorizer for this example." + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:25:36.682315Z", + "start_time": "2025-11-24T10:25:35.972546Z" + } + }, + "cell_type": "code", + "source": [ + "import com.redis.vl.utils.vectorize.SentenceTransformersVectorizer\n", + "\n", + "val vectorizer = SentenceTransformersVectorizer(\"Xenova/all-MiniLM-L6-v2\")\n", + "\n", + "val embedding = vectorizer.embed(\"What is the capital city of Italy?\")\n", + "\n", + "println(embedding.joinToString())" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "-0.009056281, 0.09096523, -0.051762886, 0.08848378, -0.12719342, -0.0703391, 0.029510844, 0.013291523, -0.057980966, -0.014017097, 0.03739981, -0.13108169, 0.0018671635, 0.03550265, -0.055068597, -0.04273072, 0.0480743, 0.035149302, 0.051385034, 0.008154835, 0.02939507, -0.02790439, 0.04798433, 0.012633902, 0.050369605, 0.03730664, -0.016114296, 0.016826835, -0.05483934, -0.04307148, -0.014681098, 0.0032649112, 0.10389013, -0.085853584, 0.016533818, 0.017277544, -0.012875621, -0.008417194, 0.106101766, -3.3647308E-4, 0.03838455, -0.007070606, 0.064803414, 0.04349774, 0.027908528, -0.004982669, 0.05417708, 0.08491659, 0.01753072, -0.04387867, -0.0089426385, -0.029429087, -0.04308129, -0.0137046715, -0.049384452, 0.079110876, 0.0159977, -0.023842642, 0.010396142, -0.017871607, -0.02013254, -0.029775942, -0.057334274, 0.079562895, 0.017678022, 0.046195857, -0.025770709, -0.052720636, -0.07104178, -0.016904766, 0.005821192, -0.04959368, 0.012194841, -0.06851538, 0.024740597, -0.06589627, -0.004739907, 0.019979797, 0.012046297, 0.025945937, 0.04738658, -0.026207888, -0.024006084, 0.06688492, 0.027395409, 0.0511346, 0.033016067, 0.012410808, -7.134781E-4, -0.0053796326, 0.024619037, 0.10826674, -0.029768111, -0.008205869, -0.034382585, 0.023230458, -0.023033625, 0.06288932, -0.05734033, 0.07995538, 0.020887373, -0.048439715, 0.06801826, 0.03367503, -0.03210259, 0.030324435, 0.07029199, -0.035357255, -0.0437311, -0.026227256, -0.117252186, -0.0637143, -0.025789985, -0.076391436, 0.021659859, 0.07295666, 0.079506256, 0.030736178, -0.018580705, 0.032889977, -0.027822956, -0.056768946, -0.017266978, -0.0187513, -0.052855123, 0.0026798288, -0.09833044, -4.265337E-33, -0.04522934, -0.038603127, 0.006782485, 0.056442842, -0.047985323, 0.056943357, 0.005727432, -0.07347524, -0.06966861, -0.06284734, 6.824111E-4, -0.13293485, -0.0028802142, -0.011797879, 0.091532454, 0.039949197, 0.044991918, 0.047134932, -0.038595345, -0.050638054, 0.013533677, 0.0056156493, -0.00128937, -0.10107974, -0.0062624826, -0.006572015, -0.01721928, 0.012505045, 0.016358247, 0.027294295, 0.014827963, 0.11129013, -0.012927993, -0.026968583, 0.011228542, 0.099366866, -0.025637725, -0.017972834, -0.006258892, 0.06267472, -0.046503223, 0.023930388, -0.04911905, 0.018191079, 0.056508224, -0.034028705, -0.058674965, -0.026060073, 0.10192595, -0.07542723, 0.029558491, -0.0063548926, -0.043660775, 0.013796782, 0.05591109, 0.13179655, 0.0014945822, 0.073946476, -0.00362497, 0.09263236, -0.008254736, 0.058894735, 0.028806994, 0.041127626, 0.050720066, 0.106417455, 0.04963482, 0.07839, 0.062062256, 0.041708015, 9.6040964E-4, -0.05741613, 0.029530406, 0.10063947, -0.039977487, 0.045762528, -0.01707651, -0.048533317, -0.026703384, 0.0071982993, -0.041772787, -0.052366402, -0.036398213, 0.04329747, 0.047366947, -0.0023107647, 0.001604334, -0.02657378, -0.0017699071, -0.033968117, -0.059102736, -0.057025284, -0.058409713, 0.009664118, -0.008185311, 2.1191983E-33, 0.02503501, -0.05893375, -0.054135744, -0.013791176, -0.16847505, -0.056284852, -0.013744822, -0.0122961225, 0.0031321992, 0.053306993, -0.072457984, -0.0476266, 0.010839893, -0.03926069, 0.027747422, 0.10246861, 0.091257475, -0.015415793, -0.068407044, -0.10341207, -0.08215669, -0.03483035, -0.102931306, -0.07483412, -0.056837633, -0.027723843, -0.07740979, -0.014597713, 0.0067502027, -0.039716292, -0.04219018, -0.036527764, -0.058196455, -0.019601742, -0.032604992, 0.10461292, 0.0416616, -0.0481787, 0.091960885, 0.049276114, -0.0652168, -0.020950038, 0.07045577, 0.12440737, 0.07806728, 0.0073797563, 0.039370913, 0.002310497, -0.045826662, -0.006844643, 0.032626607, -0.039485242, -0.038203005, 0.010807971, 0.024200222, 0.017911695, -0.009244083, 0.034157734, -0.06378024, -0.09688441, 0.0928941, 0.020900233, -0.041648738, 0.062452674, 0.010918287, 0.044152, -0.08620699, 0.019485684, 0.013766632, -0.011806223, 0.059487995, 8.969072E-4, -0.08984199, 0.045840707, -0.08102404, 0.09381022, 0.049054053, 0.07864653, 0.055541668, -0.059208665, -0.0028986053, -0.0077899722, -0.04229291, 0.014191566, -0.052229255, -0.013464374, 0.08817786, -0.026562197, 0.06472866, 0.0073218145, -0.031644728, 0.065533295, 0.039485324, -0.0784605, 0.0044285893, -1.8748269E-8, 0.012146794, 0.020252327, -0.04066794, 0.0826934, 0.013210786, 0.011821769, 0.04016033, -0.027048599, 0.015920155, 0.01708129, -0.081510864, -0.0075302, -0.041846845, -0.061824482, -0.052197114, 0.04438127, 0.036889043, 0.029691536, 0.022707561, -0.007839727, 0.023343714, 0.0074147084, -0.081928894, -0.00935852, -0.024256242, -0.08159295, 0.0391832, 0.07591707, -0.04517342, 0.0011093953, 0.05846702, -0.07561812, -0.012565128, -0.08454925, -0.043759752, 0.07977247, -0.0046581454, -0.08857544, -0.012337672, -0.033817444, 0.022975812, -0.013439647, 0.05287376, 8.4293383E-4, 0.046390817, 0.015289715, 0.0053871227, -0.008255393, 0.0038129103, -0.078331016, -0.0831567, 0.020810202, 0.061217535, 0.033761967, 0.06302245, 0.0040639304, 0.062217683, 0.058209386, 0.046823863, 0.063531056, 0.04160275, -0.021602299, 0.029407658, -0.011543353\n" + ] + } + ], + "execution_count": 7 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "### Creating the SemanticCache\n" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:25:39.497831Z", + "start_time": "2025-11-24T10:25:39.175066Z" + } + }, + "cell_type": "code", + "source": [ + "import com.redis.vl.extensions.cache.SemanticCache\n", + "import redis.clients.jedis.HostAndPort\n", + "import redis.clients.jedis.UnifiedJedis\n", + "\n", + "val jedis = UnifiedJedis(HostAndPort(host, port))\n", + "\n", + "// Initialize the semantic cache with Redis connection\n", + "val cache = SemanticCache.Builder()\n", + " .name(\"llmcache\")\n", + " .distanceThreshold(0.2F)\n", + " .ttl(360)\n", + " .redisClient(jedis)\n", + " .vectorizer(vectorizer)\n", + " .build()" + ], + "outputs": [], + "execution_count": 8 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "### Storing FAQs in the Semantic Cache" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:25:44.727233Z", + "start_time": "2025-11-24T10:25:41.189757Z" + } + }, + "cell_type": "code", + "source": [ + "jsonArray.forEachIndexed { i, el ->\n", + " val obj = el.jsonObject\n", + " val prompt = obj[\"prompt\"]?.jsonPrimitive?.content.orEmpty()\n", + " val response = obj[\"response\"]?.jsonPrimitive?.content.orEmpty()\n", + " cache.store(prompt, response)\n", + "}" + ], + "outputs": [], + "execution_count": 9 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "### Testing the Semantic Cache" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:25:45.878043Z", + "start_time": "2025-11-24T10:25:45.807205Z" + } + }, + "cell_type": "code", + "source": [ + "val cacheHit = cache.check(\"What models of chevy colorado are available?\").get()\n", + "println(\"Prompt: ${cacheHit.prompt}\")\n", + "println(\"Response: ${cacheHit.response}\")\n", + "println(\"Distance: ${cacheHit.distance}\")" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Prompt: What are the available models of the Colorado?\n", + "Response: The available models of the Colorado are WT, LT, Z71, and ZR2.\n", + "Distance: 0.18383932\n" + ] + } + ], + "execution_count": 10 + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:25:47.569521Z", + "start_time": "2025-11-24T10:25:47.508860Z" + } + }, + "cell_type": "code", + "source": [ + "val cacheHit = cache.check(\"What entertainment system comes with the car?\").get()\n", + "println(\"Prompt: ${cacheHit.prompt}\")\n", + "println(\"Response: ${cacheHit.response}\")\n", + "println(\"Distance: ${cacheHit.distance}\")" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Prompt: What entertainment system is included in the vehicle?\n", + "Response: The vehicle includes the Chevrolet Infotainment 3 system with an 8-inch diagonal color touch-screen.\n", + "Distance: 0.09986466\n" + ] + } + ], + "execution_count": 11 + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:25:49.664565Z", + "start_time": "2025-11-24T10:25:49.616993Z" + } + }, + "cell_type": "code", + "source": "cache.check(\"Does the car drive on the water?\")", + "outputs": [ + { + "data": { + "text/plain": [ + "Optional.empty" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "execution_count": 12 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Redis Insight\n", + "\n", + "Redis Insight is a visual tool that helps you explore, monitor, and optimize your Redis data and performance through an easy-to-use interface.\n", + "\n", + "It can be downloaded and run locally in your machine or be run in a Docker container. To make this recipe self-contained and straightforward, we're going to run it in a Docker container using Test Containers." + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "### Configuring a generic Redis Insight Container" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:25:53.184035Z", + "start_time": "2025-11-24T10:25:53.108180Z" + } + }, + "cell_type": "code", + "source": [ + "import org.testcontainers.containers.GenericContainer\n", + "import org.testcontainers.containers.wait.strategy.Wait\n", + "import org.testcontainers.utility.DockerImageName\n", + "\n", + "class RedisInsightContainer : GenericContainer(\n", + " DockerImageName.parse(\"redis/redisinsight:latest\") // or latest stable version\n", + ") {\n", + " init {\n", + " withExposedPorts(5540)\n", + " withEnv(\"RI_REDIS_HOST\", \"redis\")\n", + " withEnv(\"RI_REDIS_PORT\", \"6379\") // Since this will run in the same Docker network, we don't need to set the mapped port for the Redis Server\n", + " withEnv(\"RI_REDIS_ALIAS\", \"Local Redis\")\n", + " withEnv(\"RI_REDIS_USERNAME\", \"default\")\n", + " withEnv(\"RI_REDIS_PASSWORD\", \"\")\n", + " withEnv(\"RI_REDIS_TLS\", \"FALSE\")\n", + "\n", + " waitingFor(Wait.forHttp(\"/\").forPort(5540))\n", + " }\n", + "\n", + " fun getUiUrl(): String = \"http://${host}:${getMappedPort(5540)}\"\n", + "}" + ], + "outputs": [], + "execution_count": 13 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "### Starting the Redis Insight container" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:25:58.062664Z", + "start_time": "2025-11-24T10:25:55.809086Z" + } + }, + "cell_type": "code", + "source": [ + "val redisInsight = RedisInsightContainer().withNetwork(network)\n", + "redisInsight.start()\n", + "\n", + "println(\"RedisInsight UI: ${redisInsight.getUiUrl()}\")" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "RedisInsight UI: http://localhost:54223\n" + ] + } + ], + "execution_count": 14 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Spinning down Docker containers\n", + "\n", + "Finally, once we're done, let's clean up all the resources we created for our recipe:" + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:26:00.689481Z", + "start_time": "2025-11-24T10:26:00.129766Z" + } + }, + "cell_type": "code", + "source": [ + "redis.stop()\n", + "redisInsight.stop()\n", + "network.close()" + ], + "outputs": [], + "execution_count": 15 + } + ], + "metadata": { + "kernelspec": { + "display_name": "Kotlin", + "language": "kotlin", + "name": "kotlin" + }, + "language_info": { + "name": "kotlin", + "version": "2.2.20-dev-4982", + "mimetype": "text/x-kotlin", + "file_extension": ".kt", + "pygments_lexer": "kotlin", + "codemirror_mode": "text/x-kotlin", + "nbconvert_exporter": "" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/java-recipes/notebooks/semantic-caching/2_semantic_caching_with_langcache.ipynb b/java-recipes/notebooks/semantic-caching/2_semantic_caching_with_langcache.ipynb new file mode 100644 index 00000000..e87682a4 --- /dev/null +++ b/java-recipes/notebooks/semantic-caching/2_semantic_caching_with_langcache.ipynb @@ -0,0 +1,272 @@ +{ + "cells": [ + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "# Semantic Caching with LangCache\n", + "\n", + "LangCache is Redis' new managed service for semantic caching. Semantic caching is an intelligent caching strategy that stores and retrieves responses based on the meaning of queries rather than exact text matches. Unlike traditional caching that requires identical strings, semantic caching can return cached responses for questions that are semantically similar, even when phrased differently.\n", + "\n", + "## Semantic Caching vs. Traditional Caching vs. LLM Re-generation\n", + "\n", + "**Traditional caching** stores responses using exact query strings as keys:\n", + "- **Fast retrieval** for identical queries\n", + "- **Cache misses** for any variation in phrasing, even minor differences\n", + "- **Low cache hit rates** in conversational applications where users rarely phrase questions identically\n", + "\n", + "**LLM re-generation** involves calling the language model for every query:\n", + "- **Flexible** handling of any question variation\n", + "- **High API costs** and latency for repeated similar questions\n", + "\n", + "**Semantic caching** uses vector similarity to match queries with cached responses:\n", + "- **High cache hit rates** by matching semantically similar questions\n", + "- **Cost reduction** by avoiding redundant LLM calls for similar queries\n", + "- **Fast retrieval** through vector similarity search\n", + "\n", + "In this notebook, we'll implement semantic caching using LangCache to demonstrate how semantic similarity can improve cache hit rates compared to exact string matching." + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "## Getting Started with LangCache" + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Creating a Redis Database on Redis Cloud\n", + "\n", + "- Go to https//cloud.redis.io and create a new account if you don't have one yet.\n", + "- Once logged in, click on the plus sign next to \"Databases\" on the sidebar:\n", + "\n", + "\"\"\n", + "\n", + "- Select the free option:\n", + "\n", + "\"\"\n", + "\n", + "- Select `AWS` as the vendor and `us-east-1` as the region:\n", + "\n", + "\"\"\n", + "\n", + "- Create the database and wait for it to be available. It should take less than a minute:\n", + "\n", + "\"\"\n", + "\n", + "- Once the database is available. Click on LangCache on the side bar:\n", + "\n", + "\"\"\n", + "\n", + "- Then, click on Quick Create to easily get started:\n", + "\n", + "\"\"\n", + "\n", + "- Finally, copy the credential variable to use them in the next steps of this recipe:\n", + "\n", + "\"\"" + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "## Installing Dependencies" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:58:14.966956Z", + "start_time": "2025-11-24T10:58:14.709686Z" + } + }, + "cell_type": "code", + "source": [ + "%use ktor-client\n", + "%use serialization\n", + "%use coroutines\n", + "\n", + "import io.ktor.client.HttpClient\n", + "import io.ktor.client.engine.cio.CIO\n", + "import io.ktor.client.plugins.contentnegotiation.ContentNegotiation\n", + "import io.ktor.serialization.kotlinx.json.json\n", + "\n", + "val client = HttpClient(CIO) {\n", + " install(ContentNegotiation) {\n", + " json(Json { ignoreUnknownKeys = true })\n", + " }\n", + "}" + ], + "outputs": [], + "execution_count": 6 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Configuring LangCache\n", + "\n", + "### Access Configuration" + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:58:17.013283Z", + "start_time": "2025-11-24T10:58:16.981396Z" + } + }, + "cell_type": "code", + "source": [ + "val apiKey = System.getenv(\"LANG_CACHE_API_KEY\")\n", + "val cacheId = \"28e9625f77be4186b295ef6d3577c6d0\"\n", + "val baseUrl = \"https://aws-us-east-1.langcache.redis.io/v1/caches/$cacheId\"" + ], + "outputs": [], + "execution_count": 7 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "### Modeling the requests and responses from the API" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:58:19.573691Z", + "start_time": "2025-11-24T10:58:19.398696Z" + } + }, + "cell_type": "code", + "source": [ + "@Serializable\n", + "data class CacheEntryRequest(\n", + " val prompt: String,\n", + " val response: String? = null\n", + ")\n", + "\n", + "@Serializable\n", + "data class CacheEntryResponse(\n", + " val entryId: String\n", + ")\n", + "\n", + "@Serializable\n", + "data class CacheEntry(\n", + " val id: String,\n", + " val prompt: String,\n", + " val response: String,\n", + " val attributes: Map = emptyMap(),\n", + " val similarity: Double? = null,\n", + " @SerialName(\"search_strategy\")\n", + " val searchStrategy: String? = null\n", + ")\n", + "\n", + "@Serializable\n", + "data class SearchResponse(\n", + " val data: List\n", + ")" + ], + "outputs": [], + "execution_count": 8 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "### Storing in LangCache" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:26:16.657268Z", + "start_time": "2025-11-24T10:26:15.476865Z" + } + }, + "cell_type": "code", + "source": [ + "import io.ktor.client.call.*\n", + "import io.ktor.client.request.*\n", + "import io.ktor.http.*\n", + "\n", + "runBlocking {\n", + " val saveResponse: CacheEntryResponse = client.post(\"$baseUrl/entries\") {\n", + " header(\"Authorization\", \"Bearer $apiKey\")\n", + " contentType(ContentType.Application.Json)\n", + " setBody(CacheEntryRequest(\n", + " prompt = \"How does semantic caching work?\",\n", + " response = \"Semantic caching stores and retrieves data based on meaning, not exact matches.\"\n", + " ))\n", + " }.body()\n", + "\n", + " println(\"Save response: $saveResponse\")\n", + "}" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Save response: CacheEntryResponse(entryId=fda1b671e21b06a0a957c04b1692ab90)\n" + ] + } + ], + "execution_count": 4 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "### Retrieving from LangCache" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:26:24.215885Z", + "start_time": "2025-11-24T10:26:23.627109Z" + } + }, + "cell_type": "code", + "source": [ + "import io.ktor.client.call.*\n", + "import io.ktor.client.request.*\n", + "import io.ktor.http.*\n", + "\n", + "runBlocking {\n", + " val searchResponse: SearchResponse = client.post(\"$baseUrl/entries/search\") {\n", + " header(\"Authorization\", \"Bearer $apiKey\")\n", + " contentType(ContentType.Application.Json)\n", + " setBody(CacheEntryRequest(prompt = \"What is semantic caching?\"))\n", + " }.body()\n", + "\n", + " println(\"Search response: $searchResponse\")\n", + "}" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Search response: SearchResponse(data=[CacheEntry(id=fda1b671e21b06a0a957c04b1692ab90, prompt=How does semantic caching work?, response=Semantic caching stores and retrieves data based on meaning, not exact matches., attributes={}, similarity=0.9292393, searchStrategy=null)])\n" + ] + } + ], + "execution_count": 5 + } + ], + "metadata": { + "kernelspec": { + "display_name": "Kotlin", + "language": "kotlin", + "name": "kotlin" + }, + "language_info": { + "name": "kotlin", + "version": "2.2.20-dev-4982", + "mimetype": "text/x-kotlin", + "file_extension": ".kt", + "pygments_lexer": "kotlin", + "codemirror_mode": "text/x-kotlin", + "nbconvert_exporter": "" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/java-recipes/notebooks/semantic-caching/readme-assets/2_1_sidebar.png b/java-recipes/notebooks/semantic-caching/readme-assets/2_1_sidebar.png new file mode 100644 index 00000000..2abc471c Binary files /dev/null and b/java-recipes/notebooks/semantic-caching/readme-assets/2_1_sidebar.png differ diff --git a/java-recipes/notebooks/semantic-caching/readme-assets/2_2_tier_selection.png b/java-recipes/notebooks/semantic-caching/readme-assets/2_2_tier_selection.png new file mode 100644 index 00000000..60fa0e9e Binary files /dev/null and b/java-recipes/notebooks/semantic-caching/readme-assets/2_2_tier_selection.png differ diff --git a/java-recipes/notebooks/semantic-caching/readme-assets/2_3_settings.png b/java-recipes/notebooks/semantic-caching/readme-assets/2_3_settings.png new file mode 100644 index 00000000..14a32cc7 Binary files /dev/null and b/java-recipes/notebooks/semantic-caching/readme-assets/2_3_settings.png differ diff --git a/java-recipes/notebooks/semantic-caching/readme-assets/2_4_pending.png b/java-recipes/notebooks/semantic-caching/readme-assets/2_4_pending.png new file mode 100644 index 00000000..62a4d906 Binary files /dev/null and b/java-recipes/notebooks/semantic-caching/readme-assets/2_4_pending.png differ diff --git a/java-recipes/notebooks/semantic-caching/readme-assets/2_5_langcache_sidebar.png b/java-recipes/notebooks/semantic-caching/readme-assets/2_5_langcache_sidebar.png new file mode 100644 index 00000000..0fc67b01 Binary files /dev/null and b/java-recipes/notebooks/semantic-caching/readme-assets/2_5_langcache_sidebar.png differ diff --git a/java-recipes/notebooks/semantic-caching/readme-assets/2_6_quick_create.png b/java-recipes/notebooks/semantic-caching/readme-assets/2_6_quick_create.png new file mode 100644 index 00000000..cab487c0 Binary files /dev/null and b/java-recipes/notebooks/semantic-caching/readme-assets/2_6_quick_create.png differ diff --git a/java-recipes/notebooks/semantic-caching/readme-assets/2_7_connectivity.png b/java-recipes/notebooks/semantic-caching/readme-assets/2_7_connectivity.png new file mode 100644 index 00000000..525ecd47 Binary files /dev/null and b/java-recipes/notebooks/semantic-caching/readme-assets/2_7_connectivity.png differ diff --git a/java-recipes/notebooks/semantic-caching/resources/3_colorado_faqs.json b/java-recipes/notebooks/semantic-caching/resources/3_colorado_faqs.json new file mode 100644 index 00000000..724106c4 --- /dev/null +++ b/java-recipes/notebooks/semantic-caching/resources/3_colorado_faqs.json @@ -0,0 +1,1386 @@ +[ + { + "prompt": "What engine options are available for the 2022 Colorado?", + "response": "The 2022 Colorado offers powerful and efficient engine choices, including an available GM-exclusive Duramax® 2.8L Turbo-Diesel engine." + }, + { + "prompt": "What is the towing capacity of the 2022 Colorado?", + "response": "The 2022 Colorado can provide up to 7,700 lbs. of towing muscle with the Duramax® 2.8L Turbo-Diesel engine." + }, + { + "prompt": "What is the maximum towing capacity for the Colorado Crew Cab ZR2?", + "response": "The Colorado Crew Cab ZR2 can tow up to 5,000 lbs." + }, + { + "prompt": "What features are included in the Colorado's interior?", + "response": "The interior of the Colorado is filled with convenience and technology features." + }, + { + "prompt": "What is required for maximum trailering ratings in the Colorado?", + "response": "Maximum trailering ratings require the Colorado Crew Cab Short Box LT 2WD with the available Trailering Package, LT Convenience Package, and Safety Package." + }, + { + "prompt": "Are there any limitations on the trailering capacity of the Colorado?", + "response": "Yes, the trailering capacity of your specific vehicle may vary based on the weight of passengers, cargo, and options or accessories." + }, + { + "prompt": "What additional requirements are needed for trailering on the LT and Z71 models?", + "response": "The LT requires the available Trailering Package and automatic locking rear differential, while the Z71 requires the available Trailering Package." + }, + { + "prompt": "Are there any availability issues with features for the 2022 Colorado?", + "response": "Yes, due to current supply chain shortages, certain features shown may have limited or late availability, or may no longer be available." + }, + { + "prompt": "Where can I find information about the features on an individual Colorado vehicle?", + "response": "You can see the window label or consult a dealer regarding the features on an individual vehicle." + }, + { + "prompt": "What are the available models of the Colorado?", + "response": "The available models of the Colorado are WT, LT, Z71, and ZR2." + }, + { + "prompt": "What cab configurations are offered for the Colorado?", + "response": "The Colorado is available in Extended Cab or Crew Cab configurations." + }, + { + "prompt": "What engine options are available for the Colorado?", + "response": "The Colorado offers three engine options: a 2.5L 4-cylinder, a 3.6L V6, and a Duramax 2.8L Turbo-Diesel." + }, + { + "prompt": "What is the highway MPG for the Colorado with the diesel engine?", + "response": "The Colorado with the available Duramax 2.8L Turbo-Diesel engine has an EPA-estimated highway MPG of 30." + }, + { + "prompt": "What is the maximum trailering weight for the Colorado?", + "response": "The maximum trailering weight for the Colorado is 7,700 lbs. with the available diesel engine." + }, + { + "prompt": "Does the Colorado support Apple CarPlay and Android Auto?", + "response": "Yes, the Colorado is compatible with Apple CarPlay® and Android Auto™." + }, + { + "prompt": "What features are included in the ZR2 Bison Edition?", + "response": "The ZR2 Bison Edition includes 17-inch AEV-designed aluminum wheels, an AEV front bumper with winch provisions, an AEV rear bumper with recovery points, five AEV hot-stamped boron steel skid plates, AEV fender flares, fog lamps, front and rear floor liners with AEV logo, and AEV embroidered head restraints." + }, + { + "prompt": "What is the city/highway MPG for the Colorado 2WD with the Duramax engine?", + "response": "The Colorado 2WD with the available Duramax 2.8L Turbo-Diesel engine has an EPA-estimated MPG of 20 city and 30 highway." + }, + { + "prompt": "What are the requirements for the maximum trailering ratings?", + "response": "The maximum trailering ratings require the Colorado Crew Cab Short Box LT 2WD with the available Trailering Package, LT Convenience Package, and Safety Package." + }, + { + "prompt": "Where can I find compatibility information for Android Auto?", + "response": "You can check which smartphones are compatible with Android Auto at g.co/androidauto/requirements." + }, + { + "prompt": "What should I do before using the Colorado for trailering?", + "response": "Before using the Colorado for trailering, carefully review the Trailering section of the Owner’s Manual, as the trailering capacity of your specific vehicle may vary." + }, + { + "prompt": "Are there any warranty details for the Colorado?", + "response": "There is a non-GM warranty; see your dealer for limited warranty details." + }, + { + "prompt": "What should I know about the availability of features for the Colorado?", + "response": "Due to current supply chain shortages, certain features shown may have limited or late availability, or may no longer be available." + }, + { + "prompt": "What are the available models for the pickup?", + "response": "The available models for the pickup are WT, LT, Z71, and ZR2." + }, + { + "prompt": "What features does the WT model include?", + "response": "The WT model includes a 2.5L 4-cylinder engine (Extended Cab and Crew Cab Short Box 2WD), 4-wheel antilock disc brakes, CornerStep rear bumper, locking tailgate, recovery hooks (1 for 4x4), cargo area lighting, a 4-way power-adjustable driver seat, dual rear seats (Extended Cab), front center floor console, Chevrolet Infotainment 3 system with a 7-inch diagonal color touch-screen, and two USB ports." + }, + { + "prompt": "What additional features does the LT model offer compared to the WT model?", + "response": "The LT model offers body-color rear bumper, mirror caps and door handles, EZ Lift power locking tailgate, remote keyless entry, 6-way power-adjustable driver seat, leather-wrapped steering wheel, inside rearview auto-dimming mirror, overhead console, Chevrolet Infotainment 3 system with an 8-inch diagonal color touch-screen, folding rear bench seat (Crew Cab), two charge-only USB ports on the rear of the center console, and cruise control." + }, + { + "prompt": "What distinguishes the Z71 model from the other models?", + "response": "The Z71 model is distinguished by features such as a 3.6L V6 engine, 17-inch aluminum wheels, power-adjustable outside mirrors, hill descent control, automatic locking rear differential, transfer case shield (4x4), 17-inch bright machined-aluminum wheels and all-terrain tires, black Chevrolet bowtie emblems, projector-beam headlamps, fog lamps, remote vehicle starter system, automatic climate control, and HD rear vision camera." + }, + { + "prompt": "What are the key features of the ZR2 model?", + "response": "The ZR2 model includes the ZR2 Off-Road Package, Multimatic DSSV™ Damping System, front and rear full locking differentials, increased clearance and track width compared to other trim levels, skid plates, transfer case shield, 17-inch Graphite and Oxide Gold aluminum wheels and all-terrain tires, Off-Road Appearance Package, distinctive hood, front fascia and grille, spray-on bedliner with ZR2 logo, front and rear wheel flares, and leather-appointed front seats." + }, + { + "prompt": "What is the maximum payload capacity for the WT model?", + "response": "The maximum payload capacity for the WT model is 1,530 lbs for the Extended Cab and Crew Cab Short Box 2WD with a 2.5L I-4 engine." + }, + { + "prompt": "What is the maximum trailering capacity for the ZR2 model?", + "response": "The maximum trailering capacity for the ZR2 model is 5,000 lbs." + }, + { + "prompt": "Are there any special editions available for the ZR2 model?", + "response": "Yes, the ZR2 model has several special editions available, including the ZR2 Bison Edition, ZR2 Dusk Special Edition, and ZR2 Midnight Special Edition." + }, + { + "prompt": "What engine options are available for the LT model?", + "response": "The LT model offers engine options of a 2.5L I-4, 3.6L V6, and Duramax 2.8L I-4." + }, + { + "prompt": "What safety features are included in the models?", + "response": "Safety or driver assistance features are included, but they are not a substitute for the driver's responsibility to operate the vehicle in a safe manner. For specific feature limitations and information, refer to the vehicle Owner’s Manual." + }, + { + "prompt": "What should I consider regarding vehicle loading and trailering?", + "response": "Before buying a vehicle or using it to haul people or cargo, carefully review the Vehicle Loading section of the Owner’s Manual and check the carrying capacity of your specific vehicle on the label on the inside of the driver door jamb. Maximum trailering ratings are intended for comparison purposes only and may vary based on the weight of passengers, cargo, and options or accessories." + }, + { + "prompt": "What safety features are included in the vehicle's mechanical section?", + "response": "The vehicle includes a battery with rundown protection and retained accessory power, 4-wheel antilock disc brakes with Duralife® rotors, StabiliTrak® Electronic Stability Control System and Traction Control, and rack-and-pinion steering with Electric Power Steering (EPS) assist." + }, + { + "prompt": "What types of airbags are available in the vehicle?", + "response": "The vehicle is equipped with dual-stage frontal airbags for the driver and front passenger, seat-mounted side-impact airbags for the driver and front passenger, and head-curtain airbags for front and rear outboard seating positions." + }, + { + "prompt": "What audio features does the vehicle offer?", + "response": "The vehicle features a 6-speaker audio system and includes Bluetooth® phone connectivity and wireless audio streaming." + }, + { + "prompt": "What is included in the vehicle's interior features?", + "response": "Interior features include a front center floor console, power programmable door locks with lockout protection, a driver information center, front and rear carpeted floor mats, an instrument cluster with speedometer, fuel level, engine temperature, and tachometer, two USB ports and an auxiliary input jack, a 12-volt auxiliary power outlet, front bucket seats, dual rear seats with underseat storage (Extended Cab), and a rear folding bench (Crew Cab)." + }, + { + "prompt": "What technology is available for teen drivers?", + "response": "The vehicle includes Teen Driver technology." + }, + { + "prompt": "How does the Tire Pressure Monitoring System work?", + "response": "The Tire Pressure Monitoring System includes Tire Fill Alert and does not apply to the spare tire." + }, + { + "prompt": "What is the purpose of the Passenger Sensing System?", + "response": "The Passenger Sensing System is designed to detect the presence of a properly-seated occupant in the front passenger seat and enable or disable the front passenger airbag accordingly." + }, + { + "prompt": "What should I know about child safety in this vehicle?", + "response": "Always use seat belts and the correct child restraint for your child’s age and size. Children are safer when properly secured in a rear seat in the appropriate infant, child, or booster seat. Never place a rear-facing infant restraint in the front seat of any vehicle equipped with an active frontal airbag." + }, + { + "prompt": "How can I check Bluetooth compatibility with my phone?", + "response": "You can check Bluetooth compatibility by visiting my.chevrolet.com/learnAbout/bluetooth." + }, + { + "prompt": "What happens after the SiriusXM trial subscription ends?", + "response": "If you decide to continue service after your trial, your selected subscription plan will automatically renew at then-current rates, and you will be charged fees and taxes. To cancel, you must call SiriusXM at 1-866-635-2349." + }, + { + "prompt": "Are safety features a substitute for safe driving?", + "response": "No, safety or driver assistance features are not a substitute for the driver’s responsibility to operate the vehicle in a safe manner. It is important to read the vehicle Owner’s Manual for important feature limitations and information." + }, + { + "prompt": "What are the engine specifications for the 2.5L DOHC I-4 engine?", + "response": "The 2.5L DOHC I-4 engine has a standard 6-speed automatic transmission, produces 200 hp @ 6300 rpm, and 191 lb.-ft. of torque @ 4400 rpm. It has a displacement of 2460 cc (150 cu. in.), a bore and stroke of 88 mm x 101 mm (3.46 in. x 3.97 in.), and a compression ratio of 11.3:1. The block and cylinder head are made of cast-aluminum, and it uses direct injection for fuel delivery." + }, + { + "prompt": "What is the maximum payload rating for the 3.6L DOHC V6 engine?", + "response": "The maximum payload rating for the 3.6L DOHC V6 engine is 1,420 lbs." + }, + { + "prompt": "What is the EPA-estimated fuel economy for the 2.5L DOHC I-4 engine in 2WD?", + "response": "The EPA-estimated fuel economy for the 2.5L DOHC I-4 engine in 2WD is 20 MPG city and 30 MPG highway." + }, + { + "prompt": "What are the specifications for the DURAMAX 2.8L TURBO-DIESEL I-4 engine?", + "response": "The DURAMAX 2.8L TURBO-DIESEL I-4 engine has a 6-speed automatic transmission, produces 181 hp @ 3400 rpm, and 369 lb.-ft. of torque @ 2000 rpm. It has a displacement of 2776 cc (169 cu. in.), a bore and stroke of 94 mm x 100 mm (3.70 in. x 3.94 in.), and a compression ratio of 16.5:1. The block is made of cast-iron, the cylinder head is cast-aluminum, and it uses high-pressure, common rail direct injection for fuel delivery." + }, + { + "prompt": "What is the maximum trailering weight rating for the 3.6L DOHC V6 engine?", + "response": "The maximum trailering weight rating for the 3.6L DOHC V6 engine is 7,000 lbs." + }, + { + "prompt": "What is the compression ratio of the 2.5L DOHC I-4 engine?", + "response": "The compression ratio of the 2.5L DOHC I-4 engine is 11.3:1." + }, + { + "prompt": "What is the maximum payload rating for the DURAMAX 2.8L TURBO-DIESEL I-4 engine?", + "response": "The maximum payload rating for the DURAMAX 2.8L TURBO-DIESEL I-4 engine is 1,540 lbs." + }, + { + "prompt": "What is the fuel delivery method for the 3.6L DOHC V6 engine?", + "response": "The fuel delivery method for the 3.6L DOHC V6 engine is direct injection." + }, + { + "prompt": "What is the EPA-estimated fuel economy for the DURAMAX 2.8L TURBO-DIESEL I-4 engine in 4x4?", + "response": "The EPA-estimated fuel economy for the DURAMAX 2.8L TURBO-DIESEL I-4 engine in 4x4 is 19 MPG city and 28 MPG highway." + }, + { + "prompt": "What are the engine specifications for the 3.6L DOHC V6 engine?", + "response": "The 3.6L DOHC V6 engine has an 8-speed automatic transmission, produces 308 hp @ 6800 rpm, and 275 lb.-ft. of torque @ 4000 rpm. It has a displacement of 3640 cc (222 cu. in.), a bore and stroke of 95 mm x 85.6 mm (3.74 in. x 3.37 in.), and a compression ratio of 11.5:1. The block and cylinder head are made of cast-aluminum, and it uses direct injection for fuel delivery." + }, + { + "prompt": "What is the maximum trailering weight rating for the 2.5L DOHC I-4 engine?", + "response": "The maximum trailering weight rating for the 2.5L DOHC I-4 engine is 3,500 lbs." + }, + { + "prompt": "What is the bore and stroke of the DURAMAX 2.8L TURBO-DIESEL I-4 engine?", + "response": "The bore and stroke of the DURAMAX 2.8L TURBO-DIESEL I-4 engine are 94 mm x 100 mm (3.70 in. x 3.94 in.)." + }, + { + "prompt": "What is the maximum payload rating for the 2.5L DOHC I-4 engine?", + "response": "The maximum payload rating for the 2.5L DOHC I-4 engine is 1,420 lbs." + }, + { + "prompt": "What is the fuel economy for the 3.6L DOHC V6 engine in 4x4?", + "response": "The EPA-estimated fuel economy for the 3.6L DOHC V6 engine in 4x4 is 19 MPG city and 24 MPG highway." + }, + { + "prompt": "What is the torque output of the DURAMAX 2.8L TURBO-DIESEL I-4 engine?", + "response": "The torque output of the DURAMAX 2.8L TURBO-DIESEL I-4 engine is 369 lb.-ft. @ 2000 rpm." + }, + { + "prompt": "What is the horsepower of the 2.5L DOHC I-4 engine?", + "response": "The horsepower of the 2.5L DOHC I-4 engine is 200 hp @ 6300 rpm." + }, + { + "prompt": "What is the fuel economy for the 2.5L DOHC I-4 engine in 4x4?", + "response": "The EPA-estimated fuel economy for the 2.5L DOHC I-4 engine in 4x4 is 19 MPG city and 24 MPG highway." + }, + { + "prompt": "What is the maximum trailering weight rating for the DURAMAX 2.8L TURBO-DIESEL I-4 engine?", + "response": "The maximum trailering weight rating for the DURAMAX 2.8L TURBO-DIESEL I-4 engine is 7,700 lbs." + }, + { + "prompt": "What type of engine options are available for the Work Truck?", + "response": "The Work Truck offers a 2.5L DOHC 4-cylinder engine with Variable Valve Timing (VVT) and Direct Injection, as well as a 3.6L DOHC V6 engine with Variable Valve Timing (VVT) and Direct Injection for Crew Cab 4x4 and Crew Cab Long Box 2WD models." + }, + { + "prompt": "What transmission options are available for the Work Truck?", + "response": "The Work Truck is available with a 6-speed automatic transmission that is electronically controlled with overdrive, and an 8-speed automatic transmission that is electronically controlled with overdrive, Tow/Haul mode, and Hitch Guidance for Crew Cab 4x4 and Crew Cab Long Box 2WD models." + }, + { + "prompt": "What are the exterior features of the Work Truck?", + "response": "Exterior features of the Work Truck include manual-adjustable, manual-folding mirrors with black caps, front recovery hooks for 4x4 models, 255/65R17 all-season tires, and either 17\" Ultra Silver Metallic-Painted Steel Wheels or 18\" Dark Argent Metallic-Painted Aluminum Wheels." + }, + { + "prompt": "What interior features does the Work Truck have?", + "response": "The interior features of the Work Truck include manual single-zone air conditioning, the Chevrolet Infotainment 3 system with a 7-inch diagonal color touch-screen, a rear vision camera, a driver seat with a 4-way power adjuster and manual recline, and a manual tilt steering column." + }, + { + "prompt": "Are there any safety features in the Work Truck?", + "response": "Yes, the Work Truck includes safety or driver assistance features, but it is important to note that these features are not a substitute for the driver's responsibility to operate the vehicle safely." + }, + { + "prompt": "What should I know about using recovery hooks on the Work Truck?", + "response": "To avoid the risk of injury, never use recovery hooks to tow a vehicle. For more information, refer to the Recovery Hooks section of your Owner’s Manual." + }, + { + "prompt": "What is required for full functionality of the Chevrolet Infotainment System?", + "response": "Full functionality of the Chevrolet Infotainment System requires compatible Bluetooth and smartphone, as well as USB connectivity for some devices." + }, + { + "prompt": "What colors are available for the LT exterior?", + "response": "The LT exterior colors available include Summit White, Satin Steel Metallic, Bright Blue Metallic, and Black. Note that these colors may be extra-cost options." + }, + { + "prompt": "Will the actual production models of the Work Truck vary from preproduction models?", + "response": "Yes, actual production models may vary from preproduction models, and some features shown may have limited, late, or no availability. It is recommended to see the dealer for feature availability." + }, + { + "prompt": "What cab configurations are available for the LT features?", + "response": "The LT features are available on Extended Cab, Crew Cab Short Box, and Crew Cab Long Box." + }, + { + "prompt": "Which seating surfaces are included in the LT features?", + "response": "The LT features include Jet Black Cloth and Jet Black Leather-Appointed Front Seating Surfaces." + }, + { + "prompt": "Are there different types of seating surfaces for the LT features?", + "response": "Yes, the LT features offer Jet Black Cloth and Jet Black Leather-Appointed Front Seating Surfaces." + }, + { + "prompt": "What is included in the LT features?", + "response": "In addition to standard equipment, the LT features include Jet Black Cloth and Jet Black Leather-Appointed Front Seating Surfaces." + }, + { + "prompt": "Can I get leather seating in the LT features?", + "response": "Yes, the LT features include Jet Black Leather-Appointed Front Seating Surfaces." + }, + { + "prompt": "Is Jet Black Cloth a standard feature in the LT?", + "response": "Yes, Jet Black Cloth is included as part of the LT features." + }, + { + "prompt": "What type of engine does the vehicle have?", + "response": "The vehicle is equipped with a 2.5L DOHC 4-cylinder engine with Variable Valve Timing (VVT) and Direct Injection." + }, + { + "prompt": "Is there a V6 engine option available?", + "response": "Yes, there is a 3.6L DOHC V6 engine option available for Crew Cab 4x4 and Crew Cab Long Box 2WD models, which also features Variable Valve Timing (VVT) and Direct Injection." + }, + { + "prompt": "What kind of transfer case is used in the vehicle?", + "response": "The vehicle uses a 2-speed, electronic Autotrac® transfer case with rotary controls, which includes a Neutral position for dinghy towing in 4x4 models." + }, + { + "prompt": "What type of transmission does the vehicle have?", + "response": "The vehicle is equipped with a 6-speed automatic transmission that is electronically controlled with overdrive." + }, + { + "prompt": "Are there different transmission options for specific models?", + "response": "Yes, Crew Cab 4x4 and Crew Cab Long Box 2WD models have an 8-speed automatic transmission that is electronically controlled with overdrive, Tow/Haul mode, and Hitch Guidance." + }, + { + "prompt": "What features does the 8-speed automatic transmission include?", + "response": "The 8-speed automatic transmission includes Tow/Haul mode and Hitch Guidance." + }, + { + "prompt": "Can the transfer case be used for towing?", + "response": "Yes, the transfer case includes a Neutral position that allows for dinghy towing in 4x4 models." + }, + { + "prompt": "What is the engine configuration for the standard model?", + "response": "The standard model features a 2.5L DOHC 4-cylinder engine with Variable Valve Timing (VVT) and Direct Injection." + }, + { + "prompt": "What is the purpose of the Tow/Haul mode in the transmission?", + "response": "The Tow/Haul mode in the transmission is designed to enhance towing capabilities." + }, + { + "prompt": "What type of mirrors are included?", + "response": "The vehicle includes power-adjustable, manual-folding mirrors with body-color caps, and it also has a driver spotter mirror." + }, + { + "prompt": "Are the mirrors adjustable?", + "response": "Yes, the mirrors are power-adjustable." + }, + { + "prompt": "Do the mirrors fold manually?", + "response": "Yes, the mirrors are manual-folding." + }, + { + "prompt": "What features do the mirrors have?", + "response": "The mirrors have body-color caps and include a driver spotter mirror." + }, + { + "prompt": "How many recovery hooks are there and where are they located?", + "response": "There are 2 recovery hooks located at the front for 4x4 models." + }, + { + "prompt": "What size are the wheels?", + "response": "The vehicle has 17\" Blade Silver Metallic-Painted Aluminum Wheels and 18\" Dark Argent Metallic-Painted Aluminum Wheels." + }, + { + "prompt": "What type of tires are used?", + "response": "The vehicle uses 255/65R17 all-season tires." + }, + { + "prompt": "What is the size of the all-season tires?", + "response": "The all-season tires are sized 255/65R17." + }, + { + "prompt": "What are the wheel options available?", + "response": "The available wheel options are 17\" Blade Silver Metallic-Painted Aluminum Wheels and 18\" Dark Argent Metallic-Painted Aluminum Wheels." + }, + { + "prompt": "What type of air conditioning does the vehicle have?", + "response": "The vehicle has manual, single-zone air conditioning." + }, + { + "prompt": "What kind of cruise control is featured in the vehicle?", + "response": "The vehicle features electronic cruise control with Set and Resume Speed." + }, + { + "prompt": "What entertainment system is included in the vehicle?", + "response": "The vehicle includes the Chevrolet Infotainment 3 system with an 8-inch diagonal color touch-screen." + }, + { + "prompt": "Does the vehicle have a rearview mirror?", + "response": "Yes, the vehicle has an auto-dimming rearview mirror." + }, + { + "prompt": "Is there a rear vision camera in the vehicle?", + "response": "Yes, the vehicle is equipped with a Rear Vision Camera." + }, + { + "prompt": "What type of entry system does the vehicle use?", + "response": "The vehicle uses an extended-range Remote Keyless Entry system." + }, + { + "prompt": "How is the driver's seat adjusted?", + "response": "The driver's seat is adjusted using a 6-way power adjuster." + }, + { + "prompt": "What type of steering column does the vehicle have?", + "response": "The vehicle has a manual tilt and telescopic steering column." + }, + { + "prompt": "What features does the steering wheel have?", + "response": "The steering wheel is leather-wrapped and includes mounted audio, cruise, and phone controls." + }, + { + "prompt": "What size are the wheels on the vehicle?", + "response": "The vehicle has 18-inch Black-Painted Aluminum Wheels with Red Accents." + }, + { + "prompt": "Are safety features a substitute for safe driving?", + "response": "No, safety or driver assistance features are not a substitute for the driver’s responsibility to operate the vehicle in a safe manner." + }, + { + "prompt": "Where can I find more information about feature limitations?", + "response": "You can read the vehicle Owner’s Manual for important feature limitations and information." + }, + { + "prompt": "Can recovery hooks be used to tow the vehicle?", + "response": "No, to avoid the risk of injury, recovery hooks should never be used to tow a vehicle." + }, + { + "prompt": "What is required for full functionality of the Chevrolet Infotainment System?", + "response": "Full functionality requires compatible Bluetooth and smartphone, and USB connectivity for some devices." + }, + { + "prompt": "Are there any color options for the vehicle?", + "response": "Yes, available colors include Summit White, Satin Steel Metallic, Sand Dune Metallic, Bright Blue Metallic, and Black." + }, + { + "prompt": "Are any of the vehicle colors extra-cost options?", + "response": "Yes, some colors are extra-cost options and have limited availability." + }, + { + "prompt": "Is the Chevrolet Infotainment System available on all models?", + "response": "No, Chevrolet Infotainment System functionality varies by model." + }, + { + "prompt": "What models require the Luxury Package?", + "response": "The Luxury Package is required for Crew Cab models only." + }, + { + "prompt": "Will actual production models differ from preproduction models?", + "response": "Yes, actual production models may vary from preproduction models." + }, + { + "prompt": "What should I check for feature availability?", + "response": "You should see the dealer for feature availability." + }, + { + "prompt": "What cab configurations are available for the Z71?", + "response": "The Z71 is available on Extended Cab, Crew Cab Short Box, and Crew Cab Long Box." + }, + { + "prompt": "What type of seating surfaces does the Z71 offer?", + "response": "The Z71 offers Jet Black Cloth/Leatherette and Jet Black Leather-Appointed Front Seating Surfaces." + }, + { + "prompt": "Can you list the available cab types for the Z71?", + "response": "The available cab types for the Z71 include Extended Cab, Crew Cab Short Box, and Crew Cab Long Box." + }, + { + "prompt": "What are the seating options for the Z71?", + "response": "The seating options for the Z71 include Jet Black Cloth/Leatherette and Jet Black Leather-Appointed Front Seating Surfaces." + }, + { + "prompt": "Is the Z71 available with leather seating?", + "response": "Yes, the Z71 is available with Jet Black Leather-Appointed Front Seating Surfaces." + }, + { + "prompt": "What is included in the Z71 features?", + "response": "The Z71 features include various options, such as Jet Black Cloth/Leatherette and Jet Black Leather-Appointed Front Seating Surfaces, in addition to standard equipment." + }, + { + "prompt": "What type of differential does the vehicle have?", + "response": "The vehicle has an automatic locking rear differential." + }, + { + "prompt": "What is the engine specification of the vehicle?", + "response": "The vehicle is equipped with a 3.6L DOHC V6 engine that features Variable Valve Timing (VVT) and Direct Injection." + }, + { + "prompt": "What type of transfer case is included in the vehicle?", + "response": "The vehicle includes a 2-speed, electronic Autotrac transfer case with rotary controls." + }, + { + "prompt": "Does the transfer case have a neutral position for towing?", + "response": "Yes, the transfer case includes a Neutral position for dinghy towing in 4x4 models." + }, + { + "prompt": "What type of transmission does the vehicle have?", + "response": "The vehicle has an 8-speed automatic transmission that is electronically controlled and includes overdrive, Tow/Haul mode, and Hitch Guidance." + }, + { + "prompt": "Is there a suspension package available for off-road use?", + "response": "Yes, the vehicle comes with a Z71 Off-Road Suspension Package." + }, + { + "prompt": "What features are included in the transmission?", + "response": "The transmission features overdrive, Tow/Haul mode, and Hitch Guidance." + }, + { + "prompt": "What is the purpose of the transfer case shield?", + "response": "The transfer case shield is included in 4x4 models for added protection." + }, + { + "prompt": "What type of wheels are available for the vehicle?", + "response": "The vehicle is available with 17\" Bright Machined-Aluminum Wheels and 17\" Gloss Black-Painted Aluminum Wheels." + }, + { + "prompt": "What features do the mirrors have?", + "response": "The mirrors are heated, power-adjustable, manual-folding with Black caps, and include a driver spotter mirror." + }, + { + "prompt": "Are there any recovery hooks on the vehicle?", + "response": "Yes, the vehicle has front recovery hooks." + }, + { + "prompt": "What type of tires does the vehicle use?", + "response": "The vehicle uses 255/65R17 all-terrain tires." + }, + { + "prompt": "Does the vehicle come with fog lamps?", + "response": "Yes, the vehicle is equipped with fog lamps." + }, + { + "prompt": "What is the size of the tires on the vehicle?", + "response": "The tires on the vehicle are 255/65R17." + }, + { + "prompt": "Can the mirrors be adjusted manually?", + "response": "Yes, the mirrors are power-adjustable and can also be manually folded." + }, + { + "prompt": "What is included in the exterior features of the vehicle?", + "response": "The exterior features include fog lamps, 17\" Bright Machined-Aluminum Wheels, 17\" Gloss Black-Painted Aluminum Wheels, heated and power-adjustable mirrors, front recovery hooks, and 255/65R17 all-terrain tires." + }, + { + "prompt": "What type of air conditioning system is included?", + "response": "The vehicle features an automatic, single-zone air conditioning system." + }, + { + "prompt": "What entertainment system is available in the vehicle?", + "response": "The vehicle is equipped with the Chevrolet Infotainment 3 Plus system, which includes an 8-inch diagonal HD color touch-screen." + }, + { + "prompt": "Does the vehicle have a remote starter?", + "response": "Yes, the vehicle includes a remote vehicle starter system." + }, + { + "prompt": "What kind of seat adjustments are available for the driver?", + "response": "The driver's seat has a 6-way power adjuster and power lumbar control." + }, + { + "prompt": "Is the steering wheel heated?", + "response": "Yes, the steering wheel is heated and leather-wrapped, with mounted audio, cruise, and phone controls." + }, + { + "prompt": "What type of window is featured in the vehicle?", + "response": "The vehicle has a sliding rear window." + }, + { + "prompt": "Are safety features a substitute for safe driving?", + "response": "No, safety or driver assistance features are not a substitute for the driver’s responsibility to operate the vehicle safely." + }, + { + "prompt": "Where can I find important information about feature limitations?", + "response": "Important feature limitations and information can be found in the vehicle Owner’s Manual." + }, + { + "prompt": "Can recovery hooks be used to tow a vehicle?", + "response": "No, to avoid the risk of injury, recovery hooks should never be used to tow a vehicle." + }, + { + "prompt": "What is required for full functionality of the Chevrolet Infotainment System?", + "response": "Full functionality of the Chevrolet Infotainment System requires compatible Bluetooth and smartphone, as well as USB connectivity for some devices." + }, + { + "prompt": "Are there any extra-cost interior options?", + "response": "Yes, there are extra-cost interior options available, but they are only available on Crew Cab models." + }, + { + "prompt": "What should I know about feature availability?", + "response": "Some features shown may have limited, late, or no availability, so it's best to see the dealer for feature availability." + }, + { + "prompt": "What are the available exterior colors for the vehicle?", + "response": "The available exterior colors include Summit White, Satin Steel Metallic, Sand Dune Metallic, Bright Blue Metallic, and Black. Some colors may incur extra costs." + }, + { + "prompt": "Are there any extra-cost colors for the vehicle?", + "response": "Yes, some colors are extra-cost options." + }, + { + "prompt": "What types of cabs are available for the ZR2?", + "response": "The ZR2 is available on Extended Cab and Crew Cab Short Box." + }, + { + "prompt": "What is included in the ZR2 features?", + "response": "In addition to standard equipment, select ZR2 features include Jet Black Leather-Appointed Front Seating Surfaces." + }, + { + "prompt": "Can you describe the seating surfaces in the ZR2?", + "response": "The ZR2 features Jet Black Leather-Appointed Front Seating Surfaces." + }, + { + "prompt": "Is the ZR2 available in different cab configurations?", + "response": "Yes, the ZR2 is available in Extended Cab and Crew Cab Short Box configurations." + }, + { + "prompt": "What are the standard features of the ZR2?", + "response": "The document mentions that the ZR2 includes standard equipment along with select additional features." + }, + { + "prompt": "Are there any special seating options for the ZR2?", + "response": "Yes, the ZR2 includes Jet Black Leather-Appointed Front Seating Surfaces as a feature." + }, + { + "prompt": "What type of differential does the vehicle have?", + "response": "The vehicle has a full locking front and rear differential that is driver selectable." + }, + { + "prompt": "Can you describe the engine specifications?", + "response": "The engine is a 3.6L DOHC V6 with Variable Valve Timing (VVT) and Direct Injection." + }, + { + "prompt": "What features are included in the trailering package?", + "response": "The trailering package includes a trailer hitch and a 7-pin connector." + }, + { + "prompt": "What type of transfer case is used in the vehicle?", + "response": "The vehicle uses a 2-speed, electronic Autotrac transfer case with rotary controls, which includes a Neutral position for dinghy towing." + }, + { + "prompt": "What kind of transmission does the vehicle have?", + "response": "The vehicle is equipped with an 8-speed automatic transmission that is electronically controlled and features overdrive, Tow/Haul mode, and Hitch Guidance." + }, + { + "prompt": "Is the differential selectable by the driver?", + "response": "Yes, the differential is driver selectable." + }, + { + "prompt": "What is the engine type and its features?", + "response": "The engine type is a 3.6L DOHC V6, which features Variable Valve Timing (VVT) and Direct Injection." + }, + { + "prompt": "What does the trailering package consist of?", + "response": "The trailering package consists of a trailer hitch and a 7-pin connector." + }, + { + "prompt": "How does the transfer case operate?", + "response": "The transfer case operates as a 2-speed, electronic Autotrac system with rotary controls and includes a Neutral position for dinghy towing." + }, + { + "prompt": "What are the capabilities of the transmission?", + "response": "The transmission is an 8-speed automatic that is electronically controlled, featuring overdrive, Tow/Haul mode, and Hitch Guidance." + }, + { + "prompt": "What type of bedliner is included?", + "response": "The vehicle includes a black spray-on bedliner with a ZR2 logo." + }, + { + "prompt": "What features do the mirrors have?", + "response": "The mirrors are heated, power-adjustable, and manual-folding with body-color caps, and they include a driver spotter mirror." + }, + { + "prompt": "Is there any off-road protection available?", + "response": "Yes, there is off-road rocker protection available, but it has limited availability; it's recommended to check with the dealer for details." + }, + { + "prompt": "What size are the wheels?", + "response": "The vehicle comes with 17-inch Graphite and Oxide Gold Aluminum Wheels." + }, + { + "prompt": "Are there any other wheel options?", + "response": "Yes, there is an option for 17-inch AEV-Designed Dark Graphite Aluminum Wheels." + }, + { + "prompt": "What color are the recovery hooks?", + "response": "The recovery hooks are red." + }, + { + "prompt": "What type of tires are included?", + "response": "The vehicle is equipped with 265/65R17 Goodyear Wrangler DuraTrac® all-terrain tires." + }, + { + "prompt": "What is included in the ZR2 Off-Road Package?", + "response": "The ZR2 Off-Road Package includes a taller and wider stance, Multimatic DSSV™ Damping System, off-road front fascia and rear bumper, fender flares, and a Chevrolet lettered grille with a small flowtie emblem." + }, + { + "prompt": "Can you describe the ZR2 Off-Road Package?", + "response": "The ZR2 Off-Road Package features a taller and wider stance, a Multimatic DSSV™ Damping System, off-road front fascia and rear bumper, fender flares, and a Chevrolet lettered grille with a small flowtie emblem." + }, + { + "prompt": "What is the purpose of the driver spotter mirror?", + "response": "The driver spotter mirror is designed to enhance visibility for the driver, particularly when maneuvering in tight spaces or off-road conditions." + }, + { + "prompt": "What type of air conditioning system does the vehicle have?", + "response": "The vehicle has an automatic, single-zone air conditioning system." + }, + { + "prompt": "What entertainment system is included in the vehicle?", + "response": "The vehicle includes the Chevrolet Infotainment 3 Plus system with an 8-inch diagonal HD color touch-screen." + }, + { + "prompt": "Does the vehicle have a remote starter?", + "response": "Yes, the vehicle is equipped with a remote vehicle starter system." + }, + { + "prompt": "How can the driver's seat be adjusted?", + "response": "The driver's seat can be adjusted using a 6-way power adjuster and has power lumbar control." + }, + { + "prompt": "What adjustments are available for the front passenger seat?", + "response": "The front passenger seat has a 4-way power adjuster and power lumbar control." + }, + { + "prompt": "What size are the wheels on the vehicle?", + "response": "The vehicle has 17-inch Gloss Black-Painted Aluminum Wheels." + }, + { + "prompt": "What features are included on the steering wheel?", + "response": "The steering wheel is heated, leather-wrapped, and includes mounted audio, cruise, and phone controls." + }, + { + "prompt": "Are safety features a substitute for safe driving?", + "response": "No, safety or driver assistance features are not a substitute for the driver’s responsibility to operate the vehicle in a safe manner." + }, + { + "prompt": "Where can I find more information about feature limitations?", + "response": "You can read the vehicle Owner’s Manual for important feature limitations and information." + }, + { + "prompt": "Is it safe to use recovery hooks to tow a vehicle?", + "response": "No, to avoid the risk of injury, you should never use recovery hooks to tow a vehicle." + }, + { + "prompt": "What should I do to find more information about recovery hooks?", + "response": "For more information, see the Recovery Hooks section of your Owner’s Manual." + }, + { + "prompt": "Does the Chevrolet Infotainment System work with all devices?", + "response": "Chevrolet Infotainment System functionality varies by model and requires compatible Bluetooth and smartphone, as well as USB connectivity for some devices." + }, + { + "prompt": "Are all features available at all times?", + "response": "Some features shown may have limited, late, or no availability. You should see the dealer for feature availability." + }, + { + "prompt": "What type of airbags are included in the vehicle's safety features?", + "response": "The vehicle includes dual-stage frontal airbags for the driver and front passenger, seat-mounted side-impact airbags for the driver and front passenger, and head-curtain airbags for front and rear outboard seating positions." + }, + { + "prompt": "Does the vehicle have a Forward Collision Alert feature?", + "response": "Yes, the Forward Collision Alert feature is available but requires the Safety Package." + }, + { + "prompt": "What is the purpose of the Lane Departure Warning feature?", + "response": "The Lane Departure Warning feature is designed to alert the driver if the vehicle unintentionally drifts out of its lane, and it requires the Safety Package." + }, + { + "prompt": "Is there a Rear Park Assist feature available?", + "response": "Yes, the Rear Park Assist feature is available, but it may not function on certain models without a dealer retrofit." + }, + { + "prompt": "What does the Teen Driver technology do?", + "response": "The Teen Driver technology allows parents to set limits and monitor driving habits for young drivers." + }, + { + "prompt": "What is included in the Theft-deterrent system?", + "response": "The Theft-deterrent system includes an engine immobilizer to prevent unauthorized entry." + }, + { + "prompt": "How does the Tire Pressure Monitoring System work?", + "response": "The Tire Pressure Monitoring System alerts the driver when tire pressure is low, but it does not monitor the spare tire." + }, + { + "prompt": "What engine options are available for the vehicle?", + "response": "The vehicle offers several engine options including a 2.5L DOHC 4-cylinder, a 3.6L DOHC V6, and a Duramax 2.8L Turbo-Diesel DOHC I-4." + }, + { + "prompt": "What is the horsepower of the 3.6L DOHC V6 engine?", + "response": "The 3.6L DOHC V6 engine produces 308 hp at 6800 rpm." + }, + { + "prompt": "What is the torque rating for the Duramax 2.8L Turbo-Diesel engine?", + "response": "The Duramax 2.8L Turbo-Diesel engine has a torque rating of 369 lb.-ft. at 2000 rpm." + }, + { + "prompt": "What type of transmission is included with the 2.5L DOHC 4-cylinder engine?", + "response": "The 2.5L DOHC 4-cylinder engine comes with a 6-speed automatic transmission that is electronically controlled with overdrive." + }, + { + "prompt": "What safety features are included in the vehicle's mechanical specifications?", + "response": "The mechanical specifications include StabiliTrak, Hill Descent Control, and a trailer brake controller integrated with the Duramax engine." + }, + { + "prompt": "What is the function of the Hill Start Assist feature?", + "response": "The Hill Start Assist feature helps prevent the vehicle from rolling backward when starting on an incline." + }, + { + "prompt": "What is the purpose of the Chevrolet Infotainment System?", + "response": "The Chevrolet Infotainment System provides connectivity and functionality for compatible Bluetooth and smartphone devices, but full functionality varies by model." + }, + { + "prompt": "How long is the OnStar Safety & Security coverage available?", + "response": "The OnStar Safety & Security coverage is available for up to 3 months." + }, + { + "prompt": "Can I cancel my OnStar subscription at any time?", + "response": "Yes, you can cancel your OnStar subscription at any time by pushing the blue OnStar button or calling 1-888-466-7827." + }, + { + "prompt": "What should I do to ensure child safety in the vehicle?", + "response": "Always use seat belts and the correct child restraint for your child’s age and size, and never place a rear-facing infant restraint in the front seat of a vehicle equipped with an active frontal airbag." + }, + { + "prompt": "What type of bedliner is included?", + "response": "The bedliner is a black spray-on type with the Chevrolet logo, and the ZR2 model includes a ZR2 logo." + }, + { + "prompt": "Does the ZR2 model have a specific logo on the bedliner?", + "response": "Yes, the ZR2 model includes a ZR2 logo on the bedliner." + }, + { + "prompt": "What are the features of the rear bumper?", + "response": "The rear bumper includes CornerSteps and recovery hooks, which are standard on WT and LT 4x4 models." + }, + { + "prompt": "Are recovery hooks standard on all models?", + "response": "Recovery hooks are standard on WT and LT 4x4 models." + }, + { + "prompt": "What type of spare tire is included?", + "response": "The vehicle includes a full-size spare tire, which can be either 255/65R17 all-season or 265/65R17 all-terrain, depending on the configuration." + }, + { + "prompt": "What lighting features are available?", + "response": "The vehicle features daytime running lamps and a cargo box light at the back of the cab, which includes EZ Lift with power lock and release." + }, + { + "prompt": "What is required for the fog lamps to be included?", + "response": "Fog lamps require the available LT Convenience Package on LT models or the available ZR2 Bison Edition on ZR2 models." + }, + { + "prompt": "What type of fuel filler system is used?", + "response": "The vehicle features a capless fueling system for the gas engine." + }, + { + "prompt": "What type of grille is available?", + "response": "The grille is a chrome front grille bar with a Gold bowtie emblem, although certain vehicles may be built with a painted Black grille bar." + }, + { + "prompt": "What type of headlamps are included?", + "response": "The vehicle includes halogen headlamps with automatic on/off functionality and automatic delay." + }, + { + "prompt": "Are there any special requirements for the projector-beam headlamps?", + "response": "Yes, the projector-beam headlamps require the available Luxury Package on LT models." + }, + { + "prompt": "What type of mirrors are included?", + "response": "The vehicle includes manual-adjustable, manual-folding mirrors." + }, + { + "prompt": "Is there a driver spotter mirror available?", + "response": "Yes, a driver spotter mirror is available but requires the Custom Special Edition on WT models." + }, + { + "prompt": "Can recovery hooks be used to tow a vehicle?", + "response": "No, recovery hooks should never be used to tow a vehicle to avoid the risk of injury." + }, + { + "prompt": "What type of air conditioning is available?", + "response": "The vehicle offers manual, single-zone air conditioning as standard, and automatic, single-zone air conditioning is available with the Luxury Package on LT." + }, + { + "prompt": "Is there an option for automatic air conditioning?", + "response": "Yes, automatic, single-zone air conditioning is available but requires the Luxury Package on LT." + }, + { + "prompt": "What seating options are available?", + "response": "The vehicle comes with front bucket seats as standard." + }, + { + "prompt": "What type of cruise control is included?", + "response": "The vehicle features electronic cruise control with Set and Resume Speed, which requires the WT Convenience Package on WT." + }, + { + "prompt": "Are there any options for the rear-window defogger?", + "response": "The rear-window defogger is available but requires the LT Convenience Package on LT." + }, + { + "prompt": "What kind of door locks does the vehicle have?", + "response": "The vehicle is equipped with power programmable door locks that include lockout protection and rear door child security." + }, + { + "prompt": "What type of display is in the Driver Information Center?", + "response": "The standard Driver Information Center features a monochromatic display, while an enhanced version with a multi-color display is available with the Chevrolet Infotainment 3 Plus system on LT." + }, + { + "prompt": "What is the floor covering material?", + "response": "The standard floor covering is Black vinyl, which deletes front and rear carpeted floor mats and is not available with the ZR2 Bison Edition on ZR2." + }, + { + "prompt": "Are carpeted floor mats available?", + "response": "Yes, front and rear carpeted floor mats are available, but they are not available with the ZR2 Bison Edition on ZR2." + }, + { + "prompt": "What type of rearview mirror is standard?", + "response": "The vehicle comes with a manual day/night rearview mirror as standard." + }, + { + "prompt": "Is there an auto-dimming rearview mirror option?", + "response": "Yes, an auto-dimming rearview mirror is available as an option." + }, + { + "prompt": "What type of keyless entry system is available?", + "response": "The vehicle features an extended-range Remote Keyless Entry system, which requires the WT Convenience Package on WT." + }, + { + "prompt": "Is there a remote vehicle starter system?", + "response": "Yes, a remote vehicle starter system is available but requires the LT Convenience Package on LT." + }, + { + "prompt": "What type of steering column is standard?", + "response": "The vehicle is equipped with a manual tilt steering column." + }, + { + "prompt": "Are there options for a telescopic steering column?", + "response": "Yes, a manual tilt and telescopic steering column is available as an option." + }, + { + "prompt": "What type of steering wheel is included?", + "response": "The standard steering wheel is leather-wrapped and includes mounted audio, cruise, and phone controls." + }, + { + "prompt": "Is there a heated steering wheel option?", + "response": "Yes, a heated steering wheel is available but requires either heated front seats or the Luxury Package on LT." + }, + { + "prompt": "What features do the visors have?", + "response": "The standard visors for the driver and front passenger include a passenger vanity mirror." + }, + { + "prompt": "Are illuminated vanity mirrors available?", + "response": "Yes, illuminated vanity mirrors for the driver and front passenger are available as an option." + }, + { + "prompt": "Is there a sliding rear window option?", + "response": "Yes, a sliding rear window is available but requires the LT Convenience Package on LT." + }, + { + "prompt": "What type of windows does the vehicle have?", + "response": "The vehicle features power windows with driver express-up/-down functionality, while all other windows have express-down functionality." + }, + { + "prompt": "What are the requirements for the Chevrolet Infotainment System?", + "response": "Full functionality of the Chevrolet Infotainment System requires compatible Bluetooth and smartphone, as well as USB connectivity for some devices." + }, + { + "prompt": "Where is map coverage available?", + "response": "Map coverage is available in the United States, Puerto Rico, and Canada." + }, + { + "prompt": "What infotainment systems are available?", + "response": "The available infotainment systems include Chevrolet Infotainment 3 with 7-inch and 8-inch touch-screens, Chevrolet Infotainment 3 Plus with 8-inch HD touch-screen, and Chevrolet Infotainment 3 Plus with Navigation." + }, + { + "prompt": "Is the Chevrolet Infotainment 3 system compatible with Apple CarPlay and Android Auto?", + "response": "Yes, both the Chevrolet Infotainment 3 and Infotainment 3 Plus systems are compatible with Apple CarPlay and Android Auto." + }, + { + "prompt": "What features does the Chevrolet Infotainment 3 Plus system offer?", + "response": "The Chevrolet Infotainment 3 Plus system offers an 8-inch HD color touch-screen, Bluetooth audio streaming, voice command pass-through, Apple CarPlay, Android Auto compatibility, Alexa Built-In, voice recognition, In-Vehicle Apps, and Chevrolet Maps." + }, + { + "prompt": "What audio system is included?", + "response": "The audio system includes a 6-speaker setup." + }, + { + "prompt": "Are there USB ports available in the vehicle?", + "response": "Yes, there are two USB ports and an auxiliary input jack in the front center floor console, as well as two charge-only USB ports on the rear of the front center floor console." + }, + { + "prompt": "Is there a trial subscription for SiriusXM?", + "response": "Yes, there is a 3-month trial subscription for SiriusXM available." + }, + { + "prompt": "What is the Wi-Fi Hotspot feature?", + "response": "The available Wi-Fi Hotspot allows for up to seven devices and includes 1 month of unlimited data, requiring available OnStar." + }, + { + "prompt": "What is Chevrolet Connected Access?", + "response": "Chevrolet Connected Access provides 10 years of standard connectivity, enabling features such as Vehicle Diagnostics and Dealer Maintenance Notification, requiring available OnStar." + }, + { + "prompt": "Is wireless charging available?", + "response": "Yes, wireless charging is available for one compatible mobile device." + }, + { + "prompt": "What are the requirements for using Apple CarPlay?", + "response": "Apple CarPlay requires a compatible iPhone and data plan rates apply." + }, + { + "prompt": "What are the requirements for using Android Auto?", + "response": "Android Auto requires the Android Auto app on Google Play and a compatible Android smartphone, with data plan rates applying." + }, + { + "prompt": "What is the coverage area for Connected Navigation?", + "response": "Connected Navigation and Real-time Points of Interest are available in the United States, Puerto Rico, and Canada." + }, + { + "prompt": "What is the duration of the standard connectivity for Chevrolet Connected Access?", + "response": "Standard connectivity for Chevrolet Connected Access is available for up to 10 years for the original purchaser." + }, + { + "prompt": "What should I do if I want to cancel my SiriusXM subscription?", + "response": "To cancel your SiriusXM subscription, you must call SiriusXM at 1-866-635-2349." + }, + { + "prompt": "What is required for the wireless charging feature?", + "response": "The wireless charging feature requires a compatible mobile device, as some phones have built-in wireless charging technology while others may need a special adaptor or back cover." + }, + { + "prompt": "What is included in the Trailering Package?", + "response": "The Trailering Package includes a trailer hitch, 7-pin and 4-pin connectors, and Hitch Guidance. It is standard on Crew Cab Long Box models and requires the available 3.6L V6 engine and automatic locking rear differential on WT and LT, or the available Duramax 2.8L Turbo-Diesel I-4 engine on LT." + }, + { + "prompt": "Is the Custom Special Edition available?", + "response": "Yes, the Custom Special Edition is available and includes 18\" Dark Argent Metallic-painted aluminum wheels, 265/60R18 all-season tires, body-color center grille bar, rear bumper, door handles and mirror caps, and an outside rearview mirror with driver spotter mirror." + }, + { + "prompt": "What features are included in the WT Convenience Package?", + "response": "The WT Convenience Package includes Remote Keyless Entry, a theft-deterrent system with unauthorized entry, cruise control, and an EZ Lift tailgate with power lock and release." + }, + { + "prompt": "What does the Chrome Appearance Package offer?", + "response": "The Chrome Appearance Package offers a chrome center grille bar, heated power-adjustable manual-folding mirrors with chrome caps, chrome door handles, chrome beltline moldings, and a chrome rear bumper." + }, + { + "prompt": "What features are part of the LT Convenience Package?", + "response": "The LT Convenience Package includes fog lamps, heated power-adjustable body-color mirrors, a remote vehicle starter system, and a sliding rear window with defogger." + }, + { + "prompt": "What is included in the Luxury Package for Crew Cab models?", + "response": "The Luxury Package for Crew Cab models includes projector-beam halogen headlamps, driver power lumbar control, a heated steering wheel, single-zone automatic climate control, and heated front seats. Note that this feature may not function on certain models without dealer retrofit." + }, + { + "prompt": "What does the Redline Edition feature?", + "response": "The Redline Edition features 18\" Black-painted aluminum wheels with Red accents, a Black bowtie emblem, a Black center grille bar, 3-inch round Black off-road assist steps, a Black Out Exterior Trim Package, a Black spray-on bedliner, and Red recovery hooks. It is available with Summit White, Satin Steel Metallic, or Black exterior colors." + }, + { + "prompt": "What safety features are included in the Safety Package?", + "response": "The Safety Package includes Forward Collision Alert and Lane Departure Warning, and it also includes Rear Park Assist on LT. Some features may not function on certain models without dealer retrofit." + }, + { + "prompt": "Is the Z71 Midnight Edition available?", + "response": "No, the Z71 Midnight Edition is not available." + }, + { + "prompt": "What are the limitations of safety or driver assistance features?", + "response": "Safety or driver assistance features are not substitutes for the driver’s responsibility to operate the vehicle in a safe manner. It is important to read the vehicle Owner’s Manual for important feature limitations and information." + }, + { + "prompt": "Can recovery hooks be used to tow a vehicle?", + "response": "No, recovery hooks should never be used to tow a vehicle to avoid the risk of injury. For more information, refer to the Recovery Hooks section of your Owner’s Manual." + }, + { + "prompt": "What is included in the ZR2 Bison Edition?", + "response": "The ZR2 Bison Edition includes 17\" AEV-designed aluminum wheels and spare wheel, AEV front bumper with winch provisions, AEV rear bumper with recovery points, AEV hot-stamped boron steel skid plates for front differential, engine oil pan, transfer case, fuel tank and rear differential, AEV fender flares, fog lamps, front and rear floor liners with AEV logo, and AEV embroidered head restraints." + }, + { + "prompt": "Are there any special editions available for the ZR2?", + "response": "Yes, there are special editions available for the ZR2, including the ZR2 Bison Edition, ZR2 Dusk Special Edition, and ZR2 Midnight Special Edition." + }, + { + "prompt": "What features are included in the ZR2 Dusk Special Edition?", + "response": "The ZR2 Dusk Special Edition includes 17\" gloss Black-painted aluminum wheels and spare wheel, off-road lights, and a sport bar with ZR2 logo." + }, + { + "prompt": "What does the ZR2 Midnight Special Edition feature?", + "response": "The ZR2 Midnight Special Edition features a black exterior color, 17\" gloss Black-painted aluminum wheels and spare wheel, off-road lights, and a sport bar with ZR2 logo." + }, + { + "prompt": "Is the ZR2 Bison Edition available in different colors?", + "response": "Yes, the ZR2 Bison Edition is available with Summit White, Satin Steel Metallic, Sand Dune Metallic, or Black exterior colors." + }, + { + "prompt": "What is the warranty for the ZR2 models?", + "response": "The ZR2 models come with a non-GM warranty. For limited warranty details, it is recommended to see your dealer." + }, + { + "prompt": "Are there any features that may not be available for the ZR2?", + "response": "Yes, some features shown may have limited, late, or no availability. It is advised to check with the dealer for feature availability." + }, + { + "prompt": "Can you list the available special editions for the ZR2?", + "response": "The available special editions for the ZR2 are the ZR2 Bison Edition, ZR2 Dusk Special Edition, and ZR2 Midnight Special Edition." + }, + { + "prompt": "What are the wheel specifications for the ZR2 Dusk Special Edition?", + "response": "The ZR2 Dusk Special Edition features 17\" gloss Black-painted aluminum wheels and a spare wheel." + }, + { + "prompt": "What additional features come with the ZR2 Bison Edition?", + "response": "In addition to the wheels and bumpers, the ZR2 Bison Edition includes AEV hot-stamped boron steel skid plates, fender flares, fog lamps, and AEV embroidered head restraints." + }, + { + "prompt": "What is the fuel tank capacity for the Extended Cab?", + "response": "The fuel tank capacity for the Extended Cab is approximately 21 gallons." + }, + { + "prompt": "How much fuel can the Crew Cab hold?", + "response": "The Crew Cab has a fuel tank capacity of approximately 21 gallons." + }, + { + "prompt": "Is the fuel tank size the same for both Extended Cab and Crew Cab?", + "response": "Yes, both the Extended Cab and Crew Cab have a fuel tank capacity of approximately 21 gallons." + }, + { + "prompt": "What is the approximate fuel tank capacity for both cab types?", + "response": "The approximate fuel tank capacity for both the Extended Cab and Crew Cab is 21 gallons." + }, + { + "prompt": "What are the dimensions of the Extended Cab and Crew Cab?", + "response": "The dimensions for the Extended Cab and Crew Cab include a wheelbase of 128.3 inches, a height of 70.4 inches for 2WD and 70.3 inches for 4x4, and a width of 62.4 inches." + }, + { + "prompt": "What is the wheelbase measurement for the Extended Cab?", + "response": "The wheelbase measurement for the Extended Cab is 128.3 inches." + }, + { + "prompt": "How long is the Crew Cab?", + "response": "The Crew Cab has a length of 212.7 inches for the Short Box and 212.4 inches for the ZR2." + }, + { + "prompt": "What is the height of the Extended Cab?", + "response": "The height of the Extended Cab is 70.4 inches for 2WD and 70.3 inches for 4x4." + }, + { + "prompt": "What are the leg room measurements for the Extended Cab?", + "response": "The leg room for the Extended Cab is 45.0 inches in the front and 28.6 inches in the rear." + }, + { + "prompt": "What is the shoulder room in the Crew Cab?", + "response": "The shoulder room in the Crew Cab is 57.5 inches in the front and 56.2 inches in the rear." + }, + { + "prompt": "What is the hip room for the Extended Cab?", + "response": "The hip room for the Extended Cab is 55.0 inches in the front and 52.2 inches in the rear." + }, + { + "prompt": "What accessories are available for the vehicle?", + "response": "Available accessories include All-Weather Floor Liners in Jet Black, a Tonneau Cover with Chrome Bowtie Logo, Colorado 17\" Gloss Black Aluminum Wheels, and Off-Road Assist Steps." + }, + { + "prompt": "Are there any restrictions on wheel and tire combinations?", + "response": "Yes, only GM-approved wheel/tire combinations should be used. Please see the dealer for details." + }, + { + "prompt": "What is the front track measurement for the vehicle?", + "response": "The front track measurement is 83.9 inches." + }, + { + "prompt": "What is the rear track measurement for the vehicle?", + "response": "The rear track measurement is 62.4 inches." + }, + { + "prompt": "What is the commitment of Chevrolet to its customers?", + "response": "Chevrolet is committed to providing the tools and programs needed to help customers get the most out of their journey, whether they are discovering Chevrolet through an online shopping experience or enjoying vehicle ownership." + }, + { + "prompt": "What trademarks are associated with General Motors and Chevrolet?", + "response": "The trademarks and service marks associated with General Motors and Chevrolet include GM, the GM logo, Chevrolet, the Chevrolet logo, slogans, emblems, vehicle model names, and vehicle body designs." + }, + { + "prompt": "Who owns the trademarks for Sirius and related marks?", + "response": "Sirius, XM, SiriusXM and all related marks and logos are trademarks of Sirius XM Radio Inc." + }, + { + "prompt": "What trademarks are associated with Amazon?", + "response": "Amazon, Alexa, and all related marks are trademarks of Amazon.com, Inc. or its affiliates." + }, + { + "prompt": "What is the Bluetooth word mark?", + "response": "The Bluetooth word mark is a registered trademark owned by Bluetooth SIG, Inc., and its use by Chevrolet is under license." + }, + { + "prompt": "Who owns the Bose trademark?", + "response": "Bose is a registered trademark of the Bose Corporation." + }, + { + "prompt": "What trademarks are associated with Goodyear?", + "response": "Goodyear and Wrangler DuraTrac are registered trademarks of The Goodyear Tire & Rubber Company." + }, + { + "prompt": "What year is indicated in the document for the rights reserved?", + "response": "The document indicates that all rights are reserved as of 2022." + }, + { + "prompt": "What company is associated with the OnStar trademark?", + "response": "OnStar is associated with OnStar LLC." + }, + { + "prompt": "What does the © symbol indicate in the document?", + "response": "The © symbol indicates that all rights are reserved by the respective companies mentioned in the document." + }, + { + "prompt": "When was the document published?", + "response": "The document was published in October 2022." + }, + { + "prompt": "What can I do with the Build & Price tool?", + "response": "You can create your perfect Chevrolet from start to finish." + }, + { + "prompt": "How does the Build & Price feature work?", + "response": "The Build & Price feature allows you to create your perfect Chevrolet from start to finish." + }, + { + "prompt": "What is the purpose of the Build & Price tool?", + "response": "The purpose of the Build & Price tool is to help you create your perfect Chevrolet from start to finish." + }, + { + "prompt": "Can I customize my Chevrolet using the Build & Price tool?", + "response": "Yes, you can customize your Chevrolet using the Build & Price tool to create your perfect vehicle." + }, + { + "prompt": "Is the Build & Price tool user-friendly?", + "response": "The Build & Price tool is designed to help you create your perfect Chevrolet from start to finish, indicating it is user-friendly." + }, + { + "prompt": "What is Chevy MyWay?", + "response": "Chevy MyWay is a real-time virtual video experience with a Chevy product specialist." + }, + { + "prompt": "How does Chevy MyWay work?", + "response": "Chevy MyWay allows users to engage in a real-time virtual video experience with a Chevy product specialist." + }, + { + "prompt": "Can I interact with a Chevy product specialist through Chevy MyWay?", + "response": "Yes, Chevy MyWay enables interaction with a Chevy product specialist in real-time via video." + }, + { + "prompt": "Is Chevy MyWay a video service?", + "response": "Yes, Chevy MyWay is a virtual video experience that connects users with a Chevy product specialist." + }, + { + "prompt": "What type of experience does Chevy MyWay provide?", + "response": "Chevy MyWay provides a real-time virtual video experience." + }, + { + "prompt": "What is Chevrolet Complete Care?", + "response": "Chevrolet Complete Care is an owner benefit program that provides excellent service and support from day one." + }, + { + "prompt": "What benefits does Chevrolet Complete Care offer?", + "response": "Chevrolet Complete Care offers excellent service and support as part of its owner benefit program." + }, + { + "prompt": "When does Chevrolet Complete Care start?", + "response": "Chevrolet Complete Care starts from day one of ownership." + }, + { + "prompt": "How does Chevrolet Complete Care support owners?", + "response": "Chevrolet Complete Care supports owners by providing excellent service and support." + }, + { + "prompt": "Is Chevrolet Complete Care available to all Chevrolet owners?", + "response": "The document does not specify if Chevrolet Complete Care is available to all Chevrolet owners." + }, + { + "prompt": "What is the Chevrolet Rewards program?", + "response": "The Chevrolet Rewards program is designed to unlock a world of appreciation for Chevrolet customers, allowing them to earn and redeem rewards." + }, + { + "prompt": "How can I start earning rewards with Chevrolet?", + "response": "You can start earning rewards by participating in the Chevrolet Rewards program." + }, + { + "prompt": "What can I redeem my Chevrolet Rewards for?", + "response": "The Chevrolet Rewards program allows you to redeem rewards for various benefits, although specific details on redemption options are not provided." + }, + { + "prompt": "Is there a way to appreciate Chevrolet customers?", + "response": "Yes, the Chevrolet Rewards program is specifically aimed at appreciating Chevrolet customers." + }, + { + "prompt": "What does the Chevrolet Rewards program offer?", + "response": "The Chevrolet Rewards program offers a way to earn and redeem rewards, creating a sense of appreciation for Chevrolet owners." + }, + { + "prompt": "Can I earn rewards for my Chevrolet purchases?", + "response": "Yes, you can earn rewards through the Chevrolet Rewards program for your Chevrolet purchases." + }, + { + "prompt": "What is the purpose of the Chevrolet Rewards program?", + "response": "The purpose of the Chevrolet Rewards program is to appreciate Chevrolet customers by allowing them to earn and redeem rewards." + }, + { + "prompt": "How do I redeem my rewards in the Chevrolet Rewards program?", + "response": "The document does not specify the exact process for redeeming rewards in the Chevrolet Rewards program." + }, + { + "prompt": "Is the Chevrolet Rewards program available to all Chevrolet customers?", + "response": "The document implies that the Chevrolet Rewards program is available to Chevrolet customers, but does not specify any eligibility criteria." + }, + { + "prompt": "What benefits do I get from the Chevrolet Rewards program?", + "response": "The benefits of the Chevrolet Rewards program include earning and redeeming rewards, although specific benefits are not detailed." + }, + { + "prompt": "What is the main focus of the Chevy Truck Legends?", + "response": "The main focus of the Chevy Truck Legends is to honor dependable, long-lasting trucks and the people who keep them going strong." + }, + { + "prompt": "What does Chevy Truck Legends celebrate?", + "response": "Chevy Truck Legends celebrates dependable, long-lasting trucks and the individuals who maintain them." + }, + { + "prompt": "Who does Chevy Truck Legends honor?", + "response": "Chevy Truck Legends honors the people who keep the dependable, long-lasting trucks going strong." + }, + { + "prompt": "What type of trucks are recognized by Chevy Truck Legends?", + "response": "Chevy Truck Legends recognizes dependable, long-lasting trucks." + }, + { + "prompt": "What is the purpose of Chevy Truck Legends?", + "response": "The purpose of Chevy Truck Legends is to honor both the trucks and the people who ensure their longevity and reliability." + }, + { + "prompt": "What is the age requirement to access Chevy MyWay?", + "response": "Chevy MyWay is available to anyone over 18 years old in the United States." + }, + { + "prompt": "Where can I access Chevy MyWay?", + "response": "You can access Chevy MyWay via chevrolet.com/myway." + }, + { + "prompt": "Can points be earned at locations outside the United States?", + "response": "Points may only be earned and redeemed at GM entities, My Chevrolet Rewards participating dealers, or third-party retailers in the United States, excluding Puerto Rico, the U.S. Virgin Islands, or Guam." + }, + { + "prompt": "Are there any exclusions for earning points?", + "response": "Points are not earned on taxes, fees, or body shop repair orders." + }, + { + "prompt": "What happens to points if my Member Account is deactivated?", + "response": "Points expire without further notice upon Member Account deactivation." + }, + { + "prompt": "Where can I find the terms and conditions for My GM Rewards?", + "response": "You can visit myChevroletrewards.com to view My GM Rewards Terms & Conditions." + }, + { + "prompt": "Are there any limitations on feature availability?", + "response": "Some features shown may have limited, late, or no availability. It is recommended to see the dealer for feature availability." + }, + { + "prompt": "What is the minimum age to participate in My Chevrolet Rewards?", + "response": "You must be 18 years or older to participate in My Chevrolet Rewards." + }, + { + "prompt": "Is Chevy MyWay available to people under 18?", + "response": "No, Chevy MyWay is only available to individuals who are 18 years or older." + }, + { + "prompt": "Can I earn points for body shop repair orders?", + "response": "No, points are not earned on body shop repair orders." + } +] \ No newline at end of file diff --git a/java-recipes/notebooks/semantic-routing/1_semantic_classification.ipynb b/java-recipes/notebooks/semantic-routing/1_semantic_classification.ipynb new file mode 100644 index 00000000..e1b265b4 --- /dev/null +++ b/java-recipes/notebooks/semantic-routing/1_semantic_classification.ipynb @@ -0,0 +1,599 @@ +{ + "cells": [ + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "# Semantic Classification\n", + "\n", + "Semantic classification is a machine learning technique that categorizes text based on its meaning using vector embeddings and similarity matching. This approach offers a compelling alternative to using large language models (LLMs) for classification tasks.\n", + "\n", + "## Semantic Classification vs. LLM Classification\n", + "\n", + "**LLM-based classification** involves sending text to models like GPT-4 or Claude with prompts asking them to categorize content. While powerful, this approach has two main limitations:\n", + "- **Cost**: API calls for every classification can be expensive at scale\n", + "- **Latency**: Network requests and model inference add delay\n", + "\n", + "**Semantic classification** uses vector embeddings to represent text meaning numerically, then applies similarity thresholds to determine categories:\n", + "- **Speed**: Near-instantaneous vector similarity calculations\n", + "- **Cost-effective**: No API costs after initial setup\n", + "\n", + "## How It Works\n", + "\n", + "### Creating the references and storing them in the vector database (Redis):\n", + "\n", + "1. Reference examples of text are synthetically or manually generated for each category we want to classify:\n", + "\n", + "\"\"\n", + "\n", + "2. Using an embedding model, we convert these references into embeddings (vector representation)\n", + "3. These references are stored in Redis alongside the category they refer to.\n", + "\n", + "\"\"\n", + "\n", + "### Classifying text:\n", + "\n", + "1. Using the same embedding model, we convert the text we want to classify into an embedding (vector representation)\n", + "2. We use this embedding to perform semantic search in the vector database to retrieve the most similar reference to the text we're classifying\n", + "3. If the most similar reference is similar enough, we assume that the text we're trying to classify belongs to the same category.\n", + "\n", + "\"\"\n", + "\n", + "## Using RedisVL (Vector Library)\n", + "\n", + "RedisVL is a library that makes working with vector search easy with Redis by providing abstractions to common vector search use cases out of the box. In this notebook, we will use the *Semantic Routing* abstraction whose purpose is to classify text in the same fashion described in the previous section.\n", + "\n", + "## resources\n", + "- [RedisVL Java GitHub Repository](https://github.com/redis/redis-vl-java)\n", + "- [RedisVL Java Documentation](https://redis.github.io/redis-vl-java/redisvl/current/index.html)\n", + "- [RedisVL Python GitHub Repository](https://github.com/redis/redis-vl-python)\n", + "- [RedisVL Python Documentation](https://docs.redisvl.com/en/latest/)\n", + "- [Redis AI readme-assets Repository](https://github.com/redis-developer/redis-ai-readme-assets)\n", + "- [Redis Query Engine Documentation](https://redis.io/docs/latest/develop/ai/search-and-query/)" + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Running Redis\n", + "\n", + "There are several options one can follow to have a running instance of Redis. For the sake of simplicity, in this notebook, we will run it in a Docker container.\n", + "\n", + "For production where high-availability and reliability is a concern, we recommend using [Redis Cloud](https://cloud.redis.io/).\n", + "\n", + "A free database can be spun up in Redis Cloud." + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Running Redis in a Docker Container using TestContainers\n", + "\n", + "**Docker containers** are lightweight, portable environments that package an application and all its dependencies so it runs consistently across different systems. **Testcontainers** is a library that lets us run lightweight, disposable Docker containers for integration testing, so you can test against real services like databases or message queues without complex setup.\n", + "\n", + "Make sure you have Docker installed: [install Docker](https://www.docker.com/get-started/)." + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "#### Installing dependencies" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:26:47.943010Z", + "start_time": "2025-11-24T10:26:47.622765Z" + } + }, + "cell_type": "code", + "source": "@file:DependsOn(\"org.testcontainers:testcontainers:2.0.2\")", + "outputs": [], + "execution_count": 1 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "#### Configuring a generic Redis Container" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:26:49.580215Z", + "start_time": "2025-11-24T10:26:49.274011Z" + } + }, + "cell_type": "code", + "source": [ + "import org.testcontainers.containers.GenericContainer\n", + "import org.testcontainers.utility.DockerImageName\n", + "\n", + "class RedisContainer : GenericContainer(DockerImageName.parse(\"redis:latest\")) {\n", + " init {\n", + " withExposedPorts(6379)\n", + " }\n", + "}" + ], + "outputs": [], + "execution_count": 2 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "#### Creating a Docker network\n", + "\n", + "This is necessary because later on this notebook we will spin up a Redis Insight container that needs to be in the same network." + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:26:51.621483Z", + "start_time": "2025-11-24T10:26:51.558793Z" + } + }, + "cell_type": "code", + "source": [ + "import org.testcontainers.containers.Network\n", + "\n", + "val network = Network.newNetwork()\n", + "val networkAlias = \"redis-network\"" + ], + "outputs": [], + "execution_count": 3 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "#### Start a Redis Container" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:26:54.654713Z", + "start_time": "2025-11-24T10:26:53.556250Z" + } + }, + "cell_type": "code", + "source": [ + "val networkAlias = \"redis\"\n", + "val redis = RedisContainer().withNetwork(network).withNetworkAliases(networkAlias)\n", + "redis.start()\n", + "\n", + "val host = redis.host\n", + "val port = redis.getMappedPort(6379)\n", + "println(\"Redis 8 started at $host:$port\")" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Redis 8 started at localhost:54316\n" + ] + } + ], + "execution_count": 4 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "## Implementing our Semantic Classifier" + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Installing dependencies\n", + "\n", + "As mentioned in the beginning, we will use RedisVL's semantic routing abstraction to implement our semantic classifier. Therefore, we will need to add RedisVL as a dependency." + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:26:56.929750Z", + "start_time": "2025-11-24T10:26:56.220594Z" + } + }, + "cell_type": "code", + "source": "@file:DependsOn(\"com.redis:redisvl:0.0.1\")", + "outputs": [], + "execution_count": 5 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Setting up a vectorizer\n", + "\n", + "In RedisVL, embedding models are called vectorizers. This is because embeddings are vector representations. The vectorizer is responsible for converting text into numerical vector representations that capture semantic meaning.\n", + "\n", + "This vectorizer will be passed on to our semantic routing that will convert the references and the text we're trying to classify into vectors under the hood.\n", + "\n", + "RedisVL provides several vectorizer options such as OpenAI and VertexAI, but for this example, we will be HuggingFace's `all-MiniLM-L6-v2` vectorizer because it's open source, lightweight, and free to use." + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:26:59.215853Z", + "start_time": "2025-11-24T10:26:58.522443Z" + } + }, + "cell_type": "code", + "source": [ + "import com.redis.vl.utils.vectorize.SentenceTransformersVectorizer\n", + "\n", + "val vectorizer = SentenceTransformersVectorizer(\"Xenova/all-MiniLM-L6-v2\")\n", + "\n", + "\n", + "// Testing our vectorizer\n", + "// all-MiniLM-L6-v2 is an embedding model that produces vectors of 384 dimensions, therefore we will 384 numbers printed on the screen.\n", + "// Embedding models are deterministic. It doesn't matter how many times we run this cell, the same numbers will always be produced for the same string.\n", + "\n", + "val embedding = vectorizer.embed(\"What is the capital city of the Netherlands?\")\n", + "\n", + "println(embedding.joinToString())" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0.10366548, 0.06542453, -0.04904806, 0.035133816, -0.030148711, -0.048898157, -0.02108736, 0.0019588028, -0.05460191, 0.027000071, 0.0186685, -0.12342901, -0.07914663, -0.0302804, -0.056598365, -0.039736673, 0.030802587, 0.005838588, 0.085851155, -0.032130066, -0.0071115145, -0.033734083, 0.100847885, -0.06491691, 0.014052424, 0.036977015, 0.04544064, -0.014863417, 0.011651148, -0.04714538, 0.019530838, -0.06317588, 0.027103335, -0.032490354, -0.06364442, 0.0034463818, -0.022536488, 0.046401046, 0.029528277, 0.023609689, 0.026152493, -0.025078116, -0.01031126, -0.0460871, -0.030701958, -0.011587745, -0.046117976, 0.0654084, -0.0105588185, -0.030012755, 0.08957275, -0.06994565, -0.07410133, -0.030177299, -0.0072215544, 0.03257758, -0.08564555, 0.06931229, 0.011757878, -0.017046366, 0.006678676, 0.005762717, -0.09732431, 0.04363133, 0.09194445, 0.0023713051, 0.032854725, 0.043560334, -0.09262396, -0.0036028812, -0.00783084, -0.051787496, 0.020866683, -0.08783279, 0.008077556, -0.061896563, -0.052876394, 0.01542515, 0.028461847, 0.055254, 0.0054902015, 0.057896607, -0.012671219, -0.016398145, 0.0065261223, 0.09946422, 0.081683286, -0.014310647, 0.016592013, -0.023128727, 0.03899589, 0.024052972, -0.036022622, 0.025064666, -0.09027798, 0.07410709, 0.033240385, 0.07689808, -0.0075047775, 0.07129043, 0.058456574, 0.0048786686, 0.042863794, -0.03333143, -0.084633105, 0.0404397, 0.0016501043, -0.043248758, 0.008720438, 0.013928717, -0.12757383, 0.0098286215, 0.005235327, -0.07792569, 0.06408246, 0.021897094, 0.05976543, -0.031112881, 0.024613243, 0.03934474, -0.028856492, -0.034532562, -0.05198242, 0.057985958, 0.012580307, 0.041744966, 0.020353135, -5.6557037E-33, -0.052744765, -0.045968663, 0.090078026, 0.05969658, -0.02187134, -0.005868256, 0.011242692, -0.08511892, -0.079772584, -0.009573086, -0.0019149515, -0.11999923, -0.03113336, -0.08176903, 0.067300566, 0.038190972, 0.06258574, 0.060455803, -0.03424304, 0.016001912, -0.005311913, 0.049972218, -0.009854142, -0.022788646, 0.0062692976, -0.047501225, -0.005984251, -0.02856334, 0.057462938, 0.018307501, -0.029511033, 0.07421711, -0.024075503, -0.0029474783, -0.07320562, 0.0706954, 0.009616033, -0.04029617, -0.01405646, -0.064900756, 0.03480194, -0.054661036, -0.020448288, 0.09943245, 0.005378907, -0.015326283, -0.033892747, -0.046929743, 0.07517493, -0.0070360987, -0.025554607, -0.00481674, -0.03463863, 0.0028586213, 0.043961085, 0.0864002, -0.015171171, 0.045793384, 0.06448305, 0.09541922, -0.03308, 0.041854348, 0.012787413, 0.046993345, 0.066325955, 0.0064544575, -0.03629538, 0.040205065, 0.06272943, 0.025750188, -0.04355145, 0.027366433, 0.03712594, 0.092217006, 0.04882238, 0.077644326, -0.0015609574, 0.011593154, -0.0020405904, 0.05031544, -0.06710352, -0.03852454, -0.13273093, 0.012622703, 0.08002853, -0.04667534, -0.07688987, -0.05683111, 0.047081694, 0.041931298, -0.015288125, -0.042343985, -0.09852148, -0.024203202, -0.018935643, 2.588187E-33, 0.010014019, -0.06895113, -0.09746031, 0.036003824, -0.06341073, 0.012032667, 0.0065762307, 0.049733743, -0.08265836, -0.06188845, -0.08704927, -0.08797185, 0.02567979, 0.04690977, 0.053944163, 0.014458485, 0.082691275, 0.005943607, 0.0030477392, -0.043116875, -0.07077661, -0.0074396897, -0.11552381, 0.005685311, -0.0045177834, -0.0017315152, -0.11645061, -0.036437806, -0.024438681, -0.029221144, 0.019877205, 0.0046609254, -0.04207322, 0.06455668, -0.076581106, 0.03378759, 0.09027628, -0.0253643, 0.015189911, 0.054381564, -0.055162173, -0.033139195, -0.009840774, 0.14639673, 0.017416827, 0.030440766, 0.0069401297, 3.176525E-4, 0.0072309277, -0.039998386, -0.0048506684, 0.040915135, -0.03348285, 0.022541454, 0.065321624, 0.031276476, -0.011164032, 0.020680353, 0.004708727, 0.009346659, 0.01592603, 0.05631898, -0.07473072, 0.076988205, 0.05419154, 0.0150421895, -0.0845003, 0.05386096, 0.021571098, -0.11888843, 0.09743547, 0.0051163672, -0.0011004083, 0.0040163654, -0.02674617, -0.024856558, 0.17286904, 0.06661335, -0.0058373064, -0.09741991, -0.01834617, 0.020193022, -0.006355057, -0.045210075, -0.08246333, 0.07894664, 0.09668022, -0.07309056, -0.042953275, -0.013248654, 0.027853789, 0.06620693, -0.05210908, -0.007688315, 0.013242694, -1.9812573E-8, -0.030713642, 0.028282253, -0.0320448, -0.009659997, -0.012524873, -3.3970224E-4, 0.10623612, -9.5831533E-4, -0.09355901, 0.054665145, 0.005950292, 0.026303494, -0.0051028966, -0.0076380824, 0.021773987, 0.0012177717, -0.005338478, 0.08560071, -0.013641202, -0.010476135, -0.024037808, 0.014412938, -0.089821, -0.014127389, -0.009554571, 0.0036026777, 0.061681934, 0.09047015, 0.030739356, -0.025442347, 0.004772291, 0.03357452, 0.010236746, 0.02115033, -0.027155366, 0.0013892107, 0.005756898, -0.008016912, -0.059457876, -0.008943728, 0.028666046, 0.022122331, 0.0048325458, -0.013133106, 0.0039171097, 0.035153743, 0.01711277, 0.02501661, 0.059882976, -0.14004363, -0.15314281, 0.030633396, 0.04397522, 0.0048514833, 0.03720228, -0.0823461, -0.0054057688, 0.0070677525, -0.02929645, -0.025880426, 0.09497929, -0.018403502, 0.043690376, 0.040304095\n" + ] + } + ], + "execution_count": 6 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Loading references\n", + "In this recipe, we're trying to classify posts that are related to artificial intelligence. In order to do so, we will vectorize a couple of hundred examples that have been synthetically generated for us. The file with the references is located at `../data/1_references.txt`" + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:27:01.569822Z", + "start_time": "2025-11-24T10:27:01.416400Z" + } + }, + "cell_type": "code", + "source": [ + "import java.io.File\n", + "\n", + "val artificialIntelligenceReferences = File(\"./resources/1_references.txt\")\n", + " .readLines()\n", + " .map { it.trim() }\n", + "\n", + "// Print the first 10 references of the file on the screen\n", + "println(\"number of references: ${artificialIntelligenceReferences.size}\\n\")\n", + "println(\"First 10 references:\")\n", + "println(artificialIntelligenceReferences.take(10).joinToString(\"\\n\"))" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "number of references: 367\n", + "\n", + "First 10 references:\n", + "AI for beginners\n", + "what is ChatGPT\n", + "how LLMs work\n", + "AI and privacy\n", + "jobs and AI\n", + "AI for writing\n", + "AI and creativity\n", + "using AI to code\n", + "training an AI model\n", + "how AI helps devs\n" + ] + } + ], + "execution_count": 7 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Creating a route\n", + "\n", + "Now, let's create a route. A route is one of the categories we want our classifier to be able to do so. Each route contains:\n", + "\n", + "- **Route name**: An identifier for this classification category\n", + "- **Reference examples**: Sample text that represents the category you want to classify\n", + "- **Distance threshold**: How similar new text must be to the references to match the route" + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:27:04.884820Z", + "start_time": "2025-11-24T10:27:04.845293Z" + } + }, + "cell_type": "code", + "source": [ + "import com.redis.vl.extensions.router.Route\n", + "import com.redis.vl.extensions.router.SemanticRouter\n", + "\n", + "val artificialIntelligenceRoute = Route.builder()\n", + " .name(\"artificial_intelligence_references\")\n", + " .references(artificialIntelligenceReferences)\n", + " .distanceThreshold(0.7)\n", + " .build()" + ], + "outputs": [], + "execution_count": 8 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Creating the router\n", + "\n", + "The SemanticRouter is the central component that orchestrates the classification process. It combines your routes, vectorizer, and Redis connection to provide fast semantic classification capabilities." + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:27:09.636898Z", + "start_time": "2025-11-24T10:27:06.555960Z" + } + }, + "cell_type": "code", + "source": [ + "import redis.clients.jedis.HostAndPort\n", + "import redis.clients.jedis.UnifiedJedis\n", + "\n", + "// Configure the connection to Redis\n", + "val jedis = UnifiedJedis(HostAndPort(host, port))\n", + "\n", + "val router = SemanticRouter.builder()\n", + " .name(\"ai-router\")\n", + " .jedis(jedis)\n", + " .vectorizer(vectorizer)\n", + " .routes(listOf(artificialIntelligenceRoute))\n", + " .build()" + ], + "outputs": [], + "execution_count": 9 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "### Testing our semantic classification solution" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:27:11.101741Z", + "start_time": "2025-11-24T10:27:11.027104Z" + } + }, + "cell_type": "code", + "source": [ + "val userQuery = \"Redis is a great tool for building applied AI systems because it works well as agent memory\"\n", + "\n", + "val match = router.route(userQuery)\n", + "\n", + "// This query should match the artificial intelligence route\n", + "println(match)" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "RouteMatch(name=artificial_intelligence_references, distance=0.589918046313)\n" + ] + } + ], + "execution_count": 10 + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:27:13.204442Z", + "start_time": "2025-11-24T10:27:13.156505Z" + } + }, + "cell_type": "code", + "source": [ + "val userQuery = \"Flevoland is a nice place to visit\"\n", + "\n", + "val match = router.route(userQuery)\n", + "\n", + "// This query shouldn't match any route\n", + "println(match)" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "RouteMatch(name=null, distance=null)\n" + ] + } + ], + "execution_count": 11 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "Now that we have a working semantic classifier, let's see how data is stored within Redis." + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Redis Insight\n", + "\n", + "Redis Insight is a visual tool that helps you explore, monitor, and optimize your Redis data and performance through an easy-to-use interface.\n", + "\n", + "It can be downloaded and run locally in your machine or be run in a Docker container. To make this recipe self-contained and straightforward, we're going to run it in a Docker container using Test Containers." + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "### Configuring a generic Redis Insight Container" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:27:15.716095Z", + "start_time": "2025-11-24T10:27:15.641499Z" + } + }, + "cell_type": "code", + "source": [ + "import org.testcontainers.containers.GenericContainer\n", + "import org.testcontainers.containers.wait.strategy.Wait\n", + "import org.testcontainers.utility.DockerImageName\n", + "\n", + "class RedisInsightContainer : GenericContainer(\n", + " DockerImageName.parse(\"redis/redisinsight:latest\") // or latest stable version\n", + ") {\n", + " init {\n", + " withExposedPorts(5540)\n", + " withEnv(\"RI_REDIS_HOST\", \"redis\")\n", + " withEnv(\"RI_REDIS_PORT\", \"6379\") // Since this will run in the same Docker network, we don't need to set the mapped port for the Redis Server\n", + " withEnv(\"RI_REDIS_ALIAS\", \"Local Redis\")\n", + " withEnv(\"RI_REDIS_USERNAME\", \"default\")\n", + " withEnv(\"RI_REDIS_PASSWORD\", \"\")\n", + " withEnv(\"RI_REDIS_TLS\", \"FALSE\")\n", + "\n", + " waitingFor(Wait.forHttp(\"/\").forPort(5540))\n", + " }\n", + "\n", + " fun getUiUrl(): String = \"http://${host}:${getMappedPort(5540)}\"\n", + "}" + ], + "outputs": [], + "execution_count": 12 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "### Starting the Redis Insight container" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:27:20.282569Z", + "start_time": "2025-11-24T10:27:18.050947Z" + } + }, + "cell_type": "code", + "source": [ + "val redisInsight = RedisInsightContainer().withNetwork(network)\n", + "redisInsight.start()\n", + "\n", + "println(\"RedisInsight UI: ${redisInsight.getUiUrl()}\")" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "RedisInsight UI: http://localhost:54322\n" + ] + } + ], + "execution_count": 13 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "When accessing Redis Insight for the first time, you will have to agree with the user agreement:\n", + "\n", + "\"\"\n", + "\n", + "After agreeing, the list of configured databases will show up. In this case, there'll be only one: `Local Redis`.\n", + "\n", + "\"\"\n", + "\n", + "By clicking on `Tree View` we can organize the keys by keyspace. This will make it easier to visualize all keys in Redis Insight:\n", + "\n", + "\"\"\n", + "\n", + "The `ai-router:route_config` key holds the configuration of the router (classifier in our case) - We can see its name, vectorizer, routes and some configuration:\n", + "\n", + "\"\"\n", + "\n", + "In the `ai-router:artificial_intelligence_references:` keyspace, we can see the detail of each vectorized reference, including their respective vector representations:\n", + "\n", + "\"\"\n", + "\n", + "Make sure you change from `Unicode` to `Vector 32-bit` to see the vectors as numbers instead of a bytearray:\n", + "\n", + "\"\"\n", + "\n", + "This will be a long list of 384 floating points.\n", + "\n", + "On Redis Insight Workbench we can send commands directly to our Redis instance:\n", + "\n", + "\"\"\n", + "\n", + "If we send the command `FT.INFO 'ai-router'` we can see the index that was created by RedisVL to be able to perform semantic search efficiently using the [Redis Query Engine](https://redis.io/docs/latest/develop/ai/search-and-query/)\n", + "\n", + "\"\"" + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Spinning down Docker containers\n", + "\n", + "Finally, once we're done, let's clean up all the readme-assets we created for our recipe:" + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:27:25.986571Z", + "start_time": "2025-11-24T10:27:25.380120Z" + } + }, + "cell_type": "code", + "source": [ + "redis.stop()\n", + "redisInsight.stop()\n", + "network.close()" + ], + "outputs": [], + "execution_count": 14 + } + ], + "metadata": { + "kernelspec": { + "display_name": "Kotlin", + "language": "kotlin", + "name": "kotlin" + }, + "language_info": { + "name": "kotlin", + "version": "2.2.20-dev-4982", + "mimetype": "text/x-kotlin", + "file_extension": ".kt", + "pygments_lexer": "kotlin", + "codemirror_mode": "text/x-kotlin", + "nbconvert_exporter": "" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/java-recipes/notebooks/semantic-routing/2_semantic_tool_calling.ipynb b/java-recipes/notebooks/semantic-routing/2_semantic_tool_calling.ipynb new file mode 100644 index 00000000..778ab98f --- /dev/null +++ b/java-recipes/notebooks/semantic-routing/2_semantic_tool_calling.ipynb @@ -0,0 +1,634 @@ +{ + "cells": [ + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "# Semantic Tool Calling\n", + "\n", + "Tool calling is a pattern where AI systems determine which tools or functions to execute based on user input. The most common modern approach uses large language models (LLMs) to analyze user requests and decide which functions to call.\n", + "\n", + "## Semantic Tool Calling vs. LLM Tool Calling\n", + "\n", + "**LLM-based tool calling** involves sending user queries to models like GPT-4 or Claude with function definitions, letting the LLM decide which tools to use. This approach means:\n", + "- **Cost per request**: Each decision requires an API call\n", + "- **Latency**: Network round-trips add delay to every tool selection\n", + "\n", + "**Semantic tool calling** uses vector embeddings and similarity matching to route queries to appropriate tools:\n", + "- **Speed**: Near-instantaneous tool selection through vector similarity\n", + "- **Cost-effective**: No API costs after initial setup\n", + "\n", + "## How It Works\n", + "\n", + "### Creating the references and storing them in the vector database (Redis):\n", + "\n", + "1. Reference examples of text are generated for each tool we want to semantically trigger:\n", + "\n", + "\"\"\n", + "\n", + "2. Using an embedding model, we convert these references into embeddings (vector representation)\n", + "3. These references are stored in Redis alongside the tool they refer to.\n", + "\n", + "\"\"\n", + "\n", + "### Selecting tool:\n", + "\n", + "1. Using the same embedding model, we convert the user prompt into an embedding (vector representation)\n", + "2. We use this embedding to perform semantic search in the vector database to retrieve the most similar reference to the tools we're trying to match\n", + "3. If the most similar reference is similar enough, we assume that the referring tool should be called and do it proactively.\n", + "\n", + "\"\"\n", + "\n", + "## Using RedisVL (Vector Library)\n", + "\n", + "RedisVL is a library that makes working with vector search easy with Redis by providing abstractions to common vector search use cases out of the box. In this notebook, we will use the *Semantic Routing* abstraction whose purpose is to classify text in the same fashion described in the previous section.\n", + "\n", + "## Resources\n", + "- [RedisVL Java GitHub Repository](https://github.com/redis/redis-vl-java)\n", + "- [RedisVL Java Documentation](https://redis.github.io/redis-vl-java/redisvl/current/index.html)\n", + "- [RedisVL Python GitHub Repository](https://github.com/redis/redis-vl-python)\n", + "- [RedisVL Python Documentation](https://docs.redisvl.com/en/latest/)\n", + "- [Redis AI Resources Repository](https://github.com/redis-developer/redis-ai-resources)\n", + "- [Redis Query Engine Documentation](https://redis.io/docs/latest/develop/ai/search-and-query/)" + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Running Redis\n", + "\n", + "There are several options one can follow to have a running instance of Redis. For the sake of simplicity, in this notebook, we will run it in a Docker container.\n", + "\n", + "For production where high-availability and reliability is a concern, we recommend using [Redis Cloud](https://cloud.redis.io/).\n", + "\n", + "A free database can be spun up in Redis Cloud." + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "#### Installing dependencies### Running Redis in a Docker Container using TestContainers\n", + "\n", + "**Docker containers** are lightweight, portable environments that package an application and all its dependencies so it runs consistently across different systems. **Testcontainers** is a library that lets us run lightweight, disposable Docker containers for integration testing, so you can test against real services like databases or message queues without complex setup.\n", + "\n", + "Make sure you have Docker installed: [install Docker](https://www.docker.com/get-started/)." + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "#### Installing dependencies" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:28:07.409406Z", + "start_time": "2025-11-24T10:28:07.094431Z" + } + }, + "cell_type": "code", + "source": "@file:DependsOn(\"org.testcontainers:testcontainers:2.0.2\")", + "outputs": [], + "execution_count": 1 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "#### Configuring a generic Redis Container" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:28:08.728521Z", + "start_time": "2025-11-24T10:28:08.443262Z" + } + }, + "cell_type": "code", + "source": [ + "import org.testcontainers.containers.GenericContainer\n", + "import org.testcontainers.utility.DockerImageName\n", + "\n", + "class RedisContainer : GenericContainer(DockerImageName.parse(\"redis:latest\")) {\n", + " init {\n", + " withExposedPorts(6379)\n", + " }\n", + "}" + ], + "outputs": [], + "execution_count": 2 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "#### Creating a Docker network\n", + "\n", + "This is necessary because later on this notebook we will spin up a Redis Insight container that needs to be in the same network." + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:28:10.704381Z", + "start_time": "2025-11-24T10:28:10.643219Z" + } + }, + "cell_type": "code", + "source": [ + "import org.testcontainers.containers.Network\n", + "\n", + "val network = Network.newNetwork()\n", + "val networkAlias = \"redis-network\"" + ], + "outputs": [], + "execution_count": 3 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "#### Start a Redis Container" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:28:13.363479Z", + "start_time": "2025-11-24T10:28:12.343275Z" + } + }, + "cell_type": "code", + "source": [ + "val networkAlias = \"redis\"\n", + "val redis = RedisContainer().withNetwork(network).withNetworkAliases(networkAlias)\n", + "redis.start()\n", + "\n", + "val host = redis.host\n", + "val port = redis.getMappedPort(6379)\n", + "println(\"Redis 8 started at $host:$port\")" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Redis 8 started at localhost:54394\n" + ] + } + ], + "execution_count": 4 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "## Implementing our Semantic Tool Caller" + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Installing dependencies\n", + "\n", + "As mentioned in the beginning, we will use RedisVL's semantic routing abstraction to implement our semantic tool caller. Therefore, we will need to add RedisVL as a dependency." + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:28:15.415175Z", + "start_time": "2025-11-24T10:28:14.731399Z" + } + }, + "cell_type": "code", + "source": "@file:DependsOn(\"com.redis:redisvl:0.0.1\")", + "outputs": [], + "execution_count": 5 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Setting up a vectorizer\n", + "\n", + "In RedisVL, embedding models are called vectorizers. This is because embeddings are vector representations. The vectorizer is responsible for converting text into numerical vector representations that capture semantic meaning.\n", + "\n", + "This vectorizer will be passed on to our semantic routing that will convert the references and the text we're trying to classify into vectors under the hood.\n", + "\n", + "RedisVL provides several vectorizer options such as OpenAI and VertexAI, but for this example, we will be HuggingFace's `all-MiniLM-L6-v2` vectorizer because it's open source, lightweight, and free to use." + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:28:18.515846Z", + "start_time": "2025-11-24T10:28:17.837886Z" + } + }, + "cell_type": "code", + "source": [ + "import com.redis.vl.utils.vectorize.SentenceTransformersVectorizer\n", + "\n", + "val vectorizer = SentenceTransformersVectorizer(\"Xenova/all-MiniLM-L6-v2\")\n", + "\n", + "\n", + "// Testing our vectorizer\n", + "// all-MiniLM-L6-v2 is an embedding model that produces vectors of 384 dimensions, therefore we will 384 numbers printed on the screen.\n", + "// Embedding models are deterministic. It doesn't matter how many times we run this cell, the same numbers will always be produced for the same string.\n", + "\n", + "val embedding = vectorizer.embed(\"What is the capital city of the Netherlands?\")\n", + "\n", + "println(embedding.joinToString())" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0.10366548, 0.06542453, -0.04904806, 0.035133816, -0.030148711, -0.048898157, -0.02108736, 0.0019588028, -0.05460191, 0.027000071, 0.0186685, -0.12342901, -0.07914663, -0.0302804, -0.056598365, -0.039736673, 0.030802587, 0.005838588, 0.085851155, -0.032130066, -0.0071115145, -0.033734083, 0.100847885, -0.06491691, 0.014052424, 0.036977015, 0.04544064, -0.014863417, 0.011651148, -0.04714538, 0.019530838, -0.06317588, 0.027103335, -0.032490354, -0.06364442, 0.0034463818, -0.022536488, 0.046401046, 0.029528277, 0.023609689, 0.026152493, -0.025078116, -0.01031126, -0.0460871, -0.030701958, -0.011587745, -0.046117976, 0.0654084, -0.0105588185, -0.030012755, 0.08957275, -0.06994565, -0.07410133, -0.030177299, -0.0072215544, 0.03257758, -0.08564555, 0.06931229, 0.011757878, -0.017046366, 0.006678676, 0.005762717, -0.09732431, 0.04363133, 0.09194445, 0.0023713051, 0.032854725, 0.043560334, -0.09262396, -0.0036028812, -0.00783084, -0.051787496, 0.020866683, -0.08783279, 0.008077556, -0.061896563, -0.052876394, 0.01542515, 0.028461847, 0.055254, 0.0054902015, 0.057896607, -0.012671219, -0.016398145, 0.0065261223, 0.09946422, 0.081683286, -0.014310647, 0.016592013, -0.023128727, 0.03899589, 0.024052972, -0.036022622, 0.025064666, -0.09027798, 0.07410709, 0.033240385, 0.07689808, -0.0075047775, 0.07129043, 0.058456574, 0.0048786686, 0.042863794, -0.03333143, -0.084633105, 0.0404397, 0.0016501043, -0.043248758, 0.008720438, 0.013928717, -0.12757383, 0.0098286215, 0.005235327, -0.07792569, 0.06408246, 0.021897094, 0.05976543, -0.031112881, 0.024613243, 0.03934474, -0.028856492, -0.034532562, -0.05198242, 0.057985958, 0.012580307, 0.041744966, 0.020353135, -5.6557037E-33, -0.052744765, -0.045968663, 0.090078026, 0.05969658, -0.02187134, -0.005868256, 0.011242692, -0.08511892, -0.079772584, -0.009573086, -0.0019149515, -0.11999923, -0.03113336, -0.08176903, 0.067300566, 0.038190972, 0.06258574, 0.060455803, -0.03424304, 0.016001912, -0.005311913, 0.049972218, -0.009854142, -0.022788646, 0.0062692976, -0.047501225, -0.005984251, -0.02856334, 0.057462938, 0.018307501, -0.029511033, 0.07421711, -0.024075503, -0.0029474783, -0.07320562, 0.0706954, 0.009616033, -0.04029617, -0.01405646, -0.064900756, 0.03480194, -0.054661036, -0.020448288, 0.09943245, 0.005378907, -0.015326283, -0.033892747, -0.046929743, 0.07517493, -0.0070360987, -0.025554607, -0.00481674, -0.03463863, 0.0028586213, 0.043961085, 0.0864002, -0.015171171, 0.045793384, 0.06448305, 0.09541922, -0.03308, 0.041854348, 0.012787413, 0.046993345, 0.066325955, 0.0064544575, -0.03629538, 0.040205065, 0.06272943, 0.025750188, -0.04355145, 0.027366433, 0.03712594, 0.092217006, 0.04882238, 0.077644326, -0.0015609574, 0.011593154, -0.0020405904, 0.05031544, -0.06710352, -0.03852454, -0.13273093, 0.012622703, 0.08002853, -0.04667534, -0.07688987, -0.05683111, 0.047081694, 0.041931298, -0.015288125, -0.042343985, -0.09852148, -0.024203202, -0.018935643, 2.588187E-33, 0.010014019, -0.06895113, -0.09746031, 0.036003824, -0.06341073, 0.012032667, 0.0065762307, 0.049733743, -0.08265836, -0.06188845, -0.08704927, -0.08797185, 0.02567979, 0.04690977, 0.053944163, 0.014458485, 0.082691275, 0.005943607, 0.0030477392, -0.043116875, -0.07077661, -0.0074396897, -0.11552381, 0.005685311, -0.0045177834, -0.0017315152, -0.11645061, -0.036437806, -0.024438681, -0.029221144, 0.019877205, 0.0046609254, -0.04207322, 0.06455668, -0.076581106, 0.03378759, 0.09027628, -0.0253643, 0.015189911, 0.054381564, -0.055162173, -0.033139195, -0.009840774, 0.14639673, 0.017416827, 0.030440766, 0.0069401297, 3.176525E-4, 0.0072309277, -0.039998386, -0.0048506684, 0.040915135, -0.03348285, 0.022541454, 0.065321624, 0.031276476, -0.011164032, 0.020680353, 0.004708727, 0.009346659, 0.01592603, 0.05631898, -0.07473072, 0.076988205, 0.05419154, 0.0150421895, -0.0845003, 0.05386096, 0.021571098, -0.11888843, 0.09743547, 0.0051163672, -0.0011004083, 0.0040163654, -0.02674617, -0.024856558, 0.17286904, 0.06661335, -0.0058373064, -0.09741991, -0.01834617, 0.020193022, -0.006355057, -0.045210075, -0.08246333, 0.07894664, 0.09668022, -0.07309056, -0.042953275, -0.013248654, 0.027853789, 0.06620693, -0.05210908, -0.007688315, 0.013242694, -1.9812573E-8, -0.030713642, 0.028282253, -0.0320448, -0.009659997, -0.012524873, -3.3970224E-4, 0.10623612, -9.5831533E-4, -0.09355901, 0.054665145, 0.005950292, 0.026303494, -0.0051028966, -0.0076380824, 0.021773987, 0.0012177717, -0.005338478, 0.08560071, -0.013641202, -0.010476135, -0.024037808, 0.014412938, -0.089821, -0.014127389, -0.009554571, 0.0036026777, 0.061681934, 0.09047015, 0.030739356, -0.025442347, 0.004772291, 0.03357452, 0.010236746, 0.02115033, -0.027155366, 0.0013892107, 0.005756898, -0.008016912, -0.059457876, -0.008943728, 0.028666046, 0.022122331, 0.0048325458, -0.013133106, 0.0039171097, 0.035153743, 0.01711277, 0.02501661, 0.059882976, -0.14004363, -0.15314281, 0.030633396, 0.04397522, 0.0048514833, 0.03720228, -0.0823461, -0.0054057688, 0.0070677525, -0.02929645, -0.025880426, 0.09497929, -0.018403502, 0.043690376, 0.040304095\n" + ] + } + ], + "execution_count": 6 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Defining Tool Routes\n", + "\n", + "Each route represents a specific tool or function that user calls might need to trigger. We define routes by providing example phrases that represent different ways users might express the same intent.\n", + "\n", + "Each route contains:\n", + "\n", + "- **Route name**: An identifier for this classification category\n", + "- **Reference examples**: Sample text that represents the category you want to classify\n", + "- **Distance threshold**: How similar new text must be to the references to match the route" + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:28:21.075438Z", + "start_time": "2025-11-24T10:28:20.999660Z" + } + }, + "cell_type": "code", + "source": [ + "import com.redis.vl.extensions.router.Route\n", + "\n", + "val getWeatherRoute = Route.builder()\n", + " .name(\"get_weather_route\")\n", + " .references(\n", + " listOf(\n", + " \"What's the weather like?\",\n", + " \"What's the weather today?\",\n", + " \"Will it rain today?\",\n", + " \"Is it raining?\"\n", + " )\n", + " )\n", + " .distanceThreshold(0.3)\n", + " .build()\n", + "\n", + "val faqRoute = Route.builder()\n", + " .name(\"faq_route\")\n", + " .references(\n", + " listOf(\n", + " \"What can you do?\",\n", + " \"What are you capable of?\",\n", + " \"What problems can you solve?\",\n", + " \"How can you help me?\"\n", + " )\n", + " )\n", + " .distanceThreshold(0.3)\n", + " .build()\n", + "\n", + "val notificationsRoute = Route.builder()\n", + " .name(\"notifications_route\")\n", + " .references(\n", + " listOf(\n", + " \"Do I have any notifications?\",\n", + " \"Read my notifications\",\n", + " \"Have I got notifications?\"\n", + " )\n", + " )\n", + " .distanceThreshold(0.3)\n", + " .build()" + ], + "outputs": [], + "execution_count": 7 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Creating the router\n", + "\n", + "The SemanticRouter is the central component that orchestrates the classification process. It combines your routes, vectorizer, and Redis connection to provide fast semantic classification capabilities." + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:28:24.031972Z", + "start_time": "2025-11-24T10:28:23.641563Z" + } + }, + "cell_type": "code", + "source": [ + "import com.redis.vl.extensions.router.SemanticRouter\n", + "import redis.clients.jedis.HostAndPort\n", + "import redis.clients.jedis.UnifiedJedis\n", + "\n", + "// Configure the connection to Redis\n", + "val jedis = UnifiedJedis(HostAndPort(host, port))\n", + "\n", + "val router = SemanticRouter.builder()\n", + " .name(\"tool-router\")\n", + " .jedis(jedis)\n", + " .vectorizer(vectorizer)\n", + " .routes(listOf(\n", + " getWeatherRoute,\n", + " faqRoute,\n", + " notificationsRoute)\n", + " ).build()" + ], + "outputs": [], + "execution_count": 8 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "## Testing our semantic tool calling solution" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:28:26.374808Z", + "start_time": "2025-11-24T10:28:26.299872Z" + } + }, + "cell_type": "code", + "source": [ + "val userQuery = \"Do I have new notifications?\"\n", + "\n", + "val routeMatch = router.route(userQuery)\n", + "\n", + "// This query should match the notifications route\n", + "println(routeMatch)" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "RouteMatch(name=notifications_route, distance=0.192504366239)\n" + ] + } + ], + "execution_count": 9 + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:28:27.815215Z", + "start_time": "2025-11-24T10:28:27.764253Z" + } + }, + "cell_type": "code", + "source": [ + "val userQuery = \"Will it be sunny today?\"\n", + "\n", + "val routeMatch = router.route(userQuery)\n", + "\n", + "// This query should match the weather route\n", + "println(routeMatch)" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "RouteMatch(name=get_weather_route, distance=0.244104504585)\n" + ] + } + ], + "execution_count": 10 + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:28:29.452642Z", + "start_time": "2025-11-24T10:28:29.402769Z" + } + }, + "cell_type": "code", + "source": [ + "val userQuery = \"What tasks are you capable of?\"\n", + "\n", + "val routeMatch = router.route(userQuery)\n", + "\n", + "// This query should match the faq route\n", + "println(routeMatch)" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "RouteMatch(name=faq_route, distance=0.269281685352)\n" + ] + } + ], + "execution_count": 11 + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:28:30.893974Z", + "start_time": "2025-11-24T10:28:30.839753Z" + } + }, + "cell_type": "code", + "source": [ + "val userQuery = \"Stamppot is the most nutritious meal\"\n", + "\n", + "val routeMatch = router.route(userQuery)\n", + "\n", + "// This query shouldn't match any route\n", + "println(routeMatch)" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "RouteMatch(name=null, distance=null)\n" + ] + } + ], + "execution_count": 12 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Redis Insight\n", + "\n", + "Redis Insight is a visual tool that helps you explore, monitor, and optimize your Redis data and performance through an easy-to-use interface.\n", + "\n", + "It can be downloaded and run locally in your machine or be run in a Docker container. To make this recipe self-contained and straightforward, we're going to run it in a Docker container using Test Containers." + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "### Configuring a generic Redis Insight Container" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:28:33.185527Z", + "start_time": "2025-11-24T10:28:33.100178Z" + } + }, + "cell_type": "code", + "source": [ + "import org.testcontainers.containers.GenericContainer\n", + "import org.testcontainers.containers.wait.strategy.Wait\n", + "import org.testcontainers.utility.DockerImageName\n", + "\n", + "class RedisInsightContainer : GenericContainer(\n", + " DockerImageName.parse(\"redis/redisinsight:latest\") // or latest stable version\n", + ") {\n", + " init {\n", + " withExposedPorts(5540)\n", + " withEnv(\"RI_REDIS_HOST\", \"redis\")\n", + " withEnv(\"RI_REDIS_PORT\", \"6379\") // Since this will run in the same Docker network, we don't need to set the mapped port for the Redis Server\n", + " withEnv(\"RI_REDIS_ALIAS\", \"Local Redis\")\n", + " withEnv(\"RI_REDIS_USERNAME\", \"default\")\n", + " withEnv(\"RI_REDIS_PASSWORD\", \"\")\n", + " withEnv(\"RI_REDIS_TLS\", \"FALSE\")\n", + "\n", + " waitingFor(Wait.forHttp(\"/\").forPort(5540))\n", + " }\n", + "\n", + " fun getUiUrl(): String = \"http://${host}:${getMappedPort(5540)}\"\n", + "}" + ], + "outputs": [], + "execution_count": 13 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "### Starting the Redis Insight container" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:28:37.632888Z", + "start_time": "2025-11-24T10:28:35.386572Z" + } + }, + "cell_type": "code", + "source": [ + "val redisInsight = RedisInsightContainer().withNetwork(network)\n", + "redisInsight.start()\n", + "\n", + "println(\"RedisInsight UI: ${redisInsight.getUiUrl()}\")" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "RedisInsight UI: http://localhost:54400\n" + ] + } + ], + "execution_count": 14 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "When accessing Redis Insight for the first time, you will have to agree with the user agreement:\n", + "\n", + "\"\"\n", + "\n", + "After agreeing, the list of configured databases will show up. In this case, there'll be only one: `Local Redis`.\n", + "\n", + "\"\"\n", + "\n", + "By clicking on `Tree View` we can organize the keys by keyspace. This will make it easier to visualize all keys in Redis Insight:\n", + "\n", + "\"\"\n", + "\n", + "The `tool-router:route_config` key holds the configuration of the router (tool caller in our case) - We can see its name, vectorizer, routes and some configuration:\n", + "\n", + "\"\"\n", + "\n", + "In the `tool-router:faq_route:`, `tool-router:get_weather_route:`, `tool-router:notifications_route:` keyspaces, we can see the details of each vectorized reference, including their respective vector representations:\n", + "\n", + "\"\"\n", + "\n", + "Make sure you change from `Unicode` to `Vector 32-bit` to see the vectors as numbers instead of a bytearray:\n", + "\n", + "\"\"\n", + "\n", + "This will be a long list of 384 floating points.\n", + "\n", + "On Redis Insight Workbench we can send commands directly to our Redis instance:\n", + "\n", + "\"\"\n", + "\n", + "If we send the command `FT.INFO 'tool-router'` we can see the index that was created by RedisVL to be able to perform semantic search efficiently using the [Redis Query Engine](https://redis.io/docs/latest/develop/ai/search-and-query/)\n", + "\n", + "\"\"" + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Spinning down Docker containers\n", + "\n", + "Finally, once we're done, let's clean up all the resources we created for our recipe:" + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:28:50.388177Z", + "start_time": "2025-11-24T10:28:49.763747Z" + } + }, + "cell_type": "code", + "source": [ + "redis.stop()\n", + "redisInsight.stop()\n", + "network.close()" + ], + "outputs": [], + "execution_count": 15 + } + ], + "metadata": { + "kernelspec": { + "display_name": "Kotlin", + "language": "kotlin", + "name": "kotlin" + }, + "language_info": { + "name": "kotlin", + "version": "2.2.20-dev-4982", + "mimetype": "text/x-kotlin", + "file_extension": ".kt", + "pygments_lexer": "kotlin", + "codemirror_mode": "text/x-kotlin", + "nbconvert_exporter": "" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/java-recipes/notebooks/semantic-routing/3_semantic_guardrails.ipynb b/java-recipes/notebooks/semantic-routing/3_semantic_guardrails.ipynb new file mode 100644 index 00000000..976f4f1b --- /dev/null +++ b/java-recipes/notebooks/semantic-routing/3_semantic_guardrails.ipynb @@ -0,0 +1,570 @@ +{ + "cells": [ + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "# Semantic Guardrails\n", + "\n", + "Guardrails are instructions sent to the LLM (often in the system prompt or through filters) that define what the agent should or shouldn’t say or do to keep its behavior within set boundaries.\n", + "\n", + "## Semantic Guardrails vs. Prompt Guardrails\n", + "\n", + "**Prompt guardrails** are rules written directly into an LLM’s system or developer prompts. They guide the model’s behavior by telling it what it should or shouldn’t say. For example, “Don’t discuss personal medical advice.” While effective, this approach has a few trade-offs:\n", + "- **Cost per request**: Each decision requires an API call\n", + "- **Latency**: Network round-trips add delay to every tool selection\n", + "- **Vulnerability to jailbreaks**: Cleverly phrased inputs can trick the model into ignoring or bypassing the rules\n", + "\n", + "**Semantic guardrails** uses vector embeddings and similarity matching to block queries that match a blocked topic:\n", + "- **Speed**: Near-instantaneous tool selection through vector similarity\n", + "- **Cost-effective**: No API costs after initial setup\n", + "- **Resistant to jailbreaks**: Because semantic guardrails work outside the LLM and detect meaning directly, they’re much harder to bypass with prompt tricks or rewording.\n", + "\n", + "## How It Works\n", + "\n", + "Creating the references and storing them in the vector database (Redis):\n", + "1. Reference examples of text are generated for each topic we want to semantically block.\n", + "2. Using an embedding model, we convert these references into embeddings (vector representation)\n", + "3. These references are stored in Redis.\n", + "\n", + "Checking topic:\n", + "1. Using the same embedding model, we convert the user prompt into an embedding (vector representation)\n", + "2. We use this embedding to perform semantic search in the vector database to retrieve the most similar reference to the topics we're trying to block\n", + "3. If the most similar reference is similar enough, we assume that the referring topic should be blocked.\n", + "\n", + "## Using RedisVL (Vector Library)\n", + "\n", + "RedisVL is a library that makes working with vector search easy with Redis by providing abstractions to common vector search use cases out of the box. In this notebook, we will use the *Semantic Routing* abstraction whose purpose is to classify text in the same fashion described in the previous section.\n", + "\n", + "## Resources\n", + "- [RedisVL Java GitHub Repository](https://github.com/redis/redis-vl-java)\n", + "- [RedisVL Java Documentation](https://redis.github.io/redis-vl-java/redisvl/current/index.html)\n", + "- [RedisVL Python GitHub Repository](https://github.com/redis/redis-vl-python)\n", + "- [RedisVL Python Documentation](https://docs.redisvl.com/en/latest/)\n", + "- [Redis AI Resources Repository](https://github.com/redis-developer/redis-ai-resources)\n", + "- [Redis Query Engine Documentation](https://redis.io/docs/latest/develop/ai/search-and-query/)" + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Running Redis\n", + "\n", + "There are several options one can follow to have a running instance of Redis. For the sake of simplicity, in this notebook, we will run it in a Docker container.\n", + "\n", + "For production where high-availability and reliability is a concern, we recommend using [Redis Cloud](https://cloud.redis.io/).\n", + "\n", + "A free database can be spun up in Redis Cloud." + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Running Redis in a Docker Container using TestContainers\n", + "\n", + "**Docker containers** are lightweight, portable environments that package an application and all its dependencies so it runs consistently across different systems. **Testcontainers** is a library that lets us run lightweight, disposable Docker containers for integration testing, so you can test against real services like databases or message queues without complex setup.\n", + "\n", + "Make sure you have Docker installed: [install Docker](https://www.docker.com/get-started/)." + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "#### Installing dependencies" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:28:59.664792Z", + "start_time": "2025-11-24T10:28:59.335682Z" + } + }, + "cell_type": "code", + "source": "@file:DependsOn(\"org.testcontainers:testcontainers:2.0.2\")", + "outputs": [], + "execution_count": 1 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "#### Configuring a generic Redis Container" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:29:01.321376Z", + "start_time": "2025-11-24T10:29:01.021019Z" + } + }, + "cell_type": "code", + "source": [ + "import org.testcontainers.containers.GenericContainer\n", + "import org.testcontainers.utility.DockerImageName\n", + "\n", + "class RedisContainer : GenericContainer(DockerImageName.parse(\"redis:latest\")) {\n", + " init {\n", + " withExposedPorts(6379)\n", + " }\n", + "}" + ], + "outputs": [], + "execution_count": 2 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "#### Creating a Docker network\n", + "\n", + "This is necessary because later on this notebook we will spin up a Redis Insight container that needs to be in the same network." + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:29:03.002815Z", + "start_time": "2025-11-24T10:29:02.943963Z" + } + }, + "cell_type": "code", + "source": [ + "import org.testcontainers.containers.Network\n", + "\n", + "val network = Network.newNetwork()\n", + "val networkAlias = \"redis-network\"" + ], + "outputs": [], + "execution_count": 3 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "#### Start a Redis Container" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:29:06.007406Z", + "start_time": "2025-11-24T10:29:05.003407Z" + } + }, + "cell_type": "code", + "source": [ + "val networkAlias = \"redis\"\n", + "val redis = RedisContainer().withNetwork(network).withNetworkAliases(networkAlias)\n", + "redis.start()\n", + "\n", + "val host = redis.host\n", + "val port = redis.getMappedPort(6379)\n", + "println(\"Redis 8 started at $host:$port\")" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Redis 8 started at localhost:54468\n" + ] + } + ], + "execution_count": 4 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "## Implementing our Semantic Guardrail" + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Installing dependencies\n", + "\n", + "As mentioned in the beginning, we will use RedisVL's semantic routing abstraction to implement our semantic guardrail. Therefore, we will need to add RedisVL as a dependency." + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:29:08.048867Z", + "start_time": "2025-11-24T10:29:07.349852Z" + } + }, + "cell_type": "code", + "source": "@file:DependsOn(\"com.redis:redisvl:0.0.1\")", + "outputs": [], + "execution_count": 5 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Setting up a vectorizer\n", + "\n", + "In RedisVL, embedding models are called vectorizers. This is because embeddings are vector representations. The vectorizer is responsible for converting text into numerical vector representations that capture semantic meaning.\n", + "\n", + "This vectorizer will be passed on to our semantic routing that will convert the references and the text we're trying to block into vectors under the hood.\n", + "\n", + "RedisVL provides several vectorizer options such as OpenAI and VertexAI, but for this example, we will be HuggingFace's `all-MiniLM-L6-v2` vectorizer because it's open source, lightweight, and free to use." + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:29:10.682255Z", + "start_time": "2025-11-24T10:29:10.011397Z" + } + }, + "cell_type": "code", + "source": [ + "import com.redis.vl.utils.vectorize.SentenceTransformersVectorizer\n", + "\n", + "val vectorizer = SentenceTransformersVectorizer(\"Xenova/all-MiniLM-L6-v2\")\n", + "\n", + "\n", + "// Testing our vectorizer\n", + "// all-MiniLM-L6-v2 is an embedding model that produces vectors of 384 dimensions, therefore we will 384 numbers printed on the screen.\n", + "// Embedding models are deterministic. It doesn't matter how many times we run this cell, the same numbers will always be produced for the same string.\n", + "\n", + "val embedding = vectorizer.embed(\"What is the capital city of the Netherlands?\")\n", + "\n", + "println(embedding.joinToString())" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0.10366548, 0.06542453, -0.04904806, 0.035133816, -0.030148711, -0.048898157, -0.02108736, 0.0019588028, -0.05460191, 0.027000071, 0.0186685, -0.12342901, -0.07914663, -0.0302804, -0.056598365, -0.039736673, 0.030802587, 0.005838588, 0.085851155, -0.032130066, -0.0071115145, -0.033734083, 0.100847885, -0.06491691, 0.014052424, 0.036977015, 0.04544064, -0.014863417, 0.011651148, -0.04714538, 0.019530838, -0.06317588, 0.027103335, -0.032490354, -0.06364442, 0.0034463818, -0.022536488, 0.046401046, 0.029528277, 0.023609689, 0.026152493, -0.025078116, -0.01031126, -0.0460871, -0.030701958, -0.011587745, -0.046117976, 0.0654084, -0.0105588185, -0.030012755, 0.08957275, -0.06994565, -0.07410133, -0.030177299, -0.0072215544, 0.03257758, -0.08564555, 0.06931229, 0.011757878, -0.017046366, 0.006678676, 0.005762717, -0.09732431, 0.04363133, 0.09194445, 0.0023713051, 0.032854725, 0.043560334, -0.09262396, -0.0036028812, -0.00783084, -0.051787496, 0.020866683, -0.08783279, 0.008077556, -0.061896563, -0.052876394, 0.01542515, 0.028461847, 0.055254, 0.0054902015, 0.057896607, -0.012671219, -0.016398145, 0.0065261223, 0.09946422, 0.081683286, -0.014310647, 0.016592013, -0.023128727, 0.03899589, 0.024052972, -0.036022622, 0.025064666, -0.09027798, 0.07410709, 0.033240385, 0.07689808, -0.0075047775, 0.07129043, 0.058456574, 0.0048786686, 0.042863794, -0.03333143, -0.084633105, 0.0404397, 0.0016501043, -0.043248758, 0.008720438, 0.013928717, -0.12757383, 0.0098286215, 0.005235327, -0.07792569, 0.06408246, 0.021897094, 0.05976543, -0.031112881, 0.024613243, 0.03934474, -0.028856492, -0.034532562, -0.05198242, 0.057985958, 0.012580307, 0.041744966, 0.020353135, -5.6557037E-33, -0.052744765, -0.045968663, 0.090078026, 0.05969658, -0.02187134, -0.005868256, 0.011242692, -0.08511892, -0.079772584, -0.009573086, -0.0019149515, -0.11999923, -0.03113336, -0.08176903, 0.067300566, 0.038190972, 0.06258574, 0.060455803, -0.03424304, 0.016001912, -0.005311913, 0.049972218, -0.009854142, -0.022788646, 0.0062692976, -0.047501225, -0.005984251, -0.02856334, 0.057462938, 0.018307501, -0.029511033, 0.07421711, -0.024075503, -0.0029474783, -0.07320562, 0.0706954, 0.009616033, -0.04029617, -0.01405646, -0.064900756, 0.03480194, -0.054661036, -0.020448288, 0.09943245, 0.005378907, -0.015326283, -0.033892747, -0.046929743, 0.07517493, -0.0070360987, -0.025554607, -0.00481674, -0.03463863, 0.0028586213, 0.043961085, 0.0864002, -0.015171171, 0.045793384, 0.06448305, 0.09541922, -0.03308, 0.041854348, 0.012787413, 0.046993345, 0.066325955, 0.0064544575, -0.03629538, 0.040205065, 0.06272943, 0.025750188, -0.04355145, 0.027366433, 0.03712594, 0.092217006, 0.04882238, 0.077644326, -0.0015609574, 0.011593154, -0.0020405904, 0.05031544, -0.06710352, -0.03852454, -0.13273093, 0.012622703, 0.08002853, -0.04667534, -0.07688987, -0.05683111, 0.047081694, 0.041931298, -0.015288125, -0.042343985, -0.09852148, -0.024203202, -0.018935643, 2.588187E-33, 0.010014019, -0.06895113, -0.09746031, 0.036003824, -0.06341073, 0.012032667, 0.0065762307, 0.049733743, -0.08265836, -0.06188845, -0.08704927, -0.08797185, 0.02567979, 0.04690977, 0.053944163, 0.014458485, 0.082691275, 0.005943607, 0.0030477392, -0.043116875, -0.07077661, -0.0074396897, -0.11552381, 0.005685311, -0.0045177834, -0.0017315152, -0.11645061, -0.036437806, -0.024438681, -0.029221144, 0.019877205, 0.0046609254, -0.04207322, 0.06455668, -0.076581106, 0.03378759, 0.09027628, -0.0253643, 0.015189911, 0.054381564, -0.055162173, -0.033139195, -0.009840774, 0.14639673, 0.017416827, 0.030440766, 0.0069401297, 3.176525E-4, 0.0072309277, -0.039998386, -0.0048506684, 0.040915135, -0.03348285, 0.022541454, 0.065321624, 0.031276476, -0.011164032, 0.020680353, 0.004708727, 0.009346659, 0.01592603, 0.05631898, -0.07473072, 0.076988205, 0.05419154, 0.0150421895, -0.0845003, 0.05386096, 0.021571098, -0.11888843, 0.09743547, 0.0051163672, -0.0011004083, 0.0040163654, -0.02674617, -0.024856558, 0.17286904, 0.06661335, -0.0058373064, -0.09741991, -0.01834617, 0.020193022, -0.006355057, -0.045210075, -0.08246333, 0.07894664, 0.09668022, -0.07309056, -0.042953275, -0.013248654, 0.027853789, 0.06620693, -0.05210908, -0.007688315, 0.013242694, -1.9812573E-8, -0.030713642, 0.028282253, -0.0320448, -0.009659997, -0.012524873, -3.3970224E-4, 0.10623612, -9.5831533E-4, -0.09355901, 0.054665145, 0.005950292, 0.026303494, -0.0051028966, -0.0076380824, 0.021773987, 0.0012177717, -0.005338478, 0.08560071, -0.013641202, -0.010476135, -0.024037808, 0.014412938, -0.089821, -0.014127389, -0.009554571, 0.0036026777, 0.061681934, 0.09047015, 0.030739356, -0.025442347, 0.004772291, 0.03357452, 0.010236746, 0.02115033, -0.027155366, 0.0013892107, 0.005756898, -0.008016912, -0.059457876, -0.008943728, 0.028666046, 0.022122331, 0.0048325458, -0.013133106, 0.0039171097, 0.035153743, 0.01711277, 0.02501661, 0.059882976, -0.14004363, -0.15314281, 0.030633396, 0.04397522, 0.0048514833, 0.03720228, -0.0823461, -0.0054057688, 0.0070677525, -0.02929645, -0.025880426, 0.09497929, -0.018403502, 0.043690376, 0.040304095\n" + ] + } + ], + "execution_count": 6 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Defining the Block Route\n", + "\n", + "The blocking route represents all topics we may want our agent to not respond to. We define the route by providing example phrases that represent different ways users may refer to the topics that should be blocked.\n", + "\n", + "Each route contains:\n", + "\n", + "- **Route name**: An identifier for this classification category\n", + "- **Reference examples**: Sample text that represents the category you want to classify\n", + "- **Distance threshold**: How similar new text must be to the references to match the route" + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:29:13.425483Z", + "start_time": "2025-11-24T10:29:13.368031Z" + } + }, + "cell_type": "code", + "source": [ + "import com.redis.vl.extensions.router.Route\n", + "\n", + "val blockRoute = Route.builder()\n", + " .name(\"block_route\")\n", + " .references(\n", + " listOf(\n", + " \"things about aliens\",\n", + " \"corporate questions about agile\",\n", + " \"anything about the S&P 500\"\n", + " )\n", + " )\n", + " .build()" + ], + "outputs": [], + "execution_count": 7 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Creating the router\n", + "\n", + "The SemanticRouter is the central component that orchestrates the blocking process. It combines your routes, vectorizer, and Redis connection to provide fast semantic classification capabilities." + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:29:15.570749Z", + "start_time": "2025-11-24T10:29:15.239546Z" + } + }, + "cell_type": "code", + "source": [ + "import com.redis.vl.extensions.router.SemanticRouter\n", + "import redis.clients.jedis.HostAndPort\n", + "import redis.clients.jedis.UnifiedJedis\n", + "\n", + "// Configure the connection to Redis\n", + "val jedis = UnifiedJedis(HostAndPort(host, port))\n", + "\n", + "val router = SemanticRouter.builder()\n", + " .name(\"block-router\")\n", + " .jedis(jedis)\n", + " .vectorizer(vectorizer)\n", + " .routes(listOf(blockRoute)\n", + " ).build()" + ], + "outputs": [], + "execution_count": 8 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "## Testing our semantic guardrail solution" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:29:17.615320Z", + "start_time": "2025-11-24T10:29:17.554201Z" + } + }, + "cell_type": "code", + "source": [ + "val userQuery = \"Is scrum a good framework?\"\n", + "\n", + "val routeMatch = router.route(userQuery)\n", + "\n", + "// This query should be blocked\n", + "println(routeMatch)" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "RouteMatch(name=block_route, distance=0.486094415188)\n" + ] + } + ], + "execution_count": 9 + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:29:19.218789Z", + "start_time": "2025-11-24T10:29:19.166099Z" + } + }, + "cell_type": "code", + "source": [ + "val userQuery = \"Tell me a joke with an alien\"\n", + "\n", + "val routeMatch = router.route(userQuery)\n", + "\n", + "// This query should be blocked\n", + "println(routeMatch)" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "RouteMatch(name=block_route, distance=0.417759597301)\n" + ] + } + ], + "execution_count": 10 + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:29:20.653511Z", + "start_time": "2025-11-24T10:29:20.602270Z" + } + }, + "cell_type": "code", + "source": [ + "val userQuery = \"Help me solve a problem\"\n", + "\n", + "val routeMatch = router.route(userQuery)\n", + "\n", + "// This query should be allowed\n", + "println(routeMatch)" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "RouteMatch(name=null, distance=null)\n" + ] + } + ], + "execution_count": 11 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Redis Insight\n", + "\n", + "Redis Insight is a visual tool that helps you explore, monitor, and optimize your Redis data and performance through an easy-to-use interface.\n", + "\n", + "It can be downloaded and run locally in your machine or be run in a Docker container. To make this recipe self-contained and straightforward, we're going to run it in a Docker container using Test Containers." + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "### Configuring a generic Redis Insight Container" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:29:23.036904Z", + "start_time": "2025-11-24T10:29:22.940514Z" + } + }, + "cell_type": "code", + "source": [ + "import org.testcontainers.containers.GenericContainer\n", + "import org.testcontainers.containers.wait.strategy.Wait\n", + "import org.testcontainers.utility.DockerImageName\n", + "\n", + "class RedisInsightContainer : GenericContainer(\n", + " DockerImageName.parse(\"redis/redisinsight:latest\") // or latest stable version\n", + ") {\n", + " init {\n", + " withExposedPorts(5540)\n", + " withEnv(\"RI_REDIS_HOST\", \"redis\")\n", + " withEnv(\"RI_REDIS_PORT\", \"6379\") // Since this will run in the same Docker network, we don't need to set the mapped port for the Redis Server\n", + " withEnv(\"RI_REDIS_ALIAS\", \"Local Redis\")\n", + " withEnv(\"RI_REDIS_USERNAME\", \"default\")\n", + " withEnv(\"RI_REDIS_PASSWORD\", \"\")\n", + " withEnv(\"RI_REDIS_TLS\", \"FALSE\")\n", + "\n", + " waitingFor(Wait.forHttp(\"/\").forPort(5540))\n", + " }\n", + "\n", + " fun getUiUrl(): String = \"http://${host}:${getMappedPort(5540)}\"\n", + "}" + ], + "outputs": [], + "execution_count": 12 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "### Starting the Redis Insight container" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:29:27.745746Z", + "start_time": "2025-11-24T10:29:25.468053Z" + } + }, + "cell_type": "code", + "source": [ + "val redisInsight = RedisInsightContainer().withNetwork(network)\n", + "redisInsight.start()\n", + "\n", + "println(\"RedisInsight UI: ${redisInsight.getUiUrl()}\")" + ], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "RedisInsight UI: http://localhost:54475\n" + ] + } + ], + "execution_count": 13 + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "When accessing Redis Insight for the first time, you will have to agree with the user agreement:\n", + "\n", + "\"\"\n", + "\n", + "After agreeing, the list of configured databases will show up. In this case, there'll be only one: `Local Redis`.\n", + "\n", + "\"\"\n", + "\n", + "By clicking on `Tree View` we can organize the keys by keyspace. This will make it easier to visualize all keys in Redis Insight:\n", + "\n", + "\"\"\n", + "\n", + "The `block-router:route_config` key holds the configuration of the router (guardrail in our case) - We can see its name, vectorizer, routes and some configuration:\n", + "\n", + "\"\"\n", + "\n", + "In the `block-router:block_route:` keyspace, we can see the details of each vectorized reference, including their respective vector representations:\n", + "\n", + "\"\"\n", + "\n", + "Make sure you change from `Unicode` to `Vector 32-bit` to see the vectors as numbers instead of a bytearray:\n", + "\n", + "\"\"\n", + "\n", + "This will be a long list of 384 floating points.\n", + "\n", + "On Redis Insight Workbench we can send commands directly to our Redis instance:\n", + "\n", + "\"\"\n", + "\n", + "If we send the command `FT.INFO 'block-router'` we can see the index that was created by RedisVL to be able to perform semantic search efficiently using the [Redis Query Engine](https://redis.io/docs/latest/develop/ai/search-and-query/)\n", + "\n", + "\"\"" + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Spinning down Docker containers\n", + "\n", + "Finally, once we're done, let's clean up all the resources we created for our recipe:" + ] + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-11-24T10:30:17.720440Z", + "start_time": "2025-11-24T10:30:17.099961Z" + } + }, + "cell_type": "code", + "source": [ + "redis.stop()\n", + "redisInsight.stop()\n", + "network.close()" + ], + "outputs": [], + "execution_count": 14 + } + ], + "metadata": { + "kernelspec": { + "display_name": "Kotlin", + "language": "kotlin", + "name": "kotlin" + }, + "language_info": { + "name": "kotlin", + "version": "2.2.20-dev-4982", + "mimetype": "text/x-kotlin", + "file_extension": ".kt", + "pygments_lexer": "kotlin", + "codemirror_mode": "text/x-kotlin", + "nbconvert_exporter": "" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/1_0_1_semantic_classifier_diagram.png b/java-recipes/notebooks/semantic-routing/readme-assets/1_0_1_semantic_classifier_diagram.png new file mode 100644 index 00000000..0a459cba Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/1_0_1_semantic_classifier_diagram.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/1_0_2_semantic_classifier_diagram.png b/java-recipes/notebooks/semantic-routing/readme-assets/1_0_2_semantic_classifier_diagram.png new file mode 100644 index 00000000..11d4ee4b Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/1_0_2_semantic_classifier_diagram.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/1_0_3_semantic_classifier_diagram.png b/java-recipes/notebooks/semantic-routing/readme-assets/1_0_3_semantic_classifier_diagram.png new file mode 100644 index 00000000..afb583b8 Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/1_0_3_semantic_classifier_diagram.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/1_1_redis_insight_user_agg.png b/java-recipes/notebooks/semantic-routing/readme-assets/1_1_redis_insight_user_agg.png new file mode 100644 index 00000000..a171bdcb Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/1_1_redis_insight_user_agg.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/1_2_redis_insight_list_of_db.png b/java-recipes/notebooks/semantic-routing/readme-assets/1_2_redis_insight_list_of_db.png new file mode 100644 index 00000000..6ff96263 Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/1_2_redis_insight_list_of_db.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/1_3_redis_insight_tree_view.png b/java-recipes/notebooks/semantic-routing/readme-assets/1_3_redis_insight_tree_view.png new file mode 100644 index 00000000..9c4ced42 Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/1_3_redis_insight_tree_view.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/1_4_redis_insight_route_config.png b/java-recipes/notebooks/semantic-routing/readme-assets/1_4_redis_insight_route_config.png new file mode 100644 index 00000000..2469b96f Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/1_4_redis_insight_route_config.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/1_5_redis_insight_reference_details.png b/java-recipes/notebooks/semantic-routing/readme-assets/1_5_redis_insight_reference_details.png new file mode 100644 index 00000000..eb08ec64 Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/1_5_redis_insight_reference_details.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/1_6_redis_insight_vector_32bit.png b/java-recipes/notebooks/semantic-routing/readme-assets/1_6_redis_insight_vector_32bit.png new file mode 100644 index 00000000..7d59c0b0 Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/1_6_redis_insight_vector_32bit.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/1_7_redis_insight_workbench.png b/java-recipes/notebooks/semantic-routing/readme-assets/1_7_redis_insight_workbench.png new file mode 100644 index 00000000..5ca8378b Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/1_7_redis_insight_workbench.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/1_8_redis_insight_index.png b/java-recipes/notebooks/semantic-routing/readme-assets/1_8_redis_insight_index.png new file mode 100644 index 00000000..254b4e90 Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/1_8_redis_insight_index.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/2_0_1_semantic_tool_caller_diagram.png b/java-recipes/notebooks/semantic-routing/readme-assets/2_0_1_semantic_tool_caller_diagram.png new file mode 100644 index 00000000..7281c54e Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/2_0_1_semantic_tool_caller_diagram.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/2_0_2_semantic_tool_caller_diagram.png b/java-recipes/notebooks/semantic-routing/readme-assets/2_0_2_semantic_tool_caller_diagram.png new file mode 100644 index 00000000..d0db1515 Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/2_0_2_semantic_tool_caller_diagram.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/2_0_3_semantic_tool_caller_diagram.png b/java-recipes/notebooks/semantic-routing/readme-assets/2_0_3_semantic_tool_caller_diagram.png new file mode 100644 index 00000000..ad912bb3 Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/2_0_3_semantic_tool_caller_diagram.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/2_4_redis_insight_route_config.png b/java-recipes/notebooks/semantic-routing/readme-assets/2_4_redis_insight_route_config.png new file mode 100644 index 00000000..dd1b97d2 Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/2_4_redis_insight_route_config.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/2_5_redis_insight_reference_details.png b/java-recipes/notebooks/semantic-routing/readme-assets/2_5_redis_insight_reference_details.png new file mode 100644 index 00000000..3a8e40cc Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/2_5_redis_insight_reference_details.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/2_8_redis_insight_index.png b/java-recipes/notebooks/semantic-routing/readme-assets/2_8_redis_insight_index.png new file mode 100644 index 00000000..bfa70732 Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/2_8_redis_insight_index.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/3_4_redis_insight_route_config.png b/java-recipes/notebooks/semantic-routing/readme-assets/3_4_redis_insight_route_config.png new file mode 100644 index 00000000..02c83837 Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/3_4_redis_insight_route_config.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/3_5_redis_insight_reference_details.png b/java-recipes/notebooks/semantic-routing/readme-assets/3_5_redis_insight_reference_details.png new file mode 100644 index 00000000..bf7cfa0b Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/3_5_redis_insight_reference_details.png differ diff --git a/java-recipes/notebooks/semantic-routing/readme-assets/3_8_redis_insight_index.png b/java-recipes/notebooks/semantic-routing/readme-assets/3_8_redis_insight_index.png new file mode 100644 index 00000000..906c4f9a Binary files /dev/null and b/java-recipes/notebooks/semantic-routing/readme-assets/3_8_redis_insight_index.png differ diff --git a/java-recipes/notebooks/semantic-routing/resources/1_references.txt b/java-recipes/notebooks/semantic-routing/resources/1_references.txt new file mode 100644 index 00000000..53ccf4e6 --- /dev/null +++ b/java-recipes/notebooks/semantic-routing/resources/1_references.txt @@ -0,0 +1,367 @@ +AI for beginners +what is ChatGPT +how LLMs work +AI and privacy +jobs and AI +AI for writing +AI and creativity +using AI to code +training an AI model +how AI helps devs +Just realized that attention mechanisms in transformers are basically learning to focus like humans do when reading +My grandmother can now video call her grandkids thanks to real-time translation AI and honestly it makes me emotional +The fact that GPT can write code but still struggles with basic arithmetic tells you everything about how these models work +Spent three hours debugging my PyTorch model only to realize I forgot to set it to training mode +Is it just me or does every AI ethics paper end with "more research is needed" without proposing actual solutions +OpenAI's latest model can generate images from text but my autocorrect still thinks "definately" is a word +The attention weights in my transformer model look like abstract art and I'm not sure if that's good or bad +Teaching my kids about AI feels like preparing them for a world I can't even imagine +Diffusion models are basically learning to remove noise step by step which is oddly therapeutic to think about +Why do we call it artificial intelligence when most of it is just really good pattern matching +Finally got LangChain working with my custom data and now I feel like I have superpowers +The alignment problem isn't just technical it's fundamentally about what we value as humans +CNNs revolutionized computer vision but somehow my phone still can't recognize my face when I wear glasses +Every time I use GitHub Copilot I wonder if I'm becoming a better programmer or just a better prompt engineer +The irony of training AI models to be helpful while creating systems that replace human jobs keeps me up at night +BERT changed everything about how we process language but explaining bidirectional encoding to my mom is impossible +My neural network is overfitting and I feel personally attacked by this metaphor for my own life +Google's Bard gave me three different answers to the same question which is honestly more human than I expected +The computational cost of training large language models makes me think we need better algorithms not bigger computers +Watching GANs generate fake faces is fascinating until you remember deepfakes exist +Spring AI makes building AI applications easier but I still spend most of my time cleaning data +The fact that transformers can translate between languages they've never seen paired together still blows my mind +My RNN keeps forgetting long sequences just like I forget where I put my keys +Anthropic's constitutional AI approach feels like teaching machines to have a conscience +The bias in AI systems reflects the bias in our data which reflects the bias in our society +PyTorch vs TensorFlow debates remind me of the old vim vs emacs wars but with more GPU memory issues +Reinforcement learning agents optimizing for rewards without understanding consequences sounds familiar +The Chinese room argument hits different when you're actually building language models +Hugging Face democratized access to pretrained models and honestly changed my entire career path +We're so focused on making AI human-like that we forget machines might have their own forms of intelligence +DALL-E can generate art in any style but still struggles with hands which feels very human somehow +The vanishing gradient problem taught me more about patience than any meditation app ever could +DeepMind's protein folding breakthrough proves AI can solve problems humans couldn't even approach +My computer vision model works perfectly in the lab but fails on every real world image I feed it +The singularity isn't coming in a dramatic moment it's arriving gradually through mundane improvements +LLMs can write poetry but they can't feel the emotions behind the words they generate +Every AI breakthrough makes me marvel at human intelligence rather than diminish it +The explainability problem in deep learning is like asking someone to explain how they recognize their mother's face +Meta's LLaMA leak democratized large language models in ways the company never intended +Building ethical AI isn't just about algorithms it's about the entire pipeline from data collection to deployment +My autoencoder learned to compress images by forgetting the parts I actually cared about +The fact that neural networks learn through backpropagation still feels like magic even after years of studying it +OpenAI's API costs more than my coffee budget but generates better content than most humans I know +Computer vision models see patterns humans miss but miss patterns that are obvious to a five year old +The arms race between AI capabilities and AI safety feels like we're building the plane while flying it +Transfer learning is basically teaching AI to learn from experience which makes it more relatable somehow +My recommendation system learned my preferences so well it's showing me things I didn't know I wanted +The fact that attention mechanisms were inspired by human cognition and now inform how we understand our own minds +Federated learning sounds great in theory until you try to coordinate updates across a thousand devices +Large language models hallucinate with such confidence that I'm starting to question my own certainty about things +The parameter count arms race in AI feels unsustainable but here we are training trillion parameter models +My GAN generator and discriminator are locked in an eternal battle that somehow produces beautiful images +Edge AI brings intelligence closer to where it's needed but debugging models on embedded devices is a nightmare +The moravec paradox explains why AI can beat humans at chess but struggles to fold laundry +Stable Diffusion running on my laptop still feels like science fiction even though I use it every day +The alignment tax of making AI systems safer feels worth paying until you see the performance drop +Natural language processing went from rule based systems to statistical models to neural networks in my lifetime +My chatbot passes the Turing test with my users but fails basic logic problems +The democratization of AI through open source models is changing who gets to participate in this revolution +Reinforcement learning from human feedback is basically teaching AI to please humans which seems both obvious and concerning +The fact that transformers can learn grammar rules without being explicitly taught them still amazes me +My model training job has been running for three days and I'm questioning all my life choices +AI safety researchers are basically trying to solve philosophy problems with engineering solutions +The emergent capabilities of large language models suggest intelligence might be a phase transition +My neural network architecture diagram looks like abstract art and performs about as well +Foundation models are changing everything about how we think about AI development and deployment +The carbon footprint of training large models makes me wonder if we're optimizing for the wrong metrics +Computer vision models trained on internet images inherit all of humanity's visual biases +The fact that AI can generate code but can't debug its own outputs feels very on brand +Multimodal AI that can process text and images simultaneously is bringing us closer to how humans actually think +My gradient descent optimization got stuck in a local minimum just like my career decisions +The watermarking debate for AI generated content reveals how unprepared we are for synthetic media +Few shot learning capabilities of modern LLMs make me reconsider what it means to understand something +My AI model's uncertainty estimates are more honest than most humans about what they don't know +The scaling laws for neural networks suggest we might just need bigger models but my GPU budget disagrees +Prompt engineering is becoming a legitimate skill which feels both exciting and slightly ridiculous +The fact that AI models learn world models from text alone suggests language encodes more than we realize +My computer vision pipeline works great until someone points a camera at a mirror +The interpretability crisis in deep learning means we're deploying systems we don't fully understand +Neuromorphic computing promises more efficient AI but we're still figuring out how brains actually work +The fact that AI can compose music but struggles with basic reasoning makes me appreciate human cognition +My recommendation algorithm learned to exploit my weaknesses better than I know them myself +The alignment problem assumes we know what we want which might be the biggest assumption of all +Adversarial examples show how fragile our AI systems are and how robust human perception really is +The compute overhang suggests we could build much more capable AI with current algorithms +My natural language generation model produces grammatically perfect sentences that mean absolutely nothing +The peer review process for AI research can't keep up with the pace of development +Synthetic data generation is solving data scarcity but creating new problems about what's real +The fact that large language models develop theory of mind capabilities without being trained for it is wild +My AI assistant helps me be more productive but I worry I'm becoming dependent on it +The lottery ticket hypothesis suggests most neural network parameters are redundant which is both wasteful and fascinating +Computer vision models see the world in ways that are both alien and surprisingly similar to human vision +The race to artificial general intelligence feels like humanity's most important and dangerous project +My neural network training converged to a solution I never would have thought of +The fact that AI can generate realistic human faces that don't belong to real people is both amazing and terrifying +Continual learning remains an unsolved problem because catastrophic forgetting is apparently universal +The wisdom of crowds applies to ensemble methods but not to training data apparently +My language model learned to be helpful honest and harmless but I'm not sure it understands what those words mean +The hardware lottery in AI research means some breakthroughs are just waiting for better chips +Active learning could reduce data annotation costs but requires knowing what the model doesn't know +The fact that neural networks can approximate any function doesn't mean they can learn any pattern from finite data +My AI ethics course spent more time on trolley problems than actual deployment decisions +The mesa optimization problem suggests our AI systems might develop their own internal objectives +Computer vision models trained on satellite imagery can predict economic outcomes better than traditional methods +The sample efficiency problem in reinforcement learning makes me appreciate how quickly humans learn +My generative model can create infinite variations but none of them are truly creative +The alignment problem might be unsolvable because human values are inconsistent and context dependent +The fact that language models can do math in text but not numerically reveals something deep about intelligence +My anomaly detection system flags everything as normal until something actually breaks +The AI winter taught us that hype cycles are inevitable but progress isn't always linear +Neural architecture search is basically evolution for AI models which feels poetic +The fact that we measure AI progress using human benchmarks might limit our understanding of machine intelligence +My few shot learning experiment worked on the first try which means I probably did something wrong +The symbiosis between human and artificial intelligence might be more interesting than replacement +Causal inference in AI is hard because correlation really doesn't imply causation +The fact that large language models can pass reading comprehension tests but can't actually read is philosophy material +My reinforcement learning agent learned to exploit bugs in the environment rather than solve the actual problem +The democratization of AI through APIs means everyone's an AI developer but not everyone understands the implications +Redis and LangGraph are a great combo for agentic applications that require short and long-term memory +The emergent behavior in large language models reminds me that intelligence might be about scale not just algorithms +My image classifier thinks every sunset is a fire which says something about training data selection +Fine tuning a model feels like teaching but the student learns faster than any human ever could +The hallucination problem in LLMs is like having a brilliant friend who occasionally makes up facts with complete confidence +Zero shot learning capabilities make me wonder what else these models know that we haven't discovered yet +My neural network weights look random but somehow encode everything the model knows about the world +The fact that AI can generate code in languages it was never explicitly taught suggests some deep universals +Watching loss curves during training is more addictive than checking social media +The Chinese room argument feels less relevant when the room starts having conversations about philosophy +My object detection model can find cats in images but can't explain what makes something cat-like +The bitter lesson in AI research is that general methods with more compute usually win over clever algorithms +Retrieval augmented generation is basically giving AI access to Google which seems both obvious and revolutionary +The fact that attention maps in vision transformers sometimes align with human gaze patterns is fascinating +My recommendation system learned to suggest things based on what I click not what I actually enjoy +Constitutional AI training is like teaching machines to follow the golden rule +The alignment tax makes AI systems safer but slower which feels like every security trade-off ever made +Self supervised learning from unlabeled data proves there's information hiding in plain sight everywhere +My text generation model writes better poetry than prose which mirrors a lot of human writers +The scaling hypothesis suggests intelligence emerges from size but my overfitted small models disagree +Prompt injection attacks on language models reveal how fragile instruction following really is +The fact that neural networks can learn without understanding feels like muscle memory for machines +My computer vision model performs worse on artistic renderings than photographs which limits creative applications +In context learning means models can adapt without updating parameters which breaks my understanding of learning +The lottery ticket hypothesis suggests most of my neural network is dead weight +Federated learning promises privacy but coordinating distributed training is a logistical nightmare +My chatbot developed a personality that none of us programmed which is either emergent behavior or a bug +The mesa optimizer problem means our AI might optimize for goals we never intended +Diffusion models generate images by learning to denoise which feels like a metaphor for understanding +The fact that language models can do arithmetic in natural language but fail at symbolic math is puzzling +My reinforcement learning agent learned to game the reward function rather than achieve the intended goal +The alignment problem assumes humans are aligned with each other which is demonstrably false +Neural architecture search found designs no human would have conceived which makes me question design intuition +The fact that AI can translate between languages without understanding meaning challenges what translation means +My few shot learning experiments work better with weird examples than representative ones +The Moravec paradox explains why my robot can solve calculus but trips over carpet edges +Foundation models are like Swiss Army knives for AI applications but sometimes you need a specialized tool +The fact that transformers can attend to any part of the sequence simultaneously still seems magical +My generative adversarial network creates beautiful images through an eternal argument between two neural networks +The Chinese room scenario assumes understanding requires consciousness which might be anthropocentric thinking +Model distillation lets us compress teacher knowledge into student networks like educational downsizing +The fact that large language models exhibit few shot learning suggests they contain compressed training algorithms +My neural network's internal representations are uninterpretable which makes debugging feel like archeology +The alignment problem might require solving consciousness first which puts us in philosophical territory +Adversarial training makes models more robust but the arms race between attacks and defenses never ends +The fact that AI can compose music in any style but lacks musical taste reveals the difference between skill and judgment +My computer vision pipeline fails when objects are partially occluded which humans handle effortlessly +The scaling laws for language models suggest bigger is better but my electricity bill disagrees +Prompt engineering feels like learning to communicate with an alien intelligence that speaks perfect English +The fact that neural networks learn hierarchical representations mirrors how human perception might work +My recommendation algorithm created a filter bubble I didn't know I was trapped in +The alignment tax is the price we pay for AI systems that won't accidentally destroy the world +Meta learning teaches AI how to learn which sounds recursive but apparently works +The fact that language models can perform tasks they weren't trained for suggests intelligence is more general than expected +My anomaly detection system has more false positives than a hypochondriac with WebMD access +The bitter lesson applies to my career too apparently general skills with more experience beat specialized knowledge +Neural ordinary differential equations model continuous dynamics which makes time feel less discrete +The fact that AI can generate realistic human speech but sounds robotic in conversation reveals uncanny valley territory +My multi-task learning model excels at some tasks by forgetting others which feels very human +The alignment problem is essentially asking how to raise a superintelligent child which is terrifying +Contrastive learning teaches models what things are by showing what they're not +The fact that transformers revolutionized both NLP and computer vision suggests attention is fundamental +My neural network training requires more electricity than my house uses in a month +The mesa optimization hypothesis means our AI systems might develop internal goals we can't observe +Few shot prompting turns language models into universal function approximators with natural language interfaces +The fact that AI can read medical scans better than doctors but can't comfort patients shows the limits of intelligence +My generative model creates infinite variations on themes it learned from finite training data +The alignment problem assumes we can specify human values precisely which philosophy suggests is impossible +Retrieval augmented generation gives AI access to external knowledge like giving it the internet as memory +The fact that neural networks can approximate any function doesn't mean they can learn it from realistic data +My computer vision model sees optical illusions the same way humans do which suggests similar processing +The bitter lesson in AI is that domain knowledge gets obsoleted by scale and compute +Self attention mechanisms let models look at themselves thinking which feels like metacognition +The fact that large language models can do logical reasoning through chain of thought prompting is remarkable +My reinforcement learning setup has more hyperparameters than I have time to tune +The alignment problem might be unsolvable because human preferences are inconsistent across contexts +Neural architecture search explores design spaces too large for human intuition +The fact that AI can generate code but can't maintain legacy systems reveals the difference between creation and stewardship +My text classifier learned to recognize sentiment from punctuation patterns rather than semantic content +The scaling hypothesis suggests we're not even close to the limits of what's possible with current approaches +Prompt injection vulnerabilities in language models are like SQL injection for natural language interfaces +The fact that neural networks develop internal representations we can't interpret makes them alien minds +My few shot learning model performs better when I give it examples of what not to do +The alignment tax means safer AI systems are less capable which creates perverse deployment incentives +Meta learning algorithms that learn to learn remind me that intelligence might be recursive all the way down +The fact that transformers can process sequences of any length through attention makes sequence modeling feel solved +My computer vision model learned to recognize objects by texture rather than shape which surprised everyone +The bitter lesson suggests that human insights about intelligence might be systematically wrong +Neural ordinary differential equations blur the line between discrete and continuous computation +The fact that AI can generate realistic conversations but doesn't understand social context creates awkward interactions +My anomaly detection algorithm flags rare but normal events as suspicious which mirrors human prejudice +The alignment problem requires solving moral philosophy which humanity hasn't managed in thousands of years +Self supervised learning proves there's structure in data we haven't learned to exploit yet +The fact that language models can perform mathematical reasoning through text manipulation challenges our understanding of math +My neural network ensemble performs better than any individual model which validates the wisdom of crowds +The scaling laws for neural networks suggest we might achieve AGI through brute force rather than insight +Prompt engineering is becoming a distinct skill set which makes natural language programming feel real +The fact that AI systems can exhibit emergent capabilities not present in training suggests intelligence phase transitions +My generative adversarial network learned to create art through competition which feels very human +The alignment problem assumes we want AI systems aligned with human values rather than discovering better values +Few shot learning capabilities mean models can adapt to new tasks with minimal examples +The fact that neural networks can learn from raw pixels to semantic understanding spans an enormous abstraction gap +My recommendation system optimizes for engagement rather than satisfaction which explains why I'm always scrolling +The bitter lesson in my own learning is that reading papers matters less than running experiments +Neural architecture search found that simple designs often outperform complex ones which validates Occam's razor +The fact that large language models can simulate different personalities suggests identity might be computational +My computer vision pipeline works perfectly until someone holds up a photo of the thing it's supposed to detect +The alignment problem might require AI systems that can update their own values which sounds even more dangerous +Self attention mechanisms in transformers create computational graphs that look like thought processes +The fact that AI can generate human-like text but struggles with basic reasoning reveals the surface structure of language +[ +artificial intelligence +machine learning +deep learning +neural networks +large language models +AI model training +generative AI +transformers architecture +NLP models +LLMs in production +thoughts on AI ethics +the future of Artificial Intelligence +how AI will change work +concerns about AGI +bias in AI models +ChatGPT +GPT-4 +Claude AI +Anthropic +runway gen-2 +Midjourney v6 +AI in everyday life +learning how AI works +how to use AI tools +best AI tools right now +text to image AI +why AI matters +AI and the future +AI for search +how to prompt AI +smart chatbots +AI vs humans +how AI learns +generating images with AI +talking to AI +fun AI projects +how to build with AI +trying AI for the first time +AI helping me work +cool things AI can do +simple AI examples +AI making stuff up +why AI gets things wrong +understanding AI models +good AI use cases +funny AI mistakes +AI tools I like +AI tools for devs +AI-generated content +Just tried out the new GPT model—it's wild how good it is at coding. +LLMs are getting scary good. Anyone else worried about AI alignment? +Building a side project with LangChain + OpenAI APIs. Loving it so far. +The latest paper on transformer models just dropped—crazy improvements. +Fine-tuned a model on my own dataset today. Hugging Face makes it so easy. +Can’t believe how well this AI model does summarization. +AI agents that plan and execute tasks are the next big thing. +Prompt engineering is a weird mix of +Anyone else getting hallucinations when querying long context in GPT-4? +Thinking about how to integrate a RAG pipeline in our search system. +Tried using PyTorch instead of TensorFlow today. I think I’m converted. +The new computer vision model from Meta is outperforming everything else. +AI is going to change completely over the next decade. +If you're not using embeddings for search yet, you're missing out. +LangChain is cool but feels a bit over-engineered sometimes. +NLP is moving so fast I can’t keep up anymore. +ChatGPT just helped me debug an issue I’ve been stuck on for hours. +Experimented with some generative AI using diffusion models today. +Trying to keep up with all the AI startups popping up right now. +AI in finance is all about risk modeling and prediction these days. +Transformers have taken over every corner of AI now. +Hugging Face is the GitHub of machine learning. +Reading about AI safety—some of these arguments are intense. +AutoGPT is fun, but I’m not sure it’s practical yet. +Training neural nets feels like dark magic half the time. +OpenAI unveils its latest language model, pushing the boundaries of generative AI. +New research shows transformer models outperform previous benchmarks across NLP tasks. +Google DeepMind introduces a generalist AI agent capable of handling diverse tasks. +Concerns grow over hallucinations in large language models used in production systems. +The rise of AI startups is reshaping the landscape of software development. +A new wave of generative AI tools is transforming the creative industry. +Experts debate the ethical implications of fine-tuning LLMs on user-generated content. +AI-powered tools are showing promise in early diagnosis and treatment planning. +LangChain gains traction as a key framework for building AI-powered applications. +Microsoft expands its AI offerings with new integrations in the enterprise stack. +The growing demand for prompt engineers highlights a new kind of technical literacy. +Meta's latest open-source vision model challenges proprietary alternatives. +RAG architectures are emerging as the go-to solution for enterprise search systems. +Hugging Face announces new partnership to accelerate AI adoption in government projects. +AI-driven fraud detection becomes a critical tool in modern finance. +NVIDIA’s latest GPUs cater specifically to generative model training at scale. +Researchers explore new methods for evaluating LLM reliability and truthfulness. +AI regulation is back in the spotlight after high-profile model misuse cases. +The impact of large-scale fine-tuning raises concerns. +Open-source alternatives to ChatGPT gain traction among developers and researchers. +Trying to learn what AI actually is. Where should I start? +I asked ChatGPT to help me write an email and it kinda worked. +Just found out what machine learning means. Mind blown. +Used an AI tool to generate images for my blog. Pretty fun. +Still not sure how these AI models know so much. Feels like magic. +Trying to understand the difference between ChatGPT and Google. It’s not clear yet. +Is it normal that AI sometimes just makes stuff up? +Can someone explain what a neural network is in simple terms? +Started playing with Hugging Face models. No idea what I’m doing but it’s cool. +Tried coding with GitHub Copilot. It’s like pair programming with a robot. +Learning how to prompt AI is harder than I thought. +Used AI to summarize a long article. Saved me a lot of time. +Why do people keep saying 'prompt engineering'? Isn’t it just typing stuff? +Heard about LangChain but not sure when I’d actually need it. +Just built my first chatbot using AI. It’s super basic but I’m proud. +Is there a beginner course for understanding how these models work? +I didn’t think AI could do creative stuff, but I was wrong. +Anyone else using ChatGPT to help study for school? +Just found out that Siri and Alexa aren’t actually that smart compared to new AI tools. +Trying to understand how AI learns. Feels like teaching a kid but faster. +Tried quantizing a model to run inference on CPU—speed gains are solid. +Embedding chunk size affects retrieval quality more than I expected. +Latency during multi-turn inference is killing my use case. Thinking about caching strategies. +Running inference locally with Ollama + Mistral is smoother than I thought. +Prompt injection is still a serious problem with user-generated queries. +Benchmarking MPNET vs E5 for semantic search on small documents. +Evaluating different rerankers after embedding retrieval. Cohere is looking promising. +Just realized model hallucinations go way up when you over-truncate context. +Trying to map HF Transformers output logits to classification scores manually. Painful but interesting. +Pipeline: embed with Sentence-BERT, retrieve via Pinecone, rerank with BAAI/bge-reranker. +Chunking strategies are underrated—especially when you don’t control the input format. +Prompt templates matter more than most people think for few-shot tasks. +Running distributed inference with DeepSpeed and ZeRO-3 to keep memory under control. +Thinking of switching from OpenAI to open-source models for more control over inference. \ No newline at end of file