diff --git a/python-recipes/semantic-cache/04_langcache_semantic_caching.ipynb b/python-recipes/semantic-cache/04_langcache_semantic_caching.ipynb new file mode 100644 index 00000000..98a84f5b --- /dev/null +++ b/python-recipes/semantic-cache/04_langcache_semantic_caching.ipynb @@ -0,0 +1,1459 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![Redis](https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120)\n", + "\n", + "# LangCache: Semantic Caching with Redis Cloud\n", + "\n", + "This notebook demonstrates end-to-end semantic caching using **LangCache** - a managed Redis Cloud service accessed through the RedisVL library. LangCache provides enterprise-grade semantic caching with zero infrastructure management, making it ideal for production LLM applications.\n", + "\n", + "\"Open\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Introduction\n", + "\n", + "**LangCache** is a fully managed semantic cache service built on Redis Cloud. It was integrated into RedisVL in version 0.11.0 as an `LLMCache` interface implementation, making it easy for RedisVL users to:\n", + "\n", + "- Transition to a fully managed caching service\n", + "- Reduce LLM API costs by caching similar queries\n", + "- Improve application response times\n", + "- Access enterprise features without managing infrastructure\n", + "\n", + "### What You'll Learn\n", + "\n", + "In this tutorial, you will:\n", + "1. Set up LangCache with Redis Cloud\n", + "2. Load and process a knowledge base (PDF documents)\n", + "3. Generate FAQs using the Doc-to-Cache technique\n", + "4. Pre-populate a semantic cache with tagged FAQs\n", + "5. Test different cache matching strategies and thresholds\n", + "6. Integrate the cache into a RAG pipeline\n", + "7. Measure performance improvements\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 1. Environment Setup\n", + "\n", + "First, we'll install the required packages and set up our environment.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Install Required Packages\n" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.3\u001b[0m\n", + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n", + "Note: you may need to restart the kernel to use updated packages.\n", + "\n", + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.3\u001b[0m\n", + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n", + "Note: you may need to restart the kernel to use updated packages.\n" + ] + } + ], + "source": [ + "%pip install -q \"redisvl>=0.11.0\" \"langcache\" \"sentence-transformers\"\n", + "%pip install -q \"pypdf\" \"openai>=1.0.0\" \"langchain>=0.3.0\" \"langchain-community\" \"langchain-openai\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Import Dependencies\n" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import time\n", + "import json\n", + "from typing import List, Dict, Any\n", + "\n", + "# RedisVL imports\n", + "from redisvl.extensions.cache.llm import LangCacheSemanticCache" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 2. LangCache setup\n", + "\n", + "### Sign up for LangCache\n", + "\n", + "If you haven't already, sign up for a free Redis Cloud account:\n", + "\n", + "**[Log in or sign up for Redis Cloud →](https://cloud.redis.io/#/)**\n", + "\n", + "After signing up:\n", + "1. Create a new database\n", + "2. Create a new LangCache service (Select 'LangCache' on the left menu bar)\n", + "3. Copy your **API Key**\n", + "4. Copy your **Cache ID**\n", + "5. Copy your **URL**\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Configure Environment Variables\n", + "You'll need the LangCache API Key, Cache ID, URL\n", + "You will also need access to an LLM. In this notebook we'll be using OpenAI" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Initialize Semantic Cache with LangCache-Embed Model\n", + "\n", + "We'll create a cache instance using the `redis/langcache-embed-v1` model, which is specifically optimized for semantic caching tasks.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "wy4ECQMIVUCcYGbZr_Lg007Cifh4GkgiIRNAf3S4ITMWQ4puuq-OStyjMvH-iD1m0oIB6hg5EVYQye5r1xajEFL7e0AUw5Gn_UEksTQdSm-Hwzu3wXsJJ4emhp8OopEJfHx6JnPlW36LDkCf6ne4Kj8CWiQkphQHqaEeKV9mdgbml-8qOv19AFr0y5vmTtkU_Xt5ByfGMTO-mI9wMKXNLOfwZixM1kiE8KAL_JM7dJN_EHQh\n", + "50eb6a09acf5415d8b68619b1ccffd9a\n", + "https://aws-us-east-1.langcache.redis.io\n" + ] + } + ], + "source": [ + "langcache_api_key = os.environ.get('LANGCACHE_API_KEY') # found on your cloud console\n", + "langcache_id = os.environ.get('LANGCACHE_ID') # found on your cloud console\n", + "server_url = \"https://aws-us-east-1.langcache.redis.io\" # found on your cloud console\n", + "\n", + "\n", + "print(langcache_api_key)\n", + "print(langcache_id)\n", + "print(server_url)\n", + "\n", + "# Create Semantic Cache instance\n", + "cache = LangCacheSemanticCache(\n", + " server_url=server_url,\n", + " cache_id=langcache_id,\n", + " api_key=langcache_api_key,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "10:35:45 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "[]\n", + "10:35:45 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:35:45 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "[{'entry_id': '5eb63bbbe01eeed093cb22bb8f5acdc3', 'prompt': 'hello world', 'response': 'hello world from langcache', 'vector_distance': 0.07242219999999999, 'inserted_at': 0.0, 'updated_at': 0.0}]\n" + ] + } + ], + "source": [ + "# Check your cache is working\n", + "r = cache.check('hello world')\n", + "print(r) # should be empty on first run\n", + "\n", + "cache.store('hello world', 'hello world from langcache')\n", + "result = cache.check('hi world')\n", + "\n", + "print(result)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# RAG with semantic caching\n", + "\n", + "Now that we have a working semantic cache service running and we're connected to it, let's use it in an application.\n", + "\n", + "We'll build a simple Retrieval Augmented Generation (RAG) app using a PDF of NVidia's 2023 10k filing report.\n", + "\n", + "To get the full benefit of semantic caching we'll preload our cache with Frequently Asked Questions (FAQs) generated by an LLM about our PDF." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Generate FAQs Using Doc-to-Cache Technique\n", + "\n", + "The Doc-to-Cache approach uses an LLM to generate frequently asked questions from document chunks. These FAQs are then used to pre-populate the semantic cache with high-quality, factual responses.\n", + "\n", + "We'll work with three types of data:\n", + "1. **Knowledge Base**: PDF document(s) that contain factual information\n", + "2. **FAQs**: Derived from the knowledge base using Doc-to-Cache technique\n", + "3. **Test Dataset**: For evaluating and optimizing cache performance\n" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/Users/justin.cechmanek/.pyenv/versions/3.11.9/envs/redis-ai-res/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", + " from .autonotebook import tqdm as notebook_tqdm\n" + ] + } + ], + "source": [ + "# LangChain imports\n", + "from langchain_community.document_loaders import PyPDFLoader\n", + "from langchain_text_splitters import RecursiveCharacterTextSplitter\n", + "from langchain_openai import ChatOpenAI\n", + "from langchain_core.prompts import PromptTemplate, ChatPromptTemplate\n", + "from langchain_core.output_parsers import JsonOutputParser\n", + "\n", + "from pydantic import BaseModel, Field\n", + "import getpass" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "if \"OPENAI_API_KEY\" not in os.environ:\n", + " os.environ[\"OPENAI_API_KEY\"] = getpass(\"Enter your OpenAI API key: \")\n", + "\n", + "# Initialize OpenAI LLM for FAQ generation and RAG\n", + "llm = ChatOpenAI(\n", + " model=\"gpt-4o-mini\",\n", + " temperature=0.3,\n", + " max_tokens=2000\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Load PDF Knowledge Base\n" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [], + "source": [ + "# Download sample PDF if not already present\n", + "!mkdir -p data\n", + "!wget -q -O data/nvidia-10k.pdf https://raw.githubusercontent.com/redis-developer/redis-ai-resources/main/python-recipes/RAG/resources/nvd-10k-2023.pdf" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Loaded PDF: data/nvidia-10k.pdf\n", + " Total pages: 169\n", + " Created chunks: 388\n", + "\n", + "Sample chunk preview:\n", + "Table of Contents\n", + "The world’s leading cloud service providers, or CSPs, and consumer internet companies use our GPUs and broader data center-scale\n", + "accelerated computing platforms to enable, accelerate or enrich the services they deliver to billions of end-users, including search,\n", + "recommendations, so...\n" + ] + } + ], + "source": [ + "# Load and chunk the PDF\n", + "pdf_path = \"data/nvidia-10k.pdf\"\n", + "\n", + "# Configure text splitter for optimal chunk sizes\n", + "text_splitter = RecursiveCharacterTextSplitter(\n", + " chunk_size=2000,\n", + " chunk_overlap=200,\n", + " separators=[\"\\n\\n\", \"\\n\", \". \", \" \", \"\"]\n", + ")\n", + "\n", + "# Load and split the document\n", + "loader = PyPDFLoader(pdf_path)\n", + "documents = loader.load()\n", + "chunks = text_splitter.split_documents(documents)\n", + "\n", + "print(f\"Loaded PDF: {pdf_path}\")\n", + "print(f\" Total pages: {len(documents)}\")\n", + "print(f\" Created chunks: {len(chunks)}\")\n", + "print(f\"\\nSample chunk preview:\")\n", + "print(f\"{chunks[10].page_content[:300]}...\")\n" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [], + "source": [ + "# Define the FAQ data model\n", + "class QuestionAnswer(BaseModel):\n", + " question: str = Field(description=\"A frequently asked question derived from the document content\")\n", + " answer: str = Field(description=\"A factual answer to the question based on the document\")\n", + " category: str = Field(description=\"Category of the question (e.g., 'financial', 'products', 'operations')\")\n", + "\n", + "class FAQList(BaseModel):\n", + " faqs: List[QuestionAnswer] = Field(description=\"List of question-answer pairs extracted from the document\")\n", + "\n", + "# Set up JSON output parser\n", + "json_parser = JsonOutputParser(pydantic_object=FAQList)\n" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "FAQ generation chain configured\n" + ] + } + ], + "source": [ + "# Create the FAQ generation prompt\n", + "faq_prompt = PromptTemplate(\n", + " template=\"\"\"You are a document analysis expert. Extract 3-5 high-quality FAQs from the following document chunk.\n", + "\n", + "Guidelines:\n", + "- Generate diverse, specific questions that users would realistically ask\n", + "- Provide accurate, complete answers based ONLY on the document content\n", + "- Assign each FAQ to a category: 'financial', 'products', 'operations', 'technology', or 'general'\n", + "- Avoid vague or overly generic questions\n", + "- If the chunk lacks substantial content, return fewer FAQs\n", + "\n", + "{format_instructions}\n", + "\n", + "Document Chunk:\n", + "{doc_content}\n", + "\n", + "FAQs JSON:\"\"\",\n", + " input_variables=[\"doc_content\"],\n", + " partial_variables={\"format_instructions\": json_parser.get_format_instructions()}\n", + ")\n", + "\n", + "# Create the FAQ generation chain\n", + "faq_chain = faq_prompt | llm | json_parser\n", + "\n", + "print(\"FAQ generation chain configured\")\n" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Testing FAQ generation on sample chunk...\n", + "\n", + "10:36:14 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "Generated 5 FAQs:\n", + "\n", + "1. Q: What industries are leveraging NVIDIA's GPUs for automation?\n", + " Category: operations\n", + " A: A rapidly growing number of enterprises and startups across a broad range of industries, including transportation for autonomous driving, healthcare f...\n", + "\n", + "2. Q: What was the reason for the termination of the Arm Share Purchase Agreement?\n", + " Category: general\n", + " A: The Share Purchase Agreement between NVIDIA and SoftBank Group Corp. was terminated due to significant regulatory challenges that prevented the comple...\n", + "\n", + "3. Q: What are some applications of NVIDIA's GPUs in professional design?\n", + " Category: products\n", + " A: Professional designers use NVIDIA's GPUs and software to create visual effects in movies and to design buildings and products ranging from cell phones...\n" + ] + } + ], + "source": [ + "# Test FAQ generation on a single chunk\n", + "print(\"Testing FAQ generation on sample chunk...\\n\")\n", + "test_faqs = faq_chain.invoke({\"doc_content\": chunks[10].page_content})\n", + "\n", + "print(f\"Generated {len(test_faqs.get('faqs', []))} FAQs:\")\n", + "for i, faq in enumerate(test_faqs.get('faqs', [])[:3], 1):\n", + " print(f\"\\n{i}. Q: {faq['question']}\")\n", + " print(f\" Category: {faq['category']}\")\n", + " print(f\" A: {faq['answer'][:150]}...\")\n" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Generating FAQs from document chunks...\n", + "\n", + "Processing chunk 1/25...\n", + "10:36:23 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:36:34 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:36:43 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:36:52 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:36:58 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "Processing chunk 6/25...\n", + "10:37:10 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:37:14 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:37:21 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:37:29 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:37:38 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "Processing chunk 11/25...\n", + "10:37:49 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:37:57 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:38:06 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:38:14 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:38:23 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "Processing chunk 16/25...\n", + "10:38:32 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:38:39 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:38:57 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:39:05 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:39:15 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "Processing chunk 21/25...\n", + "10:39:23 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:39:31 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:39:40 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:39:47 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "10:39:54 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "\n", + "Generated 113 FAQs total\n", + "\n", + "Category distribution:\n", + " technology: 29\n", + " products: 27\n", + " financial: 20\n", + " operations: 20\n", + " general: 17\n" + ] + } + ], + "source": [ + "# Generate FAQs from all chunks (limited to first 25 for demo purposes)\n", + "def extract_faqs_from_chunks(chunks: List[Any], max_chunks: int = 25) -> List[Dict]:\n", + " \"\"\"Extract FAQs from document chunks using LLM.\n", + " \n", + " chunks: list of document chunks\n", + " max_chunks: maximum number of chunks to process\n", + " \n", + " Returns: A list of question-answer pairs\n", + " \"\"\"\n", + " all_faqs = []\n", + "\n", + " for i, chunk in enumerate(chunks[:max_chunks]):\n", + " if i % 5 == 0:\n", + " print(f\"Processing chunk {i+1}/{min(len(chunks), max_chunks)}...\", flush=True)\n", + "\n", + " try:\n", + " result = faq_chain.invoke({\"doc_content\": chunk.page_content})\n", + " if result and result.get(\"faqs\"):\n", + " all_faqs.extend(result[\"faqs\"])\n", + " except Exception as e:\n", + " print(f\" Warning: Skipped chunk {i+1} due to error: {str(e)[:100]}\")\n", + " continue\n", + "\n", + " return all_faqs\n", + "\n", + "# Extract FAQs\n", + "print(\"\\nGenerating FAQs from document chunks...\\n\")\n", + "faqs = extract_faqs_from_chunks(chunks, max_chunks=25)\n", + "\n", + "print(f\"\\nGenerated {len(faqs)} FAQs total\")\n", + "print(f\"\\nCategory distribution:\")\n", + "categories = {}\n", + "for faq in faqs:\n", + " cat = faq.get('category', 'unknown')\n", + " categories[cat] = categories.get(cat, 0) + 1\n", + "for cat, count in sorted(categories.items(), key=lambda x: x[1], reverse=True):\n", + " print(f\" {cat}: {count}\")\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 4. Pre-load semantic cache with FAQs\n", + "\n", + "Now we'll populate the cache instance with our generated FAQs. We'll use the `store()` API with metadata tags for filtering and organization.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Storing FAQs in cache...\n", + "\n", + " Stored 0/113 FAQs...\n", + "10:39:54 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:54 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:54 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:54 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:55 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:55 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:55 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:55 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:55 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:55 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:55 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:56 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:56 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:56 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:56 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:56 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:56 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:56 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:56 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:57 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + " Stored 20/113 FAQs...\n", + "10:39:57 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:57 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:57 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:57 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:57 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:57 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:57 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:58 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:58 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:58 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:58 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:58 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:58 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:58 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:58 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:59 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:59 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:59 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:59 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:59 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + " Stored 40/113 FAQs...\n", + "10:39:59 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:59 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:39:59 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:00 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:00 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:00 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:00 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:00 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:00 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:00 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:00 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:01 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:01 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:01 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:01 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:01 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:01 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:01 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:01 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:02 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + " Stored 60/113 FAQs...\n", + "10:40:02 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:02 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:02 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:02 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:02 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:02 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:02 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:03 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:03 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:03 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:03 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:03 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:03 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:03 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:03 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:04 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:04 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:04 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:04 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:04 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + " Stored 80/113 FAQs...\n", + "10:40:04 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:04 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:04 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:05 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:05 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:05 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:05 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:05 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:05 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:05 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:05 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:06 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:06 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:06 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:06 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:06 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:06 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:06 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:06 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:06 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + " Stored 100/113 FAQs...\n", + "10:40:07 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:07 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:07 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:07 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:07 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:07 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:07 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:08 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:08 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:08 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:08 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:08 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "10:40:08 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n", + "\n", + "Stored 113 FAQs in cache\n", + "\n", + "Example cache entries:\n", + "\n", + "1. Key: eb461a36940c04a1d307d33a595188af\n", + " Q: What is the fiscal year end date for NVIDIA Corporation as reported in the Form 10-K?...\n", + "\n", + "2. Key: 0daa2589e67ab291d447f3e103435706\n", + " Q: What is the trading symbol for NVIDIA Corporation's common stock?...\n" + ] + } + ], + "source": [ + "# Store FAQs in cache with metadata tags\n", + "print(\"Storing FAQs in cache...\\n\")\n", + "\n", + "stored_count = 0\n", + "cache_keys = {} # Map questions to their cache keys\n", + "\n", + "for i, faq in enumerate(faqs):\n", + " if i % 20 == 0:\n", + " print(f\" Stored {i}/{len(faqs)} FAQs...\", flush=True)\n", + "\n", + " try:\n", + " # Store with metadata - note that metadata is stored but not used for filtering in basic SemanticCache\n", + " key = cache.store(prompt=faq['question'], response=faq['answer'], metadata={'category': faq['category']})\n", + " cache_keys[faq['question']] = key\n", + " stored_count += 1\n", + " except Exception as e:\n", + " print(f\" Warning: Failed to store FAQ {i+1}: {str(e)[:100]}\")\n", + "\n", + "print(f\"\\nStored {stored_count} FAQs in cache\")\n", + "\n", + "print(f\"\\nExample cache entries:\")\n", + "for i, (q, k) in enumerate(list(cache_keys.items())[:2], 1):\n", + " print(f\"\\n{i}. Key: {k}\")\n", + " print(f\" Q: {q[:150]}...\")\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 5. Evaluating our semantic cache\n", + "Now that we have a semantic cache populated with question answer pairs we can evaluate its effectiveness.\n", + "\n", + "Unlike standard caching that uses exact key:value look ups, semantic caches relies on the notion of semantic embedding similarity.\n", + "The benefits of semantic matching are that similar questions such as, \"who is the king of England?\", and, \"who is the monarch of Britain?\" can be matched together.\n", + "This flexibility comes at the cost of occasional mismatches. A question like, \"who is the queen of England?\" is also similar and likely to match.\n", + "\n", + "Let's create a dataset of test questions to see which match and which don't to evaluate our cache hit rate, and our accuracy.\n", + "### Create test/evaluation dataset\n", + "\n", + "We'll create a test dataset with:\n", + "- **Positive examples**: Questions that should match cached FAQs\n", + "- **Negative examples**: Questions that should NOT match cached FAQs\n", + "- **Edge cases**: Slightly different phrasings to test threshold sensitivity\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Test dataset created\n", + " Positive examples: 5\n", + " Negative examples: 5\n", + " Edge cases: 5\n" + ] + } + ], + "source": [ + "# Create test dataset with positive examples (should match NVIDIA FAQs). We'll take the first 5 from our generated FAQs and modify them slightly.\n", + "positive_examples = [\n", + "{'query': \"What's the fiscal year end for NVIDIA Corporation?\",\n", + " 'expected_answer': 'The fiscal year ended January 29, 2023.',\n", + " 'category': 'general',\n", + " 'expected_match': True} ,\n", + "{'query': \"What is the trading symbol of NVIDIA Corporation's common stock in the market?\",\n", + " 'expected_answer': \"The trading symbol for NVIDIA Corporation's common stock is NVDA.\",\n", + " 'category': 'financial' ,\n", + " 'expected_match': True} ,\n", + "{'query': 'Where is the location of the executive office?',\n", + " 'expected_answer': 'The principal executive office of NVIDIA Corporation is located at 2788 San Tomas Expressway, Santa Clara, California 95051.',\n", + " 'category': 'operations' ,\n", + " 'expected_match': True} ,\n", + "{'query': 'Does the SEC consider NVIDIA Corporation a well-known seasoned issuer?',\n", + " 'expected_answer': 'No, NVIDIA Corporation is not considered a well-known seasoned issuer as indicated by the check mark in the document.',\n", + " 'category': 'financial' ,\n", + " 'expected_match': True} ,\n", + "{'query': \"In what exchange platform is NVIDIA's stock traded in?\",\n", + " 'expected_answer': \"NVIDIA Corporation's common stock is registered on The Nasdaq Global Select Market.\",\n", + " 'category': 'financial' ,\n", + " 'expected_match': True} ,\n", + "]\n", + "\n", + "# Create test dataset with negative examples\n", + "negative_examples = [\n", + " {\"query\": \"Where are these reports being submitted and who is reading them?\", \"expected_match\": False, \"category\": \"off-topic\"},\n", + " {\"query\": \"What is Jensen Huang's net worth?\", \"expected_match\": False, \"category\": \"off-topic\"},\n", + " {\"query\": \"What games run best on the RTX 4090? NVIDIA GPU?\", \"expected_match\": False, \"category\": \"off-topic\"},\n", + " {\"query\": \"What time is it?\", \"expected_match\": False, \"category\": \"off-topic\"},\n", + " {\"query\": \"Should I invest my life savings in this organization?\", \"expected_match\": False, \"category\": \"general\"},\n", + "]\n", + "\n", + "# Create test dataset with edge cases (slightly different phrasings)\n", + "edge_cases = [\n", + " {\"query\": \"What's the fiscal year end for Microsoft Corporation?\", \"expected_match\": False, \"category\": \"general\"},\n", + " {\"query\": \"What is the company total revenue for the last 5 years?\", \"expected_match\": False, \"category\": \"financial\"},\n", + " {\"query\": \"What's the location of the manufacturing plant for NVIDIA?\", \"expected_match\": False, \"category\": \"general\"},\n", + " {\"query\": \"Where are the locations of each office of NVIDIA Corporation?\", \"expected_match\": False, \"category\": \"off-topic\"},\n", + " {\"query\": \"What is the trading symbold of NVIDIA Corporation on the Japan exchange?\", \"expected_match\": False, \"category\": \"general\"},\n", + "]\n", + "\n", + "print(f\"Test dataset created\")\n", + "print(f\" Positive examples: {len(positive_examples)}\")\n", + "print(f\" Negative examples: {len(negative_examples)}\")\n", + "print(f\" Edge cases: {len(edge_cases)}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Test semantic similarity\n", + "Let's test how the cache performs with different types of queries and matching thresholds.\n", + "\n", + "We'll run through our 15 sample questions and track which ones get a hit and which ones don't. We'll also track if they should have hit.\n", + "\n", + "This will give us a baseline of our cache performance." + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Testing semantic similarity:\n", + "\n", + "10:41:05 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "0. Cache HIT (distance: 0.0842)\n", + " Original query: What's the fiscal year end for NVIDIA Corporation?\n", + " Matched: What is the fiscal year end date for NVIDIA Corporation as reported in the Form ...\n", + "10:41:05 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "1. Cache HIT (distance: 0.0174)\n", + " Original query: What is the trading symbol of NVIDIA Corporation's common stock in the market?\n", + " Matched: What is the trading symbol for NVIDIA Corporation's common stock?...\n", + "10:41:05 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "2. Cache MISS\n", + " Original query: Where is the location of the executive office?\n", + " Expected match: True\n", + "10:41:06 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "3. Cache HIT (distance: 0.0352)\n", + " Original query: Does the SEC consider NVIDIA Corporation a well-known seasoned issuer?\n", + " Matched: Is NVIDIA Corporation considered a well-known seasoned issuer?...\n", + "10:41:06 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "4. Cache HIT (distance: 0.1314)\n", + " Original query: In what exchange platform is NVIDIA's stock traded in?\n", + " Matched: What is the trading symbol for NVIDIA Corporation's common stock?...\n", + "10:41:06 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "5. Cache MISS\n", + "10:41:06 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "6. Cache MISS\n", + "10:41:06 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "7. Cache MISS\n", + "10:41:06 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "8. Cache MISS\n", + "10:41:06 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "9. Cache MISS\n", + "10:41:07 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "10. Cache MISS\n", + "10:41:07 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "11. Cache HIT (distance: 0.1066)\n", + " Original query: What's the location of the manufacturing plant for NVIDIA?\n", + " MISS MATCHED: Where is NVIDIA headquartered?...\n", + "10:41:07 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "12. Cache HIT (distance: 0.0973)\n", + " Original query: What date does NVIDIA use as it's year end for acounting purposes?\n", + " Matched: When do NVIDIA's consumer products typically see stronger revenue?...\n", + "10:41:07 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "13. Cache HIT (distance: 0.0480)\n", + " Original query: Where are the locations of each office of NVIDIA Corporation?\n", + " MISS MATCHED: Where is the principal executive office of NVIDIA Corporation located?...\n", + "10:41:07 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "14. Cache HIT (distance: 0.0870)\n", + " Original query: What is the trading symbold of NVIDIA Corporation on the NASDAQ exchange?\n", + " Matched: What is the trading symbol for NVIDIA Corporation's common stock?...\n", + "\n", + "Summary Metrics:\n", + " Accuracy: 80.000%\n", + " Precision: 75.000%\n", + " Recall: 85.714%\n", + " F1 Score: 80.000%\n" + ] + } + ], + "source": [ + "# Test with semantically similar queries\n", + "print(\"Testing semantic similarity:\\n\")\n", + "\n", + "full_test_data = positive_examples + negative_examples + edge_cases\n", + "\n", + "# Track our metrics\n", + "true_positives = 0 # we have a hit and it hould match\n", + "false_positives = 0 # we have a hit and it SHOULD NOT match\n", + "false_negatives = 0 # we have a miss and it SHOULD match\n", + "true_negatives = 0 # we have a miss and it SHOULD NOT match\n", + "\n", + "for i, question in enumerate(full_test_data):\n", + " result = cache.check(prompt=question['query'], return_fields=[\"prompt\", \"response\", \"distance\"])\n", + "\n", + " if result and question['expected_match']:\n", + " true_positives += 1\n", + " print(f\"{i}. Cache HIT (distance: {result[0].get('vector_distance', 'N/A'):.4f})\")\n", + " print(f\" Original query: {question['query']}\")\n", + " print(f\" Matched: {result[0]['prompt'][:80]}...\")\n", + " elif result and not question['expected_match']:\n", + " false_positives += 1\n", + " print(f\"{i}. Cache HIT (distance: {result[0].get('vector_distance', 'N/A'):.4f})\")\n", + " print(f\" Original query: {question['query']}\")\n", + " print(f\" MISS MATCHED: {result[0]['prompt'][:80]}...\")\n", + " elif not result and question['expected_match']:\n", + " false_negatives += 1\n", + " print(f\"{i}. Cache MISS\")\n", + " print(f\" Original query: {question['query']}\")\n", + " print(f\" Expected match: {question['expected_match']}\")\n", + " elif not result and not question['expected_match']:\n", + " true_negatives += 1\n", + " print(f\"{i}. Cache MISS\")\n", + "\n", + "# Calculate our summary metrics\n", + "accuracy = (true_positives + true_negatives) / len(full_test_data)\n", + "precision = true_positives / (true_positives + false_positives)\n", + "recall = true_positives / (true_positives + false_negatives)\n", + "f1_score = 2 * (precision * recall) / (precision + recall)\n", + "\n", + "print(f\"\\nSummary Metrics:\")\n", + "print(f\" Accuracy: {100*accuracy:.3f}%\")\n", + "print(f\" Precision: {100*precision:.3f}%\")\n", + "print(f\" Recall: {100*recall:.3f}%\")\n", + "print(f\" F1 Score: {100*f1_score:.3f}%\")\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 6. Tune cache threshold\n", + "\n", + "Using sample questions, we can find the optimal distance threshold based on our test dataset." + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Testing cache performance across different similarity thresholds...\n", + "10:41:07 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "10:41:07 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "10:41:07 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "10:41:08 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "10:41:08 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "10:41:08 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "10:41:08 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "10:41:08 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "10:41:08 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "10:41:08 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "10:41:08 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "10:41:09 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "10:41:09 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "10:41:09 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "10:41:09 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "THRESHOLD OPTIMIZATION RESULTS\n", + "====================================================================================================\n", + "\n", + "Performance Metrics by Threshold:\n", + " Threshold Total Hits Total Misses True Positives False Positives True Negatives False Negatives Precision Recall F1 Score Accuracy\n", + " 0.20 8 7 6 2 6 1 0.75 0.857143 0.8 0.8\n", + " 0.30 8 7 6 2 6 1 0.75 0.857143 0.8 0.8\n", + " 0.40 8 7 6 2 6 1 0.75 0.857143 0.8 0.8\n", + " 0.50 8 7 6 2 6 1 0.75 0.857143 0.8 0.8\n", + " 0.60 8 7 6 2 6 1 0.75 0.857143 0.8 0.8\n", + " 0.70 8 7 6 2 6 1 0.75 0.857143 0.8 0.8\n", + " 0.80 8 7 6 2 6 1 0.75 0.857143 0.8 0.8\n", + " 0.85 8 7 6 2 6 1 0.75 0.857143 0.8 0.8\n", + " 0.90 8 7 6 2 6 1 0.75 0.857143 0.8 0.8\n", + " 0.95 8 7 6 2 6 1 0.75 0.857143 0.8 0.8\n", + " 1.00 8 7 6 2 6 1 0.75 0.857143 0.8 0.8\n", + "\n", + "====================================================================================================\n", + "OPTIMAL THRESHOLD: 0.2\n", + " F1 Score: 0.800\n", + " Precision: 0.750\n", + " Recall: 0.857\n", + " Accuracy: 0.800\n", + "====================================================================================================\n", + "\n", + "Detailed breakdown at optimal threshold (0.2):\n", + "\n" + ] + } + ], + "source": [ + "# Test a range of different cache similarity thresholds\n", + "import pandas as pd\n", + "\n", + "# Define threshold ranges to test\n", + "thresholds_to_test = [0.20, 0.30, 0.40, 0.50, 0.60, 0.70, 0.80, 0.85, 0.90, 0.95, 1.00]\n", + "\n", + "print(\"Testing cache performance across different similarity thresholds...\")\n", + "\n", + "# Store results for all queries to reuse across thresholds\n", + "query_results = []\n", + "for test_case in full_test_data:\n", + " result = cache.check(prompt=test_case['query'], return_fields=[\"prompt\", \"response\", \"vector_distance\", \"entry_id\"])\n", + "\n", + " query_results.append({\n", + " 'query': test_case['query'],\n", + " 'expected_match': test_case['expected_match'],\n", + " 'cache_result': result[0] if result else None,\n", + " 'distance': result[0].get('vector_distance') if result else float('inf')\n", + " })\n", + "\n", + "# Evaluate each threshold\n", + "results = []\n", + "\n", + "for threshold in thresholds_to_test:\n", + " true_positives = 0\n", + " false_positives = 0\n", + " true_negatives = 0\n", + " false_negatives = 0\n", + "\n", + " for query_data in query_results:\n", + " # Determine if this would be a cache hit at this threshold\n", + " is_cache_hit = query_data['distance'] < threshold\n", + " should_match = query_data['expected_match']\n", + "\n", + " if is_cache_hit and should_match:\n", + " true_positives += 1\n", + " elif is_cache_hit and not should_match:\n", + " false_positives += 1\n", + " elif not is_cache_hit and not should_match:\n", + " true_negatives += 1\n", + " elif not is_cache_hit and should_match:\n", + " false_negatives += 1\n", + "\n", + " # Calculate metrics\n", + " total_hits = true_positives + false_positives\n", + " total_misses = true_negatives + false_negatives\n", + "\n", + " precision = true_positives / (true_positives + false_positives) if (true_positives + false_positives) > 0 else 0\n", + " recall = true_positives / (true_positives + false_negatives) if (true_positives + false_negatives) > 0 else 0\n", + " f1_score = 2 * (precision * recall) / (precision + recall) if (precision + recall) > 0 else 0\n", + " accuracy = (true_positives + true_negatives) / len(full_test_data)\n", + "\n", + " results.append({\n", + " 'Threshold': threshold,\n", + " 'Total Hits': total_hits,\n", + " 'Total Misses': total_misses,\n", + " 'True Positives': true_positives,\n", + " 'False Positives': false_positives,\n", + " 'True Negatives': true_negatives,\n", + " 'False Negatives': false_negatives,\n", + " 'Precision': precision,\n", + " 'Recall': recall,\n", + " 'F1 Score': f1_score,\n", + " 'Accuracy': accuracy\n", + " })\n", + "\n", + "# Display results in a formatted table\n", + "df_results = pd.DataFrame(results)\n", + "\n", + "print(\"THRESHOLD OPTIMIZATION RESULTS\")\n", + "print(\"=\"*100)\n", + "print(\"\\nPerformance Metrics by Threshold:\")\n", + "print(df_results.to_string(index=False))\n", + "\n", + "# Find optimal threshold based on F1 score\n", + "optimal_idx = df_results['F1 Score'].idxmax()\n", + "optimal_threshold = df_results.loc[optimal_idx, 'Threshold']\n", + "optimal_f1 = df_results.loc[optimal_idx, 'F1 Score']\n", + "\n", + "print(\"\\n\" + \"=\"*100)\n", + "print(f\"OPTIMAL THRESHOLD: {optimal_threshold}\")\n", + "print(f\" F1 Score: {optimal_f1:.3f}\")\n", + "print(f\" Precision: {df_results.loc[optimal_idx, 'Precision']:.3f}\")\n", + "print(f\" Recall: {df_results.loc[optimal_idx, 'Recall']:.3f}\")\n", + "print(f\" Accuracy: {df_results.loc[optimal_idx, 'Accuracy']:.3f}\")\n", + "print(\"=\"*100)\n", + "\n", + "# Show detailed breakdown for optimal threshold\n", + "print(f\"\\nDetailed breakdown at optimal threshold ({optimal_threshold}):\\n\")\n", + "for query_data in query_results:\n", + " is_cache_hit = query_data['distance'] < optimal_threshold\n", + " should_match = query_data['expected_match']\n", + "\n", + " status = \"\"\n", + " if is_cache_hit and should_match:\n", + " status = \"✓ TP (True Positive)\"\n", + " elif is_cache_hit and not should_match:\n", + " status = \"✗ FP (False Positive)\"\n", + " elif not is_cache_hit and not should_match:\n", + " status = \"✓ TN (True Negative)\"\n", + " elif not is_cache_hit and should_match:\n", + " status = \"✗ FN (False Negative)\"\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 7. RAG pipeline integration\n", + "\n", + "Now let's integrate the semantic cache into a complete RAG pipeline and measure the performance improvements." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Build a simple RAG chain\n" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "RAG chain created\n" + ] + } + ], + "source": [ + "# Create a simple RAG prompt template\n", + "rag_template = ChatPromptTemplate.from_messages([\n", + " (\"system\", \"You are a helpful assistant answering questions about NVIDIA based on their 10-K filing. Provide accurate, concise answers.\"),\n", + " (\"user\", \"{question}\")\n", + "])\n", + "\n", + "# Create RAG chain\n", + "rag_chain = rag_template | llm\n", + "\n", + "print(\"RAG chain created\")\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Create cached RAG function\n" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Cached RAG function ready\n" + ] + } + ], + "source": [ + "def rag_with_cache(question: str, use_cache: bool = True) -> tuple:\n", + " \"\"\"\n", + " Process a question through RAG pipeline with optional semantic caching.\n", + "\n", + " Returns: A tuple of (answer, cache_hit, response_time)\n", + " \"\"\"\n", + " start_time = time.time()\n", + " cache_hit = False\n", + "\n", + " # Check cache first if enabled\n", + " if use_cache:\n", + " cached_result = cache.check(prompt=question, distance_threshold=optimal_threshold)\n", + " if cached_result:\n", + " answer = cached_result[0]['response']\n", + " cache_hit = True\n", + " response_time = time.time() - start_time\n", + " return answer, cache_hit, response_time\n", + "\n", + " # Cache miss - use LLM\n", + " answer = rag_chain.invoke({\"question\": question})\n", + " response_time = time.time() - start_time\n", + "\n", + " # Store in cache for future use\n", + " if use_cache and hasattr(answer, 'content'):\n", + " cache.store(prompt=question, response=answer.content)\n", + " elif use_cache:\n", + " cache.store(prompt=question, response=str(answer))\n", + "\n", + " return answer.content if hasattr(answer, 'content') else str(answer), cache_hit, response_time\n", + "\n", + "print(\"Cached RAG function ready\")\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Performance comparison: with vs without cache\n" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "================================================================================\n", + "PERFORMANCE COMPARISON: With Cache vs Without Cache\n", + "================================================================================\n", + "\n", + "[FIRST PASS - Populating Cache]\n", + "\n", + "10:40:12 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "1. What is NVIDIA's primary business?\n", + " Cache: HIT | Time: 0.116s\n", + " Answer: NVIDIA has expanded into several large and important computationally intensive fields including scie...\n", + "\n", + "10:40:13 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "2. How much revenue did NVIDIA generate?\n", + " Cache: HIT | Time: 0.120s\n", + " Answer: NVIDIA's consumer products usually see stronger revenue in the second half of their fiscal year, wit...\n", + "\n", + "10:40:13 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "3. What are NVIDIA's main products?\n", + " Cache: HIT | Time: 0.125s\n", + " Answer: NVIDIA specializes in four large markets: Data Center, Gaming, Professional Visualization, and Autom...\n", + "\n", + "\n", + "[SECOND PASS - Cache Hits with Paraphrased Questions]\n", + "\n", + "10:40:13 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "1. What does NVIDIA do as a business?\n", + " Cache: HIT ✓ | Time: 0.122s\n", + " Answer: NVIDIA's business has evolved from a primary focus on gaming products to broader markets, including ...\n", + "\n", + "10:40:13 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "2. Can you tell me NVIDIA's revenue figures?\n", + " Cache: HIT ✓ | Time: 0.119s\n", + " Answer: NVIDIA announces material financial information to investors through its investor relations website,...\n", + "\n", + "10:40:13 httpx INFO HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n", + "3. What products does NVIDIA sell?\n", + " Cache: HIT ✓ | Time: 0.125s\n", + " Answer: NVIDIA's Graphics segment includes GeForce GPUs for gaming and PCs, the GeForce NOW game streaming s...\n", + "\n", + "\n", + "[THIRD PASS - Without Cache (Baseline)]\n", + "\n", + "10:40:15 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "1. What is NVIDIA's primary business?\n", + " Cache: DISABLED | Time: 1.640s\n", + "\n", + "10:40:17 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "2. How much revenue did NVIDIA generate?\n", + " Cache: DISABLED | Time: 2.014s\n", + "\n", + "10:40:21 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "3. What are NVIDIA's main products?\n", + " Cache: DISABLED | Time: 4.001s\n", + "\n", + "\n", + "================================================================================\n", + "PERFORMANCE SUMMARY\n", + "================================================================================\n", + "Average time - First pass (cache miss): 0.120s\n", + "Average time - Second pass (cache hit): 0.122s\n", + "Average time - Without cache: 2.552s\n", + "\n", + "Speedup with cache: 1.0x faster\n", + " Cache hit rate: 0%\n" + ] + } + ], + "source": [ + "# Test questions for RAG evaluation\n", + "test_questions_rag = [\n", + " \"What is NVIDIA's primary business?\",\n", + " \"How much revenue did NVIDIA generate?\",\n", + " \"What are NVIDIA's main products?\",\n", + "]\n", + "\n", + "print(\"\\n\" + \"=\"*80)\n", + "print(\"PERFORMANCE COMPARISON: With Cache vs Without Cache\")\n", + "print(\"=\"*80)\n", + "\n", + "# First pass - populate cache (cache misses, must call LLM)\n", + "print(\"\\n[FIRST PASS - Populating Cache]\\n\")\n", + "first_pass_times = []\n", + "\n", + "for i, question in enumerate(test_questions_rag, 1):\n", + " answer, cache_hit, response_time = rag_with_cache(question, use_cache=True)\n", + " first_pass_times.append(response_time)\n", + " print(f\"{i}. {question}\")\n", + " print(f\" Cache: {'HIT' if cache_hit else 'MISS'} | Time: {response_time:.3f}s\")\n", + " print(f\" Answer: {answer[:100]}...\\n\")\n", + "\n", + "# Second pass - test cache hits with similar questions\n", + "print(\"\\n[SECOND PASS - Cache Hits with Paraphrased Questions]\\n\")\n", + "second_pass_times = []\n", + "\n", + "similar_questions = [\n", + " \"What does NVIDIA do as a business?\",\n", + " \"Can you tell me NVIDIA's revenue figures?\",\n", + " \"What products does NVIDIA sell?\",\n", + "]\n", + "\n", + "for i, question in enumerate(similar_questions, 1):\n", + " answer, cache_hit, response_time = rag_with_cache(question, use_cache=True)\n", + " second_pass_times.append(response_time)\n", + " print(f\"{i}. {question}\")\n", + " print(f\" Cache: {'HIT ✓' if cache_hit else 'MISS ✗'} | Time: {response_time:.3f}s\")\n", + " print(f\" Answer: {answer[:100]}...\\n\")\n", + "\n", + "# Third pass - without cache (baseline)\n", + "print(\"\\n[THIRD PASS - Without Cache (Baseline)]\\n\")\n", + "no_cache_times = []\n", + "\n", + "for i, question in enumerate(test_questions_rag, 1):\n", + " answer, _, response_time = rag_with_cache(question, use_cache=False)\n", + " no_cache_times.append(response_time)\n", + " print(f\"{i}. {question}\")\n", + " print(f\" Cache: DISABLED | Time: {response_time:.3f}s\\n\")\n", + "\n", + "# Summary\n", + "print(\"\\n\" + \"=\"*80)\n", + "print(\"PERFORMANCE SUMMARY\")\n", + "print(\"=\"*80)\n", + "avg_first = sum(first_pass_times)/len(first_pass_times)\n", + "avg_second = sum(second_pass_times)/len(second_pass_times)\n", + "avg_no_cache = sum(no_cache_times)/len(no_cache_times)\n", + "\n", + "print(f\"Average time - First pass (cache miss): {avg_first:.3f}s\")\n", + "print(f\"Average time - Second pass (cache hit): {avg_second:.3f}s\")\n", + "print(f\"Average time - Without cache: {avg_no_cache:.3f}s\")\n", + "\n", + "if avg_second > 0:\n", + " speedup = avg_first / avg_second\n", + " print(f\"\\nSpeedup with cache: {speedup:.1f}x faster\")\n", + "\n", + "cache_hit_count = sum(1 for i, _ in enumerate(similar_questions) if second_pass_times[i] < 0.1)\n", + "cache_hit_rate = cache_hit_count / len(similar_questions)\n", + "print(f\" Cache hit rate: {cache_hit_rate*100:.0f}%\")\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 8. Best Practices and Tips\n", + "\n", + "### Key Takeaways\n", + "\n", + "1. **Threshold Optimization**: Start conservative (0.10-0.15) and optimize based on real usage data\n", + "2. **Doc-to-Cache**: Pre-populate your cache with high-quality FAQs for immediate benefits\n", + "3. **Monitoring**: Track cache hit rates and adjust thresholds as user patterns emerge\n", + "4. **Model Selection**: The `langcache-embed-v1` model is specifically optimized for caching tasks\n", + "5. **Cost-Performance Balance**: Even a 50% cache hit rate provides significant cost savings\n", + "\n", + "### When to Use Semantic Caching\n", + "\n", + "✅ **Good Use Cases:**\n", + "- High-traffic applications with repeated question patterns\n", + "- Customer support chatbots\n", + "- FAQ systems\n", + "- Documentation Q&A\n", + "- Product information queries\n", + "- Educational content Q&A\n", + "\n", + "❌ **Less Suitable:**\n", + "- Highly dynamic content requiring real-time data\n", + "- Creative writing tasks needing variety\n", + "- Personalized responses based on user-specific context\n", + "- Time-sensitive queries (use TTL if needed)\n", + "\n", + "### Performance Tips\n", + "\n", + "1. **Batch Loading**: Pre-populate cache with Doc-to-Cache for immediate value\n", + "2. **Monitor Hit Rates**: Track and adjust thresholds based on production metrics\n", + "3. **A/B Testing**: Test different thresholds with a subset of traffic\n", + "4. **Cache Warming**: Regularly update cache with trending topics\n", + "5. **TTL Management**: Set time-to-live for entries that may become stale\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 9. Cleanup\n", + "\n", + "Clean up resources when done.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [], + "source": [ + "# Clear cache contents\n", + "#cache.clear()\n", + "# print(\"Cache contents cleared\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Summary\n", + "\n", + "Congratulations! You've completed this comprehensive guide on semantic caching with LangCache and RedisVL. \n", + "\n", + "**What You've Learned:**\n", + "- ✅ Set up and configure LangCache with Redis Cloud\n", + "- ✅ Load and process PDF documents into knowledge bases\n", + "- ✅ Generate FAQs using the Doc-to-Cache technique with LLMs\n", + "- ✅ Pre-populate a semantic cache with tagged entries\n", + "- ✅ Test different cache matching strategies and thresholds\n", + "- ✅ Optimize cache performance using test datasets\n", + "- ✅ Leverage the `redis/langcache-embed-v1` cross-encoder model\n", + "- ✅ Integrate semantic caching into RAG pipelines\n", + "- ✅ Measure performance improvements and cost savings\n", + "\n", + "**Next Steps:**\n", + "- Experiment with different distance thresholds for your use case\n", + "- Try other embedding models and compare performance\n", + "- Implement cache analytics and monitoring in production\n", + "- Explore advanced features like TTL, metadata filtering, and cache warming strategies\n", + "- Scale your semantic cache to handle production traffic\n", + "\n", + "**Resources:**\n", + "- [RedisVL Documentation](https://docs.redisvl.com/en/stable/index.html)\n", + "- [LangCache Sign Up](https://redis.io/langcache/)\n", + "- [Redis AI Resources](https://github.com/redis-developer/redis-ai-resources)\n", + "- [Semantic Caching Paper](https://arxiv.org/abs/2504.02268)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "redis-ai-res", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.9" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}